215754 Commits

Author SHA1 Message Date
John Baldwin
c97a3872fe Document PCI_HP and PCI_IOV kernel options and various tunables in pci(4).
Describe PCI-related kernel options for HotPlug and SR-IOV support in the
pci(4) manual page.  While here, add a section describing the various
tunables supported by the PCI bus driver as well.

Reviewed by:	wblock
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7754
2016-09-08 19:42:49 +00:00
Bruce Evans
0c01bcb9ff Sprinkle DOINGASYNC() checks so as to do delayed writes for async
mounts in almost all cases instead of in most cases.  Don't override
DOINGASYNC() by any condition except IO_SYNC.

Fix previous sprinking of DOINGASYNC() checks.  Don't override IO_SYNC
by DOINGASYNC().  In ffs_write() and ffs_extwrite(), there were
intentional overrides that just broke O_SYNC of data.  In
ffs_truncate(), there are 5 calls to ffs_update(), 4 with
apparently-unintentional overrides and 1 without; this had no effect
due to the main async mount hack descibed below.

Fix 1 place in ffs_truncate() where the caller's IO_ASYNC was overridden
for the soft updates case too (to do a delayed write instead of a sync
write).  This is supposed to be the only change that affects anything
except async mounts.

In ffs_update(), remove the 19 year old efficiency hack of ignoring
the waitfor flag for async mounts, so that fsync() almost works for
async mounts.  All callers are supposed to be fixed to not ask for a
sync update unless they are for fsync() or [I]O_SYNC operations.
fsync() now almost works for async mounts.  It used to sync the data
but not the most important metdata (the inode).  It still doesn't sync
associated directories.

This gave 10-20% fewer writes for my makeworld benchmark with async
mounted tmp and obj directories from an already small number.

Style fixes:
- in ffs_balloc.c, remove rotted quadruplicated comments about the
  simplest part of the DOING*() decisions and rearrange the nearly-
  quadruplicated code to be more nearly so.
- in ufs_vnops.c, use a consistent style with less negative logic and
  no manual "optimization" of || to | in DOING*() expressions.

Reviewed by:	kib (previous version)
2016-09-08 17:40:40 +00:00
Ruslan Bukin
84aec472fc Allow the use of soft-interrupts for sending IPIs.
This will be required for SMP support on MIPS Malta platform.

Reviewed by:	adrian
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
Differential Revision:	https://reviews.freebsd.org/D7835
2016-09-08 17:37:13 +00:00
Eric van Gyzen
1a04446f08 etcupdate: preserve the metadata of the destination file
When using diff3 to perform a three-way merge, etcupdate lost the destination
file's metadata. The metadata from the temporary file were used instead.
This was unpleasant for rc.d scripts, which require execute permission.
Use "cat >" to overwrite the destination file's contents while preserving its
metadata.

Reviewed by:	bapt
Sponsored by:	Dell Technologies
Differential Revision:	https://reviews.freebsd.org/D7817
2016-09-08 15:53:49 +00:00
Gabor Kovesdan
a6be469014 - Fix typo
PR:		211245
Submitted by:	Christoph Schonweiler <public2016@hauptsignal.at>
MFC after:	5 days
2016-09-08 14:50:23 +00:00
Bruce Evans
8b530941f4 Fix single-stepping of instructions emulated by vm86.
In vm86.c, fix 2 (rarely used) cases where the return code lost the
single-step indicator.  While here, fix 2 misspellings of PSL_T as
PSL_TF (TF is the CPU manufacturer's spelling, but we use T).

In trap.c, turn T_PROTFLT and T_STKFLT into T_TRCTRAP if
vm86_emulate() asked for this (it does this when the instruction is
being traced and was successully emulated).  In the kernel case, we
used to deliver the trap as SIGTRAP to the process, where it always
terminated the process; now we deliver the trap as T_TRCTRAP to kdb,
where it normally gives single-stepping.  In the user case, the only
difference is that we now clear PSL_T and initialize ucode properly.

Reviewed by:	kib
2016-09-08 14:43:39 +00:00
Ed Maste
e62264e2dd Update capabilities.conf comment
getdtablesize is per-process state, not global state
2016-09-08 14:04:04 +00:00
Alexander Motin
cd3752643c Don't report to devd statuses that CAM doesn't consider errors.
Some statuses, such as "ATA pass through information available", are part
part of absolutely normal operation and do not worth reporting.

MFC after:	2 weeks
2016-09-08 13:33:33 +00:00
Alexander Motin
5d18110a7f "Extended copy information available" is not an error either.
MFC after:	2 weeks
2016-09-08 13:03:49 +00:00
Alexander Motin
6867747328 "ATA pass through information available" is not an error.
MFC after:	2 weeks
2016-09-08 12:58:33 +00:00
Andrew Turner
13db69623b Trap msr/mrs instructions. These are privileged arm64 instructions and
shouldn't normally be used.

Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-08 12:53:01 +00:00
Andriy Gapon
19afdc91b9 intpm: make sure to register smbus driver before intpm driver
Otherwise we can fail to create an smbus child of intpm.

MFC after:	1 week
2016-09-08 12:43:24 +00:00
Andrew Turner
e0c6c1d1fd Don't panic when we don't handle a userland exception, not all we may see
are currently handled.

Obtained from:	ABT Systems Ltd
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2016-09-08 12:39:03 +00:00
Andriy Gapon
a2f51f57d1 intpm: better clean up resources after a failed attachment
bus_generic_detach() fails when called from attach method
thus preventing further clean up actions.

MFC after:	1 week
2016-09-08 12:27:34 +00:00
Andriy Gapon
c47117f43a intpm: do not try attaching to unsupported controller revisions
While there set a different device description for the controllers
found in various FCHs (Hudson, Bolton, CPU integrated).

MFC after:	1 week
2016-09-08 12:24:46 +00:00
Andriy Gapon
6c29523e00 intpm: fix attachment to supported AMD FCHs 2016-09-08 12:12:39 +00:00
Konstantin Belousov
63876b3ba2 On rename, do not perform truncation of dirhash if the vnode
truncation failed.

Doing so resulted in inconsistent state of the ufs dirhash with regard
to the actual directory inode state, and could lead to spurious ENOENT
errors for lookups of existing files in production kernels, or
assertion failures in the debugging kernels.

Change the logic of calling ufsdirhash_dirtrunc() to be same as in
ufs_direnter().  Execute UFS_TRUNCATE() first, log error, and only do
dirtrunc() if UFS_TRUNCATE() succeeded.

Note that the problem was exacerbated by the bug in the
flush_newblk_dep() function (see r305599), which caused in the spurios
errors from ffs_sync() and then ffs_truncate().

In collaboration with:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:09:34 +00:00
Andriy Gapon
f0c4119c9a amdsbwd.4: update supported hardware list
And place it into its own section.

MFC after:	1 week
2016-09-08 12:09:13 +00:00
Konstantin Belousov
7b05b8a29c Do not leak transient ENOLCK error from flush_newblk_dep() loop.
The buffer lock is retried on failed LK_SLEEPFAIL attempt, and error
from the failed attempt is irrelevant.  But since there is path after
retry which does not clear error, it is possible to return spurious
error from the function.

The issue resulted in a spurious failure of softdep_sync_buf(),
causing further spurious failure of ffs_sync().

In collaboration with:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:08:54 +00:00
Konstantin Belousov
76db05eb14 When logging unlikely UFS_TRUNCATE() failure in ufs_direnter(),
include error code.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:08:08 +00:00
Konstantin Belousov
ea16af59a1 When externding directory inode in ufs_direnter(), adjust i_endoff.
This change is formally not needed, since i_endoff not used in all
code paths after the call to ufs_direnter(), and i_endoff is
recalculated by the next lookup.  But having the value correct makes
the reasoning about code simpler.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:07:25 +00:00
Andriy Gapon
6cb0fae25e intpm.4 update supported hardware list
MFC after:	1 week
2016-09-08 12:07:25 +00:00
Konstantin Belousov
e599d951e3 In dqsync(), when called from quotactl(), um_quotas entry might appear
cleared since nothing prevents completion of the parallel quotaoff.
There is nothing to sync in this case, and no reason to panic.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:06:43 +00:00
Konstantin Belousov
60f1c000f3 In softdep_prealloc(), return early not only for snapshots, but for
the quota files as well.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:05:13 +00:00
Konstantin Belousov
ccb19123e5 There is no need to upgrade the last dvp lock on lookups for modifying
operations.  Instead of upgrading, assert that the lock is exclusive.
Explain the cause in comments.

This effectively reverts r209367.

Tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:04:45 +00:00
Konstantin Belousov
df4265774e Partially lift suspension when ffs_reload() finished with cgs and
going to re-read inodes.

Secondary write initiators, e.g. ufs_inactive(), might need to start a
write while owning the vnode lock.  Since the suspended state
established by /dev/ufssuspend prevents them from entering
vn_start_secondary_write(), we get deadlock otherwise.

Note that it is arguably not very useful to re-read inodes after
/dev/ufssuspend suspension, because the suspension does not block
readers, and other threads might read existing files in parallel with
suspension owner (for now, only growfs(8)) operations.  This
effectively means that suspension owner cannot safely modify existing
inodes, and then there is no sense in re-reading.  But keep the code
enabled for now.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:01:28 +00:00
Alexander Motin
d4a08767ba Decode ATA Status Return descriptor.
MFC after:	2 weeks
2016-09-08 12:00:02 +00:00
Hans Petter Selasky
2b5b3a0923 Correctly map the USB mouse tilt delta values into buttons 5 and 6
instead of 3 and 4 which is used for the scroll wheel, according to
X.org.

PR:		170358
MFC after:	1 week
2016-09-08 10:10:05 +00:00
Sepherosa Ziehau
ff9eac2e6d pxeboot: Add nfs.read_size tunable.
Increase this tunable improves kernel loading speed.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	rpokala, wblock (previous version)
Obtained from:	DragonFlyBSD
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7756
2016-09-08 09:11:13 +00:00
Sepherosa Ziehau
b33720da59 hyperv/hn: Factor out NVS NDIS initialization
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7811
2016-09-08 07:45:20 +00:00
Sepherosa Ziehau
5152795ab9 hyperv/hn: Function renaming.
While I'm here, remove obvious comment.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7810
2016-09-08 07:34:31 +00:00
Sepherosa Ziehau
74decee899 hyperv/kvp: Fix IPv4/IPv6 address injection support.
The GUID string provided by hypervisor has leading and trailing braces,
while our GUID string does not have braces at all.  Both braces should
be ignored, when the GUID strings are compared.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Modified by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7809
2016-09-08 07:16:56 +00:00
Sepherosa Ziehau
a74e025394 hyperv/hn: Pass MTU around.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7808
2016-09-08 06:42:30 +00:00
Sepherosa Ziehau
021deece8f hyperv/hn: Factor out function to do NVS initialization.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7807
2016-09-08 06:23:08 +00:00
Sepherosa Ziehau
f7a9af2829 hyperv/hn: Push RXBUF size adjustment down.
It is not used in other places.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7806
2016-09-08 06:06:54 +00:00
Sepherosa Ziehau
c8fca9324a hyperv/hn: Pull vmbus channel open up.
While I'm here, pull up the channel callback related code too.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7805
2016-09-08 05:27:43 +00:00
Allan Jude
2b53c51767 Fix typo in skein amd64 assembly
Sponsored by:	ScaleEngine Inc.
2016-09-08 02:38:55 +00:00
Kevin Lo
cee4a05669 In m_devget(), if the data fits in a packet header mbuf, check the amount
of data is less than or equal to MHLEN instead of MLEN when placing initial
small packet header at end of mbuf.

Reviewed by:	glebius
MFC after:	3 days
2016-09-08 01:02:53 +00:00
Brooks Davis
ef13681631 Remove a pointless translation of struct ioc_toc_header.
struct ioc_toc_header will be the same size (and thus IOREADTOCHEADER
will have the same value on all supported platforms).

Sponsored by:	DARPA, AFRL
2016-09-08 00:38:50 +00:00
Jung-uk Kim
00f3ae2678 Fix an obvious typo. 2016-09-07 23:37:10 +00:00
Jung-uk Kim
fd59353302 Suffix short month names with "월" and replace %b with %_m for date formats.
This change is analogous to r199179, r199271, and r289041 for japanese and
chinese locales.
2016-09-07 23:35:38 +00:00
Alexander Motin
4605bf63c4 MFV r305562: 7259 DS_FIELD_LARGE_BLOCKS is unused
The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of
this patch:

    commit ca0cc3918a1789fa839194af2a9245f801a06b1a
    Author: Matthew Ahrens <mahrens@delphix.com>
    Date:   Fri Jul 24 09:53:55 2015 -0700

        5959 clean up per-dataset feature count code
        Reviewed by: Toomas Soome <tsoome@me.com>
        Reviewed by: George Wilson <george@delphix.com>
        Reviewed by: Alex Reece <alex@delphix.com>
        Approved by: Richard Lowe <richlowe@richlowe.net>

This patch simply removes this macro from dsl_dataset.h.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2016-09-07 20:09:24 +00:00
Alexander Motin
ccb5d7b8e8 7259 DS_FIELD_LARGE_BLOCKS is unused
The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of
this patch:

    commit ca0cc3918a1789fa839194af2a9245f801a06b1a
    Author: Matthew Ahrens <mahrens@delphix.com>
    Date:   Fri Jul 24 09:53:55 2015 -0700

        5959 clean up per-dataset feature count code
        Reviewed by: Toomas Soome <tsoome@me.com>
        Reviewed by: George Wilson <george@delphix.com>
        Reviewed by: Alex Reece <alex@delphix.com>
        Approved by: Richard Lowe <richlowe@richlowe.net>

This patch simply removes this macro from dsl_dataset.h.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2016-09-07 20:07:39 +00:00
Alexander Motin
de1fdddeda MFV r305560: 7278 tuning zfs_arc_max does not impact arc_c_min
When changing zfs_arc_max (e.g. as zdb does), it may be set to less
than the default arc_c_min. arc_c_min should decrease to not be more than
arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>

openzfs/openzfs@608764bead
2016-09-07 20:05:10 +00:00
Alexander Motin
03d951aef3 7278 tuning zfs_arc_max does not impact arc_c_min
When changing zfs_arc_max (e.g. as zdb does), it may be set to less
than the default arc_c_min. arc_c_min should decrease to not be more than
arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>

openzfs/openzfs@608764bead
2016-09-07 20:00:22 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
John Baldwin
e06ab612d2 Don't break out of the m_advance() loop if len drops to zero.
If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer.  The first loop iteration
will drop len to zero and exit the loop without setting 'p'.  However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).

Reviewed by:	np
Sponsored by:	Chelsio Communications
2016-09-07 18:08:43 +00:00
Andrew Turner
77c02eccb8 When synchronising the instruction and data caches we only need to clean
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.

This was noticed when investigating r305545.

Reported by:	bz
Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-07 16:46:54 +00:00
Andrew Turner
3b34364450 Only call cpu_icache_sync_range when inserting an executable page. If the
page is non-executable the contents of the i-cache are unimportant so this
call is just adding unneeded overhead when inserting pages.

While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1
iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into
arm64_icache_sync_range() from pmap_enter() along with a high number of
instruction fetches and iTLB/iCache hits.

Limiting the calls to arm64_icache_sync_range() to only executable pages,
we observe the iTLB and iCache Hit going down by about 43%. These numbers
are quite misleading when looked at alone as at the same time instructions
retired were reduced by 19.2% and instruction fetches were reduced by 38.8%.
Overall this reduced the runtime of the test program by 22.4%.

On Juno hardware, in steady-state, running the same test, using the cycle
count to determine runtime, we do see a reduction of up to 28.9% in runtime.

While these numbers certainly depend on the program executed, we expect an
overall performance improvement.

Reported by:	bz
Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-07 16:22:05 +00:00
Andriy Voskoboinyk
d204cea9f8 rum: fix possible panic on device detach (similar to r302034).
Tested with WUSB54GC, STA/AP modes.
2016-09-07 16:19:20 +00:00