Commit Graph

113112 Commits

Author SHA1 Message Date
John Baldwin
1fedfdd5c1 Remove explicit device_verbose() from the t4iov driver detach routine
now that this case is handled generically.
2016-09-12 18:07:06 +00:00
John Baldwin
71499f6a2d Make device_quiet() an attachment property.
In particular, reset the DF_QUIET flag when detaching from a device so
that a driver that marks a device quiet doesn't dictate policy for a
different driver that may claim the device in the future.

Reviewed by:	rpokala, wblock
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7803
2016-09-12 18:06:42 +00:00
Oleksandr Tymoshenko
e0cfa1bc82 Remove semicolon from the end of the macro definition
Reported by: hans
2016-09-12 17:29:20 +00:00
Andriy Voskoboinyk
bf5e39f7a7 urtwn: fix possible driver hang when beacon miss is detected. 2016-09-12 16:46:14 +00:00
Konstantin Belousov
1a9ded46bd Fix typo in comment.
MFC after:	3 days
2016-09-12 16:44:21 +00:00
Ruslan Bukin
693b6aeede Add SMP support for MTI Malta 34kf CPU.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-09-12 16:38:51 +00:00
Emmanuel Vadot
d853a418ed Remove CUBIEBOARD kernel config file.
Every Allwinner board should either use ALLWINNER (SMP) or ALLWINER_UP kernel
config files.

MFC after:	2 week
2016-09-12 16:13:27 +00:00
Sepherosa Ziehau
e0e273e2d4 hyperv/hn: Pull ether address and link status extraction up.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7831
2016-09-12 06:12:28 +00:00
Sepherosa Ziehau
2a3a8a823c hyperv/hn: Reorganize RNDIS attach
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7830
2016-09-12 05:59:39 +00:00
Sepherosa Ziehau
2cd02514a1 hyperv/hn: Reorganize sub-channel allocation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7829
2016-09-12 05:37:44 +00:00
Sepherosa Ziehau
b5f59ae0e2 hyperv/hn: Function rename.
- Minor style changes.
- Nuke unnecessary indirection.
- Nuke unapplied comment.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7827
2016-09-12 05:28:50 +00:00
Warner Losh
1648ac5095 Report the Silicon Revisions for the AM335x SoCs correctly. 2016-09-12 05:19:56 +00:00
Sepherosa Ziehau
0c84fbafef hyperv/hn: Rename chimney sending buffer connect/disconnect functions.
Minor cleanup and wording in error messages.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7825
2016-09-12 05:18:30 +00:00
Sepherosa Ziehau
a711c28f62 hyperv/hn: Rename RXBUF connect/disconnect functions.
Minor cleanup and wording in error messages.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7823
2016-09-12 05:09:45 +00:00
Adrian Chadd
b60cfea4fd [ath_hal] quieten a bit of the boot messages - this stuff has been working for a while. 2016-09-12 04:58:59 +00:00
Sepherosa Ziehau
b9f62e3a74 x86: Use sx lock for interrupt sources.
- Certain pic_assign_cpu, e.g. msi_assign_cpu can have quite a long
  call chain.  For msi_assign_cpu, mutex makes complex PCI bridge
  drivers more tricky, e.g. sleep can note be called, etc, it will
  be pretty tricky for upcoming Hyper-V PCI bridge driver for PCI
  pass-through.
- It is not used on any hot code path nor non-sleepable context, so
  sx should have the same effect as mutex.

PIC list is still protected by mutex to keep suspend/resume work.

Discussed with: jhb
Reviewed by:	jhb
MFC after:	3 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7784
2016-09-12 04:57:58 +00:00
Adrian Chadd
5abc0b2590 [ath] set the relevant TOA/TOD locationing bits when trying to do locationing.
* Don't do RTS/CTS - experiments show that we get ACK frames for each of them
  and this ends up causing the timestamps to look all funny.
* Set the HAL_TXDESC_POS bit, so the AR9300 HAL sets up the hardware to return
  location and CSI information.
2016-09-12 04:55:13 +00:00
Adrian Chadd
bd3b336251 [ath] tweak the TX EDMA debugging a bit.
I've used this to debug some amusing issues with the EDMA code.

Tested:

* AR9380, STA mode
* AR9380, TDMA mode (master, slave)
2016-09-12 04:50:40 +00:00
Oleksandr Tymoshenko
b763171b46 Cleanup evdev support for TI ADC/TS
- evdev_set_methods call is not required if actual methods are no-ops
- evdev_set_serial is also optional if there is no meaningful input device
    identifier
- evdev_set_id on the other hand is mandatory, so set virtual bus with
    dummy vendor/product/version

Suggested by:	Vladimir Kondratiev
2016-09-12 01:18:25 +00:00
Navdeep Parhar
4cf3aa135b cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

Sponsored by:	Chelsio Communications
2016-09-12 00:15:40 +00:00
Navdeep Parhar
9113e53d54 cxgbe(4): Add support for additional port types and link speeds.
Sponsored by:	Chelsio Communications.
2016-09-11 23:08:57 +00:00
Conrad Meyer
f44abf1fe7 ioat(4): Start poll timer when descriptors are released to HW
Rather than when the software creates the descriptors.

Sponsored by:	Dell EMC Isilon
2016-09-11 20:15:41 +00:00
Conrad Meyer
dc46505973 ioat(4): De-spam ioat_process_events KTR logs
Sponsored by:	Dell EMC Isilon
2016-09-11 20:14:19 +00:00
Oleksandr Tymoshenko
fbeb453ba9 Add evdev support to TI ADC/touchscreen driver
Add generic evdev support to touchscreen part of ti_adc: two absolute
coordinates + button touch to indicate pen position. Pressure value
reporting is not implemented yet.

Tested on: Beaglebone Black + 4DCAPE-43T + tslib
2016-09-11 19:08:21 +00:00
Oleksandr Tymoshenko
2b3f6d6650 Add evdev protocol implementation
evdev is a generic input event interface compatible with Linux
evdev API at ioctl level. It allows using unmodified (apart from
header name) input evdev drivers in Xorg, Wayland, Qt.

This commit has only generic kernel API. evdev support for individual
hardware drivers like ukbd, ums, atkbd, etc. will be committed later.

Project was started by Jakub Klama as part of GSoC 2014. Jakub's
evdev implementation was later used as a base, updated and finished
by Vladimir Kondratiev.

Submitted by:	Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by:	adrian, hans
Differential Revision:	https://reviews.freebsd.org/D6998
2016-09-11 18:56:38 +00:00
Navdeep Parhar
769ef07a38 cxgbe(4): Rename the debug_flags driver tunable/sysctl to dflags.
Tunables that end with _flags are special.

Sponsored by:	Chelsio Communications
2016-09-11 18:05:37 +00:00
Navdeep Parhar
82c1d6b762 cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6.
Sponsored by:	Chelsio Communications
2016-09-11 17:57:53 +00:00
Navdeep Parhar
ed7e5640a5 cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.
Sponsored by:	Chelsio Communications
2016-09-11 17:51:17 +00:00
Allan Jude
c2b475d0ee MFV r268120:
4936 lz4 could theoretically overflow a pointer with a certain input

  illumos/illumos-gate@58d0718061

Reviewed by:	delphij
MFC after:	2 weeks
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D7850
2016-09-11 17:48:06 +00:00
Navdeep Parhar
0dbc6cfd75 cxgbe(4): Update the pad_boundary calculation for T6, which has a
different range of boundaries.

Sponsored by:	Chelsio Communications
2016-09-11 17:22:54 +00:00
Navdeep Parhar
472a6004cf cxgbe(4): Use correct macro for header length with T6 ASICs. This
affects the transmit of the VF driver only.

Sponsored by:	Chelsio Communications
2016-09-11 16:11:51 +00:00
Navdeep Parhar
774168be39 cxgbe(4): Set up fl_starve_threshold2 accurately for T6.
Sponsored by:	Chelsio Communications
2016-09-11 16:06:17 +00:00
Konstantin Belousov
cf1c47763f Add FPU_KERN_NOCTX flag to the fpu_kern_enter() function on amd64.
The flag specifies that the block which uses FPU must be executed in
critical section, i.e. take no context switches, and does not need an
FPU save area during the execution.

It is intended to be applied around fast and short code pathes where
save area allocation is impossible or undesirable, due to context or
due to the relative cost of calculation vs. allocation.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-11 09:14:07 +00:00
Emmanuel Vadot
2fb194c968 a10_mmc: Remove completly the PIO code now all access is done by DMA.
Rename registers as in the manual.
Do a hard reset of the controller before a soft one.
Since DMA is always used remove dependancy on allwinner_soc_family, it was used
to differentiate SoC as the fdt compatible string were the same.

Tested on A10, A20, H3 and A64.

Reviewed by:	jmcneill
Differential Revision:	https://reviews.freebsd.org/D6868
2016-09-10 17:45:35 +00:00
Alan Cox
8cb0c1029d Various changes to pmap_ts_referenced()
Move PMAP_TS_REFERENCED_MAX out of the various pmap implementations and
into vm/pmap.h, and describe what its purpose is.  Eliminate the archaic
"XXX" comment about its value.  I don't believe that its exact value, e.g.,
5 versus 6, matters.

Update the arm64 and riscv pmap implementations of pmap_ts_referenced()
to opportunistically update the page's dirty field.

On amd64, use the PDE value already cached in a local variable rather than
dereferencing a pointer again and again.

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D7836
2016-09-10 16:49:25 +00:00
Mateusz Guzik
a27815330c cache: improve scalability by introducing bucket locks
An array of bucket locks is added.

All modifications still require the global cache_lock to be held for
writing. However, most readers only need the relevant bucket lock and in
effect can run concurrently to the writer as long as they use a
different lock. See the added comment for more details.

This is an intermediate step towards removal of the global lock.

Reviewed by:	kib
Tested by:	pho
2016-09-10 16:29:53 +00:00
Alexander Motin
20e45e033c Switch random_get_pseudo_bytes() shim to arc4rand().
Our shim for Solaris random_get_bytes() uses read_random(), that looks
reasonable, since it guaranties reliably seeded random data.  On the other
side Solaris random_get_pseudo_bytes() does not provide this guarantie,
and its original Solaris implementation is equivalent to our arc4rand(),
using software crypto without stressing slower hardware RNG.
2016-09-10 09:37:41 +00:00
Konstantin Belousov
2e4fd101fa Fix build 2016-09-10 09:00:12 +00:00
Justin Hibbits
44dd680963 Add ehci to the MPC85XX build
Many QorIQ and MPC85xx SoCs have USB support, so add it to the kernel.

MFC after:	1 week
2016-09-10 01:09:58 +00:00
Jilles Tjoelker
d30e66e53a wait: Do not copyout uninitialized status/rusage/wrusage.
If wait4() or wait6() return 0 because of WNOHANG, the status, rusage and
wrusage information should not be returned.

PR:		212048
Reported by:	Casey Lucas
MFC after:	2 weeks
2016-09-09 21:58:48 +00:00
Mateusz Guzik
a0d45f0fc8 locks: add backoff for spin mutexes and thread lock
Reviewed by:	jhb
2016-09-09 19:13:02 +00:00
Ed Maste
82b3cec52b ANSIfy uipc_syscalls.c
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7839
2016-09-09 17:40:26 +00:00
Navdeep Parhar
3aea32935c cxgbe(4): Avoid a NULL dereference in the clearstats ioctl handler.
Port softc's are not initialized when the adapter is in recovery mode.
2016-09-09 17:15:16 +00:00
Bruce Evans
5c48342f16 Pass the trap type and code down from db_trap() to db_stop_at_pc() so
that the latter can easily determine what the trap type actually is
after callers are fixed to encode the type unambigously.

ddb currently barely understands breakpoints, and it treats all
non-breakpoints as single-step traps.  This works OK for stopping
after every instruction when single-stepping, but is broken for
single-stepping with a count > 1 (especially with a large count).
ddb needs to stop on the first non-single-step trap while single-
stepping.  Otherwise, ddb doesn't even stop the first time for
fatal traps and external breakpoints like the one in kdb_enter().
2016-09-09 15:53:42 +00:00
Ruslan Bukin
930b6b4dce Add support for SMP on MIPS Malta platform.
Tested in QEMU on Malta32, Malta64.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-09-09 14:50:44 +00:00
Bruce Evans
10c458cc3b Fix stopping when the specified breakpoint count is reached. The
countdown was done correctly, but the action when the count was not
reduced to 0 was to fall through to generic code which almost always
stopped.
2016-09-09 14:09:50 +00:00
Mateusz Guzik
6a3e46059a nullfs: plug vnode ref leak in null_vptocnp
The lower vnode is already referenced and nodeget is supposed to consume
the reference. Thus the extra vref call was causing a leak.

Reported by:	pho
Reviewed by:	kib
MFC after:	1 week
2016-09-09 10:40:55 +00:00
Navdeep Parhar
9e7cb06c17 cxgbe(4): Do not prescreen frames before attempting LRO.
Sponsored by:	Chelsio Communications
2016-09-09 07:34:14 +00:00
Adrian Chadd
01decb509d [gpio] include intr.h when building with INTRNG.
Trying to build a MIPS platform that uses INTRNG needs this
for this to work right in gpiobusvar.h :

#ifdef INTRNG
struct intr_map_data_gpio {
        struct intr_map_data    hdr;
...
};
#endif
2016-09-09 04:54:41 +00:00
Adrian Chadd
c028fb5098 [net80211] add in ToA/ToD based location mbuf tags for some experimenting. 2016-09-09 04:47:48 +00:00
Adrian Chadd
90d3a30a16 [ath_hal] fixes for finer grain timestamping, some 11n macros
* change the HT_RC_2_MCS to do MCS0..23
* Use it when looking up the ht20/ht40 array for bits-per-symbol
* add a clk_to_psec (picoseconds) routine, so we can get sub-microsecond
  accuracy for the math
* .. and make that + clk_to_usec public, so higher layer code that is
  returning clocks (eg the ANI diag routines, some upcoming locationing
  experiments) can be converted to microseconds.

Whilst here, add a comment in ar5416 so i or someone else can revisit the
latency values.
2016-09-09 04:45:25 +00:00
Justin Hibbits
51d025a596 Correct the type of db_cmd_loop_done.
On big endian hardware that uses 1 byte bool a type mismatch of bool vs int will
cause the least signifcant byte of db_cmd_loop_done to be set, but the MSB to be
read, and read as 0.  This causes ddb to stay in an infinite loop.

MFC after:	1 week
2016-09-09 04:16:53 +00:00
Conrad Meyer
06b9366795 queue(3): Enhance queue debugging macros
Split the QUEUE_MACRO_DEBUG into QUEUE_MACRO_DEBUG_TRACE and
QUEUE_MACRO_DEBUG_TRASH.

Add the debug macrso QMD_IS_TRASHED() and QMD_SLIST_CHECK_PREVPTR().

Document these in queue.3.

Reviewed by:	emaste
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D3984
2016-09-08 21:20:01 +00:00
John Baldwin
b3db2736b1 Don't check aq64_minfree which is unsigned for negative values.
This fixes a tautological comparison warning.

Reviewed by:	rwatson
Differential Revision:	https://reviews.freebsd.org/D7682
2016-09-08 19:47:57 +00:00
Bruce Evans
0c01bcb9ff Sprinkle DOINGASYNC() checks so as to do delayed writes for async
mounts in almost all cases instead of in most cases.  Don't override
DOINGASYNC() by any condition except IO_SYNC.

Fix previous sprinking of DOINGASYNC() checks.  Don't override IO_SYNC
by DOINGASYNC().  In ffs_write() and ffs_extwrite(), there were
intentional overrides that just broke O_SYNC of data.  In
ffs_truncate(), there are 5 calls to ffs_update(), 4 with
apparently-unintentional overrides and 1 without; this had no effect
due to the main async mount hack descibed below.

Fix 1 place in ffs_truncate() where the caller's IO_ASYNC was overridden
for the soft updates case too (to do a delayed write instead of a sync
write).  This is supposed to be the only change that affects anything
except async mounts.

In ffs_update(), remove the 19 year old efficiency hack of ignoring
the waitfor flag for async mounts, so that fsync() almost works for
async mounts.  All callers are supposed to be fixed to not ask for a
sync update unless they are for fsync() or [I]O_SYNC operations.
fsync() now almost works for async mounts.  It used to sync the data
but not the most important metdata (the inode).  It still doesn't sync
associated directories.

This gave 10-20% fewer writes for my makeworld benchmark with async
mounted tmp and obj directories from an already small number.

Style fixes:
- in ffs_balloc.c, remove rotted quadruplicated comments about the
  simplest part of the DOING*() decisions and rearrange the nearly-
  quadruplicated code to be more nearly so.
- in ufs_vnops.c, use a consistent style with less negative logic and
  no manual "optimization" of || to | in DOING*() expressions.

Reviewed by:	kib (previous version)
2016-09-08 17:40:40 +00:00
Ruslan Bukin
84aec472fc Allow the use of soft-interrupts for sending IPIs.
This will be required for SMP support on MIPS Malta platform.

Reviewed by:	adrian
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
Differential Revision:	https://reviews.freebsd.org/D7835
2016-09-08 17:37:13 +00:00
Bruce Evans
8b530941f4 Fix single-stepping of instructions emulated by vm86.
In vm86.c, fix 2 (rarely used) cases where the return code lost the
single-step indicator.  While here, fix 2 misspellings of PSL_T as
PSL_TF (TF is the CPU manufacturer's spelling, but we use T).

In trap.c, turn T_PROTFLT and T_STKFLT into T_TRCTRAP if
vm86_emulate() asked for this (it does this when the instruction is
being traced and was successully emulated).  In the kernel case, we
used to deliver the trap as SIGTRAP to the process, where it always
terminated the process; now we deliver the trap as T_TRCTRAP to kdb,
where it normally gives single-stepping.  In the user case, the only
difference is that we now clear PSL_T and initialize ucode properly.

Reviewed by:	kib
2016-09-08 14:43:39 +00:00
Ed Maste
e62264e2dd Update capabilities.conf comment
getdtablesize is per-process state, not global state
2016-09-08 14:04:04 +00:00
Alexander Motin
cd3752643c Don't report to devd statuses that CAM doesn't consider errors.
Some statuses, such as "ATA pass through information available", are part
part of absolutely normal operation and do not worth reporting.

MFC after:	2 weeks
2016-09-08 13:33:33 +00:00
Alexander Motin
5d18110a7f "Extended copy information available" is not an error either.
MFC after:	2 weeks
2016-09-08 13:03:49 +00:00
Alexander Motin
6867747328 "ATA pass through information available" is not an error.
MFC after:	2 weeks
2016-09-08 12:58:33 +00:00
Andrew Turner
13db69623b Trap msr/mrs instructions. These are privileged arm64 instructions and
shouldn't normally be used.

Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-08 12:53:01 +00:00
Andriy Gapon
19afdc91b9 intpm: make sure to register smbus driver before intpm driver
Otherwise we can fail to create an smbus child of intpm.

MFC after:	1 week
2016-09-08 12:43:24 +00:00
Andrew Turner
e0c6c1d1fd Don't panic when we don't handle a userland exception, not all we may see
are currently handled.

Obtained from:	ABT Systems Ltd
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2016-09-08 12:39:03 +00:00
Andriy Gapon
a2f51f57d1 intpm: better clean up resources after a failed attachment
bus_generic_detach() fails when called from attach method
thus preventing further clean up actions.

MFC after:	1 week
2016-09-08 12:27:34 +00:00
Andriy Gapon
c47117f43a intpm: do not try attaching to unsupported controller revisions
While there set a different device description for the controllers
found in various FCHs (Hudson, Bolton, CPU integrated).

MFC after:	1 week
2016-09-08 12:24:46 +00:00
Andriy Gapon
6c29523e00 intpm: fix attachment to supported AMD FCHs 2016-09-08 12:12:39 +00:00
Konstantin Belousov
63876b3ba2 On rename, do not perform truncation of dirhash if the vnode
truncation failed.

Doing so resulted in inconsistent state of the ufs dirhash with regard
to the actual directory inode state, and could lead to spurious ENOENT
errors for lookups of existing files in production kernels, or
assertion failures in the debugging kernels.

Change the logic of calling ufsdirhash_dirtrunc() to be same as in
ufs_direnter().  Execute UFS_TRUNCATE() first, log error, and only do
dirtrunc() if UFS_TRUNCATE() succeeded.

Note that the problem was exacerbated by the bug in the
flush_newblk_dep() function (see r305599), which caused in the spurios
errors from ffs_sync() and then ffs_truncate().

In collaboration with:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:09:34 +00:00
Konstantin Belousov
7b05b8a29c Do not leak transient ENOLCK error from flush_newblk_dep() loop.
The buffer lock is retried on failed LK_SLEEPFAIL attempt, and error
from the failed attempt is irrelevant.  But since there is path after
retry which does not clear error, it is possible to return spurious
error from the function.

The issue resulted in a spurious failure of softdep_sync_buf(),
causing further spurious failure of ffs_sync().

In collaboration with:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:08:54 +00:00
Konstantin Belousov
76db05eb14 When logging unlikely UFS_TRUNCATE() failure in ufs_direnter(),
include error code.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:08:08 +00:00
Konstantin Belousov
ea16af59a1 When externding directory inode in ufs_direnter(), adjust i_endoff.
This change is formally not needed, since i_endoff not used in all
code paths after the call to ufs_direnter(), and i_endoff is
recalculated by the next lookup.  But having the value correct makes
the reasoning about code simpler.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:07:25 +00:00
Konstantin Belousov
e599d951e3 In dqsync(), when called from quotactl(), um_quotas entry might appear
cleared since nothing prevents completion of the parallel quotaoff.
There is nothing to sync in this case, and no reason to panic.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:06:43 +00:00
Konstantin Belousov
60f1c000f3 In softdep_prealloc(), return early not only for snapshots, but for
the quota files as well.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:05:13 +00:00
Konstantin Belousov
ccb19123e5 There is no need to upgrade the last dvp lock on lookups for modifying
operations.  Instead of upgrading, assert that the lock is exclusive.
Explain the cause in comments.

This effectively reverts r209367.

Tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:04:45 +00:00
Konstantin Belousov
df4265774e Partially lift suspension when ffs_reload() finished with cgs and
going to re-read inodes.

Secondary write initiators, e.g. ufs_inactive(), might need to start a
write while owning the vnode lock.  Since the suspended state
established by /dev/ufssuspend prevents them from entering
vn_start_secondary_write(), we get deadlock otherwise.

Note that it is arguably not very useful to re-read inodes after
/dev/ufssuspend suspension, because the suspension does not block
readers, and other threads might read existing files in parallel with
suspension owner (for now, only growfs(8)) operations.  This
effectively means that suspension owner cannot safely modify existing
inodes, and then there is no sense in re-reading.  But keep the code
enabled for now.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-08 12:01:28 +00:00
Alexander Motin
d4a08767ba Decode ATA Status Return descriptor.
MFC after:	2 weeks
2016-09-08 12:00:02 +00:00
Hans Petter Selasky
2b5b3a0923 Correctly map the USB mouse tilt delta values into buttons 5 and 6
instead of 3 and 4 which is used for the scroll wheel, according to
X.org.

PR:		170358
MFC after:	1 week
2016-09-08 10:10:05 +00:00
Sepherosa Ziehau
ff9eac2e6d pxeboot: Add nfs.read_size tunable.
Increase this tunable improves kernel loading speed.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	rpokala, wblock (previous version)
Obtained from:	DragonFlyBSD
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7756
2016-09-08 09:11:13 +00:00
Sepherosa Ziehau
b33720da59 hyperv/hn: Factor out NVS NDIS initialization
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7811
2016-09-08 07:45:20 +00:00
Sepherosa Ziehau
5152795ab9 hyperv/hn: Function renaming.
While I'm here, remove obvious comment.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7810
2016-09-08 07:34:31 +00:00
Sepherosa Ziehau
74decee899 hyperv/kvp: Fix IPv4/IPv6 address injection support.
The GUID string provided by hypervisor has leading and trailing braces,
while our GUID string does not have braces at all.  Both braces should
be ignored, when the GUID strings are compared.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Modified by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7809
2016-09-08 07:16:56 +00:00
Sepherosa Ziehau
a74e025394 hyperv/hn: Pass MTU around.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7808
2016-09-08 06:42:30 +00:00
Sepherosa Ziehau
021deece8f hyperv/hn: Factor out function to do NVS initialization.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7807
2016-09-08 06:23:08 +00:00
Sepherosa Ziehau
f7a9af2829 hyperv/hn: Push RXBUF size adjustment down.
It is not used in other places.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7806
2016-09-08 06:06:54 +00:00
Sepherosa Ziehau
c8fca9324a hyperv/hn: Pull vmbus channel open up.
While I'm here, pull up the channel callback related code too.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7805
2016-09-08 05:27:43 +00:00
Allan Jude
2b53c51767 Fix typo in skein amd64 assembly
Sponsored by:	ScaleEngine Inc.
2016-09-08 02:38:55 +00:00
Kevin Lo
cee4a05669 In m_devget(), if the data fits in a packet header mbuf, check the amount
of data is less than or equal to MHLEN instead of MLEN when placing initial
small packet header at end of mbuf.

Reviewed by:	glebius
MFC after:	3 days
2016-09-08 01:02:53 +00:00
Brooks Davis
ef13681631 Remove a pointless translation of struct ioc_toc_header.
struct ioc_toc_header will be the same size (and thus IOREADTOCHEADER
will have the same value on all supported platforms).

Sponsored by:	DARPA, AFRL
2016-09-08 00:38:50 +00:00
Alexander Motin
4605bf63c4 MFV r305562: 7259 DS_FIELD_LARGE_BLOCKS is unused
The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of
this patch:

    commit ca0cc3918a1789fa839194af2a9245f801a06b1a
    Author: Matthew Ahrens <mahrens@delphix.com>
    Date:   Fri Jul 24 09:53:55 2015 -0700

        5959 clean up per-dataset feature count code
        Reviewed by: Toomas Soome <tsoome@me.com>
        Reviewed by: George Wilson <george@delphix.com>
        Reviewed by: Alex Reece <alex@delphix.com>
        Approved by: Richard Lowe <richlowe@richlowe.net>

This patch simply removes this macro from dsl_dataset.h.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2016-09-07 20:09:24 +00:00
Alexander Motin
de1fdddeda MFV r305560: 7278 tuning zfs_arc_max does not impact arc_c_min
When changing zfs_arc_max (e.g. as zdb does), it may be set to less
than the default arc_c_min. arc_c_min should decrease to not be more than
arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>

openzfs/openzfs@608764bead
2016-09-07 20:05:10 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
John Baldwin
e06ab612d2 Don't break out of the m_advance() loop if len drops to zero.
If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer.  The first loop iteration
will drop len to zero and exit the loop without setting 'p'.  However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).

Reviewed by:	np
Sponsored by:	Chelsio Communications
2016-09-07 18:08:43 +00:00
Andrew Turner
77c02eccb8 When synchronising the instruction and data caches we only need to clean
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.

This was noticed when investigating r305545.

Reported by:	bz
Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-07 16:46:54 +00:00
Andrew Turner
3b34364450 Only call cpu_icache_sync_range when inserting an executable page. If the
page is non-executable the contents of the i-cache are unimportant so this
call is just adding unneeded overhead when inserting pages.

While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1
iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into
arm64_icache_sync_range() from pmap_enter() along with a high number of
instruction fetches and iTLB/iCache hits.

Limiting the calls to arm64_icache_sync_range() to only executable pages,
we observe the iTLB and iCache Hit going down by about 43%. These numbers
are quite misleading when looked at alone as at the same time instructions
retired were reduced by 19.2% and instruction fetches were reduced by 38.8%.
Overall this reduced the runtime of the test program by 22.4%.

On Juno hardware, in steady-state, running the same test, using the cycle
count to determine runtime, we do see a reduction of up to 28.9% in runtime.

While these numbers certainly depend on the program executed, we expect an
overall performance improvement.

Reported by:	bz
Obtained from:	ABT Systems Ltd
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-09-07 16:22:05 +00:00
Andriy Voskoboinyk
d204cea9f8 rum: fix possible panic on device detach (similar to r302034).
Tested with WUSB54GC, STA/AP modes.
2016-09-07 16:19:20 +00:00
Ruslan Bukin
9a6eb971a1 o Update QEMU device tree.
QEMU was updated to privileged architecture v1.9
and we now fully support it.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-09-07 15:48:44 +00:00
Andriy Gapon
f13826052b work around AMD erratum 793 for family 16h, models 00h-0Fh 2016-09-07 14:24:29 +00:00
Alexander Motin
07c3504304 Fix channel initialization in FBS mode.
Due to reading initialized variable, FIS receive area was always allocated
as 256 bytes, suitable for command-based switching, instead of 4096 bytes,
required for FIS-based switching.  This caused memory corruption in case of
port multipliers used on FBS-capable HBAs (Marvell).

MFC after:	1 week
2016-09-07 13:51:34 +00:00
Andriy Gapon
9626ccde7f amdsbwd: add support for FCH in family 16h models 30h-3Fh processors
Requested by:	Mike Tancsa <mike@sentex.net>
Tested by:	Mike Tancsa <mike@sentex.net>
MFC after:	1 week
2016-09-07 13:45:35 +00:00
Andriy Voskoboinyk
4c90f11b3c rum: use mgmt frame rate for EAPOL frames. 2016-09-07 12:07:02 +00:00
Stanislav Galabov
2b99b9f3d2 Fix MIPS INTRNG (both FDT and non-FDT) behaviour broken by r304459
More changes to MIPS may be required, as commented in D7692, but this
revision aims to restore MIPS INTRNG functionality so we can move on
with working interrupts.

Reported by:	yamori813@yahoo.co.jp
Tested by:	mizhka (on BCM), sgalabov (on Mediatek)
Reviewed by:	adrian, nwhitehorn (older version)
Sponsored by:	Smartcom - Bulgaria AD
Differential Revision:	https://reviews.freebsd.org/D7692
2016-09-07 09:31:10 +00:00
Sepherosa Ziehau
b67f3d2873 hyperv/hn: Nuke unused bits
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7795
2016-09-07 09:20:58 +00:00
Sepherosa Ziehau
b44fb279e8 hyperv/hn: Simplify per-packet-info construction.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7794
2016-09-07 06:02:29 +00:00
Sepherosa Ziehau
61ac564fd0 hyperv/hn: Cleanup RNDIS packet message encapsulation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7793
2016-09-07 05:41:01 +00:00
Wojciech Macek
4192788cb2 Remove messy machdep code for Alpine V1 and use proper drivers instead
Let drivers for Alpine CCU, NB and Serdes take care of internal SoC configuration.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Reviewed by:           imp,wma
Differential Revision: https://reviews.freebsd.org/D7566
2016-09-07 05:36:55 +00:00
Wojciech Macek
9d6cd3d858 Introduce support for Annapurna Alpine CCU and NB devices
This commit adds drivers for Alpine Cache Coherency Unit
and North Bridge Service whose task is to configure
the system fabric and enable cache coherency.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Reviewed by:           wma
Differential Revision: https://reviews.freebsd.org/D7565
2016-09-07 05:34:41 +00:00
Sepherosa Ziehau
5761f5dfdd hyperv/hn: Avoid bit fields for TXCSUM setup.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7792
2016-09-07 05:27:43 +00:00
Justin Hibbits
a2fe9079a8 Disable the qoriq errata fix for now
It hangs more often than it actually works it seems.  Further debugging is
needed to determine why, but for now the system needs to be able to boot.
2016-09-07 04:13:28 +00:00
Justin Hibbits
bdee3435fa Allow pmap_early_io_unmap() to reclaim memory
pmap_early_io_map()/pmap_early_io_unmap(), if used in pairs, should be used in
the form:

pmap_early_io_map()
..do stuff..
pmap_early_io_unmap()

Without other allocations in the middle.  Without reclaiming memory this can
leave large holes in the device space.

While here, make a simple change to the unmap loop which now permits it to unmap
multiple TLB entries in the range.
2016-09-07 03:26:55 +00:00
Jared McNeill
b78c83e321 Add support for Allwinner A83T CPU frequency scaling. 2016-09-07 01:10:16 +00:00
Jared McNeill
b3868b9f16 Attach later so axp81x attaches after aw_nmi. 2016-09-07 01:09:25 +00:00
Mark Johnston
4bfb585351 Don't treat an error from g_mirror_clear_metadata() as fatal.
Such errors can occur as the result of a write error or because the disk
backing the mirror element was removed. They result in a generation ID bump
on all active elements of the mirror, so we can safely disconnect the mirror
component rather than destroy it.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7750
2016-09-06 23:42:59 +00:00
Mark Johnston
40c5032d32 Add some fail points to gmirror.
These are useful for testing changes to I/O error handling, and for
reproducing existing bugs in a controlled manner. The fail points are

    g_mirror_regular_request_read
    g_mirror_regular_request_write
    g_mirror_sync_request_read
    g_mirror_sync_request_write
    g_mirror_metadata_write

They all effectively allow one to inject an error value into the bio_error
field of a corresponding BIO request as it is being completed.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-09-06 23:35:48 +00:00
Marius Strobl
b2fd098e58 Disable vt(4) by default on sparc64 as creator_vt(4) and vt_ofwfb(4)
have the serious problem of not actually attaching the hardware they
are driving at the bus level. This causes creator(4) and machfb(4)
to attach and drive the very same hardware in parallel when both
syscons(4) and vt(4) as well as their associated hardware drivers
are built into a kernel, i. e. GENERIC, at the same time.
Also, syscons(4) and its drivers still are way superior to vt(4) and
its equivalents; unlike the syscons(4) counterparts the vt(4) drivers
don't provide hardware acceleration resulting in considerably slower
screen drawing, creator_vt(4) doesn't provide a /dev/fb node as
required by the Xorg sunffb(4) etc. In theory, vt_ofwfb(4) should be
able to handle more devices than machfb(4). However, testing shows
that it hardly works with any hardware machfb(4) isn't also able to
drive, making vt(4) and vt_ofwfb(4) not favorable for the time being
from that perspective either.

MFC after:	3 days
2016-09-06 22:18:08 +00:00
Brooks Davis
ed6d876b19 Modernize the initalization of sigproptbl.
Use C99 designators to set the value of each slot and the nitems macro to
check for valid entries. In the process, switch to indexing by signal
number rather than signal-1 for improved clarity.

Obtained from:	CheriBSD (a6053c5abf03a5f53bbfcdd3a26429383f67e09f)
Sponsored by:	DARPA, AFRL
Reviewed by:	kib
2016-09-06 22:03:53 +00:00
Jared McNeill
fa1cbf00d7 Add generic device-tree cpufreq driver. 2016-09-06 21:36:20 +00:00
Mateusz Guzik
2740551545 nullfs: stop special-casing directories in null_vptocnp
The previous code was forcing an expensive walk in vop_stdvptocnp,
which was causing performance issues on highly contended zfs.

No objections:	kib
MFC after:	2 weeks
2016-09-06 21:22:03 +00:00
Jared McNeill
0fbb017195 Add generic device-tree cpufreq driver. 2016-09-06 21:18:14 +00:00
John Baldwin
da0fc9250c Reset PCI pass through devices via PCI-e FLR during VM start and end.
Add routines to trigger a function level reset (FLR) of a PCI-express
device via the PCI-express device control register.  This also includes
support routines to wait for pending transactions to complete as well
as calculating the maximum completion timeout permitted by a device.

Change the ppt(4) driver to reset pass through devices before attaching
to a VM during startup and before detaching from a VM during shutdown.

Reviewed by:	imp, wblock (earlier version)
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7751
2016-09-06 21:15:35 +00:00
Jared McNeill
b90d9a3752 Add "pci" as a dependency to ichss.
Reviewed by:	jhibbits
2016-09-06 21:01:38 +00:00
Jared McNeill
319336a943 Add generic device-tree cpufreq driver.
This driver supports two bindings:
 - cpufreq-dt: systems which share clock and voltage across all CPUs
 - arm_big_little_dt: systems which share clock and voltage across all
   CPUs in a single cluster

Reviewed by:		andrew, imp
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D7741
2016-09-06 20:43:26 +00:00
John Baldwin
64414cc00f Update the I/O MMU in bhyve when PCI devices are added and removed.
When the I/O MMU is active in bhyve, all PCI devices need valid entries
in the DMAR context tables. The I/O MMU code does a single enumeration
of the available PCI devices during initialization to add all existing
devices to a domain representing the host. The ppt(4) driver then moves
pass through devices in and out of domains for virtual machines as needed.
However, when new PCI devices were added at runtime either via SR-IOV or
HotPlug, the I/O MMU tables were not updated.

This change adds a new set of EVENTHANDLERS that are invoked when PCI
devices are added and deleted. The I/O MMU driver in bhyve installs
handlers for these events which it uses to add and remove devices to
the "host" domain.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7667
2016-09-06 20:17:54 +00:00
Oleksandr Tymoshenko
a4554c3736 Let knlist_add do the locking part
Remove explicit mtx_lock/mtx_unlock around knlist_add and pass 0 as
locked parameter so knlist_add does the locking itself

Suggested by:	kib@
2016-09-06 19:36:28 +00:00
John Baldwin
db4b3cdad8 Remove remnants of PERFMON and I586_PMC_GUPROF from amd64.
These options were never fully ported over from i386.
2016-09-06 19:25:32 +00:00
John Baldwin
5fb03c3780 Leave ppt devices in the host domain when they are not attached to a VM.
This allows a pass through device to be reset to a normal device driver
on the host and reused on the host.  ppt devices are now always active in
some I/O MMU domain when the I/O MMU is active, either the host domain
or the domain of a VM they are attached to.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7666
2016-09-06 18:53:17 +00:00
Will Andrews
d945328992 loader.efi: Bump the staging size to 64M.
This is required on my system, which loads nvidia, vmm, and zfs, and 48M is
no longer enough for that.  nvidia-driver's recent update increased its size
by several megabytes.

Reviewed by:	jhb
MFC after:	1 week
2016-09-06 17:58:58 +00:00
Mateusz Guzik
5b7d9ae2fd cv: do a lockless check for no waiters in cv_signal and cv_broadcastpri
In case of some consumers like zfs there are no waiters vast majority of
the time

Reviewed by:	jhb
MFC after:	1 week
2016-09-06 17:16:59 +00:00
Warner Losh
949b56ba66 Renumber the advertising clause. 2016-09-06 15:17:35 +00:00
Wojciech Macek
aab9fdaf6f Import missing enum declaration in pci_host_generic header file
Other files including pci_host_generic.h failed to compile
due to missing declaration of enum pci_id_type.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Reviewed by:           wma
Differential Revision: https://reviews.freebsd.org/D7561
2016-09-06 15:11:37 +00:00
Wojciech Macek
d8f1c69cc2 Remove check for 64-bit FDT ranges in pci-host-generic
This allows 32-bit platforms to use pci-host-generic.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Reviewed by:           wma
Differential Revision: https://reviews.freebsd.org/D7560
2016-09-06 15:06:08 +00:00
Wojciech Macek
3fc36ee018 Update Annapurna Alpine HAL to a newer version.
HAL version: 2.7a

Import from vendor-sys, r305475
2016-09-06 14:59:13 +00:00
Ed Maste
68c6ae00ad bhnd: remove redundant ;s at the end of functions or switch statements 2016-09-06 13:34:10 +00:00
Andriy Voskoboinyk
93ae47478c rum: use m_get2() in Rx path. 2016-09-06 12:00:16 +00:00
Andriy Voskoboinyk
b7c8904780 rtwn: fix firmware readiness check in rtwn_load_firmware(). 2016-09-06 11:08:32 +00:00
Andriy Voskoboinyk
9afea60f98 iwm: fix scanning for hidden SSIDs.
Setup SSIDs in scan command so firmware will send direct probe request(s)
while scanning.

Tested by:	dbkirk@gmail.com

PR:		211519
MFC after:	1 week
2016-09-06 10:08:32 +00:00
Andriy Voskoboinyk
c84bb70268 rum: fix frame length checks in Rx path.
Split usbd_xfer_status() check:
- Check xfer length: must be longer, than Rx descriptor size.
- Check frame size: must be shorter than xfer length.
- Discard too short frames.

Tested with WUSB54GC, STA/MONITOR modes.
2016-09-06 06:40:59 +00:00
Andriy Gapon
1a82707cd7 fix zfs pool creation accidentally broken by r305331
The upstream change introduced a new load state, SPA_LOAD_CREATE,
and vdev_geom code needs to be aware of it.

Tested by:	cy
MFC after:	1 week
X-MFC with:	r305331
2016-09-06 06:09:12 +00:00
Sepherosa Ziehau
50002d3dfa hyperv/hn: Avoid bit fields for LSOv2 setup.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7786
2016-09-06 04:37:53 +00:00
Sepherosa Ziehau
14ee29ba93 hyperv/hn: Fix VLAN tag setup for outgoing VLAN packets.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7785
2016-09-06 03:31:31 +00:00
Sepherosa Ziehau
b349357819 hyperv/hn: Stringent RNDIS packet message length/offset check.
While I'm here, use definition in net/rndis.h

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7782
2016-09-06 03:20:06 +00:00
Navdeep Parhar
48214203c4 Fix send/recv limit mixup. 2016-09-05 23:12:24 +00:00
Landon J. Fuller
824b48eff3 bhnd(4): Implement backplane interrupt handling.
This adds bhnd(4) bus-level support for querying backplane interrupt vector
routing, and delegating machine/bridge-specific interrupt handling to the
concrete bhnd(4) driver implementation.

On bhndb(4) bridged PCI devices, we provide the PCI/MSI interrupt directly
to attached cores.

On MIPS devices, we report a backplane interrupt count of 0, effectively
disabling the bus-level interrupt assignment. This allows mips/broadcom
to temporarily continue using hard-coded MIPS IRQs until bhnd_mips PIC
support is implemented.

Reviewed by:	mizhka
Approved by:	adrian (mentor, implicit)
2016-09-05 22:11:46 +00:00
Landon J. Fuller
7d6162806b bwn(4): ignore BCM4321's unpopulated USB11 host controller core.
Broadcom Intensi-fi chipsets provided a common set of IP cores; on PCI/PCIe
devices, the USB11 host controller is left floating.

Approved by:	adrian (mentor, implicit)
2016-09-05 21:55:27 +00:00
Landon J. Fuller
fb88110c09 bhnd(4): Add device classes for USB host/dev/dual-mode controller cores.
Approved by:	adrian (mentor, implicit)
2016-09-05 21:48:16 +00:00
Andriy Voskoboinyk
c57ee45ba9 rum: do not restart device when protmode / rtsthreshold is changed. 2016-09-05 19:42:35 +00:00
Navdeep Parhar
5aaa3bc3b9 cxgbe/t4_tom: toepcb should be all-zero on allocation because the code
that cleans up on failure assumes that non-NULL values indicate
initialized items.

Sponsored by:	Chelsio Communications
2016-09-05 19:37:47 +00:00
Luiz Otavio O Souza
dc3155c1be Revert r305119, move the control module register data to am335x_scm.h and
fix if_cpsw.c to include the correct header.

Discussed with:	bz
2016-09-05 18:42:21 +00:00
Michael Zhilin
29d492ace1 [BHND/USB] Port of EHCI/OHCI support from ZRouter
This patch adds driver implementation for BHND USB core. Driver has been
imported from ZRouter project with small adaptions for FreeBSD 11.

Also it's enabled for BroadCom MIPS74k boards by default. It's fully tested
on Asus boards (RT-N16: external USB, RT-N53: USB bus between SoC and WiFi
chips).

Reviewed by:    adrian (mentor), ray
Approved by:	adrian (mentor)
Obtained from:	ZRouter
Differential Revision:  https://reviews.freebsd.org/D7781
2016-09-05 16:06:52 +00:00
Mark Johnston
1720ac9d5d Remove an unreachable return state from ARM's minidumpsys().
Submitted by:	Dominik Ermel <der@semihalf.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7787
2016-09-05 16:04:40 +00:00
Hans Petter Selasky
64cb5e2a26 Resolve deadlock between device_detach() and usbd_do_request_flags()
by reviving the SX control request lock and refining which lock
protects the common scratch area in "struct usb_device".

The SX control request lock was removed by r246759 because it caused a
lock order reversal with the USB enumeration lock inside
usbd_transfer_setup() as a function of r246616. It was thought that
reducing the number of locks would resolve the LOR, but because some
USB device drivers use usbd_do_request_flags() inside callback
functions, like in taskqueues, a deadlock may occur when these are
drained from device_detach(). By restoring the SX control request
lock usbd_do_request_flags() is allowed to complete its execution
when a USB device driver is detaching. By using the SX control request
lock to protect the scratch area, the LOR introduced by r246616 is
also resolved.

Bump the FreeBSD version while at it to force recompilation of all USB
kernel modules.

Found by:	avos@
MFC after:	1 week
2016-09-05 15:35:58 +00:00
Jared McNeill
22a07618ae Add sy8106a to Allwinner kernel. This regulator is used to control VDD_CPUX
and is connected to R_TWI on some H3-based Orange Pi boards.
2016-09-05 13:45:45 +00:00
Jared McNeill
8dd48c60a0 Add driver for Silergy Corp. SY8106A buck regulator. 2016-09-05 13:39:54 +00:00
Jared McNeill
a995bf1b7d Add support for Allwinner H3 PLL_CPUX.
The H3 PLL_CPUX register looks exactly like the one found in A23, but we
need to follow a specific protocol when making adjustments to the clock.
2016-09-05 12:36:54 +00:00
Jared McNeill
4e7f43bab6 Add support for the Allwinner H3 Thermal Sensor Controller. The H3 embeds
a single thermal sensor located in the CPU.
2016-09-05 11:05:14 +00:00
Sepherosa Ziehau
0bbb7d483b hyperv/hn: Stringent RNDIS control message length check.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7758
2016-09-05 05:07:40 +00:00
Sepherosa Ziehau
a8197ee35e net/rndis: Define RNDIS status message, which could be sent by device.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7757
2016-09-05 04:56:56 +00:00
Sepherosa Ziehau
7a466137f0 hyperv/hn: Stringent NVS RNDIS packets length checks.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7755
2016-09-05 04:47:31 +00:00
Sepherosa Ziehau
dc65be7a3d hyperv/hn: Stringent NVS notification length check.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7753
2016-09-05 03:39:04 +00:00
Sepherosa Ziehau
19c8ea1086 hyperv/vmbus: Stringent header length and total length check.
While I'm here, minor style changes.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7752
2016-09-05 03:21:31 +00:00
Jared McNeill
96fe97154e A64 thermal sensor IRQ is GIC_SPI 31, not 41. 2016-09-04 22:30:46 +00:00
Alan Cox
0cbdb05a46 Replace the number 4 in pmap_ts_referenced() by PMAP_TS_REFERENCED_MAX,
like we've done elsewhere, e.g., amd64.

As an optimization to the machine-independent layer, change the machine-
dependent pmap_ts_referenced() so that it updates the page's dirty field
if a modified bit is found while counting reference bits.  This
opportunistic update can be performed at low cost and can eliminate the
need for some future calls to pmap_is_modified() by the machine-
independent layer.

MFC after:	3 weeks
2016-09-04 22:08:04 +00:00
Dimitry Andric
4e8a91fb6c Make some additional -Wconstant-conversion warnings from clang 3.9.0 in
bwn(4) non-fatal for now.
2016-09-04 17:56:55 +00:00
Dimitry Andric
1c7c2b26e8 For kernel builds, instead of suppressing certain clang warnings, make
them non-fatal, so there is some incentive to fix them eventually.
2016-09-04 17:55:22 +00:00
Andrew Turner
705cb30cae Enable superpages on arm64 by default. These seem to be stable, having
survived multiple world and kernel builds, and of poudriere building full
package sets.

I have observed a 3% reduction in buildworld times with superpages enabled,
however further testing is needed to see if this is observed in other
workloads.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-09-04 17:50:23 +00:00
Dimitry Andric
6c01c0e0c6 With clang 3.9.0, compiling sys/netinet/igmp.c results in the following
warning:

sys/netinet/igmp.c:546:21: error: implicit conversion from 'int' to 'char' changes value from 148 to -108 [-Werror,-Wconstant-conversion]
        p->ipopt_list[0] = IPOPT_RA;    /* Router Alert Option */
                         ~ ^~~~~~~~
sys/netinet/ip.h:153:19: note: expanded from macro 'IPOPT_RA'
#define IPOPT_RA                148             /* router alert */
                                ^~~

This is because ipopt_list is an array of char, so IPOPT_RA is wrapped
to a negative value.  It would be nice to change ipopt_list to an array
of u_char, but it changes the signature of the public struct ipoption,
so add an explicit cast to suppress the warning.

Reviewed by:	imp
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D7777
2016-09-04 17:23:10 +00:00
Dimitry Andric
02d4a225db With clang 3.9.0, compiling uplcom results in the following warnings:
sys/dev/usb/serial/uplcom.c:543:29: error: implicit conversion from 'int' to 'int8_t' (aka 'signed char') changes value from 192 to -64 [-Werror,-Wconstant-conversion]
        if (uplcom_pl2303_do(udev, UT_READ_VENDOR_DEVICE, UPLCOM_SET_REQUEST, 0x8484, 0, 1)
            ~~~~~~~~~~~~~~~~       ^~~~~~~~~~~~~~~~~~~~~
sys/dev/usb/usb.h:179:53: note: expanded from macro 'UT_READ_VENDOR_DEVICE'
#define UT_READ_VENDOR_DEVICE   (UT_READ  | UT_VENDOR | UT_DEVICE)
                                 ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~

This is because UT_READ is 0x80, so the int8_t argument is wrapped to a
negative value.  Fix this by using uint8_t instead.

Reviewed by:	imp, hselasky
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D7776
2016-09-04 16:59:35 +00:00
Mateusz Guzik
591df14528 cache: defer freeing entries until after the global lock is dropped
This also defers vdrop for held vnodes.

Glanced at by:	kib
2016-09-04 16:52:14 +00:00
Mateusz Guzik
1c1c35c74e fd: fix up fdeget_file
It was supposed to return NULL if a fp is not installed.

Facepalm-by: mjg
2016-09-04 13:31:57 +00:00
Mateusz Guzik
31977b420a cache: manage negative entry list with a dedicated lock
Since negative entries are managed with a LRU list, a hit requires a
modificaton.

Currently the code tries to upgrade the global lock if needed and is
forced to retry the lookup if it fails.

Provide a dedicated lock for use when the cache is only shared-locked.

Reviewed by:	kib
MFC after:	1 week
2016-09-04 08:58:35 +00:00
Mateusz Guzik
b9042ae1bf cache: put all negative entry management code into dedicated functions
Reviewed by:	kib
MFC after:	1 week
2016-09-04 08:55:15 +00:00
Landon J. Fuller
eb83f2e1ea bhndb(4): Fix probing of bhndb-attached bhnd_nvram devices.
This fixes bhnd(4) nvram handling on devices that map SPROM CSRs via PCI
configuration space.

The probe method previously required that a bhnd(4) device be attached to the
parent bridge; now that the bhnd_nvram device is always attached first, this
unnecessary sanity check always failed.

Approved by:	adrian (mentor, implicit)
2016-09-04 01:47:21 +00:00
Landon J. Fuller
63fb0e8236 bhndb(4): Skip disabled cores when performing bridge configuration probing.
On BCM4321 chipsets, both PCI and PCIe cores are included, with one of
the cores potentially left floating.

Since the PCI core appears first in the device table, and the PCI
profiles appear first in the resource configuration tables, this resulted in
incorrectly matching and using the PCI/v1 resource configuration on PCIe
devices, rather than the correct PCIe/v1 profile.

Approved by:	adrian (mentor, implicit)
2016-09-04 01:43:54 +00:00
Landon J. Fuller
f32befd1a2 siba(4): Add missing bhnd_device/bhnd_device_quirk table terminator entries.
This resulted in an over-read on siba chipsets that failed to match the
existing entries.

Approved by:	adrian (mentor, implicit)
2016-09-04 01:25:46 +00:00
Landon J. Fuller
111d7cb2e3 Migrate bhndb(4) to the new bhnd_erom API.
Adds support for probing and initializing bhndb(4) bridge state using
the bhnd_erom API, ensuring that full bridge configuration is available
*prior* to actually attaching and enumerating the bhnd(4) child device,
allowing us to safely allocate bus-level agent/device resources during
bhnd(4) bus enumeration.

- Add a bhnd_erom_probe() method usable by bhndb(4). This is an analogue
  to the existing bhnd_erom_probe_static() method, and allows the bhndb
  bridge to discover the best available erom parser class prior to newbus
  probing of its children.
- Add support for supplying identification hints when probing erom
  devices. This is required on early EXTIF-only chipsets, where chip
  identification registers are not available.
- Migrate bhndb over to the new bhnd_erom API, using bhnd_core_info
  records rather than bridged bhnd(4) device_t references to determine
  the bridged chipsets' capability/bridge configuration.
- The bhndb parent (e.g. if_bwn) is now required to supply a hardware
  priority table to the bridge. The default table is currently sufficient
  for our supported devices.
- Drop the two-pass attach approach we used for compatibility with bhndb(4) in
  the bhnd(4) bus drivers, and instead perform bus enumeration immediately,
  and allocate bridged per-child bus-level resources during that enumeration.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7768
2016-09-04 00:58:19 +00:00
Mark Johnston
3da0f3c9ae Micro-optimize sleepq_signal().
Lift a comparison out of the loop that finds the highest-priority thread
on the queue.

MFC after:	1 week
2016-09-04 00:29:48 +00:00
Mark Johnston
dd9cb6da0b Respect the caller's hints when performing swap readahead.
The pager getpages interface allows the caller to bound the number of
readahead and readbehind pages, and vm_fault_hold() makes use of this
feature. These bounds were ignored after r305056, causing the swap pager
to potentially page in more than the specified number of pages.

Reported and reviewed by:	alc
X-MFC with:	r305056
2016-09-04 00:25:49 +00:00
Landon J. Fuller
664a749708 Implement a generic bhnd(4) device enumeration table API.
This defines a new bhnd_erom_if API, providing a common interface to device
enumeration on siba(4) and bcma(4) devices, for use both in the bhndb bridge
and SoC early boot contexts, and migrates mips/broadcom over to the new API.

This also replaces the previous adhoc device enumeration support implemented
for mips/broadcom.

Migration of bhndb to the new API will be implemented in a follow-up commit.


- Defined new bhnd_erom_if interface for bhnd(4) device enumeration, along
  with bcma(4) and siba(4)-specific implementations.
- Fixed a minor bug in bhndb that logged an error when we attempted to map the
  full siba(4) bus space (18000000-17FFFFFF) in the siba EROM parser.
- Reverted use of the resource's start address as the ChipCommon enum_addr in
  bhnd_read_chipid(). When called from bhndb, this address is found within the
  host address space, resulting in an invalid bridged enum_addr.
- Added support for falling back on standard bus_activate_resource() in
  bhnd_bus_generic_activate_resource(), enabling allocation of the bhnd_erom's
  bhnd_resource directly from a nexus-attached bhnd(4) device.
- Removed BHND_BUS_GET_CORE_TABLE(); it has been replaced by the erom API.
- Added support for statically initializing bhnd_erom instances, for use prior
  to malloc availability. The statically allocated buffer size is verified both
  at runtime, and via a compile-time assertion (see BHND_EROM_STATIC_BYTES).
- bhnd_erom classes are registered within a module via a linker set, allowing
  mips/broadcom to probe available EROM parser instances without creating a
  strong reference to bcma/siba-specific symbols.
- Migrated mips/broadcom to bhnd_erom_if, replacing the previous MIPS-specific
  device enumeration implementation.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7748
2016-09-03 23:57:17 +00:00
Mark Johnston
dbbaf04f1e Remove support for idle page zeroing.
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.

Reviewed by:	alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D7714
2016-09-03 20:38:13 +00:00
Dimitry Andric
2db7b9f259 With clang 3.9.0, compiling cxgb results in the following warning:
sys/dev/cxgb/cxgb_sge.c:2873:44: error: implicit conversion from 'int'
to 'char' changes value from 128 to -128 [-Werror,-Wconstant-conversion]
                        *mtod(m, char *) = CPL_ASYNC_NOTIF;
                                         ~ ^~~~~~~~~~~~~~~

This is because CPL_ASYNC_NOTIF is 0x80, so the plain char argument is
wrapped to a negative value.  Fix this by using uint8_t instead.

Reviewed by:	np
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D7772
2016-09-03 19:01:11 +00:00
Navdeep Parhar
5a63431265 Use correct CTR<n> variant. 2016-09-03 18:54:26 +00:00
Andrew Turner
6f0c70d446 Explicitly include all .rodata.* sections in the kernel .rodata. This
helps link the kernel with lld as it will then put all these into a single
.rodata section.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2016-09-03 17:23:24 +00:00
Jared McNeill
1403e695b7 Use the root key in the Security ID EFUSE (when valid) to generate a
MAC address instead of creating a random one each boot.
2016-09-03 15:28:09 +00:00
Warner Losh
155d3e43ff Don't use -N to set the OMAGIC with data and text writeable and data
not page aligned. To do this, use the ld script gnu ld installs on my
system.

This is imperfect: LDFLAGS_BIN and LD_FLAGS_BIN describe different
things. The loader script could be better named and take into account
other architectures. And having two different mechanisms to do
basically the same thing needs study. However, it's blocking forward
progress on lld, so I'll work in parallel to sort these out.

Differential Revision: https://reviews.freebsd.org/D7409
Reviewed by: emaste
2016-09-03 15:26:28 +00:00
Jared McNeill
d69d5ab04f Add support for Allwinner A64 thermal sensors. 2016-09-03 15:26:00 +00:00
Jared McNeill
1738b325d0 Add cpu-supply xref to cpu@0 2016-09-03 15:24:30 +00:00
Jared McNeill
b18b1b0015 Add SID, THS, and CPU operating points. 2016-09-03 15:23:59 +00:00
Jared McNeill
0503b90dde Add support for reading root key on A83T/A64. 2016-09-03 15:22:50 +00:00
Dimitry Andric
3128fa9a5a With clang 3.9.0, compiling ppbus(4) results in the following warnings:
sys/dev/ppbus/ppb_1284.c:296:46: error: implicit conversion from 'int'
to 'char' changes value from 144 to -112 [-Werror,-Wconstant-conversion]
        if ((error = do_peripheral_wait(bus, SELECT | nBUSY, 0))) {
                     ~~~~~~~~~~~~~~~~~~      ~~~~~~~^~~~~~~
sys/dev/ppbus/ppb_1284.c:785:48: error: implicit conversion from 'int'
to 'char' changes value from 240 to -16 [-Werror,-Wconstant-conversion]
                if (do_1284_wait(bus, nACK | SELECT | PERROR | nBUSY,
                    ~~~~~~~~~~~~      ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
sys/dev/ppbus/ppb_1284.c:786:29: error: implicit conversion from 'int'
to 'char' changes value from 240 to -16 [-Werror,-Wconstant-conversion]
                                        nACK | SELECT | PERROR | nBUSY)) {
                                        ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

This is because nBUSY is 0x80, so the plain char argument is wrapped to
a negative value.  Fix this in a minimal fashion, by using uint8_t in a
few places.

Reviewed by:	emaste
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D7771
2016-09-03 13:48:44 +00:00
Dimitry Andric
402e32a8af Define drmP.h's __OS_HAS_AGP and __OS_HAS_MTRR macros in a defined and
portable way.

Reviewed by:	dumbbell
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D7770
2016-09-03 13:33:28 +00:00
Ed Maste
8aa5c6cfeb remove CONSTRUCTORS from MIPS uboot linker script
The linker script CONSTRUCTORS keyword is only meaningful "when linking
object file formats which do not support arbitrary sections, such as
ECOFF and XCOFF"[1] and is ignored for other object file formats.

LLVM's lld does not yet accept (and ignore) CONSTRUCTORS, so just remove
CONSTRUCTORS from the linker script as it has no effect.

[1] https://sourceware.org/binutils/docs/ld/Output-Section-Keywords.html
2016-09-03 13:01:37 +00:00
Alexander Motin
9b9258a12a Missed FreeBSD-specific piece of r305338. 2016-09-03 11:17:33 +00:00
Alexander Motin
d7e781bda3 MFC r305337: 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object
Using a benchmark which has 32 threads creating 2 million files in the
same directory, on a machine with 16 CPU cores, I observed poor
performance. I noticed that dmu_tx_hold_zap() was using about 30% of
all CPU, and doing dnode_hold() 7 times on the same object (the ZAP
object that is being held).

dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is
running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the
dnode_t that we already have in hand, rather than repeatedly calling
dnode_hold(). To do this, we need to pass the dnode_t down through
all the intermediate calls that dmu_tx_hold_zap() makes, making these
routines take the dnode_t* rather than an objset_t* and a uint64_t
object number. In particular, the following routines will need to have
analogous *_by_dnode() variants created:

dmu_buf_hold_noread()
dmu_buf_hold()
zap_lookup()
zap_lookup_norm()
zap_count_write()
zap_lockdir()
zap_count_write()

This can improve performance on the benchmark described above by 100%,
from 30,000 file creations per second to 60,000. (This improvement is on
top of that provided by working around the object allocation issue. Peak
performance of ~90,000 creations per second was observed with 8 CPUs;
adding CPUs past that decreased performance due to lock contention.) The
CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds
to 40 CPU-seconds.

Sponsored by: Intel Corp.

Closes #109

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Ned Bass <bass6@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Author: Matthew Ahrens <mahrens@delphix.com>

openzfs/openzfs@d3e523d489
2016-09-03 11:00:29 +00:00
Alexander Motin
4ad4b70e77 MFV r305336: 7247 zfs receive of deduplicated stream fails
This resolves two 'zfs recv' issues. First, when receiving into an
existing filesystem, a snapshot created during the receive process is
not added to the guid->dataset map for the stream, resulting in failed
lookups for deduped streams when a WRITE_BYREF record refers to a
snapshot received earlier in the stream. Second, the newly created
snapshot was also not set properly, referencing the snapshot before the
new receiving dataset rather than the existing filesystem.

Closes #159

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Author: Chris Williamson <chris.williamson@delphix.com>

openzfs/openzfs@b09697c8c1
2016-09-03 10:59:05 +00:00
Alexander Motin
070da3f779 MFV r305335: 7003 zap_lockdir() should tag hold
zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which
tags the hold on the zap. This will help diagnose programming errors
which misuse the hold on the ZAP.

Sponsored by: Intel Corp.

Closes #108

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Author: Matthew Ahrens <mahrens@delphix.com>

openzfs/openzfs@0780b3eab5
2016-09-03 10:58:14 +00:00
Alexander Motin
d3ec2cdb4a MFV r304157:
7230 add assertions to dmu_send_impl() to verify that stream includes BEGIN and END records

illumos/illumos-gate@12b90ee2d3
https://github.com/illumos/illumos-gate/commit/12b90ee2d3b10689fc45f4930d2392f5f
e1d9cfa

https://www.illumos.org/issues/7230
  A test failure occurred where a send stream had only a BEGIN record. This
  should not be possible if the send returns without error. Prevented this from
  happening in the future by adding an assertion to dmu_send_impl() to verify
  that if the function returns 0 (success) both a BEGIN and END record are
  present. Did this by adding flags to dmu_sendarg_t (indicating whether BEGIN o
r
  END records sent), having dump_record() set flags appropriately, adding VERIFY
  statement to dmu_send_impl().

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matt Krantz <matt.krantz@delphix.com>
2016-09-03 10:10:58 +00:00
Alexander Motin
7aafc9d4c8 MFV r304156: 7235 remove unused func dsl_dataset_set_blkptr
illumos/illumos-gate@bd56f80007
https://github.com/illumos/illumos-gate/commit/bd56f80007857b960e0981ed0797ad8ec
844a96b

https://www.illumos.org/issues/7235
  The function dsl_dataset_set_blkptr() is unused. We should remove it.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2016-09-03 10:09:23 +00:00
Alexander Motin
c9fa25c110 MFV r304155: 7090 zfs should improve allocation order and throttle allocations
illumos/illumos-gate@0f7643c737
https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b
10c846a

https://www.illumos.org/issues/7090
  When write I/Os are issued, they are issued in block order but the ZIO pipelin
e
  will drive them asynchronously through the allocation stage which can result i
n
  blocks being allocated out-of-order. It would be nice to preserve as much of
  the logical order as possible.
  In addition, the allocations are equally scattered across all top-level VDEVs
  but not all top-level VDEVs are created equally. The pipeline should be able t
o
  detect devices that are more capable of handling allocations and should
  allocate more blocks to those devices. This allows for dynamic allocation
  distribution when devices are imbalanced as fuller devices will tend to be
  slower than empty devices.
  The change includes a new pool-wide allocation queue which would throttle and
  order allocations in the ZIO pipeline. The queue would be ordered by issued
  time and offset and would provide an initial amount of allocation of work to
  each top-level vdev. The allocation logic utilizes a reservation system to
  reserve allocations that will be performed by the allocator. Once an allocatio
n
  is successfully completed it's scheduled on a given top-level vdev. Each top-
  level vdev maintains a maximum number of allocations that it can handle
  (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels *
  mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab
  groups and round robin across all eligible metaslab groups to distribute the
  work. As top-levels complete their work, they receive additional work from the
  pool-wide allocation queue until the allocation queue is emptied.

Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
2016-09-03 10:04:37 +00:00
Alexander Motin
0b51a59fc7 MFV r303078:
7086 ztest attempts dva_get_dsize_sync on an embedded blockpointer

illumos/illumos-gate@926549256b
https://github.com/illumos/illumos-gate/commit/926549256b71acd595f69b236779ff6b7
8fa08ef

https://www.illumos.org/issues/7086
  In dbuf_dirty(), we need to grab the dn_struct_rwlock before looking at the
  db_blkptr, to prevent it from being changed by syncing context.
  Otherwise we may see that ztest got a segfault from this stack:
  libzpool.so.1`dva_get_dsize_sync+0x98(872f000, b32b240, fed7811b, 0, b4cda20,
0)
  libzpool.so.1`bp_get_dsize+0x60(872f000, b32b240, 0, 97cb780, 9d4c1a8, 0)
  libzpool.so.1`dbuf_dirty+0x9b3(ce0a100, 97cb780, 9, fecd2530)
  libzpool.so.1`dmu_buf_will_dirty+0xc3(ce0a100, 97cb780, ea293d6c, 1)
  libzpool.so.1`zap_lockdir+0x1a0(8aaa3c0, 1, 0, 97cb780, 1, 1)
  libzpool.so.1`zap_remove_norm+0x30(8aaa3c0, 1, 0, 8728b10, 0, 97cb780)
  libzpool.so.1`zap_remove+0x29(8aaa3c0, 1, 0, 8728b10, 97cb780, a)
  ztest_replay_remove+0x225(ea294588, 8728ae8, 0, 38010000, 0, 0)
  ztest_remove+0x9f(ea294588, ea293f50, 4, 3)
  ztest_object_init+0x78(ea294588, ea293f50, 4e0, 1)
  ztest_dmu_object_alloc_free+0x71(ea294588, 13)
  ztest_dmu_objset_create_destroy+0x224(80cef08, 13, 0, 805d36c, 9017ad44, 0)
  ztest_execute+0x89(a, 807c720, 13, 0)
  ztest_thread+0xea(13, 0, 0, 0)
  libc.so.1`_thrp_setup+0x88(f0983240)
  libc.so.1`_lwp_start(f0983240, 0, 0, 0, 0, 0)
  Looking into it a bit, we see that this is an embedded blockpointer, so
  BP_GET_NDVAS should have returned 0:
       b32b240::blkptr
  EMBEDDED [L0 ZAP_OTHER] et=0 LZ4 size=200L/4aP birth=80L
  Instead, it looks like another thread is modifying this blockpointer:
       b32b240::ugrep | ::whatis
  f47a0e0c is in [ stack tid=0x19f ]
  ebd6ec40 is in [ stack tid=0x226 ]
  ea293bd0 is in [ stack tid=0x244 ]
  ea293be4 is in [ stack tid=0x244 ]

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2016-09-03 08:43:43 +00:00
Alexander Motin
84c3781ac9 MFV r303077:
7072 zfs fails to expand if lun added when os is in shutdown state

illumos/illumos-gate@c39a2aae1e
c39a2aae1e

https://www.illumos.org/issues/7072
  upstream:
  38733 zfs fails to expand if lun added when os is in shutdown state
  DLPX-36910 spares and caches should not display expandable space
  DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer

Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
2016-09-03 08:42:12 +00:00
Alexander Motin
efa0867fb0 MFV r302991: 6950 ARC should cache compressed data
illumos/illumos-gate@dcbf3bd6a1
dcbf3bd6a1

https://www.illumos.org/issues/6950
  When reading compressed data from disk, the ARC should keep the compressed
  block cached and only decompress it when consumers access the block. The
  uncompressed data should be short-lived allowing the ARC to cache a much larger
  amount of data. The DMU would also maintain a smaller cache of uncompressed
  blocks to minimize the impact of decompressing frequently accessed blocks.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
2016-09-03 08:30:51 +00:00