Commit Graph

112170 Commits

Author SHA1 Message Date
Ruslan Bukin
98f50c44e3 Update RISC-V port to Privileged Architecture Version 1.9.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-02 14:50:14 +00:00
Andrey V. Elsukov
723758b7ce Fix NULL pointer dereference.
ro pointer can be NULL when IPSec consumes mbuf.

PR:		211486
MFC after:	3 days
2016-08-02 12:18:06 +00:00
Sepherosa Ziehau
05cde7efa6 tcp/lro: Implement hash table for LRO entries.
This significantly improves HTTP workload performance and reduces
HTTP workload latency.

Reviewed by:	rrs, gallatin, hps
Obtained from:	rrs, gallatin
Sponsored by:	Netflix (rrs, gallatin) , Microsoft (sephe)
Differential Revision:	https://reviews.freebsd.org/D6689
2016-08-02 06:36:47 +00:00
Mateusz Guzik
fa5000a4f3 locks: fix compilation for KDTRACE_HOOKS && !ADAPTIVE_* case
Reported by:	Michael Butler <imb protected-networks.net>
2016-08-02 03:05:59 +00:00
Mateusz Guzik
0412689595 locks: fix up ifdef guards introduced in r303643
Both sx and rwlocks had copy-pasted ADAPTIVE_MUTEXES instead of the correct
define.

MFC after:	1 week
2016-08-02 00:15:08 +00:00
Conrad Meyer
9809e9dc3a rtentry: Initialize rt_mtx with MTX_NEW
The "rtentry" zone does not use UMA_ZONE_ZINIT, so it is invalid to assume the
mutex's memory will be zero.  Without MTX_NEW, garbage backing memory may
trigger the "re-initializing a mutex" assertion.

PR:		200991
Submitted by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
2016-08-01 23:07:31 +00:00
Conrad Meyer
bf4c239e47 opencrypto AES-ICM: Fix heap corruption typo
This error looks like it was a simple copy-paste typo in the original commit
for this code (r275732).

PR:		204009
Reported by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 22:57:03 +00:00
Conrad Meyer
5f00f45775 Fix ddb "show proc" to show full arguments
PR:		200052
Submitted by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
2016-08-01 22:41:50 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
Mark Johnston
49fe7b5029 ipoib: Bound the number of egress mbufs buffered during pathrec lookups.
In pathological situations where the master subnet manager becomes
unresponsive for an extended period, we may otherwise end up queuing all
of the system's mbufs while waiting for a response to a path record lookup.

This addresses the same issue as commit 1e85b806f9 in Linux.

Reviewed by:	cem, ngie
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 22:22:11 +00:00
John Baldwin
2611037c1c Disable PCI hotplug support for slots with power controllers.
After further review of the spec, I do not think the current HotPlug
code handles slots with power controllers correctly.  In particular,
the power state of the slot is to be inferred from other events, not
from examining the state of the power control bit in SLOT_CTL.  For now,
disable PCI hotplug support on such slots.

PR:		211081
Tested by:	Jeffrey E Pieper <jeffrey.e.pieper@intel.com>
MFC after:	3 days
2016-08-01 22:19:23 +00:00
Mateusz Guzik
1ada904147 Implement trivial backoff for locking primitives.
All current spinning loops retry an atomic op the first chance they get,
which leads to performance degradation under load.

One classic solution to the problem consists of delaying the test to an
extent. This implementation has a trivial linear increment and a random
factor for each attempt.

For simplicity, this first thouch implementation only modifies spinning
loops where the lock owner is running. spin mutexes and thread lock were
not modified.

Current parameters are autotuned on boot based on mp_cpus.

Autotune factors are very conservative and are subject to change later.

Reviewed by:	kib, jhb
Tested by:	pho
MFC after:	1 week
2016-08-01 21:48:37 +00:00
Sean Bruno
e72a746a6e r293331 mistakingly failed to add an assignment of paddr to the rxbuf
but only in the NETMAP code.  This lead to the NETMAP code paths
passing nothing up to userland.

Submitted by:	Ad Schellevis <ad@opnsense.org>
Reported by:	Franco Fichtner <franco@opnsense.org>
MFC after:	1 day
2016-08-01 21:19:51 +00:00
Andrey V. Elsukov
0428336393 Do not invoke resize event if initial disk size is zero. Some disks
report the size only after first opening.  And due to the events are
asynchronous, some consumers can receive this event too late and
this confuses them. This partially restores previous behaviour, and
at the same time this should fix the problem, when already opened
provider loses resize event.

PR:		211028
MFC after:	3 weeks
2016-08-01 20:54:54 +00:00
Mark Johnston
4e071758a7 MFV be9130cc9: "IB/cma: Check for GID on listening devices first"
This is an optimization that improves IB connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:29:09 +00:00
Mark Johnston
82f1d3ea2f MFV 29f27e847: "IB/cma: Use cached gids"
This addresses a regression from an earlier upstream change which caused
cma_acquire_dev() to bypass the port GID cache and instead query the HCA
for each entry in its GID table. These queries can become extremely slow on
multiport devices, which has a negative impact on connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:27:11 +00:00
Allan Jude
4deb8929ea Make boot code and loader check for unsupported ZFS feature flags
OpenZFS uses feature flags instead of a zpool version number to track
features since the split from Oracle. In addition to avoiding confusion
on ZFS vs OpenZFS version numbers, this also allows features to be added
to different operating systems that use OpenZFS in different order.

The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly
tries to read the pool, and if failed provided only a vague error message.

With this change, both the boot code and loader check the MOS features
list in the ZFS label and compare it against the list of features that
the loader supports. If any unsupported feature is active, the pool is
not considered as a candidate for booting, and a helpful diagnostic
message is printed to the screen. Features that are merely enabled via
zpool upgrade, but not in use, do not block booting from the pool.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	delphij, mav
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6857
2016-08-01 19:37:43 +00:00
Alan Cox
87ff568c26 Restore the historical behavior of "sysctl vm.swap_idle_enabled=1". Prior
to r254304, we had separate functions for reclamation and laundering
(vm_pageout_scan) versus updating usage information, i.e., "reference
bits", on active pages (vm_pageout_page_stats), and we only performed
vm_req_vmdaemon(VM_SWAP_IDLE) if vm_pages_needed was true.  However, since
r254303, if vm_swap_idle_enabled was "1", we have performed
vm_req_vmdaemon(VM_SWAP_IDLE) regardless of whether we are short of free
pages.  This was unintended and too aggressive, so I suspect no one uses
this feature.  With this change, we restore the historical behavior and
only perform vm_req_vmdaemon(VM_SWAP_IDLE) when we are short of free
pages.

Reviewed by:	kib, markj
2016-08-01 17:25:07 +00:00
Andrew Gallatin
d4c22202e6 Rework IPV6 TCP path MTU discovery to match IPv4
- Re-write tcp_ctlinput6() to closely mimic the IPv4 tcp_ctlinput()

- Now that tcp_ctlinput6() updates t_maxseg, we can allow ip6_output()
  to send TCP packets without looking at the tcp host cache for every
  single transmit.

- Make the icmp6 code mimic the IPv4 code & avoid returning
  PRC_HOSTDEAD because it is so expensive.

Without these changes in place, every TCP6 pmtu discovery or host
unreachable ICMP resulted in a call to in6_pcbnotify() which walks the
tcbinfo table with the write lock held.  Because the tcbinfo table is
shared between IPv4 and IPv6, this causes huge scalabilty issues on
servers with lots of (~100K) TCP connections, to the point where even
a small percent of IPv6 traffic had a disproportionate impact on
overall throughput.

Reviewed by:	bz, rrs, ae (all earlier versions), lstewart (in Netflix's tree)
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D7272
2016-08-01 17:02:21 +00:00
Landon J. Fuller
6ed73911fe [mips/broadcom] Fetch UART console configuration from CFE.
Relying on the boot loader console configuration allows us to use a
common set of device hints for all SENTRY5 devices.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7376
2016-08-01 16:29:32 +00:00
Andrew Turner
727c18a84f Split out the FDT parts of the GICv2 interrupt controller driver. This will
allow us to add an ACPI attachment for arm64.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7307
2016-08-01 16:29:04 +00:00
Landon J. Fuller
3cbea0b15c Sync CFE interface with upstream cfe-1.4.2 release.
Approved by:	adrian (mentor)
Obtained from:	https://www.broadcom.com/support/communications-processors
Differential Revision:	https://reviews.freebsd.org/D7375
2016-08-01 16:26:08 +00:00
Andrew Turner
698c14e189 Add a kernel variable to let the user to select their preferred order
between ACPI and FDT. This will be needed on machines with both, e.g. the
SoftIron Overdrive 3000. The kernel will accept one or more comma separated
values of either 'acpi' or 'fdt'. Any other values are skipped.

To set it the user can either set it on the loader command line, or
in loader.conf e.g. in loader.conf:
kern.cfg.order=acpi,fdt

This will try using ACPI then FDT. If none of the selected options work the
kernel tries to use one to get the serial console, then panics.

Reviewed by:	emaste (earlier version)
Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7274
2016-08-01 12:17:44 +00:00
Julian Elischer
d7373c820e netgraph module for reconstructing checksums
PR:		206108
Submitted by:	Dmitry Vagin  daemon.hammer@ya.ru
MFC after:	1 month
2016-08-01 12:09:04 +00:00
Julian Elischer
bf909fc9a4 slite style changes. There is an incoming patch that rewrites a
lot of this module and I want to get the style and whitespace changes in
a separate commit (or maybe more).

PR: 206185
Submitted by:	Dmitry Vagin
MFC after:	1 month
2016-08-01 11:34:12 +00:00
Andrew Turner
49a92cd4a5 Add the fields for the PAR_EL1 register. This is used when performing an
address lookup with the AT instructions.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-01 10:36:58 +00:00
Sepherosa Ziehau
05f7a8a69c hyperv/storvsc: Stringent PRP list assertions
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7361
2016-08-01 05:09:11 +00:00
Sepherosa Ziehau
88cb8d7812 hyperv/storvsc: Set maxio to 128KB.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7360
2016-08-01 04:51:31 +00:00
Sepherosa Ziehau
9d6016a773 hyperv/vmbus: Remove the artificial entry limit of SG and PRP list.
Just make sure that the total channel packet size does not exceed 1/2
data size of the TX bufring.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7359
2016-08-01 04:26:24 +00:00
Adrian Chadd
52fe68b878 [ath] update comments. 2016-08-01 00:36:29 +00:00
Andrew Turner
63512a1276 Add the Data Fault Status Code values to the ESR_ELx registers for when the
fault code is a Data Abort.

Obtained from:	AT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 18:58:20 +00:00
Andrew Turner
ed24579135 Extract the common parts of pmap_kenter_device to a new function. This will
be used when superpage support is added.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 17:53:09 +00:00
Andrew Turner
7592321ad1 Fix the comment above pmap_invalidate_page. tlbi will invalidate the tlb
on all CPUs.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 14:59:44 +00:00
Andrew Turner
c5b3b20907 Relax the barriers around a TLB invalidation to only wait on
inner-shareable memory accesses. There is no need for full system barriers.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 12:59:10 +00:00
Mateusz Guzik
61852185ba locks: change sleep_cnt and spin_cnt types to u_int
Both variables are uint64_t, but they only count spins or sleeps.
All reasonable values which we can get here comfortably hit in 32-bit range.

Suggested by: kib
MFC after:	1 week
2016-07-31 12:11:55 +00:00
Mateusz Guzik
b7ff4d59de amd64: implement pagezero using rep stos
The current implementation uses non-temporal writes. This turns out to
be detrimental to performance if the page is used shortly after, which
is the typical case with page faults.

Switch to rep stos.

Reviewed by:	kib
MFC after:	1 week
2016-07-31 11:34:08 +00:00
Adrian Chadd
c2346bfa26 [wdr4300] invert the GPIO LED polarity.
This makes them behave correctly.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:52:19 +00:00
Adrian Chadd
accdd12e71 [ar71xx_gpio] handle AR934x and QCA953x GPIO OE polarity.
For reasons I won't comment on, the AR934x and QCA953x GPIO_OE register
value is inverted - bit set == input, bit clear == output.

So, fix this in the output setting, in reading the initial state from
the boot loader, and also setting any gpiofunc pins that are necessary.
2016-07-31 06:51:34 +00:00
Enji Cooper
4fcae4df7e Conditionalize code which defines sysctls per _KERNEL #ifdef guard
This resolves several issues when compiling libzpool (userspace library), i.e.
-Wimplicit-function-declaration and -Wmissing-declarations issues.

MFC after:	2 weeks
Reported by:	clang
Tested with:	clang 3.8.1, gcc 4.2.1, gcc 5.3.0
Sponsored by:	EMC / Isilon Storage Division
2016-07-31 06:34:49 +00:00
Adrian Chadd
d75e8c5089 [gpioled] add support for inverting the LED polarity.
No, this isn't a star trek science joke - sometimes LEDs are wired
up to be active low, so this is needed.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:24:26 +00:00
Mateusz Guzik
e0c45af904 sx: increment spin_cnt before cpu_spinwait in xlock
The change is a no-op only done for consistency with the rest of the file.
2016-07-30 22:23:31 +00:00
Mateusz Guzik
7a54be1870 rwlock: s/READER/WRITER/ in wlock lockstat annotation 2016-07-30 22:21:48 +00:00
Alexander Motin
7e58465356 Wrap previous MSIX workaround into #ifndef EARLY_AP_STARTUP.
With EARLY_AP_STARTUP we can successfully negotiate MSIX earlier.

Requested by:	jhb@
2016-07-30 21:06:59 +00:00
Bjoern A. Zeeb
6ca2d09437 Try to declare _hw_pci for all sysctl cases needed after r303497.
MFC after:	5 days
X-MFC with:	r303497
2016-07-30 20:31:12 +00:00
Imre Vadász
a3c0e7f2fb [iwm] Fix iwm_poll_bit() usage in iwm_stop_device(), fixup r303418.
* iwm_poll_bit() returns 1 on success and 0 on failure, whereas
  iwl_poll_bit() in Linux's iwlwifi returns >= 0 on success and < 0 on
  failure.

* Because of the wrong iwm_poll_bit return code check, no warning was
  printed if tx DMA stopping failed.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7371
2016-07-30 19:03:32 +00:00
Allan Jude
aafbb33897 Improve boot loader quote parsing
parse() is the boot loader's interp_parse.c is too naive about quotes

both single and double quotes were allowed to be mixed, and single
quotes did not follow the usual semantics (re variable expansion).

The old code did not check for terminating quotes

This update implements:
 * distinguishing single and double quote
 * variable expansion will not be done inside single quote protected area
 * will preserve inner quote for values like "value 'some list'"
 * ending quote check.

this diff does not implement ending quote order check, it shouldn't
be too hard, needs some improvements on parser state machine.

PR:		204602
Submitted by:	Toomas Soome <tsoome@me.com>
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6000
2016-07-30 17:53:37 +00:00
Allan Jude
6bf8d16009 bcache should support reads shorter than sector size
dosfs (fat file systems) can perform reads of partial sectors
bcache should support such reads.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D6475
2016-07-30 17:45:56 +00:00
Alexander Motin
a8ec75016f Block MSIX negotiation until SMP started and IRQ reshuffled. 2016-07-30 15:56:36 +00:00
Alexander Motin
8260941c76 Make MAC address generation more random.
'ticks' approach does not work at boot time.
2016-07-30 15:51:16 +00:00
Alexander Motin
cf2e151f65 Fix infinite loops introduced at r303429. 2016-07-30 10:32:28 +00:00