Commit Graph

112162 Commits

Author SHA1 Message Date
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
Mark Johnston
49fe7b5029 ipoib: Bound the number of egress mbufs buffered during pathrec lookups.
In pathological situations where the master subnet manager becomes
unresponsive for an extended period, we may otherwise end up queuing all
of the system's mbufs while waiting for a response to a path record lookup.

This addresses the same issue as commit 1e85b806f9 in Linux.

Reviewed by:	cem, ngie
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 22:22:11 +00:00
John Baldwin
2611037c1c Disable PCI hotplug support for slots with power controllers.
After further review of the spec, I do not think the current HotPlug
code handles slots with power controllers correctly.  In particular,
the power state of the slot is to be inferred from other events, not
from examining the state of the power control bit in SLOT_CTL.  For now,
disable PCI hotplug support on such slots.

PR:		211081
Tested by:	Jeffrey E Pieper <jeffrey.e.pieper@intel.com>
MFC after:	3 days
2016-08-01 22:19:23 +00:00
Mateusz Guzik
1ada904147 Implement trivial backoff for locking primitives.
All current spinning loops retry an atomic op the first chance they get,
which leads to performance degradation under load.

One classic solution to the problem consists of delaying the test to an
extent. This implementation has a trivial linear increment and a random
factor for each attempt.

For simplicity, this first thouch implementation only modifies spinning
loops where the lock owner is running. spin mutexes and thread lock were
not modified.

Current parameters are autotuned on boot based on mp_cpus.

Autotune factors are very conservative and are subject to change later.

Reviewed by:	kib, jhb
Tested by:	pho
MFC after:	1 week
2016-08-01 21:48:37 +00:00
Sean Bruno
e72a746a6e r293331 mistakingly failed to add an assignment of paddr to the rxbuf
but only in the NETMAP code.  This lead to the NETMAP code paths
passing nothing up to userland.

Submitted by:	Ad Schellevis <ad@opnsense.org>
Reported by:	Franco Fichtner <franco@opnsense.org>
MFC after:	1 day
2016-08-01 21:19:51 +00:00
Andrey V. Elsukov
0428336393 Do not invoke resize event if initial disk size is zero. Some disks
report the size only after first opening.  And due to the events are
asynchronous, some consumers can receive this event too late and
this confuses them. This partially restores previous behaviour, and
at the same time this should fix the problem, when already opened
provider loses resize event.

PR:		211028
MFC after:	3 weeks
2016-08-01 20:54:54 +00:00
Mark Johnston
4e071758a7 MFV be9130cc9: "IB/cma: Check for GID on listening devices first"
This is an optimization that improves IB connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:29:09 +00:00
Mark Johnston
82f1d3ea2f MFV 29f27e847: "IB/cma: Use cached gids"
This addresses a regression from an earlier upstream change which caused
cma_acquire_dev() to bypass the port GID cache and instead query the HCA
for each entry in its GID table. These queries can become extremely slow on
multiport devices, which has a negative impact on connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:27:11 +00:00
Allan Jude
4deb8929ea Make boot code and loader check for unsupported ZFS feature flags
OpenZFS uses feature flags instead of a zpool version number to track
features since the split from Oracle. In addition to avoiding confusion
on ZFS vs OpenZFS version numbers, this also allows features to be added
to different operating systems that use OpenZFS in different order.

The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly
tries to read the pool, and if failed provided only a vague error message.

With this change, both the boot code and loader check the MOS features
list in the ZFS label and compare it against the list of features that
the loader supports. If any unsupported feature is active, the pool is
not considered as a candidate for booting, and a helpful diagnostic
message is printed to the screen. Features that are merely enabled via
zpool upgrade, but not in use, do not block booting from the pool.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	delphij, mav
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6857
2016-08-01 19:37:43 +00:00
Alan Cox
87ff568c26 Restore the historical behavior of "sysctl vm.swap_idle_enabled=1". Prior
to r254304, we had separate functions for reclamation and laundering
(vm_pageout_scan) versus updating usage information, i.e., "reference
bits", on active pages (vm_pageout_page_stats), and we only performed
vm_req_vmdaemon(VM_SWAP_IDLE) if vm_pages_needed was true.  However, since
r254303, if vm_swap_idle_enabled was "1", we have performed
vm_req_vmdaemon(VM_SWAP_IDLE) regardless of whether we are short of free
pages.  This was unintended and too aggressive, so I suspect no one uses
this feature.  With this change, we restore the historical behavior and
only perform vm_req_vmdaemon(VM_SWAP_IDLE) when we are short of free
pages.

Reviewed by:	kib, markj
2016-08-01 17:25:07 +00:00
Andrew Gallatin
d4c22202e6 Rework IPV6 TCP path MTU discovery to match IPv4
- Re-write tcp_ctlinput6() to closely mimic the IPv4 tcp_ctlinput()

- Now that tcp_ctlinput6() updates t_maxseg, we can allow ip6_output()
  to send TCP packets without looking at the tcp host cache for every
  single transmit.

- Make the icmp6 code mimic the IPv4 code & avoid returning
  PRC_HOSTDEAD because it is so expensive.

Without these changes in place, every TCP6 pmtu discovery or host
unreachable ICMP resulted in a call to in6_pcbnotify() which walks the
tcbinfo table with the write lock held.  Because the tcbinfo table is
shared between IPv4 and IPv6, this causes huge scalabilty issues on
servers with lots of (~100K) TCP connections, to the point where even
a small percent of IPv6 traffic had a disproportionate impact on
overall throughput.

Reviewed by:	bz, rrs, ae (all earlier versions), lstewart (in Netflix's tree)
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D7272
2016-08-01 17:02:21 +00:00
Landon J. Fuller
6ed73911fe [mips/broadcom] Fetch UART console configuration from CFE.
Relying on the boot loader console configuration allows us to use a
common set of device hints for all SENTRY5 devices.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7376
2016-08-01 16:29:32 +00:00
Andrew Turner
727c18a84f Split out the FDT parts of the GICv2 interrupt controller driver. This will
allow us to add an ACPI attachment for arm64.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7307
2016-08-01 16:29:04 +00:00
Landon J. Fuller
3cbea0b15c Sync CFE interface with upstream cfe-1.4.2 release.
Approved by:	adrian (mentor)
Obtained from:	https://www.broadcom.com/support/communications-processors
Differential Revision:	https://reviews.freebsd.org/D7375
2016-08-01 16:26:08 +00:00
Andrew Turner
698c14e189 Add a kernel variable to let the user to select their preferred order
between ACPI and FDT. This will be needed on machines with both, e.g. the
SoftIron Overdrive 3000. The kernel will accept one or more comma separated
values of either 'acpi' or 'fdt'. Any other values are skipped.

To set it the user can either set it on the loader command line, or
in loader.conf e.g. in loader.conf:
kern.cfg.order=acpi,fdt

This will try using ACPI then FDT. If none of the selected options work the
kernel tries to use one to get the serial console, then panics.

Reviewed by:	emaste (earlier version)
Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7274
2016-08-01 12:17:44 +00:00
Julian Elischer
d7373c820e netgraph module for reconstructing checksums
PR:		206108
Submitted by:	Dmitry Vagin  daemon.hammer@ya.ru
MFC after:	1 month
2016-08-01 12:09:04 +00:00
Julian Elischer
bf909fc9a4 slite style changes. There is an incoming patch that rewrites a
lot of this module and I want to get the style and whitespace changes in
a separate commit (or maybe more).

PR: 206185
Submitted by:	Dmitry Vagin
MFC after:	1 month
2016-08-01 11:34:12 +00:00
Andrew Turner
49a92cd4a5 Add the fields for the PAR_EL1 register. This is used when performing an
address lookup with the AT instructions.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-01 10:36:58 +00:00
Sepherosa Ziehau
05f7a8a69c hyperv/storvsc: Stringent PRP list assertions
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7361
2016-08-01 05:09:11 +00:00
Sepherosa Ziehau
88cb8d7812 hyperv/storvsc: Set maxio to 128KB.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7360
2016-08-01 04:51:31 +00:00
Sepherosa Ziehau
9d6016a773 hyperv/vmbus: Remove the artificial entry limit of SG and PRP list.
Just make sure that the total channel packet size does not exceed 1/2
data size of the TX bufring.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7359
2016-08-01 04:26:24 +00:00
Adrian Chadd
52fe68b878 [ath] update comments. 2016-08-01 00:36:29 +00:00
Andrew Turner
63512a1276 Add the Data Fault Status Code values to the ESR_ELx registers for when the
fault code is a Data Abort.

Obtained from:	AT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 18:58:20 +00:00
Andrew Turner
ed24579135 Extract the common parts of pmap_kenter_device to a new function. This will
be used when superpage support is added.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 17:53:09 +00:00
Andrew Turner
7592321ad1 Fix the comment above pmap_invalidate_page. tlbi will invalidate the tlb
on all CPUs.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 14:59:44 +00:00
Andrew Turner
c5b3b20907 Relax the barriers around a TLB invalidation to only wait on
inner-shareable memory accesses. There is no need for full system barriers.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 12:59:10 +00:00
Mateusz Guzik
61852185ba locks: change sleep_cnt and spin_cnt types to u_int
Both variables are uint64_t, but they only count spins or sleeps.
All reasonable values which we can get here comfortably hit in 32-bit range.

Suggested by: kib
MFC after:	1 week
2016-07-31 12:11:55 +00:00
Mateusz Guzik
b7ff4d59de amd64: implement pagezero using rep stos
The current implementation uses non-temporal writes. This turns out to
be detrimental to performance if the page is used shortly after, which
is the typical case with page faults.

Switch to rep stos.

Reviewed by:	kib
MFC after:	1 week
2016-07-31 11:34:08 +00:00
Adrian Chadd
c2346bfa26 [wdr4300] invert the GPIO LED polarity.
This makes them behave correctly.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:52:19 +00:00
Adrian Chadd
accdd12e71 [ar71xx_gpio] handle AR934x and QCA953x GPIO OE polarity.
For reasons I won't comment on, the AR934x and QCA953x GPIO_OE register
value is inverted - bit set == input, bit clear == output.

So, fix this in the output setting, in reading the initial state from
the boot loader, and also setting any gpiofunc pins that are necessary.
2016-07-31 06:51:34 +00:00
Enji Cooper
4fcae4df7e Conditionalize code which defines sysctls per _KERNEL #ifdef guard
This resolves several issues when compiling libzpool (userspace library), i.e.
-Wimplicit-function-declaration and -Wmissing-declarations issues.

MFC after:	2 weeks
Reported by:	clang
Tested with:	clang 3.8.1, gcc 4.2.1, gcc 5.3.0
Sponsored by:	EMC / Isilon Storage Division
2016-07-31 06:34:49 +00:00
Adrian Chadd
d75e8c5089 [gpioled] add support for inverting the LED polarity.
No, this isn't a star trek science joke - sometimes LEDs are wired
up to be active low, so this is needed.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:24:26 +00:00
Mateusz Guzik
e0c45af904 sx: increment spin_cnt before cpu_spinwait in xlock
The change is a no-op only done for consistency with the rest of the file.
2016-07-30 22:23:31 +00:00
Mateusz Guzik
7a54be1870 rwlock: s/READER/WRITER/ in wlock lockstat annotation 2016-07-30 22:21:48 +00:00
Alexander Motin
7e58465356 Wrap previous MSIX workaround into #ifndef EARLY_AP_STARTUP.
With EARLY_AP_STARTUP we can successfully negotiate MSIX earlier.

Requested by:	jhb@
2016-07-30 21:06:59 +00:00
Bjoern A. Zeeb
6ca2d09437 Try to declare _hw_pci for all sysctl cases needed after r303497.
MFC after:	5 days
X-MFC with:	r303497
2016-07-30 20:31:12 +00:00
Imre Vadász
a3c0e7f2fb [iwm] Fix iwm_poll_bit() usage in iwm_stop_device(), fixup r303418.
* iwm_poll_bit() returns 1 on success and 0 on failure, whereas
  iwl_poll_bit() in Linux's iwlwifi returns >= 0 on success and < 0 on
  failure.

* Because of the wrong iwm_poll_bit return code check, no warning was
  printed if tx DMA stopping failed.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7371
2016-07-30 19:03:32 +00:00
Allan Jude
aafbb33897 Improve boot loader quote parsing
parse() is the boot loader's interp_parse.c is too naive about quotes

both single and double quotes were allowed to be mixed, and single
quotes did not follow the usual semantics (re variable expansion).

The old code did not check for terminating quotes

This update implements:
 * distinguishing single and double quote
 * variable expansion will not be done inside single quote protected area
 * will preserve inner quote for values like "value 'some list'"
 * ending quote check.

this diff does not implement ending quote order check, it shouldn't
be too hard, needs some improvements on parser state machine.

PR:		204602
Submitted by:	Toomas Soome <tsoome@me.com>
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6000
2016-07-30 17:53:37 +00:00
Allan Jude
6bf8d16009 bcache should support reads shorter than sector size
dosfs (fat file systems) can perform reads of partial sectors
bcache should support such reads.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D6475
2016-07-30 17:45:56 +00:00
Alexander Motin
a8ec75016f Block MSIX negotiation until SMP started and IRQ reshuffled. 2016-07-30 15:56:36 +00:00
Alexander Motin
8260941c76 Make MAC address generation more random.
'ticks' approach does not work at boot time.
2016-07-30 15:51:16 +00:00
Alexander Motin
cf2e151f65 Fix infinite loops introduced at r303429. 2016-07-30 10:32:28 +00:00
Konstantin Belousov
50c22263bb Cache getbintime(9) answer in timehands, similarly to getnanotime(9)
and getmicrotime(9).

Suggested and reviewed by:	bde (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2016-07-30 09:25:57 +00:00
Mark Johnston
57185c52de Restore an ifdef that should not have been removed in r303535.
X-MFC-With:	r303535
2016-07-30 07:05:32 +00:00
Mark Johnston
6d1ffb50fc Include fasttrap handling for DATAMODEL_ILP32 when compiling for amd64.
MFC after:	1 month
2016-07-30 03:11:53 +00:00
Baptiste Daroussin
999c1fd64b Remove usage of _WITH_DPRINTF 2016-07-30 01:16:06 +00:00
John Baldwin
8dd07eab0e Various fixes to the t4/5nex character device.
- Remove null open/close methods.
- Don't set d_flags to 0 explicitly.
- Remove t5_cdevsw as the .d_name member isn't really used and doesn't
  warrant a separate cdevsw just for the name.
- Use ENOTTY as the error value for an unknown ioctl request.
- Use make_dev_s() to close race with setting si_drv1.

Sponsored by:	Chelsio Communications
2016-07-29 22:11:29 +00:00
Mark Johnston
897d0c6617 Use vm_page_undirty() instead of manually setting a page field.
Reviewed by:	alc
MFC after:	3 days
2016-07-29 21:05:37 +00:00
Alexander Motin
718e8abba9 Fix NTBT_QP_LINKS negotiation.
I believe it never worked correctly for more the one queue even in Linux.
This fixes case when one of consumer drivers is not loaded on one side,
but its queues still announced as ready if something else brought link up.

While there, remove some pointless NULL checks.
2016-07-29 21:03:30 +00:00
Mark Johnston
adaf3c4906 sdp: Destroy the RDMA ID after destroying the connection's queue pair.
This is the ordering documented by rdma_destroy_qp(). Also add a useful
KASSERT to sdp_pcbfree().

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 21:03:02 +00:00