Commit Graph

118569 Commits

Author SHA1 Message Date
Conrad Meyer
54d89ef114 intpm(4): Do not attach if io_res can not be allocated
Attempts to use the driver without an io_res result in immediate panic.

Sponsored by:	Dell EMC Isilon
2017-09-13 16:23:59 +00:00
Mark Johnston
2934eb8a22 Fix a logic error in the item size calculation for internal UMA zones.
Kegs for internal zones always keep the slab header in the slab itself.
Therefore, when determining the allocation size, we need to take the
slab header size into account.

Reported and tested by:	ae, rakuco
Reviewed by:	avg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12342
2017-09-13 15:44:54 +00:00
Sean Bruno
19ebd288fb Don't (try to) build lio(4) if the SOURCELESS_UCODE is set.
Submitted by:	Fabien Keil <fk@fabiankeil.de>
2017-09-13 15:17:35 +00:00
Toomas Soome
0a0c72ff93 libefi: efipart_realstrategy rsize pointer may be NULL
Need to check rsize before dereferencing it.
2017-09-13 14:27:13 +00:00
Andriy Gapon
ceb1a4fb2d jedec_ts: add many more devices from various vendors
The new IDs are taken from the hardware to which I have access
and from open datasheets.

Also, the hardware probing is moved to the device probe method.

Reviewed by:	rpokala
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D11730
2017-09-13 13:03:29 +00:00
Ed Maste
eadaf05db0 qlnx: exclude if WITHOUT_SOURCELESS_UCODE set
PR:		222277
Submitted by:	Fabian Keil
Obtained from:	ElectroBSD
MFC after:	1 week
2017-09-13 12:16:27 +00:00
Ilya Bakulin
a9bfc8d2ae Add MMCCAM-enabled kernel config for IMX6, reduce debug noice in MMCCAM kernels
CAM_DEBUG_TRACE results in way too much debug output than needed now.
When debugging, it's always possible to turn on trace level using camcontrol.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D12110
2017-09-13 10:56:02 +00:00
Andriy Gapon
86261a95ed slightly simplify zfs_vptocnp
It's not necessary to look up the parent's ID to check if the node is
the root node of the filesystem.

MFC after:	2 weeks
2017-09-13 07:09:58 +00:00
Navdeep Parhar
efeb46889f cxgbe(4): Ignore capabilities that depend on TOE when the firmware
reports TOE is not available.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-09-13 06:07:02 +00:00
Sean Bruno
68467b1206 Jenkins i386 LINT build uses NOTES to generate its LINT kernel config.
ixl(4) isn't in here either, so I'll remove lio(4) too.
2017-09-13 03:56:03 +00:00
Stephen Hurd
ea4c57fe0c Fix GCC build failure caused by r323516
No need to declare cold when we #include <sys/systm.h>

Reported by:	Jenkins
Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12347
2017-09-13 02:44:50 +00:00
Stephen Hurd
d300df0182 Roll up iflib commits from github. This pulls in most of the work done
by Matt Macy as well as other changes which he has accepted via pull
request to his github repo at https://github.com/mattmacy/networking/

This should bring -CURRENT and the github repo into close enough sync to
allow small feature branches rather than a large chain of interdependant
patches being developed out of tree.  The reset of the synchronization
should be able to be completed on github by splitting the remaining
changes that are not yet ready into short feature branches for later
review as smaller commits.

Here is a summary of changes included in this patch:

1)  More checks when INVARIANTS are enabled for eariler problem
    detection
2)  Group Task Queue cleanups
    - Fix use of duplicate shortdesc for gtaskqueue malloc type.
      Some interfaces such as memguard(9) use the short description to
      identify malloc types, so duplicates should be avoided.
3)  Allow gtaskqueues to use ithreads in addition to taskqueues
    - In some cases, this can improve performance
4)  Better logging when taskqgroup_attach*() fails to set interrupt
    affinity.
5)  Do not start gtaskqueues until they're needed
6)  Have mp_ring enqueue function enter the ABDICATED rather than BUSY
    state.  This moves the TX to the gtaskq and allows processing to
    continue faster as well as make TX batching more likely.
7)  Add an ift_txd_errata function to struct if_txrx.  This allows
    drivers to inspect/modify mbufs before transmission.
8)  Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need
    checksums zeroed for checksum offload to work.  This avoids modifying
    packet data in the TX path when possible.
9)  Use ithreads for iflib I/O instead of taskqueues
10) Clean up ioctl and support async ioctl functions
11) Prefetch two cachlines from each mbuf instead of one up to 128B.  We
    often need to parse packet header info beyond 64B.
12) Fix potential memory corruption due to fence post error in
    bit_nclear() usage.
13) Improved hang detection and handling
14) If the packet is smaller than MTU, disable the TSO flags.
    This avoids extra packet parsing when not needed.
15) Move TCP header parsing inside the IS_TSO?() test.
    This avoids extra packet parsing when not needed.
16) Pass chains of mbufs that are not consumed by lro to if_input()
    rather call if_input() for each mbuf.
17) Re-arrange packet header loads to get as much work as possible done
    before a cache stall.
18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/
    IFDI_DETACH();
19) Attempt to distribute RX/TX tasks across cores more sensibly,
    especially when RX and TX share an interrupt.  RX will attempt to
    take the first threads on a core, and TX will attempt to take
    successive threads.
20) Allow iflib_softirq_alloc_generic() to request affinity to the same
    cpus an interrupt has affinity with.  This allows TX queues to
    ensure they are serviced by the socket the device is on.
21) Add new iflib sysctls to net.iflib:
    - timer_int - interval at which to run per-queue timers in ticks
    - force_busdma
22) Add new per-device iflib sysctls to dev.X.Y.iflib
    - rx_budget allows tuning the batch size on the RX path
    - watchdog_events Count of watchdog events seen since load
23) Fix error where netmap_rxq_init() could get called before
    IFDI_INIT()
24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY
    when waiting for firmware
    - After interrupts are enabled, convert all waits to sleeps
    - Eliminates e1000 software/firmware synchronization busy waits after
      startup
25) e1000: Remove special case for budget=1 in em_txrx.c
    - Premature optimization which may actually be incorrect with
      multi-segment packets
26) e1000: Split out TX interrupt rather than share an interrupt for
    RX and TX.
    - Allows better performance by keeping RX and TX paths separate
27) e1000: Separate igb from em code where suitable
    Much easier to understand separate functions and "if (is_igb)" than
    previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))"

#blamebruno

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12235
2017-09-13 01:18:42 +00:00
Matt Joras
fdbf11746a Allow vlan interfaces to rx through netmap(4).
Normally after receiving a packet, a vlan(4) interface sends the packet
back through its parent interface's rx routine so that it can be
processed as an untagged frame. It does this by using the parent's
ifp->if_input. This is incompatible with netmap(4), which replaces the
vlan(4) interface's if_input with a netmap(4) hook. Fix this by using
the vlan(4) interface's ifp instead of the parent's directly.

Reported by:	Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by:	rstone
Approved by:	rstone (mentor)
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12191
2017-09-13 00:25:09 +00:00
Sean Bruno
be17336036 Leave the Cavium Liquid IO driver exist in files, not files.amd64
Submitted by:	imp
2017-09-12 23:58:38 +00:00
Warner Losh
d7fa1ab02d cam iosched: Limit the quanta default to hz if it's below 200
The cam_iosched_ticker() can't be scheduled more than once per tick.
Some limiters depend on quanta matching the number of calls per second
to enforce the proper limits. Limit the quanta to no faster than 1 per
clock tick. This fixes some features when running in VMs where the
default HZ is 100.

PR: 221953
Obtained from: ElectroBSD
Differential Revision: https://reviews.freebsd.org/D12337
Submitted by: Fabian Keil
2017-09-12 23:46:33 +00:00
Sean Bruno
e460f3adbb Do not try to build the Cavium Liquidio driver on all architechtures.
For now, limit to amd64 only.
2017-09-12 23:42:52 +00:00
Sean Bruno
f173c2b77e The diff is the initial submission of Cavium Liquidio 2350/2360 10/25G
Intelligent NIC driver.

The submission conconsists of firmware binary file and driver sources.

Submitted by:	pkanneganti@cavium.com (Prasad V Kanneganti)
Relnotes:	Yes
Sponsored by:	Cavium Networks
Differential Revision:	https://reviews.freebsd.org/D11927
2017-09-12 23:36:58 +00:00
Michael Tuexen
292efb1bc0 Export the UDP encapsualation port and the path state. 2017-09-12 21:08:50 +00:00
Alan Somers
71cd87c66c Remove spaces from CTL devices' default serial numbers
It's awkward to have spaces in CAM device serial numbers. That leads to
such things as device nodes named "/dev/diskid/MYSERIAL%20%20%201". Better
to replace the spaces with "0"s. This change only affects the default
serial numbers for users who don't provide their own.

Reviewed by:	ken, mav
MFC after:	Never
Relnotes:	Yes
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12263
2017-09-12 19:36:24 +00:00
John Baldwin
b4e9a36bf7 Handle relocations for newer non-PIC MIPS ABI.
Newer binutils supports extensions to the MIPS ABI for non-PIC code
that is used when compiling O32 binaries with clang 5 (but not used
for N64 oddly enough).  These extensions require support for
R_MIPS_COPY relocations as well as a second PLT GOT using
R_MIPS_JUMP_SLOT relocations.

For R_MIPS_COPY, use the same approach as on other architectures where
fixups are deferred to the MD do_copy_relocations.

The additional PLT GOT for jump slots is located in a .got.plt section
which is identified by a DT_MIPS_PLTGOT dynamic entry.  This GOT also
requires fixups for the first two GOT entries just as the normal GOT.
However, the entry point for this second GOT uses a different calling
convention. Rather than passing an offset into the GOT, it passes an
offset into the .rel.plt section.  This requires a second entry point
(_rtld_pltbind_start) which calls the normal _rtld_bind() rather than
_mips_rtld_bind().  This also means providing a real version of
reloc_jmpslot() which is used by _rtld_bind().

In addition, add real implementions of reloc_plt() and
reloc_jmpslots() which walk .rel.plt handling R_MIPS_JUMP_SLOT
relocations.

Reviewed by:	kib
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D12326
2017-09-12 17:46:30 +00:00
Toomas Soome
c7847a364a libefi: efipart_open should check the status from disk_open
In case of error from disk_open(), we should clean up properly.

Reviewed by:	allanjude, imp
Differential Revision:	https://reviews.freebsd.org/D12340
2017-09-12 14:18:45 +00:00
Toomas Soome
b40aaca6dd loader should support large_dnode
The zfsonlinux feature large_dnode is not yet supported by the loader.

Reviewed by:	avg, allanjude
Differential Revision:	https://reviews.freebsd.org/D12288
2017-09-12 13:45:04 +00:00
Michael Tuexen
e5cccc35c3 Add support to print the TCP stack being used.
Sponsored by:	Netflix, Inc.
2017-09-12 13:34:43 +00:00
Andriy Gapon
1a2ddb2997 fix a fallout from the ZTOV tightening, r323479
MFC after:	13 days
X-MFC with:	r323479
2017-09-12 13:21:14 +00:00
Olivier Houchard
69d14913fc Some devices come with the same name as TI devices, so we can't rely on the
"probe" method of those drivers to mean we're on e TI SoC. Introduce a new
function, ti_soc_is_supported(), and use it to be sure we're really a TI
system.

PR:	222250
2017-09-12 10:43:02 +00:00
Andriy Gapon
bcab65cab5 zfsctl_snapdir_lookup should be able to handle an uncovered vnode
The uncovered vnode is possible because there is no guarantee that
its hold count would go to zero (and it would be inactivated and reclaimed)
immediately after a covering filesystem is unmounted.
So, such a vnode should be expected and it is possible to re-use it
without any trouble.

MFC after:	3 weeks
Sponsored by:	Panzura
2017-09-12 06:06:58 +00:00
Andriy Gapon
c09d0da8d1 zfs_ctldir: remove obsolete / bogus ARGSUSED lint directives
None of the tagged functions had unused parameters.

MFC after:	1 week
2017-09-12 06:05:30 +00:00
Andriy Gapon
65b38f7311 zfsvfs_hold: assert that the busied filesystem can not be unmounted
This is a FreeBSD specific feature.

MFC after:	3 weeks
Sponsored by:	Panzura
2017-09-12 06:04:50 +00:00
Andriy Gapon
d092f79489 zfs_get_vfs: reference a requested filesystem instead of vfs_busy-ing it
The only consumer of zfs_get_vfs, zfs_unmount_snap, does not need
the filesystem to be busy, it just need a reference that it can pass
to dounmount.

Also, previously the code was racy as it unbusied the filesystem
before taking a reference on it.

Now the code should be simpler and safer.

MFC after:	2 weeks
Sponsored by:	Panzura
2017-09-12 06:04:01 +00:00
Andriy Gapon
f7519dbb76 zfs: tighten debug versions of ZTOV and VTOZ
MFC after:	2 weeks
Sponsored by:	Panzura
2017-09-12 06:02:21 +00:00
Cy Schubert
54e485fd3c Improve the wording of a comment describing why EAGAIN is the error code.
MFC after:	3 days
2017-09-12 04:21:04 +00:00
Ian Lepore
813c1b27fe Add a default implementation that returns ENODEV for start, repeat_start,
stop, read, and write methods.  Some controllers don't implement these
individual operations and have only a transfer method.  In that case, we
should return an indication that the device is present but doesn't support
the method, as opposed to the kobj default error ENXIO which makes it
look like the whole device is missing.  Userland tools such as i2c(8) can
use the differing return values to switch between the two different i2c
IO mechanisms.
2017-09-11 23:47:49 +00:00
Conrad Meyer
d63edb4dc6 MCA: Rename AMD MISC bits/masks
They apply to all AMD MCAi_MISC0 registers, not just MCA4 (NB).

No functional change.

Sponsored by:	Dell EMC Isilon
2017-09-11 20:42:07 +00:00
Conrad Meyer
f739be66e6 x86 MCA: Extract CMCI support predicate into function
On AMD, the MCG_CAP feature bit is reserved -- not explicitly zero.  Do not
use it to determine CMCI support.

Reviewed by:	avg, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12320
2017-09-11 20:41:25 +00:00
Marcin Wojtas
5a2f997cb5 Restore alphabetical order in UART Makefile
Commit r323359 introduced new Marvell UART controller driver
and by mistake it broke correct order in the Makefile. Fix this.

Reported by: emaste
2017-09-11 19:07:53 +00:00
Ilya Bakulin
61df30cfd4 Add MMCCAM-enabled kernel config for arm64
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D12114
2017-09-11 19:07:42 +00:00
Marcin Wojtas
885a74181c Expand Marvell NIC description in arm64 GENERIC config
Suggested by: emaste
2017-09-11 19:00:53 +00:00
Konstantin Belousov
809f2d8b8b Fix ioapic acpi id matching on PCI attach and rid calculation.
Sponsored by:	The FreeBSD Foundation
MFC after:	11 days
2017-09-11 18:29:09 +00:00
Conrad Meyer
e8be4e41c6 Decode new AMD SVM feature bits on family 17h
Sponsored by:	Dell EMC Isilon
2017-09-11 18:11:53 +00:00
Ed Maste
e9346a94d1 boot1: remove BOOT1_MAXSIZE default value
This Makefile relies on Makefile.fat providing the correct value for
BOOT1_MAXSIZE and BOOT1_OFFSET. Since BOOT1_OFFSET had no default value
here the build would already fail if Makefile.fat did not provide
correct values.

Sponsored by:	The FreeBSD Foundation
2017-09-11 14:33:04 +00:00
Andriy Gapon
970165f190 MFV r323111: 8569 problem with inline functions in abd.h
illumos/illumos-gate@37e84ab74e
37e84ab74e

https://www.illumos.org/issues/8569
  C [C99] has peculiar rules for inline functions that are different from the
  C++ rules.  Unlike C++ where inline is "fire and forget", in C a programmer
  must pay attention to the function's storage class / visibility.  The main
  problem is with the case where a compiler decides to not inline a call to the
  function declared as inline.
  Some relevant links:
  - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15831.html
  - http://www.drdobbs.com/the-new-c-inline-functions/184401540
  The summary is that either the inline functions should be declared 'static
  inline' or one of the compilation units (.c files) must provide a callable
  externally visible function definition.  In the former case, the compiler would
  automatically create a local non-inlined function instance in every compilation
  unit where it's needed.  In the latter case the single external definition is
  used to satisfy any non-inlined calls in all compilation units.  As things
  stand right now, we can get an undefined reference error under certain
  combinations of compilers and compiler options.  For example, this is what I
  get on FreeBSD when compiling with clang 4.0.0 and -O1:
    In function `abd_free': /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:385:
    undefined reference to `abd_is_linear'

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after:	1 week
2017-09-11 12:15:49 +00:00
Andriy Gapon
25625d8746 Revert r322601, Mark ZFS ABD inline functions static
An alternative fix is to be merged from illumos shortly.
2017-09-11 12:08:20 +00:00
Andriy Gapon
3d9a0e564d MFV r323110: 8558 lwp_create() returns EAGAIN on system with more than 80K ZFS filesystems
illumos/illumos-gate@216d7723a1
216d7723a1

https://www.illumos.org/issues/8558
  On a system with more than 80K ZFS filesystems, we've seen cases where
  lwp_create() will start to fail by returning EAGAIN. The problem being,
  for each of those 80K ZFS filesystems, a taskq will be created for each
  dataset as part of the ZIL for each dataset.
  For each of these taskq's, a kernel thread will be created which results
  in 24KB being allocated for each thread. With enough of these 24KB
  allocations, we eventually exhaust the memory region set aside for these
  allocations. Currently, segkpsize is set to a value of 2GB, which means
  we can only support about 80K filesystems; 2GB / 24KB = ~80K.
  The lwp_create() failure comes into play due to the fact that LWP
  creation also allocates 24KB from this same region of memory. Thus, if
  we've exhausted this region of memory due to the number of ZIL taskq's,
  there won't be any memory avaible to allow the call to lwp_create() to
  succeed.

FreeBSD note: I haven't created sysctl-s for the new ZIL clean
parameters.  Let's add them if anyone requires to tune them.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after:	3 weeks
2017-09-11 11:31:43 +00:00
Marcin Wojtas
be2d15eae6 Improve HW type checking in mv_ehci driver
This patch adds hwtype parameter which keeps information about hardware
revision of Marvell EHCI controller. It allows to replace multiple
calls to ofw_bus_is_compatible with comparing hwtype value during driver
initialization.

Submitted by: Patryk Duda <pdk@semihalf.com>
Suggested by: ian
Obtained from: Semihalf
Sponsored by: Semihalf
2017-09-11 10:41:42 +00:00
Toomas Soome
4f152c5b8a r323389 breaks the kernel build when WITHOUT_ZFS is defined in src.conf
Need to add #ifdef EFI_ZFS_BOOT guard into efi/loader/main.c

PR:		222215
Reported by:	Sylvain Garrigues
2017-09-11 07:38:53 +00:00
Scott Long
3c5ac992c7 Add infrastructure for allocating multiple MSI-X interrupts. Also
add more fine-tuned controls for allocating requests and replies.

Sponsored by:	Netflix
2017-09-11 01:51:27 +00:00
Ed Maste
5e66298138 boot1 generate-fat: generate all templates at once
In advance of other changes to the fat template generation process, have
generate-fat.sh create all template files at the same time so that they
cannot get out of sync.

Also correct a longstanding but where BOOT1_OFFSET was overwritten on
each invocation. A previous version of this patch stored a per-arch
offset (e.g. BOOT1_arm64_OFFSET) but that was deemed unnecessary.
Instead just hardcode the known offset that applies to all archs (0x2d)
and fail if the offset happens to be different.

Ongiong work (using newfs_msdos in bsdinstall and adding msdosfs support
to makefs) will eventually allow us to do away with this fat template
hack altogether, but in the near term we have a few improvements that
will build on this.

Reviewed by:	allanjude, imp, Eric McCorkle
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10931
2017-09-11 00:37:00 +00:00
Ed Maste
3ac77a9151 newvers.sh: speed up failing git-svn revision search
In the case of running newvers.sh on a git tree w/o git-svn-id notes we
previously piped the entire 'git log' to grep. Add --grep to the log
invocation to avoid processing log entries of no interest.

This saves about 2-3 seconds of newvers.sh run time on my SSD laptop.
Later changes will bring further speedups.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2017-09-11 00:14:04 +00:00
Ed Maste
a489102080 newvers.sh: accept "git-svn-id:" at the start of a line only
This prevents incorrect subversion revision detection when "git svn" is
not being used to get the sources but git is available. Previously old
subversion revisions included in commit messages were favoured over the
more recent and correct revisions in git notes.

For example cf1f35574722 represents r315395 but was treated as r313908
which is referenced in the commit message. Commits following
r315395/cf1f35574722 but before another commit with a git-svn-id
reference in the commit message would be treated as r313908 as well.

Patch from PR updated to accommodate the initial four space indent in
`git log` ouptut.

PR:		221848
Submitted by:	Fabian Keil
Obtained from:	ElectroBSD
MFC after:	2 weeks
2017-09-10 19:12:01 +00:00
Mateusz Guzik
1c0b34417b Move vmmeter atomic counters into dedicated cache lines
Prior to the change they were subject to extreme false sharing.
In particular this change shaves about 3 seconds real time of -j 80 buildkernel.

Reviewed by:	alc, markj
Differential Revision:	https://reviews.freebsd.org/D12281
2017-09-10 19:00:38 +00:00
Ian Lepore
e1275c6805 Add gpio methods to read/write/configure up to 32 pins simultaneously.
Sometimes it is necessary to combine several gpio pins into an ad-hoc bus
and manipulate the pins as a group. In such cases manipulating the pins
individualy is not an option, because the value on the "bus" assumes
potentially-invalid intermediate values as each pin is changed in turn. Note
that the "bus" may be something as simple as a bi-color LED where changing
colors requires changing both gpio pins at once, or something as complex as
a bitbanged multiplexed address/data bus connected to a microcontroller.

In addition to the absolute requirement of simultaneously changing the
output values of driven pins, a desirable feature of these new methods is to
provide a higher-performance mechanism for reading and writing multiple
pins, especially from userland where pin-at-a-time access incurs a noticible
syscall time penalty.

These new interfaces are NOT intended to abstract away all the ugly details
of how gpio is implemented on any given platform. In fact, to use these
properly you absolutely must know something about how the gpio hardware is
organized. Typically there are "banks" of gpio pins controlled by registers
which group several pins together. A bank may be as small as 2 pins or as
big as "all the pins on the device, hundreds of them." In the latter case, a
driver might support this interface by allowing access to any 32 adjacent
pins within the overall collection. Or, more likely, any 32 adjacent pins
starting at any multiple of 32. Whatever the hardware restrictions may be,
you would need to understand them to use this interface.

In additional to defining the interfaces, two example implementations are
included here, for imx5/6, and allwinner. These represent the two primary
types of gpio hardware drivers. imx6 has multiple gpio devices, each
implementing a single bank of 32 pins. Allwinner implements a single large
gpio number space from 1-n pins, and the driver internally translates that
linear number space to a bank+pin scheme based on how the pins are grouped
into control registers. The allwinner implementation imposes the restriction
that the first_pin argument to the new functions must always be pin 0 of a
bank.

Differential Revision:	https://reviews.freebsd.org/D11810
2017-09-10 18:08:25 +00:00
Alan Cox
d027ed2e7a To analyze the allocation of swap blocks by blist functions, add a method
for analyzing the radix tree structures and reporting on the number, and
sizes, of maximal intervals of free blocks.  The report includes the number
of maximal intervals, and also the number of them in each of several size
ranges, from small (size 1, or 3 to 4) to large (28657 to 46367) with size
boundaries defined by Fibonacci numbers.  The report is written in the test
tool with the 's' command, or in a running kernel by sysctl.

The analysis of the radix tree frequently computes the position of the lone
bit set in a u_daddr_t, a computation that also appears in leaf allocation.
That computation has been moved into a function of its own, and optimized
for cases where an inlined machine instruction can replace the usual binary
search.

Submitted by:	Doug Moore <dougm@rice.edu>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11906
2017-09-10 17:46:03 +00:00
Dag-Erling Smørgrav
008a09355b If the user tries to set kern.randompid to 1 (which is meaningless), set
it to a random value between 100 and 1123, rather than 0 as before.

Submitted by:	Marie Helene Kvello-Aune <marieheleneka@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D5336
2017-09-10 15:01:29 +00:00
Toomas Soome
44832ad99d loader.efi: chain loader should provide proper device handle
Since the efipart rewrite, the chain command was looking for device
handle using interface applicable only for net devices. Disk
partitions and zfs pools need their own approach to find the proper handle.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D12287
2017-09-10 13:53:42 +00:00
Konstantin Belousov
263f801fef Fix typo, TC0->TCO.
Submitted by:	jhb
MFC after:	1 week
2017-09-10 13:21:54 +00:00
Konstantin Belousov
8bbe154276 Add definitions of (new) bits for TCO registers from the
Lewisburg/Sunrise Point documentation.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-10 12:10:27 +00:00
Konstantin Belousov
d703e54899 Style: tab after #define.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-10 11:57:02 +00:00
Mateusz Guzik
0bbae6f364 namecache: clean up struct namecache_ts handling
namecache_ts differs from mere namecache by few fields placed mid struct.
The access to the last element (the name) is thus special-cased.

The standard solution is to put new fields at the very beginning anad
embedd the original struct. The pointer shuffled around points to the
embedded part. If needed, access to new fields can be gained through
__containerof.

MFC after:	1 week
2017-09-10 11:17:32 +00:00
Scott Long
a4bb51a4a2 Fix intrhook release in MPR and MPS for EARLY_AP_STARTUP.
Reported by:	Limelight
Sponsored by:	Netflix
2017-09-10 07:10:40 +00:00
Scott Long
1415db6ca2 More code refactoring in preparation for enabling multiqueue.
Sponsored by:	Netflix
2017-09-10 04:09:18 +00:00
Scott Long
2bf620cb8d Convert some in-line printing of diagnostic into tables.
Sponsored by:	Netflix
2017-09-09 22:02:36 +00:00
Warner Losh
618a000b0c It's been pointed out that init_script at least is useful w/o
re-rooting. Remove deprecation notice for it. init_chroot likely is
still better served with reroot.
2017-09-09 21:33:43 +00:00
Michael Tuexen
7b5f06fbcc Fix MTU computation. Coverity scanning usrsctp pointed to this code...
MFC after:	3 days
2017-09-09 21:03:40 +00:00
Michael Tuexen
2e8bb5ddf4 Fix a locking issue found by Coverity scanning the usrsctp library.
MFC after:	3 days
2017-09-09 20:51:54 +00:00
Michael Tuexen
f55c326691 Fix locking issues found by Coverity scanning the usrsctp library.
MFC after:	3 days
2017-09-09 20:44:56 +00:00
Warner Losh
834214a023 Don't build uart_dev_mvebu unless we're on arm64.
This module is specific to a single Marvel board that we currently
only support in 64-bit mode. Remove it from the build otherwise. It
likely should be completely removed, but this unbreaks x86 building.

Noticed by: sbruno@
2017-09-09 20:14:18 +00:00
Michael Tuexen
0c4622dab2 Silence a Coverity warning from scanning the usrsctp library.
MFC after:	3 days
2017-09-09 20:08:26 +00:00
Sean Bruno
d87eabeea9 revert r323371 in prepartion for a proper fix
Submitted by:	imp
2017-09-09 20:07:04 +00:00
Michael Tuexen
6c2cfc0419 Savely remove a chunk from the control queue.
This bug was found by Coverity scanning the usrsctp library.

MFC after:	3 days
2017-09-09 19:49:50 +00:00
Sean Bruno
141bf584e4 r323359 instroduced an ARMv8 only uart(4) device to the tree but placed
the driver in a place where it will be built for all targets.  x86 doesn't
have all the required build bits for this device.

Move the uart(4) device mvebu to arm64 only.
2017-09-09 19:19:13 +00:00
Scott Long
a7d065b3af Remove the unnecessary use of a temporary string buffer.
Sponsored by:	Netflix
2017-09-09 18:39:55 +00:00
Scott Long
bec09074ca Start separating the LSI drivers into per-queue structures. No
functional change.

Sponsored by:	Netflix
2017-09-09 18:03:40 +00:00
Konstantin Belousov
93c5d3a46a Add a vm_page_change_lock() helper, the common code to not relock page
lock if both old and new pages use the same underlying lock.  Convert
existing places to use the helper instead of inlining it.  Use the
optimization in vm_object_page_remove().

Suggested and reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-09 17:35:19 +00:00
Warner Losh
e6d697124f Mark init_chroot and init_script variables as deprecated. 2017-09-09 16:04:49 +00:00
Hans Petter Selasky
6263f8b78d Only search the scope ID in ip6_find_dev() for IPv6 addresses which
have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.

Suggested by:		ae @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-09 12:50:12 +00:00
Marcin Wojtas
7ca8a2b385 Enable compilation of Marvell NETA controller with arm64 GENERIC
This patch enables network operation on Marvell Armada 3700 SoC.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12259
2017-09-09 11:56:48 +00:00
Marcin Wojtas
e314ac07f4 Add support for Armada 3700 in the NETA driver
This patch enables using NETA driver on Marvell Armada 3700 SoC
by introducing new compatible string, modifying clock source
obtaining and also excluding unnecessary parts.
The driver is added as a build option for arm64 platforms as well.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12258
2017-09-09 11:54:04 +00:00
Marcin Wojtas
e7843f1dd6 Store virtual address of buffer in mvneta_rx_ring
Now the virtual address of received buffer is taken from a software ring.
Thanks to this, we can use the NETA driver on 64 bits architecture and
avoid 32-bit buf_cookie descriptor field limitation.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12257
2017-09-09 11:49:36 +00:00
Marcin Wojtas
e49e3ec31e Add support for uart_mvebu driver arm64 GENERIC config
This patch enables console output on Armada 3700 SoCs
with kernel GENERIC.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12251
2017-09-09 11:46:34 +00:00
Marcin Wojtas
ac0770ddb3 Introduce UART driver module for Armada 3700
This patch adds support for UART in Armada 3700 family.
It exposes both low-level UART interface, as well as
standard driver methods.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12250
2017-09-09 11:42:32 +00:00
Marcin Wojtas
840d633f12 Enable compilation of Marvell EHCI driver in arm64 GENERIC
Enabled driver can be used on boards equipped with Marvell Armada 3700 SoC.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12256
2017-09-09 11:16:10 +00:00
Marcin Wojtas
6c2c61060d Add support for Armada 3700 EHCI
This patch reuses ehci_mv driver by adding a support for the new
compatible string and adding ehci_mv.c to list of available options
for arm64 platforms.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12255
2017-09-09 11:06:58 +00:00
Marcin Wojtas
e75791056b Add support for AHCI in Armada 3700
This patch simply AHCI generic driver by extending compatible list.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12254
2017-09-09 11:01:44 +00:00
Andriy Gapon
90354c3200 MFV r323107: 8414 Implemented zpool scrub pause/resume
illumos/illumos-gate@1702cce751
1702cce751

FreeBSD note:  rather than merging the zpool.8 update I copied the zpool
scrub section from the illumos zpool.1m to FreeBSD zpool.8 almost
verbatim.  Now that the illumos page uses the mdoc format, it was an
easier option.  Perhaps the change is not in perfect compliance with the
FreeBSD style, but I think that it is acceptible.

https://www.illumos.org/issues/8414
  This issue tracks the port of scrub pause from ZoL: https://github.com/zfsonlinux/zfs/pull/6167
  Currently, there is no way to pause a scrub. Pausing may be useful when
  the pool is busy with other I/O to preserve bandwidth.

  Description

  This patch adds the ability to pause and resume scrubbing.  This is achieved
  by maintaining a persistent on-disk scrub state.  While the state is 'paused'
  we do not scrub any more blocks.  We do however perform regular scan
  housekeeping such as freeing async destroyed and deadlist blocks while paused.

  Motivation and Context

  Scrub pausing can be an I/O intensive operation and people have been asking
  for the ability to pause a scrub for a while. This allows one to preserve scrub
  progress while freeing up bandwidth for other I/O.

Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Alek Pinchuk <apinchuk@datto.com>

MFC after:	2 weeks
2017-09-09 11:00:07 +00:00
Marcin Wojtas
705f4b2ceb Enable compilation of Marvell XHCI driver in arm64 GENERIC
Enabled driver can be used on boards equipped with Marvell Armada
3700/7k/8k SoCs.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12253
2017-09-09 10:58:45 +00:00
Marcin Wojtas
0fd9794286 Add support for xhci in Armada 3700 and 7k/8k
This driver will be used by Marvell Armada 3700 and 7k/8k SoC families.
The same, generic xhci device also appears in Armada 380, so we are reusing
driver.

This patch also adds xhci_mv.c entry to the arm64 files list.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12252
2017-09-09 10:54:13 +00:00
Hans Petter Selasky
f4cf3177a2 Resolve IPv6 scope ID issues when using ip6_find_dev() in the LinuxKPI.
Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-09 07:21:27 +00:00
Hans Petter Selasky
65c5a7a879 Remove unsafe access to the LinuxKPI file structure from ibcore.
selwakeup() is now done by the wake_up() family of functions.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-09 06:34:20 +00:00
Hans Petter Selasky
6dec7efa83 Properly implement poll_wait() in the LinuxKPI. This prevents direct
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.

Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.

Bump the FreeBSD version to force recompilation of external kernel modules.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-09 06:29:29 +00:00
Hans Petter Selasky
5b1cfc99cf Add more sanity checks to linux_fget() in the LinuxKPI. This prevents
returning pointers to file descriptors which were not created by the
LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-09 06:04:05 +00:00
Mateusz Guzik
4dabeda46c Fix riscv and powerpc compilation after r323329.
On these archs bzero is a C function, which triggers a compilation error
as the compiler tries to expand the macro.
2017-09-09 05:56:04 +00:00
Navdeep Parhar
8d6ae10af6 cxgbe(4): Fix a couple of problems in the sge_wrq data path.
- start_wrq_wr must not drain the wr_list if there are incomplete_wrs
  pending.  This can happen when a t4_wrq_tx runs between two
  start_wrq_wr.

- commit_wrq_wr must examine the cookie's pidx and ndesc with the
  queue's lock held.  Otherwise there is a bad race when incomplete WRs
  are being completed and commit_wrq_wr for the WR that is ahead in the
  queue updates the next incomplete WR's cookie's pidx/ndesc but the
  commit_wrq_wr for the second one is using stale values that it read
  without the lock.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-09-09 05:12:14 +00:00
Justin Hibbits
c5fea8adf0 Add P5021 and P5040 conditions for LAW count check.
P5040/P5021 have the same number of LAWs as P5020.  There may be a better way of
getting the count from the FDT (fsl,num-laws property on soc/corenet-law or
soc/ecm-law), but that's not supported everywhere, so we still need this check
for those other cases.
2017-09-09 02:19:44 +00:00
Justin Hibbits
dc72081153 Add some more PVR and SVR defines
These processors may not be supported yet, but add them for completion.

POWER9 is planned for support.  e300 may work (based on 603e core).
P5040/P5021 are similar to P5020, so should work as well.  One addition is
needed for P5040, to support the number of LAWs, and will be a separate commit.
2017-09-09 02:08:22 +00:00
Conrad Meyer
ea5eee641e Fix information leak in geli(8) integrity mode
In integrity mode, a larger logical sector (e.g., 4096 bytes) spans several
physical sectors (e.g., 512 bytes) on the backing device.  Due to hash
overhead, a 4096 byte logical sector takes 8.5625 512-byte physical sectors.
This means that only 288 bytes (256 data + 32 hash) of the last 512 byte
sector are used.

The memory allocation used to store the encrypted data to be written to the
physical sectors comes from malloc(9) and does not use M_ZERO.

Previously, nothing initialized the final physical sector backing each
logical sector, aside from the hash + encrypted data portion.  So 224 bytes
of kernel heap memory was leaked to every block :-(.

This patch addresses the issue by initializing the trailing portion of the
physical sector in every logical sector to zeros before use.  A much simpler
but higher overhead fix would be to tag the entire allocation M_ZERO.

PR:		222077
Reported by:	Maxim Khitrov <max AT mxcrypt.com>
Reviewed by:	emaste
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12272
2017-09-09 01:41:01 +00:00
Scott Long
3d96cd7873 Refactor interrupt allocation and deallocation. Add some extra
diagnostics.  No other functional changes.

Sponsored by:	Netflix
2017-09-08 20:20:35 +00:00
Mateusz Guzik
2e2baf2ec3 Allow __builtin_memset instead of bzero for small buffers of known size
In particular this eliminates function calls and related register save/restore
when only few writes would suffice.

Example speed up can be seen in a fstat microbenchmark on AMD Ryzen cpus, where
the throughput went up by ~4.5%.

Thanks to cem@ for benchmarking and reviewing the patch.

MFC after:	1 week
2017-09-08 20:09:14 +00:00
Konstantin Belousov
3c700e2e4c Enhance qpi.c to make it usable on all Core-microarchitecture Xeons.
Scan all buses for CSR bus, not stopping on the first failed
match. Scan all slots for function 0 on the found bus, for instance on
IvyBridge the slot 0 is not decoded at all. Since the scan is quite
unsafe, and access to the buses is mostly useful for developers,
enable the csr buses scan with the tunable.

Current qpi.c makes too many assumptions about the uncore
configuration buses location and about slots occupied.  Also it
restricts itself only to Nehalem CPUs.  It is needed on all Core-based
Xeons.  On the 2600 v2 (IvyBridge) machine I have access to, the CSR
buses have numbers 31 (BSP socket) and 63 (second socket), and there
is no functions pci0.31.0.0 or pci0.63.0.0.  According to the CPU
datasheet, all devices on the uncore bus occupy slots >= 8.

Practically, the attach to config buses is required for the intel-pcm
pcm-memory.x tool to work, for instance.

Reviewed by:	jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D12268
2017-09-08 19:51:03 +00:00
Konstantin Belousov
fd15fee1ed Use IOAPIC PCI rid as the interrupt TLP source id for DMAR interrupt
remapping.

VT-d specification requires use of PCI rid as source id for IOAPICs
enumerated by PCI bus.  The values from the DMAR ACPI table should be
only used when IOAPIC is not on PCI.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Hardware provided by:	Intel
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12205
2017-09-08 19:45:37 +00:00
Konstantin Belousov
3fd0053a50 Add an ioapic_get_rid() function to obtain PCIe TLP requester-id for
the interrupt messages from given IOAPIC, if the IOAPIC can be
enumerated on PCI bus.

If IOAPIC has PCI binding, match the PCI device against MADT
enumerated IOAPIC.  Match is done first by registers window physical
address, then by IOAPIC ID as read from the APIC ID register.

PCI bsf address of the matched PCI device is the rid.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Hardware provided by:	Intel
MFC after:	2 weeks
X-Differential revision:	https://reviews.freebsd.org/D12205
2017-09-08 19:39:20 +00:00
Konstantin Belousov
1a92c8402d Add a constant specifying the min size of the IOAPIC registers window.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-08 19:25:11 +00:00
Maxim Sobolev
c24e7f3fd9 Correct bintime32 declaration: uint32_t sec -> time32_t sec.
Submitted by:	jhb
MFC after:	1 month
2017-09-08 18:32:13 +00:00
Stephen Hurd
47516844a3 Added support for displaying HW port stats using sysctl.
This provides port stats (updated once per second) in
dev.bnxt.X.port_stats for PFs.  VFs do not have access to the port stats.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by:	shurd, sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D11914
2017-09-08 18:03:34 +00:00
Scott Long
3e0ca40a54 Fix intrhook release in MFI as well 2017-09-08 17:51:19 +00:00
Scott Long
55550830cf As with r323317, hold off on releasing the intrhook during boot until
we're ready to accept probing from GEOM.  Untested, but the pattern is
the same as with aac.
2017-09-08 17:40:29 +00:00
Scott Long
cc336c7805 Move the intrhook release to later in the function so that GEOM knows to wait longer
for possible root devices to come online.  This fixes a race that seems to be
triggered by EARLY_AP_STARTUP.

Submitted by:	cgull@glup.org
2017-09-08 16:52:59 +00:00
Konstantin Belousov
dc63dc00cb Fix malloc() uses in em_get_regs().
Do not use malloc(M_NOWAIT), wait is possible there, and the malloc
failures where not checked.  Do not forget to free malloced memory.

Reported and tested by:	pho
Approved by:	sbruno
Sponsored by:	The FreeBSD Foundation
2017-09-08 14:54:07 +00:00
Konstantin Belousov
6ff9ce94ce Consistently use tabs for indent.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-08 10:39:28 +00:00
Andrew Turner
c1649fdb02 Not all CPUs handle reading ID_AA64MMFR2_EL1 (e.g. qemu), disable it for now.
Sponsored by:	DARPA, AFRL
2017-09-08 08:02:06 +00:00
Mateusz Guzik
dad74ce924 namecache: fold the unlock label into the only consumer
No functional changes.

MFC after:	1 week
2017-09-08 06:57:11 +00:00
Mateusz Guzik
da8f32a7f1 namecache: factor out dot lookup into a dedicated function
The intent is to move uncommon cases out of the way.

MFC after:	1 week
2017-09-08 06:51:33 +00:00
Mateusz Guzik
6a569d3525 Annotate Giant with __exclusive_cache_line 2017-09-08 06:46:24 +00:00
Mateusz Guzik
3e72c8449b Annotate global process locks with __exclusive_cache_line
MFC after:	1 week
2017-09-08 06:46:02 +00:00
Conrad Meyer
01a20b9875 mca: Fix printf types from r323289 on i386
Reported by:	Michael Butler <imb AT protected-networks.net>
Sponsored by:	Dell EMC Isilon
2017-09-08 01:06:35 +00:00
Mark Johnston
f93f7cf199 Speed up vm_page_array initialization.
We currently initialize the vm_page array in three passes: one to zero
the array, one to initialize the "order" field of each page (necessary
when inserting them into the vm_phys buddy allocator one-by-one), and
one to initialize the remaining non-zero fields and individually insert
each page into the allocator.

Merge the three passes into one following a suggestion from alc:
initialize vm_page fields in a single pass, and use vm_phys_free_contig()
to efficiently insert physical memory segments into the buddy allocator.
This reduces the initialization time to a third or a quarter of what it
was before on most systems that I tested.

Reviewed by:	alc, kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D12248
2017-09-07 21:43:39 +00:00
Conrad Meyer
092c0e867a x86 MCA: Helpfully, print why ECC thresholding is not enabled on AMD
Sponsored by:	Dell EMC Isilon
2017-09-07 21:33:27 +00:00
Conrad Meyer
d848ecfb7e x86 MCA: Enable AMD thresholding support on 17h
17h supports MCA thresholding in the same way as 16h and earlier.
Supposedly a ScalableMca feature bit in CPUID 8000_0007:EBX must be set, but
that was not true for earlier models, so be careful about relying on it.

While here, document a missing bit in LS MCA MISC0.

Reviewed by:	truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12237
2017-09-07 21:31:07 +00:00
Conrad Meyer
cd8c258198 Store AMD RAS Capabilities cpuid value and name flags
Reviewed by:	truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12237
2017-09-07 21:29:51 +00:00
Conrad Meyer
2e81566368 cpufreq(4) hwpstate: Yield CPU awaiting frequency change
It doesn't seem necessary to busy the CPU while waiting to transition
into a different p-state.

PR:		221621 (related, but does not completely address)
Reviewed by:	truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12260
2017-09-07 20:20:12 +00:00
Andrew Turner
f9fc9faa3a Fix the SVE ID field shift.
Sponsored by:	DARPA, AFRL
2017-09-07 19:52:04 +00:00
Andrew Turner
130be885e6 Add the ATS1E1 case to the ID_AA64MMFR1_EL1 decoding.
Sponsored by:	DARPA, AFRL
2017-09-07 19:51:17 +00:00
Mark Johnston
97310754e5 Fix indentation.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-09-07 19:15:31 +00:00
Andrew Turner
d28b950a4d Use the correct mask when printing undecoded fields from the
ID_AA64MMFR2_EL1 register.

Sponsored by:	DARPA, AFRL
2017-09-07 18:58:55 +00:00
Andrew Turner
d1c2e46d9c Fix a mismerge, make sure PRINT_ID_AA64_MMFR2 has a unique value.
Sponsored by:	DARPA, AFRL
2017-09-07 16:43:12 +00:00
Andrew Turner
1a2e5c004d Fix the value of ID_AA64ISAR1_DPB_SHIFT, the field is bits 3:0.
Sponsored by:	DARPA, AFRL
2017-09-07 16:12:56 +00:00
Warner Losh
84ed53b2c0 Be consistent and do return (1);
Noticed by: tsoome@
Sponsored by: Netflix
2017-09-07 15:46:44 +00:00
Andrew Turner
f45dc6945b Add the ARMv8.2 ID register additions and use them to decode the register
values. As not all assemblers understand the new ID_AA64MMFR2_EL1 register
add a macro to access it. This seems to be safe for older CPUs to read this
new register, with them returning zero.

Sponsored by:	DARPA, AFRL
2017-09-07 15:45:56 +00:00
Andrew Turner
0f962c6deb Uppercase the special register names in identcpu to be more consistent with
the other source files.

Sponsored by:	DARPA, AFRL
2017-09-07 15:30:13 +00:00
Andrew Turner
a0f16159bd Make the bit mask of ARMv8 ID registers to print sparse to keep values
close, but without having to change all values when new registers are added.

Sponsored by:	DARPA, AFRL
2017-09-07 15:24:47 +00:00
Andrew Turner
5ad42f79fb Add more ARM Ltd parts to the list of knows CPUs.
Submitted by:	Jon Brawn <jon@brawn.org>
2017-09-07 15:02:57 +00:00
Warner Losh
1a822dfd33 Fix armv6 build
We need to extend the -Wno-format hack to yet another Makefile to cope
with %S meaning (CHAR16 *) not (wchar_t *) in the context of the EFI
boot loaders.

Sponsored by: Netflix
2017-09-07 07:30:24 +00:00
Warner Losh
1d21184075 ucs2len
Rename boot1's wcslen to ucs2len, which we can't use in userland
because wchar in userland is unsigned, not short. Move it into
efichar.c. Also spell '* 2' as '* sizeof(efi_char)' and add 1 for the
trailing NUL to transition the FreeBSD boot env vars to being NUL
terminated on the same line...

Sponsored by: Netflix
2017-09-07 07:30:05 +00:00
Conrad Meyer
b1631dfb46 cam(4): Fix some warnings
When bcopy is treated as memcpy/memmove, Clang produces warnings that the
size argument doesn't match the type of the source.  This is true, it
doesn't match; we're aliasing the source.

Explicitly cast the source pointer to the expected type to remove the
warning.

No functional change.

Sponsored by:	Dell EMC Isilon
2017-09-07 07:24:22 +00:00
Maxim Sobolev
afbd12c110 In the recvmsg32() system call iterate over returned structure(s)
and convert any messages of types SCM_BINTIME, SCM_TIMESTAMP,
SCM_REALTIME and SCM_MONOTONIC from 64-bit to its 32-bit
representation. Otherwise we either run out of user-supplied
buffer to copy those out resulting in the MSG_CTRUNC or simply
return values that the userland 32-bit code is not going
to parse correctly. This fixes at least two regression tests
failing to function properly in 32-bit compat mode:

    tools/regression/sockets/udp_pingpong
    tools/regression/sockets/unix_cmsg

PR:             kern/222039
MFC after:	30 days
2017-09-07 04:29:57 +00:00
Landon J. Fuller
f526d86d75 bhnd: Remove unsupported USB core IDs from the bhnd_usb device table.
This resolves a SoC reset triggered by attempting to attach to the
BCM5365's USB 1.1 controller.

Approved by:	adrian (mentor, implicit)
2017-09-06 23:43:20 +00:00
Mateusz Guzik
1539873a32 vxge: plug void casts from memcpy/memzero calls
Most of places using them did not have the cast in the first place.

No functional changes.

MFC after:	1 week
2017-09-06 21:38:07 +00:00
Mateusz Guzik
574adb65c8 Sprinkle __read_frequently on few obvious places.
Note that some of annotated variables should probably change their types
to something smaller, preferably bit-sized.
2017-09-06 20:33:33 +00:00
Mateusz Guzik
cf558f10a7 Introduce __read_frequently
While __read_mostly groups variables together, their placement is not
specified. In particular 2 frequently used variables can end up in
different lines.

This annotation is only expected to be used for variables read all the time,
e.g. on each syscall entry.

MFC after:	1 week
2017-09-06 20:32:49 +00:00
Mateusz Guzik
fe933c1d88 Start annotating global _padalign locks with __exclusive_cache_line
While these locks are guarnteed to not share their respective cache lines,
their current placement leaves unnecessary holes in lines which preceeded them.

For instance the annotation of vm_page_queue_free_mtx allows 2 neighbour
cachelines (previously separate by the lock) to be collapsed into 1.

The annotation is only effective on architectures which have it implemented in
their linker script (currently only amd64). Thus locks are not converted to
their not-padaligned variants as to not affect the rest.

MFC after:	1 week
2017-09-06 20:28:18 +00:00
Stephen Hurd
8705464353 bnxt: Update firmware header file with the latest one
hsi_struct_def.h file contains all firmware (HWRM) data struct's, updated
that with the latest one which was released on 30'th Aug.

After this upgrade, HWRM version will be 1.8.1.5 (earlier it was 1.4.0).

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by:	shurd, sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D12203
2017-09-06 20:19:30 +00:00
Stephen Hurd
96b2e63f10 bnxt: Use correct firmware call for number of queues supported
1) Based on the suggestion from firmware team, derive
   scctx->isc_ntxqsets_max & scctx->isc_nrxqsets_max based on FUNC_QCFG
   (instead of FUNC_QCAPS).
2) Bump-up driver version to "1.0.0.2".

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by:	shurd, sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D12128
2017-09-06 20:14:34 +00:00
Konstantin Belousov
b99b705d9c Skylake server core PMC support for hwpmc(4).
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Hardware provided by:	Intel
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12221
2017-09-06 17:19:48 +00:00
Konstantin Belousov
85d88d8799 Do not leak empty swblk.
In swp_pager_meta_build(), if the requested operation results in
freeing the last swap pointer in the swblk, free the trie node.  Other
swap pager code does not expect to find completely empty swblk.

Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-06 16:18:53 +00:00
Konstantin Belousov
eed99cb81b In swp_pager_meta_build(), handle a race with other thread allocating
swapblk for our index while we dropped the object lock.

Noted by:	jeff
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-06 16:16:11 +00:00
Navdeep Parhar
aa0186bc36 Make LACP based lagg work with interfaces (like 100Gbps and 25Gbps) that
report extended media types.

lacp_aggregator_bandwidth() uses the media to determine the speed of the
interface and returns 0 for IFM_OTHER without the bits in the extended
range.

Reported by:	kbowling@
Reviewed by:	eugen_grosbein.net, mjoras@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D12188
2017-09-06 14:36:35 +00:00
Hans Petter Selasky
5b4b1bb3a1 Add new USB quirk.
PR:			221775
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-09-06 13:59:57 +00:00
Hans Petter Selasky
95ed5015ec Add support for generic backpressure indicator for ratelimited
transmit queues aswell as non-ratelimited ones.

Add the required structure bits in order to support a backpressure
indication with ratelimited connections aswell as non-ratelimited
ones. The backpressure indicator is a value between zero and 65535
inclusivly, indicating if the destination transmit queue is empty or
full respectivly. Applications can use this value as a decision point
for when to stop transmitting data to avoid endless ENOBUFS error
codes upon transmitting an mbuf. This indicator is also useful to
reduce the latency for ratelimited queues.

Reviewed by:		gallatin, kib, gnn
Differential Revision:	https://reviews.freebsd.org/D11518
Sponsored by:		Mellanox Technologies
2017-09-06 13:56:18 +00:00
Konstantin Belousov
fd9bc183bb Fix typos. Stop claiming that two children are created.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-06 11:47:59 +00:00
Scott Long
6eea4f463d Checkpoint the next phase in debug message cleanup, this time focusing on
error recovery messages.

Sponsored by:	Netflix
2017-09-06 09:19:54 +00:00
Kurt Lidl
a8273e4371 Enable dtrace support for mips64 and the ERL kernel config
Turn on the required options in the ERL config file, and ensure
that the fbt module is listed as a dependency for mips in
the modules/dtrace/dtraceall/dtraceall.c file.

PR: 		220346
Reviewed by:	gnn, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12227
2017-09-06 03:19:52 +00:00
Conrad Meyer
1b93938ae4 amdsmn(4): Do not probe not matching hostbridges
Similar to r323195, but for amdsmn(4) driver (which borrowed some design).

Ignore hostbs that do not match our PCI device id criteria.

Sponsored by:	Dell EMC Isilon
2017-09-05 21:00:33 +00:00
Conrad Meyer
40f7bccb5d amdtemp(4): Do not probe not matching hostbridges
Some systems have hostbs that do not match our PCI device id criteria.
Detect and ignore these devices in probe.

PR:		218264
Sponsored by:	Dell EMC Isilon
2017-09-05 20:35:25 +00:00
Alan Somers
0df3d85af4 Fix remounting ZFS filesystem with "zfs mount"
"zfs mount -o" passes a list of mount options directly to nmount(2) after
sanity checking them. In particular, zfs(8) will refuse to mount an already
existing file system unless "remount" is specified in the option list.
However, the "remount" option only exists in Illumos. FreeBSD's equivalent is
"update".

PR:		221985
Reviewed by:	avg
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12233
2017-09-05 19:40:04 +00:00
Conrad Meyer
a03d621bfa amdtemp(4): Add support for Family 17h temperature sensor
The sensor value is formatted similarly to previous models (same
bitfield sizes, same units), but must be read off of the internal
System Management Network (SMN) from the System Management Unit (SMU)
co-processor.

PR:		218264
Reported and tested by:	Nils Beyer <nbe AT renzel.net>
Reviewed by:	avg (no +1), mjoras, truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12217
2017-09-05 15:19:14 +00:00
Conrad Meyer
907f50fe04 Add smn(4) driver for AMD System Management Network
AMD Family 17h CPUs have an internal network used to communicate between
the host CPU and the PSP and SMU coprocessors.  It exposes a simple
32-bit register space.

Reviewed by:	avg (no +1), mjoras, truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12217
2017-09-05 15:13:41 +00:00
Edward Tomasz Napierala
b0618cda03 Make root_mount_rel(9) ignore NULL arguments, like it used to before r313351.
It would be better to fix API consumers to not pass NULL there - most of them,
such as gmirror, already contain the neccessary checks - but this is easier
and much less error-prone.

One known user-visible result is that it fixes panic on a failed "graid label".

PR:		221846
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-09-05 14:32:56 +00:00
Ed Schouten
733ba7f881 Merge pipes and socket pairs.
Now that CloudABI's sockets API has been changed to be addressless and
only connected socket instances are used (e.g., socket pairs), they have
become fairly similar to pipes. The only differences on CloudABI is that
socket pairs additionally support shutdown(), send() and recv().

To simplify the ABI, we've therefore decided to remove pipes as a
separate file descriptor type and just let pipe() return a socket pair
of type SOCK_STREAM. S_ISFIFO() and S_ISSOCK() are now defined
identically.
2017-09-05 07:46:45 +00:00
Sepherosa Ziehau
7f1e5ebba8 hyperv/hn: Log RSS capabilities mask.
This helps to detect when UDP hash types can be supported.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12177
2017-09-05 06:20:02 +00:00
Sepherosa Ziehau
8c068aa51e hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}.
The conditional compiling in the review request is removed, since
these IOCTLs will be available in stable/10 and stable/11.

Reviewed by:	gallatin
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12175
2017-09-05 06:05:48 +00:00
Marcin Wojtas
cb4226ba86 Fix loader bug causing too many pages allocation when bootloader is U-Boot
FreeBSD loader expects to have mmsz variable set by bootloader.
U-Boot behaviour is that if buffer size is not big enough to keep
whole memory map, assign the smallest correct buffer size to sz
and return error.

In other words U-Boot assumes that nobody will need mmsz value when buffer
is not filled with memory map, which is not true, so calculated pages value
was too big to allocate.

Solution: Simply assign default value to mmsz.

Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12194
2017-09-05 05:53:43 +00:00
Marcin Wojtas
eedd5eafbe Add Marvell RTC driver to arm64 GENERIC config
Marvell Armada 80x0/70x0 SoC family uses same RTC IP as
Armada 38x. This patch adds necessary files and enable driver in
GENERIC config.

Submitted by: Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12200
2017-09-05 05:50:01 +00:00
Marcin Wojtas
ee1c891dbc Add Armada 80x0/70x0 compatible to 38x RTC driver
Marvell Armada 80x0/70x0 SoC family uses same RTC IP as Armada 38x.
This patch adds Armada 8k compatible to Marvell RTC driver.

Submitted by: Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12186
2017-09-05 05:45:57 +00:00
Marcin Wojtas
cb0c98fce0 Change name of Marvell Armada38x RTC driver
Two modules with the same name cannot be loaded, so Marvell specific drivers
cannot have the same name as generic drivers.
Files with the same name, even in different folder overlaps their .o files.
Change armada38x/rtc.c to armada38x/armada38x_rtc.c fix it.
Preparation for adding this driver to GENERIC config for ARMv7
Marvell platforms.

Submitted by: Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12185
2017-09-05 05:42:37 +00:00
Sepherosa Ziehau
0f3af0411d if: Add ioctls to get RSS key and hash type/function.
It will be needed by hn(4) to configure its RSS key and hash
type/function in the transparent VF mode in order to match VF's
RSS settings. The description of the transparent VF mode and
the RSS hash value issue are here:
https://svnweb.freebsd.org/base?view=revision&revision=322299
https://svnweb.freebsd.org/base?view=revision&revision=322485

These are generic enough to promise two independent IOCs instead
of abusing SIOCGDRVSPEC.

Setting RSS key and hash type/function is a different story,
which probably requires more discussion.

Comment about UDP_{IPV4,IPV6,IPV6_EX} were only in the patch
in the review request; these hash types are standardized now.

Reviewed by:	gallatin
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12174
2017-09-05 05:28:52 +00:00
Kirk McKusick
855662c611 The new fsck recovery information to enable it to find backup
superblocks created in revision 322297 only works on disks
with sector sizes up to 4K. This update allows the recovery
information to be created by newfs and used by fsck on disks
with sector sizes up to 64K. Note that FFS currently limits
filesystem to be mounted from disks with up to 8K sectors.
Expanding this limitation will be the subject of another
commit.

Reported by: Peter Holm
Reviewed with: kib
2017-09-04 20:19:36 +00:00
Kurt Lidl
77e430c686 Fix whitespace on "options" to be <space><tab>, no functional change 2017-09-04 20:10:34 +00:00
Roger Pau Monné
45ff071d6e acpi/srat: zero the SRAT cpu array
Fix from fallout introduced in r322348 that moved the cpus array to a
dynamic allocation without zeroing the area.

Reported by:		mjg
MFC with:		r322348
Reviewed by:		mjg
Differential revision:	https://reviews.freebsd.org/D12220
2017-09-04 10:08:42 +00:00
Andrew Turner
ff6ea3fbc5 Disable the ARM generic timers before interrupts are enabled. Some
Raspberry Pi firmware images leave them enabled causing an interrupt storm.

Sponsored by:	ABT Systems Ltd
2017-09-03 09:41:40 +00:00
Marcin Wojtas
d7d8ab0316 Add ARM Cortex A72 to CPU list
This change is required to properly detect CPUs
on Marvell Armada 80x0/70x0 SoC family.

Submitted by: Rafal Kozik <rk@semihalf.com>
Reviewed by: andrew, cognet (mentor)
Approved by: cognet (mentor)
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12184
2017-09-03 08:32:33 +00:00
Ian Lepore
79a8182b65 Change leading spaces to tabs, no functional change. 2017-09-02 19:22:16 +00:00
Ian Lepore
ac0a9a9064 The latest RPi firmware leaves secondary cores in a wait-for-event (WFE)
state to save power, so after writing the entry point address for a core to
the mailbox, use a dsb() to synchronize the execution pipeline to the data
written, then use an sev() to wake up the core.

Submitted by:	Sylvain Garrigues <sylgar@gmail.com>
2017-09-02 19:20:11 +00:00
Warner Losh
d419d89ec9 Revert r322941: Eliminate redundant device matching functions
Turns out, they are subtly different. I'll refactor anew in the future
if it's worth it then.

Sponsored by: Netflix
Reported by: Tomoaki AOKI-san
2017-09-02 18:18:49 +00:00
Alexander Motin
3dcd9d784a Increase negotiation polling period from 10ms to 100ms.
There is no big need to burn CPU if other side may be not there yet.  For
example, the PLX hardware by default enables the NTB link up on reset, not
dependig on driver to do it.  In case of Intel hardware this also reduces
race between MSI-X workaround negotiation and upper layers, using the same
scratchpad registers in different time.

MFC after:	12 days
2017-09-02 13:28:45 +00:00
Alexander Motin
b7dd3fbede Make NTB drivers report more info via NewBus methods.
MFC after:	12 days
2017-09-02 11:56:16 +00:00
Warner Losh
a905e3962c The hard drive media device path contains the size of the partition,
not its end. This makes the GEOM efimedia attribute match the
FreeBSD:Boot1Device environment variable now.

Sponsored by: Netflix
2017-09-02 07:04:06 +00:00
Alexander Motin
03782c83a1 Sync NTB options between amd64 and i386.
Somehow they happen to become different.

MFC after:	13 days
2017-09-01 19:15:53 +00:00
Warner Losh
ab4effdc68 Add efimedia attribute for all GPT partitions.
Sposnored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12206
2017-09-01 17:55:25 +00:00
Josh Paetzel
9d0ec2a920 Revert r323087
This needs more thinking out and consensus, and the commit message
was wrong AND there was a typo in the commit.

pointyhat:	jpaetzel
2017-09-01 17:03:48 +00:00
Josh Paetzel
0be04b100c Take options IPSEC out of GENERIC
PR:	220170
Submitted by:	delphij
Reviewed by:	ae, glebius
MFC after:	2 weeks
Differential Revision:	D11806
2017-09-01 15:54:53 +00:00
Andrey V. Elsukov
6f9e437bdc Fix possible double releasing for SA reference.
This is missing part of r318734. When crypto subsystem returns error
the xform code handles an error independently.

PR:		221849
MFC after:	5 days
2017-09-01 11:51:07 +00:00
Alexander Motin
08dc78166d Link Interface has no Link Error registers.
MFC after:	13 days
2017-09-01 09:48:19 +00:00
Navdeep Parhar
95bb5b694e cxgbe/iw_cxgbe: Set TCP_NODELAY before initiating connection so that
t4_tom picks it up right away.  This is less work than waiting for
the connection to be established before applying the setting.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-09-01 01:34:12 +00:00
Navdeep Parhar
0a3bf7fb7e cxgbe/t4_tom: There may not be a tid to update if the connection isn't
established.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-08-31 23:34:08 +00:00
Gleb Smirnoff
2cc3b2eec5 Do not abuse flag that is clearly marked as unused.
This creates conflicts with FreeBSD variations that may use it.  The
usage of the flag M_TOOBIG is limited to iflib queue, thus using
one of M_PROTO flags is fine.  There is no need to grab global flag.

Silence from:	kmacy, sbruno (2 weeks)
2017-08-31 23:19:18 +00:00
Jung-uk Kim
2f6a1a81bb Merge ACPICA 20170831. 2017-08-31 22:47:04 +00:00
Alexander Motin
84f8cfec2f Clear doorbell bits after masking them before processing.
In theory this allows to avoid one more expensive doorbell register read
later in some scenarios.  But in practice it also significantly increases
packet rate on PLX hardware, that I can't explain yet, possibly work-
arounding some interrupt delays.

MFC after:	13 days
Sponsored by:	iXsystems, Inc.
2017-08-31 21:37:22 +00:00
Andrew Turner
5ea26aec6e Add support for quirks while enabling secondary CPUs. This uses the fdt
compatible string to check if the board is compatible with a given quirk.
It's possible this will be moved later, however as it's currently only used
by the MP code put it there.

So far the only instance of a quirk is when the list of CPUs may be
incorrect. This can happen on virtual machines with a hard coded
devicetree, but where the user may then set the number of CPUs as an
argument. This is the case on the ARM models so include the model specific
compat strings for these, including the spelling mistake found in some of
the OpenplatformPkg dtb files.

Sponsored by:	DARPA, AFRL
2017-08-31 20:48:05 +00:00
Navdeep Parhar
3ef7429927 cxgbe/t4_tom: Add a knob to select the congestion control algorigthm
used by the TOE hardware for fully offloaded connections.  The knob
affects new connections only.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-08-31 20:33:22 +00:00
Josh Paetzel
3b65550eec Allow kldload tcpmd5
PR:	220170
MFC after:	2 weeks
2017-08-31 20:16:28 +00:00
Warner Losh
6b4a1e856f Save where we're booted from
Record the file path for boot1.efi as the UEFI environemnt variable
FreeBSDBootVarGUID:Boot1Path. Record the device this came from as
FreeBSDBootVarGUID:Boot1Dev. While later stages of the boot may be
able to guess these values by retrieving UEFIGlobal:BootCurrent and
groveling through the correct UEFIGlobal:BootXXXX, this provides
certanty in the face of behavior from any part of the boot loader
chain that might "guess" what to do next. These env variables are
volatile and will disappear on reboot.

Sponsored by: Netflix
2017-08-31 17:32:24 +00:00
Warner Losh
b048c3564d Exit rather than panic for most errors.
In the FreeBSD UEFI boot protocol, boot1.efi exits back to UEFI if it
can't boot the image for most reasons (so that further items in the
EFI boot manger list can be tried). Rename panic to efi_panic, make it
static and give it an extra status argument. Exit back to UEFI with
that status argument so the next loader can be tried.

Use malloc/free exclusively instead of mixing malloc/free and
AllocatePool/FreePool. The code is smaller.

Sponsored by: Netflix
2017-08-31 17:32:19 +00:00
Warner Losh
08576441a4 boot1.efi: print more info about where boot1.efi is loaded from
Print the device that boot1.efi was loaded from. Print the path as
well (since it isn't included in DeviceHandle). Move block where we do
this earlier so all the block handle code is now together.

Sponsored by: Netflix
2017-08-31 17:32:14 +00:00
Warner Losh
2c4b0cefde Make efichar.c routines available to libefi.
Make efichar.c routines available to libefi as well as
libefivar. Define LIBEFI when building so we can conditionally include
stand.h vs the normal userland stuff.
2017-08-31 17:32:09 +00:00
Alexander Motin
171e3d1a1f Remove unneeded pmap_change_attr() calls.
Reported by:	kib
MFC after:	13 days
2017-08-31 17:02:06 +00:00
Alexander Motin
0faf59149c Add/polish some defines.
MFC after:	13 days
2017-08-31 16:32:11 +00:00
Konstantin Belousov
5a21cd1941 The nvme module should explicitly declare dependency on the cam.
If both nvme and cam are compiled as modules, nvme cannot be kldloaded
otherwise.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-31 14:21:32 +00:00
Alexander Motin
b6c46372af Fix port control for PEX 8749.
That chip has three Station Ports, so previous address math was incorrect.

MFC after:	13 days
Sponsored by:	iXsystems, Inc.
2017-08-31 13:41:44 +00:00
Baptiste Daroussin
91f1caeccb Add sysctls for arc shrinking and growing values
The default value for arc_no_grow_shift may not be optimal when using
several GiB ARC. Expose it via sysctl allows users to tune it easily.

Also expose arc_grow_retry via sysctl for the same reason. The default value of
60s might, in case of intensive load, be too long.

Submitted by:	Nikita Kozlov <nikita.kozlov@blade-group.com>
Reviewed by:	mav, manu, bapt
MFC after:	2 weeks
Sponsored by:	blade
Differential Revision:	https://reviews.freebsd.org/D12144
2017-08-31 13:02:17 +00:00
Alexander Motin
c7dabb6563 Make ntb_set_ctx() always generate fake link event.
It allows application driver get initial link state without racing with
hardware interrupts, thanks to the context rmlock held here.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2017-08-31 10:59:39 +00:00
Alexander Motin
ba4b25cbba Make ntb_transport(4) ready receive early link events.
Those events may be reported as soon as callback is registered, if the link
is enabled by hardware or some other application.

While there, clean link_is_up variable on link down event.

MFC after:	1 week
2017-08-31 10:53:10 +00:00
Navdeep Parhar
2f318252cb cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+).  The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 23:41:04 +00:00
Ed Maste
118ee8bdea xls_ehci: eliminate 'format string is not a string literal' warning
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 22:58:11 +00:00
Ed Maste
7a8d38a717 octeon_ebt3000_cf: eliminate 'format string is not a string literal' warning
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 22:54:11 +00:00
Alexander Motin
ed9652da5f Add NTB driver for PLX/Avago/Broadcom PCIe switches.
This driver supports both NTB-to-NTB and NTB-to-Root Port modes (though
the second with predictable complications on hot-plug and reboot events).
I tested it with PEX 8717 and PEX 8733 chips, but expect it should work
with many other compatible ones too.  It supports up to two NT bridges
per chip, each of which can have up to 2 64-bit or 4 32-bit memory windows,
6 or 12 scratchpad registers and 16 doorbells.  There are also 4 DMA engines
in those chips, but they are not yet supported.

While there, rename Intel NTB driver from generic ntb_hw(4) to more specific
ntb_hw_intel(4), so now it is on par with this new ntb_hw_plx(4) driver and
alike to Linux naming.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-30 21:16:32 +00:00
John Baldwin
b8b3a64000 Apply 64k padding to stack pointer for 32-bit processes.
In particular, MIPS now has COMPAT_FREEBSD32 for n64 kernels so this
cannot be ignored for n64.  On the other hand, it is unneeded for o32
MIPS kernels as the issue is only present when using 64-bit registers,
so remove the workaround from o32 kernels.

Reviewed by:	jmallett
Sponsored by:	DARPA / AFRL
2017-08-30 19:21:11 +00:00
Sean Bruno
a969350226 Revert r323008 and its conversion of e1000/iflib to using SX locks.
This seems to be missing something on the 82574L causing NFS root mounts
to hang.

Reported by:	kib
2017-08-30 18:56:24 +00:00
Navdeep Parhar
e3d00a3e03 cxgbe(4): Zero out the memory allocated for the debug dump.
cudbg_collect seems to expect it this way.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 18:46:38 +00:00
Konstantin Belousov
405058aa7a Only make the if_ix module depend on netmap when netmap is configured.
Approved by:	erj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-30 18:19:25 +00:00
Ed Maste
53d534724a bhnd: initialize variable before use
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 17:39:51 +00:00
Ed Maste
3460f018f9 arge: correct bzero sizeof (pointed-to object, not pointer)
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 17:38:55 +00:00
Maxim Sobolev
f76de5dd51 Add proper support for the md_label into md(4) ioctl compat layer.
While I am here, declare struct md_ioctl32 as packed which allows
us to stop playing tricks with sizeof(md_ioctl32)+y as well as
simplifies md_pad handling. Both were necessary because of different
alignment preferences on amd64 vs i386.

MFC after:	4 weeks
2017-08-30 15:07:10 +00:00
Konstantin Belousov
35872e79b7 Adjust interface of swapon_check_swzone() to its actual usage.
The function return value is not used.  Its argument is always
swap_total/PAGE_SIZE, so make it not take any arguments.

Submitted by:	ota@j.email.ne.jp
PR:	221356
MFC after:	1 week
2017-08-30 10:17:00 +00:00
Konstantin Belousov
f08b30995a Make the swap_pager_full variable static.
r290920 removed the use of the variable from vm/vm_pageout.c.

Submitted by:	ota@j.email.ne.jp
PR:	221356
MFC after:	1 week
2017-08-30 09:44:05 +00:00
Ed Schouten
b53b978a6c Complete the CloudABI networking refactoring.
Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.

Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
2017-08-30 07:30:06 +00:00
Ed Maste
89fdc67c9c usb: Add external "Intenso Memory" disk UQ_MSC_NO_INQUIRY quirk
PR:		221852
Submitted by:	Fabian Keil
Reviewed by:	hselasky
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-30 01:44:11 +00:00
Sean Bruno
e17e5b4134 Continuation of lock cleanup in e1000.
Post-cold sleep instead of DELAY when waiting for firmware.

Convert softc mutex to an SX lock.  Change all waits to sleeps
once interrupts are enabled (and it is safe to sleep).

Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12101
2017-08-30 00:20:43 +00:00
John Baldwin
cccc713394 Allow execution of FreeBSD/mips shared objects.
This was applied to all other FreeBSD architectures in r169846 but wasn't
included in the intial MIPS import.

Sponsored by:	DARPA / AFRL
2017-08-30 00:14:36 +00:00
Navdeep Parhar
fc740a161b cxgbe(4): Update T6/T5/T4 firmwares to 1.16.59.0.
These firmwares come from a pre-release snapshot.  The final firmwares
in this Chelsio release cycle will likely be .61.0 or later and those
will be the next "long lived" firmwares in FreeBSD head and stable
branches.  .59 is being provided in head (only) for wider test exposure.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2017-08-29 23:37:26 +00:00
Ed Maste
3c3d2ba6fe zfs: do not advertise edonr which is not yet supported
illumos 4185 ("add new cryptographic checksums to ZFS: SHA-512,
Skein, Edon-R") was intentionally merged only partially in r289422,
without adding support for skein, sha512 and edonr on FreeBSD.

Support for skein and sha512 was added later on, but edonr is still not
implemented in FreeBSD.

Prior to this commit zfs(8) correctly rejected edonr, but with an error
message that claimed support:

fk@r500 ~ $zfs set checksum=edonr tank
cannot set property for 'tank': 'checksum' must be one of 'on | off | fletcher2 | fletcher4 | sha256 | sha512 | skein | edonr'

PR:		204055
Submitted by:	Fabian Keil
Approved by:	allanjude
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-29 22:24:22 +00:00
Warner Losh
a3cf03a519 Add missing test for NVME CCBs for nvme passthru support.
Submitted by: Chuck Tuffli
2017-08-29 21:04:29 +00:00
Warner Losh
9f8ed7e40b Fix NVMe's use of XPT_GDEV_TYPE
This patch changes the way XPT_GDEV_TYPE works for NVMe. The current
ccb_getdev structure includes pointers to the NVMe Identify Controller
and Namespace structures, but these are kernel virtual addresses which
are not accessible from user space.

As an alternative, the patch changes the pointers into padding in
ccb_getdev and adds two new types to ccb_dev_advinfo to retrieve the
Identify Controller (CDAI_TYPE_NVME_CNTRL) and Namespace
(CDAI_TYPE_NVME_NS) data structures.

Reviewed By: rpokala, imp
Differential Revision: https://reviews.freebsd.org/D10466
Submitted by: Chuck Tuffli
2017-08-29 17:03:30 +00:00
Warner Losh
c2005bba77 Fix a few overlooked spots where the coded uses 16-bit NSIDs. Chuck
Tuffli had submitted a more thorough patch that I was unaware of when
I did my work and this brings in the bits I missed from that patch.

PR: 220267
Submitted by: Chuck Tuffli
2017-08-29 15:46:34 +00:00
Warner Losh
519772814d Add CAM/NVMe support for CAM_DATA_SG
This adds support in pass(4) for data to be described with a
scatter-gather list (sglist) to augment the existing (single) virtual
address.

Differential Revision: https://reviews.freebsd.org/D11361
Submitted by: Chuck Tuffli
Reviewed by: imp@, scottl@, kenm@
2017-08-29 15:29:57 +00:00
Warner Losh
850564b948 Add new compile-time option NVME_USE_NVD that sets the default value
of the runtime hw.nvme.use_vnd tunable. We still default to nvd unless
otherwise requested.

Sponsored by: Netflix
2017-08-28 23:54:25 +00:00
Warner Losh
c02565f9fa Set the max transactions for NVMe drives better.
Provided a better estimate for the number of transactions that can be
pending at one time. This will be number of queues * number of
trackers / 4, as suggested by Jim Harris. This gives a better estimate
of the number of transactions that CAM should queue before applying
back pressure. This should be revisted when we have real multi-queue
support in CAM and the upper layers of the I/O stack.

Sponsored by: Netflix
2017-08-28 23:54:20 +00:00
Warner Losh
be650b3469 Add nvme_sim.c since that's not runtime switchable.
Sponsored by: Netflix
2017-08-28 23:54:16 +00:00
Navdeep Parhar
1ba8c29ca8 cxgbe(4): Do not access the mailbox without appropriate locks while
creating hardware VIs.

This fixes a bad race on systems with hw.cxgbe.num_vis > 1.

Reported by:	olivier@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 22:41:15 +00:00
Conrad Meyer
2744a0b69b Drop CACHE_LINE_SIZE to 64 bytes on x86
The actual cache line size has always been 64 bytes.

The 128 number arose as an optimization for Core 2 era Intel processors.  By
default (configurable in BIOS), these CPUs would prefetch adjacent cache
lines unintelligently.  Newer CPUs prefetch more intelligently.

The latest Core 2 era CPU was introduced in September 2008 (Xeon 7400
series, "Dunnington").  If you are still using one of these CPUs, especially
in a multi-socket configuration, consider locating the "adjacent cache line
prefetch" option in BIOS and disabling it.

Reported by:	mjg
Reviewed by:	np
Discussed with:	jhb
Sponsored by:	Dell EMC Isilon
2017-08-28 22:28:41 +00:00
Andriy Voskoboinyk
0cc18edf18 rtwn(4): some initial preparations for (basic) VHT support.
Rename RTWN_RIDX_MCS to RTWN_RIDX_HT_MCS before adding 802.11ac
MCS rate indexes (they have different offset).

No functional change intended.
2017-08-28 22:14:16 +00:00
Mark Johnston
aed9aaaa76 Synchronize page laundering with pmap_extract_and_hold().
Before r207410, the hold count of a page in a page queue was protected
by the queue lock, and, before laundering a page, the page daemon
removed managed writeable mappings of the page before releasing the
queue lock. This ensured that other threads could not concurrently
create transient writeable mappings using pmap_extract_and_hold() on a
user map, as is done for example by vmapbuf(). With that revision,
however, a race can allow the creation of such a mapping, meaning that
the page might be modified as it is being laundered, potentially
resulting in it being marked clean when its contents do not match
those given to the pager. Close the race by using the page lock to
synchronize the hold count check in vm_pageout_cluster() with the
removal of writeable managed mappings.

Reported by:	alc
Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12084
2017-08-28 22:10:15 +00:00
Marius Strobl
e023501c6a Don't set any WOL enabling hardware bits if WOL isn't requested
according to the enabled interface capability bits. Also remove
some dead code, which tried to preserve already set contents of
E1000_WUC while that register is completely overwritten shortly
after in all cases.
2017-08-28 22:09:12 +00:00
Navdeep Parhar
7023d9d4c6 cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI).  All autonomous VIs that share a port share the
same media.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 21:44:25 +00:00
Konstantin Belousov
4eeec01fee Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-08-28 21:04:56 +00:00
Konstantin Belousov
fbcbbe78dc Verify that the BPB media descriptor and FAT ID match.
FAT specification requires that for valid FAT, FAT cluster 0 has a
specific value derived from the BPB media descriptor.  The lowest
(little-endian) byte must be equal to bpb.bpbMedia, other bits in the
cluster number must be all 1's.  Implement the check to reduce the
chance of the randomly corrupted FAT to pass the mount attempt.

Submitted by:	Siva Mahadevan <smahadevan@freebsdfoundation.org>
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12124
2017-08-28 20:52:32 +00:00
Alexander Motin
546ec4e544 Mask doorbells while processing them.
This fixes interrupt storms on hardware using legacy level-triggered
interrupts, since doorbell processing could take time after interrupt
handler completion, that triggered extra interrupts in a loop.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-28 20:00:21 +00:00
Alexander Motin
e6f000753e Fix fake interrupt when set doorbell is unmasked.
Since the doorbell bit is already set when interrupt handler is called,
the event was not propagated to upper layer.  It was working normally
because present code was not using masking actively, but that is going
to change.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-28 19:52:57 +00:00
Bryan Drewery
8359a6b7b3 Allow vdrop() of a vnode not yet on the per-mount list after r306512.
The old code allowed calling vdrop() before insmntque() to place the vnode back
onto the freelist for later recycling.  Some downstream consumers may rely on
this support.  Normally insmntque() failing is fine since is uses vgone() and
immediately frees the vnode rather than attempting to add it to the freelist if
vdrop() were used instead.

Also assert that vhold() cannot be used on such a vnode.

Reviewed by:	kib, cem, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12126
2017-08-28 19:29:51 +00:00
Warner Losh
9754579b01 Add comment about where we need to place this routine, and why.
Sponsored by: Netflix
2017-08-28 19:27:33 +00:00
Warner Losh
43effc8ca8 Add comment about where we need to place this routine, and why.
Sponsored by: Netflix
2017-08-28 19:25:49 +00:00
Alan Cox
ee620ea47d Update a couple vm_object lock assertions in the swap pager to reflect the
new use of the vm_object's lock to synchronize updates to a radix trie
mapping per-vm object page indices to on-disk swap blocks.

Fix a typo in a nearby comment.

Reviewed by:	kib, markj
X-MFC with:	r322913
Differential Revision:	https://reviews.freebsd.org/D12134
2017-08-28 17:02:25 +00:00
Alan Cox
d5efa0a475 Switching from a global hash table to per-vm_object radix tries for mapping
vm_object page indices to on-disk swap space (r322913) has changed the
synchronization requirements for a couple swap pager functions.  Whereas
before a read lock on the vm object sufficed because of the global mutex
on the hash table, a write lock on the vm object may now be required.  In
particular, calls to vm_pager_page_unswapped() now require a write lock on
the vm_object.  Consequently, vm_fault()'s fast path cannot call
vm_pager_page_unswapped().  The swap space will have to be released at a
later point.

Reviewed by:	kib, markj
X-MFC with:	r322913
Differential Revision:	https://reviews.freebsd.org/D12134
2017-08-28 16:55:43 +00:00
Maxim Sobolev
f7ca2bbe44 Add ability to label md(4) devices.
This feature comes from the fact that we rely memory-backed md(4)
in our build process heavily. However, if the build goes haywire
the allocated resources (i.e. swap and memory-backed md(4)'s) need
to be purged. It is extremely useful to have ability to attach
arbitrary labels to each of the virtual disks so that they can
be identified and GC'ed if neecessary.

MFC after:	4 weeks
Differential Revision:	https://reviews.freebsd.org/D10457
2017-08-28 15:54:07 +00:00
Michael Tuexen
3d5af7a127 Fix blackhole detection.
There were two bugs related to the blackhole detection:
* The smalles size was tried more than two times.
* The restored MSS was not the original one, but the second
  candidate.

MFC after:	1 week
Sponsored by:	Netflix, Inc.
2017-08-28 11:41:18 +00:00
Ed Schouten
5d17a1d629 Make _Static_assert() work with GCC in older C++ standards.
GCC only activates C11 keywords in C mode, not C++ mode. This means
that when targeting an older C++ standard, we cannot fall back to using
_Static_assert(). In this case, do define _Static_assert() as a macro
that uses a typedef'ed array.

Discussed in:	r322875 commit thread
Reported by:	Mark MIllard
MFC after:	1 month
2017-08-28 09:35:17 +00:00
Navdeep Parhar
573443c542 cxgbe(4): vi_mac_funcs should include the base Ethernet function. It is
already used in the driver as if it does.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-28 07:50:54 +00:00
Navdeep Parhar
358346b456 cxgbe(4): Remove write only variable from t4_port_init.
MFC after:	3 days
2017-08-28 04:06:40 +00:00
Navdeep Parhar
0fa7560dfd cxgbe(4): Fix some assertions during driver detach. The netmap queues
can't be initialized if the VI isn't.

MFC after:	3 days
2017-08-28 03:25:41 +00:00
Navdeep Parhar
f6d9d14b93 cxgbe(4): Verify that the driver accesses the firmware mailbox in a
thread-safe manner.

MFC after:	3 days
2017-08-28 03:13:16 +00:00
Andriy Voskoboinyk
191ccdf545 net80211: fix a typo (premable -> preamble). 2017-08-27 22:13:03 +00:00
Conrad Meyer
4ae2ade114 Enhance debugibility of sysctl leaf re-use warnings
Print the full conflicting oid path, and include the function name in the
warning so it is clear that the warnings are sysctl-related.

PR:		221853
Submitted by:	Fabian Keil <fk AT fabiankeil.de> (earlier version)
Sponsored by:	Dell EMC Isilon
2017-08-27 17:12:30 +00:00