Commit Graph

125864 Commits

Author SHA1 Message Date
Navdeep Parhar
c0a248ef93 cxgbev(4): Initialize debug_flags from the environment like in the PF driver. 2019-02-08 03:31:38 +00:00
Andrew Turner
4f33c38083 Add missing data barriers after storeing a new valid pagetable entry.
When moving from an invalid to a valid entry we don't need to invalidate
the tlb, however we do need to ensure the store is ordered before later
memory accesses. This is because this later access may be to a virtual
address within the newly mapped region.

Add the needed barriers to places where we don't later invalidate the
tlb. When we do invalidate the tlb there will be a barrier to correctly
order this.

This fixes a panic on boot on ThunderX2 when INVARIANTS is turned off:
panic: vm_fault_hold: fault on nofault entry, addr: 0xffff000040c11000

Reported by:	jchandra
Tested by:	jchandra
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19097
2019-02-07 20:58:45 +00:00
Andrew Turner
8308d2a251 Add a missing data barrier to the start of arm64_tlb_flushID.
We need to ensure the page table store has happened before the tlbi.

Reported by:	jchandra
Tested by:	jchandra
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19097
2019-02-07 20:50:39 +00:00
Emmanuel Vadot
bcdb59e767 arm64: dtb: allwinner: Add the new pine64-lts dtb file to the build
MFC after:	1 month
X-MFC-With:	r342936
2019-02-07 18:07:17 +00:00
Leandro Lupori
59a8224976 [ppc64] fix /dev/kmem
For direct mapped kernel addresses, ppc64 function was not
performing the dmap to physical conversion, before jumping
to the code that fetched the value from physical memory.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19086
2019-02-07 17:30:44 +00:00
Vincenzo Maffione
1ef2a88149 netmap: revert netmap_attach_ext() to pre-r343772
Reported by:	marius
MFC after:	1 week
2019-02-07 11:28:53 +00:00
Navdeep Parhar
644b22ae36 cxgbe(4): Auto-dump the CIM block's logic analyzer on a TIMER0 interrupt.
Sponsored by:	Chelsio Communications
2019-02-07 05:40:51 +00:00
Navdeep Parhar
286fd42ba6 cxgbe(4): Auto-dump the device log on a mailbox timeout or when the
firmware reports an error in pcie_fw.

Sponsored by:	Chelsio Communications
2019-02-07 05:06:29 +00:00
Jayachandran C.
13607f6db5 pci_host_generic_acpi: use IORT data for MSI/MSI-X
Use the information from IORT parsing to translate the PCI RID to
GIC ITS device ID. And similarly, use the information to find the
PIC XREF identifier to be used for PCI devices.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D18004
2019-02-07 04:50:16 +00:00
Gleb Smirnoff
ad66f95865 Now that there is only one way to allocate a slab, remove uz_slab method.
Discussed with:	jeff
2019-02-07 03:55:05 +00:00
Gleb Smirnoff
b47acb0a4d Report cache zones in UMA stats sysctl, that 'vmstat -z' uses. This
should had been part of r251826.
2019-02-07 03:32:45 +00:00
Jayachandran C.
73d8c81f38 arm64 gicv3: add IORT and NUMA support
acpi_iort.c has added support to query GIC proximity and MSI XREF
ID for GIC ITS blocks. Use this when GIC ITS blocks are initialized
from ACPI.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D18003
2019-02-07 03:01:54 +00:00
Jayachandran C.
9088a4751c arm64 acpi: Add support for IORT table
Add new file arm64/acpica/acpi_iort.c to support the "IO Remapping
Table" (IORT). The table is specified in ARM document "ARM DEN 0049D"
titled "IO Remapping Table Platform Design Document".  The IORT table
has information on the associations between PCI root complexes, SMMU
blocks and GIC ITS blocks in the system.

The changes are to parse and save the information in the IORT table.
The API to use this information is added to sys/dev/acpica/acpivar.h.

The acpi_iort.c also has code to check the GIC ITS nodes seen in the
IORT table with corresponding entries in MADT table (for validity)
and with entries in SRAT table (for proximity information).

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D18002
2019-02-07 02:30:33 +00:00
Konstantin Belousov
eb785fab3b Port sysctl kern.elf32.read_exec from amd64 to i386.
Make it more comprehensive on i386, by not setting nx bit for any
mapping, not just adding PF_X to all kernel-loaded ELF segments.  This
is needed for the compatibility with older i386 programs that assume
that read access implies exec, e.g. old X servers with hand-rolled
module loader.

Reported and tested by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-02-07 02:17:34 +00:00
Konstantin Belousov
f76b5ab6cc Fix resume on i386 PAE.
It was broken before PAE/no-PAE merge, but since now PAE is the
default, resume is apparently becomes for all machines.

The corrected issues:
- the trampoline page is not mapped executable, so machine faults when
  paging is on;
- MSR.EFER and %cr4 both should be loaded before paging is enabled,
  otherwise paging structures are invalid (cr4.PAE and EFER.NX).
- MSR.EFER and %cr4 should be only loaded if present.  I attempt to handle
  this by not touching the registers if the value is zero.

There are some more bits still not quite correct, e.g. unconditional
access to %cr4 in resumectx.

Reported and debugging help by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-02-07 02:09:34 +00:00
Konstantin Belousov
d22ff6e6a2 contigmalloc: handle M_EXEC.
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19092
2019-02-07 02:00:23 +00:00
Gavin Atkinson
afa9b0368f Support the Lenovo OneLink in ure(4).
MFC after:	1 week
2019-02-06 20:18:22 +00:00
Ed Maste
ac979af451 riscv: default to non-executable stack
There's no need to worry about potential backwards compatibility issues
in a brand-new architecture, so avoid stack PROT_EXEC as with arm64.

Discussed with:	br
2019-02-06 19:22:15 +00:00
Ed Maste
e26563b8c7 Retire SPX_HACK option unused after r342244 2019-02-06 17:21:25 +00:00
Andriy Voskoboinyk
545619f3f2 net80211(4): validate supplied roam:rate values from ifconfig(8)
MFC after:	4 days
2019-02-06 13:01:21 +00:00
Michal Meloun
64d41eddd7 Adapt FreeBSD specific DT stub for Jetson TK1 board to be consistent with
update of devicetree to 4.19 in r340337.
Our build system doesn't provide dependencies for included DTS files, so
nobody noticed this issue for long time.

PR:		235362
MFC after:	1 week
2019-02-06 06:03:44 +00:00
Justin Hibbits
4290b4b849 powerpc: Bind IRQs to only one interrupt on QorIQ SoCs
The QorIQ SoCs don't actually support multicast interrupts, and the
references state explicitly that multicast is undefined behavior.  Avoid the
undefined behavior by binding to only a single CPU, using a quirk to
determine if this is necessary.

MFC after:	3 weeks
2019-02-06 03:52:14 +00:00
Andriy Voskoboinyk
61b38ede81 iwn(4): plug initialization path vs interrupt handler races
There are few places in interrupt handler where the driver
lock is dropped; ensure that device is still running before
processing remaining ring entries.

PR:		192641
MFC after:	5 days
2019-02-06 01:34:14 +00:00
Warner Losh
a49077d365 Add quirk for Sansisk X400 drives
Certain versions of Sandisk x400 firmware can hang under extremely
heavly load of large I/Os for prolonged periods of time. Newer /
current versions work fine, and should be used where possible. Where
not possible, this quirk ensures that I/O requests are limited to 128k
to avoids the bug, even under extreme load. Since MAXPHYS is 128k,
only users with custom kernels are at risk on the older firmware.
Once all known users of the older firmware have upgraded, this quirk
will be removed.

Sponsored by: Netflix, Inc.
2019-02-05 22:53:36 +00:00
Warner Losh
e9f9c34796 Remove obsolete controller
We removed support for the super-old samsung s3xxxx parts, but this is
a straggler. Remove it too.
2019-02-05 21:37:45 +00:00
Warner Losh
d3f1313287 Remove All Rights Reserved
Remove the all rights reserved clause from my copyright, and make
other minor tweaks needed where that might have created ambiguity.
2019-02-05 21:37:34 +00:00
Warner Losh
8590b14e9d Remove a few stray "All Rights Reserved." declarations on stuff I've
written.
2019-02-05 21:28:29 +00:00
Konstantin Belousov
2648ed9275 Make it possible to override PAE mode on boot.
Initialize the static kenv in pmap_cold() and fetch user opinion on
vm.pmap.pae_mode tunable if hardware is capable.  Note that the static
environment is reinitilized in init386() later when paging is enabled.

Reviewed by:	bde
Discussed with:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2019-02-05 20:09:31 +00:00
Konstantin Belousov
c7301c6b2c Remove pointless initial value for i386 vm.pmap.pat_works sysctl definition.
The OID is served by external data.

Submitted by:	bde
MFC after:	3 days
2019-02-05 20:02:16 +00:00
Leandro Lupori
4a8450ceff [ppc64] llan: fix fatal kernel trap when system is low on memory
When running several builders in parallel, on QEMU, with 8GB of
memory, a fatal kernel trap (0x300 (data storage interrupt))
caused by llan driver is sometimes observed, when the system
starts to run out of swap space.

This happens because, at llan_intr(), a phyp call to add a
logical LAN buffer is always made when llan_add_rxbuf() fails,
even if it fails to allocate a new buffer.

PR:	235489
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19084
2019-02-05 18:16:14 +00:00
Mark Johnston
401ca034cd Avoid leaking fp references when truncating SCM_RIGHTS control messages.
Reported by:	pho
Approved by:	so
MFC after:	0 minutes
Security:	CVE-2019-5596
Sponsored by:	The FreeBSD Foundation
2019-02-05 17:55:08 +00:00
Konstantin Belousov
762138f78f amd64: clear callee-preserved registers on syscall exit.
%r8, %r10, and on non-KPTI configuration %r9 were not restored on fast
return from a syscall.

Reviewed by:	markj
Approved by:	so
Security:	CVE-2019-5595
Sponsored by:	The FreeBSD Foundation
MFC after:	0 minutes
2019-02-05 17:49:27 +00:00
Bruce Evans
78223db29c Fix missing translation of old ioctls for KDSETMODE, KDSBORDER and
CONS_SETWINORG.  After translation, the last 2 are not supported, but
the first one has incomplete support that is enough to run old versions
of X.
2019-02-05 17:17:12 +00:00
Bruce Evans
3a19918442 My recent fix for programmable function keys in syscons only worked
when TEKEN_CONS25 is configured.  Fix this by adding a function to
set the flag that enables the fix and always calling this function
for syscons.

Expand the man page for teken_set_cons25().  This function is not
very useful since it can only set but not clear 1 flag.  In practice,
it is only used when TEKEN_CONS25 is configured and all that does is
choose the the default emulation for syscons at compile time.
2019-02-05 16:59:29 +00:00
Bruce Evans
6fd2dcd428 Fix zapping of static hints and env in init_static_kenv(). Environments
are terminated by 2 NULs, but only 1 NUL was zapped.  Zapping only 1
NUL just splits the first string into an empty string and a corrupted
string.  All other strings in static hints and env remained live early
in the boot when they were supposed to be disabled.

Support calling init_static_kenv() very early in the boot, so as to
use the env very early in the boot.  Then the pointer to the loader
env may change after the first call due to enabling paging or otherwise
remapping the pointer.  Another call is needed to register the change.
Don't use the previous pointer in this (or any) later call.

Reviewed by:	kib
2019-02-05 15:34:55 +00:00
Vincenzo Maffione
75f4f3ed51 netmap: refactor logging macros and pipes
Changelist:
    - Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
      and nm_prlim, to avoid possible naming conflicts.
    - Add netmap_krings_mode_commit() helper function and use that
      to reduce code duplication.
    - Refactor pipes control code to export some functions that
      can be reused by the veth driver (on Linux) and epair(4).
    - Add check to reject API requests with version less than 11.
    - Small code refactoring for the null adapter.

MFC after:	1 week
2019-02-05 12:10:48 +00:00
Michael Tuexen
baed5270e1 Only reduce the PMTU after the send call. The only way to increase it, is
via PMTUD.

This fixes an MTU issue reported by Timo Voelker.

MFC after:		3 days
2019-02-05 10:29:31 +00:00
Michael Tuexen
e4c42fa266 Fix an off-by-one error in the input validation of the SCTP_RESET_STREAMS
socketoption.

This was found by running syzkaller.

MFC after:		3 days
2019-02-05 10:13:51 +00:00
Jayachandran C.
cc9b5471a0 arm, acpi: increase size of memory region arrays
Bump up MAX_HWCNT and MAX_EXCNT to 32 when ACPI is enabled. These are
the sizes of the hwregions and exregions arrays respectively. ACPI
firmware typically has more memory regions and the current value of
16 is not sufficient for some platforms.

This commit fixes a failure seen with AMI firmware on Cavium's Sabre
ThunderX2 reference platform. This platform needs 21 physical memory
regions and 18 excluded regions to boot correctly with the current
firmware release.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D19073
2019-02-05 06:25:35 +00:00
Justin Hibbits
9c22a13345 powerpc: Don't idle with the wait instruction on booke
It appears idling via 'wait' on e5500 causes strange behaviors, such as
top(1) simply hanging sporadically, until input.  Until this can possibly be
sorted out (interrupt issue?), just don't idle on this hardware.  The SoCs
are low power already, and the wait state doesn't save much anyway.
2019-02-05 04:47:41 +00:00
Conrad Meyer
e682df5397 extattr_list_vp: Narrow locked section somewhat
Suggested by:	mjg
Reviewed by:	kib, mjg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19083
2019-02-05 04:47:21 +00:00
Conrad Meyer
c3eb848ce3 extattr_list_vp: Only take shared vnode lock
List is a 'read'-type operation that does not modify shared state; it's safe
for multiple thread to proceed concurrently.  This is reflected in the vnode
operation LISTEXTATTR locking protocol specification, which only requires a
shared lock.

(Similar to previous r248933.)

Reported by:	Case van Rij <case.vanrij AT isilon.com>
Reviewed by:	kib, mjg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19082
2019-02-05 03:32:58 +00:00
Konstantin Belousov
ccc2d07e77 Update CPUID bits definitions and CPU identification based on changes
in SDM rev. 069.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-02-04 23:57:59 +00:00
Warner Losh
52467047aa Regularize the Netflix copyright
Use recent best practices for Copyright form at the top of
the license:
1. Remove all the All Rights Reserved clauses on our stuff. Where we
   piggybacked others, use a separate line to make things clear.
2. Use "Netflix, Inc." everywhere.
3. Use a single line for the copyright for grep friendliness.
4. Use date ranges in all places for our stuff.

Approved by: Netflix Legal (who gave me the form), adrian@ (pmc files)
2019-02-04 21:28:25 +00:00
Marius Strobl
bfce461ee9 o As illustrated by e. g. figure 7-14 of the Intel 82599 10 GbE
controller datasheet revision 3.3, in the context of Ethernet
  MACs the control data describing the packet buffers typically
  are named "descriptors". Each of these descriptors references
  one buffer, multiple of which a packet can be composed of.
  By contrast, in comments, messages and the names of structure
  members, iflib(4) refers to DMA resources employed for RX and
  TX buffers (rather than control data) as "desc(riptors)".
  This odd naming convention of iflib(4) made reviewing r343085
  and identifying wrong and missing bus_dmamap_sync(9) calls in
  particular way harder than it already is. This convention may
  also explain why the netmap(4) part of iflib(4) pairs the DMA
  tags for control data with DMA maps of buffers and vice versa
  in calls to bus_dma(9) functions.
  Therefore, change iflib(4) to refer to buf(fers) when buffers
  and not the usual understanding of descriptors is meant. This
  change does not include corrections to the DMA resources used
  in the netmap(4) parts. However, it revises error messages to
  state which kind of allocation/creation failed. Specifically,
  the "Unable to allocate tx_buffer (map) memory" copy & pasted
  inappropriately on several occasions was replaced with proper
  messages.
o Enhance some other error messages to indicate which half - RX
  or TX - they apply to instead of using identical text in both
  cases and generally canonicalize them.
o Correct the descriptions of iflib_{r,t}xsd_alloc() to reflect
  reality; current code doesn't use {r,t}x_buffer structures.
o In iflib_queues_alloc():
  - Remove redundant BUS_DMA_NOWAIT of iflib_dma_alloc() calls,
  - change the M_WAITOK from malloc(9) calls into M_NOWAIT. The
    return values are already checked, deferred DMA allocations
    not being an option at this point, BUS_DMA_NOWAIT has to be
    used anyway and prior malloc(9) calls in this function also
    specify M_NOWAIT.

Reviewed by:	shurd
Differential Revision:	https://reviews.freebsd.org/D19067
2019-02-04 20:46:57 +00:00
Alexander Motin
ed0a3e8637 s/Maximal/Maximum/ in sysctl description.
Submitted by:	smh
MFC after:	1 week
2019-02-04 20:09:22 +00:00
Dimitry Andric
0f166953f7 Use NLDT to get number of LDTs on i386
Compiling a GENERIC kernel for i386 with clang 8.0 results in the
following warning:

/usr/src/sys/i386/i386/sys_machdep.c:542:40: error: 'sizeof ((ldt))' will return the size of the pointer, not the array itself [-Werror,-Wsizeof-pointer-div]
        nldt = pldt != NULL ? pldt->ldt_len : nitems(ldt);
                                              ^~~~~~~~~~~
/usr/src/sys/sys/param.h:299:32: note: expanded from macro 'nitems'
#define nitems(x)       (sizeof((x)) / sizeof((x)[0]))
                         ~~~~~~~~~~~ ^

Indeed, 'ldt' is declared as 'union descriptor *', so nitems() is not
the right way to determine the number of LDTs.  Instead, the NLDT define
from sys/x86/include/segments.h should be used.

Reviewed by:	kib
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D19074
2019-02-04 18:07:03 +00:00
Andrew Turner
2d01f2dee3 Only enable trace-cmp on Clang and modern GCC.
It's was only added to GCC 8.1 so don't try to enable it for earlier
releases.

Reported by:	lwhsu
Sponsored by:	DARPA, AFRL
2019-02-04 16:55:24 +00:00
Alexander Motin
ef08154150 Add missed tunables/sysctls for some new vdev variables.
While there, make few existing sysctls writeable, since there is no reason
not to.

MFC after:	1 week
2019-02-04 16:13:41 +00:00
Leandro Lupori
6174048251 powerpc64: Add a trap stack area
Currently, the trap code switches to the the temporary stack in the dbtrap
section. It works in most cases, but in the beginning of the execution, the
temp stack is being used, as starting in the powerpc_init() code.

In this current scenario, the stack is being overwritten, which causes the
return of breakpoint() to take abnormal execution.

This current patchset create a small stack to use by the dbtrap: codepath
avoiding the corruption of the temporary stack.

PR:		224872
Submitted by:	breno.leitao_gmail.com
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D14484
2019-02-04 16:02:03 +00:00
Cy Schubert
5c4611cd3f Remove two more #ifdefs missed in r343701.
MFC after:	1 month
X-MFC with:	r343701
2019-02-04 05:37:16 +00:00
Alexander Motin
6a69d2a400 Use switch instead of chained if/else to improve readability.
Submitted by:	Ryan Moeller <ryan@freqlabs.com>
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D19051
2019-02-04 01:20:56 +00:00
Konstantin Belousov
f02bc51c09 Do not call PHOLD() while owning the allproc_lock sx.
Otherwise the lock might recurse in faultin() if the process is
swapped out.

Reported by:	zeising
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-02-03 21:31:40 +00:00
Konstantin Belousov
cbb65b7ec5 i386: Do not ever store to other-CPU counter64 slot.
On CPUs supporting cmpxchg8b, fetch is performed by cmpxchg8b on
corresponding CPU slot, which unconditionally write to the slot.  If
for that slot, the owner CPU increments it, then both CPUs might run
the cmpxchg8b instruction concurrently and this might race and
override the incremental write.  So the counter update would be lost.

Fix it by implementing fetch as IPI and accumulation of result.  It is
acceptable for rare counter64 fetch operation to be more expensive.

Diagnosed and tested by:	Andreas Longwitz <longwitz@incore.de>
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2019-02-03 21:28:58 +00:00
Mark Johnston
1e2b3e6f92 Allow vm_page_free_prep() to dequeue pages without the page lock.
This is a step towards being able to free pages without the page
lock held.  The approach is simply to add an implementation of
vm_page_dequeue_deferred() which does not assert that the page
lock is held.  Formally, the page lock is required to set
PGA_DEQUEUE, but in the case of vm_page_free_prep() we get the
same mutual exclusion for free by virtue of the fact that no
other references to the page may exist.

No functional change intended.

Reviewed by:	kib (previous version)
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19065
2019-02-03 18:43:20 +00:00
Mark Johnston
d0488e698f Fix a race in vm_page_dequeue_deferred().
To detect the case where the page is already marked for a deferred
dequeue, we must read the "queue" and "aflags" fields in a
precise order.  Otherwise, a race with a concurrent
vm_page_dequeue_complete() could leave the page with PGA_DEQUEUE
set despite it already having been dequeued.  Fix the problem by
using vm_page_queue() to check the queue state, which correctly
handles the race.

Reviewed by:	kib
Tested by:	pho
MFC after:	3 days
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19039
2019-02-03 18:38:58 +00:00
Andrew Turner
634a8a8873 Enable COVERAGE and KCOV by default on arm64 and amd64.
This allows userspace to trace the kernel using the coverage sanitizer
found in clang. It will also allow other coverage tools to be built as
modules and attach into the same framework.

Sponsored by:	DARPA, AFRL
2019-02-03 12:46:27 +00:00
Gleb Smirnoff
3ca1c423aa Teach pfil_ioctl() about VIMAGE.
Submitted by:	gallatin
2019-02-03 08:28:02 +00:00
Cy Schubert
4ca6f22e91 new_kmem_alloc(9) is a Solaris/illumos malloc(9). FreeBSD and NetBSD
never get here, however a test for SOLARIS, as redundant as this test is,
serves to document that this is the illumos definition. This should help
those who come after me to follow the code more easily.

MFC after:	1 month
2019-02-03 05:26:10 +00:00
Cy Schubert
e82e8246fc Remove a reference to HP-UX in a comment.
MFC after:	1 month
2019-02-03 05:26:04 +00:00
Cy Schubert
0fcd8cab4e ipfilter #ifdef cleanup.
Remove #ifdefs for ancient and irrelevant operating systems from
ipfilter.

When ipfilter was written the UNIX and UNIX-like systems in use
were diverse and plentiful. IRIX, Tru64 (OSF/1) don't exist any
more. OpenBSD removed ipfilter shortly after the first time the
ipfilter license terms changed in the early 2000's. ipfilter on AIX,
HP/UX, and Linux never really caught on. Removal of code for operating
systems that ipfilter will never run on again will simplify the code
making it easier to fix bugs, complete partially implemented features,
and extend ipfilter.

Unsupported previous version FreeBSD code and some older NetBSD code
has also been removed.

What remains is supported FreeBSD, NetBSD, and illumos. FreeBSD and
NetBSD have collaborated exchanging patches, while illumos has expressed
willingness to have their ipfilter updated to 5.1.2, provided their
zone-specific updates to their ipfilter are merged (which are of interest
to FreeBSD to allow control of ipfilters in jails from the global zone).

Reviewed by:	glebius@
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19006
2019-02-03 05:25:49 +00:00
Andriy Voskoboinyk
1c4cb65153 net80211(4): do not setup Tx parameters for unsupported modes.
That should shorten 'ifconfig <wlan> list txparam' output since
unsupported modes will not be shown.

Checked with RTL8188EE, STA mode.

MFC after:	2 weeks
2019-02-03 04:31:50 +00:00
Andriy Voskoboinyk
2ce6d2b58c net80211(4): fix rate check when 'roaming' ifconfig(8) option is set to 'auto'
Do not try to clear 'basic rate' bit from roamRate; it cannot be here and,
actually, this operation clears 'MCS rate' bit instead, breaking comparison
for 11n / 11ac modes.

Tested with RTL8188CUS, HOSTAP mode + RTL8821AU, STA mode.

MFC after:	3 days
2019-02-03 02:32:13 +00:00
Andriy Voskoboinyk
511e2766f1 net80211(4): do not setup roaming parameters for unsupported modes.
ifconfig(8) prints per-mode parameters if they are non-zero; since
we have 13 possible modes with 3...5 typically supported this change
should greatly reduce amount of information for 'ifconfig <wlan> list roam'
command.

While here ensure that sta_roam_check() will not use roaming parameters
for unsupported modes (it should not).

This change effectively reverts r188776.

MFC after:	2 weeks
2019-02-03 01:32:02 +00:00
Vincenzo Maffione
5faab77822 netmap: upgrade sync-kloop support
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.

MFC after:	5 days
2019-02-02 22:39:29 +00:00
Patrick Kelsey
769d56eccf Fix interrupt index configuratoin when using MSI interrupts.
When in MSI mode, the device was only being configured with one
interrupt index, but it needs two - one for the actual interrupt and
one to park the tx queue at.

Also clarified comments relating to interrupt index assignment.

Reported by:	Yuri Pankov <yuripv@yuripv.net>
MFC after:	1 day
2019-02-02 21:14:53 +00:00
Andriy Voskoboinyk
378478f9fc Drop unused M_80211_COM malloc(9) type.
It is not used since r287197.

MFC after:	3 days
2019-02-02 16:23:45 +00:00
Andriy Voskoboinyk
4ab4d681f3 Do not acquire IEEE80211_LOCK twice in cac_timeout(); reuse
locked function instead.

It is externally visible since r257065.

MFC after:	5 days
2019-02-02 16:21:23 +00:00
Andriy Voskoboinyk
4215ce4820 sys/dev/wtap: Check return value from malloc(..., M_NOWAIT) and
drop unneeded cast.

MFC after:	3 days
2019-02-02 16:15:46 +00:00
Andriy Voskoboinyk
6ecec3817e run(4): fix allocated memory type for ieee80211_node(4).
PR:		177366
MFC after:	3 days
2019-02-02 16:07:56 +00:00
Andriy Voskoboinyk
943607571a run(4): revert previous commit; there were no compiler warning
(at least, from clang(1)).
2019-02-02 16:06:06 +00:00
Andriy Voskoboinyk
bce0fd800a run(4): fix allocated memory type and -Wincompatible-pointer-types
compiler warning.

PR:		177366
MFC after:	3 days
2019-02-02 16:01:16 +00:00
Gleb Smirnoff
d38ca3297c Return PFIL_CONSUMED if packet was consumed. While here gather all
the identical endings of pf_check_*() into single function.

PR:		235411
2019-02-02 05:49:05 +00:00
Justin Hibbits
d49fc192c1 powerpc/powernv: Add a driver for the POWER9 XIVE interrupt controller
The XIVE (External Interrupt Virtualization Engine) is a new interrupt
controller present in IBM's POWER9 processor.  It's a very powerful,
very complex device using queues and shared memory to improve interrupt
dispatch performance in a virtualized environment.

This yields a ~10% performance improvment over the XICS emulation mode,
measured in both buildworld, and 'dd' from nvme to /dev/null.

Currently, this only supports native access.

MFC after:	1 month
2019-02-02 04:15:16 +00:00
Alexander Motin
59568a0e52 Fix integer math overflow in UMA hash_alloc().
512GB of ZFS ABD ARC means abd_chunk zone of 128M 4KB items.  To manage
them UMA tries to allocate 2GB hash table, which size does not fit into
the int variable, causing later allocation failure, which makes ARC shrink
back below the 512GB, not letting it to use more RAM.  With this change I
easily reached >700GB ARC size on 768GB RAM machine.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-02-02 04:11:59 +00:00
Conrad Meyer
f4d8b4f81c qlnxr(4), qlnxe(4): Unbreak gcc build
Remove redundant definitions and conditionalize Clang-specific CFLAGS.

Sponsored by:	Dell EMC Isilon
2019-02-01 23:04:45 +00:00
Konstantin Belousov
a6786c1799 Disable boot-time memory test on i386 be default.
With the current 24G memory limit for GENERIC, the boot time test
causes quite visible delay, amplified by the default
debug.late_console = 0.

The comment text is copied from the same setting explanation for
amd64.

Suggested by:	bde
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2019-02-01 21:09:36 +00:00
Konstantin Belousov
c3f5a36651 x86: correctly limit max memory resource address..
CPU and buses can manage up to the limit reported by cpu_maxphyaddr,
so set mem_rman to the value returned by cpu_getmaxphyaddr().  For the
PAE mode, it was missed both when rman_res_t was increased to
uintmax_t, and from the PAE merge commit.

When importing smaps or dump_avail chunks into memory rman, do not
blindly ignore resources which ends above the limit, chomp them
instead if start is below the limit.  The same change was already done
to i386 add_physmap_entry().

Based on the submission by:	bde
MFC after:	2 months
2019-02-01 20:46:47 +00:00
Navdeep Parhar
cb7c3f124a cxgbe(4): Improved error reporting and diagnostics.
"slow" interrupt handler:
- Expand the list of INT_CAUSE registers known to the driver.
- Add decode information for many more bits but decouple it from the
  rest of intr_info so that it is entirely optional.
- Call t4_fatal_err exactly once, and from the top level PL intr handler.

t4_fatal_err:
- Use t4_shutdown_adapter from the common code to stop the adapter.
- Stop servicing slow interrupts after the first fatal one.

Driver/firmware interaction:
- CH_DUMP_MBOX: note whether the mailbox being dumped is a command or a
  reply or something else.
- Log the raw value of pcie_fw for some errors.
- Use correct log levels (debug vs. error).

Sponsored by:	Chelsio Communications
2019-02-01 20:42:49 +00:00
Bruce Evans
90bdbe955c Fix function keys for syscons in cons25 mode (vidcontrol -T cons25).
kbd(4) (but only documented in atkbd(4)) maintains a table of strings
for 96 function keys.  Using teken broke this 9+ years ago for the
most usable first 12 function keys and for 10 cursor keys, by supplying
its own non-programmable strings so that the keyboard driver's strings
are not used.

Fix this by supplying NULL in the teken layer for syscons in cons25 mode
so that the the strings are found in the kbd(4) layer.

vt needs more changes to use kbd(4)'s tables.  Teken's cons25 table is
still needed to supply nonempty strings for vt in cons25 mode.

Keep using teken's xterm tables for both syscons and vt in xterm mode.
Function keys should at least default to xterm values in xterm mode,
and kbd(4) doesn't support this.

teken_set_cons25() sets a sticky flag to ask for the fix, and space is
reserved for another new flag.  vt should set this flag when it uses
kbd(4)'s tables.

PR:		226553 (for vt)
2019-02-01 16:07:49 +00:00
Michael Tuexen
116ef4d6e7 When handling SYN-ACK segments in the SYN-RCVD state, set tp->snd_wnd
consistently.

This inconsistency was observed when working on the bug reported in
PR 235256, although it does not fix the reported issue. The fix for
the PR will be a separate commit.

PR:			235256
Reviewed by:		rrs@, Richard Scheffenegger
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D19033
2019-02-01 12:33:00 +00:00
Gleb Smirnoff
547392731f Repair siftr(4): PFIL_IN and PFIL_OUT are defines of some value, relying
on them having particular values can break things.
2019-02-01 08:10:26 +00:00
Gleb Smirnoff
647b604144 Unbreak call to ipf_check(): it expects the out parameter to be 0 or 1.
Pointy hat to:	glebius
Reported by:	cy
2019-02-01 07:48:37 +00:00
Gleb Smirnoff
2790ca97d9 Fix build without INET6. 2019-02-01 00:33:17 +00:00
Brooks Davis
12aec82c09 Remove iBCS2: also remove xenix syscall function support.
Missed in r342243.
2019-01-31 23:01:12 +00:00
Gleb Smirnoff
b252313f0b New pfil(9) KPI together with newborn pfil API and control utility.
The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented.  The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.

In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.

New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.

Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.

Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.

Differential Revision:	https://reviews.freebsd.org/D18951
2019-01-31 23:01:03 +00:00
Brooks Davis
90f2d5012a Regen after r342190.
Differential Revision:	https://reviews.freebsd.org/D18444
2019-01-31 22:58:17 +00:00
Konstantin Belousov
7674dce0a4 nvdimm: only enumerate present nvdimm devices
Not all child devices of the NVDIMM root device represent DIMM devices
which are present in the system. The spec says (ACPI 6.2, sec 9.20.2):

    For each NVDIMM present or intended to be supported by platform,
    platform firmware also exposes an NVDIMM device ... under the
    NVDIMM root device.

Present NVDIMM devices are found by walking all of the NFIT table's
SPA ranges, then walking the NVDIMM regions mentioned by those SPA
ranges.

A set of NFIT walking helper functions are introduced to avoid the
need to splat the enumeration logic across several disparate
callbacks.

Submitted by:	D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by:	Intel Corporation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18439
2019-01-31 22:47:04 +00:00
Konstantin Belousov
7dcbca8d67 nvdimm: enumerate NVDIMM SPA ranges from the root device
Move the enumeration of NVDIMM SPA ranges from the spa GEOM class
initializer into the NVDIMM root device. This will be necessary for a
later change where NVDIMM namespaces require NVDIMM device enumeration
to be reliably ordered before SPA enumeration.

Submitted by:	D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by:	Intel Corporation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18734
2019-01-31 22:43:20 +00:00
Gleb Smirnoff
eec189c70b Add new m_ext type for data for M_NOFREE mbufs, which doesn't actually do
anything except several assertions.  This type is going to be used for
temporary on stack mbufs, that point into data in receive ring of a NIC,
that shall not be freed.  Such mbuf can not be stored or reallocated, its
life time is current context.
2019-01-31 22:37:28 +00:00
Mark Johnston
919e7b5359 Prevent some kobj memory allocation failures from panicking the system.
Parts of the kobj(9) KPI assume a non-sleepable context for the purpose
of internal memory allocations, but currently have no way to signal an
allocation failure to the caller, so they just panic in this case.  This
can occur even when kobj_create() is called with M_WAITOK.  Fix some
instances of the problem by plumbing wait flags from kobj_create() through
internal subroutines.  Change kobj_class_compile() to assume a sleepable
context when called externally, since all existing callers use it in a
sleepable context.

To fix the problem fully the kobj_init() KPI must be changed.

Reported and tested by:	pho
Reviewed by:	kib (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19023
2019-01-31 22:27:39 +00:00
Eric Joyner
7aad1f4edc ix(4),ixv(4): Fix TSO offloads when TXCSUM is disabled
This patch and commit message are based on r340256 created by Jacob Keller:

The iflib stack does not disable TSO automatically when TXCSUM is
disabled, instead assuming that the driver will correctly handle TSOs
even when CSUM_IP is not set.

This results in iflib calling ixgbe_isc_txd_encap with packets which have
CSUM_IP_TSO, but do not have CSUM_IP or CSUM_IP_TCP set. Because of
this, ixgbe_tx_ctx_setup will not setup the IPv4 checksum offloading.

This results in bad TSO packets being sent if a user disables TXCSUM
without disabling TSO.

Fix this by updating the ixgbe_tx_ctx_setup function to check both
CSUM_IP and CSUM_IP_TSO when deciding whether to enable checksums.

Once this is corrected, another issue for TSO packets is revealed. The
driver sets IFLIB_NEED_ZERO_CSUM in order to enable a work around that
causes the ip->sum field to be zero'd. This is necessary for ix
hardware to correctly perform TSOs.

However, if TXCSUM is disabled, then the work around is not enabled, as
CSUM_IP will not be set when the iflib stack checks to see if it should
clear the sum field.

Fix this by adding IFLIB_TSO_INIT_IP to the iflib flags for the ix and
ixv interface files.

Once both of these changes are made, the ix and ixv drivers should
correctly offload TSO packets when TSO offload is enabled, regardless
of whether TXCSUM is enabled or disabled.

Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	IntelNetworking
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18470
2019-01-31 21:53:03 +00:00
Eric Joyner
b2c1e8e620 ix(4): Run {mod,msf,mbx,fdir,phy}_task in if_update_admin_status
From Piotr:

This patch introduces adapter->task_requests register responsible for
recording requests for mod_task, msf_task, mbx_task, fdir_task and
phy_task calls. Instead of enqueueing these tasks with
GROUPTASK_ENQUEUE, handlers will be called directly from
ixgbe_if_update_admin_status() while holding ctx lock.

SIOCGIFXMEDIA ioctl() call reads adapter->media list. The list is
deleted and rewritten in ixgbe_handle_msf() task without holding ctx
lock. This change is needed to maintain data coherency when sharing
adapter info via ioctl() calls.

Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>.

PR:		221317
Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	sbruno@, IntelNetworking
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18468
2019-01-31 21:44:33 +00:00
John Baldwin
829c56fc08 Don't set IFCAP_TXRTLMT during lagg_clone_create().
lagg_capabilities() will set the capability once interfaces supporting
the feature are added to the lagg.  Setting it on a lagg without any
interfaces is pointless as the if_snd_tag_alloc call will always fail
in that case.

Reviewed by:	hselasky, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19040
2019-01-31 21:35:37 +00:00
Gleb Smirnoff
f712b16127 Revert r316461: Remove "IPFW static rules" rmlock, and use pfil's global lock.
The pfil(9) system is about to be converted to epoch(9) synchronization, so
we need [temporarily] go back with ipfw internal locking.

Discussed with:	ae
2019-01-31 21:04:50 +00:00
Konstantin Belousov
f8d49128a9 Make iflib a loadable module: add seemingly missed header.
Reported by:	CI (i.e. it is not reproducable in my local builds)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2019-01-31 20:04:18 +00:00
Konstantin Belousov
c75f49f7d8 Make iflib a loadable module.
iflib is already a module, but it is unconditionally compiled into the
kernel.  There are drivers which do not need iflib(4), and there are
situations where somebody might not want iflib in kernel because of
using the corresponding driver as module.

Reviewed by:	marius
Discussed with:	erj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19041
2019-01-31 19:05:56 +00:00
Gleb Smirnoff
37125720b9 In zone_alloc_bucket() max argument was calculated based on uz_count.
Then bucket_alloc() also selects bucket size based on uz_count. However,
since zone lock is dropped, uz_count may reduce. In this case max may
be greater than ub_entries and that would yield into writing beyond end
of the allocation.

Reported by:	pho
2019-01-31 17:52:48 +00:00
Konstantin Belousov
75fe717698 Reserve a bit in the FreeBSD feature control note for marking the
image as not compatible with ASLR.

Requested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D5603
2019-01-31 15:44:49 +00:00
Andriy Voskoboinyk
838b61c1f0 bwn(4): reuse ieee80211_tx_complete function.
MFC after:	1 week
2019-01-31 11:12:31 +00:00