Commit Graph

133576 Commits

Author SHA1 Message Date
erj
fd57b3917c Revert r339634.
That commit is causing kernel panics in em(4), so this will be reverted
until those are fixed.

Reported by:	ae@, pho@, et al
Sponsored by:	Intel Corporation
2018-10-23 17:06:36 +00:00
markj
6c262608dd Refactor domainset iterators for use by malloc(9) and UMA.
Before this change we had two flavours of vm_domainset iterators: "page"
and "malloc".  The latter was only used for kmem_*() and hard-coded its
behaviour based on kernel_object's policy.  Moreover, its use contained
a race similar to that fixed by r338755 since the kernel_object's
iterator was being run without the object lock.

In some cases it is useful to be able to explicitly specify a policy
(domainset) or policy+iterator (domainset_ref) when performing memory
allocations.  To that end, refactor the vm_dominset_* KPI to permit
this, and get rid of the "malloc" domainset_iter KPI in the process.

Reviewed by:	jeff (previous version)
Tested by:	pho (part of a larger patch)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17417
2018-10-23 16:35:58 +00:00
ae
91cf1d92ac Add the check that current VNET is ready and access to srchash is allowed.
This change is similar to r339646. The callback that checks for appearing
and disappearing of tunnel ingress address can be called during VNET
teardown. To prevent access to already freed memory, add check to the
callback and epoch_wait() call to be sure that callback has finished its
work.

MFC after:	20 days
2018-10-23 13:11:45 +00:00
ae
4af93b38a4 Add the check that current VNET is ready and access to srchash is
allowed.

ipsec_srcaddr() callback can be called during VNET teardown, since
ingress address checking subsystem isn't VNET specific. And thus
callback can make access to already freed memory. To prevent this,
use V_ipsec_idhtbl pointer as indicator of VNET readiness. And make
epoch_wait() after resetting it to NULL in vnet_ipsec_uninit() to
be sure that ipsec_srcaddr() is finished its work.

Reported by:	kp
MFC after:	20 days
2018-10-23 13:03:03 +00:00
glebius
8050a0601b Fix ipw_start(), where logic was reverted in r287197.
PR:		232554
Submitted by:	gl00my@mail.ru
2018-10-23 12:53:09 +00:00
ae
ef5ae8bd0d Remove softc from idhash when interface is destroyed.
MFC after:	20 days
2018-10-23 12:50:28 +00:00
vmaffione
7b9456a050 netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist:
    - Move large parts of VALE code to a new file and header netmap_bdg.[ch].
      This is useful to reuse the code within upcoming projects.
    - Improvements and bug fixes to pipes and monitors.
    - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
      handle differences between FreeBSD and Linux.
    - Introduce some new helper functions to handle more host rings and fake
      rings (netmap_all_rings(), netmap_real_rings(), ...)
    - Added new sysctl to enable/disable hw checksum in emulated netmap mode.
    - nm_inject: add support for NS_MOREFRAG

Approved by:	gnn (mentor)
Differential Revision:	https://reviews.freebsd.org/D17364
2018-10-23 08:55:16 +00:00
erj
e76e1c151d iflib: drain enqueued tasks before detaching from taskqgroup
The taskqgroup_detach function does not check if task is already enqueued when
detaching it. This may lead to kernel panic if enqueued task starts after
context state lock is destroyed. Ensure that the already enqueued admin tasks
are executed before detaching them.

The issue was discovered during validation of D16429. Unloading of if_ixlv
followed by immediate removal of VFs with iovctl -D may lead to panic on
NODEBUG kernel.

As well, check if iflib is in detach before enqueueing new admin or iov
tasks, to prevent new tasks from executing while the taskqgroup tasks
are being drained.

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	shurd@, erj@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D17404
2018-10-23 04:37:29 +00:00
jhibbits
59f44d20be dpaa: Mark BMan and QMan as earlier driver modules
The BMan softc must exist when dtsec devices are created, else a NULL
pointer is dereferenced.  QMan likely as well.  Until now, we have relied on
order within the fdt parsing to attach correctly, but this obviously is not
foolproof.  Mark these as BUS_PASS_SUPPORTDEV so they're probed and attached
explicitly before dtsec devices.
2018-10-23 01:56:52 +00:00
np
95d351a2dc cxgbe(4): improve the accuracy of various TSO limits reported to the kernel.
Sponsored by:	Chelsio Communications
2018-10-22 23:57:59 +00:00
np
4ebaf1bd4b cxgbe(4): Use automatic cidx updates with ofld and ctrl queues.
The bits that explicitly request cidx updates do not work reliably with
all possible WRs that can be sent over the queue.  The F_FW_WR_EQUIQ
requests that still remain may also have to be replaced with explicit
credit flush WRs in the future.

MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-10-22 23:06:23 +00:00
brooks
fcc5d25798 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
brooks
bb485f0240 Remove the need for backslashes in syscalls.master.
Join non-special lines together until we hit a line containing a '}'
character. This allows the function declaration body to be split
across multiple lines without backslash continuation characters.

Continue to join lines ending with backslashes to allow gradual
migration and to support out-of-tree syscall vectors

Reviewed by:	emaste, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17488
2018-10-22 22:13:00 +00:00
brooks
e6edb81da3 Regen after r339622.
Note: changes to freebsd32 syscalls.master impacted no generated files.
2018-10-22 21:51:59 +00:00
brooks
94492cb893 Remove __restrict qualifiers from syscalls.master.
The restruct qualifier is intended to aid code generation in the
compiler, but the only access to storage through these pointers is via
structs using copyin/copyout and the like which can not be written in C
or C++ and thus the compiler gains nothing from the qualifiers.

As such, the qualifiers add no value in current usage.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17574
2018-10-22 21:50:43 +00:00
jhb
439be2ca2b A couple of style fixes in recent TCP changes.
- Add a blank line before a block comment to match other block comments
  in the same function.
- Sort the prototype for sbsndptr_adv and fix whitespace between return
  type and function name.

Reviewed by:	gallatin, bz
Differential Revision:	https://reviews.freebsd.org/D17474
2018-10-22 21:17:36 +00:00
tijl
babda82d29 Define linuxkpi readq for 64-bit architectures. It is used by drm-kmod.
Currently the compiler picks up the definition in machine/cpufunc.h.

Add compiler memory barriers to read* and write*.  The Linux x86
implementation of these functions uses inline asm with "memory" clobber.
The Linux x86 implementation of read_relaxed* and write_relaxed* uses the
same inline asm without "memory" clobber.

Implement ioread* and iowrite* in terms of read* and write* so they also
have memory barriers.

Qualify the addr parameter in write* as volatile.

Like Linux, define macros with the same name as the inline functions.

Only define 64-bit versions on 64-bit architectures because generally
32-bit architectures can't do atomic 64-bit loads and stores.

Regroup the functions a bit and add brief comments explaining what they do:
- __raw_read*, __raw_write*: atomic, no barriers, no byte swapping
- read_relaxed*, write_relaxed*: atomic, no barriers, little-endian
- read*, write*: atomic, with barriers, little-endian

Add a comment that says our implementation of ioread* and iowrite*
only handles MMIO and does not support port IO.

Reviewed by:	hselasky
MFC after:	3 days
2018-10-22 20:55:35 +00:00
markj
0dd92926d8 Make it possible to disable NUMA support with a tunable.
This provides a chicken switch for anyone negatively impacted by
enabling NUMA in the amd64 GENERIC kernel configuration.  With
NUMA disabled at boot-time, information about the NUMA topology
is not exposed to the rest of the kernel, and all of physical
memory is viewed as coming from a single domain.

This method still has some performance overhead relative to disabling
NUMA support at compile time.

PR:		231460
Reviewed by:	alc, gallatin, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17439
2018-10-22 20:13:51 +00:00
cem
7355e92510 Update to Zstandard 1.3.7
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
2018-10-22 18:29:12 +00:00
cem
7451598e04 Conditionalize kern.tty_info_kstacks feature on STACKS option
Fix tinderbox (mips XLPN32) after r339471.

Reported by:	tinderbox
X-MFC-With:	r339471
Sponsored by:	Dell EMC Isilon
2018-10-22 17:42:57 +00:00
markj
816ca801ac Fix the build after r339601.
I committed some patches out of order and didn't build-test one of them.

Reported by:	Jenkins, O. Hartmann <ohartmann@walstatt.org>
X-MFC with:	r339601
2018-10-22 17:19:48 +00:00
markj
87de4fcb0a Avoid a redundancy in a comment updated by r339601.
Reported by:	alc
X-MFC with:	r339601
2018-10-22 17:17:30 +00:00
markj
9ce499cca5 Swap in processes unless there's a global memory shortage.
On NUMA systems, we would not swap in processes unless all domains
had some free pages.  This is too conservative in general.  Instead,
permit swapins so long as at least one domain has free pages, and add
a kernel stack NUMA policy which ensures that we will try to allocate
kernel stack pages from any domain.

Reported and tested by:	pho, Jan Bramkamp <crest@bultmann.eu>
Reviewed by:	alc, kib
Discussed with:	jeff
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17304
2018-10-22 17:04:04 +00:00
hselasky
782d004253 Make sure returned value is checked and assert a valid refcount.
While at it fix a print: Unsigned types cannot be negative.

Reviewed by:		kib, mjg
Differential revision:	https://reviews.freebsd.org/D17616
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-10-22 16:21:50 +00:00
markj
786751e464 Don't import 0 into vmem quantum caches.
vmem uses UMA cache zones to implement the quantum cache.  Since
uma_zalloc() returns 0 (NULL) to signal an allocation failure, UMA
should not be used to cache resource 0.  Fix this by ensuring that 0 is
never cached in UMA in the first place, and by modifying vmem_alloc()
to fall back to a search of the free lists if the cache is depleted,
rather than blocking in qc_import().

Reported by and discussed with:	Brett Gutstein <bgutstein@rice.edu>
Reviewed by:	alc
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17483
2018-10-22 16:16:42 +00:00
markj
741a850d6c Fix style bugs in in6_pcblookup_lbgroup().
This should have been a part of r338470.  No functional changes
intended.

Reported by:	gallatin
Reviewed by:	gallatin, Johannes Lundberg <johalun0@gmail.com>
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17109
2018-10-22 16:09:01 +00:00
glebius
e30533f381 If we lost race or were migrated during bucket allocation for the per-CPU
cache, then we put new bucket on generic bucket cache. However, code didn't
honor UMA_ZONE_NOBUCKETCACHE flag, so potentially we could start a cache
on a zone that clearly forbids that. Fix this.

Reviewed by:	markj
2018-10-22 15:48:07 +00:00
avg
c98294c273 nfsrvd_readdirplus: for some errors, do not fail the entire request
Instead, a failing entry is skipped.
This change consist of two logical changes.

A failure to vget or lookup an entry is considered to be a result of a
concurrent removal, which is the only reasonable explanation given that
the filesystem is busied.  So, the entry would be silently skipped.

In the case of a failure to get attributes of an entry for an NFSv3
request, the entry would be silently skipped.  There can be legitimate
reasons for the failure, but NFSv3 does not provide any means to report
the error, so we have two options: either fail the whole request or
ignore the failed entry.  Traditionally, the old NFS server used the
latter option, so the code is reverted to it.  Making the whole
directory unreadable because of a single entry seems to be unpractical.

Additionally, some bits of code are slightly re-arranged to account for
the new control flow and to honor style(9).

Reviewed by:	rmacklem
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D15424
2018-10-22 15:33:05 +00:00
andrew
f0a3a25e73 Stop advertising ARMv8.3 Pointer Authentication
This needs firmware and kernel support before userspace can use it. Until
then don't advertise it's available.

MFC after:	3 days
2018-10-22 15:18:49 +00:00
andrew
c84f66e25a Fix the ID_AA64ISAR0_EL1 dot product field shift.
It's 44 in the documentation, use this correct value.

MFC after:	3 days
2018-10-22 15:06:14 +00:00
andrew
3711ea9758 Correctly set the DAIF bits in new threads
We should only unmask interrupts when creating a new thread and leave the
other exceptions in teh same state as before creating the thread.

Reported by:	jhibbits
Reviewed by:	jhibbits
MFC after:	1 month
Sponsored by:	https://reviews.freebsd.org/D17497
2018-10-22 14:58:59 +00:00
avg
3d9c3161e3 ichwd: add support for TCO watchdog timer in Lewisburg PCH (C620)
The change is based on public documents listed below as well as Linux
changes and the code developed by Kostik.

The documents:
- Intel® C620 Series Chipset Platform Controller Hub Datasheet
- Intel® 100 Series and Intel® C230 Series Chipset Family Platform
  Controller Hub (PCH) Datasheet - Volume 2 of 2

Interesting Linux commits:
- 9424693035
- 2a7a0e9bf7

The peculiarity of the new chipsets is that the watchdog resources are
configured in PCI registers of SMBus controller and Power Management
function as opposed to the LPC bridge.  I took a simplistic approach of
querying the resources from the respective PCI devices.  ichwd is still
a device on isa bus.  The PCI devices are found by their slot and
function defined in the datasheets as siblings of the upstream LPC
bridge.

There are some shortcuts and missing features.

First of all, I have not implemented the functionality required to clear
the no-reboot bit.  That would require writing to a special PCI
configuration register of a hidden / invisible PCI device after which
the device would start responding to accesses to other registers.  The
no-reboot bit was not set on my test hardware, so I decided to leave its
handling for the later time.

Also, I did not try to handle the case where the watchdog resources are
not configured by the hardware as well as the case where ACPI defined
operational region conflicts with the watchdog resources.  My test
system did not have either of those problem, so, again, I decided to
leave those cases until later.
See this Linux commit for some details of the ACPI problem:
a7ae81952c

Finally, I have added only the PCI ID found on my test system.  I think
that more IDs can be added as the change gets tested.

Tested on Dell PowerEdge R740.

PR:		222079
Reviewed by:	mav, kib
MFC after:	3 weeks
Relnotes:	maybe
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D17585
2018-10-22 14:44:44 +00:00
luporl
be45c5c587 ppc64: limited 32-bit DMA address range
Further investigation of issues with 32-bit DMA on PowerNV revealed that
its window is hardcoded by OPAL (at least in skiboot version 5.4.9) and
cannot be changed by the OS.
Thus, now jhb suggestion of limiting the range in PCI DMA tag seems
the best way to deal with it.

Reviewed by:	jhibbits, nwhitehorn, sbruno
Approved by:	jhibbits(mentor)
Differential Revision:	https://reviews.freebsd.org/D17601
2018-10-22 13:40:50 +00:00
hselasky
aec6da6f62 Resolve deadlock between epoch(9) and various network interface
SX-locks, during if_purgeaddrs(), by not allowing to hold the epoch
read lock over typical network IOCTL code paths. This is a regression
issue after r334305.

Reviewed by:		ae (network)
Differential revision:	https://reviews.freebsd.org/D17647
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-10-22 13:25:26 +00:00
hselasky
d4ecaca1a8 Added support for formula-based arbitrary baud rates, in contrast to
the current fixed values, which enables use of rates above 1 Mbps.
Improved the detection of HXD chips, and the status flag handling as
well.

Submitted by:		Gabor Simon <gabor.simon75@gmail.com>
PR:			225932
Differential revision:	https://reviews.freebsd.org/D16639
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-10-22 11:58:30 +00:00
whu
9de81ae09f Do not trop UDP traffic when TXCSUM_IPV6 flag is on
PR:		231797
Submitted by:	whu
Reviewed by:	dexuan
Obtained from:	Kevin Morse
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://bugs.freebsd.org/bugzilla/attachment.cgi?id=198333&action=diff
2018-10-22 11:23:51 +00:00
slavash
cb4d66c6b2 mlx5: Notify user that the ConnectX-6 shutdown its port due to power limitation
If power exceed the slot limit, or slot limit is unknown the ConnectX-6
firmware will shutdown its port.
Inform the user via debug message.

MFC after:      3 days
Approved by:    hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-10-22 10:38:38 +00:00
hselasky
b54bba1abf The event bytes should be unsigned char.
Found by:		Peter Holm <peter@holm.cc>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2018-10-22 08:59:20 +00:00
hselasky
22025d42bc Drop sequencer mutex around uiomove() and make sure we don't move more bytes
than is available, else a panic might happen.

Found by:		Peter Holm <peter@holm.cc>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2018-10-22 08:58:27 +00:00
hselasky
6e163bee84 Fix off-by-one which can lead to panics.
Found by:		Peter Holm <peter@holm.cc>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2018-10-22 08:55:58 +00:00
mjg
ac1eb54956 amd64: finish the tail in memset with an overlapping store
Instead of finding the exact size to fit in we can just shift the target
by -8 + tail. Doing a blind write to a previously rep stosq'ed area comes
with a penalty so do it conditionally.

Sample win on EPYC when zeroing a 257 sized buffer (tail = 1) aligned to
16 bytes:
before: 44782846 ops/s
after:  46118614 ops/s

Idea stolen from NetBSD.

Sponsored by:	The FreeBSD Foundation
2018-10-22 06:44:20 +00:00
bwidawsk
2127b447f4 acpi: Add an interface to obtain DSM information
The Device Specific Method (_DSM) is on optional object that defines
device specific controls. This will be useful for our power management
controller in upcoming patches. More information can be found in ACPI
spec 6.2 section 9.1.1

https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf

This patch had a minor modification changing ENOMEM to AE_NO_MEMORY
after it got review and approval but before committing.

Test Plan: Tested in my s0ix branch

Reviewed by:	kib
Approved by:	emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D17121
2018-10-22 03:29:54 +00:00
imp
906b9eae84 Remove the long obsolete SYM_SETUP_LP_PROBE_MAP option. It's not been
needed for almost 20 years, and is totally useless now that ncr(4) has
been removed.

Relnotes: yes
2018-10-22 02:36:31 +00:00
imp
d44887731c Remove the ncr(4) drive.
This driver has been obsolete since the FreeBSD 4.x. It should have
been removed then since the sym(4) driver had subsumed it. The driver
was commented out of GENERIC in 2000.

RelNotes: Yes
2018-10-22 02:36:18 +00:00
imp
0c21ab179e Retire scsi_low
scsi_low was a common set of routines to do the SCSI bus sequencing
for the ncv, nsp and stg drivers. Those have been removed, so it's no
longer needed since nothing else in the tree uses it and nothing
likely ever will (it's for super-low-end 8-bit parallel SCSI cards).
2018-10-22 02:36:07 +00:00
imp
c912dbecce Remove stg(4) driver
stg(4) is marked as gone in 12. Remove it. There are no sightings of
it in the nycbug dmesg database. It was for an obscure SCSI card that
sold mostly in Japan, and was especially popilar among pc98 hackers in
the 4.x time frame. It was also only enabled on i386.

Relnote: Yes
2018-10-22 02:35:50 +00:00
imp
2c200a05c0 Remove nsp(4) driver
nsp(4) is marked as gone in 12. Remove it. There are no sightings of
it in the nycbug dmesg database. It was for an obscure SCSI card that
sold mostly in Japan, and was especially popilar among pc98 hackers in
the 4.x time frame. It was also only enabled on i386.

Relnote: Yes
2018-10-22 02:35:38 +00:00
imp
bd9eaea6c1 Remove ncv(4) driver
ncv(4) is marked as gone in 12. Remove it. There are no sightings of
it in the nycbug dmesg database. It was for an obscure SCSI card that
sold mostly in Japan, and was especially popilar among pc98 hackers in
the 4.x time frame..

Relnote: Yes
2018-10-22 02:35:26 +00:00
imp
a5023bb106 Retire dpt(4)
Marked as gone in 12 and not relevant since the early 90s. No
sightings in nycbug's dmesg database.

Relnotes: yes
2018-10-22 02:35:12 +00:00
imp
3b90b89d12 Remove bt(4) driver
The buslogic scsi driver has been tagged as gone in 12 for some time
now. Remove it. The nycbug dmesg database shows only one sighting in 6
for this driver. It was very popular in the early days of the project,
but that popularity seems to have died by 2004 when the nycbug
database started up.

Relnotes: yes
2018-10-22 02:34:59 +00:00