Commit Graph

268879 Commits

Author SHA1 Message Date
Warner Losh
4aed5c3c9d time_t is pathological: use %j + cast to print it.
Sponsored by:		Netflix
2021-10-01 12:16:10 -06:00
Gleb Smirnoff
b984d153e0 Don't set GELI UMA zone as UMA_ZONE_NOFREE.
That fixes memory leak on last GELI provider destroyed, introduced
in 2dbc9a388e. This patch was originally developed late 2019 and
the flag was necessary to prevent zone drainage under memory pressure.
Today, with f09cbea31a the UMA is fixed not to drain into reserves.

Discussed with:	jtl, markj
Fixes:		2dbc9a388e
PR:		258787
2021-10-01 10:31:17 -07:00
Warner Losh
4b3da659bf nvme: Only reset once on attach.
The FreeBSD nvme driver has reset the nvme controller twice on attach to
address a theoretical issue assuring the hardware is in a known
state. However, exierence has shown the second reset is unnecessary and
increases the time to boot. Eliminate the second reset. Should there be
a situation when you need a second reset (for buggy or at least somewhat
out of the mainstream hardware), the hardware option NVME_2X_RESET will
restore the old behavior. Document this in nvme(4).

If there's any trouble at all with this, I'll add a sysctl tunable to
control it.

Sponsored by:		Netflix
Reviewed by:		cperciva, mav
Differential Revision:	https://reviews.freebsd.org/D32241
2021-10-01 11:09:34 -06:00
Warner Losh
e5e26e4a24 nvme: Remove pause while resetting
After some study of the code and the standard, I think we can just drop
the pause(), unconditionally.  If we're not initialized, then there's
nothing to wait for from a software perspective.  If we are initialized,
then there might be outstanding I/O. If so, then the qpair 'recovery
state' will transition to WAITING in nvme_ctrlr_disable_qpairs, which
will ignore any interrupts for items that complete before we complete
the reset by setting cc.en=0.

If we go on to fail the controller, we'll cancel the outstanding I/O
transactions.  If we reset the controller, the hardware throws away
pending transactions and we retry all the pending I/O transactions. Any
transactions that happend to complete before cc.en=0 will have the same
effect in the end (doing the same transaction twice is just inefficient,
it won't affect the state of the device any differently than having done
it once).

The standard imposes no wait times here, so it isn't needed from that
perspective.

Unanswered Question: Do we may need to disable interrupts while we
disable in legacy mode since those are level-sensitive.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32248
2021-10-01 11:09:05 -06:00
Warner Losh
77054a897f nvme: Explain a workaround a little better
The don't touch the mmio of the drive after we do a EN 1->0 transition
is only for a tiny number of dirves that have this unforunate issue.

Sponsored by:		Netflix
2021-10-01 10:56:10 -06:00
Warner Losh
a245627a4e nvme_ctrlr_enable: Small style nits
Rewrite the nested if's using the preferred FreeBSD style for branches
of ifs that return. NFC. Minor tweaks to the comments to better fit new
code layout.

Sponsored by:		Netflix
Reviewed by:		mav, chuck (prior rev, but comments rolled in)
Differential Revision:	https://reviews.freebsd.org/D32245
2021-10-01 10:56:10 -06:00
Warner Losh
26259f6ab9 nvme: Use MS_2_TICKS rather than rolling our own
Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32246
2021-10-01 10:56:10 -06:00
Warner Losh
d5fca1dc1d nvme_ctrlr_enable: Remove unnecessary 5ms delays
Remove the 5ms delays after writing the administrative queue
registers. These delays are from the very earliest days of the driver
(they are in the first commit) and were most likely vestiges of the
Chatham NVMe prototype card that was used to create this driver. Many of
the workarounds necessary for it aren't necessary for standards
compliant cards. The original driver had other areas marked for Chatham,
but these were not. They are unneeded. There's three lines of supporting
evidence.

First, the NVMe standards make no mention of a delay time after these
registers are written. Second, the Linux driver doesn't have them, even
as an option. Third, all my nvme cards work w/o them.

To be safe, add a write barrier between setting up the admin queue and
enabling the controller.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32247
2021-10-01 10:56:10 -06:00
Eric van Gyzen
35e4527e88 sem_clockwait_np test: fix usage of ATF API
ATF_REQUIRE_ERRNO requires the given errno iff the given expression is
true.  These test cases used it incorrectly, potentially allowing
sem_clockwait_np to succeed when it was expected to fail.  Use separate
ATF calls to require failure and the expected errno.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2021-10-01 06:39:34 -05:00
Eric van Gyzen
2334abfd01 sem test: move sem_clockwait_np tests into individual cases
Move these tests into individual test cases for all the usual reasons.
No functional change intended.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2021-10-01 06:39:34 -05:00
Eric van Gyzen
31466594cd sem_clockwait_np test: relax time constraint on VMs
In a guest on a busy hypervisor, the time remaining after an
interrupted sleep could be much lower than other environments.
Relax the lower bound on VMs.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2021-10-01 06:39:30 -05:00
Randall Stewart
a36230f75e tcp: Make dsack stats available in netstat and also make sure its aware of TLP's.
DSACK accounting has been for quite some time under a NETFLIX_STATS ifdef. Statistics
on DSACKs however are very useful in figuring out how much bad retransmissions you
are doing. This is further complicated, however, by stacks that do TLP. A TLP
when discovering a lost ack in the reverse path will cause the generation
of a DSACK. For this situation we introduce a new dsack-tlp-bytes as well
as the more traditional dsack-bytes and dsack-packets. These will now
all display in netstat -p tcp -s. This also updates all stacks that
are currently built to keep track of these stats.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32158
2021-10-01 10:36:27 -04:00
Hans Petter Selasky
5b40c0aa73 mixer(3): Fix buildworld after 38c857d956 .
s/default_unit/dunit/g

Differential Revision:	https://reviews.freebsd.org/D32254
Sponsored by:	NVIDIA Networking
2021-10-01 16:34:10 +02:00
Hans Petter Selasky
433be7f21f mixer(3): Add some manual page symlinks.
Submitted by:	christos@
Differential Revision:	https://reviews.freebsd.org/D32254
Sponsored by:	NVIDIA Networking
2021-10-01 14:18:43 +02:00
Hans Petter Selasky
38c857d956 mixer(3): Add symbol versioning.
Suggested by:	kib
Differential Revision:	https://reviews.freebsd.org/D32254
Sponsored by:	NVIDIA Networking
2021-10-01 14:18:43 +02:00
Shteryana Shopova
8b959dd6a3 Fix bsnmpd(1) crash with ill-formed Discovery message
RFC 3414 Section 4. Discovery specifies that a discovery request message has a
varBindList left empty. Nonetheless, bsnmpd(1) should not crash when receiving
a non-zero var-bindings list in a Discovery Request message.

PR:		255214
MFC after:	2 weeks
2021-10-01 14:10:39 +03:00
Andrew Turner
18c2139495 Add a gic interface to allocate MSI interrupts
The previous update to handle the gicv2m as a child of the gicv3 driver
assumed there was only a single gicv2m child. On some hardware there
are multiple children. Support this by removing the mbi ivars and
adding a new interface to handle MSI allocation in a given range.

Tested by:	mw, trasz
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32224
2021-10-01 11:27:33 +01:00
Andrew Turner
3d2533f5c2 Allow ddb and dtrace use the DMAP region on arm64
When writing to memory on arm64 we may be trying to be accessing a
read-only page. In this case try to access via the DMAP region to
get a writable location.

While here simplify writing data in DDB and stop trashing the size as
it is passed into the cache handling functions.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32053
2021-10-01 11:27:33 +01:00
Andrew Turner
7ec86b6609 Also print symbols when printing arm64 registers
When printing arm64 registers because of an exception in the kernel
also print the symbol and offset. This can be used to track down why
the exception occured without needing external tools.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32077
2021-10-01 11:27:32 +01:00
Kornel Duleba
ca4a6606f0 enetc_mdio: Fix devclass name
Use correct devclass name, due to the mismatch miibus would attach
to the wrong thing causing mii_attach to silently fail.

Fixes: dfcaa2c18b (enetc_mdio: Support building the driver ...)
2021-10-01 11:24:08 +02:00
Kornel Duleba
a75400c5ad modules: felix: Remove etherswitch_if.c from Makefile
Having it included confuses KOBJOPLOOKUP resulting in kobj_error_method
being called instead of a devmethod from the switch driver.
That in turn returns ENXIO which was treated as a pointer and
dereferenced by etherswitch ioctl logic causing the kernel to panic.

Fixes: b542c9e42b (modules: felix: Add needed dependencies)
2021-10-01 11:24:08 +02:00
Kornel Duleba
8cbbe35105 arm64: std.nxp Add enetc NIC driver
It was missed during the conversion of kernel configs.
Although the driver is already built as a kernel module we might
want to have it built-in for diskless booting and such.
2021-10-01 11:24:08 +02:00
Kyle Evans
4dbd8c72d3 tcp_wrappers: get rid of duplicate fgets declarations
This is declared in stdio.h, no need for this one.
2021-09-30 23:55:27 -05:00
Kyle Evans
5487294d79 libc: ssp: sprinkle around some __dead2
This is consistent with, e.g., NetBSD's implementation, which declares
these as noreturn in ssp/ssp.h.
2021-09-30 23:55:17 -05:00
Kyle Evans
cfb9be5062 bootp: remove the USE_BFUNCS knob
We'd likely be better served by converting these to the equivalent mem*
calls, but just kill the knob for now. The b* macros being defined get
in the way of _FORTIFY_SOURCE.

Reviewed by:	imp, markj
Differential Revision:	https://reviews.freebsd.org/D32235
2021-09-30 23:47:06 -05:00
Kyle Evans
0f43c5b55c kqueue: clean up some igor and mandoc -Tlint warnings 2021-09-30 21:31:28 -05:00
Kyle Evans
4b5554cebb kqueue: document how timers with low/past timeouts are handled
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D32237
2021-09-30 21:31:28 -05:00
Kyle Evans
9c999a259f kqueue: don't arbitrarily restrict long-past values for NOTE_ABSTIME
NOTE_ABSTIME values are converted to values relative to boottime in
filt_timervalidate(), and negative values are currently rejected.  We
don't reject times in the past in general, so clamp this up to 0 as
needed such that the timer fires immediately rather than imposing what
looks like an arbitrary restriction.

Another possible scenario is that the system clock had to be adjusted
by ~minutes or ~hours and we have less than that in terms of uptime,
making a reasonable short-timeout suddenly invalid. Firing it is still
a valid choice in this scenario so that applications can at least
expect a consistent behavior.

Reviewed by:	kib, markj
Discussed with:	allanjude
Differential Revision:	https://reviews.freebsd.org/D32230
2021-09-30 21:31:24 -05:00
Jung-uk Kim
1b7a2680fb Import ACPICA 20210930
(cherry picked from commit c509b6ab0d7e5bafc5348b08653b8738bd40716e)
2021-09-30 22:05:52 -04:00
Colin Percival
ce73f768b7 EFI loader: Don't free bcache for DEVT_DISK devs
Booting on an EC2 c5.xlarge instance, this reduces the number of I/Os
performed from 609 to 432, reduces the total number of blocks read
from 61963 to 60797, and reduces the time spent in the loader by 39 ms.

Note that b4cb3fe0e3 allowed the bcache to be retained for most of
the boot process, but relies on mounting filesystems; this commit
allows the bcache to be retained at the start of the boot process,
before the root filesystem has been located.

Reviewed by:	imp, tsoome
MFC after:	1 week
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D32239
2021-09-30 14:48:14 -07:00
Kyle Evans
a6499c56ab jail(3lua): add jail.attach()/jail.remove() methods
These aren't a part of or use libjail(3), but rather are direct
syscalls.  Still, they seem like good additions, allowing us to attach
to already-running jails.

Reviewed by:	freqlabs
Differential Revision:	https://reviews.freebsd.org/D26927
2021-09-30 16:31:04 -05:00
Kyle Evans
6a7647eccd jail(3lua): add a jail.list() method
This is implemented as an iterator, reusing parts of the earlier logic
to populate jailparams from a passed in table.

The user may request any number of parameters to pull in while we're
searching, but we'll force jid and name to appear at a minimum.

Reviewed by:	freqlabs
Differential Revision:	https://reviews.freebsd.org/D26756
2021-09-30 16:30:57 -05:00
Warner Losh
9eb5fd3599 uart: Match simple comm
Match the PCI simple comm devices (or try to). Be conservative and use
legacy interrupts rather than msi messages by default for this 'catch
all' since it matches what Linux does (it has opt-in generally for MSI,
but also matches more devices because it does a catch-all like
implemented in this commit).

Sponsored by:		Netflix
Reviewed by:		kbowling
Differential Revision:	https://reviews.freebsd.org/D32228
2021-09-30 14:16:19 -06:00
Warner Losh
bf40080762 uart: Allow PCI quirk for not using MSI interrupts
Some setups claim to have one MSI, but they don't actually work. Allow
these to be flagged.

Sponsored by:		Netflix
Reviewed by:		kbowling
Differential Revision:	https://reviews.freebsd.org/D32229
2021-09-30 14:15:32 -06:00
Ed Maste
1ad2d87778 mgb: Fix nop admin interrupt handling
Previously mgb_admin_intr printed a diagnostic message if no interrupt
status bits were set, but it's not valid to call device_printf() from a
filter.  Just drop the message as it has no user-facing value.

Also return FILTER_STRAY in this case - there is nothing further for
the driver to do.

Reviewed by:	kbowling
MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32231
2021-09-30 11:50:00 -04:00
Mathy Vanhoef
ffc19cf52d net80211: prevent plaintext injection by A-MSDU RFC1042/EAPOL frames
No longer accept plaintext A-MSDU frames that start with an RFC1042
header with EtherType EAPOL.  This is done by only accepting EAPOL
packets that are included in non-aggregated 802.11 frames.

Note that before this patch, FreeBSD also only accepted EAPOL frames
that are sent in a non-aggregated 802.11 frame due to bugs in
processing EAPOL packets inside A-MSDUs. In other words,
compatibility with legitimate devices remains the same.

This relates to section 6.5 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.

Submitted by:	Mathy Vanhoef (Mathy.Vanhoef kuleuven.be)
Security:	CVE-2020-26144
PR:		256120
MFC after:	7 days
Differential Revision: https://reviews.freebsd.org/D30665
2021-09-30 14:54:04 +00:00
Mathy Vanhoef
f024bdf115 net80211: mitigation against A-MSDU design flaw
Mitigate A-MSDU injection attacks by detecting if the destination address
of a subframe equals an RFC1042 (i.e., LLC/SNAP) header, and if so
dropping the complete A-MSDU frame.  This mitigates known attacks,
although new (unknown) aggregation-based attacks may remain possible.

This defense works because in A-MSDU aggregation injection attacks, a
normal encrypted Wi-Fi frame is turned into an A-MSDU frame. This means
the first 6 bytes of the first A-MSDU subframe correspond to an RFC1042
header. In other words, the destination MAC address of the first A-MSDU
subframe contains the start of an RFC1042 header during an aggregation
attack. We can detect this and thereby prevent this specific attack.

This relates to section 7.2 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.

Submitted by:	Mathy Vanhoef (Mathy.Vanhoef kuleuven.be)
Security:	CVE-2020-24588
PR:		256119
Differential Revision: https://reviews.freebsd.org/D30664
2021-09-30 14:50:45 +00:00
Mathy Vanhoef
11572d7d7f net80211: reject mixed plaintext/encrypted fragments
ieee80211_defrag() accepts fragmented 802.11 frames in a protected Wi-Fi
network even when some of the fragments are not encrypted.
Track whether the fragments are encrypted or not and only accept
successive ones if they match the state of the first fragment.

This relates to section 6.3 in the 2021 Usenix "FragAttacks" (Fragment
and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation)
paper.

Submitted by:	Mathy Vanhoef (Mathy.Vanhoef kuleuven.be)
Security:	CVE-2020-26147
PR:		256118
Differential Revision: https://reviews.freebsd.org/D30663
2021-09-30 14:47:41 +00:00
Mitchell Horne
a20c10893e libpmc: add some AMD pmu counter aliases
Make it mostly compatible with what's defined for Intel. Except where
noted, these are defined for all of amdzen(1|2|3).

Reviewed by:	emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32162
2021-09-30 11:15:26 -03:00
Mitchell Horne
937539e0a3 libpmc: fix the 'cycles' event alias on x86
Looking for "tsc-tsc" in the pmu tables will fail every time. Instead,
make this an alias for the static TSC event defined in pmc_events.h.
This fixes 'pmcstat -s cycles' on Intel and AMD.

Reviewed by:	emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32197
2021-09-30 11:15:26 -03:00
Mateusz Guzik
69ab528386 linprocfs: find cwd and root handling
The code would incorrectly use curthread instead of the target proc to
resolve vnodes.

Fixes:	8d03b99b9d ("fd: move vnodes out of filedesc into a dedicated structure")
PR:	258729
Noted by:	 Damjan Jovanovic <damjan.jov@gmail.com>
2021-09-30 12:59:58 +02:00
Mateusz Guzik
85c855d31b fd: add pwd_hold_proc 2021-09-30 12:49:51 +02:00
Ram Kishore Vegesna
41e9466943 ocs_fc: Fix device lost timer where device is not getting deleted.
Issue: Devices wont go away after the link down.

Device lost timer functionality in ocs_fc is broken,
`is_target` flag is not set in the target database and target delete is skipped.

Fix: Remove unused flags and delete the device when timer expires.

Reported by: ken@kdm.org
Reviewed by: mav, ken
2021-09-30 13:01:17 +05:30
Ram Kishore Vegesna
d063d1bc92 ocs_fc: When commands complete with an error, freeze the device queue.
Proper error recovery depends on freezing the device queue when an
error occurs, so we can recover from an error before sending
additional commands.

The ocs_fc(4) driver was not freezing the device queue for most
SCSI errors, and that broke error recovery.

sys/dev/ocs_fc/ocs_cam.c:
	In ocs_scsi_initiator_io_cb(), freeze the device queue if
        we're passing back status other than CAM_REQ_CMP.

Submitted by: ken@kdm.org
Reviewed by: mav, ken
2021-09-30 13:01:17 +05:30
Ram Kishore Vegesna
1af49c2eeb ocs_fc: Fix CAM status reporting in ocs_fc(4) when no data is returned.
In ocs_scsi_initiator_io_cb(), if the SCSI command that is
        getting completed had a residual equal to the transfer length,
        it was setting the CCB status to CAM_REQ_CMP.

        That breaks the expected behavior for commands like READ ATTRIBUTE.
        For READ ATTRIBUTE, if the first attribute requested doesn't exist,
        the command is supposed to return an error (Illegal Request,
        Invalid Field in CDB).  The broken behavior for READ ATTRIBUTE
        caused LTFS tape formatting to fail.  It looks for attribute
        0x1623, and expects to see an error if the attribute isn't present.

        In addition, if the residual is negative (indicating an overrun),
        only set the CCB status to CAM_DATA_RUN_ERR if we have not already
        reported an error.  The SCSI sense data will have more detail about
        what went wrong.

        sys/dev/ocs_fc/ocs_cam.c:
                In ocs_scsi_initiator_io_cb(), don't set the status to
                CAM_REQ_CMP if the residual is equal to the transfer length.

                Also, only set CAM_DATA_RUN_ERR if we didn't get SCSI
                status.

Submitted by: ken@kdm.org
Reviewed by: mav, ken
2021-09-30 13:01:16 +05:30
Ram Kishore Vegesna
322dbb8ce8 ocs_fc: Increase maximum supported SG elements to support larger transfer sizes.
Reported by: ken@kdm.org
Reviewed by: mav, ken
2021-09-30 13:01:16 +05:30
Ram Kishore Vegesna
3bf42363b0 ocs_fc: Emulex Gen 7 HBA support.
Emulex Gen7 adapter support in ocs_fc driver.

Reviewed by: mav, ken
2021-09-30 13:01:15 +05:30
Kyle Evans
335c4f8edb modules: iichid: needs opt_acpi.h
This fixes the standalone build.
2021-09-29 23:10:35 -05:00
Kyle Evans
a335f76f2a modules: a lot: need opt_kern_tls.h
This fixes the standalone build.
2021-09-29 23:10:31 -05:00
Kyle Evans
1d8c9d3c0c modules: p2sb: need opt_platform.h
This fixes the standalone build.
2021-09-29 23:09:45 -05:00