Commit Graph

141381 Commits

Author SHA1 Message Date
Konstantin Belousov
f4cdb9d7c3 vm/vm_pager.h: use sys/systm.h header
it is needed for __read_mostly attribute definition, which right now
comes from vm/vm_page.h including sys/systm.h

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov
54d34bfbdf Introduce sys/kassert.h
It contains assert-related definitions previously provided by
sys/systm.h.  The new header is leaner than whole systm.h.
Include kassert.h from systm.h for compatibility.

The copyright assignment to Eivind Eklund was suggested by Kirk McKusick
and is based in the commit 5526d2d920.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:14:14 +02:00
John Baldwin
53e938e408 hyperv storvsc: Don't abuse struct sglist to hold virtual addresses.
struct sglist is intended for holding S/G lists of physical address
ranges, not virtual address ranges.  GCC 9.x issues several warnings
due to casts between pointers and integers of different sizes as a
result (vm_paddr_t is 64-bits on i386).  Instead, add a local 'struct
hv_sglist' which uses an array of 'struct iovec' to hold the S/G list
of virtual address ranges.

Differential Revision:	https://reviews.freebsd.org/D31933
2022-01-31 17:11:27 -08:00
John Baldwin
d782385e9b tcp_ratelimit: Handle some edge cases with TLS + RL send tags.
- After a connection has fallen back from NIC TLS to SW TLS, any
  pacing rate changes should modify the inpcb send tag even though
  SB_TLS_IFNET is set.

- If a connection tries to modify the pacing rate before the send
  tag has been converted from plain TLS to TLS + RL, don't fail
  the rate request set but let it fall through to setting the rate
  on the non-TLS inpcb RL tag.

Reviewed by:	gallatin, rrs, hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D34085
2022-01-31 16:40:04 -08:00
John Baldwin
d958bc7963 ktls: Try to enable TOE TLS after marking existing data not ready.
At the moment this is mostly a no-op but in the future there will be
in-flight encrypted data which requires software decryption.  This
same setup is also needed for NIC TLS RX.

Note that this does break TOE TLS RX for AES-CBC ciphers since there
is no software fallback for AES-CBC receive.  This will be resolved
one way or another before 14.0 is released.

Reviewed by:	hselasky
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34082
2022-01-31 16:39:21 -08:00
Mark Johnston
773e3a71b2 pf: Initialize pf_kpool mutexes earlier
There are some error paths in ioctl handlers that will call
pf_krule_free() before the rule's rpool.mtx field is initialized,
causing a panic with INVARIANTS enabled.

Fix the problem by introducing pf_krule_alloc() and initializing the
mutex there.  This does mean that the rule->krule and pool->kpool
conversion functions need to stop zeroing the input structure, but I
don't see a nicer way to handle this except perhaps by guarding the
mtx_destroy() with a mtx_initialized() check.

Constify some related functions while here and add a regression test
based on a syzkaller reproducer.

Reported by:	syzbot+77cd12872691d219c158@syzkaller.appspotmail.com
Reviewed by:	kp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34115
2022-01-31 16:14:00 -05:00
Konstantin Belousov
66c5fbca77 insmntque1(): remove useless arguments
Also remove once-used functions to clean up after failed insmntque1(),
which were destructor callbacks in previous life.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D34071
2022-01-31 16:49:08 +02:00
Kornel Duleba
1a6d987b7f enetc: Wait for pending transmissions before disabling TX queues
According to the RM it's not safe to disable a TX ring while it is busy
transmitting frames.
In order to be safe wait until the ring is empty. (cidx==pidx)
Use this opportunity to remove a set-but-unused variable.

Obtained from: Semihalf
Sponsored by: Alstom Group
2022-01-31 08:57:48 +01:00
Kornel Duleba
a6bda3e1ef enetc: Simply TX ring credits counting logic
According to the RM rings can hold at most ring_size - 1 descriptors at any time.
No additional logic is needed since iflib already respects this constrain.
Thanks to that the pidx == cidx situation is not ambiguous and indicates an
empty ring.
Use that to simplify the logic that calculates the amount of processed frames.

Obtained from: Semihalf
Sponsored by: Alstom Group
2022-01-31 08:57:48 +01:00
Kornel Duleba
f485d733e8 enetc: Disable HW IP packet alignment
The NIC can IP align received packets.
It was observed that it caused some rare stalls, that required full board reset.
Disable this feature for now. It doesn't provide any significant performance
improvement anyway.

Obtained from: Semihalf
Sponsored by: Alstom Group
2022-01-31 08:57:48 +01:00
Konstantin Belousov
8d8589b385 ufs: be more persistent with finishing some operations
when the vnode is doomed after relock.  The mere fact that the vnode is
doomed does not prevent us from doing UFS operations on it while it is
still belongs to UFS, which is determined by non-NULL v_data.  Not
finishing some operations, e.g. not syncing the inode block only because
the vnode started reclamation, is not correct.

Add macro IS_UFS() which incapsulates the v_data != NULL, and use it
instead of VN_IS_DOOMED() for places where the operation completion is
important.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
4559700a0a ffs_snapblkfree(): add a comment explaining lockmgr invocation
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
0cdc603308 ufs: Use IS_SNAPSHOT()
Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
3d68c4e175 syncer VOP_FSYNC(): unlock syncer vnode around call to VFS_SYNC()
The lock is unneccessary since the mount point is busied, which prevents
unmount and syncer vnode deallocation.  Having the vnode locked causes
innocent LoRs and complicates debugging.

Also stop starting write accounting around it.  Any caller of
VOP_FSYNC() must do it already, and sync_vnode() does.

Reported and tested by:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
5875b94c74 buf_alloc(): lock the buffer with LK_NOWAIT
The buffer must not be accessed by any other thread, it is freshly
allocated.  As such, LK_NOWAIT should be nop but also it prevents
recording the order between the buffer lock and any other locks we might
own in the call to getnewbuf().  In particular, if we own FFS snap lock,
it should avoid triggering false positive warning.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
531f8cfea0 Use dedicated lock name for pbufs
Also remove a pointer to array variable, use array address directly.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:14 +02:00
Konstantin Belousov
9cd59de2e1 ext2fs: remove remnants of the UFS snapshot code
Noted and reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34095
2022-01-31 04:37:16 +02:00
Kirk McKusick
85f7e9a4f0 In GEOM debugging output, show consumer for cloned and duplicated bio's.
When using bio's created by g_clone_bio() or g_duplicate_bio()
their consumer device (the device to which their I/O requests
are sent) is listed by the geom debugging facility as [unknown].
If available, this update lists the consumer associated with
the bio's parent.

MFC after:    2 weeks
Sponsored by: Netflix
2022-01-30 17:21:13 -08:00
Jason A. Harmening
a01ca46b9b unionfs: use VV_ROOT to check for root vnode in unionfs_lock()
This avoids a potentially wild reference to the mount object.
Additionally, simplify some of the checks around VV_ROOT in
unionfs_nodeget().

Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D33914
2022-01-29 22:38:44 -06:00
Alexander Motin
67c58cd729 GEOM: Remove g_wait_sim.
It seems never been used since addition.
2022-01-29 22:12:43 -05:00
Alexander Motin
10ae42ccbd GEOM: Set G_CF_DIRECT_SEND/RECEIVE for taste consumers.
All I/O requests through the taste consumers are synchronous, done
with g_read_data() and without any locks held.  It makes no sense
to delegate the I/O to g_down/g_up threads.

This removes many of context switches during disk retaste.

MFC after:	2 weeks
2022-01-29 21:59:03 -05:00
Peter Jeremy
afcd121024
geom_gate: Distinguish between classes of errors
The geom_gate API provides 2 distinct paths for exchanging error
details between the kernel and the userland client: Including an error
code in the g_gate_ctl_io structure passed in the ioctl(2) call or
having the ioctl(2) call return -1 with an error code in errno. The
latter reflects errors in the ioctl(2) call itself whilst the former
reflects errors within the geom_gate instance.

The G_GATE_CMD_START ioctl blocks waiting for an I/O request to be
directed to the geom_gate instance and the wait can fail
(necessitating an error return) if the geom_gate instance is destroyed
or if the msleep(9) fails. The code previously treated both error
cases indentically: Returning ECANCELED as a geom_gate instance error
(which the ggatec treats as a fatal error).  Whilst this is the correct
behaviour if the geom_gate instance is destroyed, a msleep(9) failure
is unrelated to the geom_gate instance itself and should be reported
as an ioctl(2) "failure".  The distinction is important because
msleep(9) can return ERESTART, which means the system call should be
retried (and this will occur automatically as part of the generic
syscall return processing).

This change alters the msleep(9) handling to directly return the error
code from msleep(9), which ensures ERESTART is correctly handled,
rather than being treated as a fatal error.

Reviewed by:    Johannes Totz <jo@bruelltuete.com>
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D33996
2022-01-29 21:15:51 +11:00
Alexander V. Chernikov
217481a333 u3g: Add support Quectel EM12-G modem.
Submitted by:	<tda.77793 at gmail.com>
PR:		260218
MFC after:	2 weeks
2022-01-29 09:59:20 +00:00
Kristof Provost
9dac026822 dummynet: dn_dequeue() may return NULL
If there are no more entries, or if we fail to restore the rcvif of a
queued mbuf dn_dequeue() can return NULL.
Cope with this.

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34078
2022-01-28 23:09:08 +01:00
Kristof Provost
703e533da5 mbuf: do not restore dying interfaces
When we remove an interface it is first removed from the interface list
V_ifnet (by if_unlink_ifnet()) and marked as IFF_DYING. We then wait for
any possible references to stop being used (i.e.
epoch_wait/epoch_drain_callbacks) before we tear it fully down.

However, the index in ifindex_table is not removed, so m_rcvif_restore()
can still find the (now dying) interface.

This results in panics, for example when dummynet restores the rcvif
pointer and passes a packet to ip6_input() we can panic because the
AF_INET6 domain has already been removed (so we end up dereferencing a
NULL pointer there).

Check that the interface is not dying before we restore it, which is
equivalent to checking its presence in V_ifnet, and thus ensures that
future accesses (while in NET_EPOCH) are safe.

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34076
2022-01-28 23:09:08 +01:00
John Baldwin
29d481ae6a Make <vm/vm_extern.h> more self-contained.
Add a nested include of <sys/systm.h> for recently added assertions.
Without this, existing code (such as in drm-kmod) needs to be patched
to add the newly required header.

While here, rewrite the assertions using KASSERT().

Reviewed by:	dougm, alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D34070
2022-01-28 13:14:03 -08:00
John Baldwin
2e8d1a5525 iscsi: Allocate a dummy PDU for the internal nexus reset task.
When an iSCSI target session is terminated, an internal nexus reset
task is posted to abort existing tasks belonging to the session.
Previously, the ctl_io for this internal nexus reset stored a pointer
to the session in the slot that normally holds a pointer to the PDU
from the initiator that triggered the I/O request.  The completion
handler then assumed that any nexus reset I/O was due to an internal
request and fetched the session pointer (instead of the PDU pointer)
from the ctl_io.  However, it is possible to trigger a nexus reset via
an on-the-wire task management PDU.  If such a PDU were sent to the
target, then the completion handler would incorrectly treat this
request as an internal request and treat the pointer to the received
PDU as a pointer to the session instead.

To fix, allocate a dummy PDU for the internal reset task and use an
invalid opcode to differentiate internal nexus resets from resets
requested by the initiator.

PR:		260449
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34055
2022-01-28 13:07:04 -08:00
Mitchell Horne
b1ab9568bc hwpmc: remove mips event definitions
Reviewed by:	imp, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34084
2022-01-28 16:37:28 -04:00
Alexander Motin
29998bf2ac glabel: Set G_CF_DIRECT_SEND/RECEIVE for taste consumer.
All I/O requests through the taste consumer are synchronous, done
with g_read_data() and without any locks held.  It makes no sense
to delegate the I/O to g_down/g_up threads.

This removes many of context switches during disk retaste.

MFC after:	2 weeks
2022-01-28 14:22:41 -05:00
Alexander Motin
ffc1cc95e7 GEOM: Relax direct dispatch for GEOM threads.
The only cases when direct dispatch does not make sense is for I/O
submission from down thread and for completion from up thread.  In
all other cases, if both consumer and producer are OK about it, we
can save on context switches.

MFC after:	2 weeks
2022-01-28 14:21:21 -05:00
Gleb Smirnoff
964b8f8b99 ifnet: garbage collect unused function ifaddr_byindex().
Last use was removed in 5adea417d4.
2022-01-28 09:51:52 -08:00
Alexander Motin
0d8cec7658 graid: Set G_CF_DIRECT_SEND for task consumer.
Unlike normal consumers all taste consumer I/O is synchronous, done
with g_read_data() and without any locks held.  It makes no sense to
delegate I/O submission to g_down thread.

This should remove number of context switches during disk retaste.

MFC after:	2 weeks
2022-01-28 11:09:30 -05:00
Gordon Bergling
4bd030b369 sctp(4): Fix a typo in an INVARIANTS panic message
- s/failes/fails/

MFC after:	1 week
2022-01-28 13:20:52 +01:00
Edward Tomasz Napierala
99454d3e98 linux: Provide dummy seccomp(2)
Don't emit messages; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D33808
2022-01-28 11:45:41 +00:00
Kirk McKusick
ddf162d1d1 ufs: handle LoR between snap lock and vnode lock
When a filesystem is mounted all of its associated snapshots must
be activated. It first allocates a snapshot lock (snaplk) that will
be shared by all the snapshot vnodes associated with the filesystem.
As part of each snapshot file activation, it must replace its own
ufs vnode lock with the snaplk. In this way acquiring the snaplk
gives exclusive access to all the snapshots for the filesystem.

A write to a ufs vnode first acquires the ufs vnode lock for the
file to be written then acquires the snaplk. Once it has the snaplk,
it can check all the snapshots to see if any of them needs to make
a copy of the block that is about to be written. This ffs_copyonwrite()
code path establishes the ufs vnode followed by snaplk locking
order.

When a filesystem is unmounted it has to release all of its snapshot
vnodes. Part of doing the release is to revert the snapshot vnode
from using the snaplk to using its original vnode lock. While holding
the snaplk, the vnode lock has to be acquired, the vnode updated
to reference it, then the snaplk released. Acquiring the vnode lock
while holding the snaplk violates the ufs vnode then snaplk order.
Because the vnode lock is unused, using LK_EXCLUSIVE | LK_NOWAIT
to acquire it will always succeed and the LK_NOWAIT prevents the
reverse lock order from being recorded.

This change was made in January 2021 (173779b98f) to avoid an LOR
violation in ffs_snapshot_unmount(). The same LOR issue was recently
found again when removing a snapshot in ffs_snapremove() which must
also revert the snaplk to the original vnode lock as part of freeing it.

The unwind in ffs_snapremove() deals with the case in which the
snaplk is held as a recursive lock holding multiple references.
Specifically an equal number of references are made on the vnode
lock. This change factors out the lock reversion operations into a
new function revert_snaplock() which handles both the recursive
locks and avoids the LOR. The new revert_snaplock() function is
then used in both ffs_snapshot_unmount() and in ffs_snapremove().

Reviewed by:  kib
Tested by:    Peter Holm
MFC after:    2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33946
2022-01-27 23:03:35 -08:00
Rick Macklem
98c788737f nfsclient: Delete unused function nfscl_getcookie()
The function nfscl_getcookie(), which is essentially the
same as ncl_getcookie(), is never called, so delete it.
This is probably cruft left over from the port of the
NFSv4 code to FreeBSD several years ago.

Found while modifying the code to better use the
directory offset cookies.

MFC after:	2 weeks
2022-01-27 15:30:26 -08:00
John Baldwin
ac4643ef78 Remove terasic drivers used on the Cambridge BERI tablet.
Reviewed by:	brooks
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D34057
2022-01-27 11:01:51 -08:00
Richard Scheffenegger
4531b3450b tcp: Tidying up the conditionals for unwinding a spurious RTO
- Use the semantically correct TSTMP_xx macro when comparing
  timestamps. (No functional change)
- check for bad retransmits only when TSopt is present in ACK
  (don't assume there will be a valid TSopt in the TCP options struct)
- exclude tsecr == 0, since that most likely indicates an
  invalid ts echo return (tsecr) value.

Reviewed By: tuexen, #transport
MFC after:   3 days
Sponsored by:        NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34062
2022-01-27 18:59:55 +01:00
Richard Scheffenegger
68e623c3f0 tcp: Rewind erraneous RTO only while performing RTO retransmissions
Under rare circumstances, a spurious retranmission is
incorrectly detected and rewound, messing up various tcpcb values,
which can lead to a panic when SACK is in use.

Reviewed By: tuexen, chengc_netapp.com, #transport
MFC after:   3 days
Sponsored by:        NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D33979
2022-01-27 18:49:42 +01:00
Gleb Smirnoff
6abb5043a6 rtsock: always set m_pkthdr.rcvif when queueing on netisr
netisr uses global workstreams and after dequeueing an mbuf it
uses rcvif to get the VNET of the mbuf.  Of course, this is not
needed when kernel is compiled without VIMAGE.  It came out that
routing socket does not set rcvif if compiled without VIMAGE.
Make this assignment not depending on VIMAGE option.

Fixes:	6871de9363
2022-01-27 09:41:31 -08:00
Gleb Smirnoff
f59fa11280 mbuf: make M_ASSERT_NO_SND_TAG() as strict as other similar asserts
Fixes:	17cbcf33c3
2022-01-27 09:41:31 -08:00
Andriy Gapon
6fd84a627f mmc_da: create disk(9) for pre-2.0 SD cards
It does not look like there is anything in mmc_da code that actually
requires protocol 2.0 or later.  dev/mmc code also does not have such a
restriction.

Tested with a very old 2GB mini-SD card.  Prior to this change mmc_da
would claim the card but would not expose it to GEOM.

Without MMCCAM:
 mmc0: <MMC/SD bus> on sdhci_pci0
 mmc0: Probing bus
 mmc0: SD probe: OK (OCR: 0x00ff8000)
 mmc0: Current OCR: 0x00ff8000
 mmc0: CMD8 failed, RESULT: 1
 mmc0: Probing cards
 mmc0: New card detected (CID 1c53565344432020100002982e007600)
 mmc0: New card detected (CSD 005e00325f5a83d02db7ffbf96800000)
 mmc0: Card at relative address 0xb368 added:
 mmc0:  card: SD SDC   1.0 SN 0002982E MFG 06/2007 by 28 SV
 mmc0:  quirks: 0
 mmc0:  bus: 4bit, 50MHz (high speed timing)
 mmc0:  memory: 3998720 blocks, erase sector 256 blocks
 mmc0: setting transfer rate to 50.000MHz (high speed timing)
 GEOM: new disk mmcsd0
 mmcsd0: 2GB <SD SDC   1.0 SN 0002982E MFG 06/2007 by 28 SV> at mmc0 50.0MHz/4bit/65535-block
 mmc0: setting bus width to 4 bits high speed timing

With MMCCAM and this change:
 sdda0 at sdhci_slot0 bus 0 scbus2 target 0 lun 0
 sdda0: Relative addr: 0000b368
 Card features: <Memory>
 sdda0: Serial Number 0002982E
 sdda0: SD SDC   1.0 SN 0002982E MFG 06/2007 by 28 SV
 GEOM: new disk sdda0

Reviewed by:	manu
MFC after:	3 weeks
2022-01-27 18:59:54 +02:00
Mateusz Guzik
2a7e4cf843 Revert b58ca5df0b ("vfs: remove the now unused insmntque1")
I was somehow convinced that insmntque calls insmntque1 with a NULL
destructor. Unfortunately this worked well enough to not immediately
blow up in simple testing.

Keep not using the destructor in previously patched filesystems though
as it avoids unnecessary casts.

Noted by:	kib
Reported by:	pho
2022-01-27 16:32:22 +00:00
Andrew Gallatin
8a7404b2ae tcp: fix leaks in tcp_chg_pacing_rate error paths
tcp_chg_pacing_rate() is expected to release the hw rate limit table,
but failed to do so in several error cases, leading to ever
increasing counts of flows using the rate.

This patch was mostly done by rrs

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34058
Reviewed by: hselasky, rrs,  jhb (inital version, outside of Differential)
2022-01-27 10:35:03 -05:00
Andrew Gallatin
9ba117960e Fix a memory leak when ip_output_send() returns EAGAIN due to send tag issues
When ip_output_send() returns EAGAIN due to issues with send tags (route
change, lagg failover, etc), it must free the mbuf. This is because
ip_output_send() was written as a wrapper/replacement for a direct
call to  if_output(), and the contract with if_output() has
historically been that it owns the mbufs once called. When
ip_output_send() failed to free mbufs, it violated this assumption
and lead to leaked mbufs.

This was noticed when using NIC TLS in combination with hardware
rate-limited connections. When seeing lots of NIC output drops
triggered ratelimit send tag changes, we noticed we were leaking
ktls_sessions, send tags and mbufs. This was due ip_output_send()
leaking mbufs which held references to ktls_sessions, which in
turn held references to send tags.

Many thanks to jbh, rrs, hselasky and markj for their help in
debugging this.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34054
Reviewed by: hselasky, jhb, rrs
MFC after: 2 weeks
2022-01-27 10:34:34 -05:00
Mark Johnston
38da0c96dc geom: Assert that BIO_SPEEDUP BIOs have bio_data set to NULL
Like BIO_FLUSH, there is no reason for consumers to pass a BIO_SPEEDUP
request with non-NULL bio_data, so assert this.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-01-27 09:58:19 -05:00
Mark Johnston
a2dfffb989 shsec: Allocate data blocks only for BIO_READ/WRITE requests
In particular, there is no need to allocate a data block when passing
BIO_FLUSH requests to child providers, and g_io_request() asserts that
bp->bio_data == NULL for such requests.

PR:		255131
Reported and tested by:	nvass@gmx.com
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-01-27 09:56:07 -05:00
Andrew Turner
548a2ec49b Add PT_GETREGSET
This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.

The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.

Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19831
2022-01-27 11:40:34 +00:00
Andriy Gapon
5d5f44623e g_mirror: don't fail reads while losing next-to-last disk
I observed a situation where some read requests failed when a 2-way geom
mirror lost one disk.  The problem appears to be in the logic that skips
retrying a failed request when a mirror has only one active disk.
Generally, that makes sense.  But during a transition from two disks to
one it is possible that the request failed on the failing disk before it
was inactivated and, so, the remaining active disk is the disk that
should be tried.

This change adds an additional check to ensure that it was the (only)
active disk that was already tried.

Reviewed by:	mav
MFC after:	3 weeks
2022-01-27 13:22:52 +02:00
Gleb Smirnoff
6871de9363 netisr: serialize/restore m_pkthdr.rcvif when queueing mbufs
Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33268
2022-01-26 21:58:50 -08:00
Gleb Smirnoff
165746f4e4 dummynet: use m_rcvif_serialize/restore when queueing packets
This fixed panic with interface being removed while packet
was sitting on a queue.  This allows to pass all dummynet
tests including forthcoming dummynet:ipfw_interface_removal
and dummynet:pf_interface_removal and demonstrates use of
m_rcvif_serialize() and m_rcvif_restore().

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33267
2022-01-26 21:58:50 -08:00
Gleb Smirnoff
e1882428dc ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33266
Fun note:		git show e6abef0918
2022-01-26 21:58:50 -08:00
Gleb Smirnoff
91f44749c6 ifnet: make if_index global
Now that ifindex is static to if.c we can unvirtualize it.  For lifetime
of an ifnet its index never changes.  To avoid leaking foreign interfaces
the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI
filter their returned value on curvnet.  Since if_vmove() no longer
changes the if_index, inline ifindex_alloc() and ifindex_free() into
if_alloc() and if_free() respectively.

API wise the only change is that now minimum interface index can be
greater than 1.  The holes in interface indexes were always allowed.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33672
2022-01-26 21:58:44 -08:00
Mateusz Guzik
d35991d327 nullfs: ansify fs/nullfs/null_subr.c 2022-01-27 01:01:45 +01:00
Mateusz Guzik
b58ca5df0b vfs: remove the now unused insmntque1
Bump __FreeBSD_version to 1400052.
2022-01-27 01:00:24 +01:00
Mateusz Guzik
3150cf0c13 unionfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:57:37 +01:00
Mateusz Guzik
5ccdfdabc8 tmpfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:56:12 +01:00
Mateusz Guzik
4e91a0b9fe nullfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:47 +01:00
Mateusz Guzik
ade1367ba8 fdescfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:38 +01:00
Mateusz Guzik
3af3e99ce4 devfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:30 +01:00
Vladimir Kondratyev
c974c22a4f Revert "LinuxKPI: Allow wake_up to be executed within a critical section"
This change was based on currently reverted commit 7dea0c9e6eba.

This reverts commit 89889ab470.
2022-01-27 01:27:01 +03:00
Vladimir Kondratyev
11ef1d975f Revert "LinuxKPI: Allow spin_lock_irqsave to be called within a critical section"
This change results in deadlocks on UP systems

This reverts commit 7dea0c9e6eba4dc127cd67667c81fa2c250f1024.

Requested by:	kib, hselasky
2022-01-27 01:27:01 +03:00
Kyle Evans
773fa8cd13 execve: disallow argc == 0
The manpage has contained the following verbiage on the matter for just
under 31 years:

"At least one argument must be present in the array"

Previous to this version, it had been prefaced with the weakening phrase
"By convention."

Carry through and document it the rest of the way.  Allowing argc == 0
has been a source of security issues in the past, and it's hard to
imagine a valid use-case for allowing it.  Toss back EINVAL if we ended
up not copying in any args for *execve().

The manpage change can be considered "Obtained from: OpenBSD"

Reviewed by:	emaste, kib, markj (all previous version)
Differential Revision:	https://reviews.freebsd.org/D34045
2022-01-26 13:40:27 -06:00
Gordon Bergling
9966757dd6 hwpmc(4): Fix a typo in a sysctl description
- s/avalable/available/

MFC after:	3 days
2022-01-26 20:18:57 +01:00
Ryan Moeller
47e46b1123 zfs: Fix zvol_cdev_open locking
First open locking changes were correctly applied to zvol_geom_open but
incorrectly applied to zvol_cdev_open, causing spa_namespace_lock to be
held indefinitely.

Make the first open locking in zvol_cdev_open match zvol_geom_open.

This change has been accepted upstream in openzfs/zfs#13016 but is not
yet merged.

Reviewed by:	mav
Fixes:		e92ffd9b62
Sponsored by:	iXsystems, Inc.
2022-01-26 18:37:52 +00:00
Gordon Bergling
9e58cca3e8 extra_tcp_stacks: Fix two typos in source code comments
- s/differnt/different/

MFC after;	3 days
2022-01-26 18:02:55 +01:00
Ed Maste
9c296a2105 geom: Add HiFive boot partitions
As documented in the HiFive Unmatched Software Reference Manual.

Reviewed by:	imp, mhorne
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34010
2022-01-26 10:54:45 -05:00
Hans Petter Selasky
9e2cce7e6a Implement a function to get the next TCP- and TLS- receive sequence number.
This function will be used by coming TLS hardware receive offload support.

Differential Revision:	https://reviews.freebsd.org/D32356
Discussed with:	jhb@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-01-26 12:55:00 +01:00
Hans Petter Selasky
c8f2c290e4 Add definitions for TLS receive tags using the existing send tag infrastructure.
Although send tags are strictly used for transmit, the name might be changed
in the future to be more generic.

The TLS receive tags support regular IPv4 and IPv6 traffic, and also over any
VLAN. If prio-tagging is enabled, VLAN ID zero, this must be checked in the
network driver itself when creating the TLS RX decryption offload filter.

TLS receive tags have a modify callback to tell the network driver about
the progress of decryption. Currently decryption is done IP packet by IP
packet, even if the IP packet contains a partial TLS record. The modify
callback allows the network driver to keep track of TCP sequence numbers
pointing to the beginning of TLS records after TCP packet reassembly.
These callbacks only happen when encrypted or partially decrypted data is
received and are used to verify the decryptions starting point for the
hardware. Typically the hardware will guess where TLS headers start and
needs help from the software to know if the guess was correct. This is
the purpose of the modify callback.

Differential Revision:	https://reviews.freebsd.org/D32356
Discussed with:	jhb@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-01-26 12:55:00 +01:00
Hans Petter Selasky
17cbcf33c3 mbuf(9): Assert receive mbufs don't carry a send tag.
Else we would start leaking reference counts.

Discussed with:	jhb@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-01-26 12:55:00 +01:00
Hans Petter Selasky
a6d4524323 mbuf(9): Properly declare some function macros when debugging is disabled.
No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-01-26 12:54:59 +01:00
Emmanuel Vadot
81de556105 linuxkpi: i2c: Add MODULE_DEPEND for iicbus
MFC after:	1 month
MFC with:	1961a14a47
Fixes:	1961a14a47 ("linuxkpi: Add i2c support")
Reported by:	GregV
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2022-01-26 10:44:07 +01:00
Andriy Gapon
f4a041af29 add overlay for enabling spi0 on allwinner h3
At least on Orange Pi PC Plus it is routed to the 40-pin header, so it
can used to communicate with external devices.

MFC after:	2 weeks
2022-01-26 11:42:20 +02:00
Andriy Gapon
a471646a08 add overlay for enabling i2c1 on allwinner h3
At least on Orange Pi PC Plus it is routed to the 40-pin header, so it
can used to communicate with external devices.

MFC after:	2 weeks
2022-01-26 11:42:20 +02:00
Gordon Bergling
b3df222eae extra_tcp_stacks: Fix a few common typos
TCP_BBR:
- Fix a typo introducted in 1b90dfa5d2, which was reported by tuexen@

TCP_RACK:
- Correct two sysctl descriptions: s/corret/correct/

tcp_bbr(4): Also fix s/measurment/measurement/ in the man page

MFC after:	1 week
2022-01-26 10:35:17 +01:00
Andriy Gapon
173d0fb616 add overlay for enabling serial1 / uart1 on rk3328
On Rock64 the uart is routed to pins on the "Pi-2" header, so it is
potentially useful.

Pin mapping:
----------------------------
| ID | Name     | Function |
----------------------------
| 15 | GPIO3_A4 | TX       |
| 16 | GPIO3_A5 | RTS      |
| 18 | GPIO3_A6 | RX       |
| 22 | GPIO3_A7 | CTS      |
----------------------------

MFC after:	2 weeks
2022-01-26 11:31:59 +02:00
Andriy Gapon
f41f98f0f0 add overlay for enabling i2c0 on rk3328
On Rock64 it is routed to pins 3 and 5 of the so called Pi-2 header.

MFC after:	2 weeks
2022-01-26 11:30:53 +02:00
Andriy Gapon
94ff1d9cc8 sdhci: fix dumping support in MMCCAM configuration
This change fixes interaction with recently added sddadump.

MFC after:	1 week
2022-01-26 09:31:45 +02:00
Warner Losh
e35816c1c9 mpr/mps: Fix a race in diagnostic reset
There's a small race in freezing the simq when performing a diagnostic
reset. During this time, a transaction can slip through and encounter
the target id of 0. If we're still in diagnostic reset when we detect
this, return a CAM_DEVICE_NOT_THERE status. Instead, freeze the queue
and return a requeue status, similar to what we do when we're resetting
a target and a transaction get here. The race is unavoidable due to
separate locks for queue and SIM, but easy enough to detect and make
harmless.

Sponsored by:		Netflix
Reviewed by:		scottl, mav
Differential Revision:	https://reviews.freebsd.org/D34017
2022-01-25 19:15:46 -07:00
John Baldwin
5fcb5ae8dc Remove a stale comment.
The intr_disable as a macro was only a problem on arm and mips and
is no longer relevant after the mips removal.
2022-01-25 17:19:36 -08:00
John Baldwin
46f69eba96 opencrypto/xform_*.h: Trim scope of included headers.
Reviewed by:	markj, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34022
2022-01-25 15:21:22 -08:00
John Baldwin
f6459a7aa8 opencrypto/cryptodev.h: Add includes to make more self-contained.
Reviewed by:	markj, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34021
2022-01-25 15:20:46 -08:00
Jessica Clarke
d930ec4ff9 dp83822phy: Add missing MII_PHY_END to avoid buffer overread on probe
Found by:	CHERI
Fixes:		0c9156faec ("Introduce DP83822 PHY driver")
2022-01-25 20:34:55 +00:00
Jessica Clarke
3f707064a5 dp83867phy: Add missing MII_PHY_END to avoid buffer overread on attach
Found by:	CHERI
Fixes:		e85c94b8d6 ("Introduce DP83867 PHY driver")
2022-01-25 20:34:55 +00:00
Emmanuel Vadot
59d465e200 Bump __FreeBSD_version for LinuxKPI changes
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2022-01-25 16:15:46 +01:00
Emmanuel Vadot
1961a14a47 linuxkpi: Add i2c support
Add i2c support to linuxkpi. This is needed by drm-kmod.
For every i2c_adapter added by i2c_add_adapter we add a child to the
device named "lkpi_iic". This child handle the conversion between
Linux i2c_msgs to FreeBSD iic_msgs.
For every i2c_adapter added by i2c_bit_add_bus we add a child to the
device named "lkpi_iicbb". This child handle the conversion between
Linux i2c_msgs to FreeBSD iic_msgs.
With the help of iic(4), this expose the i2c controller to userspace
allowing a user to query DDC information from a monitor.
e.g.: i2c -f /dev/iic0 -a 0x28 -c 128 -d r
will query the standard EDID from the monitor if plugged.

The bitbang part (lkpi_iicbb) isn't tested at all for now as I don't have
compatible hardware (all my hardware have native i2c controller).

Tested on:	Intel (SandyBridge, Skylake, ApolloLake)
Tested on:	AMD (Picasso, Polaris (amd64 and arm64))

MFC after:	1 month
Reviewed by:	hselasky
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D33053
2022-01-25 16:15:39 +01:00
Edward Tomasz Napierala
9caeb82eab Revert "linux: Provide dummy seccomp(2)"
This reverts commit 56981629f9.

Wrong patch; fails to build on i386.
2022-01-20 22:25:15 +00:00
Edward Tomasz Napierala
56981629f9 linux: Provide dummy seccomp(2)
Don't emit warnings; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D33808
2022-01-25 11:54:00 +00:00
Gleb Smirnoff
6d1808f051 if_clone: correctly destroy a clone from a different vnet
Try to live with cruel reality fact - if_vmove doesn't move an
interface from previous vnet cloning infrastructure to the new
one.  Let's admit this as design feature and make it work better.

* Delete two blocks of code that would fallback to vnet0, if a
  cloner isn't found.  They didn't do any good job and also whole
  idea of treating vnet0 as special one is wrong.
* When deleting a cloned interface, lookup its cloner using it's
  home vnet.

With this change simple sequence works correctly:

  ifconfig foo0 create
  jail -c name=jj persist vnet vnet.interface=foo0
  jexec jj ifconfig foo0 destroy

Differential revision:	https://reviews.freebsd.org/D33942
2022-01-24 21:07:16 -08:00
Gleb Smirnoff
54712fc423 if_vmove: improve restoration in cloner's ifgroup membership
* Do a single call into if_clone.c instead of two.  The cloner
  can't disappear since the interface sits on its list.
* Make restoration smarter - check that cloner with same name
  exists in the new vnet.

Differential revision:	https://reviews.freebsd.org/D33941
2022-01-24 21:06:59 -08:00
Thomas Steen Rasmussen
bc6abdd97e nd6: use CARP link level address in SLLAO for NS sent out
When sending an NS, check if we are using a IPv6 CARP address
and if we do, then put proper CARP link level address into
ND_OPT_SOURCE_LINKADDR option and also put PACKET_TAG_CARP tag
on the packet.  The latter will enforce CARP link level address
at the data link layer too, which might be necessary for broken
implementations.
The code really follows what NA sending code has been doing since
introduction of carp(4).  While here, bring to style(9) the whole
block of code.

PR:			193280
Differential revision:	https://reviews.freebsd.org/D33858
2022-01-24 21:02:47 -08:00
Eric Joyner
e438f0a975
ice_ddp: Update to 1.3.27.0
This is intended to be used with forthcoming ice(4) driver version 1.34.2.

Signed-off-by: Eric Joyner <erj@FreeBSD.org>

Sponsored by:	Intel Corporation
2022-01-24 18:25:56 -08:00
Eric Joyner
213e91399b
iflib: Allow drivers to determine which queue to TX on
Adds a new function pointer to struct if_txrx in order to allow
drivers to set their own function that will determine which queue
a packet should be sent on.

Since this includes a kernel ABI change, bump the __FreeBSD_version
as well.

(This motivation behind this is to allow the driver to examine the
UP in the VLAN tag and determine which queue to TX on based on
that, in support of HW TX traffic shaping.)

Signed-off-by: Eric Joyner <erj@FreeBSD.org>

Reviewed by:	kbowling@, stallamr@netapp.com
Tested by:	jeffrey.e.pieper@intel.com
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D31485
2022-01-24 18:22:02 -08:00
John Baldwin
2c4b65cc3d Bump __FreeBSD_version for the addition of <crypto/curve25519.h>.
Sponsored by:	The FreeBSD Foundation
2022-01-24 15:28:36 -08:00
John Baldwin
16cf646a6f crypto: Remove xform.c and compile xform_*.c standalone.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33995
2022-01-24 15:27:40 -08:00
John Baldwin
faf470ffdc xform_*.c: Add headers when needed to compile standalone.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33994
2022-01-24 15:27:40 -08:00
John Baldwin
991b84eca9 Retire now-unused M_XDATA.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33993
2022-01-24 15:27:39 -08:00
John Baldwin
35d9e00dba IPsec: Use protocol-specific malloc types instead of M_XDATA.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33992
2022-01-24 15:27:39 -08:00
John Baldwin
8f3f3fdf73 cryptodev: Use a private malloc type (M_CRYPTODEV) instead of M_XDATA.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33991
2022-01-24 15:27:39 -08:00
John Baldwin
1d95c6f9c0 Don't implicitly pull in most of 'device crypto' for 'options IPSEC'.
options IPSEC is already documented as requiring 'device crypto' and
duplicating the dependencies is harder to read and not always
consistent.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33990
2022-01-24 15:27:39 -08:00
John Baldwin
0c6274a819 crypto: Add an API supporting curve25519.
This adds a wrapper around libsodium's curve25519 support matching
Linux's curve25519 API.  The intended use case for this is WireGuard.

Note that this is not integrated with OCF as it is not related to
symmetric operations on data.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33935
2022-01-24 15:27:39 -08:00
John Baldwin
a8c4147edc cxgbei: Parse all PDUs received prior to enabling offload mode.
Previously this would only handle a single PDU that did not contain
any data.  This should now handle an arbitrary number of PDUs.

While here check for these PDUs in the T6-specific CPL_RX_ISCSI_CMP
handler in addition to CPL_RX_ISCSI_DDP.

Reported by:	Jithesh Arakkan @ Chelsio
Sponsored by:	Chelsio Communications
2022-01-24 14:20:02 -08:00
Warner Losh
802f8d4afe mpr/mps: Remove write-only flag and callout
The discovery callout is initialized and cancelled only, making it
write-only. Remove a state flag associated with it being pending as well
as two defines that aren't used that are associated with it. Remove
MP?SAS_SHUTDOWN flag, which is unused.

Sponsored by:		Netflix
Reviewed by:		ken, scottl, mav
Differential Revision:	https://reviews.freebsd.org/D33925
2022-01-24 13:21:09 -07:00
John Baldwin
308fc7e5b1 user_getpeername: Use 'bool' for the compat argument.
This matches user_getsockname.

Reviewed by:	brooks, kib
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33987
2022-01-24 09:51:35 -08:00
Kevin Lo
dea952c3e2 modules: mgb: need opt_platform.h
This fixes the standalone build.
2022-01-24 13:38:39 +08:00
Philippe Michaud-Boudreault
45f0e57105 sound: add patch for Lenovo Legion 5 AMD
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30333
2022-01-23 15:04:25 -05:00
Michal Krawczyk
8a5b4859c7 ena: update ENA version to v2.5.0
Some of the changes in this release:
- IPv6 L4 checksum offload fixes.
- Optimization of the Tx req_id validation.
- Timer service adjustments.
- NUMA awareness for the kernel RSS mode.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:48:33 +01:00
Dawid Gorecki
d10ec3ad77 ena: do not call reset if device is unresponsive
If the device becomes unresponsive, the driver will not be able to
finish the reset process correctly. Timeout during version validation
indicates that the device is currently not responding. In that case
do not perform the reset and instead reschedule timer service. Because
of that the driver will continue trying to reset the device until it
succeeds or is detached.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:48:33 +01:00
Dawid Gorecki
78554d0c70 ena: start timer service on attach
The timer service was started when the interface was brought up and it
was stopped when it was brought down. Since ena_up requires the device
to be responsive, triggering the reset would become impossible if the
device became unresponsive with the interface down.

Since most of the functions in timer service already perform the check
to see if the device is running, this only requires starting the callout
in attach and stopping it when bringing the interface up or down to
avoid race between different admin queue calls.

Since callout functions for timer service are always called with the
same arguments, replace callout_{init,reset,drain} calls with
ENA_TIMER_{INIT,RESET,DRAIN} macros.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:48:32 +01:00
Artur Rojek
b168d0c850 ena: rework tx req_id validation logic
Since `ena_com_tx_comp_req_id_get` already checks for `req_id` validity,
the logic was exiting early, never giving `validate_tx_req_id` a chance
to trigger device reset.
Rewrite the logic so that device reset is called based on return value
of `ena_com_tx_comp_req_id_get` instead.

Submitted by: Artur Rojek <ar@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:38:12 +01:00
Dawid Gorecki
2bbef9d95d ena: properly handle IPv6 L4 checksum offload
ena_tx_csum function did not check if IPv6 checksum offload was
requested it only checked checksum offloading flags for IPv4 packets.
Because of that, when encountering CSUM_IP6_* flags, the function simply
returned without actually setting checksum offloading in ena_ctx.
Check CUSM_IP6_* flags to enable IPv6 checksum offload.

Additionally, only IPv4 header was being parsed regardless of EtherType
field, because of that, value of L4 protocol read when actually trying
to send IPv6 packets was wrong. Use ip6_lasthdr function to get length
of all IPv6 headers and payload protocol.

Set the DF flag to 1 in order to allow the device to offload the IPv6
checksum calculation and achieve optimal performance.

Add CSUM6_OFFLOAD and CSUM_OFFLOAD definitions into ena_datapath.h.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:38:01 +01:00
Marcin Wojtas
eb4c4f4a2e ena: merge ena-com v2.5.0 upgrade
Merge commit '2530eb1fa01bf28fbcfcdda58bd41e055dcb2e4a'

Adjust the driver to the upgraded ena-com part twofold:

First update is related to the driver's NUMA awareness.

Allocate I/O queue memory in NUMA domain local to the CPU bound to the
given queue, improving data access time. Since this can result in
performance hit for unaware users, this is done only when RSS
option is enabled, for other cases the driver relies on kernel to
allocate memory by itself.

Information about first CPU bound is saved in adapter structure, so
the binding persists after bringing the interface down and up again.

If there are more buckets than interface queues, the driver will try to
bind different interfaces to different CPUs using round-robin algorithm
(but it will not bind queues to CPUs which do not have any RSS buckets
associated with them). This is done to better utilize hardware
resources by spreading the load.

Add (read-only) per-queue sysctls in order to provide the following
information:
- queueN.domain: NUMA domain associated with the queue
- queueN.cpu:    CPU affinity of the queue

The second change is for the CSUM_OFFLOAD constant, as ENA platform
file has removed its definition. To align to that change, it has been
added to the ena_datapath.h file.

Submitted by: Artur Rojek <ar@semihalf.com>
Submitted by: Dawid Gorecki <dgr@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-01-23 20:27:13 +01:00
Martin Matuska
5025e85013 zfs: fix kernel build after e92ffd9b6 if ZFS is compiled in
Add missing source file lz4_zfs.c to sys/conf/files
2022-01-23 09:27:27 +01:00
Martin Matuska
e92ffd9b62 zfs: merge openzfs/zfs@17b2ae0b2 (master) into main
Notable upstream pull request merges:
  #12766 Fix error propagation from lzc_send_redacted
  #12805 Updated the lz4 decompressor
  #12851 FreeBSD: Provide correct file generation number
  #12857 Verify dRAID empty sectors
  #12874 FreeBSD: Update argument types for VOP_READDIR
  #12896 Reduce number of arc_prune threads
  #12934 FreeBSD: Fix zvol_*_open() locking
  #12947 lz4: Cherrypick fix for CVE-2021-3520
  #12961 FreeBSD: Fix leaked strings in libspl mnttab
  #12964 Fix handling of errors from dmu_write_uio_dbuf() on FreeBSD
  #12981 Introduce a flag to skip comparing the local mac when raw sending
  #12985 Avoid memory allocations in the ARC eviction thread

Obtained from:	OpenZFS
OpenZFS commit:	17b2ae0b24
2022-01-22 23:05:15 +01:00
Michał Górny
028a372fe2 gdb(4): Do not use run length encoding for 3-symbol repetitions
Disable the gdb packet run length encoding for 3-symbol repetitions.
While it is technically possible to encode them, they have no advantage
over sending the characters verbatim (the resulting length is the same)
and they result in sending non-printable \x1f character.  The protocol
has been designed with the intent of avoiding non-printable characters
and therefore the run length encoding is biased to emit \x20 (a space)
with the minimal intended run length of 4.

While at it, simplify the logic by merging the different 'if' blocks
into a single while loop, and moving 'runlen == 0' check lower.

Reviewed by:	cem, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33686
2022-01-22 14:46:06 -05:00
Ed Maste
2075d00fab hwpmc: drop 0x before %p printf format string
%p already includes the 0x.

Sponsored by:	The FreeBSD Foundation
2022-01-22 13:39:05 -05:00
Konstantin Belousov
fe6db72708 Add security.bsd.allow_ptrace sysctl
that disables any access to ptrace(2) for all processes.

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33986
2022-01-22 19:36:56 +02:00
Konstantin Belousov
55a0aa2162 p_candebug(), p_cansee(): always allow for curproc
Privilege checks in both functions should allow the current process to
infer information about itself, as well as use the interfaces that are
proclaimed 'debugging', for instance, procctl(2).

Note that in p_cansee() case, explicit comparision of curproc and p
avoids a race where the process might change credentials and cause
thread to compare its cached stale credentials against updated process
creds, effectively disallowing the process to observe itself.

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33986
2022-01-22 19:36:56 +02:00
Konstantin Belousov
3de96d664a vm_pageout_scans: correct detection of active object
For non-anonymous swap objects, there is always a reference from the
owner to the object to keep it from recycling.  Account for it when
deciding should we query pmap for hardware active references for the
page.

As result, we avoid unneeded calls to pmap_ts_referenced(), which for
non-mapped page means avoiding unneccessary lock and unlock of the pv list.

Reviewed by:	markj
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33924
2022-01-22 19:34:32 +02:00
Wojciech Macek
0daa28057c ip_mroute: add unlock in early-exit
Add missing unlock if V_ip_mrotue is not set

Obtained from:		Semihalf
2022-01-22 14:48:47 +01:00
Wojciech Macek
889c60500d ip_mroute: release epoch lock if mrouter is not configured
Add mising "else" branch to release a lock if mrouter is not
configured.

Obtained from:		Semihalf
Sponsored by:		Stormshield
2022-01-22 11:48:30 +01:00
Ka Ho Ng
fa66950534 iscsi: Fix missing is_lock unlock after cam_simq_alloc() failed
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-21 16:34:18 -05:00
Takanori Watanabe
eb815a7419 atrtc: Install address space handler for \_SB and its descendant.
SystemCMOS address space is accessible for system wide.
 So install address handler in \_SB space.

Reviewed by: jhb

Differential Revision: https://reviews.freebsd.org/D33892
2022-01-21 15:32:30 +09:00
Takanori Watanabe
5c69be7084 acpi: Ignore _STA and never disable AT RTC devices
atrtc(4) should always install a SystemCMOS address space handler unless
the RTC Not Present bit is not set in IAPC_BOOT_ARCH in the FADT.
The atrtc(4) driver already checks this bit, but _STA can return not-present
even when this bit is clear.

Reviewed by : jhb
Differential Revision: https://reviews.freebsd.org/D33891
2022-01-21 15:30:46 +09:00
Wojciech Macek
9ce46cbc95 ip_mroute: move ip_mrouter_done outside lock
X_ip_mrouter_done might sleep, which triggers INVARIANTS to
print additional errors on the screen.
Move it outside the lock, but provide some basic synchronization
to avoid race condition during module uninit/unload.

Obtained from:		Semihalf
Sponsored by:		Stormshield
2022-01-21 06:17:19 +01:00
Wojciech Macek
58630bdd13 Revert "ip_mroute: do not call epoch_waitwhen lock is taken"
This reverts commit 2e72208b6c.
2022-01-21 06:17:19 +01:00
Piotr Kubaj
a0f3abb098 powerpc: enable ice in GENERIC64LE
Approved by:	erj
Differential Revision: https://reviews.freebsd.org/D33974
2022-01-21 02:17:46 +01:00
John Baldwin
89e0ee0db4 chacha20_poly1305: Use the correct license disclaimer.
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33976
2022-01-20 14:36:48 -08:00
Mark Johnston
6be8944d96 ktls: Zero out TLS_GET_RECORD control messages
Otherwise we end up copying one uninitialized byte into the socket
buffer.

Reported by:	KMSAN
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33953
2022-01-20 15:42:46 -05:00
Mark Johnston
d91d2b513e geom: Handle partial I/O in g_{read,write,delete}_data()
These routines are used internally by GEOM to dispatch I/O requests to a
provider, typically for tasting or for updating GEOM class metadata
blocks.

These routines assumed that partial I/O did not occur without setting
BIO_ERROR, but this is possible in at least two cases:
- Some or all of the I/O range is beyond the provider's mediasize.
  In this scenario g_io_check() truncates the bounds of the request
  before it is handed to the target provider.
- A read from vnode-backed md(4) device returns EOF (the backing vnode
  is allowed to be smaller than the device itself) or partial vnode I/O
  occurs.
In these scenarios g_read_data() could return a partially uninitialized
buffer.  Many consumers are not affected by the first case, since the
offsets used for provider metadata or tasting are relative to the
provider's mediasize, but in some cases metadata is read at fixed
offsets, such as when searching for a UFS superblock using the offsets
defined by SBLOCKSEARCH.

Thus, modify the routines to explicitly check for a non-zero residual
and return EIO in that case.  Remove a related check from the
DIOCGDELETE ioctl handler, it is handled within g_delete_data() now.

Reviewed by:	mav, imp, kib
Reported by:	KMSAN
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31293
2022-01-20 08:29:39 -05:00
Mark Johnston
526ddf174e vtnet: Mark MRG_RXBUF headers as initialized before loading fields
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-01-20 08:25:14 -05:00
Mark Johnston
3d8562348c fusefs: Address -Wunused-but-set-variable warnings
Reviewed by:	asomers
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33957
2022-01-20 08:25:00 -05:00
Mark Johnston
c3196306f0 clockcalib: Fix an overflow bug
tc_counter_mask is an unsigned int and in the TSC timecounter is equal
to UINT_MAX, so the addition tc->tc_counter_mask + 1 can overflow to 0,
resulting in a hang during boot.

Fixes:		c2705ceaeb ("x86: Speed up clock calibration")
Reviewed by:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33956
2022-01-20 08:23:38 -05:00
Mitchell Horne
eb81812fb7 riscv: fix unused var in page_fault_handler()
clang warns that p is set-but-not-used, so let's use it.
2022-01-19 17:21:25 -04:00
Alan Somers
170a0a8ebb ses: minor cleanup
* Prefer variables of small scope rather than large scope
* Remove a magic number
* style(9) for return statements
* Remove the get_enc_status method, which never did anything
* Fix a variable type in the handle_string method
* Proofread some comments

MFC after:	2 weeks
Sponsored by:	Spectra Logic, Axcient
Reviewed by:	ken, mav
Differential Revision: https://reviews.freebsd.org/D31686
2022-01-19 12:08:03 -07:00
Mark Johnston
6c7e4d72b1 vt: Use a taskqueue to clear splash_cpu logos
vt_fini_logos() calls vtbuf_grow(), which reallocates the console
window's buffer using malloc(M_WAITOK).  Because vt_fini_logos() is
called via a callout, we end up panicking if INVARIANTS is enabled.

Fix the problem simply by clearing the logos using a timed taskqueue.
taskqueue_thread is formally allowed to sleep; of course, if we actually
end up sleeping to satisfy the allocation, then we have bigger problems.

PR:		260896
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33932
2022-01-19 10:53:15 -05:00
Andrew Turner
2ad1999722 Add the Armv8.3-SPE registers 2022-01-19 12:07:35 +00:00
Andrew Turner
b5876847ac Teach DTrace about BTI on arm64
The Branch Target Identification (BTI) Armv8-A extension adds new
instructions that can be placed where we may indirrectly branch to,
e.g. at the start of a function called via a function pointer. We can't
emulate these in DTrace as the kernel will have raised a different
exception before the DTrace handler has run.

Skip over the BTI instruction if it's used as the first instruction in
a function.

Sponsored by:	The FreeBSD Foundation
2022-01-19 12:07:35 +00:00
Doug Moore
0ce7909cd0 vm_phys: add essential segment bounds check
A lower-bound segment check is necessary in vm_phys_alloc_seg_contig.
Add one.

Reported by:	jenkins
Reviewed by:	alc
Fixes:	da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33945
2022-01-19 00:42:39 -06:00
Alan Somers
89d57b94d7 fusefs: implement VOP_DEALLOCATE
MFC after:	Never
Reviewed by:	khng
Differential Revision: https://reviews.freebsd.org/D33800
2022-01-18 21:13:02 -07:00
Alexander Motin
b7ff445ffa Reduce bufdaemon/bufspacedaemon shutdown time.
Before this change bufdaemon and bufspacedaemon threads used
kthread_shutdown() to stop activity on system shutdown.  The problem is
that kthread_shutdown() has no idea about the wait channel and lock used
by specific thread to wake them up reliably.  As result, up to 9 threads
could consume up to 9 seconds to shutdown for no good reason.

This change introduces specific shutdown functions, knowing how to
properly wake up specific threads, reducing wait for those threads on
shutdown/reboot from average 4 seconds to effectively zero.

MFC after:	2 weeks
Reviewed by:	kib, markj
Differential Revision:  https://reviews.freebsd.org/D33936
2022-01-18 19:26:16 -05:00
John Baldwin
dd2f7a4b45 Bump __FreeBSD_version for the addition of <crypto/chacha20_poly1305.h>.
Sponsored by:	The FreeBSD Foundation
2022-01-18 14:49:24 -08:00
John Baldwin
42876a039e crypto: Stop compiling in chacha20poly1305 AEAD ciphers from libsodium.
These ciphers are now supported via OCF or 'struct enc_xform'.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33889
2022-01-18 14:48:40 -08:00
John Baldwin
e71680049b crypto: Add a simple API for [X]ChaCha20-Poly1035 on flat buffers.
This is a synchronous software API which wraps the existing software
implementation shared with OCF.  Note that this will not currently
use optimized backends (such as ossl(4)) but may be appropriate for
operations on small buffers.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33524
2022-01-18 14:47:13 -08:00
Vladimir Kondratyev
89889ab470 LinuxKPI: Allow wake_up to be executed within a critical section
by replaceing of spin_lock() call with spin_lock_irqsave()

This fixes following panic in drm-kmod:

panic: mi_switch: switch in a critical section
cpuid = 2
time = 1636939794
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b
vpanic() at vpanic+0x187
panic() at panic+0x43
mi_switch() at mi_switch+0x198
__mtx_lock_sleep() at __mtx_lock_sleep+0x1c9
__mtx_lock_flags() at __mtx_lock_flags+0xa2
linux_wake_up() at linux_wake_up+0x38
__active_retire() at __active_retire+0xb7
dma_fence_signal() at dma_fence_signal+0x100
dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0x96
i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x11d0
i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x19a
drm_ioctl_kernel() at drm_ioctl_kernel+0x72
drm_ioctl() at drm_ioctl+0x2c4
linux_file_ioctl() at linux_file_ioctl+0x297
kern_ioctl() at kern_ioctl+0x1dc
sys_ioctl() at sys_ioctl+0x124
amd64_syscall() at amd64_syscall+0x124
fast_syscall_common() at fast_syscall_common+0xf8
--- syscall (54, FreeBSD ELF64, sys_ioctl)

MFC after:	1 week
Reviewed by:	manu
Reported by:	Graham Perrin <grahamperrin_AT_gmail_DOT_com>
PR:		261166
Differential Revision:	https://reviews.freebsd.org/D33888
2022-01-18 23:14:13 +03:00
Vladimir Kondratyev
02ea603302 LinuxKPI: Allow spin_lock_irqsave to be called within a critical section
with spinning on spin_trylock. dma-buf part of drm-kmod depends on this
property and absence of it support results in "mi_switch: switch in a
critical section" assertions [1][2].

[1] https://github.com/freebsd/drm-kmod/issues/116
[2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261166

MFC after:	1 week
Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D33887
2022-01-18 23:14:12 +03:00
Robert Wing
a50e92cc20 geom: add kqfilter support for geom dev
The only event hooked up is NOTE_ATTRIB, which is triggered when the
device is resized. Support for other NOTE_* events to follow.

Reviewed by:	kib, jhb
Differential Revision:	https://reviews.freebsd.org/D33402
2022-01-18 10:54:59 -09:00
Kenneth D. Merry
6e8a2f0400 Update sa(4) comments and man page after review.
sys/cam/scsi/scsi_sa.c:
	Add comments explaining the priority order of the various
	sources of timeout values.  Also, explain that the probe
	that pulls in drive recommended timeouts via the REPORT
	SUPPORTED OPERATION CODES command is in a race with the
	thread that creates the sysctl variables.  Because of that
	race, it is important that the sysctl thread not load any
	timeout values from the kernel environment.

share/man/man4/sa.4:
	Use the Sy macro to emphasize thousandths of a second
	instead of capitalizing it.

Requested by:	Warner Losh <imp@freebsd.org>
Requested by:	Daniel Ebdrup Jensen <debdrup@freebsd.org>
Sponsored by:	Spectra Logic
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33883
2022-01-18 13:50:31 -05:00
Kenneth D. Merry
5719b5a1bb Switch to using drive-supplied timeouts for the sa(4) driver.
Summary:
The sa(4) driver has historically used tape drive timeouts that
were one-size fits all, with compile-time options to adjust a few
of them.

LTO-9 drives (and presumably other tape drives in the future)
implement a tape characterization process that happens the first
time a tape is loaded.  The characterization process formats the
tape to account for the temperature and humidity in the environment
it is being used in.  The process for LTO-9 tapes can take from 20
minutes (I have observed 17-18 minutes) to 2 hours according to the
documentation.

As a result, LTO-9 drives have significantly longer recommended
load times than previous LTO generations.

To handle this, change the sa(4) driver over to using timeouts
supplied by the tape drive using the timeout descriptors obtained
through the REPORT SUPPORTED OPERATION CODES command.  That command
was introduced in SPC-4.  IBM tape drives going back to at least
LTO-5 report timeout values.  Oracle/Sun/StorageTek tape drives
going back to at least the T10000C report timeout values.  HP LTO-5
and newer drives report timeout values.  The sa(4) driver only
queries drives that claim to support SPC-4.

This makes the timeout settings automatic and accurate for newer
tape drives.

Also, add loader tunable and sysctl support so that the user can
override individual command type timeouts for all tape drives in
the system, or only for specific drives.

The new global (these affect all tape drives) loader tunables are:

	kern.cam.sa.timeout.erase
	kern.cam.sa.timeout.load
	kern.cam.sa.timeout.locate
	kern.cam.sa.timeout.mode_select
	kern.cam.sa.timeout.mode_sense
	kern.cam.sa.timeout.prevent
	kern.cam.sa.timeout.read
	kern.cam.sa.timeout.read_position
	kern.cam.sa.timeout.read_block_limits
	kern.cam.sa.timeout.report_density
	kern.cam.sa.timeout.reserve
	kern.cam.sa.timeout.rewind
	kern.cam.sa.timeout.space
	kern.cam.sa.timeout.tur
	kern.cam.sa.timeout.write
	kern.cam.sa.timeout.write_filemarks

The new per-instance loader tunable / sysctl variables are:

	kern.cam.sa.%d.timeout.erase
	kern.cam.sa.%d.timeout.load
	kern.cam.sa.%d.timeout.locate
	kern.cam.sa.%d.timeout.mode_select
	kern.cam.sa.%d.timeout.mode_sense
	kern.cam.sa.%d.timeout.prevent
	kern.cam.sa.%d.timeout.read
	kern.cam.sa.%d.timeout.read_position
	kern.cam.sa.%d.timeout.read_block_limits
	kern.cam.sa.%d.timeout.report_density
	kern.cam.sa.%d.timeout.reserve
	kern.cam.sa.%d.timeout.rewind
	kern.cam.sa.%d.timeout.space
	kern.cam.sa.%d.timeout.tur
	kern.cam.sa.%d.timeout.write
	kern.cam.sa.%d.timeout.write_filemarks

The values are reported and set in units of thousandths of a
second.

share/man/man4/sa.4:
	Document the new loader tunables in the sa(4) man page.

sys/cam/scsi/scsi_sa.c:
	Add a new timeout_info array to the softc.

	Add a default timeouts array, along with descriptions.

	Add a new sysctl tree to the softc to handle the timeout
	sysctl values.

	Add a new function, saloadtotunables(), that will load
	the global loader tunables first and then any per-instance
	loader tunables second.

	Add creation of the new timeout sysctl variables in
	sasysctlinit().

	Add a new, optional probe state to the sa(4) driver.  We
	previously didn't do any probing, but now we probe for
	timeout descriptors if the drive claims to support SPC-4 or
	later.  In saregister(), we check the SCSI revision and
	either launch the probe state machine, or announce the
	device and become ready.

	In sastart() and sadone(), add support for the new
	SA_STATE_PROBE.  If we're probing, we don't go through
	saerror(), since that is currently only written to handle
	I/O errors in the normal state.

	Change every place in the sa(4) driver that fills in
	timeout values in a CCB to use the new timeout_info[] array
	in the softc.

	Add a new saloadtimeouts() routine to parse the returned
	timeout descriptors from a completed REPORT SUPPORTED
	OPERATION CODES command, and set the values for the
	commands we support.

MFC after:	1 week
Sponsored by:	Spectra Logic

Test Plan:
Try this out with a variety of tape drives and make sure the timeouts that
result (sysctl kern.cam.sa to see them) are reasonable.

Reviewers: #manpages, #cam

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D33883
2022-01-18 13:50:30 -05:00
Doug Moore
da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
In vm_phys_alloc_seg_contig, in allocating multiple memory blocks for
a huge allocation, ensure that the end of the allocated range does not
exceed the upper segment limit.

Reorder a couple of checks to improve code layout.

Reviewed by:	alc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33870
2022-01-18 12:49:09 -06:00