Commit Graph

136354 Commits

Author SHA1 Message Date
Rick Macklem
a5f9fe2bab copy_file_range(2): Fix for small values of input file offset and len
r366302 broke copy_file_range(2) for small values of
input file offset and len.

It was possible for rem to be greater than len and then
"len - rem" was a large value, since both variables are
unsigned.

Reported by: koobs, Pablo <pablogsal gmail com> (Python)
Reviewed by:	asomers, koobs
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28981
2021-03-01 06:31:10 -08:00
Alex Richardson
0e4ff0acbe AArch64: Don't set flush-subnormals-to-zero flag on startup
This flag has been set on startup since 65618fdda0.
However, This causes some of the math-related tests to fail as they report
zero instead of a tiny number. This fixes at least
/usr/tests/lib/msun/ldexp_test and possibly others.
Additionally, setting this flag prevents printf() from printing subnormal
numbers in decimal form.
See also https://www.openwall.com/lists/musl/2021/02/26/1

PR:		253847
Reviewed By:	mmel
Differential Revision: https://reviews.freebsd.org/D28938
2021-03-01 14:27:30 +00:00
Mitchell Horne
e152c88273 arm64: add definition for IS_SSTEP_TRAP()
arm64 has a distinct exception code for single-step, so we can use this
to detect when an unexpected SS trap is encountered, or when an expected
one is not. See db_stop_at_pc().

Reviewed by:	markj, jhb
MFC after:	5 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28942
2021-03-01 10:04:23 -04:00
Mitchell Horne
bd0b7cbf5a arm64: update kdb_thrctx->pcb_lr with BKPT_SKIP
This value should be kept in sync with updates to kdb_frame->tf_elr,
since it is queried by PC_REGS() in several places.

Reviewed by:	markj, jhb
MFC after:	5 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28943
2021-03-01 10:04:22 -04:00
Mitchell Horne
874635e381 arm64: fix hardware single-stepping from EL1
The main issue is that debug exceptions must to be disabled for the
entire duration that SS bit in MDSCR_EL1 is set. Otherwise, a
single-step exception will be generated immediately. This can occur
before returning from the debugger (when MDSCR is written to) or before
re-entering it after the single-step (when debug exceptions are unmasked
in the exception handler).

Solve this by delaying the unmask to C code for EL1, and avoid unmasking
at all while handling debug exceptions, thus avoiding any recursive
debug traps.

Reviewed by:	markj, jhb
MFC after:	5 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28944
2021-03-01 10:04:22 -04:00
Michal Meloun
ce5a4083de pci_dw_mv: Don't enable unhandled interrupts.
Mainly link errors interrupts should only be activated on fully linked port,
otherwise noise on lanes can cause livelock. But we don't have error
counters yet, so leave these interrupts disabled.
2021-03-01 14:03:34 +01:00
Elliott Mitchell
a2c0e94ccf xen: remove x86-ism from Xen common code
PAT_WRITE_BACK is x86-only, whereas sys/dev/xen could be shared
between multiple architectures.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D28831
2021-03-01 13:33:01 +01:00
Mateusz Guzik
2c1c1255e4 kcsan: add atomic_interrupt_fence
Unbreaks building GENERIC-KCSAN.
2021-03-01 07:43:27 +00:00
Rick Macklem
15bed8c46b nfsclient: add nfs node locking around uses of n_direofoffset
During code inspection I noticed that the n_direofoffset field
of the NFS node was being manipulated without any lock being
held to make it SMP safe.
This patch adds locking of the NFS node's mutex around
handling of n_direofoffset to make it SMP safe.

I have not seen any failure that could be attributed to n_direofoffset
being manipulated concurrently by multiple processors, but I think this
is possible, since directories are read with shared vnode
locking, plus locks only on individual buffer cache blocks.
However, there have been as yet unexplained issues w.r.t reading
large directories over NFS that could have conceivably been caused
by concurrent manipulation of n_direofoffset.

MFC after:	2 weeks
2021-02-28 14:53:54 -08:00
Rick Macklem
3e04ab36ba nfsclient: add checks for a server returning the current directory
Commit 3fe2c68ba2 dealt with a panic in cache_enter_time() where
the vnode referred to the directory argument.
It would also be possible to get these panics if a broken
NFS server were to return the directory as an new object being
created within the directory or in a Lookup reply.

This patch adds checks to avoid the panics and logs
messages to indicate that the server is broken for the
file object creation cases.

Reviewd by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28987
2021-02-28 14:15:32 -08:00
Elliott Mitchell
530d38441d armv8crypto: add missing newline
The missing newline mildly garbles boot-time messages and this can be
troublesome if you need those.

Fixes:		a520f5ca58 ("armv8crypto: print a message on probe failure")
Reported by:	Mike Karels (mike@karels.net)
Reviewed By:	gonzo
Differential Revision:	https://reviews.freebsd.org/D28988
2021-02-28 16:03:55 -04:00
Bjoern A. Zeeb
a9cc796fa7 net80211: rx_stats add 160Mhz channel width.
Add the missing receive stat(u)s flag for 160Mhz channel width.
While here correct the comment for c_phytype to reference the correct
flags.

MFC-after:	3 days
Sponsored-by:	Rubicon Communications, LLC ("Netgate")
2021-02-28 19:24:22 +00:00
Richard Scheffenegger
e9071000c9 Improve PRR initial transmission timing
Reviewed By:	tuexen, #transport
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D28953
2021-02-28 15:46:54 +01:00
Alexander Motin
d01032736c Fix diroffdiroff, probably copy/paste bug.
Too long name looks bad in `vmstat -m`.

MFC after:	1 week
2021-02-28 09:08:31 -05:00
Rick Macklem
3fe2c68ba2 nfsclient: fix panic in cache_enter_time()
Juraj Lutter (otis@) reported a panic "dvp != vp not true" in
cache_enter_time() called from the NFS client's nfsrpc_readdirplus()
function.
This is specific to an NFSv3 mount with the "rdirplus" mount
option. Unlike NFSv4, NFSv3 replies to ReaddirPlus
includes entries for the current directory.

This trivial patch avoids doing a cache_enter_time()
call for the current directory to avoid the panic.

Reported by:	otis
Tested by:	otis
Reviewed by:	mjg
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28969
2021-02-27 17:54:05 -08:00
Konstantin Belousov
b5449c92b4 Use atomic_interrupt_fence() instead of bare __compiler_membar()
for the which which definitely use membar to sync with interrupt handlers.

libc and rtld uses of __compiler_membar() seems to want compiler barriers
proper.

The barrier in sched_unpin_lite() after td_pinned decrement seems to be not
needed and removed, instead of convertion.

Reviewed by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28956
2021-02-28 01:27:29 +02:00
Mateusz Guzik
1d8510c1a6 zfs: add missing seqc write begin/end around zfs_acl_chown_setattr
It happens to trip over an assert but does not matter for correctness at
this time. However, do it for future proofing.

Reported by:	avg
2021-02-27 22:29:50 +00:00
Mateusz Guzik
1239a72221 cache: temporarily drop the assert that dvp != vp when adding an entry
Historically it was allowed for any names, but arguably should never be
even attempted. Allow it again since there is a release pending and
allowing it is bug-compatible with previous behavior.

Reported by:	otis
2021-02-27 22:29:50 +00:00
Michael Tuexen
70e95f0b69 sctp: avoid integer overflow when starting the HB timer
MFC after:	3 days
Reported by:	syzbot+14b9d7c3c64208fae62f@syzkaller.appspotmail.com
2021-02-27 23:27:30 +01:00
Robert Watson
a92c6b24c0 Add a comment on why the call to mac_vnode_relabel() might be in the wrong
place -- in the VOP rather than vn_setexttr() -- and that it is for historic
reasons.  We might wish to relocate it in due course, but this way at least
we document the asymmetry.
2021-02-27 16:25:26 +00:00
Alexander Motin
9d9fd8b79f Micro-optimize OOA queue processing.
- Move ctl_get_cmd_entry() calls from every OOA traversal to when
  the requests first inserted, storing seridx in struct ctl_scsiio.
- Move some checks out of the loop in ctl_check_ooa().
- Replace checks for errors that can not happen with asserts.
- Transpose ctl_serialize_table, so that any OOA traversal accessed
  only one row (cache line).  Compact it from enum to uint8_t.
- Optimize static branch predictions in hottest places.

Due to O(n) nature on deep LUN queues this can be the hottest code
path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests
are good to have in reserve.  About 50% of CPU time here according
to the profiles is now spent in two memory accesses per traversed
request in OOA.

Sponsored by:	iXsystems, Inc.
MFC after:	2 weeks
2021-02-27 10:40:24 -05:00
Warner Losh
c01da939b0 cardbus: Be sure to acquire Giant when calling into newbus
Acquire Giant in cardbus_detach_card. This used to be done above us, but no
more.

Tested by: kargl@
MFC After: 3 days
2021-02-27 01:23:09 -07:00
Martin Matuska
c170aa9f37 zfs: add missing checks for unsupported features
After the merge of OpenZFS master-9312e0fd1 it has become possible to
import ZFS pools witn an active org.illumos:edonr feature on FreeBSD,
leading to a panic.

In addition, "zpool status" reported all pools without edonr as upgradable
and "zpool upgrade -v" lists edonr in the list of upgradable features.

This is an accepted but not yet included bugfix by upstream.

Obtained from:		https://github.com/openzfs/zfs/pull/11653
Differential Revision:	https://reviews.freebsd.org/D28935
Reported by:		garga (on freebsd-current@)
Reviewed by:		freqlabs
X-MFC-with:		ba27dd8be8
2021-02-27 00:05:50 +01:00
Richard Scheffenegger
9e83a6a556 Include new data sent in PRR calculation
Reviewed By:	#transport, kbowling
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D28941
2021-02-26 22:31:58 +01:00
Warner Losh
34d6961108 cam: add new ASC and ASCQ values related to drive depopulation
Add 04/25 Depopulation restoration in progress, 31/04 Depopulation failed, and
31/05 Depopulation restoration failed.

These are defined in SPC-6r2 (though 31/4 was added in an earlier draft). They
relate to different aspects of in-progress or failed depopulation removal and
restoration commands.
2021-02-26 11:45:06 -07:00
Navdeep Parhar
dfff1de729 cxgbe(4): Read the rx 'c' channel for a port and make it available.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-02-25 23:46:14 -08:00
Jamie Gritton
589e4c1df4 jail: Add safety around prison_deref() flags.
do_jail_attach() now only uses the PD_XXX flags that refer to lock
status, so make sure that something else like PD_KILL doesn't slip
through.

Add a KASSERT() in prison_deref() to catch any further PD_KILL misuse.
2021-02-25 20:10:42 -08:00
Jamie Gritton
108a9384e9 jail: Fix locking on an early jail_set error.
I had locked allprison_lock without immediately setting PD_LIST_LOCKED.
2021-02-25 19:52:58 -08:00
Mitchell Horne
1fc928770b Remove stale references to opt_sio.h
The sio(4) driver was removed entirely in 2019, commit 71f0077631.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D28929
2021-02-25 21:43:12 -04:00
Alexander Motin
a9bd22814f Remove pointless lun->be_lun checks.
There is no such thing as LUN without backend, at least for years.

MFC after:	1 week
2021-02-25 19:48:03 -05:00
Mark Johnston
aac25e2225 pmap: Fix largemap restart checks in the kernel_maps sysctl handler
The purpose of these checks is to ensure that the address of the
next-level page table page is valid, since nothing is synchronizing with
a concurrent update of the large map and large map PTPs are freed to the
system.  However, if PG_PS is set, there is no next level.

Reported by:	rpokala
Reviewed by:	kib
Tested by:	rpokala
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28922
2021-02-25 18:49:47 -05:00
John Baldwin
90972f0402 ktls: Use COUNTER_U64_DEFINE_EARLY for the ktls_toe_chacha20 counter.
I missed updating this counter when rebasing the changes in
9c64fc4029 after the switch to
COUNTER_U64_DEFINE_EARLY in 1755b2b989.

Fixes:		9c64fc4029 Add Chacha20-Poly1305 as a KTLS cipher suite.
Sponsored by:	Netflix
2021-02-25 15:00:13 -08:00
Kristof Provost
5f1b1f184b pf: Fix incorrect fragment handling
A sequence of overlapping IPv4 fragments could crash the kernel in
pf due to an assertion.

Reported by:	Alexander Bluhm
Obtained from:	OpenBSD
MFC after:	3 days
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-02-25 21:51:08 +01:00
Adrian Chadd
547739cc00 [ar71xx] Fix routerstation / routerstation pro redboot FIS probing
Some changes back in ye olde times somewhere has changed the default
block size the flash device exposes.  So, the default geom redboot
FIS probing (to find the partition table structure in flash!)
is no longer finding it.

So, force it to probe at the last 64k of flash regardless of the
underlying flash block size.

Tested:

* Ubiquiti Routerstation pro, boots -HEAD MIPS
2021-02-25 13:14:55 -08:00
Brandon Bergren
5001c579ba [PowerPC64LE] pseries: Fix input buffering logic.
In uart_phyp_get(), when the internal buffer is empty, we make a
hypercall to retrieve up to 16 bytes of input data from the
hypervisor. As this is specified to be returned in BE format, we need
to do a 64-bit byte swap on the first and second half of the data.

If the buffer being passed in was insufficient to return the fetched
data, we store the remainder in the internal buffer and use it to
satisfy the following calls to uart_phyp_get() until it is drained.

However, in this case, we were accidentally byteswapping the internal
buffer again.

Move the byteswapping code to just after the hypercall so it only gets
swapped when we're filling the buffer.

Fixes arrow keys in qemu on pseries, among other console oddities.

Sponsored by:	Tag1 Consulting, Inc.
MFC after:	3 days
2021-02-25 14:50:13 -06:00
Ryan Libby
d7671ad8d6 Close races in vm object chain traversal for unlock
We were unlocking the vm object before reading the backing_object field.
In the meantime, the object could be freed and reused.  This could cause
us to go off the rails in the object chain traversal, failing to unlock
the rest of the objects in the original chain and corrupting the lock
state of the victim chain.

Reviewed by:	bdrewery, kib, markj, vangyzen
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D28926
2021-02-25 12:11:19 -08:00
Edward Tomasz Napierala
e7a5b3bd05 Modify lock_delay() to increase the delay time after spinning
Modify lock_delay() to increase the delay time after spinning,
not before.  Previously we would spin at least twice instead of once.
In NetApp's benchmarks this fixes a performance regression compared
to FreeBSD 10, which called cpu_spinwait() directly.

Reviewed By:	mjg
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27331
2021-02-25 18:55:26 +00:00
Richard Scheffenegger
2593f858d7 A TCP server has to take into consideration, if TCP_NOOPT is preventing
the negotiation of TCP features. This affects most TCP options but
adherance to RFC7323 with the timestamp option will prevent a session
from getting established.

PR:	253576
Reviewed By:	tuexen, #transport
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28652
2021-02-25 19:12:20 +01:00
Richard Scheffenegger
31d7a27c6e PRR: Avoid accounting left-edge twice in partial ACK.
Reviewed By:	#transport, kbowling
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D28819
2021-02-25 18:37:47 +01:00
Richard Scheffenegger
48396dc779 Address two incorrect calculations and enhance readability of PRR code
- address second instance of cwnd potentially becoming zero
- fix sublte bug due to implicit int to uint typecase in max()
- fix bug due to typo in hand-coded CEILING() function by using howmany() macro
- use int instead of long, and add a missing long typecast
- replace if conditionals with easier to read imax/imin (as in pseudocode)

Reviewed By: #transport, kbowling
MFC after: 3 days
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28813
2021-02-25 18:32:04 +01:00
Mark Johnston
369706a6f8 buf: Fix the dirtybufthresh check
dirtybufthresh is a watermark, slightly below the high watermark for
dirty buffers.  When a delayed write is issued, the dirtying thread will
start flushing buffers if the dirtybufthresh watermark is reached.  This
helps ensure that the high watermark is not reached, otherwise
performance will degrade as clustering and other optimizations are
disabled (see buf_dirty_count_severe()).

When the buffer cache was partitioned into "domains", the dirtybufthresh
threshold checks were not updated.  Fix this.

Reported by:	Shrikanth R Kamath <kshrikanth@juniper.net>
Reviewed by:	rlibby, mckusick, kib, bdrewery
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Fixes:		3cec5c77d6
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28901
2021-02-25 10:04:44 -05:00
Mark Johnston
faa998f6ff sendfile: Use the pager size to determine the file extent when possible
Previously sendfile would issue a VOP_GETATTR and use the returned size,
i.e., the file size.  When paging in file data, sendfile_swapin() will
use the pager to determine whether it needs to zero-fill, most often
because of a hole in a sparse file.  An attempt to page in beyond the
end of a file is treated this way, and occurs when the requested page is
past the end of the pager.  In other words, both the file size and pager
size were used interchangeably.

With ZFS, updates to the pager and file sizes are not synchronized by
the exclusive vnode lock, at least partially due to its use of
MNTK_SHARED_WRITES.  In particular, the pager size is updated after the
file size, so in the presence of a writer concurrently extending the
file, sendfile could incorrectly instantiate "holes" in the page cache
pages backing the file, which manifests as data corruption when reading
the file back from the page cache.  The on-disk copy is unaffected.

Fix this by consistently using the pager size when available.

Reported by:	dumbbell
Reviewed by:	chs, kib
Tested by:	dumbbell, pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28811
2021-02-25 10:04:44 -05:00
Andrew Turner
3fd63ddfdf Limit when we call DELAY from KCSAN on amd64
In some cases the DELAY implementation on amd64 can recurse on a spin
mutex in the i8254 early delay code. Detect when this is going to
happen and don't call delay in this case. It is safe to not delay here
with the only issue being KCSAN may not detect data races.

Reviewed by:	kib
Tested by:	arichardson
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28895
2021-02-25 12:38:05 +00:00
Andrew Turner
59f6ddb2bc Use pmap_qenter in the N1SDP PCIe driver
In the Neoverse N1 SDP PCIe driver we need to map a page shared
between the firmware and the kernel. Previously we would use
pmap_kenter for this, however as this is not standardised between
architectures switch to the common pmap_qenter.

While here fix the error handling code to clean up on failure.

Reviewed by:	br
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28890
2021-02-25 12:38:05 +00:00
Kristof Provost
f5537cd069 bridgestp: Ensure we send STP on VLAN interfaces
Reviewed by:	donner@
MFC after:	1 week
X-MFC-with:	711ed156b9
Sponsored by:	Orange Business Services
Differential Revision:	https://reviews.freebsd.org/D28916
2021-02-25 10:16:25 +01:00
Kristof Provost
f3245be349 net: remove legacy in_addmulti()
Despite the comment to the contrary neither pf nor carp use
in_addmulti(). Nothing does, so get rid of it.

Carp stopped using it in 08b68b0e4c
(2011). It's unclear when pf stopped using it, but before
d6d3f01e0a (2012).

Reviewed by:	bz@, melifaro@
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D28918
2021-02-25 10:13:52 +01:00
Jamie Gritton
c861373bdf jail: re-commit 811e27fa3c with fixes
Make sure PD_KILL isn't passed to do_jail_attach, where it might end
up trying to kill the caller's prison (even prison0).

Fix the child jail loop in prison_deref_kill, which was doing the
post-order part during the pre-order part.  That's not a system-
killer, but make jails not always die correctly.
2021-02-24 21:54:49 -08:00
Jamie Gritton
ddfffb41a2 jail: back out 811e27fa3c until it doesn't break Jenkins
Reported by:	arichardson
2021-02-24 21:10:47 -08:00
Marcin Wojtas
ef567155d3 Fix powerpc build after 6dd69f0064
Commit 6dd69f0064 ("iflib: introduce isc_dma_width")
failed to build on powerpc due to implicit type conversion
error. Fix that.

Submitted by: Artur Rojek <ar@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2021-02-25 02:35:41 +01:00
Ryan Libby
bf667f282a ofed: quiet gcc -Wint-in-bool-context
The int in the argument to the ternary triggered -Wint-in-bool-context
from gcc.  Upstream linux has a larger and more entangled patch,
12f727721eee61b3d19dedb95cb893b2baa9fe41, which doesn't apply cleanly.
When we eventually sync that, we can just drop this change.

Reviewed by:	hselasky, imp, kib
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D28762
2021-02-24 15:56:16 -08:00