All I/O requests through the taste consumer are synchronous, done
with g_read_data() and without any locks held. It makes no sense
to delegate the I/O to g_down/g_up threads.
This removes many of context switches during disk retaste.
MFC after: 2 weeks
The only cases when direct dispatch does not make sense is for I/O
submission from down thread and for completion from up thread. In
all other cases, if both consumer and producer are OK about it, we
can save on context switches.
MFC after: 2 weeks
Unlike normal consumers all taste consumer I/O is synchronous, done
with g_read_data() and without any locks held. It makes no sense to
delegate I/O submission to g_down thread.
This should remove number of context switches during disk retaste.
MFC after: 2 weeks
Per RFC2822 the maximum transmitted line length is "998 characters...
excluding the CRLF." In a file the maximum is 999 with the \n included.
Previously mail containing a line with exactly 999 characters would
bounce.
PR: 208261
Reported by: Helge Oldach
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Don't emit messages; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it. This is also similar with how we handle seccomp
in linux_prctl().
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D33808
Import bsddialog 0.1 Utility and Library, fully refatorized, API stable,
manuals completed, easier to maintain and improve.
Update deps for new API:
add mixedgauge consts, delete __DECONST and add bsddialog_geterror()
info to avoid silent errors
* tzsetup
* kbdmap
* distextract
Differential Revision: https://reviews.freebsd.org/D34066
When a filesystem is mounted all of its associated snapshots must
be activated. It first allocates a snapshot lock (snaplk) that will
be shared by all the snapshot vnodes associated with the filesystem.
As part of each snapshot file activation, it must replace its own
ufs vnode lock with the snaplk. In this way acquiring the snaplk
gives exclusive access to all the snapshots for the filesystem.
A write to a ufs vnode first acquires the ufs vnode lock for the
file to be written then acquires the snaplk. Once it has the snaplk,
it can check all the snapshots to see if any of them needs to make
a copy of the block that is about to be written. This ffs_copyonwrite()
code path establishes the ufs vnode followed by snaplk locking
order.
When a filesystem is unmounted it has to release all of its snapshot
vnodes. Part of doing the release is to revert the snapshot vnode
from using the snaplk to using its original vnode lock. While holding
the snaplk, the vnode lock has to be acquired, the vnode updated
to reference it, then the snaplk released. Acquiring the vnode lock
while holding the snaplk violates the ufs vnode then snaplk order.
Because the vnode lock is unused, using LK_EXCLUSIVE | LK_NOWAIT
to acquire it will always succeed and the LK_NOWAIT prevents the
reverse lock order from being recorded.
This change was made in January 2021 (173779b98f) to avoid an LOR
violation in ffs_snapshot_unmount(). The same LOR issue was recently
found again when removing a snapshot in ffs_snapremove() which must
also revert the snaplk to the original vnode lock as part of freeing it.
The unwind in ffs_snapremove() deals with the case in which the
snaplk is held as a recursive lock holding multiple references.
Specifically an equal number of references are made on the vnode
lock. This change factors out the lock reversion operations into a
new function revert_snaplock() which handles both the recursive
locks and avoids the LOR. The new revert_snaplock() function is
then used in both ffs_snapshot_unmount() and in ffs_snapremove().
Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33946
The function nfscl_getcookie(), which is essentially the
same as ncl_getcookie(), is never called, so delete it.
This is probably cruft left over from the port of the
NFSv4 code to FreeBSD several years ago.
Found while modifying the code to better use the
directory offset cookies.
MFC after: 2 weeks
I tested the original commit as part of a series that culminates in
removing this header and installing LLVM libunwind's unwind.h in its
place so missed updating this header as was done in b84693501a.
Pointy hat to: jhb
Reported by: kevans
Fixes: 3a502289d3 Use uintptr_t for return type of _Unwind_GetCFA.
This doesn't work with musl, which defines stdout as FILE * const.
Instead, explicitly pass the desired output stream to ar_read_archive().
No functional change intended.
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34064
This matches the type in other unwind headers (LLVM libunwind,
libcxxrt, glibc).
NB: include/unwind.h is not installed but is only used by libthr
Reviewed by: imp, dim, emaste
Differential Revision: https://reviews.freebsd.org/D34049
- Use the semantically correct TSTMP_xx macro when comparing
timestamps. (No functional change)
- check for bad retransmits only when TSopt is present in ACK
(don't assume there will be a valid TSopt in the TCP options struct)
- exclude tsecr == 0, since that most likely indicates an
invalid ts echo return (tsecr) value.
Reviewed By: tuexen, #transport
MFC after: 3 days
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34062
Under rare circumstances, a spurious retranmission is
incorrectly detected and rewound, messing up various tcpcb values,
which can lead to a panic when SACK is in use.
Reviewed By: tuexen, chengc_netapp.com, #transport
MFC after: 3 days
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D33979
As noted in the PR, cp -R has some surprising behavior. Typically, when
you `cp -R foo bar` where both foo and bar exist, foo is cleanly copied
to foo/bar. When you `cp -R foo foo` (where foo clearly exists), cp(1)
goes a little off the rails as it creates foo/foo, then discovers that
and creates foo/foo/foo, so on and so forth, until it eventually fails.
POSIX doesn't seem to disallow this behavior, but it isn't very useful.
GNU cp(1) will detect the recursion and squash it, but emit a message in
the process that it has done so.
This change seemingly follows the GNU behavior, but it currently doesn't
warn about the situation -- the author feels that the final product is
about what one might expect from doing this and thus, doesn't need a
warning. The author doesn't feel strongly about this.
PR: 235438
Reviewed by: bapt
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D33944
netisr uses global workstreams and after dequeueing an mbuf it
uses rcvif to get the VNET of the mbuf. Of course, this is not
needed when kernel is compiled without VIMAGE. It came out that
routing socket does not set rcvif if compiled without VIMAGE.
Make this assignment not depending on VIMAGE option.
Fixes: 6871de9363
gzip has SMALL conditionals which enable building a reduced size version
of the binary. These exist as part of the introduction of BSD licensed
gzip in 2004 in NetBSD and appear to have been required to reach a size
for inclusion in their install media. For more information see commits
to gzip in the NetBSD tree on the 28th of March 2004.
SMALL doesn't appear to be hooked up to our build system and
complicates gzip quite a bit.
Reviewed by: kevans, imp
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D34047
This test was written because execvp was found to improperly handle the
argc == 0 case when it falls back from an ENOEXEC. We could probably
mostly revert it now, but let's just fix the test for the time being and
circle back later to decide if we want to simplify execvp. The test
will likely remain either way just to make sure execvp isn't working
around the newly enforced restriction with the fallback.
Fixes: 301cb491ea ("execvp: fix up the ENOEXEC fallback")
Reported by: jenkins via lwhsu@
When run with test, verbose and list we need to parse the file otherwise
the test output is "NOT OK" even for the file is valid.
Reviewed by: kevans, allanjude, imp
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D34046
When error is called with a message with spaces (and probably multiple
lines) these are passed into dialog unquoted and an error message was
presented, wrap with quotes.
Reviewed by: bapt, allanjude
Sponsored by: Ampere Computing LLC
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D33918
It does not look like there is anything in mmc_da code that actually
requires protocol 2.0 or later. dev/mmc code also does not have such a
restriction.
Tested with a very old 2GB mini-SD card. Prior to this change mmc_da
would claim the card but would not expose it to GEOM.
Without MMCCAM:
mmc0: <MMC/SD bus> on sdhci_pci0
mmc0: Probing bus
mmc0: SD probe: OK (OCR: 0x00ff8000)
mmc0: Current OCR: 0x00ff8000
mmc0: CMD8 failed, RESULT: 1
mmc0: Probing cards
mmc0: New card detected (CID 1c53565344432020100002982e007600)
mmc0: New card detected (CSD 005e00325f5a83d02db7ffbf96800000)
mmc0: Card at relative address 0xb368 added:
mmc0: card: SD SDC 1.0 SN 0002982E MFG 06/2007 by 28 SV
mmc0: quirks: 0
mmc0: bus: 4bit, 50MHz (high speed timing)
mmc0: memory: 3998720 blocks, erase sector 256 blocks
mmc0: setting transfer rate to 50.000MHz (high speed timing)
GEOM: new disk mmcsd0
mmcsd0: 2GB <SD SDC 1.0 SN 0002982E MFG 06/2007 by 28 SV> at mmc0 50.0MHz/4bit/65535-block
mmc0: setting bus width to 4 bits high speed timing
With MMCCAM and this change:
sdda0 at sdhci_slot0 bus 0 scbus2 target 0 lun 0
sdda0: Relative addr: 0000b368
Card features: <Memory>
sdda0: Serial Number 0002982E
sdda0: SD SDC 1.0 SN 0002982E MFG 06/2007 by 28 SV
GEOM: new disk sdda0
Reviewed by: manu
MFC after: 3 weeks
I was somehow convinced that insmntque calls insmntque1 with a NULL
destructor. Unfortunately this worked well enough to not immediately
blow up in simple testing.
Keep not using the destructor in previously patched filesystems though
as it avoids unnecessary casts.
Noted by: kib
Reported by: pho
tcp_chg_pacing_rate() is expected to release the hw rate limit table,
but failed to do so in several error cases, leading to ever
increasing counts of flows using the rate.
This patch was mostly done by rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34058
Reviewed by: hselasky, rrs, jhb (inital version, outside of Differential)
When ip_output_send() returns EAGAIN due to issues with send tags (route
change, lagg failover, etc), it must free the mbuf. This is because
ip_output_send() was written as a wrapper/replacement for a direct
call to if_output(), and the contract with if_output() has
historically been that it owns the mbufs once called. When
ip_output_send() failed to free mbufs, it violated this assumption
and lead to leaked mbufs.
This was noticed when using NIC TLS in combination with hardware
rate-limited connections. When seeing lots of NIC output drops
triggered ratelimit send tag changes, we noticed we were leaking
ktls_sessions, send tags and mbufs. This was due ip_output_send()
leaking mbufs which held references to ktls_sessions, which in
turn held references to send tags.
Many thanks to jbh, rrs, hselasky and markj for their help in
debugging this.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34054
Reviewed by: hselasky, jhb, rrs
MFC after: 2 weeks
Like BIO_FLUSH, there is no reason for consumers to pass a BIO_SPEEDUP
request with non-NULL bio_data, so assert this.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
In particular, there is no need to allocate a data block when passing
BIO_FLUSH requests to child providers, and g_io_request() asserts that
bp->bio_data == NULL for such requests.
PR: 255131
Reported and tested by: nvass@gmx.com
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.
The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.
Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19831
I observed a situation where some read requests failed when a 2-way geom
mirror lost one disk. The problem appears to be in the logic that skips
retrying a failed request when a mirror has only one active disk.
Generally, that makes sense. But during a transition from two disks to
one it is possible that the request failed on the failing disk before it
was inactivated and, so, the remaining active disk is the disk that
should be tried.
This change adds an additional check to ensure that it was the (only)
active disk that was already tried.
Reviewed by: mav
MFC after: 3 weeks
These had been disabled due to panics with queued packets keeping
pointers (in m->m_pkthdr.rcvif) to removed interfaces.
This issue has been resolved in 165746f4e4, so the tests can be run
again.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Dummynet queues packets with an associated struct ifnet pointer. Ensure
that things do not explode if that interface goes away with packets
still in the queue.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33065
We provide the hostid (which is the state creatorid) to the kernel as a
big endian number (see pfctl/pfctl.c pfctl_set_hostid()), so convert it
back to system endianness when we get it from the kernel.
This avoids a confusing mismatch between the value the user configures
and the value displayed in the state.
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33989
If an invalid (i.e. overly long) interface name is specified error out
immediately, rather than in expand_rule() so we point at the incorrect
line.
PR: 260958
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D34008
This fixed panic with interface being removed while packet
was sitting on a queue. This allows to pass all dummynet
tests including forthcoming dummynet:ipfw_interface_removal
and dummynet:pf_interface_removal and demonstrates use of
m_rcvif_serialize() and m_rcvif_restore().
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33267
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33266
Fun note: git show e6abef0918