Extend the dnctl (dummynet config) tool to be able to read commands from
a file, just like ipfw already does.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33627
The implementation simply passes the text ref to the appropriate
underlying vnode. Without this, the default [un]set_text
implementation will only manage the text ref on the unionfs vnode,
causing it to be out of sync with the underlying filesystems and
potentially allowing corruption of executable file contents.
On INVARIANTS kernels, it also readily produces a panic on process
termination because the VM object representing the executable mapping
is backed by the underlying vnode, not the unionfs vnode.
PR: 251342
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33611
Use atomics to track the writecount granted to the underlying FS,
and avoid holding the vnode interlock while calling the underling FS'
VOP_ADD_WRITECOUNT(). This also fixes a WITNESS warning about nesting
the same lock type. Also add comments explaining why we need to track
the writecount on the unionfs vnode in the first place. Finally,
simplify writecount management to only use the upper vnode and assert
that we shouldn't have an active writecount on the lower vnode through
unionfs.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33611
OpenSSH has dropped libwrap support in OpenSSH 6.7p in 2014
(f2719b7c in github.com/openssh/openssh-portable) and we
maintain the patch ourselves since 2016 (a0ee8cc636).
Over the years, the libwrap support has deteriotated and probably
that was reason for removal upstream. Original idea of libwrap was
to drop illegitimate connection as soon as possible, but over the
years the code was pushed further down and down and ended in the
forked client connection handler.
The negative effects of late dropping is increasing attack surface
for hosts that are to be dropped anyway. Apart from hypothetical
future vulnerabilities in connection handling, today a malicious
host listed in /etc/hosts.allow still can trigger sshd to enter
connection throttling mode, which is enabled by default (see
MaxStartups in sshd_config(5)), effectively casting DoS attack.
Note that on OpenBSD this attack isn't possible, since they enable
MaxStartups together with UseBlacklist.
A only negative effect from early drop, that I can imagine, is that
now main listener parses file in /etc, and if our root filesystems
goes bad, it would get stuck. But unlikely you'd be able to login
in that case anyway.
Implementation details:
- For brevity we reuse the same struct request_info. This isn't
a documented feature of libwrap, but code review, viewing data
in a debugger and real life testing shows that if we clear
RQ_CLIENT_NAME and RQ_CLIENT_ADDR every time, it works as intended.
- We set SO_LINGER on the socket to force immediate connection reset.
- We log message exactly as libwrap's refuse() would do.
Differential revision: https://reviews.freebsd.org/D33044
in handling the cpuset sizes different from sizeof(cpuset_t).
For both cases, cpuset size shorter than sizeof(cpuset_t) results
in EINVAL on Linux.
For sched_getaffinity(), be more permissive and accept cpuset size
larger than our cpuset_t, by clipping the syscall argument and zeroing
the rest of the output buffer. For sched_setaffinity(), we should allow
shorter cpusets than current ABI size, again zeroing the rest of the bits.
With this change, python os.sched_get/setaffinity functions work.
Reported by: se
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The MINIMUM_SUPPORTED_OSREL is 1002501 (FreeBSD 10.3), and xlocale is
supported there.
While I'm there, explicitly use config.h generated with --disable-bzlib
--disable-xzlib instead of deleting them manually.
MFC after: 2 weeks
With crafted input to the G_GATE_CMD_CREATE ioctl, geom_gate can be made
to print kernel memory to the system console, potentially revealing
sensitive data from whatever was previously in that memory page.
But but but: this is a case of the sys admin misconfiguring, and you'd
need root privileges to do this.
Submitted By: Johannes Totz <jo@bruelltuete.com>
MFC after: 2 weeks
Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D31727
If I'm not mistaken, the underlying sendmsg() for nvlist_send() is
failing with ENOBUFS. In turn, nvlist_recv() returns NULL because it
didn't receive the expected number of file descriptors.
Adjusting net.local.dgram.recvspace worked on my local machine, but on
CI the test still fails consistently.
PR: 260891
An earlier version of this code computed the TSC frequency in kHz.
When the code was changed to compute the frequency more accurately,
the variable name was not updated.
Reviewed by: markj
Fixes: 22875f8879 x86: Implement deferred TSC calibration
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33696
It's possible that the "early" TSC calibration gave us a value which
is known to be exact; in that case, skip the later re-calibration.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33695
The code here tries to be smart and zeroes out both di_db and di_ib with
a single bzero call, thereby overrunning the di_db subobject. This is
fine on most architectures, if a little dodgy. However, on CHERI, the
compiler can optionally restrict the bounds on pointers to subobjects to
just that subobject, in order to mitigate intra-object buffer overflows,
and this is enabled in CheriBSD's pure-capability kernels.
Instead, use separate bzero calls for each array, and let the compiler
optimise it as it sees fit; even if it's not generating inline zeroing
code, Clang will happily optimise two consecutive bzero's to a single
larger call.
Reviewed by: mckusick
Differential Revision: https://reviews.freebsd.org/D33651
Shortlinks occupy the space of both di_db and di_ib when used. However,
everywhere that wants to read or write a shortlink takes a pointer do
di_db and promptly runs off the end of it into di_ib. This is fine on
most architectures, if a little dodgy. However, on CHERI, the compiler
can optionally restrict the bounds on pointers to subobjects to just
that subobject, in order to mitigate intra-object buffer overflows, and
this is enabled in CheriBSD's pure-capability kernels.
Instead, clean this up by inserting a union such that a new di_shortlink
can be added with the right size and element type, avoiding the need to
cast and allowing the use of the DIP macro to access the field. This
also mirrors how the ext2fs code implements extents support, with the
exact same structure other than having a uint32_t i_data[] instead of a
char di_shortlink[].
Reviewed by: mckusick, jhb
Differential Revision: https://reviews.freebsd.org/D33650
A recent change introduced a one-off error into a test allowing
coalescing chunks into segments. This fixes that error.
broke a check in _bus_dmamap_addseg on many architectures. This change makes it clear that it is not a particular range that is being boundary-checked, but the proposed union of the two adjacent ranges.
Reported by: se
Reviewed by: se
Fixes: c606ab59e7 vm_extern: use standard address checkers everywhere
Differential Revision: https://reviews.freebsd.org/D33715
When comparing singed with unsigned the signed value is casted
to unsigned. Make this explicit as it might lead to compilation
warnings otherwise.
Obtained from: Stormshield
Always make ofw_bus_if.h. While it's only used when option FDT is in the
kernel, it can always be generated. In theory we could omit it if option
FDT isn't present, but none of the rest of sys/modules does that. That
fine-grained control likely won't be reliable w/o a redesign of the
kernel/module config system.
Sponsored by: Netflix
Expand on the terse comments for where each of these files is used.
Reviewed by: emaste
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33716
Using 8 width is too wide for large numbers like 1379991K;
1330M is easier to read.
Submitted by: ota_j.email.ne.jp
Reviewed by: mckusick
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33495
Based on some feedback clarify the man page for
- how to load the driver currently
- status of the driver with respect to iwm(4)
and leave a comment to (automatically) add a full list of chipsets
to the man page.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: debdrup
Differential Revision: https://reviews.freebsd.org/D33713
Now posix_fallocate will be correctly forwarded to fuse file system
servers, for those that support it.
MFC after: 2 weeks
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33389
By default, FUSE file systems are assumed not to support lookups for "."
and "..". They must opt-in to that. To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..". But if the parent's vnode has
been reclaimed that won't be possible. Previously we paniced in this
situation. Now, we'll return ESTALE instead. Or, if the file system
has opted into ".." lookups, we'll just do that instead.
This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.
PR: 259974
MFC after: 2 weeks
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33239
In an earlier version of the revision that created that sysctl (D20519)
the sysctl was gated by INVARIANTS, so the test had to check for it.
But in the committed version it is always available.
MFC after: 2 weeks
If FUSE_COPY_FILE_RANGE returns successfully, update the atime of the
source and the mtime and ctime of the destination.
MFC after: 2 weeks
Reviewers: pfg
Differential Revision: https://reviews.freebsd.org/D33159
VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked. But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked. So a race is possible. For
example:
1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode
and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
the server. Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
attributes, which are now out-of-date.
Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified. Check that timestamp during VOP_LOOKUP
and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.
PR: 259071
Reported by: Agata <chogata@moosefs.pro>
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33158
When a softclock thread prepares to go off-CPU, the following happens in
the context of the thread:
1. callout state is locked
2. thread state is set to IWAIT
3. thread lock is switched from the tdq lock to the callout lock
4. tdq lock is released
5. sched_switch() sets td_lock to &blocked_lock
6. sched_switch() releases old td_lock (callout lock)
7. sched_switch() removes td from its runqueue
8. cpu_switch() sets td_lock back to the callout lock
Suppose a timer interrupt fires while the softclock thread is switching
off, and callout_process() schedules the softclock thread. Then there
is a window between steps 5 and 8 where callout_process() can call
sched_add() while td_lock is &blocked_lock, but this is not correct
since the thread is not logically locked.
callout_process() thus needs to spin waiting for the softclock thread to
finish switching off (i.e., after step 8 completes) before rescheduling
it, since callout_process() does not acquire the thread lock directly.
Reported by: syzbot+fb44dbf6734ff492c337@syzkaller.appspotmail.com
Fixes: 74cf7cae4d ("softclock: Use dedicated ithreads for running callouts.")
Reviewed by: mav, kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33709