This change adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.
Reviewed by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19177
Ensure __bss_start is associated with the next section
in case orphan sections are placed directly after .sdata,
as has been seen to happen with LLD.
Submitted by: "J.R.T. Clarke" <jrtc4@cam.ac.uk>
Differential Revision: https://reviews.freebsd.org/D18429
Loader does fail to properly match the file name in directory record and
does open file based on prefix match.
For fix, we check the name lengths first.
Reviewed by: allanjude
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19213
loaderdev variable works correctly.
The uboot_devdesc struct is variously cast back and forth between
uboot_devdesc and disk_devdesc as pointers are handed off through various
opaque interfaces. uboot_devdesc attempted to mimic the layout of
disk_devdesc by having a devdesc struct, followed by a union of some
device-specific stuff that included a struct that contains the same fields
as a disk_devdesc. However, one of those fields inside the struct is 64-bit
which causes the entire union to be 64-bit aligned -- 32 bits of padding
is added between the struct devdesc and the union, so the whole mess ends
up NOT properly mimicking a disk_devdesc after all. (In disk_devdesc there
is also 32 bits of padding, but it shows up immediately before the d_offset
field, rather than before the whole collection of d_* fields.)
This fixes the problem by using an anonymous union to overlay the devdesc
field uboot network devices need with the disk_devdesc that uboot storage
devices need. This is a different solution than the one contributed with
the PR (so if anything goes wrong, the blame goes to me), but 95% of the
credit for this fix goes to Pawel Worach and Manuel Stuhn who analyzed the
problem and proposed a fix.
PR: 233097
Comment for CAPFAIL_LOOKUP refered only to paths containing ".." but
it is returned for other restricted VFS lookup cases, such as absolute
paths or openat(AT_FDCWD, ...).
This was previously an unconditional screen clear, regardless of whether or
not we would be prompting for any passwords. This is pointless, given that
the screen clear is only there to put our screen into a consistent state
before we draw the prompts and do cursor manipulation.
This is also the only screen clear besides that to draw the menu. One can
now see early pre-loader and loader output with the menu disabled, which may
be useful for diagnostics.
Reported by: ian
MFC after: 3 days
Summary:
Now that mpc85xx can boot via ubldr, move ubldr to a separate
filesystem, mounted on /boot/uboot, so that a fresh install can boot correctly.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D18709
the size field and a tab between the partition type and the size.
Changes this
disk devices:
disk0 (MMC)
disk0s1: DOS/Windows 49MB
disk0s2: FreeBSD 14GB
disk0s2a: FreeBSD UFS 14GB
disk0s2b: Unknown 2048KB
disk0s2d: FreeBSD UFS 2040KB
to this
disk devices:
disk0 (MMC)
disk0s1: DOS/Windows 49MB
disk0s2: FreeBSD 14GB
disk0s2a: FreeBSD UFS 14GB
disk0s2b: Unknown 2048KB
disk0s2d: FreeBSD UFS 2040KB
I'm pretty sure this used to work at one time, perhaps long ago. It has
been failing recently because if you call disk_open() with dev->d_partition
set to -1 when d_slice refers to a bsd slice, it assumes you want it to
open the first partition within that slice. When you then pass that open
dev instance to ptable_open(), it tries to read the start of the 'a'
partition and decides there is no recognizable partition type there.
This restores the old functionality by resetting d_offset to the start
of the raw slice after disk_open() returns. For good measure, d_partition
is also set back to -1, although that doesn't currently affect anything.
I would have preferred to make disk_open() avoid such rude assumptions and
if you ask for partition -1 you get the raw slice. But the commit history
shows that someone already did that once (r239058), and had to revert it
(r239232), so I didn't even try to go down that road.
In r343986 we introduced a double free. The structure was already
freed fixed in the r302966. This problem was introduced
because the GitHub version was out of sync with the FreeBSD one.
Submitted by: Mindaugas Rasiukevicius <rmind@netbsd.org>
MFC with: r343986
This commit fixes a remaining output buffer overrun in the
single-sector case when there is a non-zero tail.
Reviewed by: allanjude, tsoome
MFC after: 3 months
MFC with: r344226
Differential Revision: https://reviews.freebsd.org/D19220
Some argument validation error paths would return without releasing the
file reference obtained at the beginning of the function.
While here, fix some style bugs and remove trivial debug prints.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19214
Use the object pointer itself to determine whether the object is locked.
No functional change intended.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19215
This is consistent with the removal of whole-disk vdev support from
libsa/zfs/zfs.c in r342151, and is part way to having the LBAs read
during probe be fully constrained by partition tables when present.
Reviewed by: tsoome
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D19142
The bug occurred when a bounce buffer was used and the requested read
size was greater than the size of the bounce buffer. This commit also
rewrites the read logic so that it is easier to systematically verify
all alignment and size cases.
Reviewed by: allanjude, tsoome
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D19140
The charging current can be set using steps
from 0: 200mA to 13: 2800mA (200mA/step).
While there, fix battery charging current related
sensor descriptions.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19212
Mentioned in mdconfig(8), malloc-backed md(4) can be unstable unless
required memory is allocated up front with -o reserve. Furthermore, panics
have been observed with md used in fstab on 12.0-RELEASE. Choose the stable
route and pass -o reserve.
Submitted by: Paul Vixie
MFC after: 1 week
- Drop profile libraries; MK_PROFILE=no is set in all Makefile's.
- Correct library path to libmlx5.so.1 and libibverbs.so.1
MFC after: 5 days
MFC with: 344207
- Add missing /usr/sbin/pmc, pmcformat.h, libpmcstat.h and pmc.haswellxeon.3
to the list.
- Correct man page section for pmcstudy.8.
- Include recently added libipt and libopencsd for corresponding TARGET_ARCH
MFC after: 5 days
libifconfig is built as a static-only PRIVATELIB (and there is no _pie.a
version) so disable PIE in libifconfig's consumer.
Sponsored by: The FreeBSD Foundation
OptionalObsoleteFiles.inc
Note: only files with conditional installation logic were
included from the PR.
PR: 233046
Submitted by: <rozhuk.im@gmail.com>
MFC after: 5 days
Specifically, this allows (via "-V vhostname") telling nfsd what principal
to use, instead of the hostname. This is used at iXsystems for fail-over in
HA systems.
Reviewed by: macklem
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D19191
back to the lever before r343030. For 64-bit machines reduce it slightly,
too. Together with r343030 I bumped the limit up to the value we use at
Netflix to serve 100 Gbit/s of sendfile traffic, and it probably isn't a
good default.
Provide a loader tunable to change vnode pager pbufs count. Document it.
The cached fvdat->filesize is indepedent of the (mostly unused)
cached_attrs, and we failed to update it when a cached (but perhaps
inactive) vnode was found during VOP_LOOKUP to have a different size than
cached.
As noted in the code comment, this can occur in distributed filesystems or
with other kinds of irregular file behavior (anything is possible in FUSE).
We do something similar in fuse_vnop_getattr already.
PR: 230258 (as reported in description; other issues explored in
comments are not all resolved)
Reported by: MooseFS FreeBSD Team <freebsd AT moosefs.com>
Submitted by: Jakub Kruszona-Zawadzki <acid AT moosefs.com> (earlier version)
At least prior to 7.23 (which adds FUSE_WRITEBACK_CACHE), the FUSE protocol
specifies only clean data to be cached.
Prior to this change, we implement and default to writeback caching. This
is ok enough for local only filesystems without hardlinks, but violates the
general design contract with FUSE and breaks distributed filesystems or
concurrent access to hardlinks of the same inode.
In this change, add cache mode as an extension of cache enable/disable. The
new modes are UC (was: cache disabled), WT (default), and WB (was: cache
enabled).
For now, WT caching is implemented as write-around, which meets the goal of
only caching clean data. WT can be better than WA for workloads that
frequently read data that was recently written, but WA is trivial to
implement. Note that this has no effect on O_WRONLY-opened files, which
were already coerced to write-around.
Refs:
* https://sourceforge.net/p/fuse/mailman/message/8902254/
* https://github.com/vgough/encfs/issues/315
PR: 230258 (inspired by)
Most users of fuse_vnode_setsize() set the cached fvdat->filesize and update
the buf cache bounds as a result of either a read from the underlying FUSE
filesystem, or as part of a write-through type operation (like truncate =>
VOP_SETATTR). In these cases, do not set the FN_SIZECHANGE flag, which
indicates that an inode's data is dirty (in particular, that the local buf
cache and fvdat->filesize have dirty extended data).
PR: 230258 (related)
The FUSE protocol demands that kernel implementations cache user filesystem
path components (lookup/cnp data) for a maximum period of time in the range
of [0, ULONG_MAX] seconds. In practice, typical requests are for 0, 1, or
10 seconds; or "a long time" to represent indefinite caching.
Historically, FreeBSD FUSE has ignored this client directive entirely. This
works fine for local-only filesystems, but causes consistency issues with
multi-writer network filesystems.
For now, respect 0 second cache TTLs and do not cache such metadata.
Non-zero metadata caching TTLs in the range [0.000000001, ULONG_MAX] seconds
are still cached indefinitely, because it is unclear how a userspace
filesystem could do anything sensible with those semantics even if
implemented.
Pass fuse_entry_out to fuse_vnode_get when available and only cache lookup
if the user filesystem did not set a zero second TTL.
PR: 230258 (inspired by; does not fix)
The FUSE protocol demands that kernel implementations cache user filesystem
file attributes (vattr data) for a maximum period of time in the range of
[0, ULONG_MAX] seconds. In practice, typical requests are for 0, 1, or 10
seconds; or "a long time" to represent indefinite caching.
Historically, FreeBSD FUSE has ignored this client directive entirely. This
works fine for local-only filesystems, but causes consistency issues with
multi-writer network filesystems.
For now, respect 0 second cache TTLs and do not cache such metadata.
Non-zero metadata caching TTLs in the range [0.000000001, ULONG_MAX] seconds
are still cached indefinitely, because it is unclear how a userspace
filesystem could do anything sensible with those semantics even if
implemented.
In the future, as an optimization, we should implement notify_inval_entry,
etc, which provide userspace filesystems a way of evicting the kernel cache.
One potentially bogus access to invalid cached attribute data was left in
fuse_io_strategy. It is restricted behind the undocumented and non-default
"vfs.fuse.fix_broken_io" sysctl or "brokenio" mount option; maybe these are
deadcode and can be eliminated?
Some minor APIs changed to facilitate this:
1. Attribute cache validity is tracked in FUSE inodes ("fuse_vnode_data").
2. cache_attrs() respects the provided TTL and only caches in the FUSE
inode if TTL > 0. It also grows an "out" argument, which, if non-NULL,
stores the translated fuse_attr (even if not suitable for caching).
3. FUSE VTOVA(vp) returns NULL if the vnode's cache is invalid, to help
avoid programming mistakes.
4. A VOP_LINK check for potential nlink overflow prior to invoking the FUSE
link op was weakened (only performed when we have a valid attr cache). The
check is racy in a multi-writer network filesystem anyway -- classic TOCTOU.
We have to trust any userspace filesystem that rejects local caching to
account for it correctly.
PR: 230258 (inspired by; does not fix)
Building binaries as PIE allows the executable itself to be loaded at a
random address when ASLR is enabled (not just its shared libraries).
With this change PIE objects have a .pieo extension and INTERNALLIB
libraries libXXX_pie.a.
MK_PIE is disabled for some kerberos5 tools, Clang, and Subversion, as
they explicitly reference .a libraries in their Makefiles. These can
be addressed on an individual basis later. MK_PIE is also disabled for
rtld-elf because it is already position-independent using bespoke
Makefile rules.
Currently only dynamically linked binaries will be built as PIE.
Discussed with: dim
Reviewed by: kib
MFC after: 1 month
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18423