Currently, sysctls which enable KDB in some way are flagged with
CTLFLAG_SECURE, meaning that you can't modify them if securelevel > 0.
This is so that KDB cannot be used to lower a running system's
securelevel, see commit 3d7618d8bf. However, the newer mac_ddb(4)
restricts DDB operations which could be abused to lower securelevel
while retaining some ability to gather useful debugging information.
To enable the use of KDB (specifically, DDB) on systems with a raised
securelevel, change the KDB sysctl policy: rather than relying on
CTLFLAG_SECURE, add a check of the current securelevel to kdb_trap().
If the securelevel is raised, only pass control to the backend if MAC
specifically grants access; otherwise simply check to see if mac_ddb
vetoes the request, as before.
Add a new secure sysctl, debug.kdb.enter_securelevel, to override this
behaviour. That is, the sysctl lets one enter a KDB backend even with a
raised securelevel, so long as it is set before the securelevel is
raised.
Reviewed by: mhorne, stevek
MFC after: 1 month
Sponsored by: Juniper Networks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37122
getopt(3) returns '?' when it encounters a flag not present in the in
the optstring or if a flag is missing its option argument. We can
handle this case with the "default" failure case with no loss of
legibility.
Obtained from: OpenBSD makefs.c 1.22
Those const qualifier declare that the function doesn't change the
values internally. It makes no sense to add them in the header file.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39318
Use full-featured ifa_ifwithroute() to guess route ifa/ifp
instead of ifa_ifwithnet(). This change makes the route addition
logic closer to the rt_getifa_fib() used by rtsock.
Reported by: glebius
Tested by: glebius
Differential Revision: https://reviews.freebsd.org/D39335
MFC after: 2 weeks
This is a total hack/bare minimum which follows inet4.
Otherwise 2 threads removing the same address can easily crash.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D39317
The algorithm for laying out new directories was devised in the 1980s
and markedly improved the performance of the filesystem. In those days
large disks had at most 100 cylinder groups and often as few as 10-20.
Modern multi-terrabyte disks have thousands of cylinder groups. The
original algorithm does not handle these large sizes well. This change
attempts to expand the scope of the original algorithm to work well
with these much larger disks while still retaining the properties
of the original algorithm for small disks.
The filesystem implementation is divided into policy routines and
implementation routines. The policy routines can be changed in any
way desired without risk of corrupting the filesystem. The policy
requests are handled by the implementation layer. If the policy
asks for an available resource, it is granted. But if it asks for
an already in-use resource, then the implementation will provide
an available one nearby the request. Thus it is impossible for a
policy to double allocate. This change is limited to the policy
implementation.
This change updates the ffs_dirpref() routine which is responsible
for selecting the cylinder group into which a new directory should
be placed. If we are near the root of the filesystem we aim to
spread them out as much as possible. As we descend deeper from the
root we cluster them closer together around their parent as we
expect them to be more closely interactive. Higher-level directories
like usr/src/sys and usr/src/bin should be separated while the
directories in these areas are more likely to be accessed together
so should be closer. And directories within commands or kernel
subsystems should be closer still.
We pick a range of cylinder groups around the cylinder group of the
directory in which we are being created. The size of the range for
our search is based on our depth from the root of our filesystem.
We then probe that range based on how many directories are already
present. The first new directory is at 1/2 (middle) of the range;
the second is in the first 1/4 of the range, then at 3/4, 1/8, 3/8,
5/8, 7/8, 1/16, 3/16, 5/16, etc.
It is desirable to store the depth of a directory in its on-disk
inode so that it is available when we need it. We add a new field
di_dirdepth to track the depth of each directory. Because there are
few spare fields left in the inode, we choose to share an existing
field in the inode rather than having one of our own. Specifically
we create a union with the di_freelink field. The di_freelink field
is used to track inodes that have been unlinked but remain referenced.
It is not needed until a rmdir(2) operation has been done on a
directory. At that point, the directory has no contents and even
if it is kept active as a current directory is no longer able to
have any new directories or files created in it. Thus the use of
di_dirdepth and di_freelink will never coincide.
Reported by: Timo Voelker
Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39246
We changed the CheckHostIP default to "no" years ago. Upstream has now
made the same change, so do not list it as a local change any longer.
I did not just remove the "Modified client-side defaults" section to
avoid having to renumber everything, and we may add a new local change
in the future.
Sponsored by: The FreeBSD Foundation
This does not remove LLVM_TARGET_MIPS. Note that the only
MACHINE_ARCH values ending in 'hf' were all MIPS architectures, hence
removing the pattern matches for 'hf'.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D39331
These libraries are linked to directly by applications rather than
opened at runtime via dlopen().
Discussed with: oshogbo
Reviewed by: markj, emaste
Differential Revision: https://reviews.freebsd.org/D39245
Nfsd can now be run in an appropriately
configured vnet jail.
This man page update adds some information
for this case.
This is a content change.
Reviewed by: karels, markj
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D39219
This reverts commit ab80f0b21f. The intent
of this change was to avoid possible compilation errors when certain
.inc files were not regenerated, but the method turns out to cause way
more rebuilds than anticipated. Another method will have to be found,
and in the mean time, WITH_CLEAN is the solution that always works.
Fixes: ab80f0b21f
This flag ensures that the tblgen tools do not actually touch the
produced .inc file, if there are no changes to the contents. In turn,
this may prevent a number of rebuilds of files that include such .inc
files, saving build time.
While here, ensure that the shell invocations to locate the used tblgen
binary do not show unnecessary error messages.
Reported by: des
MFC after: 1 week
Those input routines are identical.
Also inline two fast paths.
No functional change intended.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39251
creds are not using the refcount API for a long time now, but this
previously failed to fail to compile because the type remained int.
Now it broke due to conversion to long.
There are FAT12 and FAT16 file systems, but FAT13 of was an
unintentional invention of mine ...
Reported by: Ravi Pokala <rpokala@freebsd.org>
MFC after: 1 month
The prior implementation of xen_intr_resume() was wiping
xen_intr_port_to_isrc[] and then rebuilding from the x86 interrupt
table. Rework to instead wipe the channel numbers (->xi_port) and then
scan the table for sources with invalid channels.
This will be slower due to scanning the whole table, but this removes
the dependency on the x86 interrupt code.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30599
[royger]
Split line over 80 characters.
The portions of xen_rebind_ipi() and xen_rebind_virq() were already
near-identical. While xen_rebind_ipi() should panic() on
single-processor, still having the functionality to invoke seems
harmless.
Meanwhile much of the loop from xen_intr_resume() seemed to want to be
closer to this same code. This pushes related bits closer together.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30598
Remove these no longer needed headers. Key for making xen_intr.c
machine-independent as they don't exist on other architectures.
Originally this was part of a much larger commit, but was broken off
for submission to the FreeBSD project.
Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2015-10-20 09:14:56
MFC after: 1 week
Now that the atomic macros are always genuinely atomic on x86, they can
be used for synchronization with Xen. A single core VM isn't too
unusual, but actual single core hardware is uncommon.
Replace an open-coding of evtchn_clear_port() with the inline.
Substantially inspired by work done by Julien Grall <julien@xen.org>,
2014-01-13 17:40:58.
Reviewed by: royger
MFC after: 1 week
While unusual, intr_register_source() can return failure. A likely
cause might be another device grabbing from Xen's interrupt range.
This should NOT happen, but could happen due to a bug. As such check
for this and fail if it occurs.
This theoretical situation also effects xen_intr_find_unused_isrc().
There, .is_pic must be tested to ensure such an intrusion doesn't cause
misbehavior.
Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31995
Consistently use ~0 instead of 0 when clearing xenisrc structures.
0 is a valid event channel number, even though it is reserved by Xen.
Whereas ~0 is guaranteed invalid.
Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
In xen_intr_release_isrc(), the isrc should only be removed if it is
assigned to a valid port. This had been mitigated by using 0 for not
having a port, but this is actually corrupting the table. Fix this bug
as modifying the code would cause this bug to manifest as kernel memory
corruption. Similar issue for the vCPU bitmap masks.
The KASSERT() doesn't need lock protection.
Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
The comparison was wrong. Hopefully this never occurred in the wild,
but now ensure the error message will occur before damage is caused.
This appears non-exploitable as exploitation would require a guest to
force Domain 0 to allocate all event channels, which a guest shouldn't
be able to do.
Adjust the error message to better describe what has occurred.
Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
Appears errors are uncommon since calling xen_intr_release_isrc() on a
xenisrc with xi_close in an undefined state could be bad. Fix this
problematic lurking nasty.
Reviewed by: royger
MFC after: 1 week
This update implements tallying of free directory entries during
create, delete, or rename operations on FAT12 and FAT16 file systems.
Prior to this change, the total number of root directory entries
was reported as number of inodes, but 0 as the number of free
inodes, causing system health monitoring software to warn about
a suspected disk full issue.
The FAT12 and FAT16 file systems provide a limited number of
root directory entries, e.g. 512 on typical hard disk formats.
The valid range of values is 1 to 65535, but the msdosfs code
will effectively round up "odd" values to the next multiple of 16
(e.g. 513 would allow for 528 root directory entries).
This update implements tracking of directory entries during create,
delete, or rename operations, with initial values determined by
scanning the directory when the file system is mounted.
Total and free directory entries are reported in the f_files and
f_ffree elements of struct statfs, despite differences in semantics
of these values:
- There is no limit on the number of files and directories that can
be created on a FAT file system. Only the root directory of FAT12
and FAT16 file systems is limited, any number of files can still be
created in sub-directories, even when 0 free "inodes" are reported.
- A single file can require 1 to 21 directory entries, depending on
the character set, structure, and length of the name. The DOS 8.3
style file name takes up 1 entry, and if the name does not comply
with the syntax of a DOS 8.3 file name, 1 additional entry is used
for each 13 characters of the file name. Since all these entries
have to be contiguous, it is possible that a file or directory with
a long name can not be created, despite a sufficient total number of
free directory entries.
- Renaming a file can require more directory entries than currently
allocated to store its long name, which may prevent an in-place
update of the name if more entries are needed. This may cause a
rename operation to fail if no contiguous range of free entries for
the new name can be found.
- The volume label is stored in a directory entry. An empty FAT file
system with a volume label will therefore show 1 used "inode" in
df.
- The perceentage of free inodes shown in df or monitoring tools does
only represent the state of the root directory of a FAT12 or FAT16
file system. Neither does a reported value of 0% free inodes does
prevent files from being created in sub-directories, nor does a
value of 50% free inodes guarantee that even a single file with
a "long" name can be created in the root directory (if every other
directory entry is occupied and there are no 2 contiguous entries).
The statfs(2) and df(1) man pages have been updated with a notice
regarding the possibly different semantics of values reported as
total and free inodes for non-Unix file systems.
PR: 270053
Reported by: Ben Woods <woodsb02@freebsd.org>
Approved by: mckusick
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D38987
On 64-bit platforms this sorts out worries about mitigating bugs which
overflow the counter, all while not pessimizng anything -- most notably
it avoids whacking per-thread operation in favor of refcount(9) API.
The struct already had two instances of 4 byte padding with 256 bytes in
size, cr_flags gets moved around to avoid growing it.
32-bit platforms could also get the extended counter, but I did not do
it as one day(tm) the mutex protecting centralized operation should be
replaced with atomics and 64-bit ops on 32-bit platforms remain quite
penalizing.
While worries of counter overflow are addressed, the following is not
(just like it would not be with conversion to refcount(9)):
- counter *underflows*
- buffer overruns from adjacent allocations
- UAF due to stale cred pointer
- .. and other goodies
As such, while lipstick was placed, the pig should not be participating
in any beauty pageants.
Prodded by: emaste
Differential Revision: https://reviews.freebsd.org/D39220