The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
As documented in listen.2 manual page, the kernel emits a LOG_DEBUG
syslog message if a socket listen queue overflows. For some appliances,
it may be desirable to change the priority to some higher value
like LOG_INFO while keeping other debugging suppressed.
OTOH there are cases when such overflows are normal and expected.
Then it may be desirable to suppress overflow logging altogether,
so that dmesg buffer is not flooded over long run.
In addition to existing sysctl kern.ipc.sooverinterval,
introduce new sysctl kern.ipc.sooverprio that defaults to 7 (LOG_DEBUG)
to preserve current behavior. It may be changed to any value
in a range of 0..7 for corresponding priority or to -1 to suppress logging.
Document it in the listen.2 manual page.
MFC after: 1 month
For a process supervisor using the reaper API to track process subtrees,
it is very useful to know the state of the processes on the list.
Sponsored by: https://www.patreon.com/valpackett
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39585
by making it accept some open(2) flags. More precisely, only
O_CLOEXEC is supported, the flag is translated into the KQUEUE_CLOEXEC flag
for kqueuex(2), and O_NONBLOCK is silently ignored.
Reported and tested by: vishwin
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39377
There is no real need to close descriptors before a process exits, but
these close calls demonstrate by example that kqueue descriptors occupy
the same namespace as other file descriptors.
Reviewed by: fernape, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39376
There are FAT12 and FAT16 file systems, but FAT13 of was an
unintentional invention of mine ...
Reported by: Ravi Pokala <rpokala@freebsd.org>
MFC after: 1 month
This update implements tallying of free directory entries during
create, delete, or rename operations on FAT12 and FAT16 file systems.
Prior to this change, the total number of root directory entries
was reported as number of inodes, but 0 as the number of free
inodes, causing system health monitoring software to warn about
a suspected disk full issue.
The FAT12 and FAT16 file systems provide a limited number of
root directory entries, e.g. 512 on typical hard disk formats.
The valid range of values is 1 to 65535, but the msdosfs code
will effectively round up "odd" values to the next multiple of 16
(e.g. 513 would allow for 528 root directory entries).
This update implements tracking of directory entries during create,
delete, or rename operations, with initial values determined by
scanning the directory when the file system is mounted.
Total and free directory entries are reported in the f_files and
f_ffree elements of struct statfs, despite differences in semantics
of these values:
- There is no limit on the number of files and directories that can
be created on a FAT file system. Only the root directory of FAT12
and FAT16 file systems is limited, any number of files can still be
created in sub-directories, even when 0 free "inodes" are reported.
- A single file can require 1 to 21 directory entries, depending on
the character set, structure, and length of the name. The DOS 8.3
style file name takes up 1 entry, and if the name does not comply
with the syntax of a DOS 8.3 file name, 1 additional entry is used
for each 13 characters of the file name. Since all these entries
have to be contiguous, it is possible that a file or directory with
a long name can not be created, despite a sufficient total number of
free directory entries.
- Renaming a file can require more directory entries than currently
allocated to store its long name, which may prevent an in-place
update of the name if more entries are needed. This may cause a
rename operation to fail if no contiguous range of free entries for
the new name can be found.
- The volume label is stored in a directory entry. An empty FAT file
system with a volume label will therefore show 1 used "inode" in
df.
- The perceentage of free inodes shown in df or monitoring tools does
only represent the state of the root directory of a FAT12 or FAT16
file system. Neither does a reported value of 0% free inodes does
prevent files from being created in sub-directories, nor does a
value of 50% free inodes guarantee that even a single file with
a "long" name can be created in the root directory (if every other
directory entry is occupied and there are no 2 contiguous entries).
The statfs(2) and df(1) man pages have been updated with a notice
regarding the possibly different semantics of values reported as
total and free inodes for non-Unix file systems.
PR: 270053
Reported by: Ben Woods <woodsb02@freebsd.org>
Approved by: mckusick
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D38987
- This function no longer disables interrupts
- MLINK to reboot.9
- The mentions of autoconfiguration is more about shutdown_nice(),
coming in the next commit.
- Describe the RB_* flags relevant to this function
- Describe behaviour when shutdown hooks fail the reset
- Describe expected execution contexts
- Add FF copyright
- xref panic(9)
- xref this page in reboot(2)
Reviewed by: markj
Discussed with: rpokala, Pau Amma <pauamma@gundo.com>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39133
Summary:
All cap_* system calls would fail when capability mode support is
not present.
MFC after: 2 weeks
Reviewed by: emaste, pauamma
Differential Revision: https://reviews.freebsd.org/D38976
The behavior is the same as in capability mode, it does not actually
return EINVAL for absolute lookups:
openat(AT_FDCWD,"/tmp/test",O_RDONLY|O_DIRECTORY,00) = 3 (0x3)
openat(3,"../../",O_RDONLY|0x800000,00) ERR#93 'Capabilities insufficient'
openat(3,"/etc/passwd",O_RDONLY|0x800000,00) ERR#93 'Capabilities insufficient'
Fixes: 1f305be43 ("Document {O,AT}_RESOLVE_BENEATH...")
Reviewed by: kib, pauamma (manpages), emaste
Sponsored by: https://www.patreon.com/valpackett
Pull Request: https://github.com/freebsd/freebsd-src/pull/680
Differential Revision: https://reviews.freebsd.org/D38675
profil(II) is in the scanned 3rd edition manual that we have. We don't
have the 3rd edition sources, nor do we have the 4th edition souces. We
have a mostly complete (missing pipes) 4th edition C rewrite where
profil system call number is reserved, but it's not implemented (it's in
the manx section for things that apeared to have been in 3rd edition but
weren't yet part of the reimplemented 4th edition). The 5th edition
sources we have do have it, however. For other items that have appeared
in earlier manuals, we've added the simple verbage to the manual and
relegated the rest of the data for that file to the commit message.
Simplify the tests for 32-bit arm soft float support. For the files
included only on arm, drop the test entirely. For others, test
MACHINE_CPUARCH against arm.
No functional change intended. File lists appear the same before / after
the change.
Sponsored by: Netflix
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D38582
While here, move notes about FreeBSD-specific functionality to the
COMPATIBILITY section, and document the ECAPMODE error for shm_open().
Reviewed by: pauamma, kib
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38282
Under Linux to sched_[g|s]etaffinity() functions the value returned from a call
to gettid(2) (thread id) can be passed in the argument pid. Specifying pid as 0
will set the attribute for the calling thread, and passing the value returned
from a call to getpid(2) (process id) will set the attribute for the main thread
of the thread group.
Native cpuset(2) family of system calls has "which" argument to determine how
the value of id argument is interpreted, i.e., CPU_WHICH_TID is used to pass
a thread id and CPU_WHICH_PID - to pass a process id.
For now native sched_[g|s]etaffinity() implementation is wrong as uses "which"
CPU_WHICH_PID to pass both (process and thread id) to the kernel. To fix this
adding a new "which" CPU_WHICH_TIDPID intended to handle both id's.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D38209
MFC after: 1 week
Refer to sockets rather than processes, since one can have multiple
sockets in a load-balancing group within the same process.
MFC after: 1 week
Sponsored by: Modirum MDPay
Sponsored by: Klara, Inc.
This reverts commit 76e6e4d72f.
Several programs in the tree use -1 instead of INT_MAX to use
the maximum value. Thanks to Eugene Grosbein for pointing this
out.
Ensure that a negative backlog argument is handled as it if was 0.
Reviewed by: markj@, glebius@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D31821
This reverts commit 1c2be25f60.
kib@ pointed out that it is perfectly fine to write at arbitrary regular
file offsets. For example, in a 4K block size character device, geom
doesn't support writing / reading 515 byte blocks. The description is
perhaps not applicable to all EINVALs returned.
The read system call will return EINVAL if the current file offset is
not a multiple of the block size. This also applies to write(2). Add an
entry for EINVAL about this error to both man pages.
PR: 91149
Event: Aberdeen Hackathon 2022
Differential Revision: https://reviews.freebsd.org/D24617
Long ago, ktr_tid was ktr_buffer which pointed to the buffer following
the header and was used internally in the kernel. Use was removed in
efbbbf570d and it was repurposed as ktr_kid in c6854c347f. For
ABI reasons, it stayed an intptr_t rather than becoming an lwpid_t at
the time. Since it doesn't hold a pointer any more (unless you have
a ktrace.out from 2005), change the type to long which is alwasy the
same size on all supported architectures. Add a suggestion to change
the type to lwpid_t (__int32_t) on a future ABI break.
Remove most remaining references to ktr_buffer, retaing a comment in
kdump.c explaining why negative values are treated as 0. While here,
accept that pid_t and lwpid_t are of type int and simplify casts in
printf.
This changed was motivated by CheriBSD where intptr_t is 16-bytes
in the pure-capability ABI.
Reviewed by: kib, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D36599
The divert(4) is not a protocol of IPv4. It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back. It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols. The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char. I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.
Moving divert(4) out of inetsw accomplished two important things:
1) IPDIVERT is getting much closer to be not dependent on INET.
This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
Domain/proto selection code won't need a hack for SOCK_RAW and
multiple entries in inetsw implementing different flavors of
raw socket can merge into one without requirement of raw IPv4
being the last member of dom_protosw.
Differential revision: https://reviews.freebsd.org/D36379
o Undocument sockets that are no longer supported, or never were.
o Add AF_HYPERV. Note: PF_HYPERV isn't defined, no typo here.
o Point at ip(4) and ip6(4) instead of unwelcoming "not described here".
Reviewed by: gbe, markj
Differential revision: https://reviews.freebsd.org/D36284
This has already been done for most files that have the Foundation as
the only listed copyright holder. Do it now for files that list
multiple copyright holders, but have the Foundation copyright in its own
section.
Sponsored by: The FreeBSD Foundation
Implement Linux-variant of MSG_TRUNC input flag used in recv(), recvfrom() and recvmsg().
Posix defines MSG_TRUNC as an output flag, indicating packet/datagram truncation.
Linux extended it a while (~15+ years) ago to act as input flag,
resulting in returning the full packet size regarless of the input
buffer size.
It's a (relatively) popular pattern to do recvmsg( MSG_PEEK | MSG_TRUNC) to get the
packet size, allocate the buffer and issue another call to fetch the packet.
In particular, it's popular in userland netlink code, which is the primary driving factor of this change.
This commit implements the MSG_TRUNC support for SOCK_DGRAM sockets (udp, unix and all soreceive_generic() users).
PR: kern/176322
Reviewed by: pauamma(doc)
Differential Revision: https://reviews.freebsd.org/D35909
MFC after: 1 month
As per the updated FreeBSD copyright template. These were unambiguous
cases where the Foundation was the only listed copyright holder.
Sponsored by: The FreeBSD Foundation