Commit Graph

2112 Commits

Author SHA1 Message Date
Eugene Grosbein
4824d78872 listen(2): improve administrator control over logging
As documented in listen.2 manual page, the kernel emits a LOG_DEBUG
syslog message if a socket listen queue overflows. For some appliances,
it may be desirable to change the priority to some higher value
like LOG_INFO while keeping other debugging suppressed.

OTOH there are cases when such overflows are normal and expected.
Then it may be desirable to suppress overflow logging altogether,
so that dmesg buffer is not flooded over long run.

In addition to existing sysctl kern.ipc.sooverinterval,
introduce new sysctl kern.ipc.sooverprio that defaults to 7 (LOG_DEBUG)
to preserve current behavior. It may be changed to any value
in a range of 0..7 for corresponding priority or to -1 to suppress logging.
Document it in the listen.2 manual page.

MFC after:	1 month
2023-05-01 03:26:44 +07:00
Konstantin Belousov
93ca6ff295 umtx: allow to configure minimal timeout (in nanoseconds)
PR:	270785
Reviewed by:	markj, mav
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39584
2023-04-19 02:22:28 +03:00
Val Packett
77f0e198d9 procctl: add state flags to PROC_REAP_GETPIDS reports
For a process supervisor using the reaper API to track process subtrees,
it is very useful to know the state of the processes on the list.

Sponsored by:   https://www.patreon.com/valpackett
Reviewed by:    kib
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D39585
2023-04-16 13:48:20 +03:00
Konstantin Belousov
54579376c0 Change kqueue1() to be compatible with NetBSD
by making it accept some open(2) flags.  More precisely, only
O_CLOEXEC is supported, the flag is translated into the KQUEUE_CLOEXEC flag
for kqueuex(2), and O_NONBLOCK is silently ignored.

Reported and tested by:	vishwin
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39377
2023-04-05 06:29:49 +03:00
Ed Maste
20c9c3be5a kqueue: add close() calls to man page example
There is no real need to close descriptors before a process exits, but
these close calls demonstrate by example that kqueue descriptors occupy
the same namespace as other file descriptors.

Reviewed by:	fernape, markj
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39376
2023-04-04 09:29:53 -04:00
Konstantin Belousov
dac3102488 Rename kqueue1(2) to kqueuex(2) to avoid compat issues with NetBSD
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39377
2023-04-04 16:19:08 +03:00
Ed Maste
d860991a72 kqueue: tidy up indentation in man page example
Fixes: e07b0c12ba ("[patch][doc] Fix EXAMPLE in kqueue(2)")
Sponsored by:	The FreeBSD Foundation
2023-03-31 14:48:39 -04:00
Stefan Eßer
9d33a9d96f Fix typo in statfs man page
There are FAT12 and FAT16 file systems, but FAT13 of was an
unintentional invention of mine ...

Reported by:	Ravi Pokala <rpokala@freebsd.org>
MFC after:	1 month
2023-03-29 10:11:19 +02:00
Stefan Eßer
c33db74b53 fs/msdosfs: add tracking of free root directory entries
This update implements tallying of free directory entries during
create, delete,	or rename operations on FAT12 and FAT16 file systems.

Prior to this change, the total number of root directory entries
was reported as number of inodes, but 0 as the number of free
inodes, causing system health monitoring software to warn about
a suspected disk full issue.

The FAT12 and FAT16 file systems provide a limited number of
root directory entries, e.g. 512 on typical hard disk formats.
The valid range of values is 1 to 65535, but the msdosfs code
will effectively round up "odd" values to the next multiple of 16
(e.g. 513 would allow for 528 root directory entries).

This update implements tracking of directory entries during create,
delete, or rename operations, with initial values determined by
scanning the directory when the file system is mounted.

Total and free directory entries are reported in the f_files and
f_ffree elements of struct statfs, despite differences in semantics
of these values:

- There is no limit on the number of files and directories that can
  be created on a FAT file system. Only the root directory of FAT12
  and FAT16 file systems is limited, any number of files can still be
  created in sub-directories, even when 0 free "inodes" are reported.

- A single file can require 1 to 21 directory entries, depending on
  the character set, structure, and length of the name. The DOS 8.3
  style file name takes up 1 entry, and if the name does not comply
  with the syntax of a DOS 8.3 file name, 1 additional entry is used
  for each 13 characters of the file name. Since all these entries
  have to be contiguous, it is possible that a file or directory with
  a long name can not be created, despite a sufficient total number of
  free directory entries.

- Renaming a file can require more directory entries than currently
  allocated to store its long name, which may prevent an in-place
  update of the name if more entries are needed. This may cause a
  rename operation to fail if no contiguous range of free entries for
  the new name can be found.

- The volume label is stored in a directory entry. An empty FAT file
  system with a volume label will therefore show 1 used "inode" in
  df.

- The perceentage of free inodes shown in df or monitoring tools does
  only represent the state of the root directory of a FAT12 or FAT16
  file system. Neither does a reported value of 0% free inodes does
  prevent files from being created in sub-directories, nor does a
  value of 50% free inodes guarantee that even a single file with
  a "long" name can be created in the root directory (if every other
  directory entry is occupied and there are no 2 contiguous entries).

The statfs(2) and df(1) man pages have been updated with a notice
regarding the possibly different semantics of values reported as
total and free inodes for non-Unix file systems.

PR:		270053
Reported by:	Ben Woods <woodsb02@freebsd.org>
Approved by:	mckusick
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D38987
2023-03-29 08:46:01 +02:00
Konstantin Belousov
f2ec444be5 kqueue1(2): document
Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39271
2023-03-28 02:39:26 +03:00
Konstantin Belousov
375732cc6e kqueue1(2): export the symbol from libc
Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39271
2023-03-28 02:39:26 +03:00
Mitchell Horne
d55c187738 kern_reboot(9): some updates
- This function no longer disables interrupts
 - MLINK to reboot.9
 - The mentions of autoconfiguration is more about shutdown_nice(),
   coming in the next commit.
 - Describe the RB_* flags relevant to this function
 - Describe behaviour when shutdown hooks fail the reset
 - Describe expected execution contexts
 - Add FF copyright
 - xref panic(9)
 - xref this page in reboot(2)

Reviewed by:	markj
Discussed with:	rpokala, Pau Amma <pauamma@gundo.com>
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39133
2023-03-20 17:12:12 -03:00
Xin LI
75798f9b01 cap_*(2): Document ENOSYS behavior.
Summary:
All cap_* system calls would fail when capability mode support is
not present.

MFC after:	2 weeks
Reviewed by:	emaste, pauamma
Differential Revision: https://reviews.freebsd.org/D38976
2023-03-09 18:10:50 -08:00
Val Packett
c7a8502bdf open.2: describe O_RESOLVE_BENEATH errors correctly
The behavior is the same as in capability mode, it does not actually
return EINVAL for absolute lookups:

    openat(AT_FDCWD,"/tmp/test",O_RDONLY|O_DIRECTORY,00) = 3 (0x3)
    openat(3,"../../",O_RDONLY|0x800000,00)          ERR#93 'Capabilities insufficient'
    openat(3,"/etc/passwd",O_RDONLY|0x800000,00)     ERR#93 'Capabilities insufficient'

Fixes:          1f305be43 ("Document {O,AT}_RESOLVE_BENEATH...")
Reviewed by:    kib, pauamma (manpages), emaste
Sponsored by:   https://www.patreon.com/valpackett
Pull Request:	https://github.com/freebsd/freebsd-src/pull/680
Differential Revision: https://reviews.freebsd.org/D38675
2023-03-02 15:58:00 -05:00
Warner Losh
5f044c4e05 profil(2): profil(II) was in the v3 sources
profil(II) is in the scanned 3rd edition manual that we have. We don't
have the 3rd edition sources, nor do we have the 4th edition souces. We
have a mostly complete (missing pipes) 4th edition C rewrite where
profil system call number is reserved, but it's not implemented (it's in
the manx section for things that apeared to have been in 3rd edition but
weren't yet part of the reimplemented 4th edition). The 5th edition
sources we have do have it, however. For other items that have appeared
in earlier manuals, we've added the simple verbage to the manual and
relegated the rest of the data for that file to the commit message.
2023-02-15 12:44:32 -07:00
Warner Losh
ae902a5be9 libc: Simplify soft-float on 32-bit arm
Simplify the tests for 32-bit arm soft float support. For the files
included only on arm, drop the test entirely. For others, test
MACHINE_CPUARCH against arm.

No functional change intended. File lists appear the same before / after
the change.

Sponsored by:		Netflix
Reviewed by:		emaste
Differential Revision:	https://reviews.freebsd.org/D38582
2023-02-14 09:53:08 -07:00
Mark Johnston
5f03f96fbe shm: Document shm_create_largepage()
While here, move notes about FreeBSD-specific functionality to the
COMPATIBILITY section, and document the ECAPMODE error for shm_open().

Reviewed by:	pauamma, kib
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D38282
2023-02-03 11:48:25 -05:00
Dmitry Chagin
c21b080f3d cpuset: Fix sched_[g|s]etaffinity() for better compatibility with Linux.
Under Linux to sched_[g|s]etaffinity() functions the value returned from a call
to gettid(2) (thread id) can be passed in the argument pid. Specifying pid as 0
will set the attribute for the calling thread, and passing the value returned
from a call to getpid(2) (process id) will set the attribute for the main thread
of the thread group.

Native cpuset(2) family of system calls has "which" argument to determine how
the value of id argument is interpreted, i.e., CPU_WHICH_TID is used to pass
a thread id and CPU_WHICH_PID - to pass a process id.

For now native sched_[g|s]etaffinity() implementation is wrong as uses "which"
CPU_WHICH_PID to pass both (process and thread id) to the kernel. To fix this
adding a new "which" CPU_WHICH_TIDPID intended to handle both id's.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D38209
MFC after:		1 week
2023-01-29 16:17:33 +03:00
Alexander V. Chernikov
b0286ee504 man: add Netlink reference to socket(2)
Reviewed by:	lwhsu, pauamma, gbe
Differential Revision: https://reviews.freebsd.org/D38054
2023-01-15 11:27:43 +00:00
Val Packett
0b4531511e copyright: chase my name and email change
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D37945
2023-01-06 15:28:42 -05:00
Konstantin Belousov
a98613f238 ptrace(2): document PT_SC_REMOTE
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37590
2022-12-22 23:11:42 +02:00
Konstantin Belousov
0e07241c37 ptrace(2): explain how to select specific thread to operate on
Requested by:	emaste
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37590
2022-12-22 23:11:35 +02:00
John Baldwin
120eff994a ptrace.2: Fix warnings from igor.
Reviewed by:	pauamma, imp
Differential Revision:	https://reviews.freebsd.org/D37689
2022-12-15 11:26:57 -08:00
Alan Somers
34120c0c52 [skip ci] document first appearance of fhlink et al
MFC after:	1 week
Sponsored by:	Axcient
Reviewed by:	imp
Differential Revision: https://reviews.freebsd.org/D37575
2022-11-30 14:57:56 -07:00
Mark Johnston
d0f8e31761 getsockopt.2: Clarify the SO_REUSEPORT_LB text a bit
Refer to sockets rather than processes, since one can have multiple
sockets in a load-balancing group within the same process.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Sponsored by:	Klara, Inc.
2022-11-02 13:46:24 -04:00
John Baldwin
c9c9057c77 ktrace.2: Document KTRFAC_STRUCT_ARRAY.
Sponsored by:	DARPA
2022-11-02 10:35:26 -07:00
Michael Tuexen
bc0d407676 Revert "listen(): improve POSIX compliance"
This reverts commit 76e6e4d72f.

Several programs in the tree use -1 instead of INT_MAX to use
the maximum value. Thanks to Eugene Grosbein for pointing this
out.
2022-10-12 04:33:00 +02:00
Michael Tuexen
76e6e4d72f listen(): improve POSIX compliance
Ensure that a negative backlog argument is handled as it if was 0.

Reviewed by:		markj@, glebius@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D31821
2022-10-11 22:46:51 +02:00
Benedict Reuschling
44b0b943b8 Revert "Add extra EINVAL information about wrong block size to read(2)/write(2)"
This reverts commit 1c2be25f60.

kib@ pointed out that it is perfectly fine to write at arbitrary regular
file offsets. For example, in a 4K block size character device, geom
doesn't support writing / reading 515 byte blocks. The description is
perhaps not applicable to all EINVALs returned.
2022-10-08 10:23:51 +00:00
Benedict Reuschling
1c2be25f60 Add extra EINVAL information about wrong block size to read(2)/write(2)
The read system call will return EINVAL if the current file offset is
not a multiple of the block size. This also applies to write(2). Add an
entry for EINVAL about this error to both man pages.

PR:			91149
Event:			Aberdeen Hackathon 2022
Differential Revision:	https://reviews.freebsd.org/D24617
2022-10-07 11:32:37 +00:00
Brooks Davis
bb23932803 ktrace: make ktr_tid a long not intptr_t (NFC)
Long ago, ktr_tid was ktr_buffer which pointed to the buffer following
the header and was used internally in the kernel.  Use was removed in
efbbbf570d and it was repurposed as ktr_kid in c6854c347f.  For
ABI reasons, it stayed an intptr_t rather than becoming an lwpid_t at
the time.  Since it doesn't hold a pointer any more (unless you have
a ktrace.out from 2005), change the type to long which is alwasy the
same size on all supported architectures.  Add a suggestion to change
the type to lwpid_t (__int32_t) on a future ABI break.

Remove most remaining references to ktr_buffer, retaing a comment in
kdump.c explaining why negative values are treated as 0.  While here,
accept that pid_t and lwpid_t are of type int and simplify casts in
printf.

This changed was motivated by CheriBSD where intptr_t is 16-bytes
in the pure-capability ABI.

Reviewed by:	kib, markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D36599
2022-09-17 09:21:59 +01:00
Jens Schweikhardt
714300faec Change sysctl section to 3 as suggested by Benjamin Kaduk. 2022-09-17 09:35:49 +02:00
Jens Schweikhardt
e9e615c88a Fix dead references (wrong section) to sysctl(8). 2022-09-16 20:00:49 +02:00
Gleb Smirnoff
8624f4347e divert: declare PF_DIVERT domain and stop abusing PF_INET
The divert(4) is not a protocol of IPv4.  It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back.  It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols.  The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char.  I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.

Moving divert(4) out of inetsw accomplished two important things:

1) IPDIVERT is getting much closer to be not dependent on INET.
   This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
   Domain/proto selection code won't need a hack for SOCK_RAW and
   multiple entries in inetsw implementing different flavors of
   raw socket can merge into one without requirement of raw IPv4
   being the last member of dom_protosw.

Differential revision:	https://reviews.freebsd.org/D36379
2022-08-30 15:09:21 -07:00
Gleb Smirnoff
a358db5603 socket(2): bring documentation up tp date
o Undocument sockets that are no longer supported, or never were.
o Add AF_HYPERV. Note: PF_HYPERV isn't defined, no typo here.
o Point at ip(4) and ip6(4) instead of unwelcoming "not described here".

Reviewed by:		gbe, markj
Differential revision:	https://reviews.freebsd.org/D36284
2022-08-26 08:16:15 -07:00
Ed Maste
5b5fa75acf libc: drop "All rights reserved" from Foundation copyrights
This has already been done for most files that have the Foundation as
the only listed copyright holder.  Do it now for files that list
multiple copyright holders, but have the Foundation copyright in its own
section.

Sponsored by:	The FreeBSD Foundation
2022-08-04 16:57:50 -04:00
Alexander V. Chernikov
426afc8f52 recv: bump manpage date after be1f485d7d.
MFC after:	1 month
2022-08-02 13:24:14 +00:00
Alexander V. Chernikov
be1f485d7d sockets: add MSG_TRUNC flag handling for recvfrom()/recvmsg().
Implement Linux-variant of MSG_TRUNC input flag used in recv(), recvfrom() and recvmsg().
Posix defines MSG_TRUNC as an output flag, indicating packet/datagram truncation.
Linux extended it a while (~15+ years) ago to act as input flag,
resulting in returning the full packet size regarless of the input
buffer size.
It's a (relatively) popular pattern to do recvmsg( MSG_PEEK | MSG_TRUNC) to get the
packet size, allocate the buffer and issue another call to fetch the packet.
In particular, it's popular in userland netlink code, which is the primary driving factor of this change.

This commit implements the MSG_TRUNC support for SOCK_DGRAM sockets (udp, unix and all soreceive_generic() users).

PR:		kern/176322
Reviewed by:	pauamma(doc)
Differential Revision: https://reviews.freebsd.org/D35909
MFC after:	1 month
2022-07-30 18:21:51 +00:00
Ed Maste
560e22c8fe Remove "All Rights Reserved" from FreeBSD Foundation libc copyrights
As per the updated FreeBSD copyright template.  These were unambiguous
cases where the Foundation was the only listed copyright holder.

Sponsored by:	The FreeBSD Foundation
2022-07-21 09:53:31 -04:00
Mateusz Piotrowski
16e4487e5f clock_gettime.2: Clarify CLOCK_*
Clarify that CLOCK_* (e.g., CLOCK_REALTIME) do not necessarily default
to CLOCK_*_FAST.

PR:		259642
2022-07-08 21:57:24 +02:00
Mateusz Piotrowski
13a9da7d55 clock_gettime.2: Add cross references and fix linter warnings
MFC after:	3 days
2022-07-08 21:51:03 +02:00
Warner Losh
ef86876b84 pselect(2): Document what a null pointer for the signalmask means
When pselect is passed a null pointer for the signal mask, the standard
says it shall behave like select (except for the different timeout
arg). Make a note of that here.

Sponsored by:		Netflix
2022-07-02 13:41:46 -06:00
Mark Johnston
b8ec0ce5b4 wait.2: Remove sys/types.h from the list of required headers
wait.h is self-contained.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-06-30 10:31:26 -04:00
Gleb Smirnoff
48a55bbfe9 unix: change error code for recvmsg() failed due to RLIMIT_NOFILE
Instead of returning EMSGSIZE pass the error code from fdallocn() directly
to userland.  That would be EMFILE, which makes much more sense.  This
error code is not listed in the specification[1], but the specification
doesn't cover such edge case at all.  Meanwhile the specification lists
EMSGSIZE as the error code for invalid value of msg_iovlen, and FreeBSD
follows that, see sys_recmsg().  Differentiating these two cases will make
a developer/admin life much easier when debugging.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35640
2022-06-29 09:42:58 -07:00
Mark Johnston
6405997f45 kevent.2: Add an xref to listen.2
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-06-20 12:48:14 -04:00
Sergey Kandaurov
5dd1f6f144 utimensat(2): Remove description of compatibility code
Commit e0e0323354 removed compat stubs for
kernels that did not have futimens() and utimensat() system calls, but
removed the documentation for them in the manual page only partially.

Remove the rest of the documentation of the compatibility code.

MFC after:	1 week
2022-06-12 22:57:31 +02:00
Dmitry Chagin
14c99b43ef kqueue: Fix kqueue(2) man page.
Remove bogus BUGS note about timeout limit to 24 hours, that's not true
since callouting project import.

Reviewed by:		mav
Differential revision:	https://reviews.freebsd.org/D35206
MFC after:		2 weeks
2022-05-14 14:52:51 +03:00
Dmitry Chagin
f35093f8d6 Use Linux semantics for the thread affinity syscalls.
Linux has more tolerant checks of the user supplied cpuset_t's.

Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.

Reviewed by:		Pau Amma (man pages)
In collaboration with:	jhb
Differential revision:	https://reviews.freebsd.org/D34849
MFC after:		2 weeks
2022-05-11 10:36:01 +03:00
Gordon Bergling
4b7f35db44 libc: Add HISTORY sections to the manual pages
There are some sections which could be improved
and work to do so is on going. The work will be
covered via 'X-MFC-WITH' commits.

Obtained from:	OpenBSD
MFC after:	1 month
Differential Revision: https://reviews.freebsd.org/D34759
2022-05-05 18:46:32 +02:00
Dmitry Chagin
89ecdff2c3 Fix sigtimedwait manpage.
Historically, sigtimedwait() blocks indefinitely if timeout is NULL.

Reviewed by:		jilles, imp
Differential Revision:	https://reviews.freebsd.org/D34985
MFC after:		2 weeks
2022-04-21 10:52:29 +03:00