1844 Commits

Author SHA1 Message Date
Ed Maste
f6a10ccc53 Use consistent struct stat arg name in stat man page
stat, lstat, and fstat use `*sb` as the stat struct pointer arg name,
while fstatat previously used `*buf`.

MFC after:	1 week
2019-03-13 15:18:14 +00:00
Ed Maste
d95826c43d poll.2: POLLNVAL is returned also for insufficient rights
Reported by:	"Bora Özarslan" <borako.ozarslan@gmail.com>
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-02-27 17:52:22 +00:00
Konstantin Belousov
9fb91a0a7d procctl(2): document ASLR knobs.
Reviewed by:	0mp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D19308
2019-02-26 17:41:41 +00:00
Konstantin Belousov
80a3fa4893 procctl(2): fix -width parameter to .Bl.
According to 0mp, macros are not expanded in the argument provided to
-width.  Use plain identifiers for width specification.

Noted and reviewed by:	0mp
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D19308
2019-02-26 17:35:06 +00:00
Gleb Smirnoff
5bfb2e008d Imaginary cat jumped my keyboard! 2019-02-15 23:46:34 +00:00
Gleb Smirnoff
66fb0b1ad7 For 32-bit machines rollback the default number of vnode pager pbufs
back to the lever before r343030.  For 64-bit machines reduce it slightly,
too.  Together with r343030 I bumped the limit up to the value we use at
Netflix to serve 100 Gbit/s of sendfile traffic, and it probably isn't a
good default.

Provide a loader tunable to change vnode pager pbufs count. Document it.
2019-02-15 23:36:22 +00:00
Sergey Kandaurov
78c8b9477c Document the ENOBUFS errno in setsockopt(2).
In particular, it is the case if SO_SNDBUF/SO_RCVBUF would exceed sb_max_adj.

PR:		200649
MFC after:	1 week
2019-02-09 21:33:32 +00:00
Enji Cooper
2b9ecf4896 Document that sendfile will return an invalid value for sbytes if provided an invalid address
This is meant to clarify the fact that the system call will not fail
with -1/EFAULT, as one might expect, when reading the sendfile(2)
manpage today.

While here, pet the mandoc linter, when dealing with the section that
describes valid values for `flags`.

PR:	232210
MFC after:	2 weeks
Approved by:	emaste (mentor)
Reviewed by:	glebius, 0mp
Differential Revision: https://reviews.freebsd.org/D18949
2019-01-25 19:56:02 +00:00
Kirk McKusick
88640c0e8b Create new EINTEGRITY error with message "Integrity check failed".
An integrity check such as a check-hash or a cross-correlation failed.
The integrity error falls between EINVAL that identifies errors in
parameters to a system call and EIO that identifies errors with the
underlying storage media. EINTEGRITY is typically raised by intermediate
kernel layers such as a filesystem or an in-kernel GEOM subsystem when
they detect inconsistencies. Uses include allowing the mount(8) command
to return a different exit value to automate the running of fsck(8)
during a system boot.

These changes make no use of the new error, they just add it. Later
commits will be made for the use of the new error number and it will
be added to additional manual pages as appropriate.

Reviewed by:    gnn, dim, brueffer, imp
Discussed with: kib, cem, emaste, ed, jilles
Differential Revision: https://reviews.freebsd.org/D18765
2019-01-17 06:35:45 +00:00
Konstantin Belousov
ea7e7006db Implement shmat(2) flag SHM_REMAP.
Based on the description in Linux man page.

Reviewed by:	markj, ngie (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18837
2019-01-16 05:15:57 +00:00
Konstantin Belousov
3fbc2e00d1 Add a tunable which changes mincore(2) algorithm to only report data
from the local mapping.

Enable the setting by default.
The article behind the change: https://arxiv.org/abs/1901.01161

Reviewed by:	markj
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18764
2019-01-07 22:10:48 +00:00
Jilles Tjoelker
8cc4b29d5a thr_wake(2): Minor mdoc fixes
MFC after:	1 week
2019-01-06 21:34:05 +00:00
Mark Johnston
2f2ddd68a5 Support MSG_DONTWAIT in send*(2).
As it does for recv*(2), MSG_DONTWAIT indicates that the call should
not block, returning EAGAIN instead.  Linux and OpenBSD both implement
this, so the change makes porting easier, especially since we do not
return EINVAL or so when unrecognized flags are specified.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	tuexen
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18728
2019-01-04 17:31:50 +00:00
Konstantin Belousov
eba8ab0e3e Remove special case handling for getfhat(fd, NULL, handle).
There is no reason for it to behave differently from openat(fd, NULL).
Also the handling did not worked because the substituted path was from
the system address space, causing EFAULT.

Submitted by:	Jack Halford <jack@gandi.net>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18501
2018-12-11 02:48:49 +00:00
Konstantin Belousov
d1fd400a80 Add new file handle system calls.
Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2).  The
syscalls are provided for a NFS userspace server (nfs-ganesha).

Submitted by:	Jack Halford <jack@gandi.net>
Sponsored by:	Gandi.net
Tested by:	pho
Feedback from:	brooks, markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18359
2018-12-07 15:17:29 +00:00
Alan Somers
006678fd05 stat(2): clarify which syscalls modify file timestamps
The list of syscalls that modify st_atim, st_mtim, and st_ctim was quite out
of date and probably not accurate to begin with.  Update it, and make it
clear that the list is open-ended.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18410
2018-12-05 17:28:40 +00:00
Alan Somers
a14a34ef62 fcntl.2: document an additional error condition
MFC after:	2 weeks
2018-11-15 16:13:25 +00:00
Konstantin Belousov
1c4ca77890 Add d_off support for multiple filesystems.
The d_off field has been added to the dirent structure recently.
Currently filesystems don't support this feature.  Support has been
added and tested for zfs, ufs, ext2fs, fdescfs, msdosfs and unionfs.
A stub implementation is available for cd9660, nandfs, udf and
pseudofs but hasn't been tested.

Motivation for this feature: our usecase is for a userspace nfs server
(nfs-ganesha) with zfs.  At the moment we cache direntry offsets by
calling lseek once per entry, with this patch we can get the offset
directly from getdirentries(2) calls which provides a significant
speedup.

Submitted by:	Jack Halford <jack@gandi.net>
Reviewed by:	mckusick, pfg, rmacklem (previous versions)
Sponsored by:	Gandi.net
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17917
2018-11-14 14:18:35 +00:00
Konstantin Belousov
5b1fb8ec66 First draft of documentation for AT/O_BENEATH handling of the absolute
paths.

It was decided that committing the code and drafting of the man page
update is better than allowing the code to rot until wordsmithing
happens.

Reviewed by:	jilles (previous version)
Discussed with:	brooks, jilles, emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17714
2018-11-11 01:46:48 +00:00
Conrad Meyer
78c2a9806e kern_poll: Restore explanatory comment removed in r177374
The comment isn't stale.  The check is bogus in the sense that poll(2)
does not require pollfd entries to be unique in fd space, so there is no
reason there cannot be more pollfd entries than open or even allowed
fds.  The check is mostly a seatbelt against accidental misuse or
abuse.  FD_SETSIZE, while usually unrelated to poll, is used as an
arbitrary floor for systems with very low kern.maxfilesperproc.

Additionally, document this possible EINVAL condition in the poll.2
manual.

No functional change.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D17671
2018-11-01 23:46:23 +00:00
Warner Losh
f64bccc6d9 Bump .Dd forgotten in last commit. 2018-10-28 03:02:09 +00:00
Warner Losh
5669c6748d Note that the kenrel doesn't keep track daylight savings time, nor
timezone offset. These values are generally zero.

While one still theoreticall could set these values, that's almost
never done. Users wishing to have an offset between the time of day
clock hardware and UTC use adjkerntz(8) instead.

localtime(3) should be used to find these values for the current
timezone.
2018-10-28 02:58:22 +00:00
Konstantin Belousov
4f77f48884 Implement O_BENEATH and AT_BENEATH.
Flags prevent open(2) and *at(2) vfs syscalls name lookup from
escaping the starting directory.  Supposedly the interface is similar
to the same proposed Linux flags.

Reviewed by:	jilles (code, previous version of manpages), 0mp (manpages)
Discussed with:	allanjude, emaste, jonathan
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D17547
2018-10-25 22:16:34 +00:00
Mark Johnston
b8e4cdda35 Clarify slightly the interaction between wait*() and pdfork().
There are multiple ways to wait for any child process to return a
status (e.g., waitpid(-1, ...), waitid(P_ALL, ...)), so don't be so
specific.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-10-24 18:42:13 +00:00
Poul-Henning Kamp
01652e9c8e Update example to something people less than 40 years old have heard about. 2018-10-21 07:30:26 +00:00
Edward Tomasz Napierala
edbedaf4dc Add .Xrs to kqueue(2) from pdfork(2) and procdesc(4), to make EVFILT_PROCDESC
easier to find.

Approved by:	re (rgrimes)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2018-10-14 18:42:54 +00:00
Allan Jude
c452913091 Document that sendfile(2) can return ENOTCAPABLE
PR:		232207
Submitted by:	Enji Cooper <yaneurabeya@gmail.com>
Approved by:	re (rgrimes)
2018-10-13 02:20:16 +00:00
Michael Tuexen
6b01d4d433 Add SOL_SOCKET level socket option with name SO_DOMAIN to get
the domain of a socket.

This is helpful when testing and Solaris and Linux have the same
socket option using the same name.

Reviewed by:		bcr@, rrs@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D16791
2018-08-21 14:04:30 +00:00
Mateusz Piotrowski
c8b8b38e5f Document socket control message routines for ancillary data access (CMSG_DATA).
PR:		227777
Reviewed by:	bcr, eadler
Approved by:	mat (mentor), manpages (bcr)
Obtained from:	OpenBSD
Differential Revision:	https://reviews.freebsd.org/D15215
2018-08-19 17:42:49 +00:00
Jamie Gritton
c542c43ef1 Revert r337922, except for some documention-only bits. This needs to wait
until user is changed to stop using jail(2).

Differential Revision:	D14791
2018-08-16 19:09:43 +00:00
Jamie Gritton
284001a222 Put jail(2) under COMPAT_FREEBSD11. It has been the "old" way of creating
jails since FreeBSD 7.

Along with the system call, put the various security.jail.allow_foo and
security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or
BURN_BRIDGES).  These sysctls had two disparate uses: on the system side,
they were global permissions for jails created via jail(2) which lacked
fine-grained permission controls; inside a jail, they're read-only
descriptions of what the current jail is allowed to do.  The first use
is obsolete along with jail(2), but keep them for the second-read-only use.

Differential Revision:	D14791
2018-08-16 18:40:16 +00:00
Conrad Meyer
ba9ace7436 settimeofday(2): Remove stale note about timezone
Contrary to the removed comment, the kernel does appear to use the timezone
argument of settimeofday.  The comment dates to the BSD4.4 import; I assume it
is just stale.
2018-08-04 22:08:24 +00:00
Ruslan Bukin
42570cd1d4 MAXLOGNAME changed to 33 in r243023.
Update man pages.

Sponsored by:	DARPA, AFRL
2018-08-03 16:05:03 +00:00
David Bright
95c05062ec Allow a EVFILT_TIMER kevent to be updated.
If a timer is updated (re-added) with a different time period
(specified in the .data field of the kevent), the new time period has
no effect; the timer will not expire until the original time has
elapsed. This violates the documented behavior as the kqueue(2) man
page says (in part) "Re-adding an existing event will modify the
parameters of the original event, and not result in a duplicate
entry."

This modification, adapted from a patch submitted by cem@ to PR214987,
fixes the kqueue system to allow updating a timer entry. The
kevent timer behavior is changed to:

  * When a timer is re-added, update the timer parameters to and
    re-start the timer using the new parameters.
  * Allow updating both active and already expired timers.
  * When the timer has already expired, dequeue any undelivered events
    and clear the count of expirations.

All of these changes address the original PR and also bring the
FreeBSD and macOS kevent timer behaviors into agreement.

A few other changes were made along the way:

  * Update the kqueue(2) man page to reflect the new timer behavior.
  * Fix man page style issues in kqueue(2) diagnosed by igor.
  * Update the timer libkqueue system test to test for the updated
    timer behavior.
  * Fix the (test) libkqueue common.h file so that it includes
    config.h which defines various HAVE_* feature defines, before the
    #if tests for such variables in common.h. This enables the use of
    the actual err(3) family of functions.
  * Fix the usages of the err(3) functions in the tests for incorrect
    type of variables. Those were formerly undiagnosed due to the
    disablement of the err(3) functions (see previous bullet point).

PR:		214987
Reported by:	Brian Wellington <bwelling@xbill.org>
Reviewed by:	kib
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D15778
2018-07-27 13:49:17 +00:00
Konstantin Belousov
b3042426d0 Remove bits of the old NUMA.
Remove numactl(1), edit numa(4) to bring it some closer to reality,
provide libc ABI shims for old NUMA syscalls.

Noted and reviewed by:	brooks (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D16142
2018-07-10 22:00:20 +00:00
Brooks Davis
7cc923f8a8 Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary very called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

This is a respin of r335983 with a workaround for the ancient BFD linker
in the libc stubs.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16193
2018-07-10 13:32:04 +00:00
Warner Losh
bdea3adca6 Tweak documentation to RB_ constants to reflect current use
RB_ASKNAME is no longer instructions to the boot loader to request a
prompt for which kernel to boot. Instead, it asks for what the root
file system to use. RB_INITNAME is unused, and never has been in
FreeBSD as far as I can tell. Remove it from the documentation and fix
comment. RB_SELFTEST and RB_MINIROOT likewise (though they were
completely undocumented). These last three constants can likely just
be deleted as nothing references them (even to set useless bits).

RB_ASKNAME doesn't actually survive reboot, however, so needs to be
communicated to the bootloader via other means. If the bootloader sets
it, though, it will be honored.
2018-07-10 00:01:14 +00:00
Brooks Davis
714c03c81e Revert r335983.
The bfd linker in tree doesn't support multiple names for the same
symbol (at least with current flags).
2018-07-05 16:03:03 +00:00
Brooks Davis
5b04a71dae Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary ever called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15814
2018-07-05 14:12:56 +00:00
Conrad Meyer
e02d32f72e sigaction.2: Minor cleanups
Add vertical space between struct definition and function prototype.

Use "NULL" to describe zero pointers, instead of "zero."

Remove perhaps unclear "can not" and replace.  Tag struct member names used
with appropriate tags.
2018-06-28 18:17:20 +00:00
Ian Lepore
25b10ed4b7 Add some words clarifying that rename(2) does nothing when the 'from' and
'to' args are the same file.  Wording borrowed from POSIX.1-2017, but
the freebsd code to implement this behavior was added in 2002 (r103180).
2018-06-21 15:21:17 +00:00
Sean Bruno
1a43cff92a Load balance sockets with new SO_REUSEPORT_LB option.
This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple
programs or threads to bind to the same port and incoming connections will be
load balanced using a hash function.

Most of the code was copied from a similar patch for DragonflyBSD.

However, in DragonflyBSD, load balancing is a global on/off setting and can not
be set per socket. This patch allows for simultaneous use of both the current
SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system.

Required changes to structures:
Globally change so_options from 16 to 32 bit value to allow for more options.
Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets.

Limitations:
As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or
threads sharing the same socket).

This is a substantially different contribution as compared to its original
incarnation at svn r332894 and reverted at svn r332967.  Thanks to rwatson@
for the substantive feedback that is included in this commit.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Obtained from:	DragonflyBSD
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11003
2018-06-06 15:45:57 +00:00
Mark Johnston
9f9c9b22ec Reimplement brk() and sbrk() to avoid the use of _end.
Previously, libc.so would initialize its notion of the break address
using _end, a special symbol emitted by the static linker following
the bss section.  Compatibility issues between lld and ld.bfd could
cause the wrong definition of _end (libc.so's definition rather than
that of the executable) to be used, breaking the brk()/sbrk()
interface.

Avoid this problem and future interoperability issues by simply not
relying on _end.  Instead, modify the break() system call to return
the kernel's view of the current break address, and have libc
initialize its state using an extra syscall upon the first use of the
interface.  As a side effect, this appears to fix brk()/sbrk() usage
in executables run with rtld direct exec, since the kernel and libc.so
no longer maintain separate views of the process' break address.

PR:		228574
Reviewed by:	kib (previous version)
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D15663
2018-06-04 19:35:15 +00:00
Justin Hibbits
2e65567500 Added ptrace support for reading/writing powerpc VSX registers
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).

This is necessary to add support for VSX registers in debuggers.

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
2018-06-02 19:17:11 +00:00
Mark Johnston
e2c1730299 Remove an inaccuracy from mincore.2.
Super pages are supported on non-x86 architectures, so just remove the
incorrect note.  While here, change terminology to be consistent with
mmap.2.

MFC after:	1 week
2018-06-01 23:40:43 +00:00
Brooks Davis
7351a8bdb5 Make vadvise compat freebsd11.
The vadvise syscall (aka ovadvise) is undocumented and has always been
implmented as returning EINVAL.  Put the syscall under COMPAT11 and
provide a userspace implementation.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15557
2018-05-25 20:40:23 +00:00
Brooks Davis
2357535254 Indicate the brk/sbrk are deprecated and not portable.
More firmly suggest mmap(2) instead.

Include the history of arm64 and riscv shipping without brk/sbrk.

Mention that sbrk(0) produces unreliable results.

Reviewed by:	emaste, Marcin Cieślak
MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15535
2018-05-24 18:32:54 +00:00
Konstantin Belousov
84ffdd6a81 Note that PT_SETSTEP is auto-cleared.
Wording and reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D15054
2018-05-23 17:55:30 +00:00
Sevan Janiyan
d3fff23be8 Use St macro for specifying C standards.
Reported by:	rgrimes@
2018-05-20 21:56:08 +00:00
Sevan Janiyan
d55b77df03 Fix a typo and remove an unneeded Tn macro as highlighted by mandoc -Tlint.
Submitted by:		Mateusz Piotrowski
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15204
2018-05-20 20:28:17 +00:00