1776 Commits

Author SHA1 Message Date
Konstantin Belousov
3449821376 Update getdirentries(2) page for new struct dirent layout.
Sponsored by:	The FreeBSD Foundation
2017-05-28 09:29:53 +00:00
Edward Tomasz Napierala
a9a393b390 Don't end up manpage titles with a full stop.
MFC after:	2 weeks
2017-05-24 21:02:53 +00:00
Glen Barber
4adb408018 Update the "first appeared in" version in several manual pages.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2017-05-24 17:50:34 +00:00
Allan Jude
f299c47b52 Allow cpuset_{get,set}affinity in capabilities mode
bhyve was recently sandboxed with capsicum, and needs to be able to
control the CPU sets of its vcpu threads

Reviewed by:	emaste, oshogbo, rwatson
MFC after:	2 weeks
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D10170
2017-05-24 00:58:30 +00:00
Konstantin Belousov
6992112349 Commit the 64-bit inode project.
Extend the ino_t, dev_t, nlink_t types to 64-bit ints.  Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment.  Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.

ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks.  Unfortunately, not everything can be
fixed, especially outside the base system.  For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.

Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.

Struct xvnode changed layout, no compat shims are provided.

For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.

Update note: strictly follow the instructions in UPDATING.  Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.

Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb).  Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver.  Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem).  Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).

Sponsored by:	The FreeBSD Foundation (emaste, kib)
Differential revision:	https://reviews.freebsd.org/D10439
2017-05-23 09:29:05 +00:00
Enji Cooper
9227de8c73 kill(2): add missing section for sysctl(9)
Reported by:	make manlint
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-23 07:46:10 +00:00
Enji Cooper
945cb7775f ptrace(2): clean up trailing whitespace
Reviewed by:	make manlint
MFC after:	2 weeks
2017-05-23 07:45:29 +00:00
Enji Cooper
7661c8b028 open(2): fix manlint warnings
- Sort SEE ALSO .Xr entries.
- Sort sections (HISTORY comes after STANDARDS).

Reported by:	make manlint
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-23 07:44:43 +00:00
Enji Cooper
c0f64185c6 rctl_add_rule(2): fix manlint warnings
- Fix commas (either missing or misused) after .Nm entries in SYNOPSIS

Reported by:	make manlint
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-23 07:32:57 +00:00
Enji Cooper
60f14b3186 cap_enter(2): fix manlint issues
- Sort SEE ALSO section appropriately.
- Correct section for sysctl(9).

Reported by:	make manlint
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-23 07:31:03 +00:00
Enji Cooper
bb09af5f58 _umtx_op(2): fix minor manlint issues
- Sort .Xr entries in SEE ALSO section.
- Sort SEE ALSO and STANDARDS sections properly, in terms of the
  entire document.

Reported by:	make manlint
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-23 07:26:45 +00:00
Stephen J. Kiernan
f55d7dd1c9 Add information to open(2) man page about the O_VERIFY flag.
Reviewed by:	bjk wblock
Approved by:	sjg (mentor)
Obtained from:	Juniper Networks, Inc.
2017-05-15 19:32:26 +00:00
Brooks Davis
f19351aad8 Provide a freebsd32 implementation of sigqueue()
The previous misuse of sys_sigqueue() was sending random register or
stack garbage to 64-bit targets.  The freebsd32 implementation preserves
the sival_int member of value when signaling a 64-bit process.

Document the mixed ABI implementation of union sigval and the
incompability of sival_ptr with pointer integrity schemes.

Reviewed by:	kib, wblock
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10605
2017-05-05 18:49:39 +00:00
Conrad Meyer
9ac9bc28bb cpuset.2: Document new API options
A follow-up to r317756.  Adrian will chase up the userspace cpuset(1)
additions.

Reported by:	kib@
Sponsored by:	Dell EMC Isilon
2017-05-03 18:46:33 +00:00
Sergey Kandaurov
b11ef6e971 Document kevent EVFILT_EMPTY.
Reviewed by:	hiren
X-MFC with:	r312277
2017-04-18 15:36:13 +00:00
Eric van Gyzen
16fe28bb9f clock_gettime.2: add some clock IDs
Add the CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clock_id
values to the clock_gettime(2) man page.  Reformat the excessively
long paragraph (sentence!) into a tag list.

Reported by:	jilles in https://reviews.freebsd.org/D10020
MFC after:	3 days
Sponsored by:	Dell EMC
2017-03-22 00:50:36 +00:00
Xin LI
73065ae826 Make space style consistent with earlier entries.
X-MFC with:	r315526
2017-03-20 03:47:15 +00:00
Eric van Gyzen
3f8455b090 Add clock_nanosleep()
Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.

Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)

Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.

Reviewed by:	kib, ngie, jilles
MFC after:	3 weeks
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D10020
2017-03-19 00:51:12 +00:00
Maxim Konovalov
a6c1047fce More trap_enotcap spelling fixes.
PR:		217839
Submitted by:	tobik
2017-03-16 13:19:38 +00:00
Maxim Konovalov
f24fc4834a Spell kern.trap_enotcap.
PR:		217836
Submitted by:	tobik
2017-03-16 12:16:23 +00:00
Xin LI
78d7964b46 Implement INHERIT_ZERO for minherit(2).
INHERIT_ZERO is an OpenBSD feature.

When a page is marked as such, it would be zeroed
upon fork().

This would be used in new arc4random(3) functions.

PR:	182610
Reviewed by:	kib (earlier version)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D427
2017-03-14 17:10:42 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Eric van Gyzen
90c1b723a5 Make several improvements and corrections in the kenv(2) man page
MFC after:	3 days
Sponsored by:	Dell EMC
2017-02-21 19:51:41 +00:00
Ed Schouten
8eb15797b1 Remove unnecessary #includes from the kqueue(2) man page.
Now that <sys/event.h> can be included on its own, adjust the manual
page accordingly. Remove both unnecessary #include statements from the
synopsis and the example code.

While there, also add a note to the BUGS section to mention that
previous versions of this header file still depend on <sys/types.h>.

Reviewed by:	ngie, vangyzen
Differential Revision:	https://reviews.freebsd.org/D9605
2017-02-16 06:52:53 +00:00
Konstantin Belousov
987ff18184 Consistently handle negative or wrapping offsets in the mmap(2) syscalls.
For regular files and posix shared memory, POSIX requires that
[offset, offset + size) range is legitimate.  At the maping time,
check that offset is not negative.  Allowing negative offsets might
expose the data that filesystem put into vm_object for internal use,
esp. due to OFF_TO_IDX() signess treatment.  Fault handler verifies
that the mapped range is valid, assuming that mmap(2) checked that
arithmetic gives no undefined results.

For device mappings, leave the semantic of negative offsets to the
driver.  Correct object page index calculation to not erronously
propagate sign.

In either case, disallow overflow of offset + size.

Update mmap(2) man page to explain the requirement of the range
validity, and behaviour when the range becomes invalid after mapping.

Reported and tested by:	royger (previous version)
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-02-12 21:05:44 +00:00
Jilles Tjoelker
e301fd984a Clean up documentation of AF_UNIX control messages.
Document AF_UNIX control messages in unix(4) only, not split between unix(4)
and recv(2).

Also, warn about LOCAL_CREDS effective uid/gid fields, since the write could
be from a setuid or setgid program (with the explicit SCM_CREDS and
LOCAL_PEERCRED, the credentials are read at such a time that it can be
assumed that the process intends for them to be used in this context).

Reviewed by:	wblock
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9298
2017-02-03 20:33:23 +00:00
Maxim Sobolev
dd1badb4a3 Improve wording around SO_TS_CLOCK documentation.
Submitted by:	wblock
Differential Revision:	https://reviews.freebsd.org/D9171
2017-01-20 18:37:14 +00:00
Warren Block
7fd5cf0544 Mention sendfile(2) by popular demand.
Submitted by:	alc, kib
MFC after:	1 week
Sponsored by:	iXsystems
Differential Revision:	https://reviews.freebsd.org/D9259
2017-01-20 17:29:59 +00:00
Enji Cooper
d0fd0203fb Replace dot-dot relative pathing with SRCTOP-relative paths where possible
This reduces build output, need for recalculating paths, and makes it clearer
which paths are relative to what areas in the source tree. The change in
performance over a locally mounted UFS filesystem was negligible in my testing,
but this may more positively impact other filesystems like NFS.

LIBC_SRCTOP was left alone so Juniper (and other users) can continue to
manipulate lib/libc/Makefile (and other Makefile.inc's under lib/libc) as
include Makefiles with custom options.

Discussed with:	marcel, sjg
MFC after:	1 week
Reviewed by:	emaste
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D9207
2017-01-20 03:23:24 +00:00
Hans Petter Selasky
f3e7afe2d7 Implement kernel support for hardware rate limited sockets.
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.

Reviewed by:		wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision:	https://reviews.freebsd.org/D3687
Sponsored by:		Mellanox Technologies
MFC after:		3 months
2017-01-18 13:31:17 +00:00
Maxim Sobolev
339efd75a4 Add a new socket option SO_TS_CLOCK to pick from several different clock
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:

o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).

In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.

Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.

This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.

There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.

Reviewed by:    adrian, gnn
Differential Revision:  https://reviews.freebsd.org/D9171
2017-01-16 17:46:38 +00:00
Warren Block
b44047f3df Update the shm_open.2 man page to reflect objective reality.
PR:		215612
Submitted by:	rwatson
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9066
2017-01-13 19:41:02 +00:00
John Baldwin
34ed0c63c8 Rename the 'flags' argument to getfsstat() to 'mode' and validate it.
This argument is not a bitmask of flags, but only accepts a single value.
Fail with EINVAL if an invalid value is passed to 'flag'.  Rename the
'flags' argument to getmntinfo(3) to 'mode' as well to match.

This is a followup to r308088.

Reviewed by:	kib
MFC after:	1 month
2016-12-27 20:21:11 +00:00
Eric van Gyzen
ff07dd913e thr_set_name(): silently truncate the given name as needed
Instead of failing with ENAMETOOLONG, which is swallowed by
pthread_set_name_np() anyway, truncate the given name to MAXCOMLEN+1
bytes.  This is more likely what the user wants, and saves the
caller from truncating it before the call (which was the only
recourse).

Polish pthread_set_name_np(3) and add a .Xr to thr_set_name(2)
so the user might find the documentation for this behavior.

Reviewed by:	jilles
MFC after:	3 days
Sponsored by:	Dell EMC
2016-12-03 01:14:21 +00:00
Mark Johnston
64910ddbff Launder VPO_NOSYNC pages upon vnode deactivation.
As of r234483, vnode deactivation causes non-VPO_NOSYNC pages to be
laundered. This behaviour has two problems:

1. Dirty VPO_NOSYNC pages must be laundered before the vnode can be
   reclaimed, and this work may be unfairly deferred to the vnlru process
   or an unrelated application when the system is under vnode pressure.
2. Deactivation of a vnode with dirty VPO_NOSYNC pages requires a scan of
   the corresponding VM object's memq for non-VPO_NOSYNC dirty pages; if
   the laundry thread needs to launder pages from an unreferenced such
   vnode, it will reactivate and deactivate the vnode with each laundering,
   potentially resulting in a large number of expensive scans.

Therefore, ensure that all dirty pages are laundered upon deactivation,
i.e., when all maps of the vnode are removed and all references are
released.

Reviewed by:	alc, kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8641
2016-11-26 21:00:27 +00:00
Jilles Tjoelker
295159dfa3 open(2): Clarify non-POSIX error when opening a symlink with O_NOFOLLOW.
We return [EMLINK] instead of [ELOOP] when trying to open a symlink with
O_NOFOLLOW, so that the original case of [ELOOP] can be distinguished. Code
like cmp -h and xz takes advantage of this.

PR:		214633
Reviewed by:	kib, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D8586
2016-11-22 22:30:55 +00:00
Gleb Smirnoff
00b5ffde8e Add flag SF_USER_READAHEAD to sendfile(2). When specified, the syscall won't
do any speculations about readahead, and use exactly the amount of readahead
specified by user.  E.g. setting SF_FLAGS(0, SF_USER_READAHEAD) will guarantee
that no readahead at all will be performed.
2016-11-17 21:36:18 +00:00
Edward Tomasz Napierala
319f1fd2ea Document that getfsstat(2) called with MNT_NOWAIT skips file systems
that are in the process of being unmounted.

Reviewed by:	des@ (earlier version)
MFC after:	1 month
2016-11-06 19:37:22 +00:00
John Baldwin
3abf45a148 Use 'cmd' rather than 'command' to match the function prototype. 2016-10-17 22:36:37 +00:00
Bryan Drewery
3617efe593 Improve grammar.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-10-06 17:35:50 +00:00
Conrad Meyer
c038bae74c open.2: Document Capsicum behavior
Document open(2) and openat(2) behavior in Capsicum capability mode.

Reviewed by:	ed (previous version), emaste, rwatson (previous version),
		wblock
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D7947
2016-09-30 23:01:37 +00:00
Konstantin Belousov
6d8f097966 Reword the statement.
Submitted by:	wblock
MFC after:	3 days
2016-09-30 16:02:25 +00:00
Konstantin Belousov
98003d078f Add an article.
Submitted by:	wblock
MFC after:	3 days
2016-09-30 15:47:13 +00:00
Dag-Erling Smørgrav
58d2f848e2 Reinstate Xr macros that were accidentally removed in a previous
commit.  Add some missing cross-references to the SEE ALSO section.
Bump date now that there are content changes.

MFC after:	1 week
2016-09-30 13:05:32 +00:00
Dag-Erling Smørgrav
2224742ff4 Minor markup and wording fixes.
MFC after:	1 week
2016-09-30 13:04:18 +00:00
Dag-Erling Smørgrav
1577b7750e After perusal of the documentation and some experimentation, I found a
version that works with both groff and mandoc.

Hat tip to:	kib
MFC after:	1 week
2016-09-30 11:05:29 +00:00
Dag-Erling Smørgrav
ef14f6a19e Format the table correctly, using cell separators instead of relying
on *roff or mandoc to guess where one cell ends and the next begins.

MFC after:	1 week
2016-09-30 09:23:29 +00:00
Konstantin Belousov
5925fff002 Editing fixes for r306257, documentation for trapcap.
Suggested by:	wblock
Discussed with:	jilles
Reviewed by:	cem (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8023
2016-09-27 11:31:53 +00:00
Konstantin Belousov
fd6c95c09f Document thr_suspend(2) and thr_wake(2).
Reviewed by:	bjk, jilles
Discussed with:	emaste, wblock
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8016
2016-09-26 08:18:34 +00:00
Konstantin Belousov
23670cf40a Document r306081, i.e. procctl(PROC_TRAPCAP) and sysctl kern.trap_enocap.
Reviewed by:	cem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8003
2016-09-23 09:26:40 +00:00