1694 Commits

Author SHA1 Message Date
markj
e06021a945 Launder VPO_NOSYNC pages upon vnode deactivation.
As of r234483, vnode deactivation causes non-VPO_NOSYNC pages to be
laundered. This behaviour has two problems:

1. Dirty VPO_NOSYNC pages must be laundered before the vnode can be
   reclaimed, and this work may be unfairly deferred to the vnlru process
   or an unrelated application when the system is under vnode pressure.
2. Deactivation of a vnode with dirty VPO_NOSYNC pages requires a scan of
   the corresponding VM object's memq for non-VPO_NOSYNC dirty pages; if
   the laundry thread needs to launder pages from an unreferenced such
   vnode, it will reactivate and deactivate the vnode with each laundering,
   potentially resulting in a large number of expensive scans.

Therefore, ensure that all dirty pages are laundered upon deactivation,
i.e., when all maps of the vnode are removed and all references are
released.

Reviewed by:	alc, kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8641
2016-11-26 21:00:27 +00:00
jilles
56c196dc25 open(2): Clarify non-POSIX error when opening a symlink with O_NOFOLLOW.
We return [EMLINK] instead of [ELOOP] when trying to open a symlink with
O_NOFOLLOW, so that the original case of [ELOOP] can be distinguished. Code
like cmp -h and xz takes advantage of this.

PR:		214633
Reviewed by:	kib, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D8586
2016-11-22 22:30:55 +00:00
glebius
b739d60344 Add flag SF_USER_READAHEAD to sendfile(2). When specified, the syscall won't
do any speculations about readahead, and use exactly the amount of readahead
specified by user.  E.g. setting SF_FLAGS(0, SF_USER_READAHEAD) will guarantee
that no readahead at all will be performed.
2016-11-17 21:36:18 +00:00
trasz
a92655722a Document that getfsstat(2) called with MNT_NOWAIT skips file systems
that are in the process of being unmounted.

Reviewed by:	des@ (earlier version)
MFC after:	1 month
2016-11-06 19:37:22 +00:00
jhb
09b070758a Use 'cmd' rather than 'command' to match the function prototype. 2016-10-17 22:36:37 +00:00
bdrewery
7f8c084baf Improve grammar.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-10-06 17:35:50 +00:00
cem
003df6649c open.2: Document Capsicum behavior
Document open(2) and openat(2) behavior in Capsicum capability mode.

Reviewed by:	ed (previous version), emaste, rwatson (previous version),
		wblock
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D7947
2016-09-30 23:01:37 +00:00
kib
ba7c9b12dc Reword the statement.
Submitted by:	wblock
MFC after:	3 days
2016-09-30 16:02:25 +00:00
kib
c67a6b88f9 Add an article.
Submitted by:	wblock
MFC after:	3 days
2016-09-30 15:47:13 +00:00
des
feff8c35f0 Reinstate Xr macros that were accidentally removed in a previous
commit.  Add some missing cross-references to the SEE ALSO section.
Bump date now that there are content changes.

MFC after:	1 week
2016-09-30 13:05:32 +00:00
des
6b103b8437 Minor markup and wording fixes.
MFC after:	1 week
2016-09-30 13:04:18 +00:00
des
407f7867e3 After perusal of the documentation and some experimentation, I found a
version that works with both groff and mandoc.

Hat tip to:	kib
MFC after:	1 week
2016-09-30 11:05:29 +00:00
des
e60f4ea968 Format the table correctly, using cell separators instead of relying
on *roff or mandoc to guess where one cell ends and the next begins.

MFC after:	1 week
2016-09-30 09:23:29 +00:00
kib
b0a0f2de47 Editing fixes for r306257, documentation for trapcap.
Suggested by:	wblock
Discussed with:	jilles
Reviewed by:	cem (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8023
2016-09-27 11:31:53 +00:00
kib
e4920be6f2 Document thr_suspend(2) and thr_wake(2).
Reviewed by:	bjk, jilles
Discussed with:	emaste, wblock
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8016
2016-09-26 08:18:34 +00:00
kib
131da443e7 Document r306081, i.e. procctl(PROC_TRAPCAP) and sysctl kern.trap_enocap.
Reviewed by:	cem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8003
2016-09-23 09:26:40 +00:00
cem
589ec3ab2d posix_openpt.2: Sort includes per style(9)
Sponsored by:	Dell EMC Isilon
2016-09-21 17:51:27 +00:00
badger
ec84c9826b Add manpage for rctl_* system calls
Reviewed by:	trasz, wblock
Approved by:	kib (mentor)
MFC after:	3 days
Sponsored by:	Dell Technologies
Differential Revision:	https://reviews.freebsd.org/D7877
2016-09-19 02:25:30 +00:00
emaste
e52a38003b cap_enter.2: describe flag returned by cap_getmode
Previously the flag returned by cap_getmode was not described explicitly
in the man page.

Reviewed by:	wblock
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7822
2016-09-11 01:11:47 +00:00
brooks
5072762802 Fix spelling in comment.
Submitted by:	brueffer
2016-09-09 16:18:44 +00:00
brooks
dc591d96b2 Reduce duplicate NOASM and PSEUDO definitions
The initial value of NOASM is nearly the same in all cases and the
initial value of PSEUDO is the same in all cases so reduce duplication
(and hopefully, future merge conflicts) by machine independent defaults.

Also document the PSEUDO variable.

Reviewed by:	jhb, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D7820
2016-09-08 22:38:20 +00:00
jilles
34c9a9c899 intro(2),_exit(2): Update for reaper (procctl(PROC_REAP_ACQUIRE)).
MFC after:	1 week
2016-09-08 21:50:03 +00:00
kib
7c7eb882c0 Typesetting fixes.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-08-29 19:53:13 +00:00
kib
baf2e0796c Restore the requirement of setting errno to zero before calling
ptrace(2).  Describe the behaviour of automatically zeroing errno as
historical feature.

Requested by:	ache, jhb
Reviewed by:	ache, bjk
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-29 19:33:07 +00:00
kib
fffac29068 Rewrite ptrace(2) wrappers in C.
Besides removing hand-translation to assembler, this also adds missing
wrappers for arm64 and risc-v.

Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7694
2016-08-29 18:47:51 +00:00
kib
11a326dc31 Do not obliterate errno value in the main thread during ptrace(2) call on x86.
Since ptrace(2) syscall can return -1 for non-error situations, libc
wrappers set errno to 0 before performing the syscall, as the service
to the caller.  On both i386 and amd64, the errno symbol was directly
referenced, which only works correctly in single-threaded process.

Change assembler wrappers for ptrace(2) to get current thread errno
location by calling __error().  Allow __error interposing, as
currently allowed in cerror().

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-27 23:03:23 +00:00
jhb
66478f22b3 Fix various nits in the aio operation manpages.
- Avoid double use of "request" in a single sentence.  Instead, describe
  aio_sigevent as being used to request notification of the associated
  operation's completion.  This matches the language used to describe
  aio_sigevent in aio(4).
- Simplify the prohibition on modifying buffers while requests are in
  flight.
- Fix case mismatch.
- Drop note about not using stack variables. C programmers should be able
  to figure out if a stack variable is safe based on the later warning
  about the life cycle requirements of control blocks.
- Remove prohibition on modifying the I/O buffer for aio_fsync() since
  it does not use an I/O buffer.  For aio_mlock(), prohibit modifications
  to the mapping (e.g. due to mprotect, munmap, mmap, etc.) but do not
  prohibit modifications to the memory backing the buffer (stores into
  the pages backing the buffer).

Requested by:	wblock (1,2), kib (4)
Reviewed by:	kib, rpokala, wblock
MFC after:	3 days
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7462
2016-08-19 17:37:32 +00:00
kevlo
52cd198be4 Remove <sys/types.h> from the SYNOPSIS. 2016-08-18 06:39:09 +00:00
bdrewery
4eb119b8e7 Garbage collect _umtx_lock(2)/_umtx_unlock(2) references removed in r263318.
This has no real impact on the resulting libc.so file.

MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-17 10:20:05 +00:00
kib
21a721373f Add fdatasync(2) man page, combined with fsync(2).
Reviewed by:	emaste, rpokala, wblock
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7522
2016-08-17 10:16:42 +00:00
kib
e56264ca17 Implement userspace gettimeofday(2) with HPET timecounter.
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC.  For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge.  Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET.  For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods.  Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location.  __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access.  But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.

Tested by:	Howard Su <howard0su@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
kib
e04600d300 The fdatasync(2) call must be cancellation point.
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2016-08-16 08:27:03 +00:00
kib
aa6b4fc56a Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2).  For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC().  This is functionally correct but not efficient.

This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.

Reviewed by:	mckusick
Discussed with:	avg (ZFS), jhb (AIO part)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7471
2016-08-15 19:08:51 +00:00
jhb
17008900fa Remove obsolete manpage that is not currently installed. 2016-08-09 22:10:40 +00:00
ed
f14bc84311 mprotect(): Change prototype to comply to POSIX.
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.

PR:		211423 (exp-run)
Tested by:	antoine@ (Thanks!)
2016-08-03 06:33:04 +00:00
jhb
21e4e9482d Note that not all optional ptrace events use SIGTRAP.
New child processes attached due to PTRACE_FORK use SIGSTOP instead of
SIGTRAP.  All other ptrace events use SIGTRAP.
2016-07-28 20:51:29 +00:00
ed
4eb594a8c4 Change the return type of msgrcv() to ssize_t as required by POSIX.
It looks like the msgrcv() system call is already written in such a way
that the size is internally computed as a size_t and written into all of
td_retval[0]. This means that it is effectively already returning
ssize_t. It's just that the userspace prototype doesn't match up.
2016-07-28 12:22:01 +00:00
jhb
6db41da768 Add more documentation regarding unsafe AIO requests.
The asynchronous I/O changes made previously result in different
behavior out of the box. Previously all AIO requests failed with
ENOSYS / SIGSYS unless aio.ko was explicitly loaded. Now, some AIO
requests complete and others ("unsafe" requests) fail with EOPNOTSUPP.

Reword the introductory paragraph in aio(4) to add a general
description of AIO before describing the vfs.aio.enable_unsafe sysctl.

Remove the ENOSYS error description from aio_fsync(2), aio_read(2),
and aio_write(2) and replace it with a description of EOPNOTSUPP.

Remove the ENOSYS error description from aio_mlock(2).

Log a message to the system log the first time a process requests an
"unsafe" AIO request that fails with EOPNOTSUPP. This is modeled on
the log message used for processes using the legacy pty devices.

Reviewed by:	kib (earlier version)
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7151
2016-07-21 22:49:47 +00:00
zeising
5ee345da81 Change wording to use function rather than system call in the description
as well.

Reviewed by:	brooks
MFC after:	5 days
2016-07-20 18:16:58 +00:00
brooks
824bffcfe6 Update to reflect the fact that pipe() is a wrapper around the pipe2()
system call.

Reviewed by:	jhb, wblock
MFC after:	5 days
Sponsored by:	DAPRA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6948
2016-07-20 18:02:07 +00:00
jhb
f39e6951ab Add PTRACE_VFORK to trace vfork events.
First, PL_FLAG_FORKED events now also set a PL_FLAG_VFORKED flag when
the new child was created via vfork() rather than fork().  Second, a
new PL_FLAG_VFORK_DONE event can now be enabled via the PTRACE_VFORK
event mask.  This new stop is reported after the vfork parent resumes
due to the child calling exit or exec.  Debuggers can use this stop to
reinsert breakpoints in the vfork parent process before it resumes.

Reviewed by:	kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7045
2016-07-18 14:53:55 +00:00
jhb
91d07047c4 Add a mask of optional ptrace() events.
ptrace() now stores a mask of optional events in p_ptevents.  Currently
this mask is a single integer, but it can be expanded into an array of
integers in the future.

Two new ptrace requests can be used to manipulate the event mask:
PT_GET_EVENT_MASK fetches the current event mask and PT_SET_EVENT_MASK
sets the current event mask.

The current set of events include:
- PTRACE_EXEC: trace calls to execve().
- PTRACE_SCE: trace system call entries.
- PTRACE_SCX: trace syscam call exits.
- PTRACE_FORK: trace forks and auto-attach to new child processes.
- PTRACE_LWP: trace LWP events.

The S_PT_SCX and S_PT_SCE events in the procfs p_stops flags have
been replaced by PTRACE_SCE and PTRACE_SCX.  PTRACE_FORK replaces
P_FOLLOW_FORK and PTRACE_LWP replaces P2_LWP_EVENTS.

The PT_FOLLOW_FORK and PT_LWP_EVENTS ptrace requests remain for
compatibility but now simply toggle corresponding flags in the
event mask.

While here, document that PT_SYSCALL, PT_TO_SCE, and PT_TO_SCX both
modify the event mask and continue the traced process.

Reviewed by:	kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7044
2016-07-15 15:32:09 +00:00
jhb
35927cc862 Add documentation for the sigevent structure.
- Add a sigevent(3) manpage to give a general overview of the sigevent
  structure and the available notification mechanisms.
- Document that AIO requests contain a nested sigevent structure that can
  be used to request completion notification.
- Expand the sigevent details in other manuals to note details such as
  the extra values stored in a queued signal's information or in a posted
  kevent.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7122
2016-07-15 15:12:56 +00:00
trasz
6062dd4af2 Add some .Xrs to getloginclass(2).
MFC after:	1 month
2016-07-12 06:00:57 +00:00
jilles
1b626dc6a2 fcntl(2): Document interrupt/restart for file locks.
Since r302216, thread suspension causes advisory file locks to restart
(instead of continuing to wait) and for a long time SA_RESTART has
affected advisory file locks. These are both not compliant to POSIX.1.

To clarify that restarting means something, add a paragraph about fair
queuing. Note that the network lock manager does not implement fair
queuing.

Reviewed by:	kib (previous version)
Approved by:	re (gjb)
2016-07-07 21:44:59 +00:00
brooks
ecead64b41 Replace use of the pipe(2) system call with pipe2(2) with a zero flags
value.

This eliminates the need for machine dependant assembly wrappers for
pipe(2).

It also make passing an invalid address to pipe(2) return EFAULT rather
than triggering a segfault.  Document this behavior (which was already
true for pipe2(2), but undocumented).

Reviewed by:	andrew
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6815
2016-06-22 21:11:27 +00:00
jilles
563b27479d utimes(2),utime(3): Add deprecation in favour of utimensat(2) and futimens(2).
Setting time by seconds or microseconds may cause unexpected effects
especially if sysctl vfs.timestamp_precision=3 (not default).

Calling the obsolete functions with NULL timestamps is acceptable.
2016-06-09 22:14:58 +00:00
oshogbo
17ff1e5b26 Introduce the PD_CLOEXEC for pdfork(2).
Reviewed by:	mjg
2016-06-08 02:09:14 +00:00
kib
fb4ba6efe3 Fix markup.
Sponsored by:	The FreeBSD Foundation
2016-06-04 20:20:14 +00:00
vangyzen
c46e1b80b9 Improve errno documentation in pthread_create(3) and thr_new(2)
Add some missing errno values to thr_new(2) and pthread_create(3).
In particular, EDEADLK was not documented in the latter.
While I'm here, improve some English and cross-references.

Reviewed by:	kib
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D6663
2016-06-03 14:30:32 +00:00