Commit Graph

1701 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
6c316535e2 Remove useless comment.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-07 13:11:45 +00:00
Jilles Tjoelker
2205e0d1bd Add futimens and utimensat system calls.
The core kernel part is patch file utimes.2008.4.diff from
pluknet@FreeBSD.org. I updated the code for API changes, added the manual
page and added compatibility code for old kernels. There is also audit and
Capsicum support.

A new UTIME_* constant might allow setting birthtimes in future.

Differential Revision:	https://reviews.freebsd.org/D1426
Submitted by:	pluknet (partially)
Reviewed by:	delphij, pluknet, rwatson
Relnotes:	yes
2015-01-23 21:07:08 +00:00
Konstantin Belousov
677258f7e7 Add procctl(2) PROC_TRACE_CTL command to enable or disable debugger
attachment to the process.  Note that the command is not intended to
be a security measure, rather it is an obfuscation feature,
implemented for parity with other operating systems.

Discussed with:	jilles, rwatson
Man page fixes by:	rwatson
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-18 15:13:11 +00:00
Konstantin Belousov
397d851d66 Reduce the size of the interposing table and amount of
cancellation-handling code in the libthr.  Translate some syscalls
into their more generic counterpart, and remove translated syscalls
from the table.

List of the affected syscalls:
creat, open -> openat
raise -> thr_kill
sleep, usleep -> nanosleep
pause -> sigsuspend
wait, wait3, waitpid -> wait4

Suggested and reviewed by:	jilles (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-11 22:16:31 +00:00
John Baldwin
e275993995 Document CPU_WHICH_DOMAIN and bump Dd for cpuset.1.
Missed in:	r276829
2015-01-08 18:53:11 +00:00
Konstantin Belousov
1a744fefc2 Avoid calling internal libc function through PLT or accessing data
though GOT, by staticizing and hiding.  Add setter for
__error_selector to hide it as well.

Suggested and reviewed by:	jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-05 01:06:54 +00:00
Konstantin Belousov
8495e8b1e9 Fix known issues which blow up the process after dlopen("libthr.so")
(or loading a dso linked to libthr.so into process which was not
linked against threading library).

- Remove libthr interposers of the libc functions, including
  __error(). Instead, functions calls are indirected through the
  interposing table, similar to how pthread stubs in libc are already
  done.  Libc by default points either to syscall trampolines or to
  existing libc implementations.  On libthr load, libthr rewrites the
  pointers to the cancellable implementations already in libthr.  The
  interposition table is separate from pthreads stubs indirection
  table to not pull pthreads stubs into static binaries.

- Postpone the malloc(3) internal mutexes initialization until libthr
  is loaded.  This avoids recursion between calloc(3) and static
  pthread_mutex_t initialization.

- Reinstall signal handlers with wrapper on libthr load.  The
  _rtld_is_dlopened(3) is used to avoid useless calls to sigaction(2)
  when libthr is statically referenced from the main binary.

In the process, fix openat(2), swapcontext(2) and setcontext(2)
interposing.  The libc symbols were exported at different versions
than libthr interposers.  Export both libc and libthr versions from
libc now, with default set to the higher version from libthr.

Remove unused and disconnected swapcontext(3) userspace implementation
from libc/gen.

No objections from:	deischen
Tested by:	pho, antoine (exp-run) (previous versions)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-03 18:38:46 +00:00
Christian Brueffer
0aee91e1fb Various mdoc fixes and a few EOL whitespace removals.
Found with:	mandoc -Tlint
2014-12-21 12:36:36 +00:00
Bryan Drewery
4bb90cbe18 Bump Dd for r275846
MFC after:	3 weeks
2014-12-17 01:36:00 +00:00
Kirk McKusick
27ae6f4af7 Add some additional clarification and fix a few gammer nits.
Reviewed by: kib
MFC after:   3 weeks
2014-12-17 01:32:27 +00:00
Konstantin Belousov
19eaed5353 Markup fixes for kqueue(2), no content changes.
Reviewed by:	brueffer (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2014-12-15 14:58:10 +00:00
Konstantin Belousov
237623b028 Add a facility for non-init process to declare itself the reaper of
the orphaned descendants.  Base of the API is modelled after the same
feature from the DragonFlyBSD.

Requested by:	bapt
Reviewed by:	jilles (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2014-12-15 12:01:42 +00:00
Ed Maste
294246bb7d Revert r274772: it is not valid on MIPS
Reported by:	sbruno
2014-11-25 03:50:31 +00:00
Baptiste Daroussin
2c900f1907 Ta is only allowed with Bl -column not in Bl -item 2014-11-23 23:35:16 +00:00
Joel Dahl
d4d112e34a Misc mdoc fixes:
- Remove superfluous paragraph macros.
- Remove/fix empty or incorrect macros.
- Sort sections into conventional order.
- Terminate quoted strings properly.
- Remove EOL whitespace.
2014-11-23 21:00:00 +00:00
Ed Maste
688fd61ae8 Use canonical __PIC__ flag
It is automatically set when -fPIC is passed to the compiler.

Reviewed by:	dim, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1179
2014-11-21 02:05:48 +00:00
Dmitry Chagin
186d9c3473 Add the ppoll() system call.
Export kern_poll() needed by an upcoming Linuxulator change.

Differential Revision:	https://reviews.freebsd.org/D1133
Reviewed by:	kib, wblock
MFC after:	1 month
2014-11-13 05:26:14 +00:00
Dag-Erling Smørgrav
4b52e0d84f <sys/param.h> is a superset of <sys/types.h> and must always come
first.  Coincidentally, today is the 11th anniversary of this man
page's (and this bug's) first appearance in FreeBSD.

MFC after:	3 days
2014-11-01 09:10:21 +00:00
Gavin Atkinson
ff8d5270bc Slightly improve grammar in EAGAIN description.
PR:		176806
Submitted by:	Jeremy Chadwick
MFC after:	3 days
2014-10-15 23:39:47 +00:00
Xin LI
b888b86e6f accept(2) may and can return EAGAIN, document it.
MFC after:	1 week
2014-10-10 03:05:55 +00:00
Bryan Drewery
c1efb88730 Document [EPERM] for UNIX sockets.
MFC after:	2 weeks
2014-09-30 00:06:53 +00:00
John Baldwin
7a9f047ba7 - Remove mention of MAP_INHERIT. It hasn't been implemented for thirteen
years.
- Remove mention of unimplemented MAP_SWAP.  There are no future plans to
  implement it.

Submitted by:	alc (2)
2014-09-17 19:45:34 +00:00
Enji Cooper
2bcdab32c3 Bump .Dd for the content change done to access(2) in r271655
PR: 181155
Sponsored by: EMC / Isilon Storage Division
2014-09-16 00:59:08 +00:00
Enji Cooper
257597a434 Validate the mode argument in access, eaccess, and faccessat for optional
POSIX compliance and to improve compatibility with Linux and NetBSD

The issue was identified with lib/libc/sys/t_access:access_inval from
NetBSD

Update the manpage accordingly

PR: 181155
Reviewed by: jilles (code), jmmv (code), wblock (manpage), wollman (code)
MFC after: 4 weeks
Phabric: D678 (code), D786 (manpage)
Sponsored by: EMC / Isilon Storage Division
2014-09-16 00:56:47 +00:00
John-Mark Gurney
e2cc4003e2 document mqueuefs is required for mq_open... 2014-09-15 22:32:35 +00:00
John Baldwin
5fd3f8b3b6 Add stricter checking of some mmap() arguments:
- Fail with EINVAL if an invalid protection mask is passed to mmap().
- Fail with EINVAL if an unknown flag is passed to mmap().
- Fail with EINVAL if both MAP_PRIVATE and MAP_SHARED are passed to mmap().
- Require one of either MAP_PRIVATE or MAP_SHARED for non-anonymous
  mappings.

Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D698
2014-09-15 17:20:13 +00:00
Joel Dahl
cfaa96f327 Minor mdoc nit. 2014-09-09 14:34:54 +00:00
Baptiste Daroussin
42e62eca52 Extend kqueue's EVFILT_TIMER by adding precision unit flags support
Define the precision macros as bits sets to conform with XNU equivalent.
Test fflags passed for EVFILT_TIMER and return EINVAL in case an invalid flag
is passed.

Phabric:	https://phabric.freebsd.org/D421
Reviewed by:	kib
2014-07-18 14:27:04 +00:00
Kevin Lo
92511d108b Document that listen(2) can fail with EDESTADDRREQ. 2014-07-15 02:21:51 +00:00
Mark Johnston
d3fe75eb62 Fix a typo.
MFC after:	3 days
2014-07-09 01:33:35 +00:00
Konstantin Belousov
c22de76166 Note that most errors are possible for all syscalls from utimes(2)
family.  Minor wording corrections.

Based on the suggestions by bde.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-03 11:19:16 +00:00
Sergey Kandaurov
521aa90cb8 Document EINVAL as per POSIX.
This also follows r124335-r124336, r225827.

PR:		191382
MFC after:	1 week
Sponsored by:	Nginx, Inc.
2014-06-26 10:21:00 +00:00
Garrett Wollman
775a76844f Catch up with many years of changes:
o Document PF_LOCAL as being an alias for PF_UNIX
o Document POSIX standardization of this interface using AF_*
  constants rather than PF_* constants, and note the three particular
  families which POSIX standardizes.
o Note anticipated POSIX standardization of SOCK_CLOEXEC.
o Delete from listing protocol families that FreeBSD doesn't support
  (in some cases, like PF_PUP, has never supported).
o Add to listing some current protocol families that have been
  introduced in the last decade or so.
o Document the correspondence of PF_* and AF_* constants.

We should probably change the documentation to make the AF_* constants
primary, but this commit does not do so.

Reviewed by:	kevlo@
MFC after:	1 month
2014-06-24 20:23:18 +00:00
Joel Dahl
df2d82e003 mdoc: remove superfluous paragraph macros. 2014-06-23 18:40:21 +00:00
Baptiste Daroussin
8fbf3d50e3 use .Mt to mark up email addresses consistently (part4)
PR:		191174
Submitted by:	Franco Fichtner  <franco at lastsummer.de>
2014-06-23 08:25:03 +00:00
Konstantin Belousov
11c42bcc54 Add MAP_EXCL flag for mmap(2). It should be combined with MAP_FIXED,
and prevents the request from deleting existing mappings in the
region, failing instead.

Reviewed by:	alc
Discussed with:	jhb
Tested by:	markj, pho (previous version, as part of the bigger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-19 05:00:39 +00:00
Konstantin Belousov
8efb419829 The time come to remove the wrapper, most likely, but tidy up it code
instead for now.  Remove spurious blank line, use C89 definition, wrap
long line.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-19 04:55:00 +00:00
Benjamin Kaduk
245d93279b Minor mdoc fix
Submitted by:	hrs
Approved by:	hrs (mentor, implicit)
2014-05-30 02:16:28 +00:00
Benjamin Kaduk
6953d7db5c Correct documentation of the limit on how much memory can be mlock()ed
vm.max_wired is a system-wide limit, not per-process.  Reword the
section to make this more clear.

PR:		docs/189214
Submitted by:	Lawrence Chen (original text)
Approved by:	hrs (mentor)
2014-05-17 03:05:52 +00:00
Peter Holm
e103f5b1c0 msync(2) must return ENOMEM and not EINVAL when the address is outside the
allowed range or when one or more pages are not mapped. This according to
The Open Group Base Specifications Issue 7.

Discussed with:	 attilio, Bruce Evans
Reviewed by:	 alc, Garrett Cooper
Reported by:	 ATF
MFC after:	 2 weeks
Sponsored by:	EMC / Isilon storage division
2014-05-07 08:38:02 +00:00
Ed Schouten
f56cfe8d61 Fix table alignment. EVFILT_PROCDESC is longer than the existing filters. 2014-04-07 18:17:31 +00:00
Ed Schouten
38219d6acd Implement kqueue(2) for procdesc(4).
kqueue(2) already supports EVFILT_PROC. Add an EVFILT_PROCDESC that
behaves the same, but operates on a procdesc(4) instead. Only implement
NOTE_EXIT for now. The nice thing about NOTE_EXIT is that it also
returns the exit status of the process, meaning that we can now obtain
this value, even if pdwait4(2) is still unimplemented.

Notes:

- Simply reuse EVFILT_NETDEV for EVFILT_PROCDESC. As both of these will
  be used on totally different descriptor types, this should not clash.

- Let procdesc_kqops_event() reuse the same structure as filt_proc().
  The only difference is that procdesc_kqops_event() should also be able
  to deal with the case where the process was already terminated after
  registration. Simply test this when hint == 0.

- Fix some style(9) issues in filt_proc() to keep it consistent with the
  newly added procdesc_kqops_event().

- Save the exit status of the process in pd->pd_xstat, as we cannot pick
  up the proctree_lock from within procdesc_kqops_event().

Discussed on:	arch@
Reviewed by:	kib@
2014-04-07 18:10:49 +00:00
Warner Losh
a5fc5b6223 Convert from WITHOUT_SYSCALL_COMPAT to MK_SYSCALL_COMPAT. 2014-04-05 17:54:43 +00:00
Ed Schouten
2ad6bba714 Correct return type of pdfork(2).
The pdfork(2) man page states:

	"pdfork() returns a PID, 0 or -1, as fork(2) does."

As it returns a PID, the return type should obviously be pid_t. As int
and pid_t have the same size on all architectures, this change does not
affect the ABI in any way.
2014-04-04 19:53:45 +00:00
Eitan Adler
45ebf5d172 Use the correct variable name in the example code. 2014-03-30 04:40:41 +00:00
Robert Watson
cf321a51b1 Update system man pages for s/capability.h/capsicum.h/.
MFC after:	3 weeks
2014-03-27 21:43:00 +00:00
Marcel Moolenaar
8876613dc5 Replace use of ${.CURDIR} by ${LIBC_SRCTOP} and define ${LIBC_SRCTOP}
if not already defined. This allows building libc from outside of
lib/libc using a reach-over makefile.

A typical use-case is to build a standard ILP32 version and a COMPAT32
version in a single iteration by building the COMPAT32 version using a
reach-over makefile.

Obtained from:	Juniper Networks, Inc.
2014-03-04 02:19:39 +00:00
Benjamin Kaduk
af1e239814 syncer(4) is a kernel process, not a user process
Noticed by: Geoffrey Thomas <gthomas@mokafive.com>
Approved by:	hrs (mentor)
2014-02-27 04:06:34 +00:00
Christian Brueffer
9cba0f9670 Match the correct variable to the variable description.
PR:		121173
Submitted by:	Thomas Mueller <tmueller at sysgo.com>
MFC after:	1 week
2014-02-21 13:53:41 +00:00
John-Mark Gurney
ad6a53db5f document _JAIL as a possible option to set a cpuset for a jail..
MFC after:	3 days
2014-02-15 07:01:45 +00:00
Christian Brueffer
a578215eed Fix a typo.
MFC after:	1 week
2014-02-03 22:16:46 +00:00
Konstantin Belousov
49d39308ba The posix_madvise(3) and posix_fadvise(2) should return error on
failure, same as posix_fallocate(2).

Noted by:	Bob Bishop <rb@gid.co.uk>
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-01-30 18:04:39 +00:00
Ulrich Spörlein
d7d8b00bec mdoc: fix several uses of the Fx macro to point to actual releases.
Found by:  make manlint
2014-01-28 21:40:10 +00:00
Konstantin Belousov
2852de0489 The posix_fallocate(2) syscall should return error number on error,
without modifying errno.

Reported and tested by:	Gennady Proskurin <gpr@mail.ru>
Reviewed by:	mdf
PR:	standards/186028
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-01-23 17:24:26 +00:00
Sergey Kandaurov
dccae053f7 Update EINVAL description.
This matches current POSIX standards and actual FreeBSD behavior.

MFC after:	1 week
2014-01-23 09:37:03 +00:00
Jilles Tjoelker
b83686c8fe Add some missing .Nm for newer syscalls in existing man pages.
MFC after:	1 week
2014-01-11 22:00:16 +00:00
Sergey Kandaurov
d3178d7d27 - Fix EBADF description, in following the future POSIX tc and what FreeBSD
actually implements.
- Improve grammar: use more preferred "can", not "could".

Submitted by:	jilles
2013-12-27 16:57:38 +00:00
Sergey Kandaurov
4ca1cd1d63 Fix an apparent typo.
MFC after:	3 days
2013-12-26 19:18:43 +00:00
Sergey Kandaurov
e44aa9fde0 Provide the manual page for aio_fsync(2).
Reviewed by:	davidxu
MFC after:	1 week
2013-12-26 19:16:30 +00:00
Sergey Kandaurov
5acf8f8325 The compile time constant limit on number of swap devices was removed in 5.2.
As such, remove the EINVAL error saying so.  Currently the vm.nswapdev sysctl
just represents the number of added swap devices.

MFC after:	1 week
2013-12-25 16:01:29 +00:00
Ruslan Ermilov
0f987f1f08 shm_open(2): Fixed the history information.
While here, sort xrefs.

Reviewed by:	jhb
2013-12-18 12:18:17 +00:00
Joel Dahl
2727e97436 mdoc: remove EOL whitespace. 2013-12-06 21:22:33 +00:00
John Baldwin
d4e3c0a2d7 Various updates and tweaks to the wait(2) manpage.
PR:		docs/183904
Submitted by:	Michael Galassi <michaelgalassi@gmail.com>
Reviewed by:	kib, wblock (earlier version)
2013-12-03 21:00:13 +00:00
Jilles Tjoelker
b865f8ef40 chmod(2): Document S_ISVTX following SUSv3/SUSv4.
S_ISTXT is non-standard.

While here, also update fchmodat() standards entry to POSIX.1-2008.
2013-12-01 12:24:57 +00:00
Jilles Tjoelker
09466daf8c waitid(2): Do not tell userland programmers to include <sys/signal.h>.
Userland should get these definitions by including <signal.h>.
2013-12-01 11:59:37 +00:00
Pawel Jakub Dawidek
f2b525e6b9 Make process descriptors standard part of the kernel. rwhod(8) already
requires process descriptors to work and having PROCDESC in GENERIC
seems not enough, especially that we hope to have more and more consumers
in the base.

MFC after:	3 days
2013-11-30 15:08:35 +00:00
Sergey Kandaurov
dc211b3d40 Fix extattr(2) MLINKS.
MFC after:	1 week
2013-11-09 00:36:09 +00:00
Pawel Jakub Dawidek
6f62d278e8 - Add manual pages for capability rights (rights(4)), cap_rights_init(3)
family of functions and cap_rights_get(3) function.
- Update remaining Capsicum-related manual pages.

Reviewed by:	bdrewery
MFC after:	3 days
2013-11-04 14:10:22 +00:00
Jilles Tjoelker
1947c8a6d1 kqueue: Change error for kqueues rlimit from EMFILE to ENOMEM and document
this error condition in the kqueue(2) manual page.

Discussed with:	kib
2013-11-03 23:06:24 +00:00
Konstantin Belousov
85a0ddfd0b Add a resource limit for the total number of kqueues available to the
user.  Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.

Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process.  Limit
allows administrator to prevent the abuse.

This is kernel-mode side of the change, with the user-mode enabling
commit following.

Reported and tested by:	pho
Discussed with:	jmg
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-10-21 16:46:12 +00:00
Jilles Tjoelker
0f49c96cfc accept(2): Update portability note for accept4().
The accept(2) man page warns that O_NONBLOCK and other properties on the
new socket may vary across implementations. However, this issue only
applies to accept() and not to accept4(). On the other hand, accept4()
is not commonly available yet.

Reported by:	pluknet
Reviewed by:	bjk
Approved by:	re (kib)
2013-10-01 21:17:18 +00:00
Joel Dahl
828378a6d3 Minor mdoc improvements.
Approved by:	re (blanket)
2013-09-19 19:43:38 +00:00
John Baldwin
55648840de Extend the support for exempting processes from being killed when swap is
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
  from arbitrary processes.  Similar to ktrace it can apply a change to all
  existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
  control operations on processes (as opposed to the debugger-specific
  operations provided by ptrace(2)).  procctl(2) uses a combination of
  idtype_t and an id to identify the set of processes on which to operate
  similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
  of a set of processes.  MADV_PROTECT still works for backwards
  compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
  the first bit of which is used to track if P_PROTECT should be inherited
  by new child processes.

Reviewed by:	kib, jilles (earlier version)
Approved by:	re (delphij)
MFC after:	1 month
2013-09-19 18:53:42 +00:00
Bryan Drewery
c36029e6dc Consistently reference file descriptors as "fd". 55 other manpages
used "fd", while these used "d" and "filedes".

MFC after:	1 week
Approved by:	gjb
Approved by:	re (delphij)
2013-09-12 00:53:38 +00:00
John Baldwin
edb572a38c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
Jilles Tjoelker
550ac4a8e8 wait(2): Add some possible caveats to standards section. 2013-09-07 11:41:52 +00:00
Jilles Tjoelker
75b1cda430 Update some signal man pages for multithreading. 2013-09-06 09:08:40 +00:00
Pawel Jakub Dawidek
7008be5bd7 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
Robert Watson
7b223d2286 Xref capsicum(4) and procdesc(4) from pdfork(2).
Suggested by:	sbruno
MFC after:	3 days
2013-08-28 20:00:25 +00:00
Joel Dahl
4d5c7c633b Remove EOL whitespace. 2013-08-22 16:02:20 +00:00
Kenneth D. Merry
7da1a731c6 Expand the use of stat(2) flags to allow storing some Windows/DOS
and CIFS file attributes as BSD stat(2) flags.

This work is intended to be compatible with ZFS, the Solaris CIFS
server's interaction with ZFS, somewhat compatible with MacOS X,
and of course compatible with Windows.

The Windows attributes that are implemented were chosen based on
the attributes that ZFS already supports.

The summary of the flags is as follows:

UF_SYSTEM:	Command line name: "system" or "usystem"
		ZFS name: XAT_SYSTEM, ZFS_SYSTEM
		Windows: FILE_ATTRIBUTE_SYSTEM

		This flag means that the file is used by the
		operating system.  FreeBSD does not enforce any
		special handling when this flag is set.

UF_SPARSE:	Command line name: "sparse" or "usparse"
		ZFS name: XAT_SPARSE, ZFS_SPARSE
		Windows: FILE_ATTRIBUTE_SPARSE_FILE

		This flag means that the file is sparse.  Although
		ZFS may modify this in some situations, there is
		not generally any special handling for this flag.

UF_OFFLINE:	Command line name: "offline" or "uoffline"
		ZFS name: XAT_OFFLINE, ZFS_OFFLINE
		Windows: FILE_ATTRIBUTE_OFFLINE

		This flag means that the file has been moved to
		offline storage.  FreeBSD does not have any special
		handling for this flag.

UF_REPARSE:	Command line name: "reparse" or "ureparse"
		ZFS name: XAT_REPARSE, ZFS_REPARSE
		Windows: FILE_ATTRIBUTE_REPARSE_POINT

		This flag means that the file is a Windows reparse
		point.  ZFS has special handling code for reparse
		points, but we don't currently have the other
		supporting infrastructure for them.

UF_HIDDEN:	Command line name: "hidden" or "uhidden"
		ZFS name: XAT_HIDDEN, ZFS_HIDDEN
		Windows: FILE_ATTRIBUTE_HIDDEN

		This flag means that the file may be excluded from
		a directory listing if the application honors it.
		FreeBSD has no special handling for this flag.

		The name and bit definition for UF_HIDDEN are
		identical to the definition in MacOS X.

UF_READONLY:	Command line name: "urdonly", "rdonly", "readonly"
		ZFS name: XAT_READONLY, ZFS_READONLY
		Windows: FILE_ATTRIBUTE_READONLY

		This flag means that the file may not written or
		appended, but its attributes may be changed.

		ZFS currently enforces this flag, but Illumos
		developers have discussed disabling enforcement.

		The behavior of this flag is different than MacOS X.
		MacOS X uses UF_IMMUTABLE to represent the DOS
		readonly permission, but that flag has a stronger
		meaning than the semantics of DOS readonly permissions.

UF_ARCHIVE:	Command line name: "uarch", "uarchive"
		ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE
		Windows name: FILE_ATTRIBUTE_ARCHIVE

		The UF_ARCHIVED flag means that the file has changed and
		needs to be archived.  The meaning is same as
		the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and
		the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.

		msdosfs and ZFS have special handling for this flag.
		i.e. they will set it when the file changes.

sys/param.h:		Bump __FreeBSD_version to 1000047 for the
			addition of new stat(2) flags.

chflags.1:		Document the new command line flag names
			(e.g. "system", "hidden") available to the
			user.

ls.1:			Reference chflags(1) for a list of file flags
			and their meanings.

strtofflags.c:		Implement the mapping between the new
			command line flag names and new stat(2)
			flags.

chflags.2:		Document all of the new stat(2) flags, and
			explain the intended behavior in a little
			more detail.  Explain how they map to
			Windows file attributes.

			Different filesystems behave differently
			with respect to flags, so warn the
			application developer to take care when
			using them.

zfs_vnops.c:		Add support for getting and setting the
			UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN,
			UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.

			All of these flags are implemented using
			attributes that ZFS already supports, so
			the on-disk format has not changed.

			ZFS currently doesn't allow setting the
			UF_REPARSE flag, and we don't really have
			the other infrastructure to support reparse
			points.

msdosfs_denode.c,
msdosfs_vnops.c:	Add support for getting and setting
			UF_HIDDEN, UF_SYSTEM and UF_READONLY
			in MSDOSFS.

			It supported SF_ARCHIVED, but this has been
			changed to be UF_ARCHIVE, which has the same
			semantics as the DOS archive attribute instead
			of inverse semantics like SF_ARCHIVED.

			After discussion with Bruce Evans, change
			several things in the msdosfs behavior:

			Use UF_READONLY to indicate whether a file
			is writeable instead of file permissions, but
			don't actually enforce it.

			Refuse to change attributes on the root
			directory, because it is special in FAT
			filesystems, but allow most other attribute
			changes on directories.

			Don't set the archive attribute on a directory
			when its modification time is updated.
			Windows and DOS don't set the archive attribute
			in that scenario, so we are now bug-for-bug
			compatible.

smbfs_node.c,
smbfs_vnops.c:		Add support for UF_HIDDEN, UF_SYSTEM,
			UF_READONLY and UF_ARCHIVE in SMBFS.

			This is similar to changes that Apple has
			made in their version of SMBFS (as of
			smb-583.8, posted on opensource.apple.com),
			but not quite the same.

			We map SMB_FA_READONLY to UF_READONLY,
			because UF_READONLY is intended to match
			the semantics of the DOS readonly flag.
			The MacOS X code maps both UF_IMMUTABLE
			and SF_IMMUTABLE to SMB_FA_READONLY, but
			the immutable flags have stronger meaning
			than the DOS readonly bit.

stat.h:			Add definitions for UF_SYSTEM, UF_SPARSE,
			UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY
			and UF_HIDDEN.

			The definition of UF_HIDDEN is the same as
			the MacOS X definition.

			Add commented-out definitions of
			UF_COMPRESSED and UF_TRACKED.  They are
			defined in MacOS X (as of 10.8.2), but we
			do not implement them (yet).

ufs_vnops.c:		Add support for getting and setting
			UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY,
			UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS.
			Alphabetize the flags that are supported.

			These new flags are only stored, UFS does
			not take any action if the flag is set.

Sponsored by:	Spectra Logic
Reviewed by:	bde (earlier version)
2013-08-21 23:04:48 +00:00
Pawel Jakub Dawidek
fe0670cfb3 Correct function name and return value. 2013-08-17 14:55:31 +00:00
John Baldwin
5aa60b6f21 Add new mmap(2) flags to permit applications to request specific virtual
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
  Requests for n >= number of bits in a pointer or less than the size of
  a page fail with EINVAL.  This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED.  It can be used
  to optimize the chances of using large pages.  By default it will align
  the mapping on a large page boundary (the system is free to choose any
  large page size to align to that seems best for the mapping request).
  However, if the object being mapped is already using large pages, then
  it will align the virtual mapping to match the existing large pages in
  the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
  VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
  MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
  MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
  explicitly using VMFS_SUPER_SPACE.  All device objects are forced to
  use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
  equivalent.

Reviewed by:	alc
MFC after:	1 month
2013-08-16 21:13:55 +00:00
Jilles Tjoelker
fdafa7840f pselect(2): Add xref to sigsuspend(2). 2013-08-16 14:06:29 +00:00
Jilles Tjoelker
5219e2caba Add man page dup3(3). 2013-08-16 13:16:27 +00:00
Jilles Tjoelker
f57087b21c sigsuspend(2): Add xrefs to pselect(2) and sigwait-alikes. 2013-08-15 22:33:27 +00:00
John Baldwin
513bfc4fe2 Enhance the description of NOTE_TRACK:
- NOTE_TRACK has never triggered a NOTE_TRACK event from the parent pid.
  If NOTE_FORK is set, the listener will get a NOTE_FORK event from
  the parent pid, but not a separate NOTE_TRACK event.
- Explicitly note that the event added to monitor the child process
  preserves the fflags from the original event.
- Move the description of NOTE_TRACKERR under NOTE_TRACK as it is not a
  bit for the user to set (which is what this list pupports to be).
  Also, explicitly note that if an error occurs, the NOTE_CHILD event
  will not be generated.

MFC after:	1 week
2013-07-25 19:34:24 +00:00
Ed Maste
4e1d691281 Document EINVAL error return from PT_LWPINFO 2013-07-22 18:18:21 +00:00
Joel Dahl
2a82581d9b Minor mdoc fixes. 2013-06-09 07:15:43 +00:00
Jilles Tjoelker
172886a93e sigaction(2): Document various non-POSIX functions as async-signal safe. 2013-06-08 13:45:43 +00:00
Gleb Smirnoff
6160e12c10 Add new system call - aio_mlock(). The name speaks for itself. It allows
to perform the mlock(2) operation, which can consume a lot of time, under
control of aio(4).

Reviewed by:	kib, jilles
Sponsored by:	Nginx, Inc.
2013-06-08 13:27:57 +00:00
Jilles Tjoelker
4b08438c22 dup(2): Clarify return value, in particular of dup2(). 2013-05-31 22:09:31 +00:00
Jilles Tjoelker
4e3f0e45cf sigaction(2): *at system calls are async-signal safe. 2013-05-31 21:31:38 +00:00
Jilles Tjoelker
f8732c7fc3 sigaction(2): Extend description of async-signal safe functions:
* Improve description when unsafe functions are unsafe.
* Add various safe functions from POSIX.1-2008 and Austin Group issue #692.
2013-05-31 21:25:51 +00:00
Jilles Tjoelker
0bbe34c35d fork(2): Add information about fork() in multi-threaded processes.
There is nothing about pthread_atfork(3) or extensions like calling
malloc(3) in the child process as this may be unreliable or broken.
2013-05-31 20:46:08 +00:00
Jilles Tjoelker
45100a722a fork(2): #include <sys/types.h> is not needed. 2013-05-31 14:48:37 +00:00
Ed Maste
e2e9c35fa4 Remove the advertising clause from the Regents of the University of
California's license, per the letter dated July 22, 1999.
2013-05-28 21:05:06 +00:00
Jilles Tjoelker
24f3b0bcd0 cap_rights_limit(2): CAP_ACCEPT also permits accept4(2). 2013-05-27 21:37:19 +00:00
Jilles Tjoelker
0bbacb9c66 sigreturn(2): Remove ancient compatibility warning about 4.2BSD.
The HISTORY subsection still says that sigreturn() was added in 4.3BSD.
2013-05-25 13:59:40 +00:00
Julian Elischer
956e8eee53 Update the setfib man page to reflect recent changes. 2013-05-20 20:47:40 +00:00
Sergey Kandaurov
e0906c9a0d POSIX 1003.1-2008: add ENOTRECOVERABLE, EOWNERDEAD errnos. 2013-05-04 19:07:22 +00:00
Jilles Tjoelker
ed5987bd08 accept(2), pipe(2): Fix .Dd. 2013-05-01 22:47:47 +00:00
Jilles Tjoelker
dc570d5e56 Add pipe2() system call.
The pipe2() function is similar to pipe() but allows setting FD_CLOEXEC and
O_NONBLOCK (on both sides) as part of the function.

If p points to two writable ints, pipe2(p, 0) is equivalent to pipe(p).

If the pointer is not valid, behaviour differs: pipe2() writes into the
array from the kernel like socketpair() does, while pipe() writes into the
array from an architecture-specific assembler wrapper.

Reviewed by:	kan, kib
2013-05-01 22:42:42 +00:00
Jilles Tjoelker
da7d2afb6d Add accept4() system call.
The accept4() function, compared to accept(), allows setting the new file
descriptor atomically close-on-exec and explicitly controlling the
non-blocking status on the new socket. (Note that the latter point means
that accept() is not equivalent to any form of accept4().)

The linuxulator's accept4 implementation leaves a race window where the new
file descriptor is not close-on-exec because it calls sys_accept(). This
implementation leaves no such race window (by using falloc() flags). The
linuxulator could be fixed and simplified by using the new code.

Like accept(), accept4() is async-signal-safe, a cancellation point and
permitted in capability mode.
2013-05-01 20:10:21 +00:00
Jilles Tjoelker
3143f63a23 intro(2): Fix some errors in ENFILE and EMFILE descriptions.
MFC after:	1 week
2013-04-27 11:55:23 +00:00
Jilles Tjoelker
e160aec9a5 getdtablesize(2): Describe what this function actually does.
getdtablesize() returns the limit on new file descriptors; this says nothing
about existing descriptors.

MFC after:	1 week
2013-04-24 21:24:35 +00:00
Sergey Kandaurov
89bbe1496d Keep up with negative addrlen check removal in r249649. 2013-04-22 09:18:50 +00:00
Jilles Tjoelker
2cd19a510a dup(2): Remove incorrect sentence about getdtablesize().
There are no getdtablesize() bounds on the file descriptor to be duplicated;
it only has to be open. If the RLIMIT_NOFILE rlimit was decreased after
opening the file descriptor, it may be greater than or equal to
getdtablesize() but still valid.

MFC after:	1 week
2013-04-21 19:42:04 +00:00
Joel Dahl
cd088fc43a Remove cross-references to nonexistent CPU_SET(3) manpage.
Also fix cpu_getaffinity(2) document title.

PR:		176317
Submitted by:	brucec
2013-04-21 06:46:41 +00:00
George V. Neville-Neil
599c412493 Correct the returned message lengths for timeval and bintime control
messages (SO_BINTIME, SO_TIMEVAL).

Obtained from:	phk
2013-04-05 18:09:43 +00:00
Matthew D Fleming
e324bf91e8 Fix return type of extattr_set_* and fix rmextattr(8) utility.
extattr_set_{fd,file,link} is logically a write(2)-like operation and
should return ssize_t, just like extattr_get_*.  Also, the user-space
utility was using an int for the return value of extattr_get_* and
extattr_list_*, both of which return an ssize_t.

MFC after:	1 week
2013-04-02 05:30:41 +00:00
Jilles Tjoelker
de9dcfba06 accept(2): Mention inheritance of O_ASYNC and signal destination.
While almost nobody uses O_ASYNC, and rightly so, the inheritance of the
related properties across accept() is a portability issue like the
inheritance of O_NONBLOCK.
2013-03-26 22:46:56 +00:00
Pawel Jakub Dawidek
2883fbd521 Document chflagsat(2).
Obtained from:	jilles
2013-03-21 23:05:44 +00:00
Pawel Jakub Dawidek
e948704e4b Implement chflagsat(2) system call, similar to fchmodat(2), but operates on
file flags.

Reviewed by:	kib, jilles
Sponsored by:	The FreeBSD Foundation
2013-03-21 22:59:01 +00:00
Pawel Jakub Dawidek
b4b2596b97 - Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type
u_long. Before this change it was of type int for syscalls, but prototypes
  in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not
  for lchflags(2)) stated that it was u_long. Now some related functions
  use u_long type for flags (strtofflags(3), fflagstostr(3)).
- Make path argument of type 'const char *' for consistency.

Discussed on:	arch
Sponsored by:	The FreeBSD Foundation
2013-03-21 22:44:33 +00:00
Jilles Tjoelker
46f10cc265 Allow O_CLOEXEC in posix_openpt() flags.
PR:		kern/162374
Reviewed by:	ed
2013-03-21 21:39:15 +00:00
Jilles Tjoelker
c2e3c52e0d Implement SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC.
This change allows creating file descriptors with close-on-exec set in some
situations. SOCK_CLOEXEC and SOCK_NONBLOCK can be OR'ed in socket() and
socketpair()'s type parameter, and MSG_CMSG_CLOEXEC to recvmsg() makes file
descriptors (SCM_RIGHTS) atomically close-on-exec.

The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in NetBSD.
MSG_CMSG_CLOEXEC is the first free bit for MSG_*.

The SOCK_* flags are not passed to MAC because this may cause incorrect
failures and can be done later via fcntl() anyway. On the other hand, audit
is expected to cope with the new flags.

For MSG_CMSG_CLOEXEC, unp_externalize() is extended to take a flags
argument.

Reviewed by:	kib
2013-03-19 20:58:17 +00:00
Gleb Smirnoff
8863cc408c There are actually two different cases when mlock(2) returns
ENOMEM. Clarify this, taking text from SUS.

Reviewed by:	kib
2013-03-19 05:44:25 +00:00
Pawel Jakub Dawidek
136cbf84ef Add a note to the HISTORY section about lchflags(2) being introduced in
FreeBSD 5.0.
2013-03-16 22:44:14 +00:00
Pawel Jakub Dawidek
7493f24ee6 - Implement two new system calls:
int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen);
	int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen);

  which allow to bind and connect respectively to a UNIX domain socket with a
  path relative to the directory associated with the given file descriptor 'fd'.

- Add manual pages for the new syscalls.

- Make the new syscalls available for processes in capability mode sandbox.

- Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on
  the directory descriptor for the syscalls to work.

- Update audit(4) to support those two new syscalls and to handle path
  in sockaddr_un structure relative to the given directory descriptor.

- Update procstat(1) to recognize the new capability rights.

- Document the new capability rights in cap_rights_limit(2).

Sponsored by:	The FreeBSD Foundation
Discussed with:	rwatson, jilles, kib, des
2013-03-02 21:11:30 +00:00
Joel Dahl
fdf25068b7 mdoc: remove superfluous paragraph macro. 2013-03-02 06:55:55 +00:00
Pawel Jakub Dawidek
2609222ab4 Merge Capsicum overhaul:
- Capability is no longer separate descriptor type. Now every descriptor
  has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
  should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
  cap_new(2), which limits capability rights of the given descriptor
  without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
  ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
  ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
  that can be used with the new cap_fcntls_limit(2) syscall and retrive
  them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
  heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
  recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
  backward API and ABI compatibility there are some incompatible changes
  that are described in detail below:

	CAP_CREATE old behaviour:
	- Allow for openat(2)+O_CREAT.
	- Allow for linkat(2).
	- Allow for symlinkat(2).
	CAP_CREATE new behaviour:
	- Allow for openat(2)+O_CREAT.

	Added CAP_LINKAT:
	- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
	- Allow to be target for renameat(2).

	Added CAP_SYMLINKAT:
	- Allow for symlinkat(2).

	Removed CAP_DELETE. Old behaviour:
	- Allow for unlinkat(2) when removing non-directory object.
	- Allow to be source for renameat(2).

	Removed CAP_RMDIR. Old behaviour:
	- Allow for unlinkat(2) when removing directory.

	Added CAP_RENAMEAT:
	- Required for source directory for the renameat(2) syscall.

	Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
	- Allow for unlinkat(2) on any object.
	- Required if target of renameat(2) exists and will be removed by this
	  call.

	Removed CAP_MAPEXEC.

	CAP_MMAP old behaviour:
	- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
	  PROT_WRITE.
	CAP_MMAP new behaviour:
	- Allow for mmap(2)+PROT_NONE.

	Added CAP_MMAP_R:
	- Allow for mmap(PROT_READ).
	Added CAP_MMAP_W:
	- Allow for mmap(PROT_WRITE).
	Added CAP_MMAP_X:
	- Allow for mmap(PROT_EXEC).
	Added CAP_MMAP_RW:
	- Allow for mmap(PROT_READ | PROT_WRITE).
	Added CAP_MMAP_RX:
	- Allow for mmap(PROT_READ | PROT_EXEC).
	Added CAP_MMAP_WX:
	- Allow for mmap(PROT_WRITE | PROT_EXEC).
	Added CAP_MMAP_RWX:
	- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

	Renamed CAP_MKDIR to CAP_MKDIRAT.
	Renamed CAP_MKFIFO to CAP_MKFIFOAT.
	Renamed CAP_MKNODE to CAP_MKNODEAT.

	CAP_READ old behaviour:
	- Allow pread(2).
	- Disallow read(2), readv(2) (if there is no CAP_SEEK).
	CAP_READ new behaviour:
	- Allow read(2), readv(2).
	- Disallow pread(2) (CAP_SEEK was also required).

	CAP_WRITE old behaviour:
	- Allow pwrite(2).
	- Disallow write(2), writev(2) (if there is no CAP_SEEK).
	CAP_WRITE new behaviour:
	- Allow write(2), writev(2).
	- Disallow pwrite(2) (CAP_SEEK was also required).

	Added convinient defines:

	#define	CAP_PREAD		(CAP_SEEK | CAP_READ)
	#define	CAP_PWRITE		(CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_R		(CAP_MMAP | CAP_SEEK | CAP_READ)
	#define	CAP_MMAP_W		(CAP_MMAP | CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_X		(CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
	#define	CAP_MMAP_RW		(CAP_MMAP_R | CAP_MMAP_W)
	#define	CAP_MMAP_RX		(CAP_MMAP_R | CAP_MMAP_X)
	#define	CAP_MMAP_WX		(CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_MMAP_RWX		(CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_RECV		CAP_READ
	#define	CAP_SEND		CAP_WRITE

	#define	CAP_SOCK_CLIENT \
		(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
		 CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
	#define	CAP_SOCK_SERVER \
		(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
		 CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
		 CAP_SETSOCKOPT | CAP_SHUTDOWN)

	Added defines for backward API compatibility:

	#define	CAP_MAPEXEC		CAP_MMAP_X
	#define	CAP_DELETE		CAP_UNLINKAT
	#define	CAP_MKDIR		CAP_MKDIRAT
	#define	CAP_RMDIR		CAP_UNLINKAT
	#define	CAP_MKFIFO		CAP_MKFIFOAT
	#define	CAP_MKNOD		CAP_MKNODAT
	#define	CAP_SOCK_ALL		(CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by:	The FreeBSD Foundation
Reviewed by:	Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with:	rwatson, benl, jonathan
ABI compatibility discussed with:	kib
2013-03-02 00:53:12 +00:00
Pawel Jakub Dawidek
d6f122f4fb Provide cap_sandboxed(3) function, which is a wrapper around cap_getmode(2)
system call, which has a nice property - it never fails, so it is a bit
easier to use. If there is no support for capability mode in the kernel
the function will return false (not in a sandbox). If the kernel is compiled
with the support for capability mode, the function will return true or false
depending if the calling process is in the capability mode sandbox or not
respectively.

Sponsored by:	The FreeBSD Foundation
2013-03-02 00:11:27 +00:00
Pawel Jakub Dawidek
1f2ce2a086 Put one file per line so it is easier to read diffs against those files. 2013-02-16 22:21:46 +00:00
Ian Lepore
74938cbb7f Make the F_READAHEAD option to fcntl(2) work as documented: a value of zero
now disables read-ahead.  It used to effectively restore the system default
readahead hueristic if it had been changed; a negative value now restores
the default.

Reviewed by:	kib
2013-02-13 15:09:16 +00:00
Jilles Tjoelker
08851ea288 sigqueue(2): Fix typo (EEPERM -> EPERM).
MFC after:	3 days
2013-02-10 13:20:23 +00:00
Eitan Adler
4b3d880203 Fix logic inversion.
PR:		docs/174966
Submitted by:	Christian Ullrich <chris+freebsd@chrullrich.net>
Approved by:	bcr (mentor)
2013-02-09 17:13:51 +00:00
Konstantin Belousov
2ee83a16c0 Document the detail of interaction between vfork and PT_TRACEME.
MFC after:	2 weeks
2013-02-07 15:36:24 +00:00
Konstantin Belousov
9be7626dee Document the ERESTART translation to EINTR for devfs nodes.
Based on the submission by:	jilles
MFC after:	2 weeks
2013-02-07 15:11:43 +00:00
Konstantin Belousov
150facd256 Rework the __vdso_* symbols attributes to only make the symbols weak,
but use normal references instead of weak.  This makes the statically
linked binaries to use fast gettimeofday(2) by forcing the linker to
resolve references and providing the neccessary functions.

Reported by:	bde
Tested by:	marius (sparc64)
MFC after:	2 weeks
2013-01-30 12:48:16 +00:00
Gleb Smirnoff
422d45aa3b posix_fadvise(2) first appeared in FreeBSD 9.1 2013-01-23 10:50:52 +00:00
Pawel Jakub Dawidek
ffce2a5b46 Note that SIGCHLD is special and if ignored, won't be recorded by the filter. 2013-01-21 22:07:34 +00:00
Andrey Zonov
69d649baa9 - Use standard RETURN VALUES section.
Approved by:	kib (mentor)
MFC after:	1 week
2013-01-15 14:09:08 +00:00
Andrey Zonov
bde505592f - Update manual pages accordingly to r244384 and r244385.
Approved by:	kib (mentor)
MFC after:	1 week
2012-12-25 13:43:01 +00:00
Kevin Lo
0ff48e7194 Document that socket(2) may fail with EAFNOSUPPORT if the family cannot
be found.

Reviewed by:	glebius
Obtained from:	NetBSD
2012-12-07 02:26:08 +00:00
Kevin Lo
5e48557ef0 Document that bind(2) can fail with EAFNOSUPPORT.
Reviewed by:	glebius
2012-12-04 09:53:09 +00:00
Kevin Lo
51fb3b32c4 Document that getpeername(2) and getsockname(2) can fail with EINVAL.
Reviewed by:	glebius
2012-11-23 10:14:54 +00:00
Kevin Lo
b95b21c7ab Document that rtprio(2) and rtprio_thread(2) can fail with EFAULT
due to the invoked copyout(9).

Reviewed by:	davidxu
2012-11-16 09:56:25 +00:00
Kevin Lo
d9b3cfecfd Document that sendfile(2) can fail with ENOBUFS.
Reviewed by:	glebius
2012-11-14 01:45:10 +00:00
Konstantin Belousov
0b92b54045 Document wait6() and waitid().
PR:	standards/170346
Submitted by:	"Jukka A. Ukkonen" <jau@iki.fi>
MFC after:	1 month
2012-11-13 12:56:42 +00:00
Konstantin Belousov
eb3d4e1fbd Implement the waitid() SUSv4 function using wait6() system call.
PR:	standards/170346
Submitted by:	"Jukka A. Ukkonen" <jau@iki.fi>
MFC after:	1 month
2012-11-13 12:55:52 +00:00
Jilles Tjoelker
252462b3bd fcntl(2): Fix typos in name of constant "F_DUP2FD_CLOEXEC".
MFC after:	1 week
2012-11-01 09:38:28 +00:00
Eitan Adler
c965707311 Update the kill(2) and killpg(2) man pages to the modern permission
checks. Also indicate killpg(2) is POSIX compliant.

Reviewed by:	jilles
Reviewed by:	wblock
Approved by:	cperciva
MFC after:	3 days
2012-10-22 03:37:00 +00:00
Andre Oppermann
dc00208ec4 Grammar fixes to r241781.
Submitted by:	alc
2012-10-20 19:38:22 +00:00
Andre Oppermann
2bdf61ca29 Hide the unfortunate named sysctl kern.ipc.somaxconn from sysctl -a
output and replace it with a new visible sysctl kern.ipc.acceptqueue
of the same functionality.  It specifies the maximum length of the
accept queue on a listen socket.

The old kern.ipc.somaxconn remains available for reading and writing
for compatibility reasons so that existing programs, scripts and
configurations continue to work.  There no plans to ever remove the
orginal and now hidden kern.ipc.somaxconn.
2012-10-20 12:53:14 +00:00
Jilles Tjoelker
b30cd8df7c sigaction(2),sigwait(2),sigwaitinfo(2): Remove [EFAULT] error condition.
Passing an invalid pointer results in undefined behaviour.

The wrappers in libthr access some of the data pointed to by the arguments
in userland, so that an invalid pointer will cause a signal and not an
[EFAULT] error return.

Furthermore, if the [EFAULT] error occurs when the kernel is writing, it is
not a proper error in the sense that the call still commits (changing the
signal disposition or accepting the signal).

MFC after:	1 week
2012-09-27 17:48:04 +00:00
Kevin Lo
cc77b2d8e2 Remove the restrict qualifier to match function prototype. 2012-09-20 02:25:18 +00:00
Gleb Smirnoff
be81cc14ab Describe in detail required conditions for receiving the SCM_CREDS
control message and suggest to use LOCAL_CREDS setsockopt() for
reliability.
2012-09-12 09:50:17 +00:00
John Baldwin
9baeec2982 When WIFCONTINUED was added, the number of "first" macros grew from
three to four.

MFC after:	1 week
2012-09-05 11:55:53 +00:00
Niclas Zeising
faf04960ff Add missing .Pp macro.
PR:		docs/170380
Submitted by:	Garrett Cooper <yanegomi@gmail.com>
Approved by:	joel (mentor)
2012-08-21 16:35:14 +00:00
David Xu
d65f1abca7 Implement syscall clock_getcpuclockid2, so we can get a clock id
for process, thread or others we want to support.
Use the syscall to implement POSIX API clock_getcpuclock and
pthread_getcpuclockid.

PR:	168417
2012-08-17 02:26:31 +00:00
Konstantin Belousov
d5a53d996c Document F_DUP2FD_CLOEXEC.
MFC after:	1 week
2012-07-27 10:41:53 +00:00
Konstantin Belousov
a53cab2c6c (Incomplete) fixes for symbols visibility issues and style in fcntl.h.
Append '__' prefix to the tag of struct oflock, and put it under BSD
namespace. Structure is needed both by libc and kernel, thus cannot be
hidden under #ifdef _KERNEL.

Move a set of non-standard F_* and O_* constants into BSD namespace.
SUSv4 explicitely allows implemenation to pollute F_* and O_* names
after fcntl.h is included, but it costs us nothing to adhere
to the specification if exact POSIX compliance level is requested by
user code.

Change some spaces after #define to tabs.

Noted by and discussed with:	     bde
MFC after:   1 week
2012-07-21 13:02:11 +00:00
Konstantin Belousov
39c5964c5a Document F_DUPFD_CLOEXEC. Also provide some wording changes for
F_DUPFD to make it less confusing, at least for me.

MFC after:	1 week
2012-07-19 10:23:59 +00:00
Lawrence Stewart
19220a8330 Move the ffclock symbols from FBSD_1.2 to FBSD_1.3 where they should have been
put initially. They were added to head during development of 10-CURRENT, not
9-CURRENT.

Submitted by:	glebius
Reviewed by:	kib
2012-07-10 08:31:28 +00:00
Konstantin Belousov
869fd80fd4 Use struct vdso_timehands data to implement fast gettimeofday(2) and
clock_gettime(2) functions if supported. The speedup seen in
microbenchmarks is in range 4x-7x depending on the hardware.

Only amd64 and i386 architectures are supported. Libc uses rdtsc and
kernel data to calculate current time, if enabled by kernel.

Hopefully, this code is going to migrate into vdso in some future.

Discussed with:	bde
Reviewed by:	jhb
Tested by:	flo
MFC after:	1 month
2012-06-22 07:13:30 +00:00
John Baldwin
cd4ecf3cd2 Further refine the implementation of POSIX_FADV_NOREUSE.
First, extend the changes in r230782 to better handle the common case
of using NOREUSE with sequential reads.  A NOREUSE file descriptor
will now track the last implicit DONTNEED request it made as a result
of a NOREUSE read.  If a subsequent NOREUSE read is adjacent to the
previous range, it will apply the DONTNEED request to the entire range
of both the previous read and the current read.  The effect is that
each read of a file accessed sequentially will apply the DONTNEED
request to the entire range that has been read.  This allows NOREUSE
to properly handle misaligned reads by flushing each buffer to cache
once it has been completely read.

Second, apply the same changes made to read(2) by r230782 and this
change to writes.  This provides much better performance in the
sequential write case as it allows writes to still be clustered.  It
also provides much better performance for misaligned writes.  It does
mean that NOREUSE will be generally ineffective for non-sequential
writes as the current implementation relies on a future NOREUSE
write's implicit DONTNEED request to flush the dirty buffer from the
current write.

MFC after:	2 weeks
2012-06-19 18:42:24 +00:00
Ed Schouten
0089e0c430 Remove invalid remark about pipes.
The stat structures returned on pipes seems to contain all the
information required by POSIX. Especially the wording "and thus to a
pipe" makes little sense, because it seems to imply a certain
relationship between sockets and pipes that simply isn't there.

MFC after:	2 weeks
2012-06-02 10:50:25 +00:00
Konstantin Belousov
86b0103262 Clarify the SEEK_HOLE description, it repositions the file pointer.
MFC after:  3 days
2012-05-26 05:25:55 +00:00
Joel Dahl
9650117163 Remove tab from kernel configuration option. This is consistent with the rest
of our manual pages.
2012-05-12 16:08:05 +00:00
Glen Barber
9f63b42217 General mdoc(7) and typo fixes.
PR:		167713
Submitted by:	Nobuyuki Koganemaru (kogane!jp.freebsd.org)
2012-05-08 18:56:21 +00:00
Robert Watson
ceebc4ca95 fix a further typo in the pdfork(2) man page.
Submitted by:	Norman Hardy
MFC after:	3 days
2012-04-30 08:00:52 +00:00
Robert Watson
aa5d4b9247 The returned file descriptor from pdfork(2) is via fdp, not pidp.
Submitted by:	Norman Hardy
MFC after:	3 days
2012-04-30 07:32:39 +00:00
Eitan Adler
74af13c61a pread(2) might fail with EBUSY, so document it
PR:		docs/167201
Submitted by:	Kurt Jaeger <fbsd-ports@opsec.eu>
Approved by:	cperciva
MFC after:	3 days
2012-04-29 22:23:00 +00:00
Jamie Gritton
91b24c185b A new jail(8) with a configuration file, ultimately to replace the work
currently done by /etc/rc.d/jail.

MFC after:	3 months
2012-04-26 17:36:05 +00:00
Xin LI
e3a635d15f - Use quote when tab is used;
- Follow the same macros used in device driver manual pages.
2012-04-22 07:51:49 +00:00
Jaakko Heinonen
5c7f2335d4 Additional manual page updates for r234103.
Submitted by:	bde
2012-04-13 05:40:26 +00:00
Eitan Adler
847d0034e3 Return EBADF instead of EMFILE from dup2 when the second argument is
outside the range of valid file descriptors

PR:		kern/164970
Submitted by:	Peter Jeremy <peterjeremy@acm.org>
Reviewed by:	jilles
Approved by:	cperciva
MFC after:	1 week
2012-04-11 14:08:09 +00:00
Jaakko Heinonen
fce74feae1 - Return EPERM from ufs_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.
- Use the '^' operator in the SF_SNAPSHOT anti-toggling check.

Flags are now stored to ip->i_flags in one place after all checks.

Submitted by:	bde
2012-04-10 15:59:37 +00:00
Joel Dahl
a53e0df1d7 mdoc: Ud takes no argument. 2012-03-29 16:20:20 +00:00
Joel Dahl
288eac5aed mandoc complains loudly when <TAB>s are misused in columnated lists. Fix
this syntax violation and while I'm here also convert <TAB> to Ta and adjust
quotation marks in order to prevent this problem in the future.
2012-03-29 16:02:40 +00:00
Eitan Adler
50d675f7a9 Remove trailing whitespace per mdoc lint warning
Disussed with:	gavin
No objection from:	doc
Approved by:	joel
MFC after:	3 days
2012-03-29 05:02:12 +00:00
Jim Harris
0f70b33419 Fix comment to specify correct struct name.
Reviewed by: gjb
Approved by: sbruno
2012-03-28 23:51:06 +00:00
Joel Dahl
12afe06c06 Make sure sections are sorted into conventional order. 2012-03-25 16:00:56 +00:00
Joel Dahl
41949a1ed5 Remove superfluous paragraph macro. 2012-03-25 12:13:24 +00:00
Benjamin Kaduk
87696ecec7 Remove trailing whitespace.
Approved by:	hrs (mentor)
2012-03-19 05:08:09 +00:00
Benjamin Kaduk
8882458c89 Expound a bit more about the system maximum number of FIBs,
how it may be set, and current limitations on the value.

Approved by:	hrs (mentor)
PR:		docs/157453
MFC after:	1 week
2012-03-19 04:46:11 +00:00
Konstantin Belousov
1d73bec4c1 Do not claim that msync(2) is obsoleted [1].
Document EIO from msync(2).

Inspired by PR:	 docs/165929 [1]
Reviewed by:	 jilles
MFC after:	 2 weeks
2012-03-17 23:55:18 +00:00
Ed Schouten
e06ea46468 Extend the description for ESRCH a bit.
This errno can also be returned if the passed process identifier doesn't
correspond with a process group.

Discussed on:	arch@
MFC after:	1 week
2012-03-15 12:12:39 +00:00
Ed Schouten
c0fd04a922 Remove impossible error condition from the man page.
On FreeBSD, all processes have a process group, so it is impossible for
kill(2) to fail this way.  POSIX also doesn't mention this error
condition.

Discussed on:	arch@
MFC after:	3 weeks
2012-03-15 11:49:26 +00:00
Edward Tomasz Napierala
6975edd9d0 Cross-reference sigqueue(2) and kill(2). 2012-03-10 10:54:52 +00:00
Pawel Jakub Dawidek
cc83460ceb Link EV_SET(3) to kqueue(2).
MFC after:	3 days
2012-03-05 20:59:34 +00:00
Konstantin Belousov
1d2ea43149 Document SO_PROTOCOL socket option.
Discussed with:	bz
Reviewed by:	glebius
MFC after:	2 weeks
2012-02-26 13:57:24 +00:00
Glen Barber
9d496f5ab6 Whitespace cleanup:
o Wrap sentences on to new lines
 o Cleanup trailing whitespace

Found with:	textproc/igor
MFC after:	1 week
X-MFC-With:	r232157
2012-02-25 15:21:43 +00:00
Glen Barber
3102cfe2e2 Fix various typos in manual pages.
Submitted by:	amdmi3
PR:		165431
MFC after:	1 week
2012-02-25 14:31:25 +00:00
Konstantin Belousov
3e2f30f6dd Document PL_FLAG_CHILD.
MFC after:	3 days
2012-02-18 22:26:32 +00:00
Xin LI
601e0f587d Bump .Dd date for previous revision. 2012-02-15 18:34:57 +00:00
David Xu
03a67b59f8 Add notes about sigev_notify_kevent_flags introduced in revision 230857
which enables thread-friendly polling on same fd for AIO events.

Reviewed by:	delphij
2012-02-15 02:59:17 +00:00
Ed Schouten
6b99842ada Globally replace u_int*_t from (non-contributed) man pages.
The reasoning behind this, is that if we are consistent in our
documentation about the uint*_t stuff, people will be less tempted to
write new code that uses the non-standard types.

I am not going to bump the man page dates, as these changes can be
considered style nits. The meaning of the man pages is unaffected.

MFC after:	1 month
2012-02-12 18:29:56 +00:00
Jamie Gritton
e9d3a32ffd Acknowledge that jail_attach and jail_remove can return EPERM.
MFC after:	1 week
2012-02-08 23:34:47 +00:00
Tijl Coosemans
2cb08f8d7d Move descriptions of file caching commands out of the file locking section.
Approved by:	kib (mentor)
2012-01-28 18:35:10 +00:00
Sergey Kandaurov
a12406889a Remove a left-over reference to make.conf(5) which was used as a place to
store the VM_STACK compile option to enable MAP_STACK support in its
earliest stage of development.

Found by:	mux
2012-01-27 13:26:19 +00:00
Konstantin Belousov
7ff824218a Clarify the implementation-defined behaviour in case of close(2)
returning error.

MFC after:	1 week
2012-01-22 11:58:17 +00:00
Pawel Jakub Dawidek
e0c980d95d The sys/uio.h header is needed only for readv(2), preadv(2), writev(2) and
pwritev(2). Document it more precisely.

Reviewed by:	jilles
MFC after:	3 days
2012-01-22 11:15:48 +00:00
Eitan Adler
4e529b6d78 Make man page wording more clear:
PR:		docs/164078
Submitted by:	Taras <ds@ukrhub.net>
Approved by:	bcr
MFC after:	3 days
2012-01-15 20:14:52 +00:00
Xin LI
6fe5169cff Document the fact that chroot(2) is no longer part of POSIX since SUSv3
and add a SECURITY CONSIDERATIONS section for recommended practices.
2012-01-04 02:04:20 +00:00
Sergey Kandaurov
5f09751fe3 Fix manual section for acl_get(3) and mac_get(3) family functions.
Reviewed by:	rwatson
MFC after:	1 week
2011-12-29 21:12:22 +00:00
Xin LI
b0266169d4 Update rtprio(2) manual page to reflect the latest changes in -CURRENT as
well as provide documentation for rtprio_thread(2) system call.

MFC after:	1 month
X-MFC-after:	r228470
2011-12-27 10:34:00 +00:00
Ruslan Ermilov
20df026c9a The NOTE_COPY should have been named NOTE_FFCOPY from the very
beginning.

Submitted by:	Igor Sysoev
2011-12-07 11:06:18 +00:00
Robert Watson
251944df31 Cross-reference capsicum.4 from cap_enter.2 and cap_new.2.
MFC after:	3 days
Sponsored by:	Google, Inc.
2011-11-27 19:45:41 +00:00