Commit Graph

1617 Commits

Author SHA1 Message Date
Mark Johnston
93c9677b94 fork(2): Add a note to the effect that kqueue descriptors, unlike other
descriptor types, are not inherited from the parent process.

Reported by:	kmacy
MFC after:	1 week
2015-05-02 00:29:27 +00:00
Baptiste Daroussin
18c5321d06 Escape "Ed" 2015-04-26 10:52:37 +00:00
John Baldwin
179fa75e6e Reassign copyright statements on several files from Advanced
Computing Technologies LLC to Hudson River Trading LLC.

Approved by:	Hudson River Trading LLC (who owns ACT LLC)
MFC after:	1 week
2015-04-23 14:22:20 +00:00
Konstantin Belousov
0538aafc41 The lseek(2), mmap(2), truncate(2), ftruncate(2), pread(2), and
pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x
kernels which required padding before the off_t parameter.  The
fcntl(2) contains compatibility code to handle kernels before the
struct flock was changed during the 8.x CURRENT development.  The
shims were reasonable to allow easier revert to the older kernel at
that time.

Now, two or three major releases later, shims do not serve any
purpose.  Such old kernels cannot handle current libc, so revert the
compatibility code.

Make padded syscalls support conditional under the COMPAT6 config
option.  For COMPAT32, the syscalls were under COMPAT6 already.

Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to
(partially) disable the removed shims.

Reviewed by:	jhb, imp (previous versions)
Discussed with:	peter
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-18 21:50:13 +00:00
Konstantin Belousov
3d0045bb2b Make wait6(2), waitid(3) and ppoll(2) cancellation points. The
waitid() function is required to be cancellable by the standard.  The
wait6() and ppoll() follow the other syscalls in their groups.

Reviewed by:	jhb, jilles (previous versions)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-18 21:35:41 +00:00
Sergey Kandaurov
d359191f7e Remove obsolete bits about maximum number of file systems.
NMOUNT has gone together with static mount table in 4.3BSD-Reno.

MFC after:	1 week
2015-04-12 21:14:58 +00:00
John Baldwin
b871bfa1a2 vfork() first appeared in 3BSD which pre-dates 2.9BSD. Verified via the
copy of 3BSD on disc 1 of "The CSRG Archives".

PR:		198612
MFC after:	1 week
2015-04-06 20:40:01 +00:00
Ed Maste
541236cf60 libc: Eliminate duplicate copies of __vdso_gettc.c
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D2152
2015-04-02 21:18:11 +00:00
Edward Tomasz Napierala
522196b5ed Update open(2) to make it more obvious that O_NOCTTY and O_TTY_INIT
are ignored.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-04-02 11:41:04 +00:00
Konstantin Belousov
1849df3006 Correctly handle __fcntl_compat symbol for the !SYSCALL_COMPAT case.
Both .weak and .alias assembler directives only work when assembling
the file which defines the symbol.

Reported and tested by:	andrew
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-01 16:55:30 +00:00
Konstantin Belousov
b072e86d09 Make kevent(2) a cancellation point.
Note that to cancel blocked kevent(2) call, changelist must be empty,
since we cannot cancel a call which already made changes to the
process state.  And in reverse, call which only makes changes to the
kqueue state, without waiting for an event, is not cancellable.  This
makes a natural usage model to migrate kqueue loop to support
cancellation, where existing single kevent(2) call must be split into
two: first uncancellable update of kqueue, then cancellable wait for
events.

Note that this is ABI-incompatible change, but it is believed that
there is no cancel-safe code that relies on kevent(2) not being a
cancellation point.  Option to preserve the ABI would be to keep
kevent(2) as is, but add new call with flags to specify cancellation
behaviour, which only value seems to add complications.

Suggested and reviewed by:	jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-03-29 19:14:41 +00:00
John-Mark Gurney
32d52c275d forgot to bump date, and replace contraction (igor)... 2015-03-07 03:48:32 +00:00
John-Mark Gurney
4a46673183 make things a bit more clear.. we worked together on language..
Submitted by:	Justin Cormack
2015-03-06 23:17:18 +00:00
John-Mark Gurney
b759b8aa44 fix spelling, add comma and remove BUGS section.. it provided no useful
information, and is not really bugs, but limitations for other reasons...
2015-02-19 01:51:17 +00:00
Marius Strobl
aed116911d Unbreak sparc64 after r276630 by calling __sparc_sigtramp_setup signal
trampoline as part of the MD __sys_sigaction again.

Submitted by:	kib (initial versions)
MFC after:	3 days
2015-02-16 22:13:03 +00:00
Konstantin Belousov
45468c5356 Properly interpose libc spinlocks, was missed in r276630. In
particular, stdio locking was affected.

Reported and tested by:	"Matthew D. Fuller" <fullermd@over-yonder.net>
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2015-02-14 11:47:40 +00:00
Edward Tomasz Napierala
6c316535e2 Remove useless comment.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-07 13:11:45 +00:00
Jilles Tjoelker
2205e0d1bd Add futimens and utimensat system calls.
The core kernel part is patch file utimes.2008.4.diff from
pluknet@FreeBSD.org. I updated the code for API changes, added the manual
page and added compatibility code for old kernels. There is also audit and
Capsicum support.

A new UTIME_* constant might allow setting birthtimes in future.

Differential Revision:	https://reviews.freebsd.org/D1426
Submitted by:	pluknet (partially)
Reviewed by:	delphij, pluknet, rwatson
Relnotes:	yes
2015-01-23 21:07:08 +00:00
Konstantin Belousov
677258f7e7 Add procctl(2) PROC_TRACE_CTL command to enable or disable debugger
attachment to the process.  Note that the command is not intended to
be a security measure, rather it is an obfuscation feature,
implemented for parity with other operating systems.

Discussed with:	jilles, rwatson
Man page fixes by:	rwatson
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-18 15:13:11 +00:00
Konstantin Belousov
397d851d66 Reduce the size of the interposing table and amount of
cancellation-handling code in the libthr.  Translate some syscalls
into their more generic counterpart, and remove translated syscalls
from the table.

List of the affected syscalls:
creat, open -> openat
raise -> thr_kill
sleep, usleep -> nanosleep
pause -> sigsuspend
wait, wait3, waitpid -> wait4

Suggested and reviewed by:	jilles (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-11 22:16:31 +00:00
John Baldwin
e275993995 Document CPU_WHICH_DOMAIN and bump Dd for cpuset.1.
Missed in:	r276829
2015-01-08 18:53:11 +00:00
Konstantin Belousov
1a744fefc2 Avoid calling internal libc function through PLT or accessing data
though GOT, by staticizing and hiding.  Add setter for
__error_selector to hide it as well.

Suggested and reviewed by:	jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-05 01:06:54 +00:00
Konstantin Belousov
8495e8b1e9 Fix known issues which blow up the process after dlopen("libthr.so")
(or loading a dso linked to libthr.so into process which was not
linked against threading library).

- Remove libthr interposers of the libc functions, including
  __error(). Instead, functions calls are indirected through the
  interposing table, similar to how pthread stubs in libc are already
  done.  Libc by default points either to syscall trampolines or to
  existing libc implementations.  On libthr load, libthr rewrites the
  pointers to the cancellable implementations already in libthr.  The
  interposition table is separate from pthreads stubs indirection
  table to not pull pthreads stubs into static binaries.

- Postpone the malloc(3) internal mutexes initialization until libthr
  is loaded.  This avoids recursion between calloc(3) and static
  pthread_mutex_t initialization.

- Reinstall signal handlers with wrapper on libthr load.  The
  _rtld_is_dlopened(3) is used to avoid useless calls to sigaction(2)
  when libthr is statically referenced from the main binary.

In the process, fix openat(2), swapcontext(2) and setcontext(2)
interposing.  The libc symbols were exported at different versions
than libthr interposers.  Export both libc and libthr versions from
libc now, with default set to the higher version from libthr.

Remove unused and disconnected swapcontext(3) userspace implementation
from libc/gen.

No objections from:	deischen
Tested by:	pho, antoine (exp-run) (previous versions)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-03 18:38:46 +00:00
Christian Brueffer
0aee91e1fb Various mdoc fixes and a few EOL whitespace removals.
Found with:	mandoc -Tlint
2014-12-21 12:36:36 +00:00
Bryan Drewery
4bb90cbe18 Bump Dd for r275846
MFC after:	3 weeks
2014-12-17 01:36:00 +00:00
Kirk McKusick
27ae6f4af7 Add some additional clarification and fix a few gammer nits.
Reviewed by: kib
MFC after:   3 weeks
2014-12-17 01:32:27 +00:00
Konstantin Belousov
19eaed5353 Markup fixes for kqueue(2), no content changes.
Reviewed by:	brueffer (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2014-12-15 14:58:10 +00:00
Konstantin Belousov
237623b028 Add a facility for non-init process to declare itself the reaper of
the orphaned descendants.  Base of the API is modelled after the same
feature from the DragonFlyBSD.

Requested by:	bapt
Reviewed by:	jilles (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2014-12-15 12:01:42 +00:00
Ed Maste
294246bb7d Revert r274772: it is not valid on MIPS
Reported by:	sbruno
2014-11-25 03:50:31 +00:00
Baptiste Daroussin
2c900f1907 Ta is only allowed with Bl -column not in Bl -item 2014-11-23 23:35:16 +00:00
Joel Dahl
d4d112e34a Misc mdoc fixes:
- Remove superfluous paragraph macros.
- Remove/fix empty or incorrect macros.
- Sort sections into conventional order.
- Terminate quoted strings properly.
- Remove EOL whitespace.
2014-11-23 21:00:00 +00:00
Ed Maste
688fd61ae8 Use canonical __PIC__ flag
It is automatically set when -fPIC is passed to the compiler.

Reviewed by:	dim, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1179
2014-11-21 02:05:48 +00:00
Dmitry Chagin
186d9c3473 Add the ppoll() system call.
Export kern_poll() needed by an upcoming Linuxulator change.

Differential Revision:	https://reviews.freebsd.org/D1133
Reviewed by:	kib, wblock
MFC after:	1 month
2014-11-13 05:26:14 +00:00
Dag-Erling Smørgrav
4b52e0d84f <sys/param.h> is a superset of <sys/types.h> and must always come
first.  Coincidentally, today is the 11th anniversary of this man
page's (and this bug's) first appearance in FreeBSD.

MFC after:	3 days
2014-11-01 09:10:21 +00:00
Gavin Atkinson
ff8d5270bc Slightly improve grammar in EAGAIN description.
PR:		176806
Submitted by:	Jeremy Chadwick
MFC after:	3 days
2014-10-15 23:39:47 +00:00
Xin LI
b888b86e6f accept(2) may and can return EAGAIN, document it.
MFC after:	1 week
2014-10-10 03:05:55 +00:00
Bryan Drewery
c1efb88730 Document [EPERM] for UNIX sockets.
MFC after:	2 weeks
2014-09-30 00:06:53 +00:00
John Baldwin
7a9f047ba7 - Remove mention of MAP_INHERIT. It hasn't been implemented for thirteen
years.
- Remove mention of unimplemented MAP_SWAP.  There are no future plans to
  implement it.

Submitted by:	alc (2)
2014-09-17 19:45:34 +00:00
Enji Cooper
2bcdab32c3 Bump .Dd for the content change done to access(2) in r271655
PR: 181155
Sponsored by: EMC / Isilon Storage Division
2014-09-16 00:59:08 +00:00
Enji Cooper
257597a434 Validate the mode argument in access, eaccess, and faccessat for optional
POSIX compliance and to improve compatibility with Linux and NetBSD

The issue was identified with lib/libc/sys/t_access:access_inval from
NetBSD

Update the manpage accordingly

PR: 181155
Reviewed by: jilles (code), jmmv (code), wblock (manpage), wollman (code)
MFC after: 4 weeks
Phabric: D678 (code), D786 (manpage)
Sponsored by: EMC / Isilon Storage Division
2014-09-16 00:56:47 +00:00
John-Mark Gurney
e2cc4003e2 document mqueuefs is required for mq_open... 2014-09-15 22:32:35 +00:00
John Baldwin
5fd3f8b3b6 Add stricter checking of some mmap() arguments:
- Fail with EINVAL if an invalid protection mask is passed to mmap().
- Fail with EINVAL if an unknown flag is passed to mmap().
- Fail with EINVAL if both MAP_PRIVATE and MAP_SHARED are passed to mmap().
- Require one of either MAP_PRIVATE or MAP_SHARED for non-anonymous
  mappings.

Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D698
2014-09-15 17:20:13 +00:00
Joel Dahl
cfaa96f327 Minor mdoc nit. 2014-09-09 14:34:54 +00:00
Baptiste Daroussin
42e62eca52 Extend kqueue's EVFILT_TIMER by adding precision unit flags support
Define the precision macros as bits sets to conform with XNU equivalent.
Test fflags passed for EVFILT_TIMER and return EINVAL in case an invalid flag
is passed.

Phabric:	https://phabric.freebsd.org/D421
Reviewed by:	kib
2014-07-18 14:27:04 +00:00
Kevin Lo
92511d108b Document that listen(2) can fail with EDESTADDRREQ. 2014-07-15 02:21:51 +00:00
Mark Johnston
d3fe75eb62 Fix a typo.
MFC after:	3 days
2014-07-09 01:33:35 +00:00
Konstantin Belousov
c22de76166 Note that most errors are possible for all syscalls from utimes(2)
family.  Minor wording corrections.

Based on the suggestions by bde.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-03 11:19:16 +00:00
Sergey Kandaurov
521aa90cb8 Document EINVAL as per POSIX.
This also follows r124335-r124336, r225827.

PR:		191382
MFC after:	1 week
Sponsored by:	Nginx, Inc.
2014-06-26 10:21:00 +00:00
Garrett Wollman
775a76844f Catch up with many years of changes:
o Document PF_LOCAL as being an alias for PF_UNIX
o Document POSIX standardization of this interface using AF_*
  constants rather than PF_* constants, and note the three particular
  families which POSIX standardizes.
o Note anticipated POSIX standardization of SOCK_CLOEXEC.
o Delete from listing protocol families that FreeBSD doesn't support
  (in some cases, like PF_PUP, has never supported).
o Add to listing some current protocol families that have been
  introduced in the last decade or so.
o Document the correspondence of PF_* and AF_* constants.

We should probably change the documentation to make the AF_* constants
primary, but this commit does not do so.

Reviewed by:	kevlo@
MFC after:	1 month
2014-06-24 20:23:18 +00:00
Joel Dahl
df2d82e003 mdoc: remove superfluous paragraph macros. 2014-06-23 18:40:21 +00:00
Baptiste Daroussin
8fbf3d50e3 use .Mt to mark up email addresses consistently (part4)
PR:		191174
Submitted by:	Franco Fichtner  <franco at lastsummer.de>
2014-06-23 08:25:03 +00:00
Konstantin Belousov
11c42bcc54 Add MAP_EXCL flag for mmap(2). It should be combined with MAP_FIXED,
and prevents the request from deleting existing mappings in the
region, failing instead.

Reviewed by:	alc
Discussed with:	jhb
Tested by:	markj, pho (previous version, as part of the bigger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-19 05:00:39 +00:00
Konstantin Belousov
8efb419829 The time come to remove the wrapper, most likely, but tidy up it code
instead for now.  Remove spurious blank line, use C89 definition, wrap
long line.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-19 04:55:00 +00:00
Benjamin Kaduk
245d93279b Minor mdoc fix
Submitted by:	hrs
Approved by:	hrs (mentor, implicit)
2014-05-30 02:16:28 +00:00
Benjamin Kaduk
6953d7db5c Correct documentation of the limit on how much memory can be mlock()ed
vm.max_wired is a system-wide limit, not per-process.  Reword the
section to make this more clear.

PR:		docs/189214
Submitted by:	Lawrence Chen (original text)
Approved by:	hrs (mentor)
2014-05-17 03:05:52 +00:00
Peter Holm
e103f5b1c0 msync(2) must return ENOMEM and not EINVAL when the address is outside the
allowed range or when one or more pages are not mapped. This according to
The Open Group Base Specifications Issue 7.

Discussed with:	 attilio, Bruce Evans
Reviewed by:	 alc, Garrett Cooper
Reported by:	 ATF
MFC after:	 2 weeks
Sponsored by:	EMC / Isilon storage division
2014-05-07 08:38:02 +00:00
Ed Schouten
f56cfe8d61 Fix table alignment. EVFILT_PROCDESC is longer than the existing filters. 2014-04-07 18:17:31 +00:00
Ed Schouten
38219d6acd Implement kqueue(2) for procdesc(4).
kqueue(2) already supports EVFILT_PROC. Add an EVFILT_PROCDESC that
behaves the same, but operates on a procdesc(4) instead. Only implement
NOTE_EXIT for now. The nice thing about NOTE_EXIT is that it also
returns the exit status of the process, meaning that we can now obtain
this value, even if pdwait4(2) is still unimplemented.

Notes:

- Simply reuse EVFILT_NETDEV for EVFILT_PROCDESC. As both of these will
  be used on totally different descriptor types, this should not clash.

- Let procdesc_kqops_event() reuse the same structure as filt_proc().
  The only difference is that procdesc_kqops_event() should also be able
  to deal with the case where the process was already terminated after
  registration. Simply test this when hint == 0.

- Fix some style(9) issues in filt_proc() to keep it consistent with the
  newly added procdesc_kqops_event().

- Save the exit status of the process in pd->pd_xstat, as we cannot pick
  up the proctree_lock from within procdesc_kqops_event().

Discussed on:	arch@
Reviewed by:	kib@
2014-04-07 18:10:49 +00:00
Warner Losh
a5fc5b6223 Convert from WITHOUT_SYSCALL_COMPAT to MK_SYSCALL_COMPAT. 2014-04-05 17:54:43 +00:00
Ed Schouten
2ad6bba714 Correct return type of pdfork(2).
The pdfork(2) man page states:

	"pdfork() returns a PID, 0 or -1, as fork(2) does."

As it returns a PID, the return type should obviously be pid_t. As int
and pid_t have the same size on all architectures, this change does not
affect the ABI in any way.
2014-04-04 19:53:45 +00:00
Eitan Adler
45ebf5d172 Use the correct variable name in the example code. 2014-03-30 04:40:41 +00:00
Robert Watson
cf321a51b1 Update system man pages for s/capability.h/capsicum.h/.
MFC after:	3 weeks
2014-03-27 21:43:00 +00:00
Marcel Moolenaar
8876613dc5 Replace use of ${.CURDIR} by ${LIBC_SRCTOP} and define ${LIBC_SRCTOP}
if not already defined. This allows building libc from outside of
lib/libc using a reach-over makefile.

A typical use-case is to build a standard ILP32 version and a COMPAT32
version in a single iteration by building the COMPAT32 version using a
reach-over makefile.

Obtained from:	Juniper Networks, Inc.
2014-03-04 02:19:39 +00:00
Benjamin Kaduk
af1e239814 syncer(4) is a kernel process, not a user process
Noticed by: Geoffrey Thomas <gthomas@mokafive.com>
Approved by:	hrs (mentor)
2014-02-27 04:06:34 +00:00
Christian Brueffer
9cba0f9670 Match the correct variable to the variable description.
PR:		121173
Submitted by:	Thomas Mueller <tmueller at sysgo.com>
MFC after:	1 week
2014-02-21 13:53:41 +00:00
John-Mark Gurney
ad6a53db5f document _JAIL as a possible option to set a cpuset for a jail..
MFC after:	3 days
2014-02-15 07:01:45 +00:00
Christian Brueffer
a578215eed Fix a typo.
MFC after:	1 week
2014-02-03 22:16:46 +00:00
Konstantin Belousov
49d39308ba The posix_madvise(3) and posix_fadvise(2) should return error on
failure, same as posix_fallocate(2).

Noted by:	Bob Bishop <rb@gid.co.uk>
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-01-30 18:04:39 +00:00
Ulrich Spörlein
d7d8b00bec mdoc: fix several uses of the Fx macro to point to actual releases.
Found by:  make manlint
2014-01-28 21:40:10 +00:00
Konstantin Belousov
2852de0489 The posix_fallocate(2) syscall should return error number on error,
without modifying errno.

Reported and tested by:	Gennady Proskurin <gpr@mail.ru>
Reviewed by:	mdf
PR:	standards/186028
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-01-23 17:24:26 +00:00
Sergey Kandaurov
dccae053f7 Update EINVAL description.
This matches current POSIX standards and actual FreeBSD behavior.

MFC after:	1 week
2014-01-23 09:37:03 +00:00
Jilles Tjoelker
b83686c8fe Add some missing .Nm for newer syscalls in existing man pages.
MFC after:	1 week
2014-01-11 22:00:16 +00:00
Sergey Kandaurov
d3178d7d27 - Fix EBADF description, in following the future POSIX tc and what FreeBSD
actually implements.
- Improve grammar: use more preferred "can", not "could".

Submitted by:	jilles
2013-12-27 16:57:38 +00:00
Sergey Kandaurov
4ca1cd1d63 Fix an apparent typo.
MFC after:	3 days
2013-12-26 19:18:43 +00:00
Sergey Kandaurov
e44aa9fde0 Provide the manual page for aio_fsync(2).
Reviewed by:	davidxu
MFC after:	1 week
2013-12-26 19:16:30 +00:00
Sergey Kandaurov
5acf8f8325 The compile time constant limit on number of swap devices was removed in 5.2.
As such, remove the EINVAL error saying so.  Currently the vm.nswapdev sysctl
just represents the number of added swap devices.

MFC after:	1 week
2013-12-25 16:01:29 +00:00
Ruslan Ermilov
0f987f1f08 shm_open(2): Fixed the history information.
While here, sort xrefs.

Reviewed by:	jhb
2013-12-18 12:18:17 +00:00
Joel Dahl
2727e97436 mdoc: remove EOL whitespace. 2013-12-06 21:22:33 +00:00
John Baldwin
d4e3c0a2d7 Various updates and tweaks to the wait(2) manpage.
PR:		docs/183904
Submitted by:	Michael Galassi <michaelgalassi@gmail.com>
Reviewed by:	kib, wblock (earlier version)
2013-12-03 21:00:13 +00:00
Jilles Tjoelker
b865f8ef40 chmod(2): Document S_ISVTX following SUSv3/SUSv4.
S_ISTXT is non-standard.

While here, also update fchmodat() standards entry to POSIX.1-2008.
2013-12-01 12:24:57 +00:00
Jilles Tjoelker
09466daf8c waitid(2): Do not tell userland programmers to include <sys/signal.h>.
Userland should get these definitions by including <signal.h>.
2013-12-01 11:59:37 +00:00
Pawel Jakub Dawidek
f2b525e6b9 Make process descriptors standard part of the kernel. rwhod(8) already
requires process descriptors to work and having PROCDESC in GENERIC
seems not enough, especially that we hope to have more and more consumers
in the base.

MFC after:	3 days
2013-11-30 15:08:35 +00:00
Sergey Kandaurov
dc211b3d40 Fix extattr(2) MLINKS.
MFC after:	1 week
2013-11-09 00:36:09 +00:00
Pawel Jakub Dawidek
6f62d278e8 - Add manual pages for capability rights (rights(4)), cap_rights_init(3)
family of functions and cap_rights_get(3) function.
- Update remaining Capsicum-related manual pages.

Reviewed by:	bdrewery
MFC after:	3 days
2013-11-04 14:10:22 +00:00
Jilles Tjoelker
1947c8a6d1 kqueue: Change error for kqueues rlimit from EMFILE to ENOMEM and document
this error condition in the kqueue(2) manual page.

Discussed with:	kib
2013-11-03 23:06:24 +00:00
Konstantin Belousov
85a0ddfd0b Add a resource limit for the total number of kqueues available to the
user.  Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.

Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process.  Limit
allows administrator to prevent the abuse.

This is kernel-mode side of the change, with the user-mode enabling
commit following.

Reported and tested by:	pho
Discussed with:	jmg
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-10-21 16:46:12 +00:00
Jilles Tjoelker
0f49c96cfc accept(2): Update portability note for accept4().
The accept(2) man page warns that O_NONBLOCK and other properties on the
new socket may vary across implementations. However, this issue only
applies to accept() and not to accept4(). On the other hand, accept4()
is not commonly available yet.

Reported by:	pluknet
Reviewed by:	bjk
Approved by:	re (kib)
2013-10-01 21:17:18 +00:00
Joel Dahl
828378a6d3 Minor mdoc improvements.
Approved by:	re (blanket)
2013-09-19 19:43:38 +00:00
John Baldwin
55648840de Extend the support for exempting processes from being killed when swap is
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
  from arbitrary processes.  Similar to ktrace it can apply a change to all
  existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
  control operations on processes (as opposed to the debugger-specific
  operations provided by ptrace(2)).  procctl(2) uses a combination of
  idtype_t and an id to identify the set of processes on which to operate
  similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
  of a set of processes.  MADV_PROTECT still works for backwards
  compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
  the first bit of which is used to track if P_PROTECT should be inherited
  by new child processes.

Reviewed by:	kib, jilles (earlier version)
Approved by:	re (delphij)
MFC after:	1 month
2013-09-19 18:53:42 +00:00
Bryan Drewery
c36029e6dc Consistently reference file descriptors as "fd". 55 other manpages
used "fd", while these used "d" and "filedes".

MFC after:	1 week
Approved by:	gjb
Approved by:	re (delphij)
2013-09-12 00:53:38 +00:00
John Baldwin
edb572a38c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
Jilles Tjoelker
550ac4a8e8 wait(2): Add some possible caveats to standards section. 2013-09-07 11:41:52 +00:00
Jilles Tjoelker
75b1cda430 Update some signal man pages for multithreading. 2013-09-06 09:08:40 +00:00
Pawel Jakub Dawidek
7008be5bd7 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
Robert Watson
7b223d2286 Xref capsicum(4) and procdesc(4) from pdfork(2).
Suggested by:	sbruno
MFC after:	3 days
2013-08-28 20:00:25 +00:00
Joel Dahl
4d5c7c633b Remove EOL whitespace. 2013-08-22 16:02:20 +00:00
Kenneth D. Merry
7da1a731c6 Expand the use of stat(2) flags to allow storing some Windows/DOS
and CIFS file attributes as BSD stat(2) flags.

This work is intended to be compatible with ZFS, the Solaris CIFS
server's interaction with ZFS, somewhat compatible with MacOS X,
and of course compatible with Windows.

The Windows attributes that are implemented were chosen based on
the attributes that ZFS already supports.

The summary of the flags is as follows:

UF_SYSTEM:	Command line name: "system" or "usystem"
		ZFS name: XAT_SYSTEM, ZFS_SYSTEM
		Windows: FILE_ATTRIBUTE_SYSTEM

		This flag means that the file is used by the
		operating system.  FreeBSD does not enforce any
		special handling when this flag is set.

UF_SPARSE:	Command line name: "sparse" or "usparse"
		ZFS name: XAT_SPARSE, ZFS_SPARSE
		Windows: FILE_ATTRIBUTE_SPARSE_FILE

		This flag means that the file is sparse.  Although
		ZFS may modify this in some situations, there is
		not generally any special handling for this flag.

UF_OFFLINE:	Command line name: "offline" or "uoffline"
		ZFS name: XAT_OFFLINE, ZFS_OFFLINE
		Windows: FILE_ATTRIBUTE_OFFLINE

		This flag means that the file has been moved to
		offline storage.  FreeBSD does not have any special
		handling for this flag.

UF_REPARSE:	Command line name: "reparse" or "ureparse"
		ZFS name: XAT_REPARSE, ZFS_REPARSE
		Windows: FILE_ATTRIBUTE_REPARSE_POINT

		This flag means that the file is a Windows reparse
		point.  ZFS has special handling code for reparse
		points, but we don't currently have the other
		supporting infrastructure for them.

UF_HIDDEN:	Command line name: "hidden" or "uhidden"
		ZFS name: XAT_HIDDEN, ZFS_HIDDEN
		Windows: FILE_ATTRIBUTE_HIDDEN

		This flag means that the file may be excluded from
		a directory listing if the application honors it.
		FreeBSD has no special handling for this flag.

		The name and bit definition for UF_HIDDEN are
		identical to the definition in MacOS X.

UF_READONLY:	Command line name: "urdonly", "rdonly", "readonly"
		ZFS name: XAT_READONLY, ZFS_READONLY
		Windows: FILE_ATTRIBUTE_READONLY

		This flag means that the file may not written or
		appended, but its attributes may be changed.

		ZFS currently enforces this flag, but Illumos
		developers have discussed disabling enforcement.

		The behavior of this flag is different than MacOS X.
		MacOS X uses UF_IMMUTABLE to represent the DOS
		readonly permission, but that flag has a stronger
		meaning than the semantics of DOS readonly permissions.

UF_ARCHIVE:	Command line name: "uarch", "uarchive"
		ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE
		Windows name: FILE_ATTRIBUTE_ARCHIVE

		The UF_ARCHIVED flag means that the file has changed and
		needs to be archived.  The meaning is same as
		the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and
		the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.

		msdosfs and ZFS have special handling for this flag.
		i.e. they will set it when the file changes.

sys/param.h:		Bump __FreeBSD_version to 1000047 for the
			addition of new stat(2) flags.

chflags.1:		Document the new command line flag names
			(e.g. "system", "hidden") available to the
			user.

ls.1:			Reference chflags(1) for a list of file flags
			and their meanings.

strtofflags.c:		Implement the mapping between the new
			command line flag names and new stat(2)
			flags.

chflags.2:		Document all of the new stat(2) flags, and
			explain the intended behavior in a little
			more detail.  Explain how they map to
			Windows file attributes.

			Different filesystems behave differently
			with respect to flags, so warn the
			application developer to take care when
			using them.

zfs_vnops.c:		Add support for getting and setting the
			UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN,
			UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.

			All of these flags are implemented using
			attributes that ZFS already supports, so
			the on-disk format has not changed.

			ZFS currently doesn't allow setting the
			UF_REPARSE flag, and we don't really have
			the other infrastructure to support reparse
			points.

msdosfs_denode.c,
msdosfs_vnops.c:	Add support for getting and setting
			UF_HIDDEN, UF_SYSTEM and UF_READONLY
			in MSDOSFS.

			It supported SF_ARCHIVED, but this has been
			changed to be UF_ARCHIVE, which has the same
			semantics as the DOS archive attribute instead
			of inverse semantics like SF_ARCHIVED.

			After discussion with Bruce Evans, change
			several things in the msdosfs behavior:

			Use UF_READONLY to indicate whether a file
			is writeable instead of file permissions, but
			don't actually enforce it.

			Refuse to change attributes on the root
			directory, because it is special in FAT
			filesystems, but allow most other attribute
			changes on directories.

			Don't set the archive attribute on a directory
			when its modification time is updated.
			Windows and DOS don't set the archive attribute
			in that scenario, so we are now bug-for-bug
			compatible.

smbfs_node.c,
smbfs_vnops.c:		Add support for UF_HIDDEN, UF_SYSTEM,
			UF_READONLY and UF_ARCHIVE in SMBFS.

			This is similar to changes that Apple has
			made in their version of SMBFS (as of
			smb-583.8, posted on opensource.apple.com),
			but not quite the same.

			We map SMB_FA_READONLY to UF_READONLY,
			because UF_READONLY is intended to match
			the semantics of the DOS readonly flag.
			The MacOS X code maps both UF_IMMUTABLE
			and SF_IMMUTABLE to SMB_FA_READONLY, but
			the immutable flags have stronger meaning
			than the DOS readonly bit.

stat.h:			Add definitions for UF_SYSTEM, UF_SPARSE,
			UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY
			and UF_HIDDEN.

			The definition of UF_HIDDEN is the same as
			the MacOS X definition.

			Add commented-out definitions of
			UF_COMPRESSED and UF_TRACKED.  They are
			defined in MacOS X (as of 10.8.2), but we
			do not implement them (yet).

ufs_vnops.c:		Add support for getting and setting
			UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY,
			UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS.
			Alphabetize the flags that are supported.

			These new flags are only stored, UFS does
			not take any action if the flag is set.

Sponsored by:	Spectra Logic
Reviewed by:	bde (earlier version)
2013-08-21 23:04:48 +00:00
Pawel Jakub Dawidek
fe0670cfb3 Correct function name and return value. 2013-08-17 14:55:31 +00:00
John Baldwin
5aa60b6f21 Add new mmap(2) flags to permit applications to request specific virtual
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
  Requests for n >= number of bits in a pointer or less than the size of
  a page fail with EINVAL.  This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED.  It can be used
  to optimize the chances of using large pages.  By default it will align
  the mapping on a large page boundary (the system is free to choose any
  large page size to align to that seems best for the mapping request).
  However, if the object being mapped is already using large pages, then
  it will align the virtual mapping to match the existing large pages in
  the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
  VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
  MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
  MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
  explicitly using VMFS_SUPER_SPACE.  All device objects are forced to
  use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
  equivalent.

Reviewed by:	alc
MFC after:	1 month
2013-08-16 21:13:55 +00:00
Jilles Tjoelker
fdafa7840f pselect(2): Add xref to sigsuspend(2). 2013-08-16 14:06:29 +00:00