7186 Commits

Author SHA1 Message Date
jilles
b6f424e548 accept(2): Update portability note for accept4().
The accept(2) man page warns that O_NONBLOCK and other properties on the
new socket may vary across implementations. However, this issue only
applies to accept() and not to accept4(). On the other hand, accept4()
is not commonly available yet.

Reported by:	pluknet
Reviewed by:	bjk
Approved by:	re (kib)
2013-10-01 21:17:18 +00:00
joel
bd6ef8adfa Minor mdoc improvements.
Approved by:	re (blanket)
2013-09-19 19:43:38 +00:00
jhb
d3ef75b6c7 Extend the support for exempting processes from being killed when swap is
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
  from arbitrary processes.  Similar to ktrace it can apply a change to all
  existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
  control operations on processes (as opposed to the debugger-specific
  operations provided by ptrace(2)).  procctl(2) uses a combination of
  idtype_t and an id to identify the set of processes on which to operate
  similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
  of a set of processes.  MADV_PROTECT still works for backwards
  compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
  the first bit of which is used to track if P_PROTECT should be inherited
  by new child processes.

Reviewed by:	kib, jilles (earlier version)
Approved by:	re (delphij)
MFC after:	1 month
2013-09-19 18:53:42 +00:00
tuexen
0524de64dc Remove an unused variable and fix a memory leak in sctp_connectx().
Approved by:	re (gjb)
MFC after:	3 days
2013-09-19 06:19:24 +00:00
bdrewery
b3237a11f6 Consistently reference file descriptors as "fd". 55 other manpages
used "fd", while these used "d" and "filedes".

MFC after:	1 week
Approved by:	gjb
Approved by:	re (delphij)
2013-09-12 00:53:38 +00:00
jhb
04bb6e10cd Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
andrew
59c30969f9 On ARM EABI double precision floating point values are stored in the
endian the CPU is in, i.e. little-endian on most ARM cores.

This allows ARMv4 and ARMv5 boards to boot with the ARM EABI.
2013-09-07 14:04:10 +00:00
jilles
eb5a66191b wait(2): Add some possible caveats to standards section. 2013-09-07 11:41:52 +00:00
jilles
979e7776c1 libc: Make resolver sockets close-on-exec (SOCK_CLOEXEC).
Although the resolver's sockets are exposed to applications via res_state,
I do not expect them to pass the sockets across execve().
2013-09-06 23:49:54 +00:00
jilles
a0c0abfff1 libc: Use SOCK_CLOEXEC for various internal file descriptors.
This change avoids undesirably passing some internal file descriptors to a
process created (fork+exec) by another thread.

Kernel support for SOCK_CLOEXEC was added in r248534, March 19, 2013.
2013-09-06 21:02:06 +00:00
jilles
68907dc598 libc/stdio: Allow fopen/freopen modes in any order (except initial r/w/a).
Austin Group issue #411 requires 'e' to be accepted before and after 'x',
and encourages accepting the characters in any order, except the initial
'r', 'w' or 'a'.

Given that glibc accepts the characters after r/w/a in any order and that
diagnosing this problem may be hard, change our libc to behave that way as
well.
2013-09-06 13:47:16 +00:00
theraven
c04dfb0b19 Fix the namespace pollution caused by iconv.h including stdbool.h
This broke any C89 ports that defined bool themselves, including things
like gcc, gtk, and so on.
2013-09-06 09:46:44 +00:00
jilles
178dd060a8 Update some signal man pages for multithreading. 2013-09-06 09:08:40 +00:00
pjd
029a6f5d92 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
rwatson
e6c5cc6ac3 Document SIGLIBRT in signal(3); take a stab at the signal description as
the original committer didn't provide one.

MFC after:	3 days
2013-09-03 08:19:06 +00:00
jilles
73eea4eee6 system(): Restore behaviour for SIGINT and SIGQUIT.
As mentioned in r16117 and the book "Advanced Programming in the Unix
Environment" by W. Richard Stevens, we should ignore SIGINT and SIGQUIT
before forking, since it is not guaranteed that the parent process starts
running soon enough.

To avoid calling sigaction() in the vforked child, instead block SIGINT and
SIGQUIT before vfork() and keep the sigaction() to ignore after vfork(). The
FreeBSD kernel discards ignored signals, even if they are blocked;
therefore, it is not necessary to unblock SIGINT and SIGQUIT earlier.
2013-09-01 19:59:54 +00:00
jilles
ee4b8e07a8 libc: Always use our own copy of sys_errlist and sys_nerr (.so only).
This ensures strerror() and friends continue to work correctly even if a
(non-PIE) executable linked against an older libc imports sys_errlist (which
causes sys_errlist to refer to the executable's copy with a size fixed when
that executable was linked).

The executable's use of sys_errlist remains broken because it uses the
current value of sys_nerr and may access past the bounds of the array.

Different from the message "Using sys_errlist from executables is not
ABI-stable" on freebsd-arch, this change does not affect the static library.
There seems no reason to prevent overriding the error messages in the static
library.
2013-08-31 22:32:42 +00:00
rwatson
1dbe7f4b10 Xref capsicum(4) and procdesc(4) from pdfork(2).
Suggested by:	sbruno
MFC after:	3 days
2013-08-28 20:00:25 +00:00
jilles
d4eb686387 wordexp(): Avoid leaking the pipe file descriptors to a parallel fork/exec.
This uses the new pipe2() system call added on May 1 (r250159).
2013-08-27 21:47:01 +00:00
jilles
ea95259d98 libc: Access some unexported variables more efficiently (related to stdio). 2013-08-23 14:23:54 +00:00
jilles
d2eb50cd0c libc: Make various internal file descriptors from fopen() close-on-exec. 2013-08-23 13:59:47 +00:00
joel
08d86e8646 Remove EOL whitespace. 2013-08-22 16:02:20 +00:00
ken
c7af094e18 Expand the use of stat(2) flags to allow storing some Windows/DOS
and CIFS file attributes as BSD stat(2) flags.

This work is intended to be compatible with ZFS, the Solaris CIFS
server's interaction with ZFS, somewhat compatible with MacOS X,
and of course compatible with Windows.

The Windows attributes that are implemented were chosen based on
the attributes that ZFS already supports.

The summary of the flags is as follows:

UF_SYSTEM:	Command line name: "system" or "usystem"
		ZFS name: XAT_SYSTEM, ZFS_SYSTEM
		Windows: FILE_ATTRIBUTE_SYSTEM

		This flag means that the file is used by the
		operating system.  FreeBSD does not enforce any
		special handling when this flag is set.

UF_SPARSE:	Command line name: "sparse" or "usparse"
		ZFS name: XAT_SPARSE, ZFS_SPARSE
		Windows: FILE_ATTRIBUTE_SPARSE_FILE

		This flag means that the file is sparse.  Although
		ZFS may modify this in some situations, there is
		not generally any special handling for this flag.

UF_OFFLINE:	Command line name: "offline" or "uoffline"
		ZFS name: XAT_OFFLINE, ZFS_OFFLINE
		Windows: FILE_ATTRIBUTE_OFFLINE

		This flag means that the file has been moved to
		offline storage.  FreeBSD does not have any special
		handling for this flag.

UF_REPARSE:	Command line name: "reparse" or "ureparse"
		ZFS name: XAT_REPARSE, ZFS_REPARSE
		Windows: FILE_ATTRIBUTE_REPARSE_POINT

		This flag means that the file is a Windows reparse
		point.  ZFS has special handling code for reparse
		points, but we don't currently have the other
		supporting infrastructure for them.

UF_HIDDEN:	Command line name: "hidden" or "uhidden"
		ZFS name: XAT_HIDDEN, ZFS_HIDDEN
		Windows: FILE_ATTRIBUTE_HIDDEN

		This flag means that the file may be excluded from
		a directory listing if the application honors it.
		FreeBSD has no special handling for this flag.

		The name and bit definition for UF_HIDDEN are
		identical to the definition in MacOS X.

UF_READONLY:	Command line name: "urdonly", "rdonly", "readonly"
		ZFS name: XAT_READONLY, ZFS_READONLY
		Windows: FILE_ATTRIBUTE_READONLY

		This flag means that the file may not written or
		appended, but its attributes may be changed.

		ZFS currently enforces this flag, but Illumos
		developers have discussed disabling enforcement.

		The behavior of this flag is different than MacOS X.
		MacOS X uses UF_IMMUTABLE to represent the DOS
		readonly permission, but that flag has a stronger
		meaning than the semantics of DOS readonly permissions.

UF_ARCHIVE:	Command line name: "uarch", "uarchive"
		ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE
		Windows name: FILE_ATTRIBUTE_ARCHIVE

		The UF_ARCHIVED flag means that the file has changed and
		needs to be archived.  The meaning is same as
		the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and
		the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.

		msdosfs and ZFS have special handling for this flag.
		i.e. they will set it when the file changes.

sys/param.h:		Bump __FreeBSD_version to 1000047 for the
			addition of new stat(2) flags.

chflags.1:		Document the new command line flag names
			(e.g. "system", "hidden") available to the
			user.

ls.1:			Reference chflags(1) for a list of file flags
			and their meanings.

strtofflags.c:		Implement the mapping between the new
			command line flag names and new stat(2)
			flags.

chflags.2:		Document all of the new stat(2) flags, and
			explain the intended behavior in a little
			more detail.  Explain how they map to
			Windows file attributes.

			Different filesystems behave differently
			with respect to flags, so warn the
			application developer to take care when
			using them.

zfs_vnops.c:		Add support for getting and setting the
			UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN,
			UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.

			All of these flags are implemented using
			attributes that ZFS already supports, so
			the on-disk format has not changed.

			ZFS currently doesn't allow setting the
			UF_REPARSE flag, and we don't really have
			the other infrastructure to support reparse
			points.

msdosfs_denode.c,
msdosfs_vnops.c:	Add support for getting and setting
			UF_HIDDEN, UF_SYSTEM and UF_READONLY
			in MSDOSFS.

			It supported SF_ARCHIVED, but this has been
			changed to be UF_ARCHIVE, which has the same
			semantics as the DOS archive attribute instead
			of inverse semantics like SF_ARCHIVED.

			After discussion with Bruce Evans, change
			several things in the msdosfs behavior:

			Use UF_READONLY to indicate whether a file
			is writeable instead of file permissions, but
			don't actually enforce it.

			Refuse to change attributes on the root
			directory, because it is special in FAT
			filesystems, but allow most other attribute
			changes on directories.

			Don't set the archive attribute on a directory
			when its modification time is updated.
			Windows and DOS don't set the archive attribute
			in that scenario, so we are now bug-for-bug
			compatible.

smbfs_node.c,
smbfs_vnops.c:		Add support for UF_HIDDEN, UF_SYSTEM,
			UF_READONLY and UF_ARCHIVE in SMBFS.

			This is similar to changes that Apple has
			made in their version of SMBFS (as of
			smb-583.8, posted on opensource.apple.com),
			but not quite the same.

			We map SMB_FA_READONLY to UF_READONLY,
			because UF_READONLY is intended to match
			the semantics of the DOS readonly flag.
			The MacOS X code maps both UF_IMMUTABLE
			and SF_IMMUTABLE to SMB_FA_READONLY, but
			the immutable flags have stronger meaning
			than the DOS readonly bit.

stat.h:			Add definitions for UF_SYSTEM, UF_SPARSE,
			UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY
			and UF_HIDDEN.

			The definition of UF_HIDDEN is the same as
			the MacOS X definition.

			Add commented-out definitions of
			UF_COMPRESSED and UF_TRACKED.  They are
			defined in MacOS X (as of 10.8.2), but we
			do not implement them (yet).

ufs_vnops.c:		Add support for getting and setting
			UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY,
			UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS.
			Alphabetize the flags that are supported.

			These new flags are only stored, UFS does
			not take any action if the flag is set.

Sponsored by:	Spectra Logic
Reviewed by:	bde (earlier version)
2013-08-21 23:04:48 +00:00
pjd
b717fb9f08 Implement fdclosedir(3) function, which is equivalent to the closedir(3)
function, but returns directory file descriptor instead of closing it.

Submitted by:	Mariusz Zaborski <oshogbo@FreeBSD.org>
Sponsored by:	Google Summer of Code 2013
2013-08-18 20:11:34 +00:00
pjd
ab20de7f07 Remove redundant space. 2013-08-18 20:06:35 +00:00
jilles
836cb97bd1 dup3(3): Replace copyright notice.
Although I copied dup(2) to create dup3(3), I removed almost all the
non-boilerplate, so dup3(3) is copyright me.

Reported by:	bjk
2013-08-18 13:25:18 +00:00
pjd
76babb36d9 Consistently use 'af' as an argument name for address family.
Now both gethostbyname2(3) and gethostbyaddr(3) use the same argument name.
The same argument name is also used in implementations of those functions.
2013-08-18 10:38:59 +00:00
pjd
6e8f2d8487 Make example more correct (errstr is a pointer, not boolean). 2013-08-18 10:33:46 +00:00
jilles
fd29e78a68 libc: Access _logname_valid more efficiently.
The variable _logname_valid is not exported via the version script;
therefore, change C and i386/amd64 assembler code to remove indirection
(which allowed interposition). This makes the code slightly smaller and
faster.

Also, remove #define PIC_GOT from i386/amd64 in !PIC mode. Without PIC,
there is no place containing the address of each variable, so there is no
possible definition for PIC_GOT.
2013-08-17 19:24:58 +00:00
pjd
ac8f6c2ee4 Correct function name and return value. 2013-08-17 14:55:31 +00:00
jhb
3bfcb89de4 Add new mmap(2) flags to permit applications to request specific virtual
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
  Requests for n >= number of bits in a pointer or less than the size of
  a page fail with EINVAL.  This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED.  It can be used
  to optimize the chances of using large pages.  By default it will align
  the mapping on a large page boundary (the system is free to choose any
  large page size to align to that seems best for the mapping request).
  However, if the object being mapped is already using large pages, then
  it will align the virtual mapping to match the existing large pages in
  the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
  VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
  MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
  MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
  explicitly using VMFS_SUPER_SPACE.  All device objects are forced to
  use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
  equivalent.

Reviewed by:	alc
MFC after:	1 month
2013-08-16 21:13:55 +00:00
jilles
a43a1c528c pselect(2): Add xref to sigsuspend(2). 2013-08-16 14:06:29 +00:00
jilles
1c4bb0bb48 Add man page dup3(3). 2013-08-16 13:16:27 +00:00
jilles
020684e443 Add dup3(), based on F_DUP2FD and F_DUP2FD_CLOEXEC fcntls.
I removed functionality not proposed for POSIX in Austin group issue #411.
A man page (my own) and test cases will follow in later commits.

PR:		176233
Submitted by:	Jukka Ukkonen
2013-08-16 13:10:30 +00:00
jilles
bdb743702f sigsuspend(2): Add xrefs to pselect(2) and sigwait-alikes. 2013-08-15 22:33:27 +00:00
jilles
e3e0bd874e libc: Use O_CLOEXEC when writing gmon files (cc -pg). 2013-08-13 21:45:48 +00:00
peter
4fb136d770 vfork(2) was listed as deprecated in 1994 (r1573) and was the false
reports of its impending demise were removed in 2009 (r199257).

However, in 1996 (r16117) system(3) was switched from vfork(2) to
fork(2) based partly on this.  Switch back to vfork(2).  This has a
dramatic effect in cases of extreme mmap use - such as excessive
abuse (500+) of shared libraries.

popen(3) has used vfork(2) for a while.  vfork(2) isn't going anywhere.
2013-08-13 20:38:55 +00:00
jilles
9381206c83 db: Use O_CLOEXEC instead of separate fcntl() call. 2013-08-13 19:20:50 +00:00
peter
39612cec78 Expose _citrus_bcs_trunc_rws_len for libintl's use.
Submitted by:	Jan Beich <jbeich@tormail.org>
2013-08-13 18:14:53 +00:00
peter
995e1f0063 The iconv in libc did two things - implement the standard APIs, the GNU
extensions and also tried to be link time compatible with ports libiconv.
This splits that functionality and enables the parts that shouldn't
interfere with the port by default.

WITH_ICONV (now on by default) - adds iconv.h, iconv_open(3) etc.
WITH_LIBICONV_COMPAT (off by default) adds the libiconv_open etc API, linker
symbols and even a stub libiconv.so.3 that are good enough to be able
to 'pkg delete -f libiconv' on a running system and reasonably expect it
to work.

I have tortured many machines over the last few days to try and reduce
the possibilities of foot-shooting as much as I can.  I've successfully
recompiled to enable and disable the libiconv_compat modes, ports that use
libiconv alongside system iconv etc.  If you don't enable the
WITH_LIBICONV_COMPAT switch, they don't share symbol space.

This is an extension of behavior on other system.  iconv(3) is a standard
libc interface and libiconv port expects to be able to run alongside it on
systems that have it.

Bumped osreldate.
2013-08-13 07:15:01 +00:00
jilles
3bf0adb320 db/hash: Use O_CLOEXEC instead of separate fcntl() call.
In particular, a hash db is used by getpwnam() and getpwuid().

MFC after:	1 week
2013-08-11 15:38:48 +00:00
jilles
2895e1352c Add mkostemp() and mkostemps().
These are like mkstemp() and mkstemps() but allow passing open(2) flags like
O_CLOEXEC.
2013-08-09 17:24:23 +00:00
ache
0a3b2e376d According to POSIX \ in the fnmatch(3) pattern should escape
any character including '\0', but our version replace escaped '\0'
with '\\'.
I.e. fnmatch("\\", "\\", 0) should not match while fnmatch("\\", "", 0)
should (Linux and NetBSD does the same). Was vice versa.

PR:     181129
MFC after:      1 week
2013-08-08 09:04:02 +00:00
peter
4f61f84d69 Our libc iconv (unlike gnu iconv and the citrus code in NetBSD) has a
bypass mode when src == dst.  Unfortunately, there are tools in ports
that pass byte streams through iconv to determine if the encodings
are valid.  eg: gettext-0.18.3+.

Disable the optimization and behave like the other implementations.
2013-08-08 01:53:27 +00:00
avg
4e6c4b2a36 Revert r253748,253749
This WIP should not have been committed yet.

Pointyhat to:	avg
2013-07-28 18:44:17 +00:00
avg
c8737cbf1c remove needless inclusion of machine/cpu.h in userland
MFC after:	21 days
2013-07-28 18:35:43 +00:00
zont
d47da97be7 Remove define and documentation for vm_pageout_algorithm missed in r253587 2013-07-26 02:00:06 +00:00
jhb
2ee9cbbc0f Enhance the description of NOTE_TRACK:
- NOTE_TRACK has never triggered a NOTE_TRACK event from the parent pid.
  If NOTE_FORK is set, the listener will get a NOTE_FORK event from
  the parent pid, but not a separate NOTE_TRACK event.
- Explicitly note that the event added to monitor the child process
  preserves the fflags from the original event.
- Move the description of NOTE_TRACKERR under NOTE_TRACK as it is not a
  bit for the user to set (which is what this list pupports to be).
  Also, explicitly note that if an error occurs, the NOTE_CHILD event
  will not be generated.

MFC after:	1 week
2013-07-25 19:34:24 +00:00
jilles
bc9fec6137 wordexp(): Fix syntax validation for backslashes in single-quotes. 2013-07-23 21:09:26 +00:00
emaste
a3c7be9ea2 Document EINVAL error return from PT_LWPINFO 2013-07-22 18:18:21 +00:00