freebsd-nq

Author	SHA1	Message	Date
Michael Tuexen	db4493f7b6	sendfile() does currently not support SCTP sockets. Therefore, fail the call. Reviewed by: markj@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24059	2020-03-13 18:38:28 +00:00
Kirk McKusick	95ca762da8	When mounting a UFS filesystem, return EINTEGRITY rather than EIO when a superblock check-hash error is detected. This change clarifies a mount that failed due to media hardware failures (EIO) from a mount that failed due to media errors (EINTEGRITY) that can be corrected by running fsck(8). Sponsored by: Netflix	2020-03-11 21:00:40 +00:00
Ed Maste	0a052459e6	umtx_op.2: correct typo PR: 244611 Submitted by: John F. Carr <jfc@mit.edu> MFC after: 3 days	2020-03-05 15:51:44 +00:00
Mateusz Piotrowski	a81c96922d	thr_self.2: Fix some typos in the thread identifier range Reported by: kaktus Approved by: bcr (mentor) Differential Revision: https://reviews.freebsd.org/D23936	2020-03-03 09:51:53 +00:00
Ed Maste	acb8858f05	Return ENOTSUP for mmap/mprotect if prot not subset of prot_max From POSIX, [ENOTSUP] The implementation does not support the combination of accesses requested in the prot argument. This fits the case that prot contains permissions which are not a subset of prot_max. Reviewed by: brooks, cem Relnotes: Yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23843	2020-02-26 20:03:43 +00:00
Warner Losh	a5b6c2960d	Remove sparc64 specific parts of libc. Also update comments for which architectures use 128 bit long doubles, as appropriate. The softfloat specialization routines weren't updated since they appear to be from an upstream source which we may want to update in the future to get a more favorable license. Reviewed by: emaste@ Differential Revision: https://reviews.freebsd.org/D23658	2020-02-26 18:55:09 +00:00
Ed Maste	5c6d07fb9c	mprotect.2: sort errors alphabetically Reported by: brooks MFC after: 3 days	2020-02-26 18:46:41 +00:00
Eric van Gyzen	3ae8839afe	truncate(2): extending the file is required by POSIX 2008 Update the man page to mention that extending a file with truncate(2) is required by POSIX as of 2008. Reviewed by: bcr MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23354	2020-02-20 23:47:09 +00:00
Konstantin Belousov	146fc63fce	Add a way to manage thread signal mask using shared word, instead of syscall. A new syscall sigfastblock(2) is added which registers a uint32_t variable as containing the count of blocks for signal delivery. Its content is read by kernel on each syscall entry and on AST processing, non-zero count of blocks is interpreted same as the signal mask blocking all signals. The biggest downside of the feature that I see is that memory corruption that affects the registered fast sigblock location, would cause quite strange application misbehavior. For instance, the process would be immune to ^C (but killable by SIGKILL). With consumers (rtld and libthr added), benchmarks do not show a slow-down of the syscalls in micro-measurements, and macro benchmarks like buildworld do not demonstrate a difference. Part of the reason is that buildworld time is dominated by compiler, and clang already links to libthr. On the other hand, small utilities typically used by shell scripts have the total number of syscalls cut by half. The syscall is not exported from the stable libc version namespace on purpose. It is intended to be used only by our C runtime implementation internals. Tested by: pho Disscussed with: cem, emaste, jilles Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D12773	2020-02-09 11:53:12 +00:00
Kyle Evans	6a5abb1ee5	Provide O_SEARCH O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247	2020-02-02 16:34:57 +00:00
Mateusz Guzik	d3cc535474	vfs: provide F_ISUNIONSTACK as a kludge for libc Prior to introduction of this op libc's readdir would call fstatfs(2), in effect unnecessarily copying kilobytes of data just to check fs name and a mount flag. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D23162	2020-01-17 14:42:25 +00:00
Conrad Meyer	86def3dcd6	getrandom(2): Add Linux GRND_INSECURE API flag Treat it as a synonym for GRND_NONBLOCK. The reasoning is this: We have two choices for handling Linux's GRND_INSECURE API flag. 1. We could ignore it completely (like GRND_RANDOM). However, this might produce the surprising result of GRND_INSECURE requests blocking, when the Linux API does not block. 2. Alternatively, we could treat GRND_INSECURE requests as requests for GRND_NONBLOCk. Here, the surprising result for Linux programs is that invocations with unseeded random(4) will produce EAGAIN, rather than garbage. Honoring the flag in the way Linux does seems fraught. If we actually use the output of a random(4) implementation prior to seeding, we leak some entropy (in an information theory and also practical sense) from what will be the initial seed to attackers (or allow attackers to arbitrary DoS initial seeding, if we don't leak). This seems unacceptable -- it defeats the purpose of blocking on initial seeding. Secondary to that concern, before seeding we may have arbitrarily little entropy collected; producing output from zero or a handful of entropy bits does not seem particularly useful to userspace. If userspace can accept garbage, insecure, non-random bytes, they can create their own insecure garbage with srandom(time(NULL)) or similar. Any program which would be satisfied with a 3-bit key CTR stream has no need for CSPRNG bytes. So asking the kernel to produce such an output from the secure getrandom(2) API seems inane. For now, we've elected to emulate GRND_INSECURE as an alternative spelling of GRND_NONBLOCK (2). Consider this API not-quite stable for now. We guarantee it will never block. But we will attempt to monitor actual port uptake of this bizarre API and may revise our plans for the unseeded behavior (prior stable/13 branching). Approved by: csprng(markm), manpages(bcr) See also: https://lwn.net/ml/linux-kernel/cover.1577088521.git.luto@kernel.org/ See also: https://lwn.net/ml/linux-kernel/20200107204400.GH3619@mit.edu/ Differential Revision: https://reviews.freebsd.org/D23130	2020-01-12 20:47:38 +00:00
Kyle Evans	2856d85ecb	posix_fallocate: push vnop implementation into the fileop layer This opens the door for other descriptor types to implement posix_fallocate(2) as needed. Reviewed by: kib, bcr (manpages) Differential Revision: https://reviews.freebsd.org/D23042	2020-01-08 19:05:32 +00:00
Konstantin Belousov	0cc9fb7551	Only return EPERM from kill(-pid) when no process was signalled. As mandated by POSIX. Also clarify the kill(2) manpage. While there, restructure the code in killpg1() to use helper which keeps overall state of the process list iteration in the killpg1_ctx structued, later used to infer the error returned. Reported by: amdmi3 Reviewed by: jilles Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22621	2019-12-07 18:07:49 +00:00
Alan Somers	8d3443b1fc	clock_gettime(2): add a HISTORY section MFC after: 2 weeks	2019-12-07 16:45:12 +00:00
Alan Somers	fbf7102d14	lio_listio(2): add a HISTORY section MFC after: 2 weeks	2019-12-07 16:29:56 +00:00
Warner Losh	f86e60008b	Regularize my copyright notice o Remove All Rights Reserved from my notices o imp@FreeBSD.org everywhere o regularize punctiation, eliminate date ranges o Make sure that it's clear that I don't claim All Rights reserved by listing All Rights Reserved on same line as other copyright holders (but not me). Other such holders are also listed last where it's clear.	2019-12-04 16:56:11 +00:00
Mark Johnston	a6d05b9be7	Fix typos in the cpuset_{get,set}domain() man page. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-11-22 16:25:00 +00:00
Rick Macklem	51e069ac10	Update the copy_file_range man page to reflect the semantic change done by r354574. This is a content change.	2019-11-10 01:13:41 +00:00
Rick Macklem	fef163e117	Update the copy_file_range.2 man page to reflect the semantic change implemented by r354564. This is a content change.	2019-11-08 23:49:27 +00:00
Kyle Evans	142c5c8c36	memfd_create(3): Don't actually force hugetlb size with MFD_HUGETLB The size flags are only required to select a size on systems that support multiple sizes. MFD_HUGETLB by itself is valid.	2019-09-29 17:30:10 +00:00
Warner Losh	ab311b7f12	Revert the mode_t -> int changes and add a warning in the BUGS section instead. While FreeBSD's implementation of these expect an int inside of libc, that's an implementation detail that we can hide from the user as it's the natural promotion of the current mode_t type and before it is used in the kernel, it's converted back to the narrower type that's the current definition of mode_t. As such, documenting int is at best confusing and at worst misleading. Instead add a note that these args are variadic and as such calling conventions may differ from non-variadic arguments.	2019-09-28 17:15:48 +00:00
Warner Losh	4470d73996	Document varadic args as int, since you can't have short varadic args (they are promoted to ints). - `mode_t` is `uint16_t` (`sys/sys/_types.h`) - `openat` takes variadic args - variadic args cannot be 16-bit, and indeed the code uses int - the manpage currently kinda implies the argument is 16-bit by saying `mode_t` Prompted by Rust things: https://github.com/tailhook/openat/issues/21 Submitted by: Greg V at unrelenting Differential Revision: https://reviews.freebsd.org/D21816	2019-09-27 16:11:47 +00:00
Kyle Evans	e12ff89136	Further normalize copyright notices - s/C/c/ where I've been inconsistent about it - +SPDX tags - Remove "All rights reserved" where possible Requested by: rgrimes (all rights reserved)	2019-09-26 16:19:22 +00:00
David Bright	d4f4430503	Correct mistake in MLINKS introduced in r352747 Messed up a merge conflict resolution and didn't catch that before commit. Sponsored by: Dell EMC Isilon	2019-09-26 16:13:17 +00:00
David Bright	9afb12bab4	Add an shm_rename syscall Add an atomic shm rename operation, similar in spirit to a file rename. Atomically unlink an shm from a source path and link it to a destination path. If an existing shm is linked at the destination path, unlink it as part of the same atomic operation. The caller needs the same permissions as shm_unlink to the shm being renamed, and the same permissions for the shm at the destination which is being unlinked, if it exists. If those fail, EACCES is returned, as with the other shm_* syscalls. truss support is included; audit support will come later. This commit includes only the implementation; the sysent-generated bits will come in a follow-on commit. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jilles (earlier revision) Reviewed by: brueffer (manpages, earlier revision) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21423	2019-09-26 15:32:28 +00:00
Kyle Evans	a631497fca	Add SPDX tags to recently added files Reported by: Pawel Biernacki	2019-09-25 22:53:30 +00:00
Kyle Evans	c34a5f16fa	posix_spawn(3): handle potential signal issues with vfork Described in [1], signal handlers running in a vfork child have opportunities to corrupt the parent's state. Address this by adding a new rfork(2) flag, RFSPAWN, that has vfork(2) semantics but also resets signal handlers in the child during creation. x86 uses rfork_thread(3) instead of a direct rfork(2) because rfork with RFMEM/RFSPAWN cannot work when the return address is stored on the stack -- further information about this problem is described under RFMEM in the rfork(2) man page. Addressing this has been identified as a prerequisite to using posix_spawn in subprocess on FreeBSD [2]. [1] https://ewontfix.com/7/ [2] https://bugs.python.org/issue35823 Reviewed by: jilles, kib Differential Revision: https://reviews.freebsd.org/D19058	2019-09-25 19:22:03 +00:00
Kyle Evans	079c5b9ed8	rfork(2): add RFSPAWN flag When RFSPAWN is passed, rfork exhibits vfork(2) semantics but also resets signal handlers in the child during creation to avoid a point of corruption of parent state from the child. This flag will be used by posix_spawn(3) to handle potential signal issues. Reviewed by: jilles, kib Differential Revision: https://reviews.freebsd.org/D19058	2019-09-25 19:20:41 +00:00
Kyle Evans	a9ac5e1424	sysent: regenerate after r352705 This also implements it, fixes kdump, and removes no longer needed bits from lib/libc/sys/shm_open.c for the interim.	2019-09-25 18:09:19 +00:00
Kyle Evans	3e25d1fb61	Add linux-compatible memfd_create memfd_create is effectively a SHM_ANON shm_open(2) mapping with optional CLOEXEC and file sealing support. This is used by some mesa parts, some linux libs, and qemu can also take advantage of it and uses the sealing to prevent resizing the region. This reimplements shm_open in terms of shm_open2(2) at the same time. shm_open(2) will be moved to COMPAT12 shortly. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21393	2019-09-25 18:03:18 +00:00
Kyle Evans	f17221ee7a	Update fcntl(2) after r352695	2019-09-25 17:33:12 +00:00
Sean Eric Fagan	ba7a55d934	Add two options to allow mount to avoid covering up existing mount points. The two options are * nocover/cover: Prevent/allow mounting over an existing root mountpoint. E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local is already a mountpoint. * emptydir/noemptydir: Prevent/allow mounting on a non-empty directory. E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail. Neither of these options is intended to be a default, for historical and compatibility reasons. Reviewed by: allanjude, kib Differential Revision: https://reviews.freebsd.org/D21458	2019-09-23 04:28:07 +00:00
Konstantin Belousov	55894117b1	Return EISDIR when directory is opened with O_CREAT without O_DIRECTORY. Reviewed by: bcr (man page), emaste (previous version) PR: 240452 Sponsored by: The FreeBSD Foundation MFC after: 1 week DIfferential revision: https://reviews.freebsd.org/D21634	2019-09-17 18:32:18 +00:00
Alan Somers	8d910a4282	getsockopt.2: clarify that SO_TIMESTAMP is not 100% reliable When SO_TIMESTAMP is set, the kernel will attempt to attach a timestamp as ancillary data to each IP datagram that is received on the socket. However, it may fail, for example due to insufficient memory. In that case the packet will still be received but not timestamp will be attached. Reviewed by: kib MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21607	2019-09-11 19:48:32 +00:00
Mitchell Horne	d1bc2d79f2	Fix cpuwhich_t column width Not bumping .Dd since this is purely a format change. Approved by: markj (mentor)	2019-09-08 21:37:52 +00:00
Konstantin Belousov	fe69291ff4	Add procctl(PROC_STACKGAP_CTL) It allows a process to request that stack gap was not applied to its stacks, retroactively. Also it is possible to control the gaps in the process after exec. PR: 239894 Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21352	2019-09-03 18:56:25 +00:00
Mateusz Guzik	d05b53e0ba	Add sysctlbyname system call Previously userspace would issue one syscall to resolve the sysctl and then another one to actually use it. Do it all in one trip. Fallback is provided in case newer libc happens to be running on an older kernel. Submitted by: Pawel Biernacki Reported by: kib, brooks Differential Revision: https://reviews.freebsd.org/D17282	2019-09-03 04:16:30 +00:00
Ed Maste	3afdc7303c	Add @generated tag to libc syscall asm wrappers Although libc syscall wrappers do not get checked in this can aid in finding the source of generated files when spelunking in the objdir. Multiple tools use @generated to identify generated files (for example, in a review Phabricator will by default hide diffs in generated files). For consistency use the @generated tag in makesyscalls.sh as we've done for other generated files, even though these wrappers aren't checked in to the tree.	2019-08-16 14:14:57 +00:00
Konstantin Belousov	a60c863ced	wait(2): clarify reparenting of children of the exiting process. Point to the existence of reapers and mention that init is the default reaper. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-08-11 15:47:48 +00:00
Konstantin Belousov	cd6a6b772d	wait(2): split long line by using .Fo/.Fa instead of .Ft. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-08-11 15:44:36 +00:00
Benjamin Kaduk	1f0a85545e	Fix grammar nit in copy_file_range docs Bytes are countable, so we have fewer of them, not less of them.	2019-07-25 15:43:15 +00:00
Rick Macklem	78756b9e6f	Add libc support for the copy_file_range(2) syscall added by r350315. copy_file_range.2 is a new man page (content change). Reviewed by: kib, asomers Relnotes: yes Differential Revision: https://reviews.freebsd.org/D20584	2019-07-25 06:05:49 +00:00
John Baldwin	32451fb9fc	Add ptrace op PT_GET_SC_RET. This ptrace operation returns a structure containing the error and return values from the current system call. It is only valid when a thread is stopped during a system call exit (PL_FLAG_SCX is set). The sr_error member holds the error value from the system call. Note that this error value is the native FreeBSD error value that has _not_ been translated to an ABI-specific error value similar to the values logged to ktrace. If sr_error is zero, then the return values of the system call will be set in sr_retval[0] and sr_retval[1]. Reviewed by: kib MFC after: 1 month Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D20901	2019-07-15 21:48:02 +00:00
Konstantin Belousov	8c95181495	Document atomicity for read(2) and write(2). Take part of the text from POSIX 2018 edition and describe the atomicity requirements for read and write syscalls. See p1003.1-2018, Vol.2, 2.9.7 Threads interaction with Regular File Operations. Reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D20867	2019-07-06 20:31:37 +00:00
Konstantin Belousov	5dc7e31a09	Control implicit PROT_MAX() using procctl(2) and the FreeBSD note feature bit. In particular, allocate the bit to opt-out the image from implicit PROTMAX enablement. Provide procctl(2) verbs to set and query implicit PROTMAX handling. The knobs mimic the same per-image flag and per-process controls for ASLR. Reviewed by: emaste, markj (previous version) Discussed with: brooks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D20795	2019-07-02 19:07:17 +00:00
Konstantin Belousov	e0a126f6d2	Typo. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-06-28 16:42:44 +00:00
Brooks Davis	ee37749af6	Add PROT_MAX to the HISTORY section. In the case of mmap(), add a HISTORY section. Mention that mmap() and mprotect()'s documentation predates an implementation. The implementation first saw wide use in 4.3-Reno, but there seems to be no easy way to express that in mdoc so stick with 4.4BSD. Reviewed by: emaste Requested by: cem Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D20713	2019-06-20 21:52:30 +00:00
Brooks Davis	74a1b66cf4	Extend mmap/mprotect API to specify the max page protections. A new macro PROT_MAX() alters a protection value so it can be OR'd with a regular protection value to specify the maximum permissions. If present, these flags specify the maximum permissions. While these flags are non-portable, they can be used in portable code with simple ifdefs to expand PROT_MAX() to 0. This change allows (e.g.) a region that must be writable during run-time linking or JIT code generation to be made permanently read+execute after writes are complete. This complements W^X protections allowing more precise control by the programmer. This change alters mprotect argument checking and returns an error when unhandled protection flags are set. This differs from POSIX (in that POSIX only specifies an error), but is the documented behavior on Linux and more closely matches historical mmap behavior. In addition to explicit setting of the maximum permissions, an experimental sysctl vm.imply_prot_max causes mmap to assume that the initial permissions requested should be the maximum when the sysctl is set to 1. PROT_NONE mappings are excluded from this for compatibility with rtld and other consumers that use such mappings to reserve address space before mapping contents into part of the reservation. A final version this is expected to provide per-binary and per-process opt-in/out options and this sysctl will go away in its current form. As such it is undocumented. Reviewed by: emaste, kib (prior version), markj Additional suggestions from: alc Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18880	2019-06-20 18:24:16 +00:00
Alan Somers	5993fa5582	open(2): fix the description of O_FSYNC The man page claims that with O_FSYNC (aka O_SYNC) the kernel will not cache written data. However, that's not true. Nor does POSIX require it. Perhaps it was true when that section of the man page was written in r69336 (I haven't checked). But it's not true now. Now the effect is simply that writes are sent to disk immediately and synchronously, but they're still cached. See also: https://pubs.opengroup.org/onlinepubs/9699919799/ See also: ffs_write in sys/ufs/ffs/ffs_vnops.c Reviewed by: cem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20641	2019-06-14 20:35:37 +00:00

1 2 3 4 5 ...

1900 Commits