in handling the cpuset sizes different from sizeof(cpuset_t).
For both cases, cpuset size shorter than sizeof(cpuset_t) results
in EINVAL on Linux.
For sched_getaffinity(), be more permissive and accept cpuset size
larger than our cpuset_t, by clipping the syscall argument and zeroing
the rest of the output buffer. For sched_setaffinity(), we should allow
shorter cpusets than current ABI size, again zeroing the rest of the bits.
With this change, python os.sched_get/setaffinity functions work.
Reported by: se
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This is a re-application of commit
2d82b47a5b, which was reverted since it
broke with syslog daemons that don't adjust the /dev/log recv buffer
size. Now that the default is large enough to accomodate 8KB messages,
restore support for large messages.
PR: 260126
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.
Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.
The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).
The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.
This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.
One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.
Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.
The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.
Reviewed by: kib
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D33451
The states macro is the type for engine.c to use, with states1 being a
local macro for regexec to use to determine whether it can use the small
matcher or not (by comparing nstates and 8*sizeof(states1)). However,
macro bodies are expanded in the context of their use, and so when
regexec uses states1 it uses the current value of states, which is left
over as char * from the large version (or, really, the multi-byte one,
but that reuses large's states). For all supported architectures in
FreeBSD, the two have the same size, and so this confusion is harmless.
However, for architectures like CHERI where that is not the case (or
Windows's LLP64 as discovered by LLVM and fixed in 2010 in 2e071faed8e2)
and sizeof(char *) is bigger than sizeof(long) regexec will erroneously
try to use the small matcher when nstates is between sizeof(long) and
sizeof(char *) (i.e. between 64 and 128 on CHERI, or 32 and 64 on LLP64)
and end up overflowing the number of bits in the underlying long if it
ever uses those high states. On weirder architectures where sizeof(long)
is greater than sizeof(char *) this also fixes it to not fall back on
the large matcher prematurely, but such architectures are likely limited
to the embedded space, if they exist at all.
Fix this by swapping round states and states1, so that states1 is
defined directly as being long and states is an alias for it for the
small matcher case.
Found by: CHERI
Checking there are still bytes left must be done before dereferencing
the pointer, not the other way round. This is harmless on traditional
architectures since the result will immediately be thrown away, and all
callers are in separate translation units so there is no potential for
optimising based on this out-of-bounds read. However, on CHERI, pointers
are bounded, and so this will trap if fed a string that does not have a
NUL within the first len bytes.
Found by: CHERI
Reviewed by: brooks
Add an idletime user group that allows non-root users to run processes
with idle scheduling priority. Privileges are granted by a MAC policy in
the mac_priority module. For this purpose, the kernel privilege
PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change).
Deprecate the system wide sysctl(8) knob
security.bsd.unprivileged_idprio which lets any user run idle priority
processes, regardless of context. While the knob is still working, it is
marked as deprecated in the description and in the man pages.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338
Their uses have been replaced by _tcb_get() and _tcb_set() from
<machine/tls.h>.
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33354
- Include <machine/tls.h> in MD rtld_machdep.h headers.
- Remove local definitions of TLS_* constants from rtld_machdep.h
headers and libc using the values from <machine/tls.h> instead.
- Use _tcb_set() instead of inlined versions in MD
allocate_initial_tls() routines in rtld. The one exception is amd64
whose _tcb_set() invokes the amd64_set_fsbase ifunc. rtld cannot
use ifuncs, so amd64 inlines the logic to optionally write to fsbase
directly.
- Use _tcb_set() instead of _set_tp() in libc.
- Use '&_tcb_get()->tcb_dtv' instead of _get_tp() in both rtld and libc.
This permits removing _get_tp.c from rtld.
- Use TLS_TCB_SIZE and TLS_TCB_ALIGN with allocate_tls() in MD
allocate_initial_tls() routines in rtld.
Reviewed by: kib, jrtc27 (earlier version)
Differential Revision: https://reviews.freebsd.org/D33353
- Use 16 byte alignment rather than 8 for aarch64, powerpc64, and RISC-V.
- Use 8 byte alignment rather than 4 for 32-bit arm, mips, and powerpc.
I suspect that mips64 should be using 16 byte alignment, but both libc
and rtld currently use 8 byte alignment.
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33350
There's no point in a knob to avoid installing a half dozen manpages.
It's undocumented and unused in the tree. Online, the only metions
I've found are the FreeBSD source tree, a commit in DragonFly BSD
removing it, and some lists of build options for small systems where
it's inevitably redundant due to an accompanying NO_MAN.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D33310
Otherwise the asm stub is used and libthr interposition does not work.
Reviewed by: kib
Fixes: 21f749da82 ("libthr: wrap pdfork(2), same as fork(2).")
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Also use the term operation consistently, over the command.
Reviewed by: emaste, jhb, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33277
that returns struct kinfo_file for the given file descriptor. Among
other data, it also returns kf_path, if file op was able to restore file
path.
Reviewed by: jhb, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33277
This is a MAC policy module that grants scheduling privileges based on
group membership. Users or processes in the group realtime (gid 47) are
allowed to run threads and processes with realtime scheduling priority.
For timing-sensitive, low-latency software like audio/jack, running with
realtime priority helps to avoid stutter and gaps.
PR: 239125
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33191
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.
A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
Section 9.5 of RFC 6458 (SCTP Socket API) requires that
sctp_getladdrs() returns 0 in case the socket is unbound. This
is the cause of reporting 0 addresses. So don't indicate an
error, just report this case as required.
PR: 260117
MFC after: 1 week
When calling getsockopt() with SCTP_GET_LOCAL_ADDR_SIZE, use a
pointer to a 32-bit variable, since this is what the kernel
expects.
While there, do some cleanups.
MFC after: 1 week
This reverts commit 2886c93d1b.
The original commit has two problems:
* It sets SO_SNDBUF to be as large as MAXLINE. But for unix domain
sockets, the send buffer is bypassed. Packets go directly to the
peer's receive buffer, so setting and querying SO_SNDBUF is
ineffective. To ensure that the socket can accept messages of a
certain size, it would be necessary to add a SO_PEERRCVBUF socket
option that could query the connected peer's receive buffer size.
* It sets MAXLINE to 8 kB, which is larger than the default sockbuf size
of 4 kB. That's ok for the builtin syslogd, which sets its recvbuf
to 80 kB, but not ok for alternative sysloggers, like rsyslogd, which
use the default size.
As a consequence, writing messages of more than 4 kB with syslog() as a
non-root user while running rsyslogd would cause the logging application
to spin indefinitely within syslog().
PR: 260126
MFC: 2 weeks
Sponsored by: Axcient
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D33199
Namely posix_spawn_file_actions_addclosefrom_np, in the form it is
provided by glibc.
Reviewed by: kevans, ngie (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33143
to wrap too long lines with function prototypes.
Reviewed by: kevans, ngie (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33143
In _citrus_prop_read_TYPE_func_ generated functions, do not ignore parsed
'-' sign, negate the value as appropriate.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33146
It receives the malloc() result, and we do not want the malloc() call
to be optimized out, which is allowed for hosted compiler. Use dummy
for actual write though.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The variables clang13 complains about take the results of var_arg() calls.
I decided to kept variables around, annotating their definitions with
__unused, to keep clear expected types of the varargs.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
These are the updated version of the older Cortex Strings Library we
previously used. The Arm Optimized Routines also support CPU features
that are currently in development on FreeBSD, e.g. Branch Target
Identification (BTI). Rather than add BTI support to the old code it's
easier to just use the maintained version.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32774
Instead of only hiding cpu_set_t compat typedef itself.
Too many software packages assume that sched_getaffinity() presence
implies full source compatibility with glibc. We can (and should)
handle missing CPU_* macros, but then there are incompatible BIT_* uses
which cannot be fixed in src/.
So hide everything under _WITH_CPU_SET_T, in particular, do not expose
sched_getcpu(), sched_get/setaffinity(), as well as CPU_* and BIT_*
macros. Consumers that want sched* functions must opt-in.
Reported by: portmgr (antoine)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
for compatibility with Linux.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32901
for compatibility with Linux.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32901
Remove code that is ifdefed out on USELOOPBACK, which uses historical
class. No functional change intended.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D32712
Mark functions inet_netof(), inet_lnaof(), and inet_makeaddr() as
deprecated, as they assume Class A/B/C. inet_makeaddr() mostly works
when networks are a multiple of 8 bits, but warn for anything other
than historical classes. Reduce other mentions of network classes.
MFC after: 1 month
Reviewed by: bcr, #manpages
Differential Revision: https://reviews.freebsd.org/D32711
Currently after cleaning the variables the environment will be always
set to the intEnviron as documented in __rebuild_environ.
Reported by: lwhsu@, jenkins
The clearenv(3) function allows us to clear all environment
variable in one shot. This may be useful for security programs that
want to control the environment or what variables are passed to new
spawned programs.
Reviewed by: scf, markj (secteam), 0mp (manpages)
Differential Revision: https://reviews.freebsd.org/D28223
A BPF descriptor only has an associated interface descriptor once it is
attached to an interface, e.g., with BIOCSETIF. Avoid dereferencing a
NULL pointer in filt_bpfwrite() if the BPF descriptor is not attached.
Reviewed by: ae
Reported by: syzbot+ae45d5166afe15a5a21d@syzkaller.appspotmail.com
Fixes: ded77e0237 ("Allow the BPF to be select for write.")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32561
The last two drivers that required sppp are cp(4) and ce(4).
These devices are still produced and can be purchased
at Cronyx <http://cronyx.ru/hardware/wan.html>.
Since Roman Kurakin <rik@FreeBSD.org> has quit them, they no
longer support FreeBSD officially. Later they have dropped
support for Linux drivers to. As of mid-2020 they don't even
have a developer to maintain their Windows driver. However,
their support verbally told me that they could provide aid to
a FreeBSD developer with documentaion in case if there appears
a new customer for their devices.
These drivers have a feature to not use sppp(4) and create an
interface, but instead expose the device as netgraph(4) node.
Then, you can attach ng_ppp(4) with help of ports/net/mpd5 on
top of the node and get your synchronous PPP. Alternatively
you can attach ng_frame_relay(4) or ng_cisco(4) for HDLC.
Actually, last time I used cp(4) back in 2004, using netgraph(4)
instead of sppp(4) was already the right way to do.
Thus, remove the sppp(4) related part of the drivers and enable
by default the negraph(4) part. Further maintenance of these
drivers in the tree shouldn't be a big deal.
While doing that, remove some cruft and enable cp(4) compilation
on amd64. The ce(4) for some unknown reason marks its internal
DDK functions with __attribute__ fastcall, which most likely is
safe to remove, but without hardware I'm not going to do that, so
ce(4) remains i386-only.
Reviewed by: emaste, imp, donner
Differential Revision: https://reviews.freebsd.org/D32590
See also: https://reviews.freebsd.org/D23928
for state control over TRACE, TRAPCAP, ASLR, PROTMAX, STACKGAP,
NO_NEWPRIVS, and WXMAP.
Reported by: emaste
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32513
These calls do operate on vnodes only, not file contents.
This is useful for e.g. the xdg-document-portal fuse filesystem.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32438
When this flag is set, operations that update an existing kevent will
not change the udata field. This can be used to NOTE_TRIGGER or
EV_{EN,DIS}ABLE events without overwriting the stashed pointer.
Reviewed by: Domagoj Stolfa <domagoj.stolfa@gmail.com>
Obtained from: CheriBSD
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D30286
Each locale embeds a lazily initialized lconv which is populated by
localeconv(3) and localeconv_l(3). When setlocale(3) updates the global
locale, the lconv needs to be (lazily) reinitialized. To signal this,
we set flag variables in the locale structure. There are two problems:
- The flags are set before the locale is fully updated, so a concurrent
localeconv() call can observe partially initialized locale data.
- No barriers ensure that localeconv() observes a fully initialized
locale if a flag is set.
So, move the flag update appropriately, and use acq/rel barriers to
provide some synchronization. Note that this is inadequate in the face
of multiple concurrent calls to setlocale(3), but this is not expected
to work regardless.
Thanks to Henry Hu <henry.hu.sh@gmail.com> for providing a test case
demonstrating the race.
PR: 258360
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31899
It allows to override kern.elf{32,64}.allow_wx on per-process basis.
In particular, it makes it possible to run binaries without PT_GNU_STACK
and without elfctl note while allow_wx = 0.
Reviewed by: brooks, emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31779
Implement optional timezone change detection for local time libc
functions. This is disabled by default; set WITH_DETECT_TZ_CHANGES
to build it.
Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
X-NetApp-PR: #47
Differential Revision: https://reviews.freebsd.org/D30183
On case-insensitive file systems (most likely to be seen on macOS, where
it is the default), _Fork.o for the new POSIX _Fork function conflicts
with _fork.o for the PSEUDO file. This results in non-determinsitic
behaviour in terms of which ends up being present; if _Fork.o wins then
the build fails to link libc.so due to missing __sys_fork, and if
_fork.o wins then libc silently fails to include the implementation of
_Fork. A similar issue occurred in the past for C99's _Exit conflicting
with exit(2) and was fixed in cb1cb6a2a8, so this adds a fix based on
that.
As a longer-term solution it might be better to instead make the
generated files use a different prefix that's less likely to conflict
with other things (such as __sys_foo.o given they always contain that)
but that's a rather more invasive change.
Fixes: 49ad342cc1 ("Add _Fork()")
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31895
Unlike the other syscalls these two symbols were missing from the
version script. I noticed this while looking into the compiler-rt
runtime libraries for CHERI.
Reviewed by: brooks
Obtained from: https://github.com/CTSRD-CHERI/cheribsd/pull/1063
MFC after: 3 days
The new wording for standard flags is losely based on the POSIX
description.
Make it clearer that PROT_MAX() is a local extension.
Reviewed by: alc, mckusick, imp, kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D31777
This text dates to the BSD 4.4 import and is misleading. The mprotect
syscall acts on page granularity and breaks up mappings as required to
do so.
Note that with the addition of non-transparent superpages (aka
largepages) the size of a page at a given address may vary. This
commit does not attempt to address the lack of documentation of this
feature.
Sponsored by: DARPA
Reviewed by: alc, mckusick, imp, kib, markj
Differential Revision: https://reviews.freebsd.org/D31776
Currently when an [EAGAIN] is encountered we return a partial result
that does not contain the delimeter. On the next (successful) read we
were returning the next part of the line without the preceding string
from the first failed call.
Fix this by using the same mechanism as ungetc(3) does. For the buffered
case we could simply set fp->_r and fp->_p back to their values before
sappend() is ran but for simplicity ungetc(3) is done in there as well.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31687
Clang 13 produces the following warning for this function:
lib/libc/stdlib/merge.c:137:41: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]
if (!(size % ISIZE) && !(((char *)base - (char *)0) % ISIZE))
^ ~~~~~~~~~
This is meant to check whether the size and base parameters are aligned
to the size of an int, so use our __is_aligned() macro instead.
Also remove the comment that indicated this "stupid subtraction" was
done to pacify some ancient and unknown Cray compiler, and which has
been there since the BSD 4.4 Lite Lib Sources were imported.
MFC after: 3 days
rmsr.r_offset now is set to rqsr.r_offset plus the number of bytes
zeroed before hitting the end-of-file. After this change rmsr.r_offset
no longer contains the EOF when the requested operation range is
completely beyond the end-of-file. Instead in such case rmsr.r_offset is
equal to rqsr.r_offset. Callers can obtain the number of bytes zeroed
by subtracting rqsr.r_offset from rmsr.r_offset.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31677
The removed text claimed that memcpy is implemented using bcopy and thus
strings may overlap. Use of bcopy is an implementation detail that is
no longer true, even if the implementation (on some archs) does allow
overlap.
In any case behaviour is undefined per the C standard if memcpy is
called with overlapping objects, and this man page already claimed that
src and dst may not overlap.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31192
rmacklem@ spotted two things in the system call:
- Upon returning from a successful operation, vop_stddeallocate can
update rmsr.r_offset to a value greater than file size. This behavior,
although being harmless, can be confusing.
- The EINVAL return value for rqsr.r_offset + rqsr.r_len > OFF_MAX is
undocumented.
This commit has the following changes:
- vop_stddeallocate and shm_deallocate to bound the the affected area
further by the file size.
- The EINVAL case for rqsr.r_offset + rqsr.r_len > OFF_MAX is
documented.
- The fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9)'s return
len is explicitly documented the be the value 0, and the return offset
is restricted to be the smallest of off + len and current file size
suggested by kib@. This semantic allows callers to interact better
with potential file size growth after the call.
Sponsored by: The FreeBSD Foundation
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D31604
Add missing wrapper code to librt for these new functions so that
SIGEV_THREAD works. Without machinery to convert it to SIGEV_THREAD_ID,
you got EINVAL.
Reviewed by: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31618
Allow multiple vector IOs to be started with one system call.
aio_readv() and aio_writev() already used these opcodes under the
covers. This commit makes them available to user space.
Being non-standard extensions, they're only visible if __BSD_VISIBLE is
defined, like the functions.
Reviewed by: asomers, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31627
Variant I architectures use off and Variant II ones use size + off.
Define TLS_VARIANT_I/TLS_VARIANT_II symbols similarly to how libc
handles it.
Reviewed by: kib
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31539
Differential revision: https://reviews.freebsd.org/D31541