Many operating systems also provide MAP_ANONYMOUS. It's not hard to
support this ourselves, we'd better add it to make it more likely for
applications to work out of the box.
Reviewed by: alc (mman.h)
well-known race condition, which elimination was the reason for the
function appearance in first place. If sigmask supplied as argument to
pselect() enables a signal, the signal might be delivered before thread
called select(2), causing lost wakeup. Reimplement pselect() in kernel,
making change of sigmask and sleep atomic.
Since signal shall be delivered to the usermode, but sigmask restored,
set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK
should be cleared by ast() in case signal was not gelivered during
syscall execution.
Reviewed by: davidxu
Tested by: pho
MFC after: 1 month
* retry various system calls on EINTR
* retry the rest after a short read (common if there is more than about 1K
of output)
* block SIGCHLD like system(3) does (note that this does not and cannot
work fully in threaded programs, they will need to be careful with wait
functions)
PR: 90580
MFC after: 1 month
one path. When the list is empty (contain only a NULL pointer), return
EINVAL instead of pretending to succeed, which will cause a NULL pointer
deference in a later fts_read() call.
Noticed by: Christoph Mallon (via rdivacky@)
MFC after: 2 weeks
- Tolerate applications that pass a NULL pointer for the buffer and
claim that the capacity of the buffer is nonzero.
- If an application passes in a non-NULL buffer pointer and claims the
buffer has zero capacity, we should free (well, realloc) it
anyway. It could have been obtained from malloc(0), so failing to
free it would be a small memory leak.
MFC After: 2 weeks
Reported by: naddy
PR: ports/138320
the type argument. This is known to fix some pthread_mutexattr_settype()
invocations, especially when it comes to pulseaudio.
Approved by: kib
deischen (threads)
MFC after: 3 days
- F_READAHEAD: specify the amount for sequential access. The amount is
specified in bytes and is rounded up to nearest block size.
- F_RDAHEAD: Darwin compatible version that use 128KB as the sequential
access size.
A third argument of zero disables the read-ahead behavior.
Please note that the read-ahead amount is also constrainted by sysctl
variable, vfs.read_max, which may need to be raised in order to better
utilize this feature.
Thanks Igor Sysoev for proposing the feature and submitting the original
version, and kib@ for his valuable comments.
Submitted by: Igor Sysoev <is rambler-co ru>
Reviewed by: kib@
MFC after: 1 month
a large page size that is greater than malloc(3)'s default chunk size but
less than or equal to 4 MB, then increase the chunk size to match the large
page size.
Most often, using a chunk size that is less than the large page size is not
a problem. However, consider a long-running application that allocates and
frees significant amounts of memory. In particular, it frees enough memory
at times that some of that memory is munmap()ed. Up until the first
munmap(), a 1MB chunk size is just fine; it's not a problem for the virtual
memory system. Two adjacent 1MB chunks that are aligned on a 2MB boundary
will be promoted automatically to a superpage even though they were
allocated at different times. The trouble begins with the munmap(),
releasing a 1MB chunk will trigger the demotion of the containing superpage,
leaving behind a half-used 2MB reservation. Now comes the real problem.
Unfortunately, when the application needs to allocate more memory, and it
recycles the previously munmap()ed address range, the implementation of
mmap() won't be able to reuse the reservation. Basically, the coalescing
rules in the virtual memory system don't allow this new range to combine
with its neighbor. The effect being that superpage promotion will not
reoccur for this range of addresses until both 1MB chunks are freed at some
point in the future.
Reviewed by: jasone
MFC after: 3 weeks
needed to satisfy static libraries that are compiled with -fpic
and linked into static binary afterwards. Several libraries in
gcc are examples of such static libs.
EV_RECEIPT is useful to disambiguating error conditions when multiple
events structures are passed to kevent(2). The error code is returned
in the data field and EV_ERROR is set.
Approved by: rwatson (co-mentor)
When the EV_DISPATCH flag is used the event source will be disabled
immediately after the delivery of an event. This is similar to the
EV_ONESHOT flag but it doesn't delete the event.
Approved by: rwatson (co-mentor)
Add user events support to kernel events which are not associated with any
kernel mechanism but are triggered by user level code. This is useful for
adding user level events to an event handler that may also be monitoring
kernel events.
Approved by: rwatson (co-mentor)
An additional one coming from http://www.research.att.com/~gsf/testregex/
was not added; at some point the entire AT&T regression test harness
should be imported here.
But that would also mean commitment to fix the uncovered errors.
PR: 130504
Submitted by: Chris Kuklewicz
When I wrote the pseudo-terminal driver for the MPSAFE TTY code, Robert
Watson and I agreed the best way to implement this, would be to let
posix_openpt() create a pseudo-terminal with proper permissions in place
and let grantpt() and unlockpt() be no-ops.
This isn't valid behaviour when looking at the spec. Because I thought
it was an elegant solution, I filed a bug report at the Austin Group
about this. In their last teleconference, they agreed on this subject.
This means that future revisions of POSIX may allow grantpt() and
unlockpt() to be no-ops if an open() on /dev/ptmx (if the implementation
has such a device) and posix_openpt() already do the right thing.
I'd rather put this in the manpage, because simply mentioning we don't
comply to any standard makes it look worse than it is. Right now we
don't, but at least we took care of it.
Approved by: re (kib)
MFC after: 3 days
other than the current system-wide size (32-bits) has been updated so
for now just cautiously turn the check off. While here fix the check
for IDs being too large which doesn't work due to type mis-matches.
Reviewed by: jhb (previous version)
Approved by: re (kib)
MFC after: 1 month (type mis-match fixes only)
compiled with stack protector.
Use libssp_nonshared library to pull __stack_chk_fail_local symbol into
each library that needs it instead of pulling it from libc. GCC
generates local calls to this function which result in absolute
relocations put into position-independent code segment, making dynamic
loader do extra work every time given shared library is being relocated
and making affected text pages non-shareable.
Reviewed by: kib
Approved by: re (kib)
behavior is mandated by POSIX.
- Do not fail requests that pass a length greater than SSIZE_MAX
(such as > 2GB on 32-bit platforms). The 'len' parameter is actually
an unsigned 'size_t' so negative values don't really make sense.
Submitted by: Alexander Best alexbestms at math.uni-muenster.de
Reviewed by: alc
Approved by: re (kib)
MFC after: 1 week
Right now nmemb is returned when size is 0. In newer versions of the
standards, it is explicitly required that fwrite() should return 0.
Submitted by: Christoph Mallon
Approved by: re (kib)
if the new file mode is the same as it was before; however, this
optimization must be disabled for filesystems that support NFSv4 ACLs.
Chmod uses pathconf(2) to determine whether this is the case - however,
pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.
This change adds lpathconf(3) to make it possible to solve that problem
in a clean way.
Reviewed by: rwatson (earlier version)
Approved by: re (kib)
Use libssp_nonshared library to pull __stack_chk_fail_local symbol into
each library that needs it instead of pulling it from libc. GCC generates
local calls to this function which result in absolute relocations put into
position-independent code segment, making dynamic loader do extra work everys
time given shared library is being relocated and making affected text pages
non-shareable.
Reviewed by: kib
Approved by: re (kensmith)
This adds the following functions to the acl(3) API: acl_add_flag_np,
acl_clear_flags_np, acl_create_entry_np, acl_delete_entry_np,
acl_delete_flag_np, acl_get_extended_np, acl_get_flag_np, acl_get_flagset_np,
acl_set_extended_np, acl_set_flagset_np, acl_to_text_np, acl_is_trivial_np,
acl_strip_np, acl_get_brand_np. Most of them are similar to what Darwin
does. There are no backward-incompatible changes.
Approved by: rwatson@
part of libc is still not thread safe but this would at least
reduce the problems we have.
PR: threads/118544
Submitted by: Changming Sun <snnn119 gmail com>
MFC after: 2 weeks
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
(this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
int. This removes the need for the shm_bsegsz member in struct
shmid_kernel and should allow for complete support of SYSV SHM regions
>= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
short.
- The shm_internal member of struct shmid_ds is now gone. The internal
VM object pointer for SHM regions has been moved into struct
shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
now marked COMPAT7 and new versions of those system calls which support
the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc. The
FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
symbol versions has been added to libc. Version tags are added to
system calls by adding an appropriate __sym_compat() entry to
src/lib/libc/incldue/compat.h. [1]
PR: kern/16195 kern/113218 bin/129855
Reviewed by: arch@, rwatson
Discussed with: kan, kib [1]
- update for getrlimit(2) manpage;
- support for setting RLIMIT_SWAP in login class;
- addition to the limits(1) and sh and csh limit-setting builtins;
- tuning(7) documentation on the sysctls controlling overcommit.
In collaboration with: pho
Reviewed by: alc
Approved by: re (kensmith)
It's not necessary to add stdlib directories for each architecture, even
if the architecture doesn't implement any files of its own.
Submitted by: Christoph Mallon
main cycle only if the len passed is equal to 0. If end address
overflows use last possible address as the end address.
Based on: discussion on arm@
MFC after: 1 month
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively. (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)
The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer. Do the equivalent in
kinfo_proc.
Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively. Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary. In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.
Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups. When feasible, truncate
the group list rather than generating an error.
Minor changes:
- Reduce the number of hand rolled versions of groupmember().
- Do not assign to both cr_gid and cr_groups[0].
- Modify ipfw to cache ucreds instead of part of their contents since
they are immutable once referenced by more than one entity.
Submitted by: Isilon Systems (initial implementation)
X-MFC after: never
PR: bin/113398 kern/133867
system callers of getgroups(), getgrouplist(), and setgroups() to
allocate buffers dynamically. Specifically, allocate a buffer of size
sysconf(_SC_NGROUPS_MAX)+1 (+2 in a few cases to allow for overflow).
This (or similar gymnastics) is required for the code to actually follow
the POSIX.1-2008 specification where {NGROUPS_MAX} may differ at runtime
and where getgroups may return {NGROUPS_MAX}+1 results on systems like
FreeBSD which include the primary group.
In id(1), don't pointlessly add the primary group to the list of all
groups, it is always the first result from getgroups(). In principle
the old code was more portable, but this was only done in one of the two
places where getgroups() was called to the overall effect was pointless.
Document the actual POSIX requirements in the getgroups(2) and
setgroups(2) manpages. We do not yet support a dynamic NGROUPS, but we
may in the future.
MFC after: 2 weeks
dace for UPDv4 sockets bound to INADDR_ANY. Move the code to set
IP_RECVDSTADDR/IP_SENDSRCADDR into svc_dg.c, so that both TLI and non-TLI
users will be using it.
Back out my previous commit to mountd. Turns out the problem was affecting
more than one binary so it needs to me addressed in generic rpc code in
libc in order to fix them all.
Reported by: lstewart
Tested by: lstewart