Currently sizeof(struct filedesc0) is 1096 bytes, which means allocations from
malloc use 2048 bytes.
There is no easy way to shrink the structure <= 1024 an it is likely to grow in
the future.
whether the shared request for already shared-locked lock could be
granted. Both problems result in the exclusive locker starvation.
The concurrent exclusive request is indicated by either
LK_EXCLUSIVE_WAITERS or LK_EXCLUSIVE_SPINNERS flags. The reverse
condition, i.e. no exclusive waiters, must check that both flags are
cleared.
Add a flag LK_NODDLKTREAT for shared lock request to indicate that
current thread guarantees that it does not own the lock in shared
mode. This turns back the exclusive lock starvation avoidance code;
see man page update for detailed description.
Use LK_NODDLKTREAT when doing lookup(9).
Reported and tested by: pho
No objections from: attilio
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
A new terminal_set_cursor() is added: it wraps the existing
teken_set_cursor() function.
In vtbuf_grow(), the cursor position is adjusted at the end of the
function. In vt_change_font(), we call terminal_set_cursor() just after
terminal_set_winsize_blank(), while the terminal is mute.
This fixes a bug where, after loading a kernel video driver which
increases the terminal window size, the cursor remains at its old
position, in other words, in the middle of the display content.
PR: 194421
MFC after: 1 week
of fuword(9) and suword(9). This makes the functions type-compatible
with volatile objects and does not require devolatile force, e.g. in
kern_umtx.c.
Requested by: bde
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
appropriately
Assert FILEDESC_XLOCK_ASSERT only for already used tables in fdgrowtable.
We don't have to call it with the lock held if we are just creating new
filedesc.
As a side note, strictly speaking processes can have fdtables with
fd_lastfile = -1, but then they cannot enter fdgrowtable. Very first file
descriptor they get will be 0 and the only syscall allowing to choose fd number
requires an active file descriptor. Should this ever change, we can add an 'init'
(or similar) parameter to fdgrowtable.
While here add 'fdused_init' which does not perform unnecessary work.
Drop FILEDESC_LOCK_ASSERT from fdisused and rely on callers to hold
it when appropriate. This function is only used with INVARIANTS.
No functional changes intended.
Test for file availability by fde_file != NULL instead of fdisused, this is
consistent with similar checks later.
Drop badfileops check. badfileops don't have DFLAG_PASSABLE set, so it was never
reached in practice.
fdiused is now only used in some KASSERTS, so ifdef it under INVARIANTS.
No functional changes.
This code has had an extensive rewrite and a good series of reviews, both by the author and other parties. This means a lot of code has been simplified. Pluggable structures for high-rate entropy generators are available, and it is most definitely not the case that /dev/random can be driven by only a hardware souce any more. This has been designed out of the device. Hardware sources are stirred into the CSPRNG (Yarrow, Fortuna) like any other entropy source. Pluggable modules may be written by third parties for additional sources.
The harvesting structures and consequently the locking have been simplified. Entropy harvesting is done in a more general way (the documentation for this will follow). There is some GREAT entropy to be had in the UMA allocator, but it is disabled for now as messing with that is likely to annoy many people.
The venerable (but effective) Yarrow algorithm, which is no longer supported by its authors now has an alternative, Fortuna. For now, Yarrow is retained as the default algorithm, but this may be changed using a kernel option. It is intended to make Fortuna the default algorithm for 11.0. Interested parties are encouraged to read ISBN 978-0-470-47424-2 "Cryptography Engineering" By Ferguson, Schneier and Kohno for Fortuna's gory details. Heck, read it anyway.
Many thanks to Arthur Mesh who did early grunt work, and who got caught in the crossfire rather more than he deserved to.
My thanks also to folks who helped me thresh this out on whiteboards and in the odd "Hallway track", or otherwise.
My Nomex pants are on. Let the feedback commence!
Reviewed by: trasz,des(partial),imp(partial?),rwatson(partial?)
Approved by: so(des)
without restarting whole lookup
Restart is only needed when fp was closed by current process, which is a much
rarer event than ref/deref by some other thread.
A read barrier was necessary because fd table pointer and table size were
updated separately, opening a window where fget_unlocked could read new size
and old pointer.
This patch puts both these fields into one dedicated structure, pointer to which
is later atomically updated. As such, fget_unlocked only needs data a dependency
barrier which is a noop on all supported architectures.
Reviewed by: kib (previous version)
MFC after: 2 weeks
- Move the existing code to x86/x86/identcpu.c since it is x86-specific.
- If the CPUID2_HV flag is set, assume a hypervisor is present and query
the 0x40000000 leaf to determine the hypervisor vendor ID. Export the
vendor ID and the highest supported hypervisor CPUID leaf via
hv_vendor[] and hv_high variables, respectively. The hv_vendor[]
array is also exported via the hw.hv_vendor sysctl.
- Merge the VMWare detection code from tsc.c into the new probe in
identcpu.c. Add a VM_GUEST_VMWARE to identify vmware and use that in
the TSC code to identify VMWare.
Differential Revision: https://reviews.freebsd.org/D1010
Reviewed by: delphij, jkim, neel
Also fix some mishandling of suword(9) errors as errno, which resulted
in spurious ERESTART.
Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 3 weeks
and casuword(9), but do not mix value read and indication of fault.
I know (or remember) enough assembly to handle x86 and powerpc. For
arm, mips and sparc64, implement fueword() and casueword() as wrappers
around fuword() and casuword(), which means that the functions cannot
distinguish between -1 and fault.
On architectures where fueword() and casueword() are native, implement
fuword() and casuword() using fueword() and casuword(), to reduce
assembly code duplication.
Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 2 weeks (ia64 needs treating)
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.
Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.
MFC after: 3 days
Sponsored by: Mellanox Technologies
The kernel tracks syscall users so that modules can safely unregister them.
But if the module is not unloadable or was compiled into the kernel, there is
no need to do this.
Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC
during kernel build and 0 otherwise.
Reviewed by: kib (previous version)
MFC after: 2 weeks
- 'groups' initialization to NULL is always ovewrwriten before use, so plug it
- get rid of 'goto out'
- kern_setgroups's callers already validate ngrp, so only assert the condition
- ngrp is an u_int, so 'ngrp < 1' is more readable as 'ngrp == 0'
No functional changes.
and the following r273143 commit, supposed to workaround introduced issue by
quite innocent-looking change.
While there is no clear understanding why, but r273143 is accused in data
corruption in some environments with high I/O load. I personally don't see
any problem in that commit, and possibly it is just a trigger to some other
bug somewhere, but better safe then sorry for now.
Requested by: scottl@
MFC after: 3 days
in a separate word from the _count. This does not permit both items to
be updated atomically in a portable manner. As a result, sem_post()
must always perform a system call to safely clear _has_waiters.
This change removes the _has_waiters field and instead uses the high bit
of _count as the _has_waiters flag. A new umtx object type (_usem2) and
two new umtx operations are added (SEM_WAIT2 and SEM_WAKE2) to implement
these semantics. The older operations are still supported under the
COMPAT_FREEBSD9/10 options. The POSIX semaphore API in libc has
been updated to use the new implementation. Note that the new
implementation is not compatible with the previous implementation.
However, this only affects static binaries (which cannot be helped by
symbol versioning). Binaries using a dynamic libc will continue to work
fine. SEM_MAGIC has been bumped so that mismatched binaries will error
rather than corrupting a shared semaphore. In addition, a padding field
has been added to sem_t so that it remains the same size.
Differential Revision: https://reviews.freebsd.org/D961
Reported by: adrian
Reviewed by: kib, jilles (earlier version)
Sponsored by: Norse
indiscriminately to printf() and freeenv() is incorrect. Add a NULL
check before freeenv(); as for printf(), we could use req.newptr
instead, but we'd have to select the correct format string based on
the type, and that's too much work for an error message, so just
remove it.
initial static environment to a dynamic one, zero the static environment
buffer, and zero individual values when kern_unsetenv and freeenv are
called.
Tested by: kmoore (VM memory dump + grep)
Tested by: cperciva (kernel panic dump + grep)
Rename it to fdsetugidsafety for consistency with other functions.
There is no need to take filedesc lock if not closing any files.
The loop has to verify each file and we are guaranteed fdtable has space
for at least 20 fds. As such there is no need to check fd_lastfile.
While here tidy up is_unsafe.