witness_proc_has_locks(), as they are unused, which results in a compiler
error. This problem was introduced with the implementation of "show
alllocks".
Spotted by: Artem Kuchin <matrix at itlegion dot ru>
designed to help detect tamper-after-free scenarios, a problem more
and more common and likely with multithreaded kernels where race
conditions are more prevalent.
Currently MemGuard can only take over malloc()/realloc()/free() for
particular (a) malloc type(s) and the code brought in with this
change manually instruments it to take over M_SUBPROC allocations
as an example. If you are planning to use it, for now you must:
1) Put "options DEBUG_MEMGUARD" in your kernel config.
2) Edit src/sys/kern/kern_malloc.c manually, look for
"XXX CHANGEME" and replace the M_SUBPROC comparison with
the appropriate malloc type (this might require additional
but small/simple code modification if, say, the malloc type
is declared out of scope).
3) Build and install your kernel. Tune vm.memguard_divisor
boot-time tunable which is used to scale how much of kmem_map
you want to allott for MemGuard's use. The default is 10,
so kmem_size/10.
ToDo:
1) Bring in a memguard(9) man page.
2) Better instrumentation (e.g., boot-time) of MemGuard taking
over malloc types.
3) Teach UMA about MemGuard to allow MemGuard to override zone
allocations too.
4) Improve MemGuard if necessary.
This work is partly based on some old patches from Ian Dowse.
and always has been, but the system call itself returns
errno in a register so the problem is really a function of
libc, not the system call.
Discussed with : Matthew Dillion <dillon@apollo.backplane.com>
they both happen before pipe backing allocation occurs. Previously,
a pipe memory shortage would cause a panic due to a KNOTE call
on an uninitialized si_note.
Reported by: Peter Holm
MFC after: 1 week
unhappiness lately.
As far as I can tell, no files that have made it safely to disk
have been endangered, but stuff in transit has been in peril.
Pointy hat: phk
and KASSERT coverage.
After this check there is only one "nasty" cast in this code but there
is a KASSERT to protect against the wrong argument structure behind
that cast.
Un-inlining the meat of VOP_FOO() saves 35kB of text segment on a typical
kernel with no change in performance.
We also now run the checking and tracing on VOP's which have been layered
by nullfs, umapfs, deadfs or unionfs.
Add new (non-inline) VOP_FOO_AP() functions which take a "struct
foo_args" argument and does everything the VOP_FOO() macros
used to do with checks and debugging code.
Add KASSERT to VOP_FOO_AP() check for argument type being
correct.
Slim down VOP_FOO() inline functions to just stuff arguments
into the struct foo_args and call VOP_FOO_AP().
Put function pointer to VOP_FOO_AP() into vop_foo_desc structure
and make VCALL() use it instead of the current offsetoff() hack.
Retire vcall() which implemented the offsetoff()
Make deadfs and unionfs use VOP_FOO_AP() calls instead of
VCALL(), we know which specific call we want already.
Remove unneeded arguments to VCALL() in nullfs and umapfs bypass
functions.
Remove unused vdesc_offset and VOFFSET().
Generally improve style/readability of the generated code.
up its pending error state, which may be set in some rare conditions resulting
in connect() syscall returning that bogus error and making application believe
that attempt to change association has failed, while it has not in fact.
There is sockets/reconnect regression test which excersises this bug.
MFC after: 2 weeks
errno can be tampered potentially by nested signal handle.
Now all error codes are returned in negative value, positive value are
reserved for future expansion.
TAILQ_FOREACH_SAFE().
Loose the error pointer argument and return any errors the normal way.
Return EAGAIN for the case where more work needs to be done.
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:
The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.
Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.
If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.
Discussed with: rwatson
before deciding to do more expensive locking to account for process
exit. This acceptable minor race avoids two mutex operations in
that highly common case of accounting not being enabled.
MFC after: 2 weeks
turn it back on. Specifically, the actual changes are now less intrusive
in that the _get_spin_lock() and _rel_spin_lock() macros now have their
contents changed for UP vs SMP kernels which centralizes the changes.
Also, UP kernels do not use _mtx_lock_spin() and no longer include it. The
UP versions of the spin lock functions do not use any atomic operations,
but simple compares and stores which allow mtx_owned() to still work for
spin locks while removing the overhead of atomic operations.
Tested on: i386, alpha
unloaded, cleanup, or return ebusy of that's inconvenient.' The
default module hanlder for newbus will now call this when we get a
MOD_QUIESCE event, but in the future may call this at other times.
This shouldn't change any actual behavior until drivers start to use it.
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64