remove previous entropy harvesting mutex names as they are no longer
present. Commit to this file was ommitted when randomdev_soft.c:1.5
was made.
Feet shot: Robert Huff <roberthuff at rcn dot com>
Sockets in the listen queues have reference counts of 0, so if the
protocol decides to disconnect the pcb and try to free the socket, this
triggered a race with accept() wherein accept() would bump the reference
count before sofree() had removed the socket from the listen queues,
resulting in a panic in sofree() when it discovered it was freeing a
referenced socket. This might happen if a RST came in prior to accept()
on a TCP connection.
The fix is two-fold: to expand the coverage of the accept mutex earlier
in sofree() to prevent accept() from grabbing the socket after the "is it
really safe to free" tests, and to expand the logic of the "is it really
safe to free" tests to check that the refcount is still 0 (i.e., we
didn't race).
RELENG_5 candidate.
Much discussion with and work by: green
Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by: Vlad <marchenko at gmail dot com>
and make it visible (same way as in OpenBSD). Describe usage in manpage.
This change is useful for creating custom free methods, which
call default free method at their end.
While here, make malloc declaration for mbuf tags more informative.
Approved by: julian (mentor), sam
MFC after: 1 month
when the spin lock in question isn't -- it's the critical_enter() that
KDB set. No more panic in DDB for console -> syscons -> tty -> knote
operations.
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.
Discussed with: davidxu, deischen, marcel
MFC after: 3 days
Discussion: this panic (or waning) only occurs when the kernel is
compiled with INVARIANTS. Otherwise the problem (which means that
the vp->v_data field isn't NULL, and represents a coding error and
possibly a memory leak) is silently ignored by setting it to NULL
later on.
Panicking here isn't very helpful: by this time, we can only find
the symptoms. The panic occurs long after the reason for "not
cleaning" has been forgotten; in the case in point, it was the
result of severe file system corruption which left the v_type field
set to VBAD. That issue will be addressed by a separate commit.
all other threads to suicide, problem is execve() could be failed, and
a failed execve() would change threaded process to unthreaded, this side
effect is unexpected.
The new code introduces a new single threading mode SINGLE_BOUNDARY, in
the mode, all threads should suspend themself at user boundary except
the singler. we can not use SINGLE_NO_EXIT because we want to start from
a clean state if execve() is successful, suspending other threads at unknown
point and later resuming them from there and forcing them to exit at user
boundary may cause the process to start from a dirty state. If execve() is
successful, current thread upgrades to SINGLE_EXIT mode and forces other
threads to suicide at user boundary, otherwise, other threads will be resumed
and their interrupted syscall will be restarted.
Reviewed by: julian
the raw values including for child process statistics and only compute the
system and user timevals on demand.
- Fix the various kern_wait() syscall wrappers to only pass in a rusage
pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
times it needs rather than calling getrusage() twice with associated
stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
for user, system, and interrupt time as well as a bintime of the total
runtime. A new p_rux field in struct proc replaces the same inline fields
from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux
field in struct proc contains the "raw" child time usage statistics.
ruadd() has been changed to handle adding the associated rusage_ext
structures as well as the values in rusage. Effectively, the values in
rusage_ext replace the ru_utime and ru_stime values in struct rusage. These
two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
calculates appropriate timevals for user and system time as well as updating
the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a
copy of the process' p_rux structure to compute the timevals after updating
the runtime appropriately if any of the threads in that process are
currently executing. It also now only locks sched_lock internally while
doing the rux_runtime fixup. calcru() now only requires the caller to
hold the proc lock and calcru1() only requires the proc lock internally.
calcru() also no longer allows callers to ask for an interrupt timeval
since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
calling calcru1() on p_crux. Note that this means that any code that wants
child times must now call this function rather than reading from p_cru
directly. This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
proctree lock is used to protect the process group rather than the process
group lock. By holding this lock until the end of the function we now
ensure that the process/thread that we pick to dump info about will no
longer vanish while we are trying to output its info to the console.
Submitted by: bde (mostly)
MFC after: 1 month
turnstile chain lock until after making all the awakened threads
runnable. First, this fixes a priority inversion race. Second, this
attempts to finish waking up all of the threads waiting on a turnstile
before doing a preemption.
Reviewed by: Stephan Uphoff (who found the priority inversion race)
so would cause kernel to produce an unkillable process in some cases,
especially, P_STOPPED_SINGLE has a singling thread, turning off the
bit would mess the state.
the new subr_unit.c code.
For now assert Giant in ttycreate() and ttyfree(). It is not obvious that
it will ever pay off to lock these with anything else.
Allocation is always lowest free unit number.
A mixed range/bitmap strategy for maximum memory efficiency. In
the typical case where no unit numbers are freed total memory usage
is 56 bytes on i386.
malloc is called M_WAITOK but no locking is provided (yet). A bit of
experience will be necessary to determine the best strategy. Hopefully
a "caller provides locking" strategy can be maintained, but that may
require use of M_NOWAIT allocation and failure handling.
A userland test driver is included.
generic way. This code will allow a similar amount of code to be
removed from most if not all serial port drivers.
Add generic cdevsw for tty devices.
Add generic slave cdevsw for init/lock devices.
Add ttypurge function which wakes up all know generic sleep
points in the tty code, and calls into the hw-driver if it
provides a method.
Add ttycreate function which creates tty device and optionally
cua device. In both cases .init/.lock devices are created
as well.
Change ttygone() slightly to also call the hw driver provided
purge routine.
Add ttyfree() which will purge and destroy the cdevs.
Add ttyconsole mode for setting console friendly termios
on a port.
is one, detect mbuf loops and stop, add an extra arg so you can only print
the first x bytes of the data per mbuf (print all if arg is -1), print
flags using %b (bitmask)...
No code in the tree appears to use m_print, and it's just a maner of adding
-1 as an additional arg to m_print to restore original behavior..
MFC after: 4 days
the trapframe via kdb_frame, but kdb_frame was not initialized until
after the call to kdb_cpu_trap(). Ergo: kdb_cpu_trap() was moved too
far up.
Pointy hat: marcel
dev_refthread() will return the cdevsw pointer or NULL. If the
return value is non-NULL a threadcount is held which much be released
with dev_relthread(). If the returned cdevsw is NULL no threadcount
is held on the device.
It can be used to delay mounting root partition to give a chance to GEOM
providers to show up.
Now, when there is no needed provider, vfs_rootmount() function will look
for it every second and if it can't be find in defined time, it'll ask
for root device name (before this change it was done immediately).
This will allow to boot from gmirror device in degraded mode.
of the number of threads which are inside whatever is behind the
cdevsw for this particular cdev.
Make the device mutex visible through dev_lock() and dev_unlock().
We may want finer granularity later.
Replace spechash_mtx use with dev_lock()/dev_unlock().
Better to kill all other threads than to panic the system if 2 threads call
execve() at the same time. A better fix will be committed later.
Note that this only affects the case where the execve fails.
Ask uma_zcreate() to align mbufs to MSIZE bytes (otherwise dtom() breaks)
As it happens, uma_zalloc_arg() always returned mbufs aligned to MSIZE
anyway, but that was an implementation side-effect....
KASSERT -> CTASSERT suggested by: dd@
Approved by: silence on -net
UMA_ZONE_NOFREE to guarantee type stability, so proc_fini() should
never be called. Move an assertion from proc_fini() to proc_dtor()
and garbage-collect the rest of the unreachable code. I have retained
vm_proc_dispose(), since I consider its disuse a bug.
most if not all of our tty drivers in the future.
Centralizing this stuff enables us to remove about 100 lines of
almost but not quite perfectly copy&paste code from each tty driver.
and the previously malloc'ed snapshot lock.
Malloc struct snapdata instead of just the lock.
Replace snapshot fields in cdev with pointer to snapdata (saves 16 bytes).
While here, give the private readblock() function a vnode argument
in preparation for moving UFS to access GEOM directly.
preparation for integration of p4::phk_bufwork. In the future,
local filesystems will talk to GEOM directly and they will consequently
be able to issue BIO_DELETE directly. Since the removal of the fla
driver, BIO_DELETE has effectively been a no-op anyway.
fully initialed when the pmap layer tries to call sched_pini() early in the
boot and results in an quick panic. Use ke_pinned instead as was originally
done with Tor's patch.
Approved by: julian