Commit Graph

8021 Commits

Author SHA1 Message Date
phk
f2581a224f Since we do not support forceful unmount of DEVFS we can do away with
the partially implemented vnode-readoption code in vgonechrl().
2005-01-04 08:49:14 +00:00
marcel
1a8a332194 Regen. 2005-01-03 00:47:23 +00:00
marcel
413099920e uuidgen(2) is MP safe. 2005-01-03 00:45:57 +00:00
imp
58871563b4 Implement device_quiesce. This method means 'you are about to be
unloaded, cleanup, or return ebusy of that's inconvenient.'  The
default module hanlder for newbus will now call this when we get a
MOD_QUIESCE event, but in the future may call this at other times.

This shouldn't change any actual behavior until drivers start to use it.
2004-12-31 20:47:51 +00:00
pjd
09d675e003 Be consistent and always use form 'return (value);' instead of 'return value;'.
We had (before this change) 84 lines where it was style(9)-clean and 15 lines
where it was not.
2004-12-31 14:52:53 +00:00
jhb
ad4d00347b Fix a typo and two whitespace nits. 2004-12-30 22:17:00 +00:00
jhb
3f307e93e3 Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
  sched_lend_prio() and sched_unlend_prio().  The turnstile code uses these
  functions to ask the scheduler to lend a thread a set priority and to
  tell the scheduler when it thinks it is ok for a thread to stop borrowing
  priority.  The unlend case is slightly complex in that the turnstile code
  tells the scheduler what the minimum priority of the thread needs to be
  to satisfy the requirements of any other threads blocked on locks owned
  by the thread in question.  The scheduler then decides where the thread
  can go back to normal mode (if it's normal priority is high enough to
  satisfy the pending lock requests) or it it should continue to use the
  priority specified to the sched_unlend_prio() call.  This involves adding
  a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
  for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
  borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
  on a turnstile, it will call a new function turnstile_adjust() to inform
  the turnstile code of the change.  This function resorts the thread on
  the priority list of the turnstile if needed, and if the thread ends up
  at the head of the list (due to having the highest priority) and its
  priority was raised, then it will propagate that new priority to the
  owner of the lock it is blocked on.

Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
  of its associated kse group has been consolidated in a new static
  function resetpriority_thread().  One change to this function is that
  it will now only adjust the priority of a thread if it already has a
  time sharing priority, thus preserving any boosts from a tsleep() until
  the thread returns to userland.  Also, resetpriority() no longer calls
  maybe_resched() on each thread in the group. Instead, the code calling
  resetpriority() is responsible for calling resetpriority_thread() on
  any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
  sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
  directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
  group priority has been adjusted.

Discussed with:	bde
Reviewed by:	ups, jeffr
Tested on:	4bsd, ule
Tested on:	i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
jhb
e3adf38617 Whitespace fix. 2004-12-30 20:30:58 +00:00
jhb
7b611b0cb2 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
jhb
a610dc93b7 Call tty_close() at the very end of ttyclose() since otherwise NULL
deferences can occur since tty_close() may end up freeing the tty structure
if it drops the last reference to it.

Glanced at by:	phk
2004-12-30 19:24:49 +00:00
rwatson
44386c7719 Make the sysctls kern.ipc.msgmnb and kern.ipc.msgtql into tunables as
is the case for most other sysctls in the System V IPC message queue
implementation.

PR:		75541
Submitted by:	Sergiy Vyshnevetskiy <serg at vostok dot net>
MFC after:	2 weeks
2004-12-30 13:56:34 +00:00
davidxu
6724b46563 Make umtx_wait and umtx_wake more like linux futex does, it is
more general than previous. It also lets me implement cancelable point
in thread library. Also in theory, umtx_lock and umtx_unlock can
be implemented by using umtx_wait and umtx_wake, all atomic operations
can be done in userland without kernel's casuptr() function.
2004-12-30 02:56:17 +00:00
alc
f5297a787e Eliminate (now) unnecessary acquisition and release of the global page
queues lock.
2004-12-29 04:49:10 +00:00
jhb
38cd373d81 - Up the WITNESS_COUNT macro from 200 to 1024 to support the growing number
of lock types in the kernel.  This results in an increase of witness
  data usage from ~145k to ~280k on i386 for kernels with
  'options WITNESS'.
- Remove the unused witness malloc bucket.

Submitted by:	Michal Mertl mime at traveller dot cz (1)
2004-12-28 21:21:27 +00:00
rwatson
c2459d7b3f Attempt to slightly refine the print out from "show alllocks" -- list
the process and thread numbers/names on the same line rather than on
separate lines, and print the thread pointer not just the tid.
2004-12-27 10:47:08 +00:00
kan
afd7d6f06b Do not vput(9) unlocked vnode and do not VREF it with the sole purpose
of vputting it back immediately.

Complained by:	DEBUG_VFS_LOCKS
2004-12-27 05:17:11 +00:00
jeff
94a75d08c3 - Unintentionally checked in a debugging panic. Remove that. 2004-12-26 23:21:48 +00:00
jeff
862fb71e5e - Remove a 4BSD specific hack since this will work on ULE too. 2004-12-26 22:56:51 +00:00
jeff
ceca9b8f9e - Fix a long standing problem where an ithread would not honor sched_pin().
- Remove the sched_add wrapper that used sched_add_internal() as a backend.
   Its only purpose was to interpret one flag and turn it into an int.  Do
   the right thing and interpret the flag in sched_add() instead.
 - Pass the flag argument to sched_add() to kseq_runq_add() so that we can
   get the SRQ_PREEMPT optimization too.
 - Add a KEF_INTERNAL flag.  If KEF_INTERNAL is set we don't adjust the SLOT
   counts, otherwise the slot counts are adjusted as soon as we enter
   sched_add() or sched_rem() rather than when the thread is actually placed
   on the run queue.  This greatly simplifies the handling of slots.
 - Remove the explicit prevention of migration for ithreads on non-x86
   platforms.  This was never shown to have any real benefit.
 - Remove the unused class argument to KSE_CAN_MIGRATE().
 - Add ktr points for thread migration events.
 - Fix a long standing bug on platforms which don't initialize the cpu
   topology.  The ksg_maxid variable was never correctly set on these
   platforms which caused the long term load balancer to never inspect
   more than the first group or processor.
 - Fix another bug which prevented the long term load balancer from working
   properly.  If stathz != hz we can't expect sched_clock() to be called
   on the exact tick count that we're anticipating.
 - Rearrange sched_switch() a bit to reduce indentation levels.
2004-12-26 22:56:08 +00:00
rwatson
b864ac486c Add "show alllocks" command to DDB, which dumps a list of processes
and threads currently holding sleep mutexes (and spin mutexes for
curthread).  This can be quite useful in looking for a lock condition
summary for a system, as it avoids manually iterating through threads
and processes to find all the interesting locks.

NB: "alllocks" is up there with "lockedvnods" for a bad argument for
show.

MFC after:	2 weeks
2004-12-26 22:52:24 +00:00
jeff
4739ea6908 - Run sched_userret() after thread_userret(). Before, sched_userret() would
lower the priority of the returning thread to a user priority before
   calling into thread_userret() which would call wakeup() which in turn would
   cause the returning thread to eventually context switch rather than
   completing its slice.  Allowing this thread to complete its slice first
   yields a 15% performance improvement in super-smack on my dual opteron with
   4BSD.
2004-12-26 07:30:35 +00:00
jeff
a5c50b5ce9 - Wrap the thread count adjustment in sched_load_add() and sched_load_rem()
so that we may place some ktr entries nearby.
 - Define other KTR_SCHED tracepoints so that we may graph the operation
   of the scheduler.
2004-12-26 00:16:24 +00:00
jeff
d378b46f4e - Remove earlier KTR_ULE tracepoints.
- Define new KTR_SCHED points so that we can graph the operation of the
   scheduler.
2004-12-26 00:15:33 +00:00
jeff
c2b9649e7a - Define KTR points for KTR_SCHED. 2004-12-26 00:14:21 +00:00
davidxu
9476ffbed8 Make _umtx_op() as more general interface, the final parameter needn't be
timespec pointer, every parameter will be interpreted by its opcode.
2004-12-25 13:02:50 +00:00
davidxu
7b03c7ecc4 1. introduce umtx_owner to get an owner of a umtx.
2. add const qualifier to umtx_timedlock and umtx_timedwait.
3. add missing blackets in umtx do_unlock_and_wait.
2004-12-25 12:49:35 +00:00
davidxu
da86559777 Add umtxq_lock/unlock around umtx_signal, fix debug kernel compiling,
let umtx_lock returns EINTR when it returns ERESTART, this lets
userland have chance to back off mtx lock code when needed.
2004-12-24 11:59:20 +00:00
davidxu
85fe4cfe89 1. Fix race condition between umtx lock and unlock, heavy testing
on SMP can explore the bug.
2. Let umtx_wake returns number of threads have been woken.
2004-12-24 11:30:55 +00:00
rwatson
649bb26a69 Assert the sem lock in sem_ref() and sem_rel(), as it is required to
safely manipulate the reference count.
2004-12-23 02:22:47 +00:00
rwatson
e1ce7eb9ce Remove temporary debugging printf that was used to detect the presence
of a race that had previously caused a panic in order to determine if
the fix was for the right problem.  It was.

MFC after:	2 weeks
2004-12-23 01:19:27 +00:00
rwatson
f1732152a7 In sonewconn(), the s/if/while/ change to wait for room at the tail of
the accept queue is a feature, not a bug/issue, so remove the XXXRW
from the comment.
2004-12-23 01:16:21 +00:00
rwatson
5728709bda Remove an XXXRW indicating atomic operations might be used as a
substitute for a global mutex protecting the socket count and
generation number.

The observation that soreceive_rcvoob() can't return an mbuf
chain is a property, not a bug, so remove the XXXRW.

In sorflush, s/existing/previous/ for code when describing prior
behavior.

For SO_LINGER socket option retrieval, remove an XXXRW about why
we hold the mutex: this is correct and not dubious.

MFC after:	2 weeks
2004-12-23 01:07:12 +00:00
rwatson
fd297e3939 In soalloc(), simplify the mac_init_socket() handling to remove
unnecessary use of a global variable and simplify the return case.
While here, use ()'s around return values.

In sodealloc(), remove a comment about why we bump the gencnt and
decrement the socket count separately.  It doesn't add
substantially to the reading, and clutters the function.

MFC after:	2 weeks
2004-12-23 00:59:43 +00:00
alc
04b2362b0f Add send buffer locking to uipc_send(). Without this locking a race can
occur between a reader and a writer that results in a panic upon close,
e.g.,
	"panic: sbflush_locked: cc 4 || mb 0xffffff0052afa400 || mbcnt 0"

Reviewed by: rwatson@
MFC after: 2 weeks
2004-12-22 20:28:46 +00:00
phk
fbe7293f5a Include uio.h
Check O_NONBLOCK instead if IO_NDELAY
Don't include vnode.h
2004-12-22 17:37:14 +00:00
phk
93beb5568e Hide/remove various printfs, now that root mounting doesn't seem to explode
on people.
2004-12-20 21:59:25 +00:00
phk
1a55c4023a fix a misleading sleep identifier. 2004-12-20 21:38:13 +00:00
phk
0faeb292ed We can only ever get to vgonechrl() from a devfs vnode, so we do not
need to reassign the vp->v_op to devfs_specops, we know that is the
value already.

Make devfs_specops private to devfs.
2004-12-20 21:34:29 +00:00
davidxu
acc63fde9c 1. msleep returns EWOULDBLOCK not ETIMEDOUT, use EWOULDBLOCK instead.
2. Eliminate a possible lock leak in timed wait loop.
2004-12-18 13:43:16 +00:00
davidxu
395ea4c2e2 1. make umtx sharable between processes, the way is two or more processes
call mmap() to create a shared space, and then initialize umtx on it,
   after that, each thread in different processes can use the umtx same
   as threads in same process.
2. introduce a new syscall _umtx_op to support timed lock and condition
   variable semantics. also, orignal umtx_lock and umtx_unlock inline
   functions now are reimplemented by using _umtx_op, the _umtx_op can
   use arbitrary id not just a thread id.
2004-12-18 12:52:44 +00:00
sam
8c71ba93c4 fix m_append for case where additional mbufs are required 2004-12-15 19:04:07 +00:00
phk
eab55e1589 Fix a deadlock I introduced this morning.
Mostly from:	tegge
2004-12-14 20:48:40 +00:00
jeff
c94fadce10 - Garbage collect several unused members of struct kse and struce ksegrp.
As best as I can tell, some of these were never used.
2004-12-14 10:53:55 +00:00
jeff
422a07b8e1 - In kseq_choose(), don't recalculate slice values for processes with a
nice of 0.  Doing so can cause an infinite loop because they should be
   running, but a nice -20 process could prevent them from doing so.
 - Add a new flag KEF_PRIOELEV to flag a thread that has had its priority
   elevated due to priority propagation.  If a thread has had its priority
   elevated, we assume that it must go on the current queue and it must
   get a slice.
 - In sched_userret() if our priority was elevated and we shouldn't have
   a timeslice, yield here until we should.

Found/Tested by:	glebius
2004-12-14 10:34:27 +00:00
phk
8a94ce5ea1 Add a new kind of reference count (fd_holdcnt) to struct filedesc
which holds on to just the data structure and the mutex.  (The
existing refcount (fd_refcnt) holds onto the open files in the
descriptor.)

The fd_holdcnt is protected by fdesc_mtx, fd_refcnt by FILEDESC_LOCK.

Add fdhold(struct proc *) which gets a hold on the filedescriptors of
the specified proc..

Add fddrop(struct filedesc *) which drops the fd_holdcnt and if zero
destroys the mutex and frees the memory.

Initialize the fd_holdcnt to one in fdinit().  Normal operations on
the filedesc structure will not change it.

In fdfree() use fddrop() to dispose of the mutex and structure.  Hold
the FILEDESC_LOCK() until we have cleaned out the contents and carefully
set the fields to null values during cleanup.

Use fdhold()/fddrop() in mountcheckdirs() and sysctl_kern_file().
2004-12-14 09:09:51 +00:00
phk
8986ab0ecf Make fdesc_mtx private to kern_descrip.c now that the flock has come home. 2004-12-14 08:44:51 +00:00
phk
b124f6c510 Move the checkdirs() function from vfs_mount.c to kern_descrip.c and
call it mountcheckdirs().
2004-12-14 08:23:18 +00:00
phk
e622b9bcba Add new function fdunshare() which encapsulates the necessary light magic
for ensuring that a process' filedesc is not shared with anybody.

Use it in the two places which previously had private implmentations.

This collects all fd_refcnt handling in kern_descrip.c
2004-12-14 07:20:03 +00:00
jeff
6ec1292f83 - If delivering a signal will result in killing a process that has a
nice value above 0, set it to 0 so that it may proceed with haste.
   This is especially important on ULE, where adjusting the priority
   does not guarantee that a thread will be granted a greater time slice.
2004-12-13 16:45:57 +00:00
jeff
00f132eec6 - Take up a 'slot' while we're on the assigned queue, waiting to be
posted to another processor.  Otherwise, kern_switch() gets confused
   and tries to sched_add(NULL).
2004-12-13 13:09:33 +00:00