Commit Graph

6580 Commits

Author SHA1 Message Date
phk
6689b404af Crude but efficient:
#ifdef DIAGNOSTIC hold a mutex while calling callout's so that we hear
about it if they sleep.
2003-06-20 08:07:15 +00:00
phk
b64d71d8c8 Don't (re)initialize f_gcflag to zero.
Move initialization of DTYPE_VNODE specific field f_seqcount into
the DTYPE_VNODE specific code.
2003-06-20 08:02:30 +00:00
davidxu
88ed270c3d When a STOP signal is being sent to a process, it is possible all
threads in the process have already masked the signal, so job control
is delayed. But later a thread unmasking the STOP signal should enable
job control, so in issignal(), scanning all threads in process to see
if we can direct suspend some of them, not just suspend current thread.
2003-06-20 03:36:45 +00:00
davidxu
c0a849442b Fix typo. td should be td0. 2003-06-20 01:56:28 +00:00
alfred
c618ac8338 Unlock the struct file lock before aquiring Giant, otherwise
we can deadlock because of lock order reversals.  This was not
caught because Witness ignores pool mutexes right now.

Diagnosis and help: truckman
Noticed by: pho
2003-06-19 18:13:07 +00:00
silby
79bbff7ee2 Add a ratelimited message of the form
"maxproc limit exceeded by uid %i, please see tuning(7) and login.conf(5)."

Which will be triggered whenever a user hits his/her maxproc limit or
the systemwide maxproc limit is reached.

MFC after:	1 week
2003-06-19 05:57:25 +00:00
truckman
84188f1f4f FILE_LOCK() uses a pool mutex, as does the vnode v_vnlock. Since pool
mutexes are supposed to only be used as leaf mutexes, and what appear
to be separate pool mutexes could be aliased together, it is bad idea
for a thread to attempt to hold two pool mutexes at the same time.

Slightly rearrange the code in kern_open() so that FILE_UNLOCK() is
called before calling VOP_GETVOBJECT(), which will grab the v_vnlock
mutex.
2003-06-19 04:10:56 +00:00
silby
82d03c66d5 Add a rate limited message reporting when kern.maxfiles is exceeded,
reporting who did it.

Also, fix a style bug introduced in the previous change.

MFC after:	1 week
2003-06-19 04:07:12 +00:00
truckman
b30ab68043 VOP_GETVOBJECT() wants to be called with the vnode lock held. 2003-06-19 03:55:01 +00:00
phk
a81d7fdac7 Introduce a new flag on a file descriptor: DFLAG_SEEKABLE and use that
rather than assume that only DTYPE_VNODE is seekable.
2003-06-18 19:53:59 +00:00
silby
0d0a45a41b Reserve the last 5% of file descriptors for root use. This should allow
systems to fail more gracefully when a file descriptor exhaustion situation
occurs.

Original patch by:	David G. Andersen <dga@lcs.mit.edu>
PR:			45353
MFC after:		1 week
2003-06-18 18:57:58 +00:00
phk
591f399cfe Initialize struct fileops with C99 sparse initialization. 2003-06-18 18:16:40 +00:00
jeff
263ba3bebb - Use a more robust mechanism for determining whether or not a kse is on a
kseq.
2003-06-17 19:49:18 +00:00
scottl
060172ae90 Drop the proc lock around SYSCTL_OUT in the no-threads case.
Submitted by:	truckman
2003-06-17 19:14:00 +00:00
jeff
85db173ae6 - Temporarily patch a problem where the interact score could be negative
because the run time exceeds the largest value a signed int can hold.
   The real solution involves calculating how far we are over the limit.
   To quickly solve this problem we loop removing 1/5th of the current value
   until it falls below the limit.  The common case requires no passes.
2003-06-17 10:21:34 +00:00
jeff
a9649bd4c0 - Add a new function "sched_interact_update()" that scales back the sleep
and run time.
 - Scale the sleep and run time back via sched_interact_update() in more
   places.  This is to keep the statistic more accurate.
 - Charge a parent one tick for forking a child.
 - Add only the run time and not the sleep time to the parents kg when a
   thread exits.  This allows us to give a penalty for having an expensive
   thread exit but does not give a bonus for having an interactive thread
   exit.
 - Change the SLP_RUN_THROTTLE to limit us to 4/5th and not 1/2.
 - Change the SLP_RUN_MAX to two seconds.  This keeps bursty interactive
   applications like mozilla and openoffice in the interactive range even
   through expensive tasks.
 - Recalculate the slice after every sleep.  This ensures that once a task
   has been marked interactive it only has a slice of 1 at the risk of
   giving tasks that sleep for a very brief period a longer time slice.
2003-06-17 06:39:51 +00:00
silby
2896b67c14 Hide the m_defrag* statistics under MBUF_STRESS_TEST, there seems
to be no need to see them in the general case (and they aren't
smp-safe anyway.)

Suggested by:	hmp
MFC after:	1 week
2003-06-17 02:34:40 +00:00
davidxu
91eb81dd0c Forgot to commit code to disable creating a bound thread in same
group again except first kse_create syscall.

Noticed by: julian
2003-06-16 23:46:41 +00:00
davidxu
1c3c8e4e60 Reset ncpus to 1 for bound thread group since there is only one
thread in such group.
Change message text from kse_rel to kserel, it is better displayed
in top.
2003-06-16 13:14:52 +00:00
phk
3b3b9689c1 Get rid of the b_spc specialty field in struct buf by using an already
available caller private field.
2003-06-16 07:18:39 +00:00
phk
ad04f29757 I have not had any reports of trouble for a long time, so remove the
gentle versions of the vop_strategy()/vop_specstrategy() mismatch methods
and use vop_panic() instead.
2003-06-15 19:49:14 +00:00
rwatson
51aa556e27 Various cr*() calls believed to be MPSAFE, since the uidinfo
code is locked down.
2003-06-15 15:57:42 +00:00
davidxu
1d77a8e0f6 1. Add code to support bound thread. when blocked, a bound thread never
schedules an upcall. Signal delivering to a bound thread is same as
   non-threaded process. This is intended to be used by libpthread to
   implement PTHREAD_SCOPE_SYSTEM thread.
2. Simplify kse_release() a bit, remove sleep loop.
2003-06-15 12:51:26 +00:00
iedowse
6fb682520c Don't overwrite the static panicstr buffer for secondary and further
panics. Before revision 1.38, we used to just point panicstr at the
format string if panicstr was NULL, but since we now use a static
buffer for the formatted panic message, we have to be careful to
only write to it during the first panic.

Pointed out by:	bde
2003-06-15 11:43:00 +00:00
jeff
840862a5c2 - Increase the ksegrp's cpu time history buffer to 250ms.
- Decrease the history buffer divisor to 2 so that we remember more of the
   old behavior.
2003-06-15 04:14:25 +00:00
davidxu
90eed4d53b 1. Migrate TDF_UPCALLING from td_flags to td_pflags.
2. Add a flag TDF_SA, it will be used to distinguish SA
   based thread from bound thread.
2003-06-15 03:18:58 +00:00
jeff
fcc153102c - Cap the growth of sleep and run time in sched_exit_kse(). 2003-06-15 02:52:29 +00:00
jeff
1b22cc79dc - Fix the maximum slice value. I accidentally checked in a value of '2'
which meant no process would run for longer than 20ms.
 - Slightly redo the interactivity scorer.  It follows the same algorithm but
   in a slightly more correct way.  Previously values above half were
   incorrect.
 - Lower the interactivity threshold to 20.  It seems that in testing non-
   interactive tasks are hardly ever near there and expensive interactive
   tasks can sometimes surpass it.  This area needs more testing.
 - Remove an unnecessary KTR.
 - Fix a case where an idle thread that had an elevated priority due to
   priority prop. would be placed back on the idle queue.
 - Delay setting NEEDRESCHED until userret() for threads that haad their
   priority elevated while in kernel.  This gives us the same context switch
   optimization as SCHED_4BSD.
 - Limit the child's slice to 1 in sched_fork_kse() so we detect its behavior
   more quickly.
 - Inhert some of the run/slp time from the child in sched_exit_ksegrp().
 - Redo some of the priority comparisons so they are more clear.
 - Throttle the frequency of sched_pctcpu_update() so that rounding errors
   do not make it invalid.
2003-06-15 02:18:29 +00:00
davidxu
abb4420bbe Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
alc
83f108b04d Migrate the thread stack management functions from the machine-dependent
to the machine-independent parts of the VM.  At the same time, this
introduces vm object locking for the non-i386 platforms.

Two details:

1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES.  The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES.  To disable guard page, set
KSTACK_GUARD_PAGES to 0.

2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new.  In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
2003-06-14 23:23:55 +00:00
alc
d20c30720b Move the *_new_altkstack() and *_dispose_altkstack() functions out of the
various pmap implementations into the machine-independent vm.  They were
all identical.
2003-06-14 06:20:25 +00:00
mux
c4ee7613fb Style(9). 2003-06-13 19:39:21 +00:00
des
6e6f4e8270 Make the VFS cache use zones instead of malloc(9). This results in a
small but noticeable increase in performance for name lookup operations.

The code uses two zones, one for short names (less than 32 characters)
and one for long names (up to NAME_MAX).  Since most file names are
fairly short, this saves a considerable amount of space that would
otherwise be wasted if we always allocated NAME_MAX bytes.  The cutoff
value of 32 characters was picked arbitrarily and may benefit from some
tweaking; it could also be made into a tunable.

Submitted by:	hmp
2003-06-13 08:46:13 +00:00
alc
d66a37a0f2 Add vm object locking to various pagers' "get pages" methods, i386 stack
management functions, and a u area management function.
2003-06-13 03:02:28 +00:00
phk
fd139fd7d0 Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
des
c4e22440ca Document some sysctl variables.
Submitted by:	hmp
2003-06-12 19:46:51 +00:00
scottl
7d369c191d Add support to sysctl_kern_proc to return all threads in a proc, not just the
first one.  The old behaviour can be switched by specifying KERN_PROC_PROC.

Submitted by: julian, tweaks and added functionality by myself
2003-06-12 16:41:50 +00:00
alc
e8221b068f Finish the vm object locking in sendfile(2). More generally,
the vm locking in sendfile(2) is complete.
2003-06-12 05:52:09 +00:00
alc
4451de3f80 Lock the vm object when removing a page. 2003-06-11 21:23:04 +00:00
alc
df7799dd77 Lock the vm object when removing a page. 2003-06-11 16:37:33 +00:00
des
f27cbd8cfb Whitespace cleanup. 2003-06-11 07:35:56 +00:00
alc
958ca4b214 Add vm object locking. 2003-06-11 06:43:48 +00:00
obrien
7d804031bd Use __FBSDID(). 2003-06-11 06:34:30 +00:00
ps
3fbe5ead23 Don't overflow when calculating vm_kmem_size. This fixes kmem_map
too small panics on PAE machines which have odd > 4GB sizes (4.5 gig
would render a 20MB of KVA for kmem_map instead of 200MB).

Submitted by:	John Cagle <john.cagle@hp.com>, jeff
Reviewed by:	jeff, peter, scottl, lots of USENIX folks
2003-06-11 05:18:59 +00:00
davidxu
52433d15d4 Fix error in my last commit. Correctly maintain p_maxthrwaits and unlock
sched_lock.
2003-06-11 01:08:33 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
davidxu
c30bd0e029 If there are signals delivered to current thread, breaks out of loop,
userret() will be called again by ast() and thread_userret() will be
called again by userret().

Reported by: tegge
2003-06-10 02:21:32 +00:00
mux
57a3d130f6 style(9). 2003-06-09 21:57:48 +00:00
jhb
dc1245470e Wait for the real interval timer callout handler to finish executing if it
is currently executing when we try to remove it in exit1().  Without this,
it was possible for the callout to bogusly rearm itself and eventually
refire after the process had been free'd resulting in a panic.

PR:		kern/51964
Reported by:	Jilles Tjoelker <jilles@stack.nl>
Reviewed by:	tegge, bde
2003-06-09 21:46:22 +00:00
jhb
734fc8d52e The issetugid() function is MPSAFE. 2003-06-09 21:34:19 +00:00
alc
99e4660fa8 Update the vm object and page locking in exec_map_first_page(). Mark the
one still anticipated change with XXX.  Otherwise, this function is done.
2003-06-09 19:37:14 +00:00
alc
00063a54b2 - Add vm object locking to vm_pgmoveco().
- Add a comment to vm_pgmoveco() describing what remains to be done
   for vm locking.
2003-06-09 19:23:03 +00:00
jmallett
2f59062691 Attempt to fix Alpha build by renaming ident[] to kern_ident[]. 2003-06-09 18:19:33 +00:00
jhb
ae45522340 - Add a td_pflags field to struct thread for private flags accessed only by
curthread.  Unlike td_flags, this field does not need any locking.
- Replace the td_inktr and td_inktrace variables with equivalent private
  thread flags.
- Move TDF_OLDMASK over to the private flags field so it no longer requires
  sched_lock.
2003-06-09 17:38:32 +00:00
jmallett
05e817b7af Expose kern.ident by way of OID_AUTO.
Requested by:	phk
2003-06-09 10:54:23 +00:00
jeff
64dd44ce61 - Add a simple CPU load balancing algorithm. This works by executing once a
second and equalizing the load between the two most imbalanced CPU.  This
   is intended to clear up long term load imbalances that would not be handled
   by the 'pull' method in sched_choose().
 - Pull out some bits of sched_choose() into a kseq_move() function that moves
   an arbitrary thread from one kseq to another.
2003-06-09 00:39:09 +00:00
alc
cc48080643 Lock the vm object when performing vm_page_grab(). 2003-06-08 07:14:30 +00:00
jeff
e1588d6299 - When a new thread is added to a kseq the load is incremented prior to
adding it to the nice tables.  Therefore, in kseq_add_nice, we should
   keep in mind that the load will be 1 if we are the only thread, and not
   0.
 - Assert that the sched lock is held in all the appropriate places.
 - Increase the scope of the sched lock in sched_pctcpu_update().
 - Hold the sched lock in sched_runnable().  It is not held by the caller.
2003-06-08 00:47:33 +00:00
phk
8d105bca1c Improve the root-dev prompt facility for printing devices which could
possibly be a root filesystem.
2003-06-07 15:46:53 +00:00
davidxu
9a8a455a6a thread_signal_add now is called with ps_mtx held, unlock it before
calling copyin.
2003-06-06 02:17:38 +00:00
rwatson
6b8a71ea4a If a system call comes in requesting to retrieve an attribute named
"", temporarily map it to a call to extattr_list_vp() to provide
compatibility for older applications using the "" API to retrieve
EA lists.

Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than
VOP_GETEXTATTR(..., "", ...).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Asssociates Laboratories
2003-06-05 05:55:34 +00:00
rwatson
9c43f2f46e Add vop_listextattr(), similar to vop_getextattr() but without a
specific attribute name.  It will have the same semantics as the
older vop_getextattr() "retrieve the names" hack, returning
a buffer with ASCII nul-seperated names.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 05:53:35 +00:00
marcel
482a35058c Change the second (and last) argument of cpu_set_upcall(). Previously
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.

Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.

This change officially marks the ability on ia64 for 1:1 threading.

Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
2003-06-04 21:13:21 +00:00
phk
a9b8284f73 Add instrumentation which tells us how much work softclock() does
per invocation.
2003-06-04 05:25:58 +00:00
rwatson
df2db1a4c2 Implementations of extattr_list_fd(), extattr_list_file(), and
extattr_list_link() system calls, which return a least of extended
attributes defined for a vnode referenced by a file descriptor
or path name.  Currently, we just invoke VOP_GETEXTATTR() since
it will convert a request for an empty name into a query for a
name list, which was the old (more hackish) API.  At some point
in the near future, we'll push the distinction between get and
list down to the vnode operation layer, but this provides access
to the new API for applications in the short term.

Pointed out by:	Dominic Giampaolo <dbg@apple.com>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-04 03:57:28 +00:00
rwatson
bf8d163e59 Regen from syscalls.master:1.149, addition of extended attribute
list system calls for fd, file, link.
2003-06-04 03:50:20 +00:00
rwatson
f603fe84cb Add system calls to explicitly list extended attributes on a
file/directory/link, rather than using a less explicit hack on
the extattr retrieval API:

  extattr_list_fd()
  extattr_list_file()
  extattr_list_link()

The existing API was counter-intuitive, and poorly documented.
The prototypes for these system calls are identical to
extattr_get_*(), but without a specific attribute name to
leave NULL.

Pointed out by:	Dominic Giampaolo <dbg@apple.com>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-04 03:49:31 +00:00
rwatson
16f34ab413 Assert the vnode lock when returning successfully from vn_open_cred(). 2003-06-04 00:54:27 +00:00
julian
e31c5c959f Remove un-needed code.
Don't copyin() data we are about to overwrite.
Add a flag to tell userland that KSE is officially "DONE" with the
mailbox and has gone away.

Obtained from:	davidxu@
2003-06-04 00:12:57 +00:00
bmilekic
81e7f6caa9 Fix a potential bucket leak where when freeing to an empty bucket
we failed to put the bucket back into the general cache/container.

Also, fix a bad assumption.  There was a KASSERT() that aimed to
guarantee that whenever the pcpu container's mc_starved was > 0,
that whatever the bucket we were freeing to was an empty bucket,
assuming it belonged to the pcpu container cache. However, there
is at least one case where this is not true anymore; consider:
1) All containers empty, next thread to try to alloc will touch
   a pcpu container, notice it's empty, and increment the pcpu
   container's mc_starved.
2) Some other thread frees an mbuf belonging to a bucket in
   the general cache/container.  Then it frees another mbuf
   belonging to the same bucket (still in gen container).
3) Some third thread tries to allocate an mbuf from the pcpu
   container and, since empty, grabs one mbuf now available
   in the general cache and moves the non-empty bucket from
   which it took 1 mbuf and to which the thread in (2) freed
   to, and moves it to the pcpu container.
4) A final thread tries to free an mbuf belonging to the
   NON-EMPTY bucket mentionned in (2) and (3) and, since
   the pcpu container's mc_starved is > 0, but the bucket
   is obviously non-empty, it trips on the KASSERT.
This meant that one could potentially get a panic in some
cases when out of mbufs and clusters.  The problem could
be mitigated by commenting out some cv_signal() calls,
but I'm assuming that was pure coincidence and this is
the correct fix.
2003-06-03 19:19:13 +00:00
jeff
27ff96520c - Remove the blocked pointer from the umtx structure.
- Use a hash of umtx queues to queue blocked threads.  We hash on pid and the
   virtual address of the umtx structure.  This eliminates cases where we
   previously held a lock across a casuptr call.

Reviwed by:	jhb (quickly)
2003-06-03 05:24:46 +00:00
tegge
e41badac0a Add tracking of process leaders sharing a file descriptor table and
allow a file descriptor table to be shared between multiple process
leaders.

PR:		50923
2003-06-02 16:05:32 +00:00
marcel
2d3c5aba3d Remove the ia64 hackery in threadinit() that was needed to work around
the lameness of the kstack code. The EPC overhaul de-lame-ified the
kstack code by removing the need for contigmalloc(). We can now
allocate stacks using malloc(). We probably want to make the stacks
swappable as well so that we can make it MI. But that's another story.
2003-06-01 05:57:58 +00:00
rwatson
3a7cb1d1fd Attempt to further comment and clarify System V IPC logic: document
why certain exceptions are made, note an inconsistency between
FreeBSD and some other implementations regarding IPC_M, and let
suser() generate our EPERM rather than forcing it ourselves.
Remove a carriage return that crept in in the last commit.

Reviewed by:	gordon
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-31 23:31:51 +00:00
rwatson
55281b2df1 Attempt to marginally de-obfuscate sections of the System V IPC access
control logic.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-31 23:17:30 +00:00
phk
11a69d36a0 Add "" around mutex name to make message less confusing. 2003-05-31 21:11:01 +00:00
phk
2048912526 Remove unused variable(s).
Found by:       FlexeLint
2003-05-31 20:29:34 +00:00
phk
bc9e7b689e Remove return after panic.
Found by:       FlexeLint
2003-05-31 20:18:23 +00:00
phk
11faebdb1a Remove needless return
Found by:       FlexeLint
2003-05-31 20:16:44 +00:00
phk
108e10f649 Add a couple of XXX comments where the intent is not clear.
Found by:       FlexeLint
2003-05-31 20:13:58 +00:00
phk
145891d899 Remove unused variable(s).
Remove break after goto

Found by:       FlexeLint
2003-05-31 20:11:33 +00:00
phk
d876d5ad8d Remove return after panic.
Found by:       FlexeLint
2003-05-31 20:09:42 +00:00
phk
ddccb1d287 Remove unused variable and now unbalanced call to splbio();
Found by:       FlexeLint
2003-05-31 20:09:01 +00:00
marcel
9bba923f2e Fix ia32 compat on ia64. Recent ia64 MD changes caused the garbage on
the stack to be changed in a way incompatible with elf32_map_insert()
where we used data_buf without initializing it for when the partial
mapping resulting in a misaligned image (typical when the page size
implied by the image is not the same as the page size in use by the
kernel). Since data_buf is passed by reference to vm_map_find(), the
compiler cannot warn about it.

While here, move all local variables to the top of the function.
2003-05-31 19:55:05 +00:00
phk
3953441927 "break" rather than fall through to a break in the default clause.
Found by:       FlexeLint
2003-05-31 16:53:16 +00:00
phk
383c23d209 Introduce {be,le}_uuid_{enc,dec}() functions for explicitly encoding
and decoding UUID's in big endian and little endian binary format.
2003-05-31 16:47:07 +00:00
phk
0129a20107 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
peter
6c537c22b4 Add __amd64__ to the ifdefs that introduce the "pcicfg" spinlock to
witness.

Approved by:  re (safe amd64 support)
2003-05-31 06:42:37 +00:00
mux
2618a96d5e When loading a module that contains a sysctl which is already compiled
in the kernel, the sysctl_register() call would fail, as expected.
However, when unloading this module again, the kernel would then panic
in sysctl_unregister().  Print a message error instead.

Submitted by:	Nicolai Petri <nicolai@catpipe.net>
Reviewed by:	imp
Approved by:	re@ (jhb)
2003-05-29 21:19:18 +00:00
dwmalone
2220e22eed Add an INVARIENTS only check to make sure Giant is held if mbuf
allocation is attempted with M_TRYWAIT.

Reviewed by:	bmilekic
Approved by:	re (scottl)
2003-05-29 18:38:24 +00:00
dwmalone
a02706ac93 Grab giant in sendit rather than kern_sendit because sockargs may
allocate mbufs with M_TRYWAIT, which may require Giant.

Reviewed by:	bmilekic
Approved by:	re (scottl)
2003-05-29 18:36:26 +00:00
iedowse
0253303b1a In cluster_wbuild(), initialise b_iocmd to BIO_WRITE before calling
buf_start() to avoid triggering a panic in softdep_disk_io_initiation()
if b_iocmd happened to be BIO_READ. The later initialisation of
b_iocmd in cluster_wbuild() could probably be moved to before the
buf_start() call, but this patch keeps the change as simple as
possible.

This is reported to fix occasional "softdep_disk_io_initiation: read"
panics, especially on NFS servers.

Reported by:	Nick Hilliard <nick@netability.ie>
Tested by:	Nick Hilliard <nick@netability.ie>
Approved by:	re (rwatson)
2003-05-28 13:22:10 +00:00
peter
54155a49a5 Copy the va_list in sbuf_vprintf() before passing it to vsnprintf(),
because we could fail due to a small buffer and loop and rerun.  If this
happens, then the vsnprintf() will have already taken the arguments off
the va_list.  For i386 and others, this doesn't matter because the
va_list type is a passed as a copy.  But on powerpc and amd64, this is
fatal because the va_list is a reference to an external structure that
keeps the vararg state due to the more complicated argument passing system.
On amd64, arguments can be passed as follows:
First 6 int/pointer type arguments go in registers, the rest go on
  the memory stack.
Float and double are similar, except using SSE registers.
long double (80 bit precision) are similar except using the x87 stack.
Where the 'next argument' comes from depends on how many have been
processed so far and what type it is.  For amd64, gcc keeps this state
somewhere that is referenced by the va_list.

I found a description that showed the va_copy was required here:
http://mirrors.ccs.neu.edu/cgi-bin/unixhelp/man-cgi?va_end+9
The single unix spec doesn't mention va_copy() at all.

Anyway, the problem was that the sysctl kern.geom.conf* nodes would panic
due to walking off the end of the va_arg lists in vsnprintf.  A better fix
would be to have sbuf_vprintf() use a single pass and call kvprintf()
with a callback function that stored the results and grew the buffer
as needed.

Approved by:	re (scottl)
2003-05-25 19:03:08 +00:00
jeff
4c8aa154ff - Create a new lock, umtx_lock, for use instead of the proc lock for
protecting the umtx queues.  We can't use the proc lock because we need
   to hold the lock across calls to casuptr, which can fault.

Approved by:	re
2003-05-25 18:18:32 +00:00
jeff
a4b79b551b - Reset the free ent to NULL if we have consumed the last free entry. This
fixes a problem where we would overwrite old data if we ran out of free
   entries.

Submitted by:	sam
Approved by:	re (scottl)
2003-05-25 08:48:42 +00:00
alc
53638c7027 Make the maximum number of vnodes a function of both the physical memory
size and the kernel's heap size, specifically, vm_kmem_size.  This
function allows a maximum of 40% of the vm_kmem_size to be used for
vnodes and vm objects.  This is a conservative bound based upon recent
problem reports.  (In other words, a slight increase in this percentage
may be safe.)

Finally, machines with less than ~3GB of RAM should be unaffected
by this change, i.e., the maximum number of vnodes should remain
the same.  If necessary, machines with 3GB or more of RAM can increase
the maximum number of vnodes by increasing vm_kmem_size.

Desired by:	scottl
Tested by:	jake
Approved by:	re (rwatson,scottl)
2003-05-23 19:54:02 +00:00
julian
117dadd4fc When we are spilling threads out of the run queue during panic, make sure we
keep the thread state variable consistent with its real state.
i.e. Don't say it's on the run queue when it isn't.

Also clarify the associated comment.

Turns a double panic back to a single panic :-/

Approved by:	re@ (jhb)
2003-05-21 18:53:25 +00:00
marcel
5d3af2c5ab Revamp of the syscall path, exception and context handling. The
prime objectives are:
o  Implement a syscall path based on the epc inststruction (see
   sys/ia64/ia64/syscall.s).
o  Revisit the places were we need to save and restore registers
   and define those contexts in terms of the register sets (see
   sys/ia64/include/_regset.h).

Secundairy objectives:
o  Remove the requirement to use contigmalloc for kernel stacks.
o  Better handling of the high FP registers for SMP systems.
o  Switch to the new cpu_switch() and cpu_throw() semantics.
o  Add a good unwinder to reconstruct contexts for the rare
   cases we need to (see sys/contrib/ia64/libuwx)

Many files are affected by this change. Functionally it boils
down to:
o  The EPC syscall doesn't preserve registers it does not need
   to preserve and places the arguments differently on the stack.
   This affects libc and truss.
o  The address of the kernel page directory (kptdir) had to
   be unstaticized for use by the nested TLB fault handler.
   The name has been changed to ia64_kptdir to avoid conflicts.
   The renaming affects libkvm.
o  The trapframe only contains the special registers and the
   scratch registers. For syscalls using the EPC syscall path
   no scratch registers are saved. This affects all places where
   the trapframe is accessed. Most notably the unaligned access
   handler, the signal delivery code and the debugger.
o  Context switching only partly saves the special registers
   and the preserved registers. This affects cpu_switch() and
   triggered the move to the new semantics, which additionally
   affects cpu_throw().
o  The high FP registers are either in the PCB or on some
   CPU. context switching for them is done lazily. This affects
   trap().
o  The mcontext has room for all registers, but not all of them
   have to be defined in all cases. This mostly affects signal
   delivery code now. The *context syscalls are as of yet still
   unimplemented.

Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.

Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.

This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.

Approved by: re@ (jhb)
2003-05-16 21:26:42 +00:00
truckman
80040f21a3 Detect that a vnode has been reclaimed while vflush() was waiting to lock
the vnode and restart the loop.  Vflush() is vulnerable since it does not
hold a reference to the vnode and it holds no other locks while waiting
for the vnode lock.  The vnode will no longer be on the list when the
loop is restarted.

Approved by:	re (rwatson)
2003-05-16 19:46:51 +00:00
obrien
384dc4a2a3 Fix long standing bug that prevents the PT_CONTINUE, PT_KILL and
PT_DETACH ptrace(2) requests from functioning as advertised in the
manual page.  As described in kern/35175, the PT_DETACH request will,
under certain circumstances, pass an unwanted signal on to the traced
process upan detaching from it.  The PT_CONTINUE request will
sometimes fail if you make it pass a signal that has "properties" that
differ from the properties of the signal that origionally caused the
traced process to be stopped.  Since PT_KILL is nothing than
PT_CONTINUE with SIGKILL, it is broken too.  In the PT_KILL case, this
leads to an unkillable process.

PR:		44011
Submitted by:	Mark Kettenis <kettenis@chello.nl>
Approved by:	re(jhb)
2003-05-16 01:34:23 +00:00
rwatson
1db54a2d45 VOP_PATHCONF() requires a vnode lock; this patch adds locking to
fpathconf(). The lock is held for direct calls to VOP_PATHCONF() in
pathconf() already.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:13:08 +00:00
bmilekic
f48bcc48de Make the mb_alloc low-watermark sysctl-tunable read-only and make
netstat(1) not display it for now because its effects are not yet
completely implemented and we're about to cut 5.2-RELEASE.
This is temporary.

Approved by: re (scottl, rwatson)
2003-05-15 19:05:28 +00:00
ps
00084d3dc9 p_sigignore moved into struct sigacts. move one which was missed.
Approved by:	re (scottl)
2003-05-14 00:03:55 +00:00
jhb
89a4eb17de - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
jhb
824931292d In setitimer(2), if the it_value of the new itimer value is clear, then
don't add the current time to it, but leave it as clear so that when the
timer is disabled, the it_value is always clear.

Reviewed by:	bde
Approved by:	re (rwatson)
2003-05-13 19:21:46 +00:00
alc
0422418ef4 Optimize the use of splay in gbincore(). During a "make buildworld" the
desired buffer is found at one of the roots more than 60% of the time.
Thus, checking both roots before performing either splay eliminates
unnecessary splays on the first tree splayed.

Approved by:	re (jhb)
2003-05-13 04:36:02 +00:00
phk
d38d01f72d Bail out if there were not two loadable sections. Add XXX comment about
one other issue.

Approved by:	re/rwatson.
2003-05-12 15:08:10 +00:00
rwatson
c21d149f29 Remove bogus locking from DDB's "show lockedvnods" command: using
synchronization primitives from inside DDB is generally a bad idea,
and in this case it frequently results in panics due to DDB commands
being executed from the sio fast interrupt context on a serial
console.  Replace the locking with a note that a lack of locking
means that DDB may get see inconsistent views of the mount and vnode
lists, which could also result in a panic.  More frequently,
though, this avoids a panic than causes it.

Discussed with ages ago:	bde
Approved by:			re (scottl)
2003-05-12 14:37:47 +00:00
phk
587d476cf9 Don't pass NULL pointer to memset if we are compiled with DIAGNOSTIC
Approved by:	re/rwatson
2003-05-12 05:09:56 +00:00
bmilekic
988354fc06 Make m_freem() just use m_free() instead of duplicating the code. The
reason for the duplication was that m_freem() was meant to eventually
be optimized to hold the lock of the cache being freed to as long as
possible across frees but the difficulty of implementing said
optimization right now is too high, given that in some cases (see MAC
and non-cluster external buffers), we need to call into other subsytems,
something not permissible when the cache lock is held.

This change minimizes code duplication while keeping at least the
atomic mbuf+cluster free optimization.

Suggested by: luigi
2003-05-10 18:08:23 +00:00
jhb
9efb8e111e Remove Giant from kern_sigsuspend() and osigsuspend() as these should now
be MP safe.

Approved by:	re (scottl)
2003-05-09 19:11:32 +00:00
rwatson
1a81aef457 Rename MAC_MAX_POLICIES to MAC_MAX_SLOTS, since the variables and
constants in question refer to the number of label slots, not the
maximum number of policies that may be loaded.  This should reduce
confusion regarding an element in the MAC sysctl MIB, as well as
make it more clear what the affect of changing the compile-time
constants is.

Approved by:	re (jhb)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-08 19:49:42 +00:00
rwatson
cdd2c23d20 Clean up locking for the MAC Framework:
(1) Accept that we're now going to use mutexes, so don't attempt
    to avoid treating them as mutexes.  This cleans up locking
    accessor function names some.

(2) Rename variables to _mtx, _cv, _count, simplifying the naming.

(3) Add a new form of the _busy() primitive that conditionally
    makes the list busy: if there are entries on the list, bump
    the busy count.  If there are no entries, don't bump the busy
    count.  Return a boolean indicating whether or not the busy
    count was bumped.

(4) Break mac_policy_list into two lists: one with the same name
    holding dynamic policies, and a new list, mac_static_policy_list,
    which holds policies loaded before mac_late and without the
    unload flag set.  The static list may be accessed without
    holding the busy count, since it can't change at run-time.

(5) In general, prefer making the list busy conditionally, meaning
    we pay only one mutex lock per entry point if all modules are
    on the static list, rather than two (since we don't have to
    lower the busy count when we're done with the framework).  For
    systems running just Biba or MLS, this will halve the mutex
    accesses in the network stack, and may offer a substantial
    performance benefits.

(6) Lay the groundwork for a dynamic-free kernel option which
    eliminates all locking associated with dynamically loaded or
    unloaded policies, for pre-configured systems requiring
    maximum performance but less run-time flexibility.

These changes have been running for a few weeks on MAC development
branch systems.

Approved by:	re (jhb)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-07 17:49:24 +00:00
alc
d18bfec38d Lock the vm_object when performing vm_pager_deallocate(). 2003-05-06 02:45:28 +00:00
jhb
ad3e75f51e Tweak the clearing of TDF_DEADLKTREAT so that we only bother grabbing the
lock and clearing the flag if it was clear when uiomove() was called.
2003-05-05 21:27:29 +00:00
jhb
65572963c9 Mostly sort the includes. 2003-05-05 21:26:25 +00:00
jhb
099389efb0 Lock the proc lock around calls to tdsignal() in the sigwait() family of
syscalls.
2003-05-05 21:18:10 +00:00
jhb
755cc1e549 Make issignal() private to kern_sig.c since it is only called from cursig()
and cursig() is now a function rather than a macro.
2003-05-05 21:16:28 +00:00
jhb
828797f029 Remove TD_ON_RUNQ() from a check to make sure Giant is not held when
calling mi_switch().  The kernel would panic on an earlier KASSERT() in
mi_switch() if TD_ON_RUNQ() was true.
2003-05-05 21:12:36 +00:00
dwmalone
86e87d5e2d Split sendit into two parts. The first part, still called sendit, that
does the copyin stuff and then calls the second part kern_sendit to do
the hard work. Don't bother holding Giant during the copyin phase.

The intent of this is to allow the Linux emulator to impliment send*
syscalls without using the stackgap.
2003-05-05 20:33:38 +00:00
mbr
98d5255d63 Change the semantics of sysv shm emulation to take a additional
argument to the functions shm{at,ctl}1 and shm_find_segment_by_shmid{x}.
The BSD semantics didn't allow the usage of shared segment after
being marked for removal through IPC_RMID.

The patch involves the following functions:
  - shmat
  - shmctl
  - shm_find_segment_by_shmid
  - shm_find_segment_by_shmidx
  - linux_shmat
  - linux_shmctl

Submitted by:	Orlando Bassotto <orlando.bassotto@ieo-research.it>
Reviewed by:	marcel
2003-05-05 09:22:58 +00:00
phk
091a67c527 Add two KASSERTS which trigger if free(9) would drag the "memuse" statistic
for a malloc bucket under zero.  This typically happens if you malloc(9)
from one bucket and free to another.
2003-05-05 08:32:53 +00:00
phk
d8ee9fd2c6 Use le32dec() instead of le32toh() because we are not guaranteed to have
a word aligned input.
2003-05-05 07:22:35 +00:00
alc
410b675ed9 - Revert kern/vfs_subr.c revision 1.444. The vm_object's size isn't
trustworthy for vnode-backed objects.
 - Restore the old behavior of vm_object_page_remove() when the end
   of the given range is zero.  Add a comment to vm_object_page_remove()
   regarding this behavior.

Reported by:	iedowse
2003-05-03 08:09:24 +00:00
alc
d282523893 Lock access to the vm_object's flags in vop_stdcreatevobject(). 2003-05-02 19:33:21 +00:00
julian
910b47d3e7 Fix typo in last commit 2003-05-02 06:18:55 +00:00
silby
f449396167 Add the M_FREELIST flag, which is used to detect whenever a
double free of a mbuf occurs and cause an immediate panic, rather
than allowing free list corruption to occur.

This code is trapped under INVARIANTS, so it should not cause any
change in default performance.

Reviewed by:	a bunch of people on -net
MFC after:	1 week
2003-05-02 03:43:40 +00:00
julian
6ab69ab6e0 remove old and inaccurate XXX comment. 2003-05-02 01:02:20 +00:00
julian
a954908b35 Move the flag that indicates an idle thread from the KSE to the thread.
It was always referenced via the thread anyhow.

Reviewed by:	jhb (a LOOOOONG time ago)
2003-05-02 00:33:12 +00:00
jhb
53849d5075 Remove Giant from the setuid(), seteuid(), setgid(), setegid(),
setgroups(), setreuid(), setregid(), setresuid(), and setresgid() syscalls
as well as the cred_update_thread() function.
2003-05-01 21:21:42 +00:00
jhb
9e17fca425 Initialize and destroy the struct proc mutex in the proc zone's init and
fini routines instead of in fork() and wait().  This has the nice side
benefit that the proc lock of any process on the allproc list is always
valid and sched_lock doesn't have to be used to test against PRS_NEW
anymore.
2003-05-01 21:16:38 +00:00
jhb
2b455d2859 Garbage collect unused TDF_INMSLEEP flag. 2003-05-01 17:05:24 +00:00
des
8ed712ead1 Instead of recording the Unix time in a process when it starts, record the
uptime.  Where necessary, convert it back to Unix time by adding boottime
to it.  This fixes a potential problem in the accounting code, which would
compute the elapsed time incorrectly if the Unix time was stepped during
the lifetime of the process.
2003-05-01 16:59:23 +00:00
davidxu
f63c44bf1c Fix compiling problem, p_tracee is in my local repository for
threaded process debugging, not ready for this time.
2003-05-01 12:16:06 +00:00
davidxu
a8c00fe70b Drop Giant lock before suspended, pick up it after resumed.
thread_suspend_check() is used in exit1() which still needs
Giant lock.
2003-05-01 07:29:25 +00:00
alc
2eb952c37f Lock an update to a vm_object's ref_count. 2003-05-01 03:51:05 +00:00
alc
e9c4374a87 Lock accesses to the vm_object's ref_count and resident_page_count. 2003-05-01 03:10:38 +00:00
peter
d6b6ab622f AMD64 uses the new-style cpu_switch()/cpu_throw() calling conventions. 2003-04-30 21:45:03 +00:00
jhb
65f917c9f1 Forgot to remove Giant around call to kern_sigaction() in
freebsd4_sigaction() in revision 1.232.
2003-04-30 19:45:13 +00:00
jhb
0bca1844ff Axe a stale comment. 2003-04-30 19:41:04 +00:00
markm
6cc289554b Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
davidxu
766ca101f3 Increase some default values. 2003-04-30 01:18:29 +00:00
kan
9468fdaf14 Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
mike
3e80873bba style(9) 2003-04-28 18:32:19 +00:00
alc
afa5d8b791 Finish the vm_object locking for this file, including holding the vm_object
lock when accessing the vm_object's flags or calling vm_page_lookup().
2003-04-28 05:40:45 +00:00
davidxu
450b9799ce unlock sched_lock at right time. 2003-04-27 04:32:40 +00:00
alc
d5ac0bc453 Various changes to vm_object_page_remove():
- Eliminate an odd, special-case feature:
   if start == end == 0 then all pages are removed.  Only one caller
   used this feature and that caller can trivially pass the object's
   size.
 - Assert that the vm_object is locked on entry; don't bother testing
   for a NULL vm_object.
 - Style: Fix lines that are longer than 80 characters.
2003-04-26 23:41:30 +00:00
alc
4f5c780f99 - Lock the vm_object on entry to vm_object_terminate(). 2003-04-26 19:36:19 +00:00
alc
373b18b5c3 - Convert vm_object_pip_wait() from using tsleep() to msleep().
- Make vm_object_pip_sleep() static.
 - Lock the vm_object when performing vm_object_pip_wait().
2003-04-26 18:33:18 +00:00
alc
3db24e0c46 - Lock the vm_object when performing vm_page_alloc() in allocbuf(). 2003-04-26 07:42:24 +00:00
phk
773e071682 Update the "last malloc failure timestamp" also for simulated
malloc errors.
2003-04-25 21:49:24 +00:00
jhb
942b3c3db7 Remove Giant from getpgid() and getsid() and tweak the logic to more
closely match that of 4.x.
2003-04-25 20:09:31 +00:00
jhb
db5f78d397 Push down Giant around calls to proc_rwmem() in kern_ptrace. kern_ptrace()
should now be MP safe.
2003-04-25 20:02:16 +00:00
jhb
57c0e7ab21 Push Giant down into kern_sigaction() instead of locking it around calls
to kern_sigaction() in the various callers of the function.
2003-04-25 20:01:19 +00:00
jhb
b3c19f6ec9 - Push down Giant around vnode operations in ktrace().
- Mark the ktrace() and utrace() syscalls as being MP safe.
- Validate the facs argument to ktrace() prior to doing any vnode
  operations or acquiring any locks.
- Share lock the proctree lock over the entire section that calls
  ktrsetchildren() and ktrops().  We already did this for process groups.
  Doing it for the process case closes a small race where a process might
  go away after we look it up.  As a result of this, ktrstchildren() now
  just asserts that the proctree lock is locked rather than acquiring the
  lock itself.
- Add some missing comments to #else and #endif.
2003-04-25 19:59:35 +00:00
deischen
3d51b3a280 Add an argument to get_mcontext() which specified whether the
syscall return values should be cleared.  The system calls
getcontext() and swapcontext() want to return 0 on success
but these contexts can be switched to at a later time so
the return values need to be cleared in the saved register
sets.  Other callers of get_mcontext() would normally want
the context without clearing the return values.

Remove the i386-specific context saving from the KSE code.
get_mcontext() is not i386-specific any more.

Fix a bad pointer in the alpha get_mcontext() code.  The
context was being bcopy()'d from &td->tf_frame, but tf_frame
is itself a pointer, so the thread was being copied instead.
Spotted by jake.

Glanced at by:  jake
Reviewed by:    bde (months ago)
2003-04-25 01:50:30 +00:00
tjr
5464273ad7 Include altkstack pages in the RSS regardless of whether the process
is swapped out. Pointed out by jhb.
2003-04-25 00:20:40 +00:00
des
4e35cc9041 It seems that 1 was not a magic value as I thought, but a coincidence.
Instead of applying the adjustment to processes with a start time of 1,
apply it to all processes with a start time of less than 3600.

None of this would be necessary if the start times were recorded in ticks
instead of seconds and microseconds.
2003-04-24 12:12:06 +00:00
tjr
bf8bb3cbd9 Do a better job of calculating the RSS for swapped-out processes:
don't include the kernel stacks of swapped-out threads in the page count,
but do include the alternate kernel stack. jhb provided some helpful
comments on this.

PR:		49102
2003-04-24 11:03:04 +00:00
tjr
2b308e25a0 Free mount credentials (mnt_cred) when freeing the mount struct
in failure cases to avoid leaking struct ucreds, and ultimately
leaking struct uidinfo references.
2003-04-24 08:16:06 +00:00
alc
87da2c3cf3 - Acquire the vm_object's lock when performing vm_object_page_clean().
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
   whether its caller has locked the vm_object.  (This is a temporary
   measure to bootstrap vm_object locking.)
2003-04-24 04:31:25 +00:00
des
51f01cb3f7 When filling out a kinfo_proc structure, if we come across a process
whose p_stats->p_start has the magic value 1, replace it with boottime.
Some users were apparently confused by the fact that ps(1) reported a
start time in early 1970 for system processes.
2003-04-24 03:37:59 +00:00
jhb
9b55ca02a0 Remove Giant from osigblock(), osigsetmask(), and kern_sigaltstack(). 2003-04-23 19:49:18 +00:00
jhb
2c416d197d The signotify() sanity check in userret() doesn't need Giant anymore. 2003-04-23 18:51:55 +00:00
jhb
26097c18e1 Add lock assertions for various proc/thread/kse/ksegroup fields to the
scheduler functions.
2003-04-23 18:51:05 +00:00
jhb
89c52cff2e - Reorganize osigstack() to do the copyin first, grab the proc lock once,
do all the various sigstack dances, unlock the proc lock, and finally do
  the copyout.  This more closely resembles the behavior of
  kern_sigaltstack() and closes a small race.
- Remove Giant from osigstack as it is no longer needed.
2003-04-23 18:50:25 +00:00
jhb
2958cf621b Remove Giant from [gs]etpriority(). 2003-04-23 18:48:55 +00:00
jhb
a0bf3a3e6f - Protect p_numthreads with the sched_lock.
- Protect p_singlethread with both the sched_lock and the proc lock.
- Protect p_suspcount with the proc lock.
2003-04-23 18:46:51 +00:00
obrien
590e10e4ac Add /dev to the Alpha manual mount root example. 2003-04-23 05:02:40 +00:00
jhb
128ae3c8d8 - Move PS_PROFIL and its new cousin PS_STOPPROF back over to p_flag and
rename them appropriately.  Protect both flags with both the proc lock
  and the sched_lock.
- Protect p_profthreads with the proc lock.
- Remove Giant from profil(2).
2003-04-22 20:54:04 +00:00
jhb
41837c0a14 - Assert that the proc lock and sched_lock are held in sched_nice().
- For the 4BSD scheduler, this means that all callers of the static
  function resetpriority() now always hold sched_lock, so don't lock
  sched_lock explicitly in that function.
2003-04-22 20:50:38 +00:00
jhb
ced60d737a Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
2003-04-22 20:45:38 +00:00
jhb
d5cf4c5275 Prefer the proc lock to sched_lock when testing PS_INMEM now that it is
safe to do so.
2003-04-22 20:01:56 +00:00
jhb
8c172a3498 Protect p_swtime with the sched_lock. 2003-04-22 19:48:25 +00:00
jhb
cfedd4c7d6 - Mark the kse_purge_group() and kse_purge() definitions static to match
their prototypes.
- Remove sched_lock locking from kse_purge() as all callers already lock
  the sched_lock before calling it.
- Hold the proc lock slightly longer to protect P_SHOULDSTOP().
2003-04-22 19:47:55 +00:00
imp
fb873e6edf Create a new function, device_is_attached(), that is like
device_is_alive() that tells us if the device has successfully
attached.  device_is_alive just tells us that the device has
successfully probed.
2003-04-21 18:19:08 +00:00
davidxu
d5ff3e991d Fix lock order reversal problem. 2003-04-21 14:42:04 +00:00
davidxu
7e0ecb5345 Introduce two flags to control upcall behaviour:
o KMF_NOUPCALL
	Ask kse_release to not return to userland upcall entry, but instead
	direct returns to userland by using current thread's stack and return
	address on stack. This flags is intended to be used by UTS in critical
	region to wait another UTS thread to leave critical region, by using
	kse_release with this flag to avoid spinnng and burning CPU. Also this
	flags can be used by UTS to poll completed context when there is nothing
	to do in userland and needn't restart from its entry like normal upcall.

o KMF_NOCOMPLETED
	Ask kernel to not bring completed thread contexts back to userland when
	doing upcall, this flags is intend to be used with above flag when an
	upcall thread is in critical region and can not process completed contexts
	at that time.

Tested by: deischen
2003-04-21 07:27:59 +00:00
imp
cf3cf85267 Fix /dev/devctl's implementation of poll. We should only be setting
the poll bits when there's actually something in the queue.
Otherwise, select always returned '2' when there were no items to be
read, and '3' when there were.  This would preclude being able to read
in a threaded (libc_r) program, as well as checking to see if there
were pending events or not.
2003-04-21 05:58:51 +00:00
alc
e7c8e4e470 - Lock the vm_object when performing vm_object_pip_add(). 2003-04-20 07:29:50 +00:00
alc
c9e51c9b11 Lock the vm_object in vfs_busy_pages(). 2003-04-20 00:17:05 +00:00
alc
dc48d3db81 - Lock the vm_object when performing vm_object_pip_subtract().
- Assert that the vm_object lock is held in vm_object_pip_subtract().
2003-04-19 22:11:41 +00:00
alc
ef4e8a19cf - Lock the vm_object when performing vm_object_pip_wakeupn().
- Assert that the vm_object lock is held in vm_object_pip_wakeupn().
 - Add a new macro VM_OBJECT_LOCK_ASSERT().
2003-04-19 21:15:44 +00:00
alc
d558a7a53b Lock the jumbo_vm_object when performing vm_page_alloc(). 2003-04-19 19:13:25 +00:00
davidxu
a10a41ca38 Test next upcall time correctly. 2003-04-19 06:16:04 +00:00
davidxu
28038e92fe Unbreak sigaltstack syscall. sigonstack is now a function and
want proc lock be held.
2003-04-19 05:04:06 +00:00
davidxu
8ef415ed06 Use correct thread pointer. 2003-04-19 04:39:10 +00:00
jhb
801acfe1d4 - Make sigonstack() a regular function instead of an inline and add a proc
lock assertion to it.
- SIGPENDING() no longer needs sched_lock, so only grab sched_lock to set
  the TDF_NEEDSIGCHK and TDF_ASTPENDING flags in signotify().
- Add a proc lock assertion to tdsigwakeup().
- Since we always set TDF_OLDMASK while holding the proc lock, the proc
  lock is sufficient protection to check its state in postsig() and we only
  need sched_lock when clearing the actual flag.
2003-04-18 20:59:05 +00:00
jhb
8b7a3b47d1 Use the proc lock to protect p_singlethread and a P_WEXIT test. This
fixes a couple of potential KSE panics on non-i386 arch's that weren't
holding the proc lock when calling thread_exit().
2003-04-18 20:20:00 +00:00
jhb
fa6200c9ec Rename do_sigprocmask() to kern_sigprocmask() and make it a global symbol
so that it can be used by binary emulators.
2003-04-18 20:18:44 +00:00
jhb
de4c9711d0 Add a couple of sched_lock asserts. 2003-04-18 20:17:47 +00:00
jhb
f043193969 - Add a static function pgadjustjobc() to adjust the job control count for
a process group.
- Call pgadjustjobc() twice in fixjobc() to avoid code duplication and
  improve readability.
- Use the proc lock to protect P_SHOULDSTOP() instead of sched_lock.
- Check to see if a process is PRS_NEW with sched_lock before trying to
  lock its proc lock since the lock may not be constructed yet.
2003-04-18 20:17:05 +00:00
rwatson
9abedb6965 Update NAI copyright to 2003, missed in earlier commits and merges. 2003-04-18 19:57:37 +00:00
alc
83fe46be18 Update locking around vm_object_page_remove() to use the new macros. 2003-04-18 16:39:03 +00:00
jeff
556bb64555 - Set the ke_cpu field in sched_add() for interrupt and realtime threads
since they are going on the current cpu and not their previously assigned
   cpu.
 - sched_runnable() should only return true in the SMP case if the other
   processor has more than one thread that is runnable.  We can not steal
   curthread.
 - Change kseq_print() to accept the cpuid instead of a kseq pointer.  This
   makes use of this function in ddb much easier.
2003-04-18 05:24:10 +00:00
julian
0e096a3dd1 Add a thread_unlink() and use it.
It could also be used twice in kern_thr.c but that's owned by jeff
so I'l let him change it when he's next there.
2003-04-18 00:16:13 +00:00
jhb
e1dd224437 - kthread's don't have p_textvp set to anything, so replace code that
dealt with that possibility with a KASSERT().
- No need to set P_SYSTEM, kthread_create() does that for us.
2003-04-17 22:37:48 +00:00
jhb
05864a7334 - Use a local struct proc variable to improve readability.
- Use a local variable to close a minor race when determining if the wmesg
    printed out needs a prefix such as when a thread is blocked on a lock.
2003-04-17 22:36:40 +00:00
jhb
bffa90cc0a Tweak locking in the PS_XCPU handler to hold the sched_lock while reading
p_runtime.
2003-04-17 22:33:04 +00:00
jhb
5023bfe74a The sched_lock is not needed while clearing two of the P_STOPPED bits in
p_flag.  Also, the proc lock can't be recursed, so simplify an older proc
lock assertion.
2003-04-17 22:31:54 +00:00