Commit Graph

10008 Commits

Author SHA1 Message Date
Robert Watson
b4be6ef22f Only require privilege to set the current time adjustment, not in order to
query it.
2007-06-14 18:37:58 +00:00
Robert Watson
3805385e3d Spell statistics more correctly in comments. 2007-06-14 03:02:33 +00:00
John Baldwin
34a9edafbc Improve the ktrace locking somewhat to reduce overhead:
- Depessimize userret() in kernels where KTRACE is enabled by doing an
  unlocked check of the per-process queue of pending events before
  acquiring any locks.  Previously ktr_userret() unconditionally acquired
  the global ktrace_sx lock on every return to userland for every thread,
  even if ktrace wasn't enabled for the thread.
- Optimize the locking in exit() to first perform an unlocked read of
  p_traceflag to see if ktrace is enabled and only acquire locks and
  teardown ktrace if the test succeeds.  Also, explicitly disable tracing
  before draining any pending events so the pending events actually get
  written out.  The unlocked read is safe because proc lock is acquired
  earlier after single-threading so p_traceflag can't change between then
  and this check (well, it can currently due to a bug in ktrace I will fix
  next, but that race existed prior to this change as well).

Reviewed by:	rwatson
2007-06-13 20:01:42 +00:00
John Baldwin
ce0be64687 Conditionally acquire Giant when dropping a reference on the ktrace vnode
during execve() when turning off tracing due to executing a setuid binary
as non-root.  Previously this could fail to acquire Giant and fail an
assertion if the ktrace file was on a non-MPSAFE filesystem and the
executable was on an MPSAFE filesystem.

MFC after:	3 days
Reported by:	kris
2007-06-13 19:41:47 +00:00
Jeff Roberson
3036ab79e3 - Include opt_sched.h for SCHED_STATS. 2007-06-12 23:27:31 +00:00
Jeff Roberson
671f2709ae - Garbage collect unused concurrency functions. 2007-06-12 19:50:31 +00:00
Jeff Roberson
e7c8d2e9fe - Garbage collect unused concurrency functions.
- Remove unused kse fields from struct proc.
 - Group remaining fields and #ifdef KSE them.
 - Move some kern_kse.c only prototypes out of proc and into kern_kse.

Discussed with:	Julian
2007-06-12 19:49:39 +00:00
Jeff Roberson
fe54587ffa - Move some common code out of sched_fork_exit() and back into fork_exit(). 2007-06-12 07:47:09 +00:00
Jeff Roberson
ff8fbcffcb Solve a complex exit race introduced with thread_lock:
- Add a count of exiting threads, p_exitthreads, to struct proc.
 - Increment p_exithreads when we set the deadthread in thread_exit().
 - When we thread_stash() a deadthread use an atomic to drop the count.
 - Spin until the p_exithreads count reaches 0 in thread_wait().
 - Lock the last exiting thread momentarily to be certain that it has
   exited cpu_throw().
 - Restructure thread_wait().  It does not need a loop as there will only
   ever be one thread.

Tested by:	moose@opera.com
Reported by:	kris, moose@opera.com
2007-06-12 07:24:46 +00:00
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Jeff Roberson
efe641b939 - Add a missing PROC_SUNLOCK() in tdsignal() 2007-06-11 23:27:03 +00:00
Olivier Houchard
e411ce026a Re-acquire the PROC_SLOCK before calling calcru(), and release it after,
since calcru() expects it to be locked.

Reviewed by:	attilio
2007-06-11 21:05:41 +00:00
Sam Leffler
68e8e04e93 Update 802.11 wireless support:
o major overhaul of the way channels are handled: channels are now
  fully enumerated and uniquely identify the operating characteristics;
  these changes are visible to user applications which require changes
o make scanning support independent of the state machine to enable
  background scanning and roaming
o move scanning support into loadable modules based on the operating
  mode to enable different policies and reduce the memory footprint
  on systems w/ constrained resources
o add background scanning in station mode (no support for adhoc/ibss
  mode yet)
o significantly speedup sta mode scanning with a variety of techniques
o add roaming support when background scanning is supported; for now
  we use a simple algorithm to trigger a roam: we threshold the rssi
  and tx rate, if either drops too low we try to roam to a new ap
o add tx fragmentation support
o add first cut at 802.11n support: this code works with forthcoming
  drivers but is incomplete; it's included now to establish a baseline
  for other drivers to be developed and for user applications
o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates
  prepending mbufs for traffic generated locally
o add support for Atheros protocol extensions; mainly the fast frames
  encapsulation (note this can be used with any card that can tx+rx
  large frames correctly)
o add sta support for ap's that beacon both WPA1+2 support
o change all data types from bsd-style to posix-style
o propagate noise floor data from drivers to net80211 and on to user apps
o correct various issues in the sta mode state machine related to handling
  authentication and association failures
o enable the addition of sta mode power save support for drivers that need
  net80211 support (not in this commit)
o remove old WI compatibility ioctls (wicontrol is officially dead)
o change the data structures returned for get sta info and get scan
  results so future additions will not break user apps
o fixed tx rate is now maintained internally as an ieee rate and not an
  index into the rate set; this needs to be extended to deal with
  multi-mode operation
o add extended channel specifications to radiotap to enable 11n sniffing

Drivers:
o ath: add support for bg scanning, tx fragmentation, fast frames,
       dynamic turbo (lightly tested), 11n (sniffing only and needs
       new hal)
o awi: compile tested only
o ndis: lightly tested
o ipw: lightly tested
o iwi: add support for bg scanning (well tested but may have some
       rough edges)
o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data
o wi: lightly tested

This work is based on contributions by Atheros, kmacy, sephe, thompsa,
mlaier, kevlo, and others.  Much of the scanning work was supported by
Atheros.  The 11n work was supported by Marvell.
2007-06-11 03:36:55 +00:00
Attilio Rao
393a081d42 Optimize vmmeter locking.
In particular:
- Add an explicative table for locking of struct vmmeter members
- Apply new rules for some of those members
- Remove some unuseful comments

Heavily reviewed by: alc, bde, jeff
Approved by: jeff (mentor)
2007-06-10 21:59:14 +00:00
Matt Jacob
a659386c7e Remove unused variable. 2007-06-10 01:50:05 +00:00
Matt Jacob
26756b7a58 The new compiler can't quite follow the logic of has_stime and
complains about using uninitialized tags in stime.
2007-06-10 01:49:17 +00:00
Matt Jacob
9b73d2396a Initialized ets to zero. This is arguably a gcc bug in that ets is always
set to rts when timeout is non-NULL and then timevalid is set and ets is
only checked later when timervalid is set.
2007-06-10 01:43:11 +00:00
Attilio Rao
bdf08be439 Fix a bug caming from the committing a pre-merge version of the patch
instead than a post-merge version (respect to another rusage fix).

Reported by: marcel
Approved by: jeff(mentor)
2007-06-10 00:28:41 +00:00
Marcel Moolenaar
55b5660de4 Work around an integer overflow in expression `3 * maxbufspace / 4',
when maxbufspace is larger than INT_MAX / 3. The overflow causes a
hard hang on ia64 when physical memory is sufficiently large (8GB).
2007-06-09 23:41:14 +00:00
Attilio Rao
a1fe14bc33 rufetch and calcru sometimes should be called atomically together.
This patch fixes places where they should be called atomically changing
their locking requirements (both assume per-proc spinlock held) and
introducing rufetchcalc which wrappers both calls to be performed in
atomic way.

Reviewed by: jeff
Approved by: jeff (mentor)
2007-06-09 21:48:44 +00:00
Attilio Rao
86a49dea5b Since locking in kern/subr_prof.c is changed a bit, we need nomore of
time_lock spinlock exported.

Approved by: jeff (mentor)
2007-06-09 19:41:14 +00:00
Attilio Rao
a140976eb4 The current rusage code show peculiar problems:
- Unsafeness on ruadd() in thread_exit()
- Unatomicity of thread_exiit() in the exit1() operations

This patch addresses these problems allocating p_fd as part of the
process and modifying the way it is accessed.

A small chunk of this patch, resolves a race about p_state in kern_wait(),
since we have to be sure about the zombif-ing process.

Submitted by: jeff
Approved by: jeff (mentor)
2007-06-09 18:56:11 +00:00
Matt Jacob
65d32cd8fb Propagate volatile qualifier to make gcc4.2 happy. 2007-06-09 18:09:37 +00:00
Attilio Rao
e682569165 Remove the MUTEX_WAKE_ALL option and make it the default behaviour for our
mutexes.
Currently we alredy force MUTEX_WAKE_ALL beacause of some problems with the
!MUTEX_WAKE_ALL case (unavioidable priority inversion).
2007-06-08 21:36:52 +00:00
Poul-Henning Kamp
7acfb0af82 Double the WITNESS and DIAGNOSTIC benchmark warnings right before we
go into userland to improve the chances of people noticing them.
2007-06-08 11:47:36 +00:00
Xin LI
7b8c8b858c In getblk(), before gbincore(), use BO_LOCK directly when locking
the bufobj, rather than using VI_LOCK, like what was done with
revision 1.453.
2007-06-08 07:05:08 +00:00
Robert Watson
faef53711b Move per-process audit state from a pointer in the proc structure to
embedded storage in struct ucred.  This allows audit state to be cached
with the thread, avoiding locking operations with each system call, and
makes it available in asynchronous execution contexts, such as deep in
the network stack or VFS.

Reviewed by:	csjp
Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
2007-06-07 22:27:15 +00:00
John Baldwin
a66fde8d35 - Remove unused variable from create_thread().
- Move kern_thr_*() prototype to <sys/syscallsubr.h> where all the other
  kern_*() prototypes live.
2007-06-07 19:45:19 +00:00
David Xu
42ce445fed Backout experimental adaptive-spin umtx code. 2007-06-06 07:35:08 +00:00
Jeff Roberson
710eacdc5f - Placing the 'volatile' on the right side of the * in the td_lock
declaration removes the need for __DEVOLATILE().

Pointed out by:	tegge
2007-06-06 03:40:47 +00:00
Attilio Rao
d301eb10c7 Fix a problem with not-preemptive kernels caming from mis-merging of
existing code with the new thread_lock patch.
This also cleans up a bit unlock operation for mutexes.

Approved by: jhb, jeff(mentor)
2007-06-05 18:57:09 +00:00
Konstantin Belousov
b95b98b0bd Restore non-SMP build.
Reviewed by:	attilio
2007-06-05 14:20:13 +00:00
Jeff Roberson
95e3a0bca3 - Better fix for previous error; use DEVOLATILE on the td_lock pointer
it can actually sometimes be something other than sched_lock even on
   schedulers which rely on a global scheduler lock.

Tested by:	kan
2007-06-05 04:12:46 +00:00
Jeff Roberson
c219b097af - Pass &sched_lock as the third argument to cpu_switch() as this will
always be the correct lock and we don't get volatile warnings this
   way.

Pointed out by:	kan
2007-06-05 03:46:54 +00:00
Jeff Roberson
36b369163b - Define TDQ_ID() for the !SMP case.
- Default pick_pri to off.  It is not faster in most cases.
2007-06-05 02:53:51 +00:00
Jeff Roberson
8e0185f604 - Remove sched_core.c. The maintainer has lost interest in pursuing this
and it has been neglected in the recent ksegrp removal as well as
   the thread_lock() changes.

Discussed with:	davidxu
2007-06-05 00:12:37 +00:00
Jeff Roberson
982d11f836 Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
Jeff Roberson
bd43e47156 Commit 10/14 of sched_lock decomposition.
- Add new spinlocks to support thread_lock() and adjust ordering.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:55:45 +00:00
Jeff Roberson
07a61420ff Commit 9/14 of sched_lock decomposition.
- Attempt to return the ttyinfo() selection algorithm to something sane
   as it has been broken and disabled for some time.  Adapt this algorithm
   in such a way that it does not conflict with per-cpu scheduler locking.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:55:32 +00:00
Jeff Roberson
3c2e44364e Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
   is no global scheduler lock.
 - Use thread_lock() to protect thread state.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
Jeff Roberson
765b2891e8 Commit 7/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Use a global kse spinlock to protect upcall and thread assignment.  The
   per-process spinlock can not be used because this lock must be acquired
   via mi_switch() where we already hold a thread lock.  The kse spinlock
   is a leaf lock ordered after the process and thread spinlocks.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:27 +00:00
Jeff Roberson
11bda9b8d5 Commit 6/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Replace the tail-end of fork_exit() with a scheduler specific routine
   which can do the appropriate lock manipulations.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:53:34 +00:00
Jeff Roberson
40acdeabab Commit 5/14 of sched_lock decomposition.
- Protect the cp_time tick counts with atomics instead of a global lock.
   There will only be one atomic per tick and this allows all processors
   to execute softclock concurrently.
 - In softclock, protect access to rusage and td_*tick data with the
   thread_lock(), expanding the scope of the thread lock over the whole
   function.
 - Do some creative re-arranging in hardclock() to avoid excess locking.
 - Protect the p_timer fields with the per-process spinlock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:53:06 +00:00
Jeff Roberson
a54e85fdbf Commit 4/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Move some common code into thread_suspend_switch() to handle the
   mechanics of suspending a thread.  The locking here is incredibly
   convoluted and should be simplified.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:52:24 +00:00
Jeff Roberson
2502c107ba Commit 3/14 of sched_lock decomposition.
- Add a per-turnstile spinlock to solve potential priority propagation
   deadlocks that are possible with thread_lock().
 - The turnstile lock order is defined as the exact opposite of the
   lock order used with the sleep locks they represent.  This allows us
   to walk in reverse order in priority_propagate and this is the only
   place we wish to multiply acquire turnstile locks.
 - Use the turnstile_chain lock to protect assigning mutexes to turnstiles.
 - Change the turnstile interface to pass back turnstile pointers to the
   consumers.  This allows us to reduce some locking and makes it easier
   to cancel turnstile assignment while the turnstile chain lock is held.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:51:44 +00:00
Jeff Roberson
d72e80f09a Commit 2/14 of sched_lock decomposition.
- Adapt sleepqueues to the new thread_lock() mechanism.
 - Delay assigning the sleep queue spinlock as the thread lock until after
   we've checked for signals.  It is illegal for a thread to return in
   mi_switch() with any lock assigned to td_lock other than the scheduler
   locks.
 - Change sleepq_catch_signals() to do the switch if necessary to simplify
   the callers.
 - Simplify timeout handling now that locking a sleeping thread has the
   side-effect of locking the sleepqueue.  Some previous races are no
   longer possible.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:50:56 +00:00
Jeff Roberson
7b20fb19fb Commit 1/14 of sched_lock decomposition.
- Move all scheduler locking into the schedulers utilizing a technique
   similar to solaris's container locking.
 - A per-process spinlock is now used to protect the queue of threads,
   thread count, suspension count, p_sflags, and other process
   related scheduling fields.
 - The new thread lock is actually a pointer to a spinlock for the
   container that the thread is currently owned by.  The container may
   be a turnstile, sleepqueue, or run queue.
 - thread_lock() is now used to protect access to thread related scheduling
   fields.  thread_unlock() unlocks the lock and thread_set_lock()
   implements the transition from one lock to another.
 - A new "blocked_lock" is used in cases where it is not safe to hold the
   actual thread's lock yet we must prevent access to the thread.
 - sched_throw() and sched_fork_exit() are introduced to allow the
   schedulers to fix-up locking at these points.
 - Add some minor infrastructure for optionally exporting scheduler
   statistics that were invaluable in solving performance problems with
   this patch.  Generally these statistics allow you to differentiate
   between different causes of context switches.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:50:30 +00:00
Attilio Rao
b4b7081961 Do proper "locking" for missing vmmeters part.
Now, we assume no more sched_lock protection for some of them and use the
distribuited loads method for vmmeter (distribuited through CPUs).

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:45:18 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
David Malone
041b706b2f Despite several examples in the kernel, the third argument of
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.

Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.

In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported.  In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
2007-06-04 18:25:08 +00:00