Commit Graph

7140 Commits

Author SHA1 Message Date
Colin Percival
ec513ff759 Fix filt_timer* races: Finish initializing a knote before we pass it to
a callout, and use the new callout_drain API to make sure that a callout
has finished before we deallocate memory it is using.

PR:		kern/64121
Discussed with:	gallatin
2004-04-07 05:59:57 +00:00
Colin Percival
2c1bb20746 Introduce a callout_drain() function. This acts in the same manner as
callout_stop(), except that if the callout being stopped is currently
in progress, it blocks attempts to reset the callout and waits until the
callout is completed before it returns.

This makes it possible to clean up callout-using code safely, e.g.,
without potentially freeing memory which is still being used by a callout.

Reviewed by:	mux, gallatin, rwatson, jhb
2004-04-06 23:08:49 +00:00
John Baldwin
9000d57d57 Associate a simple count of waiters with each condition variable. The
count is protected by the mutex that protects the condition, so the count
does not require any extra locking or atomic operations.  It serves as an
optimization to avoid calling into the sleepqueue code at all if there are
no waiters.

Note that the count can get temporarily out of sync when threads sleeping
on a condition variable time out or are aborted.  However, it doesn't hurt
to call the sleepqueue code for either a signal or a broadcast when there
are no waiters, and the count is never out of sync in the opposite
direction unless we have more than INT_MAX sleeping threads.
2004-04-06 19:17:46 +00:00
John Baldwin
535eb30962 Add a new kernel option MUTEX_WAKE_ALL that changes the mutex unlock code
to awaken all waiters when a contested mutex is released instead of just
the highest priority waiter.  If the various threads are awakened in
sequence then each thread may acquire and release the lock in question
without contention resulting in fewer expensive unlock and lock
operations.  This old behavior of waking just the highest priority is
still used if this option is specified.  Making the algorithm conditional
on a kernel option will allows us to benchmark both cases later and
determine which one should be used by default.

Requested by:	tanimura-san
2004-04-06 19:12:24 +00:00
John Baldwin
ef2c0ba7e4 Rename turnstile_wakeup() to turnstile_broadcast() to make the naming
more consistent with other APIs. sleepq and cv's use signal/broadcast, and
msleep uses wakeup_one/wakeup.  Prior to this turnstiles were using a
signal/wakeup mixture.
2004-04-06 19:07:21 +00:00
Bruce Evans
295ed75297 Removed some less than useful comments:
- don't say what a small subset of the options includes are for.
- don't mark up functions which use all their args with /* ARGSUSED */.
  The markup should have been removed when the unused retval parameter
  was removed.
- don't comment on what routine suser() checks do.  Removed nearby
  excessive vertical whitespace.
2004-04-06 10:05:02 +00:00
Warner Losh
7f8a436ff2 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
Doug Rabson
7d5ea13fcd Try not to crash instantly when signalling a libthr program to death. 2004-04-05 15:06:01 +00:00
Doug Rabson
e2c8a799c1 Regen. 2004-04-05 10:17:23 +00:00
Doug Rabson
0b0a60fb43 Add lgetfh(2) which is like getfh(2) but doesn't follow symlinks. 2004-04-05 10:15:53 +00:00
Robert Watson
051bbf603a Detatch incorrect spellings of detach. 2004-04-04 19:15:45 +00:00
Jeff Roberson
37a35e4a60 - Use the proper constant in sched_interact_update(). Previously,
SCHED_INTERACT_MAX was used where SCHED_SLP_RUN_MAX was needed.  This was
   causing the interactivity scaler to lose history at a more dramatic rate
   than intended.
2004-04-04 19:12:56 +00:00
Marcel Moolenaar
8c9b7b2c84 Create NT_PRSTATUS and NT_FPREGSET notes for each and every thread
in the process. This is required for proper debugging of corefiles
created by 1:1 or M:N threaded processes. Add an XXX comment where
we should actually call a function that dumps MD specific notes.
An example of a MD specific note is the NT_PRXFPREG note for SSE
registers.

Since BFD creates non-annotated pseudo-sections for the first PRSTATUS
and FPREGSET notes (non-annotated in the sense that the name of the
section does not contain the pid/tid), make sure those sections describe
the initial thread of the process (i.e. the thread which tid equals the
pid). This is not strictly necessary, but makes sure that tools that use
the non-annotated section names will not change behaviour due to this
change.

The practical upshot of this all is that one can see the threads in
the debugger when looking at a corefile. For 1:1 threading this means
that *all* threads are visible.
2004-04-03 20:25:41 +00:00
Marcel Moolenaar
fdcac92868 Assign thread IDs to kernel threads. The purpose of the thread ID (tid)
is twofold:
1. When a 1:1 or M:N threaded process dumps core, we need to put the
   register state of each of its kernel threads in the core file.
   This can only be done by differentiating the pid field in the
   respective note. For this we need the tid.
2. When thread support is present for remote debugging the kernel
   with gdb(1), threads need to be identified by an integer due to
   limitations in the remote protocol. This requires having a tid.

To minimize the impact of having thread IDs, threads that are created
as part of a fork (i.e. the initial thread in a process) will inherit
the process ID (i.e. tid=pid). Subsequent threads will have IDs larger
than PID_MAX to avoid interference with the pid allocation algorithm.
The assignment of tids is handled by thread_new_tid().

The thread ID allocation algorithm has been written with 3 assumptions
in mind:
1. IDs need to be created as fast a possible,
2. Reuse of IDs may happen instantaneously,
3. Someone else will write a better algorithm.
2004-04-03 15:59:13 +00:00
Alan Cox
121230a40d In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it
should not.  Add a new parameter so that the caller can specify which is
the case.

Reported by:	dillon
2004-04-03 09:16:27 +00:00
Kris Kennaway
c5af600675 Add missing comment terminator. 2004-04-02 04:57:40 +00:00
Julian Elischer
4f73277a35 The comment complained about not having a thread_unlink()
and did the work itself, but thread_unink() has existed for a while... use it.
2004-04-02 01:01:34 +00:00
John Baldwin
e43257aa7d Finish fixing up Alpha to work with an MP safe ptrace():
- ptrace_single_step() is no longer called with the proc lock held, so
  don't try to unlock it and then relock it.
- Push Giant down into proc_rwmem() instead of forcing all the consumers
  (including Alpha breakpoint support) to explicitly wrap calls to
  proc_rwmem() with Giant.

Tested by:	kensmith
2004-04-01 20:56:44 +00:00
Scott Long
cd587b1397 Don't print out 'GIANT-LOCKED' for INTR_FAST drivers. 2004-04-01 07:18:42 +00:00
Pawel Jakub Dawidek
2fc0588da2 Remove sysctl kern.ps_argsopen, it is not very useful, one should use
security.bsd.see_other_uids instead.

Discussed with:	phk, rwatson
2004-04-01 00:10:45 +00:00
Pawel Jakub Dawidek
5e2c0c0b0e Remove ps_argsopen check. It is was bogus in the past and was corrected
not quite well by me - if kern.ps_argsopen was set to 0, users weren't
permitted to see arguments of even own processes.
But kern.ps_argsopen is going away, so just remove this check and leave
security checks for p_cansee() function.
2004-04-01 00:08:20 +00:00
Julian Elischer
4ccbe07e84 Remove unused variable. 2004-03-31 08:20:44 +00:00
Robert Watson
8e44a7ec13 In sofree(), avoid nested declaration and initialization in
declaration.  Observe that initialization in declaration is
frequently incompatible with locking, not just a bad idea
due to style(9).

Submitted by:	bde
2004-03-31 03:48:35 +00:00
Robert Watson
db48c0d254 Export uipc_connect2() from uipc_usrreq.c instead of unp_connect2(),
and consume that interface in portalfs and fifofs instead.  In the
new world order, unp_connect2() assumes that the unpcb mutex is
held, whereas uipc_connect2() validates that the passed sockets are
UNIX domain sockets, then grabs the mutex.

NB: the portalfs and fifofs code gets down and dirty with UNIX domain
sockets.  Maybe this is a bad thing.
2004-03-31 01:41:30 +00:00
Alan Cox
1dc10fceaa White space and wording changes to init_param3().
Mostly submitted by:	bde
2004-03-30 08:00:11 +00:00
Robert Watson
fc3fcacf52 Prefer NULL to 0 when testing and assigning pointer values. 2004-03-30 02:16:25 +00:00
Peter Wemm
9a6a4cb50d Shorten some XXXKSE commentry 2004-03-29 22:46:54 +00:00
Peter Wemm
39d3505a30 Kill some XXXKSE's. vnlru/syncer are single threaded. 2004-03-29 22:45:33 +00:00
Peter Wemm
b21126c6b3 Clean up the stub fake vnode locking implemenations. The main reason this
stuff was here (NFS) was fixed by Alfred in November.  The only remaining
consumer of the stub functions was umapfs, which is horribly horribly
broken.  It has missed out on about the last 5 years worth of maintenence
that was done on nullfs (from which umapfs is derived).  It needs major
work to bring it up to date with the vnode locking protocol.  umapfs really
needs to find a caretaker to bring it into the 21st century.

Functions GC'ed:
vop_noislocked, vop_nolock, vop_nounlock, vop_sharedlock.
2004-03-29 22:41:21 +00:00
Robert Watson
181e65db5b Use a common return path for filt_soread() and filt_sowrite() to
simplify the impact of locking on these functions.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
2004-03-29 18:06:15 +00:00
Robert Watson
71c90a2944 In sofree(), moving caching of 'head' from 'so->so_head' to later in
the function once it has been determined to be non-NULL to simplify
locking on an earlier return.
2004-03-29 17:57:43 +00:00
Robert Watson
5a35e5f9af If debug.mpsafenet, initialize UNIX domain socket timeouts as MPSAFE;
otherwise, assert Giant in the callouts.
2004-03-29 17:00:05 +00:00
Robert Watson
627e4a9973 Conditionally acquire Giant when entering the sockets layer via the
socket-specific system calls based on debug.mpsafenet, rather than
acquiring Giant unconditionally.
2004-03-29 02:21:56 +00:00
Robert Watson
32903c86e7 Conditionally acquire Giant when entering the socket layer via file
descriptor operations based on debug.mpsafenet, rather than acquiring
Giant unconditionally.
2004-03-29 01:55:32 +00:00
Robert Watson
74041f5a10 When validating that the length sum in recvit(), we fail to release
Giant on an error.  Add a Giant acquisition.

Reviewed by:	sam, bms
2004-03-29 01:37:06 +00:00
Robert Watson
a1288c786e Conditionally assert Giant in fputsock() based on the value of
debug.mpsafenet.
2004-03-29 00:33:02 +00:00
Alan Cox
e3b19536fb Revise the direct or optimized case to use uiomove_fromphys() by the reader
instead of ephemeral mappings using pmap_qenter() by the writer.  The
writer is still, however, responsible for wiring the pages, just not
mapping them.  Consequently, the allocation of KVA for the direct case is
unnecessary.  Remove it and the sysctls limiting it, i.e.,
kern.ipc.maxpipekvawired and kern.ipc.amountpipekvawired.  The number
of temporarily wired pages is still, however, limited by
kern.ipc.maxpipekva.

Note: On platforms lacking a direct virtual-to-physical mapping,
uiomove_fromphys() uses sf_bufs to cache ephemeral mappings.  Thus,
the number of available sf_bufs can influence the performance of pipes
on platforms such i386.  Surprisingly, I saw the greatest gain from this
change on such a machine: lmbench's pipe bandwidth result increased from
~1050MB/s to ~1850MB/s on my 2.4GHz, 400MHz FSB P4 Xeon.
2004-03-27 19:50:23 +00:00
Marcel Moolenaar
b2ae7ed72c Change the type of the various CPU masks to cpumask_t. Note that as
long as there are still explicit uses of int, whether in types or
in function names (such as atomic_set_int() in sched_ule.c), we can
not change cpumask_t to be anything other than u_int. See also the
commit log for sys/sys/types.h, revision 1.84.
2004-03-27 18:21:24 +00:00
Mike Makonnen
a73027fee9 Regen for libthr thread synchronization syscalls. 2004-03-27 14:34:17 +00:00
Mike Makonnen
0af67a2ef9 Use the proc lock to sleep on a libthr umtx. 2004-03-27 14:32:03 +00:00
Mike Makonnen
1713a51661 Separate thread synchronization from signals in libthr. Instead
use msleep() and wakeup_one().

Discussed with: jhb, peter, tjr
2004-03-27 14:30:43 +00:00
Pawel Jakub Dawidek
0b68054f9d - Add a description for vfs.usermount sysctl.
- Add the vfs_equalopts() function for mount options comparsion.
  Now it looks much more clear.
- Style fixed.

In co-operation with:	bde
2004-03-27 08:39:28 +00:00
Pawel Jakub Dawidek
6c8cc8ec4b - Loudly disallow MNT_SUIDDIR mount flag for unprivileged users mounts.
- Style fixed.

Submitted by:	bde
2004-03-27 08:09:00 +00:00
Pawel Jakub Dawidek
2c6040bbb7 We probably shouldn't allow users to mount file systems with MNT_SUIDDIR.
There should be not shell access when SUIDDIR is compiled in, but
better be sure.

Reviewed by:	rwatson
2004-03-26 21:12:14 +00:00
Alan Cox
2b63e7f397 Use uiomove_fromphys() instead of pmap_qenter() and pmap_qremove() in
proc_rwmem().
2004-03-24 23:35:04 +00:00
Warner Losh
9fc0327792 Conform to local file sytle and prefer (a && (b & flag)). 2004-03-24 16:49:37 +00:00
David E. O'Brien
0d50bcb36b Change the !MPSAFE boot string to something that doesn't potentially
scare users that the kernel won't run on MP systems.
2004-03-23 01:58:09 +00:00
Alfred Perlstein
12e9993f65 Emit a traceback when witness_trace is set and witness_warn() is
called and triggers (typically caused by sleeping with a non-sleepable
lock).

Reviewed by: jhb
2004-03-23 00:32:27 +00:00
David E. O'Brien
f1c8692d0a Rather than display which interrupts are MPSAFE, display those that aren't.
This way we can take stock of the work to be done.  boot -v will note those
interrupts that are MPSAFE.
2004-03-22 22:36:11 +00:00
Paul Saab
2eada6bc8e Remove some netbsd debug code that crept into rev 1.116 2004-03-22 10:17:40 +00:00