Commit Graph

4591 Commits

Author SHA1 Message Date
Alfred Perlstein
85f190e4d1 Fixes to make select/poll mpsafe.
Problem:
  selwakeup required calling pfind which would cause lock order
  reversals with the allproc_lock and the per-process filedesc lock.
Solution:
  Instead of recording the pid of the select()'ing process into the
  selinfo structure, actually record a pointer to the thread.  To
  avoid dereferencing a bad address all the selinfo structures that
  are in use by a thread are kept in a list hung off the thread
  (protected by sellock).  When a selwakeup occurs the selinfo is
  removed from that threads list, it is also removed on the way out
  of select or poll where the thread will traverse its list removing
  all the selinfos from its own list.

Problem:
  Previously the PROC_LOCK was used to provide the mutual exclusion
  needed to ensure proper locking, this couldn't work because there
  was a single condvar used for select and poll and condvars can
  only be used with a single mutex.
Solution:
  Introduce a global mutex 'sellock' which is used to provide mutual
  exclusion when recording events to wait on as well as performing
  notification when an event occurs.

Interesting note:
  schedlock is required to manipulate the per-thread TDF_SELECT
  flag, however if given its own field it would not need schedlock,
  also because TDF_SELECT is only manipulated under sellock one
  doesn't actually use schedlock for syncronization, only to protect
  against corruption.

Proc locks are no longer used in select/poll.

Portions contributed by: davidc
2002-03-14 01:32:30 +00:00
Brian Feldman
0e0af8ecda Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate.
While doing this, move it earlier in the sysinit boot process so that the
VM system can use it.

After that, the system is now able to use sx locks instead of lockmgr
locks in the VM system.  To accomplish this, some of the more
questionable uses of the locks (such as testing whether they are
owned or not, as well as allowing shared+exclusive recursion) are
removed, and simpler logic throughout is used so locks should also be
easier to understand.

This has been tested on my laptop for months, and has not shown any
problems on SMP systems, either, so appears quite safe.  One more
user of lockmgr down, many more to go :)
2002-03-13 23:48:08 +00:00
Archie Cobbs
44a8ff315e Add realloc() and reallocf(), and make free(NULL, ...) acceptable.
Reviewed by:	alfred
2002-03-13 01:42:33 +00:00
Jeff Roberson
8de00f4a87 This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs.
The stat() and open() calls have been changed to make use of this new functionality.  Using shared locks in
these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes.  Also,
this reduces the number of exclusive locks that are floating around in the system, which helps reduce the
number of deadlocks that occur.

A new kernel option "LOOKUP_SHARED" has been added.  It defaults to off so this patch can be turned on for
testing, and should eventually go away once it is proven to be stable.  I have personally been running this
patch for over a year now, so it is believed to be fully stable.

Reviewed by:	jake, obrien
Approved by:	jake
2002-03-12 04:00:11 +00:00
Poul-Henning Kamp
417fb7f6fa Make the disk_clone() routine more robust for abuse.
Sneak in a trivial bit of the GEOM stuff while we're here anyway.
2002-03-11 08:08:02 +00:00
Seigo Tanimura
183ccde6c6 Stop abusing the pgrpsess_lock. 2002-03-11 07:53:13 +00:00
Seigo Tanimura
aa3bf85c54 Do not lock the pgrpsess_lock exclusively across ttywait().
Spotted by:		David Wolfskill <david@catwhisker.org>
Investigated by:	rwatson
2002-03-11 07:51:08 +00:00
David Malone
6c75a65a00 Don't assign strcmp to a variable called err and then compare it
with zero, just compare strcmp with zero. This fixes the same bug
which Maxim just fixed and fixes some odd style too.

PR:		35712
Reviewed by:	arr
2002-03-10 23:12:43 +00:00
Maxim Sobolev
832af2d5ed Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results
in "missing dependencies" error when loading some kld modules. It is sad to
see how often these days style cleanus break doesn't broken things. Perhaps
people should recall good old principle: "don't fix it if it isn't broken".
2002-03-10 19:20:01 +00:00
Poul-Henning Kamp
01de1b13b8 Make the proposed name arg to dev_stdclone() const. 2002-03-10 10:50:05 +00:00
Alfred Perlstein
bbbb04ce62 Remove __P 2002-03-09 22:44:37 +00:00
Alfred Perlstein
be4af4b723 Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not
fully instaniated.

Revert the logic in pipeclose so that we don't have the entire function
pretty much under a single if() statement, instead invert the test and
just return if it fails.

Submitted (in different form) by: bde

Don't use pool mutexes for pipes.  We can not use pool mutexes
because we will need to grab the select lock while holding a pipe
lock which is not allowed because you may not aquire additional
mutexes when holding a pool mutex.

Instead malloc(9) space for the mutex that is shared between the
pipes.
2002-03-09 22:06:31 +00:00
Poul-Henning Kamp
1c1676edca Delete "notyet" code before it becomes "ohh no" code. 2002-03-09 20:11:25 +00:00
Luigi Rizzo
2dbd9d5bc3 Make the DEVICE_POLLING code compile with -Werror and in LINT 2002-03-09 08:02:52 +00:00
John Baldwin
60e269643d - Use a MI critical section in witness_sleep() and witness_list() as they
simply need to prevent switching from another CPU and do not need
  interrupts disabled.
- Add a comment to witness_list() about why displaying spin locks for
  threads on other CPU's really is just a bad idea and probably shouldn't
  be done.
2002-03-08 18:57:57 +00:00
John Baldwin
c29824db05 Read KTR_CPU into a temporary variable so that we use a consistent value
for both the cpumask check and the cpu entry field w/o needing to use
a critical section.
2002-03-08 18:55:59 +00:00
Poul-Henning Kamp
fb92273bdc Move the mount of the root filesystem to happen in the init process before
the exec if /sbin/init.

This allows the scheduler to get started and kthreads a chance to run
before we start filesystem operations.
2002-03-08 10:33:11 +00:00
Mike Silbersack
77a7d074e4 Unconditionally limit maxproc so that it is not possible
to exhaust all kmaps.  The only reward for setting maxproc
to a value which will cause kmap exhaustion is a panic
during a forkbomb attack.

MFC after:	3 days
2002-03-07 04:50:36 +00:00
Jake Burkholder
752dff3d9c Add needed includes of machine/smp.h, remove nested include in sys/smp.h
so that inlines in machine/smp.h can use variables declared in sys/smp.h.
2002-03-07 04:43:51 +00:00
Dag-Erling Smørgrav
e97c3e3d5c Rename runq_find() to runq_findproc(), and hide it behind #ifdef DIAGNOSTIC,
as it can have a severe impact on performance under high load, and the bug
it was meant to catch was fixed ages ago.
2002-03-06 15:34:07 +00:00
Maxim Konovalov
cf11f48256 Fix a typo, unbreak the world.
Thanks to:	mux
Approved by:	ru
2002-03-06 12:28:51 +00:00
Bruce Evans
3006e31679 Don't (blindly) truncate the unit number to 4 digits when formatting the
string returned by device_get_nameunit().
2002-03-06 11:34:02 +00:00
Maxim Konovalov
9dfd307b10 Maximum semid is seminfo.semmni not seminfo.semmsl.
PR:		kern/34979
Submitted by:	James Gritton <jamie@gritton.org>
Reviewed by:	alfred, ru
Approved by:	ru
MFC after:	1 week
2002-03-06 10:52:49 +00:00
Robert Watson
89e1164ee2 Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to
be safe.
2002-03-05 19:45:45 +00:00
Robert Watson
b0ad6e203a The change from td->td_proc->p_ucred to td->td_ucred has shortened some
lines: more agressively line wrap under those circumstances.
2002-03-05 19:31:25 +00:00
John Baldwin
c6f55f33ea - Use td_ucred for jail checks.
- Move jail checks and some other checks involving constants and stack
  variables out from under Giant.  This isn't perfectly safe atm because
  jail_sysvipc_allowed is read w/o a lock meaning that its value could be
  stale.  This global variable will soon become a per-jail flag, however,
  at which time it will either not need a lock or will use the prison lock.
2002-03-05 18:57:36 +00:00
Eivind Eklund
f52bd684f3 * Move bswlist declaration and initialization from kern/vfs_bio.c to
vm/vm_pager.c, which is the only place it is used.
* Make the QUEUE_* definitions and bufqueues local to vfs_bio.c.
* constify buf_wmesg.
2002-03-05 18:20:58 +00:00
Eivind Eklund
04858e7ee4 Change wmesg to const char * instead of char * 2002-03-05 17:45:12 +00:00
Robert Watson
ba51c2659d Part II: update various mechanically generated files to allow for new
system call number allocations.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:13:01 +00:00
Robert Watson
11ffd032ff Reserve system call numbers for the MAC framework. This will prevent
people working on the MAC tree from getting toasted whenever system call
numbers are allocated in the main tree (for example, for KSE :-).
Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:11:11 +00:00
Eivind Eklund
eb8e6d5276 Document all functions, global and static variables, and sysctls.
Includes some minor whitespace changes, and re-ordering to be able to document
properly (e.g, grouping of variables and the SYSCTL macro calls for them, where
the documentation has been added.)

Reviewed by:	phk (but all errors are mine)
2002-03-05 15:38:49 +00:00
Robert Drehmel
6f60771b6d Fix a warning. 2002-03-05 15:19:33 +00:00
Jeff Roberson
88c99cfbc8 Add a new variable mp_maxid. This is used so that per cpu datastructures may
be allocated as arrays indexed by the cpu id.  Previously the only reliable
way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here
because cpu ids may be sparsely mapped, although x86 and alpha do not do this.

Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM
starts up.  This is intended to help support per cpu queues for the new
allocator, but may be useful elsewhere.

Reviewed by:	jake
Approved by:	jake
2002-03-05 10:01:46 +00:00
Seigo Tanimura
996abba928 Track the number of wired pages to avoid unwiring unwired pages.
Reviewed by:	alfred
2002-03-05 00:51:03 +00:00
Mitsuru IWASAKI
899ccf541a Add generalized power profile code.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.

 - move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
 - call power_profile_set_state() also from APM driver when AC-line
   status changes
 - add call-back function for Crusoe LongRun controlling on power
   profile changes for a example
2002-03-04 18:46:13 +00:00
Bosko Milekic
5a4f147089 Fix bug in mb_alloc that made systems configured with
PAGE_SIZE / MCLBYTES == 1 crash. Fix them by changing the
appropriate "allocate new page and bucket" code in mb_alloc to use
the macro for properly grabbing an allocated object from a bucket,
the one that checks whether the bucket is empty.
This should allow ken to continue testing zero-copy stuff on -CURRENT.

Noticed and provided debug info: ken
2002-03-03 22:10:04 +00:00
Dima Dorfman
e74d483140 Check the version of ex_anon (a `struct xucred') before using it to
fill out netc_anon (a `struct ucred'), and add an XXX around the
entire operation since it isn't clear whether it's doing the right
thing with things like cr_uidinfo and cr_prison.
2002-03-03 06:07:57 +00:00
Seigo Tanimura
92c914f936 Fix lock leakage and late unlock.
Submitted by:	bde
2002-03-02 12:42:24 +00:00
Ian Dowse
167b8d0334 In sosend(), enforce the socket buffer limits regardless of whether
the data was supplied as a uio or an mbuf. Previously the limit was
ignored for mbuf data, and NFS could run the kernel out of mbufs
when an ipfw rule blocked retransmissions.
2002-02-28 11:22:40 +00:00
Warner Losh
0cf3c909d8 Remove now unused struct proc *p.
Approved by: jhb
2002-02-27 20:57:57 +00:00
John Baldwin
bdd67d483c - Change namei() to use td_ucred instead of p_ucred.
- Change the hack in access() that uses a temporary credential to set
  td_ucred to the temp cred instead of p_ucred.
2002-02-27 19:15:29 +00:00
John Baldwin
6f105b3444 - Change unp_listen() to accept a thread rather than a proc as its second
argument.
- Use td_ucred in unp_listen() instead of p_ucred.
2002-02-27 19:14:01 +00:00
John Baldwin
4a7d6cd251 Fix Giant leakage in several error cases in __semctl(). 2002-02-27 19:12:14 +00:00
John Baldwin
6bd7ad69a1 Add a comment about an unlocked access to p_ucred that will go away in
the near future.
2002-02-27 19:10:50 +00:00
Alfred Perlstein
9f01374de5 kill __P. 2002-02-27 18:51:53 +00:00
Alfred Perlstein
566c1313a3 add assertions in the places where giant is required to catch when
the pipe is locked and shouldn't be.

initialize pipe->pipe_mtxp to NULL when creating pipes in order not
to trip the above assertions.

swap pipe lock with giant around calls to pipe_destroy_write_buffer()

pipe_destroy_write_buffer issue noticed by: jhb
2002-02-27 18:49:58 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
John Baldwin
65e3406d28 Temporarily lock Giant while we update td_ucred. The proc lock doesn't
fully protect p_ucred yet so Giant is needed until all the p_ucred
locking is done.  This is the original reason td_ucred was not used
immediately after its addition.  Unfortunately, not using td_ucred is
not enough to avoid problems.  Since p_ucred could be stale, we could
actually be dereferencing a stale pointer to dink with the refcount, so
we really need Giant to avoid foot-shooting.  This allows td_ucred to
be safely used as well.
2002-02-27 18:30:01 +00:00
Alfred Perlstein
21dbcfd500 Fix a NULL deref panic in pipe_write, we can't blindly lock
pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the
passed in pipe's mutex instead.
2002-02-27 17:23:16 +00:00
Robert Drehmel
ad1ff0997e Make getcredhostname() take a buffer and the buffer's size
as arguments.  The correct hostname is copied into the buffer
while having the prison's lock acquired in a jailed process'
case.

Reviewed by:	jhb, rwatson
2002-02-27 16:43:20 +00:00