Commit Graph

4994 Commits

Author SHA1 Message Date
jhb
366bb5db9c Whitespace bogon. 2002-04-27 04:48:36 +00:00
marcel
37e2e2ecca Insert a semi-colon between label 'skip:' and the closing brace
of the FOREACH loop to silence GCC 3.
2002-04-27 02:58:18 +00:00
mike
99e543a853 Move the new byte order function prototypes from <sys/param.h> to
<sys/endian.h>.  This puts us in line with NetBSD and OpenBSD.
2002-04-26 22:48:23 +00:00
phk
4c421c0b9a Now that the private parts of timecounters are no longer being fingered
by other bits of code, split struct timecounter into two.

struct timecounter contains just the bits which pertains to the hardware
counter and the reading of it.

struct timehands (as in "the hands on a clock") contains all the ugly bit
fidling stuff.  Statically compile ten timehands.

This commit is the functional part.  A later cosmetic patch will rename
various variables and fieldnames.
2002-04-26 21:51:08 +00:00
phk
d1d55e6cb9 Hide the private parts of timecounter from a couple of places that don't
really need to know the gory details.
2002-04-26 21:31:44 +00:00
phk
0054f0f74b Simplify the RFC2783 and PPS_SYNC timestamp collection API. 2002-04-26 20:24:28 +00:00
phk
04257819a4 Move the winding of timecounters out of hardclock and into a normal
timeout loop.

Limit the rate at which we wind the timecounters to approx 1000 Hz.

This limits the precision of the get{bin,nano,micro}[up]time(9)
functions to roughly a millisecond.
2002-04-26 12:37:36 +00:00
phk
91f1d49b73 Various cleanup and sorting of clock reading functions. Add the two
functions missing in the complete 12 function complement.
2002-04-26 10:19:29 +00:00
phk
76a2a4c2cf Rename tco_setscales() and tco_delta() to use the same tc_ prefix as
the rest of this file.
2002-04-26 10:11:02 +00:00
phk
f227fb83e6 Remove the tc_update() function. Any frequency change to the
timecounter will be used starting at the next second, which is
good enough for sysctl purposes.  If better adjustment is needed
the NTP PLL should be used.
2002-04-26 10:06:26 +00:00
brian
895107253f Test if rootvnode is NULL rather than if rootdev is NODEV when determining
if there's a filesystem present.

rootdev can be NODEV in the NFS-mounted root scenario.

Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse
2002-04-26 09:52:54 +00:00
silby
dd3cd5fed6 Make sure that sockets undergoing accept filtering are aborted in a
LRU fashion when the listen queue fills up.  Previously, there was
no mechanism to kick out old sockets, leading to an easy DoS of
daemons using accept filtering.

Reviewed by:	alfred
MFC after:	3 days
2002-04-26 02:07:46 +00:00
des
b3648bf706 Add the mutex profiling lock to the witness list. This hopefully unbreaks
the MUTEX_PROFILING + WITNESS + !WITNESS_SKIPSPIN case.

Submitted by:	Hiten Pandya <hiten@uk.FreeBSD.org>
2002-04-25 22:48:40 +00:00
bde
e1e6cfc088 Fixed some longstanding bugs in _getenv_static():
- malformed environment strings (ones without an '=') were not rejected.
  There shouldn't be any of these, but when the static environment is
  empty it always begins with one of these; this one should be considered
  as the terminator after the end of the environment, but it isn't.
- the comparison of the name being looked up with the name in the
  environment was fuzzy -- only the characters up to the length of the
  latter were compared, so _getenv_static("foobar") matched "foo=..."
  in the environment and everything matched "" in the empty environment.

MFC after:	3 days
2002-04-25 20:25:15 +00:00
bde
c7cc23aacf Break the following implementation of panic(3):
#!bin/sh

	# Original version of this by Michael Reifenberger
	# <root@nihil.plaut.de>.

	mdconfig -d -u 11 >/dev/null 2>&1
	dd if=/dev/zero of=zz bs=1m count=1

	while :
	do
		mdconfig -a -t vnode -f zz -u 11
		fdisk -f - -iv /dev/md11 <<EOF1
		g c1 h64 s32
		p 1 165 0 2048
		a 1
	EOF1
		mdconfig -d -u 11
	done

Garbage pointers in __si_u were not cleared by destroy_dev().  Not
clearing si_disk made the above fatal because the disk layer uses
si_disk as a flag to indicate that the dev_t has been completely
initialized.  disk_destroy() clears si_disk for the parent dev_t
but doesn't get called for children.

Not fixed:
- setting the undocumented sysctl debug.free_devt should cause more
  complete destruction of the dev_t including clearing of __si_u, but
  actually causes the above to panic a little earlier.
- the loop leaks 10 memory allocations per iteration (4 DEVFS, 2 devbuf
  and 4 dev_t).

Reviewed by:	timeout by MAINTAINER after 3 months
2002-04-25 13:17:33 +00:00
marcel
56d625090e Don't use the symbol name to lookup the symbol value when we can use
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.

Don't pass the symbol name to elf_reloc(), as it isn't used any more.
2002-04-25 01:22:16 +00:00
tanimura
1616fbed42 Free(9) should be Giant-free.
Suggested by:	jhb
2002-04-24 09:59:18 +00:00
silby
b4055530fc Remove sodropablereq - this function hasn't been used since the
syncache went in.

MFC after:	3 days
2002-04-24 04:11:08 +00:00
hsu
7bef5a6e99 The cold and panicstr variables do not need to be protected by sched_lock.
Submitted by:	Jennifer Yang (yangjihui@yahoo.com)
Reviewed by:	jake & jhb in principle
2002-04-23 19:50:22 +00:00
phk
834fdde07a Add a basic sanity check on pointers passed to free(9).
Should be improved by:	jeff
2002-04-23 18:50:25 +00:00
phk
bf5ba9f42b Don't call malloc(9) to allocate zero bytes softc data for devices. 2002-04-23 15:48:23 +00:00
rwatson
780f32f693 Slightly restructure extattr_get_vp() so that there's only one entry point
to VOP_GETEXTATTR().  This simplifies code flow when inserting MAC hooks.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-23 01:27:38 +00:00
alfred
d4c507ea29 Don't FILEDESC_LOCK around calls to falloc(). 2002-04-22 20:09:11 +00:00
des
4d6b787d2d Usage style sweep: spell "usage" with a small 'u'.
Also change one case of blatant __progname abuse (several more remain)
This commit does not touch anything in src/{contrib,crypto,gnu}/.
2002-04-22 13:44:47 +00:00
phk
68aee74f02 Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
2002-04-22 06:53:20 +00:00
marcel
84ecc1bfc1 Add function link_elf_get_gp(), specific to ia64 for now, to get
the DT_PLTGOT value. On ia64 this is the value of GP. We need this
to construct function descriptors, but the elf file structure is
not exported to MD code.

Note that the name of the function is based on the meaning that
DT_PLTGOT has on ia64. This may differ on other architectures. As
such, link_elf_get_gp() has a high level of MD to it. Renaming the
function to describe what DT_* value is returned makes it generic,
but also makes the MD code less clear and if we only need this on
ia64, then a general name for a specific function doesn't help.

In short: I don't know what is "right" at this time, so I'll go
with what I have.
2002-04-21 21:08:30 +00:00
markm
b0c0526342 Use protected names (_foo) to cutdown on boatloads of lint warnings. 2002-04-21 11:16:10 +00:00
marcel
5de2c9fb38 GCC 3.x WARNS: Add a break to the default case. 2002-04-20 21:56:42 +00:00
tanimura
e2acd5cecf Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only
malloc(9) and free(9).
2002-04-20 12:02:52 +00:00
rwatson
30744d9c56 Improve style consistency of vfs_syscalls.c by converting the style used
in various extattr_*() calls to match the rest of the file.  Originally,
these bits at the end looked more like style(9).  This patch was submitted
by green by way of the TrustedBSD MAC tree, and I fixed a few problems
with it on the way through.  Someone with more time on their hands should
convert the entire file to style(9); this commit is for diff reduction
purposes.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-20 01:37:08 +00:00
rwatson
4d39491e7e In sendfile(), use the vn_rdwr() helper function, rather than manually
constructing a struct aio and invoking VOP_READ() directly.  This cleans
up the code a little, but also has the advantage of making sure almost
all vnode read/write access in the kernel goes through the helper
function, meaning that instrumentation of that helper function can impact
almost all relevant read/write operations.  In this case, it permits us
to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet).

In general, if helper vn_*() functions exist, they should be used in
preference to direct VOP's in system call service code.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:46:24 +00:00
rwatson
63ab78794e Divorce proc0 and proc1 credentials earlier; while this isn't technically
needed in the current code, in the MAC tree, create_init() relies on the
ability to modify the credentials present for initproc, and should not
perform that modification on a shared credential.  Pro-active diff
reduction against MAC changes that are in the queue; also facilitates
other work, including the capabilities implementation.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:35:53 +00:00
phk
f4a2041f29 suser is Giant safe, so optimize a pointless case. 2002-04-19 09:20:13 +00:00
suz
553226e8e1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
nectar
fcc5ad0935 When exec'ing a set[ug]id program, make sure that the stdio file descriptors
(0, 1, 2) are allocated by opening /dev/null for any which are not already
open.

Reviewed by:	alfred, phk
MFC after:	2 days
2002-04-19 00:45:29 +00:00
mux
6961e47900 Avoid calling malloc() or free() while holding the
kenv lock.

Reviewed by:	jake
2002-04-17 17:51:10 +00:00
mux
a207e41bef Rework the kernel environment subsystem. We now convert the static
environment needed at boot time to a dynamic subsystem when VM is
up.  The dynamic kernel environment is protected by an sx lock.

This adds some new functions to manipulate the kernel environment :
freeenv(), setenv(), unsetenv() and testenv().  freeenv() has to be
called after every getenv() when you have finished using the string.
testenv() only tests if an environment variable is present, and
doesn't require a freeenv() call. setenv() and unsetenv() are self
explanatory.

The kenv(2) syscall exports these new functionalities to userland,
mainly for kenv(1).

Reviewed by:	peter
2002-04-17 13:06:36 +00:00
mux
c79270302c Add an entry for the kenv(2) syscall (code to follow).
Reviewed by: peter
2002-04-17 13:05:13 +00:00
iedowse
64322dabea The recent NFS forced unmount improvements introduced a side-effect
where some client operations might be unexpectedly cancelled during
an unsuccessful non-forced unmount attempt. This causes problems
for amd(8), because it periodically attempts a non-forced unmount
to check if the filesystem is still in use.

Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set
only during the operation of a forced unmount. Use this instead of
MNTK_UNMOUNT to trigger the cancellation of hung NFS operations.

Also correct a problem where dounmount() might inadvertently clear
the MNTK_UNMOUNT flag.

Reported by:	simokawa
MFC after:	1 week
2002-04-17 01:07:29 +00:00
jhb
dba04cd736 Lock proctree_lock instead of pgrpsess_lock. 2002-04-16 17:11:34 +00:00
jhb
6cbba0bb03 - Lock proctree_lock instead of pgrpsess_lock.
- Use temporary variables to hold a pointer to a pgrp while we dink with it
  while not holding either the associated proc lock or proctree_lock.  It
  is in theory possible that p->p_pgrp could change out from under us.
2002-04-16 17:09:22 +00:00
jhb
d9a4c30c37 - Lock proctree_lock instead of pgrpsess_lock.
- Simplify return logic of setsid() and setpgid().
2002-04-16 17:06:11 +00:00
jhb
2ebbf84d61 - Lock proctree_lock instead of pgrpsess_lock.
- Exclusively lock proctree_lock while calling leavepgrp().
2002-04-16 17:04:21 +00:00
jhb
7202da4491 - Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock
sx lock.  Trying to get the lock order between these locks was getting
  too complicated as the locking in wait1() was being fixed.
- leavepgrp() now requires an exclusive lock of proctree_lock to be held
  when it is called.
- fixjobc() no longer gets a shared lock of proctree_lock now that it
  requires an xlock be held by the caller.
- Locking notes in sys/proc.h are adjusted to note that everything that
  used to be protected by the pgrpsess_lock is now protected by the
  proctree_lock.
2002-04-16 17:03:05 +00:00
phk
2edc95ffee Remove two debug printfs which should never have been committed. 2002-04-15 21:08:51 +00:00
jhb
f656d44b0b You have to cast int64_t's to long long if you printf them with %lld.
This now compiles on alpha without a warning.

Pointy-hat to:	phk
2002-04-15 21:04:32 +00:00
phk
b6bf4c07cf Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
phk
ed0cd9a251 Take the "tickadj" element out of struct clockinfo. Our adjtime(2)
implementation is being changed and the very concept of tickadj will
no longer be meaningful.
2002-04-15 12:11:06 +00:00
phk
af54e26ee0 In the ntp_adjtime(2) syscall, return our actual estimate of unapplied
offset correction instead of the most recent offset applied.
2002-04-15 08:58:24 +00:00
jeff
6cb876e7dd Finish adding support code for sysctl kern.mprof. This dumps some malloc
information related to bucket size effeciency.  Three things are printed on
each row:

Size is the size the user actually asked for rounded to 16 bytes.
Requests is the number of times this size was asked for.
Real Size is the size we actually handed out.

At the end the total memory used and total waste is displayed.  Currently my
system displays about 33% wasted memory.

The intent of this code is to gather statistics for tuning the malloc bucket
sizes.  It is not intended to be run with INVARIANTS and it is not entirely
mp safe.  It can be enabled via 'options MALLOC_PROFILE' which was commited
earlier.
2002-04-15 05:24:01 +00:00
jeff
da6660250e Remove malloc_type's ks_limit.
Updated the kmemzones logic such that the ks_size bitmap can be used as an
index into it to report the size of the zone used.

Create the kern.malloc sysctl which replaces the kvm mechanism to report
similar data.  This will provide an easy place for statistics aggregation if
malloc_type statistics become per cpu data.

Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing
the malloc buckets.
2002-04-15 04:05:53 +00:00
alfred
0925885691 Don't allow one to trace an ancestor when already traced.
PR: kern/29741
Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org>
Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au>
MFC After: 2 weeks
2002-04-14 17:12:55 +00:00
jeff
9089f1baf8 Use VOP_GETVOBJECT instead of accessing the member directly. This fixed
an issue with nullfs and NAMEI shared.

Submitted by:	Alexander Kabaev
2002-04-14 10:18:48 +00:00
alc
3ad9fd7f0b Regen 2002-04-14 05:33:58 +00:00
alc
a34b48c478 Remove the requirement that Giant be held around sigreturn(). 2002-04-14 05:31:47 +00:00
alc
7e3107d0af o Use aiocblist::fd_file in the AIO threads rather than recomputing
the file * from the calling process's descriptor table.
 o Eliminate sharing of the calling process's descriptor table
   with the AIO threads.
2002-04-14 03:04:19 +00:00
jhb
e93a8a367d - Change killpg1()'s first argument to be a thread instead of a process so
we can use td_ucred.
- In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB
  or not.  We don't need sched_lock.
- Close some races in psignal().  In psignal() there is a big switch
  statement based on p_stat.  All the different cases are assuming that
  the process (or thread) isn't going to change state out from under it.
  To ensure this is true, just lock sched_lock for the entire switch.  We
  practically held it the entire time already anyways.  This also
  simplifies the locking somewhat and actually results in fewer lock
  operations.
- Allow signotify() to be called with the sched_lock held since psignal()
  now does that.
- Use td_ucred in a couple of places.
2002-04-13 23:33:36 +00:00
jhb
300593a2cc - Change donice() to take a thread as the first argument instead of a
process so it can use td_ucred.
- Require the target process of donice() to be locked when donice() is
  called.
- Use td_ucred.
- Lock the target process of p_cansee() and while reading the credentials
  of a process.
- Change the logic of rtprio() slightly so it does it's copyin() if needed
  prior to locking the target process.
- rtprio() no longer needs Giant.  In theory with full KSE it would still
  need Giant to protect p_ucred of curproc for the p_canfoo() functions
  but p_canfoo() will be changing to using td_ucred of curthread before
  full KSE hits the tree.
2002-04-13 23:28:23 +00:00
jhb
95ee443e6c - Change the algorithms of the syscalls to modify process credentials to
allocate a blank cred first, lock the process, perform checks on the
  old process credential, copy the old process credential into the new
  blank credential, modify the new credential, update the process
  credential pointer, unlock the process, and cleanup rather than trying
  to allocate a new credential after performing the checks on the old
  credential.
- Cleanup _setugid() a little bit.
- setlogin() doesn't need Giant thanks to pgrp/session locking and
  td_ucred.
2002-04-13 23:07:05 +00:00
jhb
418e247b74 - Change the first argument of ktrcanset(), ktrsetchildren(), and ktrops()
to a thread pointer so that ktrcanset() can use td_ucred.
- Add some proc locking to partially protect p_tracep and p_traceflag.
2002-04-13 22:54:18 +00:00
tmm
a0622efd75 Use pmap_extract() instead of pmap_kextract() to retrieve the physical
address associated with a user virtual address in
pipe_build_write_buffer().

Reviewed by:	alc
2002-04-13 20:09:06 +00:00
asmodai
4d94ee39e6 Use the correct macros for F_SETFD/F_GETFD instead of magic numbers.
Reflect that fact in the manual page.

PR:		12723
Submitted by:	Peter Jeremy <peter.jeremy@alcatel.com.au>
Approved by:	bde
MFC after:	2 weeks
2002-04-13 10:16:53 +00:00
tmm
86be827a6a Back out the last revision - it does not work correctly when one of
the pages in question is not in the top-level vm object, but in
one of the shadow ones.

Pointed out by: alc
Pointy hat to:	tmm
2002-04-13 00:03:07 +00:00
jhb
6629f872ca Rework ptrace(2) to be more locking friendly. We do any needed copyin()'s
and acquire the proctree_lock if needed first.  Then we lock the process
if necessary and fiddle with it as appropriate.  Finally we drop locks and
do any needed copyout's.  This greatly simplifies the locking.
2002-04-12 21:17:37 +00:00
tmm
1720bac84c Do not use pmap_kextract() to find out the physical address of a user
belong to a user virtual address; while this happens to work on some
architectures, it can't on sparc64, since user and kernel virtual
address spaces overlap there (the distinction between them is done via
separate address space identifiers).

Instead, look up the page in the vm_map of the process in question.

Reviewed by:	jake
2002-04-12 19:38:41 +00:00
hsu
74de2695a0 Fix corner case where m_len was not being initialized.
Submitted by:	Maksim Yevmenkin <myevmenk@digisle.net>
MFC after:	1 week
2002-04-12 00:01:50 +00:00
jhb
9522d33ac9 - Set the base priority of an ithread that has no handlers when we set its
normal priority.
- Lock sched_lock while we dink with the priorities.
- Remove a few extra blank lines.
2002-04-11 21:03:35 +00:00
alc
200626256b Regen 2002-04-11 17:35:53 +00:00
alc
8a20a702cf Remove the requirement that Giant be held around osigreturn(). All platform-
specific implementations are MPSAFE.
2002-04-11 17:34:38 +00:00
jhb
ce939dfab8 - Change settime() to take a thread as its first argument instead of a proc
so it can use td_ucred.
- Push Giant down into the end of settime() where we actually set the time
  on the timecounter and time of day clock.
- Remove Giant from clock_settime().
- Push Giant down in settimeofday() to just protect the 'tz' global
  variable.
2002-04-10 04:09:07 +00:00
jhb
78e19df6f6 Display the recursion count in the lock_instance in the show locks
output.

Indirectly requested by:	peter
2002-04-10 01:25:11 +00:00
jhb
ad726578d6 Cosmetic fixup in output of lock types in show locks output. 2002-04-10 01:19:53 +00:00
brian
cd534ec28e In linker_load_module(), check that rootdev != NODEV before calling
linker_search_module().

Without this, modules loaded from loader.conf that then try to load
in additional modules (such as digi.ko loading a card's BIOS) die
badly in the vn_open() called from linker_search_module().

It may be worth checking (KASSERTing?) that rootdev != NODEV in
vn_open() too.
2002-04-10 01:14:45 +00:00
brian
8ad55476a0 Change linker_reference_module() so that it's passed a struct
mod_depend * (which may be NULL).  The only consumer of this
function at the moment is digi_loadmoduledata(), and that passes
a NULL mod_depend *.

In linker_reference_module(), check to see if we've already got
the required module loaded.  If we have, bump the reference count
and return that, otherwise continue the module search as normal.
2002-04-10 01:13:57 +00:00
jhb
97bce5a40f - Change fill_kinfo_proc() to require that the process is locked when it
is called.
- Change sysctl_out_proc() to require that the process is locked when it
  is called and to drop the lock before it returns.  If this proves too
  complex we can change sysctl_out_proc() to simply acquire the lock at
  the very end and have the calling code drop the lock right after it
  returns.
- Lock the process we are going to export before the p_cansee() in the
  loop in sysctl_kern_proc() and hold the lock until we call
  sysctl_out_proc().
- Don't call p_cansee() on the process about to be exported twice in
  the aforementioned loop.
2002-04-09 20:10:46 +00:00
jhb
026e9455de Whitespace changes to wrap long lines. 2002-04-09 20:01:16 +00:00
jhb
1fbf4e9848 We don't need Giant to read the pgrp ID since the proc lock has protected
p_pgrp since the pgrp locking went in.  We also don't need it to check for
invalid values in the options argument to wait1(), so push Giant down
slightly.
2002-04-09 20:00:40 +00:00
jhb
fc492a338c - Remove an early KSE diagnostic panic. The thread pointer here is always
curthread.
- We don't need Giant to do suser() checks now, so don't lock Giant until
  after the check.
2002-04-09 19:58:38 +00:00
jhb
f7c4d57b64 Don't lock the ithread lock in ithread_create(). The ithread isn't on any
lists or in any tables yet so there are no other references to it, thus
we don't need to lock it.
2002-04-09 16:26:37 +00:00
phk
a90e28ebbb Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start
of the device magic stuff might occupy.

Sponsored by: DARPA & NAI Labs.
2002-04-09 15:43:32 +00:00
phk
5b960672bf Rename DIOCGKERNELDUMP to DIOCSKERNELDUMP as it strictly speaking
is a "set" not a "get" operation.

Sponsored by:	DARPA & NAI Labs.
2002-04-09 10:04:09 +00:00
jeff
0b5e15cef7 Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable this
behavior by default.  Also, change the options line to reflect this.

If there are no problems reported this will become the only behavior and the
knob will be removed in a month or so.

Demanded by:	obrien
2002-04-09 05:14:17 +00:00
mux
bf7d877bcd The fourth parameter to copystr() is a size_t, not an int.
Approved by:	peter
2002-04-08 21:14:19 +00:00
phk
33405073ec Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>.
Sponsored by:	DARPA & NAI Labs
2002-04-08 09:20:07 +00:00
phk
e1803d493e Put back dumppcb, but this time we put a comment to tell what it is for.
Brucifixion by:	bde
2002-04-08 06:59:13 +00:00
alc
548fecffbc Restructure aio_return() to eliminate duplicated code and facilitate Giant
push down.
2002-04-08 04:57:56 +00:00
hsu
d9992faaf2 There's only one socket zone so we don't need to remember it
in every socket structure.
2002-04-08 03:04:22 +00:00
mux
00b6b34450 o Change kernel_vmount() interface to be more convenient : pass two
separate strings instead of passing "foo=bar".
o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount()
  fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount()
  when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs
  refcount in the !MNT_UPDATE case.
2002-04-07 13:22:47 +00:00
dwmalone
d7ca365130 Remove a comment which relates to the old name cache code, which
was replaced in 1997.

Approved by:	phk
2002-04-07 08:58:31 +00:00
alc
bfb320784e Reduce the duplication of code for error handling in _aio_aqueue(). 2002-04-07 07:17:59 +00:00
alc
2f8880db13 Change jobref and *ijoblist from int to long in order to avoid
a catastrophe after the 2^32nd AIO operation on 64-bit architectures.
2002-04-07 01:28:34 +00:00
jake
7f897ef089 Remove a stale comment. 2002-04-06 08:44:04 +00:00
jake
553fb6e233 Include machine/ktr.h for sparc64 so we pick up KTR_CPU. 2002-04-06 08:43:17 +00:00
jake
3fadd16f0d Use CTASSERT rather than a runtime check to detect kinfo_proc size changes.
Remove the ugly yuck code to busy wait for 20 seconds.
2002-04-06 08:13:52 +00:00
nyan
e4475cba04 Added the new kernel dumping support for pc98. 2002-04-06 06:41:54 +00:00
bde
572176e33d Updated a doubly stale comment about signotify(). Fixed a nearby long line. 2002-04-05 10:00:37 +00:00
peter
293815b90a Increase the size of the register stack storage on ia64 from 32K to 2MB so
that we can compile gcc.  This is a hack because it adds a fixed 2MB to
each process's VSIZE regardless of how much is really being used since
there is no grow-up stack support.  At least it isn't physical memory.
Sigh.

Add a sysctl to enable tweaking it for new processes.
2002-04-05 01:57:45 +00:00
tmm
91f571835a Add a generic implementation of inittodr() and resettodr(), as well as
a set of helper routines to deal with real-time clocks. The generic
functions access the clock diver using a kobj interface. This is intended
to reduce code reduplication and make it easy to support more than one
clock model on a single architecture.

This code is currently only used on sparc64, but it is planned to convert
the code of the other architectures to it later.
2002-04-04 23:39:10 +00:00
jhb
db9aa81e23 Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on:	i386, alpha, sparc64
2002-04-04 21:03:38 +00:00
jhb
ec0e08c944 Change mtx_init() to now take an extra argument. The third argument is
the generic lock type for use with witness.  If this argument is NULL then
the lock name is used as the lock type.  Add a macro for a lock type name
for network driver locks.
2002-04-04 20:52:27 +00:00
jhb
883d8a5526 Set the lock type equal to the lock name for now as all of the current
sx locks don't use very specific lock names.
2002-04-04 20:49:35 +00:00
jhb
8143d2b80e Add a new char * pointer lo_type to struct lock_object that is used to
point to a more generic name for a lock that is more suitable for use by
witness when grouping locks.  For example, although network driver locks
use the interface name for the name of each lock, they should all use the
same witness and be treated the same as witness.  Another example is that
all UMA zone locks should be treated the same.  The witness code has also
been updated to print out the lock type in addition to the lock name in a
few places where it is relevant.
2002-04-04 20:45:21 +00:00
phk
38f498fe43 Delete the bogus d_boot[01] fields from struct disklabel.
This shrinks the size 4 bytes on alpha, down to the same 276 bytes
as all other platforms.

Construct a hack to make old ioctls work on new kernels.

Once world is recompiled only the new and correct sysctls will be
used.

This hack will become annoying around 1st of may to make people
rebuild their worlds and it will be gone before 5.0.
2002-04-04 20:34:48 +00:00
bde
14ae95f735 Moved signal handling and rescheduling from userret() to ast() so that
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.

Avoid locking in userret() in most of the remaining cases.

Submitted by:	luoqi (first part only, long ago, reorganized by me)
Reminded by:	dillon
2002-04-04 17:49:48 +00:00
bde
3b8182ff40 Optimized the check for unmasked pending signals in CURSIG() using a new
inline function sigsetmasked() and a new macro SIGPENDING().  CURSIG()
will soon be moved out of the normal path of execution for syscalls and
traps.  Then its efficiency will be less important but the new interfaces
will be useful for checking for unmasked pending signals in more places.

Submitted by:		luoqi (long ago, in a slightly different form)

Assert that sched_lock is not held in CURSIG().
2002-04-04 15:19:41 +00:00
alc
db11618136 o aio_process needn't fhold()/fdrop() the fp now that _aio_aqueue() and
aio_free_entry() do this.
 o Remove two unnecessary/unused variables from aio_process() and one field
   from aiocblist.
2002-04-04 02:13:20 +00:00
alfred
6dc270c501 Avoid a lock order reversal by dropping the eventhandler_mutex earlier.
We get enough protection from the lock on the individual lists that we
aquire later.

Noticed/Tested by: Steven G. Kargl <kargl@troutmask.apl.washington.edu>
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-04 00:52:03 +00:00
jhb
9fa365d7d6 - Axe a stale comment. We haven't allowed the ucred pointer passed to
securelevel_*() to be NULL for a while now.
- Use KASSERT() instead of if (foo) panic(); to optimize the
  !INVARIANTS case.

Submitted by:	Martin Faxer <gmh003532@brfmasthugget.se>
2002-04-03 18:35:25 +00:00
mux
9effffd331 Add two forgotten vfs_unbusy() calls, in vfs_mount() and vfs_nmount().
Reviewed by:	phk
2002-04-03 12:19:03 +00:00
ru
d8ffece3c4 Dike out a highly insecure UCONSOLE option.
TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed.

Obtained from:	OpenBSD
2002-04-03 10:56:59 +00:00
dillon
9a85737b15 brelse() was improperly clearing B_DELWRI in the B_DELWRI|B_INVAL case
without removing the buffer from the vnode's dirty buffer list, which
can result in a panic in NFS.  Replaced the code with a call to bundirty()
which deals with it properly.

PR:		kern/36108, kern/36174
Submitted by:	various people
Special mention: to Danny Schales <dan@coes.LaTech.edu> for providing a core dump that helped me track this down.
MFC after:	1 day
2002-04-03 00:17:36 +00:00
des
2a92b78602 Revert to open hashing. It makes the code simpler, and works farily well
even when the number of records approaches the size of the hash table.
Besides, the previous implementation (using linear probing) was broken :)

Also, use the newly introduced MTX_SYSINIT.
2002-04-02 23:26:32 +00:00
jhb
9d3d63fcbc - Move the MI mutexes sched_lock and Giant from being declared in the
various machdep.c's to being declared in kern_mutex.c.
- Add a new function mutex_init() used to perform early initialization
  needed for mutexes such as setting up thread0's contested lock list
  and initializing MI mutexes.  Change the various MD startup routines
  to call this function instead of duplicating all the code themselves.

Tested on:	alpha, i386
2002-04-02 22:19:16 +00:00
jhb
9153749ef0 Spelling police. 2002-04-02 20:44:30 +00:00
jhb
2c4739409a Enforce an implicit lock order of sleepable locks before non-sleepable
locks.
2002-04-02 19:27:21 +00:00
arr
e3fb0536de - Add a mutex to lock the global securelevel value.
- Make use of MTX_SYSINIT() as the means to initialize our mutex lock.
2002-04-02 17:43:17 +00:00
tanimura
448edc64b4 Fix leakage of p_pgrp lock. 2002-04-02 17:12:06 +00:00
jhb
77dc513737 Explicitly document how we implicitly enforce the lock order of sleep
locks before spin locks.
2002-04-02 16:51:20 +00:00
arr
6ae00dcc9f - Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx
locks to be able to setup a SYSINIT call.  This helps in places where
  a lock is needed to protect some data, but the data is not truly
  associated with a subsystem that can properly initialize it's lock.
  The macros use the mtx_sysinit() and sx_sysinit() functions,
  respectively, as the handler argument to SYSINIT().

Reviewed by: alfred, jhb, smp@
2002-04-02 16:05:43 +00:00
des
cbcf839df4 Instead of get_cyclecount(9), use nanotime(9) to record acquisition and
release times.  Measurements are made and stored in nanoseconds but
presented in microseconds, which should be sufficient for the locks for
which we actually want this (those that are held long and / or often).
Also, rename some variables and structure members to unit-agnostic names.
2002-04-02 14:42:01 +00:00
phk
4d586060a3 Retire the bogus ioctl DIOCGPART in toto.
Once again we can notice that badly thought out hacks ferment and infect
far more code than initially expected.

Sponsored by:	DARPA and NAI Labs.
2002-04-02 11:52:13 +00:00
marcel
5dc73db814 Don't compile the dummy dumpsys for ia64. 2002-04-02 10:55:40 +00:00
rwatson
219837a0a6 Update comment regarding the locking of the sysctl tree.
Rename memlock to sysctllock, and MEMLOCK()/MEMUNLOCK() to SYSCTL_LOCK()/
SYSCTL_UNLOCK() and related changes to make the lock names make more
sense.

Submitted by:	Jonathan Mini <mini@haikugeek.com>
2002-04-02 05:50:07 +00:00
alfred
4a63b9d69c Use sx locks instead of flags+tsleep locks.
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-02 04:20:38 +00:00
alfred
cb408d85e7 Use sx locks rather than lockmgr locks for eventhandlers.
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-02 04:18:54 +00:00
des
f6a3790f10 Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the
following sysctl variables:

  debug.mutex.prof.enable	    enable / disable profiling
  debug.mutex.prof.acquisitions	    number of mutex acquisitions recorded
  debug.mutex.prof.records	    number of acquisition points recorded
  debug.mutex.prof.maxrecords	    max number of acquisition points
  debug.mutex.prof.rejected	    number of rejections (due to full table)
  debug.mutex.prof.hashsize	    hash size
  debug.mutex.prof.collisions	    number of hash collisions
  debug.mutex.prof.stats	    profiling statistics

The code records four numbers for each acquisition point (identified by
source file name and line number): longest time held, total time held,
number of non-recursive acquisitions, average time held.  The measurements
are in clock cycles (as returned by get_cyclecount(9)); this may cause
measurements on some SMP systems to be unreliable.  This can probably be
worked around by replacing get_cyclecount(9) by some incarnation of
nanotime(9).

This work was derived from initial patches by eivind.
2002-04-02 00:01:49 +00:00
dillon
3ad295d416 Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development.  Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures.  Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable.  Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks.  They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by:	jake
2002-04-01 23:51:23 +00:00
jhb
dc2e474f79 Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
jhb
81ff87afc2 Whitespace only change: use ANSI function declarations instead of K&R. 2002-04-01 20:13:31 +00:00
phk
07b4c10b28 Extend a hack to also hack around PC98's definition of __i386__ 2002-04-01 20:13:03 +00:00
jhb
7205e92665 Fix style bug in previous commit. 2002-04-01 17:53:42 +00:00
jake
f9f52274db ktr changes to improve performance and make writing a userland utility to
dump the trace buffer feasible.
- Remove KTR_EXTEND.  This changes the format of the trace entries when
  activated, making writing a userland tool which is not tied to a specific
  kernel configuration difficult.
- Use get_cyclecount() for timestamps.  nanotime() is much too heavy weight
  and requires recursion protection due to ktr traces occuring as a result
  of ktr traces.  KTR_VERBOSE may still require recursion protection, which
  is now conditional on it.
- Allow KTR_CPU to be overridden by MD code.  This is so that it is possible
  to trace early in startup before pcpu and/or curthread are setup.
- Add a version number for the ktr interface.  A userland tool can check this
  to detect mismatches.
- Use an array for the parameters to make decoding in userland easier.
- Add file and line recording to the non-extended traces now that the extended
  version is no more.

These changes will break gdb macros to decode the extended version of the
trace buffer which are floating around.  Users of these macros should either
use the show ktr command in ddb, or use the userland utility which can be run
on a core dump.

Approved by:	jhb
Tested on:	i386, sparc64
2002-04-01 05:35:26 +00:00
phk
ef82a51634 Here follows the new kernel dumping infrastructure.
Caveats:

The new savecore program is not complete in the sense that it emulates
enough of the old savecores features to do the job, but implements none
of the options yet.

I would appreciate if a userland hacker could help me out getting savecore
to do what we want it to do from a users point of view, compression,
email-notification, space reservation etc etc.  (send me email if
you are interested).

Currently, savecore will scan all devices marked as "swap" or "dump" in
/etc/fstab _or_ any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but
looking at the i386 version it should be trivial for anybody familiar
with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been
removed.  The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set
the device as dumpdev.  To implement the "off" argument, /dev/null
is used as the device.

Savecore will fail if handed any options since they are not (yet)
implemented.  All devices marked "dump" or "swap" in /etc/fstab
will be scanned and dumps found will be saved to diskfiles
named from the MD5 hash of the header record.  The header record
is dumped in readable format in the .info file.  The kernel
is not saved.  Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to
improve and extend.

Sponsored by:   DARPA, NAI Labs
2002-03-31 22:37:00 +00:00
phk
ac4142e564 Implement the two "GEOM" ioctls DIOCGSECTORSIZE and DIOCGMEDIASIZE for
the non-GEOM code as well.  This simplifies the the kernel-dumping
and disk-management tools as less compatibility cruft will be needed.

Sponsored by:	DARPA and NAI Labs.
2002-03-31 21:17:12 +00:00
alc
6985f3b63c Keep the reference to the file acquired in _aio_aqueue() until the operation
completes.  The reference is released in aio_free_entry().

Submitted by:	tegge
2002-03-31 20:17:56 +00:00
alfred
abeff55bde Close some holes with p->p_args by NULL'ing out the p->p_args pointer
while holding the proc lock, and by holding the pargs structure when
accessing it from outside of the owner.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-31 10:33:12 +00:00
phk
87273d930a Centralize the "bootdev" and "dumpdev" variables. They are still pretty
bogus all things considered, but at least now they don't camouflage as
being MD variables.
2002-03-31 07:15:28 +00:00
alc
bd314bbdd7 Add a local proc *p in exec_new_vmspace() to avoid repeated dereferencing
to obtain it.
2002-03-31 00:05:30 +00:00
bde
2f7ae9b739 Fixed handling of short reads in readdisklabel() and writedisklabel().
These functions use DEV_STRATEGY() which can easily return a short
count (with no error) for reads near EOF.  EOF happens for "disks" too
small to contain a label sector (mainly for empty slices).  The functions
didn't understand this at all, and looked for labels in the garbage
in the buffer beyond what DEV_STRATEGY() returned.  The recent UMA
changes combined with my local changes and configuration resulted in
the garbage often containing a valid but garbage label left over from
a previous call.

Bugs in EOF handling in -current limited the problem to "disks" with
size precisely LABELSECTOR sectors.  LABELSECTOR happens to be a very
unusual "disk" size since it is only 0 for non-i386 arches that don't
usually have disks with DOS MBRs.
2002-03-30 16:02:43 +00:00
dan
ade94cf622 Nuke CV_DEBUG in favour of INVARIANTS.
Approved by: jhb
2002-03-30 03:52:52 +00:00
jake
855079d5b7 Style fixes purposefully left out of last commit. I checked the kse tree
and didn't see any changes that this conflicts with.
2002-03-29 16:45:03 +00:00
jake
8f9ce8398d Remove abuse of intr_disable/restore in MI code by moving the loop in ast()
back into the calling MD code.  The MD code must ensure no races between
checking the astpening flag and returning to usermode.

Submitted by:	peter (ia64 bits)
Tested on:	alpha (peter, jeff), i386, ia64 (peter), sparc64
2002-03-29 16:35:26 +00:00
tanimura
9ae6d1242c The description of fd_mtx is "filedesc structure." 2002-03-29 11:26:05 +00:00
mdodd
9e60cd20eb Add resource_list_add_next() which returns the RID for the resource added. 2002-03-29 06:42:54 +00:00
alfred
1118775014 To remove nested include of sys/lock.h and sys/mutex.h from sys/proc.h
make the pargs_* functions into non-inlines in kern/kern_proc.c.

Requested by: bde
2002-03-28 18:12:27 +00:00
phk
8834f75902 Get the magnitude of the NTP adjustment right. 2002-03-28 16:02:44 +00:00
mux
ac4c018837 - Properly sync vfs_nmount() with changes that have be already done
in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240.
- flag2 is a low quality variable name, change it to kern_flag.
- strncpy NUL-terminates f_fstypename and f_mntonname since the strings
  have length <= <buffer length> - 1, so the explicit NUL-termination is
  bogus.
- M_ZERO'ing space for fstype and fspath is stupid since we never use the
  space beyond the end of the string.
- Do various style(9) cleanups in both functions.

Submitted by:	bde
Reviewed by:	phk
2002-03-28 13:47:32 +00:00
alc
b35f60b7fd Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite()
can be called both with and without the pipe mutex held.  (For example,
if called by pipeselwakeup(), it is held.  Whereas, if called by kqueue_scan(),
it is not.)

Reviewed by:	alfred
2002-03-27 21:47:50 +00:00
alfred
c513408927 Make the reference counting of 'struct pargs' SMP safe.
There is still some locations where the PROC lock should be held
in order to prevent inconsistent views from outside (like the
proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be
fixed later.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-27 21:36:18 +00:00
jeff
dff418f166 Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks
with this flag.  Remove the dup_list and dup_ok code from subr_witness.  Now
we just check for the flag instead of doing string compares.

Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface.  The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.

Approved by:	jhb
2002-03-27 09:23:41 +00:00
dillon
7fa55182e2 oops, forgot to commit this. td->td_savecrit = 0 replaced by API
call cpu_thread_link().
2002-03-27 08:26:37 +00:00
jake
c214910390 Make this compile.
Pointy hat to:	dillon
2002-03-27 06:44:32 +00:00
dillon
dc5aafeb94 Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
bde
cd522c5374 "Fixed" -Wshadow warnings by changing the name of some function parameters
from `index' to `indx'.  The correct fix would be to not support or use
index().
2002-03-27 04:04:17 +00:00
alc
0afabfc8b7 Remove an unnecessary and inconsistently used variable from exec_new_vmspace(). 2002-03-26 19:20:04 +00:00
arr
da9c75ac68 - Fixup a few style nits:
- return error -> return (error);
  - move a declaration to the top of the function.
  - become bug for bug compatible with if (error) lines.

Submitted by: bde
2002-03-26 18:07:10 +00:00
mux
124c6d3a26 As discussed in -arch, add the new nmount(2) system call and the
new vfs_getopt()/vfs_copyopt() API.  This is intended to be used
later, when there will be filesystems implementing the VFS_NMOUNT
operation.  The mount(2) system call will disappear when all
filesystems will be converted to the new API.  Documentation will
be committed in a while.

Reviewed by:	phk
2002-03-26 15:33:44 +00:00
bde
4941686e50 Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h>.
2002-03-26 01:09:51 +00:00
bde
05400f476f Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h> or <sys/socketvar.h>.
2002-03-25 21:52:04 +00:00
obrien
1e153b6d04 Commit work-around for panics when mounting FS's that are auto-loaded as
modules (ie. procfs.ko).

When the kernel loads dynamic filesystem module, it looks for any of the
VOP operations specified by the new filesystem that have not been registered
already by the currently known filesystems.  If any of such operations exist,
vfs_add_vnops function calls vfs_opv_recalc function, which rebuilds vop_t
vectors for each filesystem and sets all global pointers like ufs_vnops_p,
devfs_specop_p, etc to the new values and then frees the old pointers.  This
behavior is bad because there might be already active vnodes whose v_op fields
will be left pointing to the random garbage, leading to inevitable crash soon.

Submitted by:	Alexander Kabaev <ak03@gte.com>
2002-03-25 21:30:50 +00:00
arr
db4f882c76 - Recommit the securelevel_gt() calls removed by commits rev. 1.84 of
kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the
  source of the recent panics occuring around kldloading file system
  support modules.

Requested by: rwatson
2002-03-25 18:26:34 +00:00
phk
811d04c86c Modernize my email address. 2002-03-25 13:52:45 +00:00
bde
90f30ee936 Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-24 05:09:11 +00:00
jhb
f89014c6f6 Use td_ucred in several trivial syscalls and remove Giant locking as
appropriate.
2002-03-22 22:32:04 +00:00
jhb
59d20d5aab Use explicit Giant locks and unlocks for rather than instrumented ones for
code that is still not safe.  suser() reads p_ucred so it still needs
Giant for the time being.  This should allow kern.giant.proc to be set
to 0 for the time being.
2002-03-22 21:02:02 +00:00
rwatson
afe2b1f929 Merge from TrustedBSD MAC branch:
Move the network code from using cr_cansee() to check whether a
    socket is visible to a requesting credential to using a new
    function, cr_canseesocket(), which accepts a subject credential
    and object socket.  Implement cr_canseesocket() so that it does a
    prison check, a uid check, and add a comment where shortly a MAC
    hook will go.  This will allow MAC policies to seperately
    instrument the visibility of sockets from the visibility of
    processes.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 19:57:41 +00:00
alfred
054cce2c17 When "cloning" a pipe's buffer bcopy the data after dropping the pipe's
lock as the data may be paged out and cause a fault.
2002-03-22 16:09:22 +00:00
rwatson
a58b691f90 In sysctl, req->td is believed always to be non-NULL, so there's no need
to test req->td for NULL values and then do somewhat more bizarre things
relating to securelevel special-casing and suser checks.  Remove the
testing and conditional security checks based on req->td!=NULL, and insert
a KASSERT that td != NULL.  Callers to sysctl must always specify the
thread (be it kernel or otherwise) requesting the operation, or a
number of current sysctls will fail due to assumptions that the thread
exists.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
Discussed with:	bde
2002-03-22 14:58:27 +00:00
rwatson
ef91e0f942 Since cred never appears to be passed into the securelevel calls as
NULL, turn warning printf's into panic's, since this call has been
restructured such that a NULL cred would result in a page fault anyway.

There appears to be one case where NULL is explicitly passed in in the
sysctl code, and this is believed to be in error, so will be modified.
Securelevels now always require a credential context so that per-jail
securelevels are properly implemented.

Obtained from:	TrustedBSD Project
Sponsored by:	NAI Labs
Discussed with:	bde
2002-03-22 14:49:12 +00:00
arr
fc49faf982 - Back out the commit to make the linker_load_file() securelevel check
made aware in jail environments.  Supposedly something is broken, so
  this should be backed out until further investigation proves otherwise,
  or a proper fix can be provided.
2002-03-22 04:56:09 +00:00
rwatson
d8370f667d Break out the "see_other_uids" policy check from the various
method-based inter-process security checks.  To do this, introduce
a new cr_seeotheruids(u1, u2) function, which encapsulates the
"see_other_uids" logic.  Call out to this policy following the
jail security check for all of {debug,sched,see,signal} inter-process
checks.  This more consistently enforces the check, and makes the
check easy to modify.  Eventually, it may be that this check should
become a MAC policy, loaded via a module.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 02:28:26 +00:00
arr
68e226a99e - Fix a logic error in checking the securelevel that was introduced in the
previous commit.

Pointy hats to: arr, rwatson
2002-03-21 15:27:39 +00:00
imp
969e82886e Remove last two abuses of cpu_critical_{enter,exit} in the MI code.
Reviewed by: jake, jhb, rwatson
2002-03-21 06:11:09 +00:00
benno
d30ab95478 Add a change mirroring that made to kern/subr_trap.c and others.
This makes kernel builds with DIAGNOSTIC work again.

Apparently forgotten by:	jhb
Might want to be checked by:	jhb
2002-03-21 02:47:51 +00:00
jeff
f350069589 UMA permited us to utilize the 'waitok' flag to soalloc. 2002-03-20 21:23:26 +00:00
jhb
2e425ee2fc Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined.
Instead of caching the ucred reference, just go ahead and eat the
decerement and increment of the refcount.  Now that Giant is pushed down
into crfree(), we no longer have to get Giant in the common case.  In the
case when we are actually free'ing the ucred, we would normally free it on
the next kernel entry, so the cost there is not new, just in a different
place.  This also removse td_cache_ucred from struct thread.  This is
still only done #ifdef DIAGNOSTIC.

[ missed this file in the previous commit ]

Tested on:	i386, alpha
2002-03-20 21:12:04 +00:00
jhb
64bf9fe9fa - Push down Giant into crfree() in the case that we actually free a ucred.
- Add a cred_free_thread() function (conditional on DIAGNOSTICS) that drops
  a per-thread ucred reference to be used in debugging code when leaving
  the kernel.
2002-03-20 21:00:50 +00:00
arr
fc9167c193 - Change a check of securelevel to securelevel_gt() call in order to help
against users within a jail attempting to load kernel modules.
- Add a check of securelevel_gt() to vfs_mount() in order to chop some
  low hanging fruit for the repair of securelevel checking of linking and
  unlinking files from within jails.  There is more to be done here.

Reviewed by: rwatson
2002-03-20 16:03:42 +00:00
arr
3780b11057 - Remove a semi-colon from after SYSINIT that was introduced in rev. 1.163. 2002-03-20 14:46:38 +00:00
jeff
dcd2af7655 Add calls to uma_zone_set_max() to restore previously enforced limits. 2002-03-20 05:30:58 +00:00
jeff
803cb2a2ba Backout part of my previous commit; I was wrong about vm_zone's handling of
limits on zones w/o objects.
2002-03-20 04:39:32 +00:00
jeff
35c1a72689 Remove references to vm_zone.h and switch over to the new uma API. 2002-03-20 04:11:52 +00:00
jeff
318cbeeecf Remove references to vm_zone.h and switch over to the new uma API.
Also, remove maxsockets.  If you look carefully you'll notice that the old
zone allocator never honored this anyway.
2002-03-20 04:09:59 +00:00
alfred
357e37e023 Remove __P. 2002-03-19 21:25:46 +00:00
alfred
5a84f98839 don't generate files with __P. 2002-03-19 20:48:32 +00:00
arr
ae315cb919 - Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag. 2002-03-19 15:41:21 +00:00
peter
83444279ce Fix a gcc-3.1+ warning.
warning: deprecated use of label at end of compound statement

ie: you cannot do this anymore:
switch(foo) {
....

default:
}
2002-03-19 11:02:06 +00:00
peter
a0b32e92d6 Pacify gcc-3.1+, initialize two variables to avoid -Wuninitialized
warnings.
2002-03-19 10:57:40 +00:00
peter
4319b6e738 Fix warnings on gcc-3.1+ where __func__ is a const char * instead of a
string.
2002-03-19 10:56:46 +00:00
jeff
2923687da3 This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by:	arch@
2002-03-19 09:11:49 +00:00
alfred
21fc25cfdf Close a race when vfs_syscalls.c:checkdirs() runs.
To do this protect the filedesc pointer in the proc with PROC_LOCK
in both checkdirs() and kern_descrip.c:fdfree().
2002-03-19 04:30:04 +00:00
bde
df8144b98e Fixed some printf format errors (hopefully all of the remaining daddr64_t
ones for GENERIC, and all others on the same line as those).  Reformat
the printfs if necessary to avoid new long lones or old format printf
errors.
2002-03-19 04:09:21 +00:00
arr
25a6daa828 - Lock down the ``module'' structure by adding an SX lock that is used by
all the global bits of ``module'' data.  This commit adds a few generic
  macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways
  of accessing the SX lock.  It is also the first step in helping to lock
  down the kernel linker and module systems.

Reviewed by: jhb, jake, smp@
2002-03-18 07:45:30 +00:00
mckusick
14dd08fd15 Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.
2002-03-17 01:25:47 +00:00
jake
34dcf8975d Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/
pmap_qremove.  pmap_kenter is not safe to use in MI code because it is not
guaranteed to flush the mapping from the tlb on all cpus.  If the process
in question is preempted and migrates cpus between the call to pmap_kenter
and pmap_kremove, the original cpu will be left with stale mappings in its
tlb.  This is currently not a problem for i386 because we do not use PG_G on
SMP, and thus all mappings are flushed from the tlb on context switches, not
just user mappings.  This is not the case on all architectures, and if PG_G
is to be used with SMP on i386 it will be a problem.  This was committed by
peter earlier as part of his fine grained tlb shootdown work for i386, which
was backed out for other reasons.

Reviewed by:	peter
2002-03-17 00:56:41 +00:00
des
cba4e41433 Implement PT_IO (read / write arbitrary amounts of data or text).
Submitted by:	Artur Grabowski <art@{blahonga,openbsd}.org>
Obtained from:	OpenBSD
2002-03-16 02:40:02 +00:00
des
85d610d6a1 PT_[GS]ET{,DB,FP}REGS isn't really optional any more, since we have dummy
backend functions for those archs that don't support them.  I meant to do
this ages ago, but never got around to it.

Inspired by:	OpenBSD
2002-03-15 20:17:12 +00:00
mckusick
e929f2e4f0 Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
2002-03-15 18:49:47 +00:00
alfred
b0fd50345a Giant pushdown for read/write/pread/pwrite syscalls.
kern/kern_descrip.c:
Aquire Giant in fdrop_locked when file refcount hits zero, this removes
the requirement for the caller to own Giant for the most part.

kern/kern_ktrace.c:
Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls.

kern/vfs_bio.c:
Aquire Giant in bwillwrite if needed.

kern/sys_generic.c
Giant pushdown, remove Giant for:
   read, pread, write and pwrite.
readv and writev aren't done yet because of the possible malloc calls
for iov to uio processing.

kern/sys_socket.c
Grab giant in the socket fo_read/write functions.

kern/vfs_vnops.c
Grab giant in the vnode fo_read/write functions.
2002-03-15 08:03:46 +00:00
alfred
2261bd0e24 Bug fixes:
Missed a place where the pipe sleep lock was needed in order to safely grab
Giant, fix it and add an assertion to make sure this doesn't happen again.

Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the
wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK.

Fix a location where the wrong pipe was being passed to
PIPE_GET_GIANT/PIPE_DROP_GIANT.
2002-03-15 07:18:09 +00:00
alfred
2c16fbdd2a Fixes to make select/poll mpsafe.
Problem:
  selwakeup required calling pfind which would cause lock order
  reversals with the allproc_lock and the per-process filedesc lock.
Solution:
  Instead of recording the pid of the select()'ing process into the
  selinfo structure, actually record a pointer to the thread.  To
  avoid dereferencing a bad address all the selinfo structures that
  are in use by a thread are kept in a list hung off the thread
  (protected by sellock).  When a selwakeup occurs the selinfo is
  removed from that threads list, it is also removed on the way out
  of select or poll where the thread will traverse its list removing
  all the selinfos from its own list.

Problem:
  Previously the PROC_LOCK was used to provide the mutual exclusion
  needed to ensure proper locking, this couldn't work because there
  was a single condvar used for select and poll and condvars can
  only be used with a single mutex.
Solution:
  Introduce a global mutex 'sellock' which is used to provide mutual
  exclusion when recording events to wait on as well as performing
  notification when an event occurs.

Interesting note:
  schedlock is required to manipulate the per-thread TDF_SELECT
  flag, however if given its own field it would not need schedlock,
  also because TDF_SELECT is only manipulated under sellock one
  doesn't actually use schedlock for syncronization, only to protect
  against corruption.

Proc locks are no longer used in select/poll.

Portions contributed by: davidc
2002-03-14 01:32:30 +00:00
green
9a5e1dcf21 Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate.
While doing this, move it earlier in the sysinit boot process so that the
VM system can use it.

After that, the system is now able to use sx locks instead of lockmgr
locks in the VM system.  To accomplish this, some of the more
questionable uses of the locks (such as testing whether they are
owned or not, as well as allowing shared+exclusive recursion) are
removed, and simpler logic throughout is used so locks should also be
easier to understand.

This has been tested on my laptop for months, and has not shown any
problems on SMP systems, either, so appears quite safe.  One more
user of lockmgr down, many more to go :)
2002-03-13 23:48:08 +00:00
archie
4ff8306186 Add realloc() and reallocf(), and make free(NULL, ...) acceptable.
Reviewed by:	alfred
2002-03-13 01:42:33 +00:00
jeff
e6d26e8880 This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs.
The stat() and open() calls have been changed to make use of this new functionality.  Using shared locks in
these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes.  Also,
this reduces the number of exclusive locks that are floating around in the system, which helps reduce the
number of deadlocks that occur.

A new kernel option "LOOKUP_SHARED" has been added.  It defaults to off so this patch can be turned on for
testing, and should eventually go away once it is proven to be stable.  I have personally been running this
patch for over a year now, so it is believed to be fully stable.

Reviewed by:	jake, obrien
Approved by:	jake
2002-03-12 04:00:11 +00:00
phk
ad998ff108 Make the disk_clone() routine more robust for abuse.
Sneak in a trivial bit of the GEOM stuff while we're here anyway.
2002-03-11 08:08:02 +00:00
tanimura
22c75bf1c9 Stop abusing the pgrpsess_lock. 2002-03-11 07:53:13 +00:00
tanimura
b9e49bfcc9 Do not lock the pgrpsess_lock exclusively across ttywait().
Spotted by:		David Wolfskill <david@catwhisker.org>
Investigated by:	rwatson
2002-03-11 07:51:08 +00:00
dwmalone
532dc5e009 Don't assign strcmp to a variable called err and then compare it
with zero, just compare strcmp with zero. This fixes the same bug
which Maxim just fixed and fixes some odd style too.

PR:		35712
Reviewed by:	arr
2002-03-10 23:12:43 +00:00
sobomax
c497ba78d2 Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results
in "missing dependencies" error when loading some kld modules. It is sad to
see how often these days style cleanus break doesn't broken things. Perhaps
people should recall good old principle: "don't fix it if it isn't broken".
2002-03-10 19:20:01 +00:00
phk
e28f5bfbe1 Make the proposed name arg to dev_stdclone() const. 2002-03-10 10:50:05 +00:00
alfred
00d9ca2b85 Remove __P 2002-03-09 22:44:37 +00:00
alfred
32b4a85d20 Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not
fully instaniated.

Revert the logic in pipeclose so that we don't have the entire function
pretty much under a single if() statement, instead invert the test and
just return if it fails.

Submitted (in different form) by: bde

Don't use pool mutexes for pipes.  We can not use pool mutexes
because we will need to grab the select lock while holding a pipe
lock which is not allowed because you may not aquire additional
mutexes when holding a pool mutex.

Instead malloc(9) space for the mutex that is shared between the
pipes.
2002-03-09 22:06:31 +00:00
phk
0c402314d9 Delete "notyet" code before it becomes "ohh no" code. 2002-03-09 20:11:25 +00:00
luigi
282d2b3dfb Make the DEVICE_POLLING code compile with -Werror and in LINT 2002-03-09 08:02:52 +00:00
jhb
93b2e6d421 - Use a MI critical section in witness_sleep() and witness_list() as they
simply need to prevent switching from another CPU and do not need
  interrupts disabled.
- Add a comment to witness_list() about why displaying spin locks for
  threads on other CPU's really is just a bad idea and probably shouldn't
  be done.
2002-03-08 18:57:57 +00:00
jhb
d07be68dba Read KTR_CPU into a temporary variable so that we use a consistent value
for both the cpumask check and the cpu entry field w/o needing to use
a critical section.
2002-03-08 18:55:59 +00:00
phk
0a13c0f2f6 Move the mount of the root filesystem to happen in the init process before
the exec if /sbin/init.

This allows the scheduler to get started and kthreads a chance to run
before we start filesystem operations.
2002-03-08 10:33:11 +00:00
silby
e3a68020c7 Unconditionally limit maxproc so that it is not possible
to exhaust all kmaps.  The only reward for setting maxproc
to a value which will cause kmap exhaustion is a panic
during a forkbomb attack.

MFC after:	3 days
2002-03-07 04:50:36 +00:00
jake
96a2ab54a0 Add needed includes of machine/smp.h, remove nested include in sys/smp.h
so that inlines in machine/smp.h can use variables declared in sys/smp.h.
2002-03-07 04:43:51 +00:00
des
8096f018ff Rename runq_find() to runq_findproc(), and hide it behind #ifdef DIAGNOSTIC,
as it can have a severe impact on performance under high load, and the bug
it was meant to catch was fixed ages ago.
2002-03-06 15:34:07 +00:00
maxim
62ffd8f69e Fix a typo, unbreak the world.
Thanks to:	mux
Approved by:	ru
2002-03-06 12:28:51 +00:00
bde
11add29df0 Don't (blindly) truncate the unit number to 4 digits when formatting the
string returned by device_get_nameunit().
2002-03-06 11:34:02 +00:00
maxim
1896fa66ee Maximum semid is seminfo.semmni not seminfo.semmsl.
PR:		kern/34979
Submitted by:	James Gritton <jamie@gritton.org>
Reviewed by:	alfred, ru
Approved by:	ru
MFC after:	1 week
2002-03-06 10:52:49 +00:00
rwatson
8d5b7b21f3 Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to
be safe.
2002-03-05 19:45:45 +00:00
rwatson
191712b565 The change from td->td_proc->p_ucred to td->td_ucred has shortened some
lines: more agressively line wrap under those circumstances.
2002-03-05 19:31:25 +00:00
jhb
3a1d17e45b - Use td_ucred for jail checks.
- Move jail checks and some other checks involving constants and stack
  variables out from under Giant.  This isn't perfectly safe atm because
  jail_sysvipc_allowed is read w/o a lock meaning that its value could be
  stale.  This global variable will soon become a per-jail flag, however,
  at which time it will either not need a lock or will use the prison lock.
2002-03-05 18:57:36 +00:00
eivind
e97f2e798c * Move bswlist declaration and initialization from kern/vfs_bio.c to
vm/vm_pager.c, which is the only place it is used.
* Make the QUEUE_* definitions and bufqueues local to vfs_bio.c.
* constify buf_wmesg.
2002-03-05 18:20:58 +00:00
eivind
df86de23e6 Change wmesg to const char * instead of char * 2002-03-05 17:45:12 +00:00
rwatson
1557809682 Part II: update various mechanically generated files to allow for new
system call number allocations.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:13:01 +00:00
rwatson
dc9611ec67 Reserve system call numbers for the MAC framework. This will prevent
people working on the MAC tree from getting toasted whenever system call
numbers are allocated in the main tree (for example, for KSE :-).
Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:11:11 +00:00
eivind
05c20bbea2 Document all functions, global and static variables, and sysctls.
Includes some minor whitespace changes, and re-ordering to be able to document
properly (e.g, grouping of variables and the SYSCTL macro calls for them, where
the documentation has been added.)

Reviewed by:	phk (but all errors are mine)
2002-03-05 15:38:49 +00:00
robert
9df5bc6dbf Fix a warning. 2002-03-05 15:19:33 +00:00
jeff
f25b3f9f5b Add a new variable mp_maxid. This is used so that per cpu datastructures may
be allocated as arrays indexed by the cpu id.  Previously the only reliable
way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here
because cpu ids may be sparsely mapped, although x86 and alpha do not do this.

Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM
starts up.  This is intended to help support per cpu queues for the new
allocator, but may be useful elsewhere.

Reviewed by:	jake
Approved by:	jake
2002-03-05 10:01:46 +00:00
tanimura
66d03d29ff Track the number of wired pages to avoid unwiring unwired pages.
Reviewed by:	alfred
2002-03-05 00:51:03 +00:00
iwasaki
3f245d8dd1 Add generalized power profile code.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.

 - move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
 - call power_profile_set_state() also from APM driver when AC-line
   status changes
 - add call-back function for Crusoe LongRun controlling on power
   profile changes for a example
2002-03-04 18:46:13 +00:00
bmilekic
4d41cd259b Fix bug in mb_alloc that made systems configured with
PAGE_SIZE / MCLBYTES == 1 crash. Fix them by changing the
appropriate "allocate new page and bucket" code in mb_alloc to use
the macro for properly grabbing an allocated object from a bucket,
the one that checks whether the bucket is empty.
This should allow ken to continue testing zero-copy stuff on -CURRENT.

Noticed and provided debug info: ken
2002-03-03 22:10:04 +00:00
dd
a9f1d1dab4 Check the version of ex_anon (a `struct xucred') before using it to
fill out netc_anon (a `struct ucred'), and add an XXX around the
entire operation since it isn't clear whether it's doing the right
thing with things like cr_uidinfo and cr_prison.
2002-03-03 06:07:57 +00:00
tanimura
af34f116bc Fix lock leakage and late unlock.
Submitted by:	bde
2002-03-02 12:42:24 +00:00
iedowse
5c807e00f8 In sosend(), enforce the socket buffer limits regardless of whether
the data was supplied as a uio or an mbuf. Previously the limit was
ignored for mbuf data, and NFS could run the kernel out of mbufs
when an ipfw rule blocked retransmissions.
2002-02-28 11:22:40 +00:00
imp
4edacd7b0f Remove now unused struct proc *p.
Approved by: jhb
2002-02-27 20:57:57 +00:00
jhb
56094fdb72 - Change namei() to use td_ucred instead of p_ucred.
- Change the hack in access() that uses a temporary credential to set
  td_ucred to the temp cred instead of p_ucred.
2002-02-27 19:15:29 +00:00
jhb
8205940b14 - Change unp_listen() to accept a thread rather than a proc as its second
argument.
- Use td_ucred in unp_listen() instead of p_ucred.
2002-02-27 19:14:01 +00:00
jhb
9fbc6e93de Fix Giant leakage in several error cases in __semctl(). 2002-02-27 19:12:14 +00:00
jhb
9b6ed8b080 Add a comment about an unlocked access to p_ucred that will go away in
the near future.
2002-02-27 19:10:50 +00:00
alfred
3970e888bc kill __P. 2002-02-27 18:51:53 +00:00
alfred
fc10081bae add assertions in the places where giant is required to catch when
the pipe is locked and shouldn't be.

initialize pipe->pipe_mtxp to NULL when creating pipes in order not
to trip the above assertions.

swap pipe lock with giant around calls to pipe_destroy_write_buffer()

pipe_destroy_write_buffer issue noticed by: jhb
2002-02-27 18:49:58 +00:00
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
jhb
ec01b5bdbc Temporarily lock Giant while we update td_ucred. The proc lock doesn't
fully protect p_ucred yet so Giant is needed until all the p_ucred
locking is done.  This is the original reason td_ucred was not used
immediately after its addition.  Unfortunately, not using td_ucred is
not enough to avoid problems.  Since p_ucred could be stale, we could
actually be dereferencing a stale pointer to dink with the refcount, so
we really need Giant to avoid foot-shooting.  This allows td_ucred to
be safely used as well.
2002-02-27 18:30:01 +00:00
alfred
3a862cdbbd Fix a NULL deref panic in pipe_write, we can't blindly lock
pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the
passed in pipe's mutex instead.
2002-02-27 17:23:16 +00:00
robert
a441b857b6 Make getcredhostname() take a buffer and the buffer's size
as arguments.  The correct hostname is copied into the buffer
while having the prison's lock acquired in a jailed process'
case.

Reviewed by:	jhb, rwatson
2002-02-27 16:43:20 +00:00