Commit Graph

4784 Commits

Author SHA1 Message Date
brian
895107253f Test if rootvnode is NULL rather than if rootdev is NODEV when determining
if there's a filesystem present.

rootdev can be NODEV in the NFS-mounted root scenario.

Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse
2002-04-26 09:52:54 +00:00
silby
dd3cd5fed6 Make sure that sockets undergoing accept filtering are aborted in a
LRU fashion when the listen queue fills up.  Previously, there was
no mechanism to kick out old sockets, leading to an easy DoS of
daemons using accept filtering.

Reviewed by:	alfred
MFC after:	3 days
2002-04-26 02:07:46 +00:00
des
b3648bf706 Add the mutex profiling lock to the witness list. This hopefully unbreaks
the MUTEX_PROFILING + WITNESS + !WITNESS_SKIPSPIN case.

Submitted by:	Hiten Pandya <hiten@uk.FreeBSD.org>
2002-04-25 22:48:40 +00:00
bde
e1e6cfc088 Fixed some longstanding bugs in _getenv_static():
- malformed environment strings (ones without an '=') were not rejected.
  There shouldn't be any of these, but when the static environment is
  empty it always begins with one of these; this one should be considered
  as the terminator after the end of the environment, but it isn't.
- the comparison of the name being looked up with the name in the
  environment was fuzzy -- only the characters up to the length of the
  latter were compared, so _getenv_static("foobar") matched "foo=..."
  in the environment and everything matched "" in the empty environment.

MFC after:	3 days
2002-04-25 20:25:15 +00:00
bde
c7cc23aacf Break the following implementation of panic(3):
#!bin/sh

	# Original version of this by Michael Reifenberger
	# <root@nihil.plaut.de>.

	mdconfig -d -u 11 >/dev/null 2>&1
	dd if=/dev/zero of=zz bs=1m count=1

	while :
	do
		mdconfig -a -t vnode -f zz -u 11
		fdisk -f - -iv /dev/md11 <<EOF1
		g c1 h64 s32
		p 1 165 0 2048
		a 1
	EOF1
		mdconfig -d -u 11
	done

Garbage pointers in __si_u were not cleared by destroy_dev().  Not
clearing si_disk made the above fatal because the disk layer uses
si_disk as a flag to indicate that the dev_t has been completely
initialized.  disk_destroy() clears si_disk for the parent dev_t
but doesn't get called for children.

Not fixed:
- setting the undocumented sysctl debug.free_devt should cause more
  complete destruction of the dev_t including clearing of __si_u, but
  actually causes the above to panic a little earlier.
- the loop leaks 10 memory allocations per iteration (4 DEVFS, 2 devbuf
  and 4 dev_t).

Reviewed by:	timeout by MAINTAINER after 3 months
2002-04-25 13:17:33 +00:00
marcel
56d625090e Don't use the symbol name to lookup the symbol value when we can use
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.

Don't pass the symbol name to elf_reloc(), as it isn't used any more.
2002-04-25 01:22:16 +00:00
tanimura
1616fbed42 Free(9) should be Giant-free.
Suggested by:	jhb
2002-04-24 09:59:18 +00:00
silby
b4055530fc Remove sodropablereq - this function hasn't been used since the
syncache went in.

MFC after:	3 days
2002-04-24 04:11:08 +00:00
hsu
7bef5a6e99 The cold and panicstr variables do not need to be protected by sched_lock.
Submitted by:	Jennifer Yang (yangjihui@yahoo.com)
Reviewed by:	jake & jhb in principle
2002-04-23 19:50:22 +00:00
phk
834fdde07a Add a basic sanity check on pointers passed to free(9).
Should be improved by:	jeff
2002-04-23 18:50:25 +00:00
phk
bf5ba9f42b Don't call malloc(9) to allocate zero bytes softc data for devices. 2002-04-23 15:48:23 +00:00
rwatson
780f32f693 Slightly restructure extattr_get_vp() so that there's only one entry point
to VOP_GETEXTATTR().  This simplifies code flow when inserting MAC hooks.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-23 01:27:38 +00:00
alfred
d4c507ea29 Don't FILEDESC_LOCK around calls to falloc(). 2002-04-22 20:09:11 +00:00
des
4d6b787d2d Usage style sweep: spell "usage" with a small 'u'.
Also change one case of blatant __progname abuse (several more remain)
This commit does not touch anything in src/{contrib,crypto,gnu}/.
2002-04-22 13:44:47 +00:00
phk
68aee74f02 Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
2002-04-22 06:53:20 +00:00
marcel
84ecc1bfc1 Add function link_elf_get_gp(), specific to ia64 for now, to get
the DT_PLTGOT value. On ia64 this is the value of GP. We need this
to construct function descriptors, but the elf file structure is
not exported to MD code.

Note that the name of the function is based on the meaning that
DT_PLTGOT has on ia64. This may differ on other architectures. As
such, link_elf_get_gp() has a high level of MD to it. Renaming the
function to describe what DT_* value is returned makes it generic,
but also makes the MD code less clear and if we only need this on
ia64, then a general name for a specific function doesn't help.

In short: I don't know what is "right" at this time, so I'll go
with what I have.
2002-04-21 21:08:30 +00:00
markm
b0c0526342 Use protected names (_foo) to cutdown on boatloads of lint warnings. 2002-04-21 11:16:10 +00:00
marcel
5de2c9fb38 GCC 3.x WARNS: Add a break to the default case. 2002-04-20 21:56:42 +00:00
tanimura
e2acd5cecf Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only
malloc(9) and free(9).
2002-04-20 12:02:52 +00:00
rwatson
30744d9c56 Improve style consistency of vfs_syscalls.c by converting the style used
in various extattr_*() calls to match the rest of the file.  Originally,
these bits at the end looked more like style(9).  This patch was submitted
by green by way of the TrustedBSD MAC tree, and I fixed a few problems
with it on the way through.  Someone with more time on their hands should
convert the entire file to style(9); this commit is for diff reduction
purposes.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-20 01:37:08 +00:00
rwatson
4d39491e7e In sendfile(), use the vn_rdwr() helper function, rather than manually
constructing a struct aio and invoking VOP_READ() directly.  This cleans
up the code a little, but also has the advantage of making sure almost
all vnode read/write access in the kernel goes through the helper
function, meaning that instrumentation of that helper function can impact
almost all relevant read/write operations.  In this case, it permits us
to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet).

In general, if helper vn_*() functions exist, they should be used in
preference to direct VOP's in system call service code.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:46:24 +00:00
rwatson
63ab78794e Divorce proc0 and proc1 credentials earlier; while this isn't technically
needed in the current code, in the MAC tree, create_init() relies on the
ability to modify the credentials present for initproc, and should not
perform that modification on a shared credential.  Pro-active diff
reduction against MAC changes that are in the queue; also facilitates
other work, including the capabilities implementation.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:35:53 +00:00
phk
f4a2041f29 suser is Giant safe, so optimize a pointless case. 2002-04-19 09:20:13 +00:00
suz
553226e8e1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
nectar
fcc5ad0935 When exec'ing a set[ug]id program, make sure that the stdio file descriptors
(0, 1, 2) are allocated by opening /dev/null for any which are not already
open.

Reviewed by:	alfred, phk
MFC after:	2 days
2002-04-19 00:45:29 +00:00
mux
6961e47900 Avoid calling malloc() or free() while holding the
kenv lock.

Reviewed by:	jake
2002-04-17 17:51:10 +00:00
mux
a207e41bef Rework the kernel environment subsystem. We now convert the static
environment needed at boot time to a dynamic subsystem when VM is
up.  The dynamic kernel environment is protected by an sx lock.

This adds some new functions to manipulate the kernel environment :
freeenv(), setenv(), unsetenv() and testenv().  freeenv() has to be
called after every getenv() when you have finished using the string.
testenv() only tests if an environment variable is present, and
doesn't require a freeenv() call. setenv() and unsetenv() are self
explanatory.

The kenv(2) syscall exports these new functionalities to userland,
mainly for kenv(1).

Reviewed by:	peter
2002-04-17 13:06:36 +00:00
mux
c79270302c Add an entry for the kenv(2) syscall (code to follow).
Reviewed by: peter
2002-04-17 13:05:13 +00:00
iedowse
64322dabea The recent NFS forced unmount improvements introduced a side-effect
where some client operations might be unexpectedly cancelled during
an unsuccessful non-forced unmount attempt. This causes problems
for amd(8), because it periodically attempts a non-forced unmount
to check if the filesystem is still in use.

Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set
only during the operation of a forced unmount. Use this instead of
MNTK_UNMOUNT to trigger the cancellation of hung NFS operations.

Also correct a problem where dounmount() might inadvertently clear
the MNTK_UNMOUNT flag.

Reported by:	simokawa
MFC after:	1 week
2002-04-17 01:07:29 +00:00
jhb
dba04cd736 Lock proctree_lock instead of pgrpsess_lock. 2002-04-16 17:11:34 +00:00
jhb
6cbba0bb03 - Lock proctree_lock instead of pgrpsess_lock.
- Use temporary variables to hold a pointer to a pgrp while we dink with it
  while not holding either the associated proc lock or proctree_lock.  It
  is in theory possible that p->p_pgrp could change out from under us.
2002-04-16 17:09:22 +00:00
jhb
d9a4c30c37 - Lock proctree_lock instead of pgrpsess_lock.
- Simplify return logic of setsid() and setpgid().
2002-04-16 17:06:11 +00:00
jhb
2ebbf84d61 - Lock proctree_lock instead of pgrpsess_lock.
- Exclusively lock proctree_lock while calling leavepgrp().
2002-04-16 17:04:21 +00:00
jhb
7202da4491 - Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock
sx lock.  Trying to get the lock order between these locks was getting
  too complicated as the locking in wait1() was being fixed.
- leavepgrp() now requires an exclusive lock of proctree_lock to be held
  when it is called.
- fixjobc() no longer gets a shared lock of proctree_lock now that it
  requires an xlock be held by the caller.
- Locking notes in sys/proc.h are adjusted to note that everything that
  used to be protected by the pgrpsess_lock is now protected by the
  proctree_lock.
2002-04-16 17:03:05 +00:00
phk
2edc95ffee Remove two debug printfs which should never have been committed. 2002-04-15 21:08:51 +00:00
jhb
f656d44b0b You have to cast int64_t's to long long if you printf them with %lld.
This now compiles on alpha without a warning.

Pointy-hat to:	phk
2002-04-15 21:04:32 +00:00
phk
b6bf4c07cf Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
phk
ed0cd9a251 Take the "tickadj" element out of struct clockinfo. Our adjtime(2)
implementation is being changed and the very concept of tickadj will
no longer be meaningful.
2002-04-15 12:11:06 +00:00
phk
af54e26ee0 In the ntp_adjtime(2) syscall, return our actual estimate of unapplied
offset correction instead of the most recent offset applied.
2002-04-15 08:58:24 +00:00
jeff
6cb876e7dd Finish adding support code for sysctl kern.mprof. This dumps some malloc
information related to bucket size effeciency.  Three things are printed on
each row:

Size is the size the user actually asked for rounded to 16 bytes.
Requests is the number of times this size was asked for.
Real Size is the size we actually handed out.

At the end the total memory used and total waste is displayed.  Currently my
system displays about 33% wasted memory.

The intent of this code is to gather statistics for tuning the malloc bucket
sizes.  It is not intended to be run with INVARIANTS and it is not entirely
mp safe.  It can be enabled via 'options MALLOC_PROFILE' which was commited
earlier.
2002-04-15 05:24:01 +00:00
jeff
da6660250e Remove malloc_type's ks_limit.
Updated the kmemzones logic such that the ks_size bitmap can be used as an
index into it to report the size of the zone used.

Create the kern.malloc sysctl which replaces the kvm mechanism to report
similar data.  This will provide an easy place for statistics aggregation if
malloc_type statistics become per cpu data.

Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing
the malloc buckets.
2002-04-15 04:05:53 +00:00
alfred
0925885691 Don't allow one to trace an ancestor when already traced.
PR: kern/29741
Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org>
Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au>
MFC After: 2 weeks
2002-04-14 17:12:55 +00:00
jeff
9089f1baf8 Use VOP_GETVOBJECT instead of accessing the member directly. This fixed
an issue with nullfs and NAMEI shared.

Submitted by:	Alexander Kabaev
2002-04-14 10:18:48 +00:00
alc
3ad9fd7f0b Regen 2002-04-14 05:33:58 +00:00
alc
a34b48c478 Remove the requirement that Giant be held around sigreturn(). 2002-04-14 05:31:47 +00:00
alc
7e3107d0af o Use aiocblist::fd_file in the AIO threads rather than recomputing
the file * from the calling process's descriptor table.
 o Eliminate sharing of the calling process's descriptor table
   with the AIO threads.
2002-04-14 03:04:19 +00:00
jhb
e93a8a367d - Change killpg1()'s first argument to be a thread instead of a process so
we can use td_ucred.
- In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB
  or not.  We don't need sched_lock.
- Close some races in psignal().  In psignal() there is a big switch
  statement based on p_stat.  All the different cases are assuming that
  the process (or thread) isn't going to change state out from under it.
  To ensure this is true, just lock sched_lock for the entire switch.  We
  practically held it the entire time already anyways.  This also
  simplifies the locking somewhat and actually results in fewer lock
  operations.
- Allow signotify() to be called with the sched_lock held since psignal()
  now does that.
- Use td_ucred in a couple of places.
2002-04-13 23:33:36 +00:00
jhb
300593a2cc - Change donice() to take a thread as the first argument instead of a
process so it can use td_ucred.
- Require the target process of donice() to be locked when donice() is
  called.
- Use td_ucred.
- Lock the target process of p_cansee() and while reading the credentials
  of a process.
- Change the logic of rtprio() slightly so it does it's copyin() if needed
  prior to locking the target process.
- rtprio() no longer needs Giant.  In theory with full KSE it would still
  need Giant to protect p_ucred of curproc for the p_canfoo() functions
  but p_canfoo() will be changing to using td_ucred of curthread before
  full KSE hits the tree.
2002-04-13 23:28:23 +00:00
jhb
95ee443e6c - Change the algorithms of the syscalls to modify process credentials to
allocate a blank cred first, lock the process, perform checks on the
  old process credential, copy the old process credential into the new
  blank credential, modify the new credential, update the process
  credential pointer, unlock the process, and cleanup rather than trying
  to allocate a new credential after performing the checks on the old
  credential.
- Cleanup _setugid() a little bit.
- setlogin() doesn't need Giant thanks to pgrp/session locking and
  td_ucred.
2002-04-13 23:07:05 +00:00
jhb
418e247b74 - Change the first argument of ktrcanset(), ktrsetchildren(), and ktrops()
to a thread pointer so that ktrcanset() can use td_ucred.
- Add some proc locking to partially protect p_tracep and p_traceflag.
2002-04-13 22:54:18 +00:00