Commit Graph

4945 Commits

Author SHA1 Message Date
Jeff Roberson
f0d73b3e5f Switch from just holding the interlock to holding the standard lock throughout
getnewvnode().  This is safer.  In the future, we should investigate requiring
only the interlock to get the vnode object.
2002-05-07 02:44:06 +00:00
Alfred Perlstein
e649887b1e Make funsetown() take a 'struct sigio **' so that the locking can
be done internally.

Ensure that no one can fsetown() to a dying process/pgrp.  We need
to check the process for P_WEXIT to see if it's exiting.  Process
groups are already safe because there is no such thing as a pgrp
zombie, therefore the proctree lock completely protects the pgrp
from having sigio structures associated with it after it runs
funsetownlst.

Add sigio lock to witness list under proctree and allproc, but over
proc and pgrp.

Seigo Tanimura helped with this.
2002-05-06 19:31:28 +00:00
John Baldwin
e746d950ab When checking to see if the init process calls exit1(), compare p to the
initproc proc pointer instead of checking to see if the pid is 1.

Submitted by:	bde
2002-05-06 17:07:10 +00:00
John Baldwin
276c516984 Style fixes in local variable declarations.
Submitted by:	bde
2002-05-06 17:04:29 +00:00
John Baldwin
7a6b989bfa - Style fixes in some comments.
- Whitespace nit.
- Sort some includes.

Submitted by:	bde (mostly)
2002-05-06 15:46:29 +00:00
Jeff Roberson
6953f5da1a Hold the currently selected vnode's lock across the call to VOP_GETVOBJECT.
Don't try to create a vm object before the file system has a chance to finish
initializing it.  This is incorrect for a number of reasons.  Firstly, that
VOP requires a lock which the file system may not have initialized yet. Also,
open and others will create a vm object if it is necessary later.
2002-05-06 04:47:43 +00:00
Maxime Henrion
9d997d8be8 Add the lchflags(2) syscall.
Reviewed by:	rwatson
2002-05-05 23:47:41 +00:00
Maxime Henrion
8d9b781fb5 Add an entry for the lchflags(2) syscall. It's useful to prevent
a symlink deletion.

Reviewed by:	rwatson
2002-05-05 23:37:44 +00:00
Jeff Roberson
576365ba36 Move a KASSERT() in open() prior to unlocking the vnode. It's not safe to
call VOP_GETVOBJECT without a lock.
2002-05-05 23:17:13 +00:00
Alan Cox
c50fe92b8d o Condition the compilation of uiomoveco() and vm_uiomove()
on ENABLE_VFS_IOOPT.
 o Add a comment to the effect that this code is experimental
   support for zero-copy I/O.
2002-05-05 22:42:40 +00:00
Poul-Henning Kamp
81e017430a Expand the one-line function pbreassignbuf() the only place it is or could
be used.
2002-05-05 20:37:08 +00:00
Bruce Evans
f5216b9a19 Return the correct error code (ENOSYS, not EINVAL) from nosys(). Getting
killed by SIGSYS for unimlemented syscalls is bad enough.

Obtained from:	Lite2 branch

The Lite2 branch has some other interesting unmerged (?) bits in this
file.  They are well hidden among cosmetic regressions.
2002-05-05 04:50:47 +00:00
Bruce Evans
a9a0f15a69 Fixed breakage of binary compatibility of the kern.clockrate sysctl in
sys/time.h rev.1.53, etc.  Zero out the entire struct clkinfo and not
just the new spare part of it so that there is no possibility of leaking
kernel stack context to userland.
2002-05-05 04:33:09 +00:00
Maxime Henrion
afd458b0fa Fix a typo.
Submitted by:	dwmalone
2002-05-04 19:50:09 +00:00
Poul-Henning Kamp
e31c615c60 Remove a six year old undocumented #ifdef : NO_B_MALLOC. 2002-05-04 19:24:55 +00:00
Matthew Dillon
9f9435545b Remove obsolete code (that was already #if 0'd out).
Requested by: Hiten Pandya <hitmaster2k@yahoo.com>
2002-05-04 17:10:15 +00:00
Alfred Perlstein
698f85d3e3 style(9): 'if' and 'while' need a space after them. 2002-05-04 07:40:49 +00:00
Poul-Henning Kamp
48e5da550a Initialize time_second to 1 instead of zero to pacify slightly bogus arp code.
Various minor style fixes from BDE.
2002-05-03 08:46:03 +00:00
Seigo Tanimura
6041fa0a60 As malloc(9) and free(9) are now Giant-free, remove the Giant lock
across malloc(9) and free(9) of a pgrp or a session.
2002-05-03 07:46:59 +00:00
Seigo Tanimura
c8d8a686e4 Fix the lock order reversal between the sigio lock and a process/pgrp lock in
funsetownlst() by locking the sigio lock across funsetownlst().
2002-05-03 05:32:25 +00:00
Peter Wemm
85f79d52e9 Retire makeobjops.pl - replaced by ../tools/makeobjops.awk. 2002-05-02 22:21:59 +00:00
Poul-Henning Kamp
0b5d880d39 As promised make the hack for sizeof(struct disklabel) on alpha annoying.
Run make world (or recompile whatever program whines) to get rid of warning.

Compat bits will be removed entirely in about two weeks.
2002-05-02 21:53:39 +00:00
Maxime Henrion
6dbde1fe23 Convert devfs to nmount.
Reviewed by:	phk
2002-05-02 20:27:42 +00:00
John Baldwin
3fc755c118 - Protect randompid and nprocs with the allproc_lock.
- Reorder fork1() to do malloc() and other blocking operations prior to
  acquiring the needed process locks.
- The new process inherit's the credentials of curthread, not the
  credentials of the old process.
- Document a really weird race that will come up with KSE allows multiple
  kernel threads per process.
2002-05-02 15:13:45 +00:00
John Baldwin
d7aadbf9ce - Reorder a few things so that when we lock the process at the end of
exit1() we don't have to release it until we acquire schd_lock to
  call cpu_throw().
- Since we can switch at any time due to preemption or a lock release
  prior to acquiring sched_lock, don't update switchtime and switchticks
  until the very end of exit1() after we have acquired sched_lock.
- Interlock the proctree_lock and proc lock in wait1() and exit1() to
  avoid lost wakeups when a parent blocks waiting for a child to exit at
  the bottom of wait1().  In exit1() the proc lock interlocked with
  proctree_lock (and released after acquiring sched_lock) is that of
  the parent process.
- In wait1() use an exclusive lock of proctree lock while we are
  looking for a process to harvest.  This allows us to completely
  remove all references to the process once we've found one (i.e.,
  disconnect it from pgrp's, session's, zombproc list, and it's parent's
  children list) "atomically" without needing to worry about a lock
  upgrade.
- We don't need sched_lock to test if p_stat is SZOMB or SSTOP when holding
  the proc lock since the proc lock is always held with p_stat is set to
  SZOMB or SSTOP.
- Protect nprocs with an xlock of the allproc_lock.
2002-05-02 15:09:58 +00:00
John Baldwin
9b3b1c5fdf - Reorder execve() so that it performs blocking operations before it
locks the process.
- Defer other blocking operations such as vrele()'s until after we
  release locks.
- execsigs() now requires the proc lock to be held when it is called
  rather than locking the process internally.
2002-05-02 15:00:14 +00:00
Jeff Roberson
8f70816cf2 Hide a pointer to the malloc_type bucket at the end of the freed memory. If
this memory is modified after it has been freed we can now report it's
previous owner.
2002-05-02 09:07:04 +00:00
Jeff Roberson
5a34a9f089 malloc/free(9) no longer require Giant. Use the malloc_mtx to protect the
mallochash.  Mallochash is going to go away as soon as I introduce the
kfree/kmalloc api and partially overhaul the malloc wrapper.  This can't happen
until all users of the malloc api that expect memory to be aligned on the size
of the allocation are fixed.
2002-05-02 07:22:19 +00:00
Jeff Roberson
639c9550fb Remove the temporary alignment check in free().
Implement the following checks on freed memory in the bucket path:
	- Slab membership
	- Alignment
	- Duplicate free

This previously was only done if we skipped the buckets.  This code will slow
down INVARIANTS a bit, but it is smp safe.  The checks were moved out of the
normal path and into hooks supplied in uma_dbg.
2002-05-02 02:08:48 +00:00
Alfred Perlstein
f132072368 Redo the sigio locking.
Turn the sigio sx into a mutex.

Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.

In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.
2002-05-01 20:44:46 +00:00
Peter Wemm
6692ac6644 Cosmetic tweaks. Try and keep the style more consistent, catch some stray
whitespace and update a comment.
2002-05-01 02:51:50 +00:00
Peter Wemm
aed0556447 kern_tc.c doesn't use <machine/psl.h>, and having this #include breaks
other platforms.
2002-05-01 01:31:26 +00:00
David E. O'Brien
2244cf6bba Remove this Perl script. There have been zero bug reports against
vnode_if.awk.
2002-05-01 00:40:44 +00:00
Jeff Roberson
289f207c81 Convert longs to u_longs in stats. This will hold off wrap arounds for a
while longer.
2002-04-30 22:39:32 +00:00
Alan Cox
ea0f50bcf0 o Convert the vm_page buckets mutex to a spin lock. (This resolves
an issue on the Alpha platform found by jeff@.)
 o Simplify vm_page_lookup().

Reviewed by:	jhb
2002-04-30 21:24:47 +00:00
Poul-Henning Kamp
39acc78a1e Brucifixion ? Yes, out that door, row on the left, one patch each.
Many thanks to:	bde
2002-04-30 20:42:06 +00:00
Matthew Dillon
e6728403d4 These are Alexander Kabaev's VFSops fixes (see the thread 'Found: module
loading breakage').  The patch fixes serious issues with the VFS
operations vector array which results in a crash when a filesystem module
adding a new VOP is loaded into the kernel.  Basically what was happening
before was that the old operations vector was being freed and a new one
allocated.  The original MALLOC code tended to reuse the same address
for the case and so the bug did not rear its ugly head until the new memory
subsystem was emplaced.

This patch replaces the temporary workaround Dave O'Brien comitted in 1.58.

The patch is clean enough that I intend to MFC it to stable at some point.

Submitted by:	Alexander Kabaev <ak03@gte.com>
MFC after:	1 week
2002-04-30 18:44:32 +00:00
Jeff Roberson
8efc4eff00 Add a new UMA debugging facility. This will overwrite freed memory with
0xdeadc0de and then check for it just before memory is handed off as part
of a new request.  This will catch any post free/pre alloc modification of
memory, as well as introduce errors for anything that tries to dereference
it as a pointer.

This code takes the form of special init, fini, ctor and dtor routines that
are specificly used by malloc.  It is in a seperate file because additional
debugging aids will want to live here as well.
2002-04-30 07:54:25 +00:00
Jeff Roberson
2cc35ff9c6 Move the implementation of M_ZERO into UMA so that it can be passed to
uma_zalloc and friends.  Remove this functionality from the malloc wrapper.

Document this change in uma.h and adjust variable names in uma_core.
2002-04-30 04:26:34 +00:00
Seigo Tanimura
960ed29c4b Revert the change of #includes in sys/filedesc.h and sys/socketvar.h.
Requested by:	bde

Since locking sigio_lock is usually followed by calling pgsigio(),
move the declaration of sigio_lock and the definitions of SIGIO_*() to
sys/signalvar.h.

While I am here, sort include files alphabetically, where possible.
2002-04-30 01:54:54 +00:00
Robert Watson
43a7c4e919 Re-add the 16384 bucket also.
Submitted by:	green
2002-04-29 17:53:23 +00:00
Robert Watson
bd796eb25f Revert a portion of kern_malloc.c:1.99, which (in addition to adding
malloc profiling) also modified the set of pre-defined buckets for the
memory allocator.  For reasons unknown to me, this resulted in extensive
memory corruption in the kernel, in particular on SMP boxes, so I'm
committing this work-around until Jeff gets a chance to debug it
properly.  David Wolfskill pointed me at this commit as the one that
might be a problem; I've been running this code on two dual-processor
burn-in boxes for about 12 hours now, and the rate of panics due to
memory corruption has dropped to zero (from one every five minutes).

Hopefully not treading on the toes of:	jeff
2002-04-29 17:12:02 +00:00
David Malone
dbe620d321 Add a sysctl which disables the logging of console output.
Approved by:	phk
MFC after:	2 weeks
2002-04-29 09:15:38 +00:00
Jeroen Ruigrok van der Werven
1cf1a725ff Fix indention which I did wrong in a previous commit.
Submitted by:	bde
2002-04-29 08:18:06 +00:00
Poul-Henning Kamp
6b00cf46ec Stylistic sweep through the timecounter code.
Renovate comments.
2002-04-28 18:24:21 +00:00
Poul-Henning Kamp
d25917e856 Don't screw up our uptime with historical dates. 2002-04-28 16:51:36 +00:00
Ian Dowse
ba1551ca81 Avoid the user-visible effect of setting SA_NOCLDWAIT when the
SIGCHLD handler is SIG_IGN. This is a reimplementation of the
problematic revision 1.131 of kern_exit.c. To avoid accessing process
UPAGES, we set a new procsig flag when the SIGCHLD handler is SIG_IGN
and use that instead.
2002-04-27 22:41:41 +00:00
Peter Wemm
4f033348f4 Finish fixing hints. Remember the use_kenv state for the next run.
Otherwise we fall back to using the static hints the next time around.
We still have the leftover fallback code there which meant that we skipped
the use_hints checking on the second and subsequent calls.  Also, be a bit
more careful about walking off the end of the envp array.

I've extracted this from a larger diff.  I hope I didn't miss anything...
2002-04-27 22:32:57 +00:00
Peter Wemm
fc1218bb71 Partial fix for hints
Obtained from:  mux
2002-04-27 22:25:13 +00:00
Ian Dowse
3eee035c5b Remove a stale comment saying that the vnode lock must be the first
element in the structure pointed to by vp->v_data; the vnode lock
is now within the vnode structure itself.
2002-04-27 22:20:33 +00:00
Seigo Tanimura
acbbcc5f1d Fix the code fragment clobbered in my last commit. 2002-04-27 09:33:49 +00:00
Seigo Tanimura
d48d4b2501 Add a global sx sigio_lock to protect the pointer to the sigio object
of a socket.  This avoids lock order reversal caused by locking a
process in pgsigio().

sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now
require sigio_lock to be locked.  Provide sowwakeup_locked(),
soisconnected_locked(), and so on in case where we have to modify a
socket and wake up a process atomically.
2002-04-27 08:24:29 +00:00
Poul-Henning Kamp
f5d157fb51 Explain magic number.
Add magic date no explanation.

Add a delta which was lost in transit yesterday which prevented
other timecounters from actually being used.
2002-04-27 07:28:54 +00:00
Poul-Henning Kamp
f175569ac2 Make the dummy timecounter actually tick or we will never get anyhere. 2002-04-27 07:06:52 +00:00
John Baldwin
e64b74e35b Whitespace bogon. 2002-04-27 04:48:36 +00:00
Marcel Moolenaar
9ae9d0ff86 Insert a semi-colon between label 'skip:' and the closing brace
of the FOREACH loop to silence GCC 3.
2002-04-27 02:58:18 +00:00
Mike Barcroft
a30d4b3270 Move the new byte order function prototypes from <sys/param.h> to
<sys/endian.h>.  This puts us in line with NetBSD and OpenBSD.
2002-04-26 22:48:23 +00:00
Poul-Henning Kamp
62efba6a0c Now that the private parts of timecounters are no longer being fingered
by other bits of code, split struct timecounter into two.

struct timecounter contains just the bits which pertains to the hardware
counter and the reading of it.

struct timehands (as in "the hands on a clock") contains all the ugly bit
fidling stuff.  Statically compile ten timehands.

This commit is the functional part.  A later cosmetic patch will rename
various variables and fieldnames.
2002-04-26 21:51:08 +00:00
Poul-Henning Kamp
b4a1d0deb1 Hide the private parts of timecounter from a couple of places that don't
really need to know the gory details.
2002-04-26 21:31:44 +00:00
Poul-Henning Kamp
7bf758bff0 Simplify the RFC2783 and PPS_SYNC timestamp collection API. 2002-04-26 20:24:28 +00:00
Poul-Henning Kamp
9e1b5510c3 Move the winding of timecounters out of hardclock and into a normal
timeout loop.

Limit the rate at which we wind the timecounters to approx 1000 Hz.

This limits the precision of the get{bin,nano,micro}[up]time(9)
functions to roughly a millisecond.
2002-04-26 12:37:36 +00:00
Poul-Henning Kamp
056abcabb7 Various cleanup and sorting of clock reading functions. Add the two
functions missing in the complete 12 function complement.
2002-04-26 10:19:29 +00:00
Poul-Henning Kamp
656d3e04d1 Rename tco_setscales() and tco_delta() to use the same tc_ prefix as
the rest of this file.
2002-04-26 10:11:02 +00:00
Poul-Henning Kamp
7e2d76ff05 Remove the tc_update() function. Any frequency change to the
timecounter will be used starting at the next second, which is
good enough for sysctl purposes.  If better adjustment is needed
the NTP PLL should be used.
2002-04-26 10:06:26 +00:00
Brian Somers
b94c4e9a93 Test if rootvnode is NULL rather than if rootdev is NODEV when determining
if there's a filesystem present.

rootdev can be NODEV in the NFS-mounted root scenario.

Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse
2002-04-26 09:52:54 +00:00
Mike Silbersack
e1f1827f98 Make sure that sockets undergoing accept filtering are aborted in a
LRU fashion when the listen queue fills up.  Previously, there was
no mechanism to kick out old sockets, leading to an easy DoS of
daemons using accept filtering.

Reviewed by:	alfred
MFC after:	3 days
2002-04-26 02:07:46 +00:00
Dag-Erling Smørgrav
521eb014c8 Add the mutex profiling lock to the witness list. This hopefully unbreaks
the MUTEX_PROFILING + WITNESS + !WITNESS_SKIPSPIN case.

Submitted by:	Hiten Pandya <hiten@uk.FreeBSD.org>
2002-04-25 22:48:40 +00:00
Bruce Evans
2c900f6451 Fixed some longstanding bugs in _getenv_static():
- malformed environment strings (ones without an '=') were not rejected.
  There shouldn't be any of these, but when the static environment is
  empty it always begins with one of these; this one should be considered
  as the terminator after the end of the environment, but it isn't.
- the comparison of the name being looked up with the name in the
  environment was fuzzy -- only the characters up to the length of the
  latter were compared, so _getenv_static("foobar") matched "foo=..."
  in the environment and everything matched "" in the empty environment.

MFC after:	3 days
2002-04-25 20:25:15 +00:00
Bruce Evans
ff557fa1a9 Break the following implementation of panic(3):
#!bin/sh

	# Original version of this by Michael Reifenberger
	# <root@nihil.plaut.de>.

	mdconfig -d -u 11 >/dev/null 2>&1
	dd if=/dev/zero of=zz bs=1m count=1

	while :
	do
		mdconfig -a -t vnode -f zz -u 11
		fdisk -f - -iv /dev/md11 <<EOF1
		g c1 h64 s32
		p 1 165 0 2048
		a 1
	EOF1
		mdconfig -d -u 11
	done

Garbage pointers in __si_u were not cleared by destroy_dev().  Not
clearing si_disk made the above fatal because the disk layer uses
si_disk as a flag to indicate that the dev_t has been completely
initialized.  disk_destroy() clears si_disk for the parent dev_t
but doesn't get called for children.

Not fixed:
- setting the undocumented sysctl debug.free_devt should cause more
  complete destruction of the dev_t including clearing of __si_u, but
  actually causes the above to panic a little earlier.
- the loop leaks 10 memory allocations per iteration (4 DEVFS, 2 devbuf
  and 4 dev_t).

Reviewed by:	timeout by MAINTAINER after 3 months
2002-04-25 13:17:33 +00:00
Marcel Moolenaar
d297ad160e Don't use the symbol name to lookup the symbol value when we can use
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.

Don't pass the symbol name to elf_reloc(), as it isn't used any more.
2002-04-25 01:22:16 +00:00
Seigo Tanimura
ce00aebe22 Free(9) should be Giant-free.
Suggested by:	jhb
2002-04-24 09:59:18 +00:00
Mike Silbersack
c473d3e406 Remove sodropablereq - this function hasn't been used since the
syncache went in.

MFC after:	3 days
2002-04-24 04:11:08 +00:00
Jeffrey Hsu
4bc37205bc The cold and panicstr variables do not need to be protected by sched_lock.
Submitted by:	Jennifer Yang (yangjihui@yahoo.com)
Reviewed by:	jake & jhb in principle
2002-04-23 19:50:22 +00:00
Poul-Henning Kamp
708da94ef2 Add a basic sanity check on pointers passed to free(9).
Should be improved by:	jeff
2002-04-23 18:50:25 +00:00
Poul-Henning Kamp
00d70dec4e Don't call malloc(9) to allocate zero bytes softc data for devices. 2002-04-23 15:48:23 +00:00
Robert Watson
7a0776e477 Slightly restructure extattr_get_vp() so that there's only one entry point
to VOP_GETEXTATTR().  This simplifies code flow when inserting MAC hooks.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-23 01:27:38 +00:00
Alfred Perlstein
ea5b39d029 Don't FILEDESC_LOCK around calls to falloc(). 2002-04-22 20:09:11 +00:00
Dag-Erling Smørgrav
d397408818 Usage style sweep: spell "usage" with a small 'u'.
Also change one case of blatant __progname abuse (several more remain)
This commit does not touch anything in src/{contrib,crypto,gnu}/.
2002-04-22 13:44:47 +00:00
Poul-Henning Kamp
29f88f470e Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
2002-04-22 06:53:20 +00:00
Marcel Moolenaar
8420105927 Add function link_elf_get_gp(), specific to ia64 for now, to get
the DT_PLTGOT value. On ia64 this is the value of GP. We need this
to construct function descriptors, but the elf file structure is
not exported to MD code.

Note that the name of the function is based on the meaning that
DT_PLTGOT has on ia64. This may differ on other architectures. As
such, link_elf_get_gp() has a high level of MD to it. Renaming the
function to describe what DT_* value is returned makes it generic,
but also makes the MD code less clear and if we only need this on
ia64, then a general name for a specific function doesn't help.

In short: I don't know what is "right" at this time, so I'll go
with what I have.
2002-04-21 21:08:30 +00:00
Mark Murray
bd41864183 Use protected names (_foo) to cutdown on boatloads of lint warnings. 2002-04-21 11:16:10 +00:00
Marcel Moolenaar
9daa5b147a GCC 3.x WARNS: Add a break to the default case. 2002-04-20 21:56:42 +00:00
Seigo Tanimura
1c2451c24d Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only
malloc(9) and free(9).
2002-04-20 12:02:52 +00:00
Robert Watson
0510317039 Improve style consistency of vfs_syscalls.c by converting the style used
in various extattr_*() calls to match the rest of the file.  Originally,
these bits at the end looked more like style(9).  This patch was submitted
by green by way of the TrustedBSD MAC tree, and I fixed a few problems
with it on the way through.  Someone with more time on their hands should
convert the entire file to style(9); this commit is for diff reduction
purposes.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-20 01:37:08 +00:00
Robert Watson
89e9e6e7c5 In sendfile(), use the vn_rdwr() helper function, rather than manually
constructing a struct aio and invoking VOP_READ() directly.  This cleans
up the code a little, but also has the advantage of making sure almost
all vnode read/write access in the kernel goes through the helper
function, meaning that instrumentation of that helper function can impact
almost all relevant read/write operations.  In this case, it permits us
to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet).

In general, if helper vn_*() functions exist, they should be used in
preference to direct VOP's in system call service code.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:46:24 +00:00
Robert Watson
5a06cb0ca6 Divorce proc0 and proc1 credentials earlier; while this isn't technically
needed in the current code, in the MAC tree, create_init() relies on the
ability to modify the credentials present for initproc, and should not
perform that modification on a shared credential.  Pro-active diff
reduction against MAC changes that are in the queue; also facilitates
other work, including the capabilities implementation.

Submitted by:	green
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-04-19 13:35:53 +00:00
Poul-Henning Kamp
3bdd2d061a suser is Giant safe, so optimize a pointless case. 2002-04-19 09:20:13 +00:00
SUZUKI Shinsuke
88ff5695c1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
Jacques Vidrine
e983a3762b When exec'ing a set[ug]id program, make sure that the stdio file descriptors
(0, 1, 2) are allocated by opening /dev/null for any which are not already
open.

Reviewed by:	alfred, phk
MFC after:	2 days
2002-04-19 00:45:29 +00:00
Maxime Henrion
b48a4280fc Avoid calling malloc() or free() while holding the
kenv lock.

Reviewed by:	jake
2002-04-17 17:51:10 +00:00
Maxime Henrion
d786139c76 Rework the kernel environment subsystem. We now convert the static
environment needed at boot time to a dynamic subsystem when VM is
up.  The dynamic kernel environment is protected by an sx lock.

This adds some new functions to manipulate the kernel environment :
freeenv(), setenv(), unsetenv() and testenv().  freeenv() has to be
called after every getenv() when you have finished using the string.
testenv() only tests if an environment variable is present, and
doesn't require a freeenv() call. setenv() and unsetenv() are self
explanatory.

The kenv(2) syscall exports these new functionalities to userland,
mainly for kenv(1).

Reviewed by:	peter
2002-04-17 13:06:36 +00:00
Maxime Henrion
fd448168b7 Add an entry for the kenv(2) syscall (code to follow).
Reviewed by: peter
2002-04-17 13:05:13 +00:00
Ian Dowse
df99ca52f1 The recent NFS forced unmount improvements introduced a side-effect
where some client operations might be unexpectedly cancelled during
an unsuccessful non-forced unmount attempt. This causes problems
for amd(8), because it periodically attempts a non-forced unmount
to check if the filesystem is still in use.

Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set
only during the operation of a forced unmount. Use this instead of
MNTK_UNMOUNT to trigger the cancellation of hung NFS operations.

Also correct a problem where dounmount() might inadvertently clear
the MNTK_UNMOUNT flag.

Reported by:	simokawa
MFC after:	1 week
2002-04-17 01:07:29 +00:00
John Baldwin
ba626c1db2 Lock proctree_lock instead of pgrpsess_lock. 2002-04-16 17:11:34 +00:00
John Baldwin
596325f154 - Lock proctree_lock instead of pgrpsess_lock.
- Use temporary variables to hold a pointer to a pgrp while we dink with it
  while not holding either the associated proc lock or proctree_lock.  It
  is in theory possible that p->p_pgrp could change out from under us.
2002-04-16 17:09:22 +00:00
John Baldwin
c8b1829d8e - Lock proctree_lock instead of pgrpsess_lock.
- Simplify return logic of setsid() and setpgid().
2002-04-16 17:06:11 +00:00
John Baldwin
ea97757a54 - Lock proctree_lock instead of pgrpsess_lock.
- Exclusively lock proctree_lock while calling leavepgrp().
2002-04-16 17:04:21 +00:00
John Baldwin
f089b57070 - Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock
sx lock.  Trying to get the lock order between these locks was getting
  too complicated as the locking in wait1() was being fixed.
- leavepgrp() now requires an exclusive lock of proctree_lock to be held
  when it is called.
- fixjobc() no longer gets a shared lock of proctree_lock now that it
  requires an xlock be held by the caller.
- Locking notes in sys/proc.h are adjusted to note that everything that
  used to be protected by the pgrpsess_lock is now protected by the
  proctree_lock.
2002-04-16 17:03:05 +00:00
Poul-Henning Kamp
fe4dc7a6ee Remove two debug printfs which should never have been committed. 2002-04-15 21:08:51 +00:00
John Baldwin
38e0823392 You have to cast int64_t's to long long if you printf them with %lld.
This now compiles on alpha without a warning.

Pointy-hat to:	phk
2002-04-15 21:04:32 +00:00
Poul-Henning Kamp
e1d970f181 Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
Poul-Henning Kamp
b35c8f287d Take the "tickadj" element out of struct clockinfo. Our adjtime(2)
implementation is being changed and the very concept of tickadj will
no longer be meaningful.
2002-04-15 12:11:06 +00:00
Poul-Henning Kamp
b9c6e8bdbd In the ntp_adjtime(2) syscall, return our actual estimate of unapplied
offset correction instead of the most recent offset applied.
2002-04-15 08:58:24 +00:00
Jeff Roberson
5e914b96b9 Finish adding support code for sysctl kern.mprof. This dumps some malloc
information related to bucket size effeciency.  Three things are printed on
each row:

Size is the size the user actually asked for rounded to 16 bytes.
Requests is the number of times this size was asked for.
Real Size is the size we actually handed out.

At the end the total memory used and total waste is displayed.  Currently my
system displays about 33% wasted memory.

The intent of this code is to gather statistics for tuning the malloc bucket
sizes.  It is not intended to be run with INVARIANTS and it is not entirely
mp safe.  It can be enabled via 'options MALLOC_PROFILE' which was commited
earlier.
2002-04-15 05:24:01 +00:00
Jeff Roberson
6f2671750e Remove malloc_type's ks_limit.
Updated the kmemzones logic such that the ks_size bitmap can be used as an
index into it to report the size of the zone used.

Create the kern.malloc sysctl which replaces the kvm mechanism to report
similar data.  This will provide an easy place for statistics aggregation if
malloc_type statistics become per cpu data.

Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing
the malloc buckets.
2002-04-15 04:05:53 +00:00
Alfred Perlstein
46e12b42fe Don't allow one to trace an ancestor when already traced.
PR: kern/29741
Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org>
Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au>
MFC After: 2 weeks
2002-04-14 17:12:55 +00:00
Jeff Roberson
79a3e97054 Use VOP_GETVOBJECT instead of accessing the member directly. This fixed
an issue with nullfs and NAMEI shared.

Submitted by:	Alexander Kabaev
2002-04-14 10:18:48 +00:00
Alan Cox
24ab015f79 Regen 2002-04-14 05:33:58 +00:00
Alan Cox
b0d97980f6 Remove the requirement that Giant be held around sigreturn(). 2002-04-14 05:31:47 +00:00
Alan Cox
00e731601d o Use aiocblist::fd_file in the AIO threads rather than recomputing
the file * from the calling process's descriptor table.
 o Eliminate sharing of the calling process's descriptor table
   with the AIO threads.
2002-04-14 03:04:19 +00:00
John Baldwin
9c1ab3e04a - Change killpg1()'s first argument to be a thread instead of a process so
we can use td_ucred.
- In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB
  or not.  We don't need sched_lock.
- Close some races in psignal().  In psignal() there is a big switch
  statement based on p_stat.  All the different cases are assuming that
  the process (or thread) isn't going to change state out from under it.
  To ensure this is true, just lock sched_lock for the entire switch.  We
  practically held it the entire time already anyways.  This also
  simplifies the locking somewhat and actually results in fewer lock
  operations.
- Allow signotify() to be called with the sched_lock held since psignal()
  now does that.
- Use td_ucred in a couple of places.
2002-04-13 23:33:36 +00:00
John Baldwin
bad56603ba - Change donice() to take a thread as the first argument instead of a
process so it can use td_ucred.
- Require the target process of donice() to be locked when donice() is
  called.
- Use td_ucred.
- Lock the target process of p_cansee() and while reading the credentials
  of a process.
- Change the logic of rtprio() slightly so it does it's copyin() if needed
  prior to locking the target process.
- rtprio() no longer needs Giant.  In theory with full KSE it would still
  need Giant to protect p_ucred of curproc for the p_canfoo() functions
  but p_canfoo() will be changing to using td_ucred of curthread before
  full KSE hits the tree.
2002-04-13 23:28:23 +00:00
John Baldwin
07f3485d5e - Change the algorithms of the syscalls to modify process credentials to
allocate a blank cred first, lock the process, perform checks on the
  old process credential, copy the old process credential into the new
  blank credential, modify the new credential, update the process
  credential pointer, unlock the process, and cleanup rather than trying
  to allocate a new credential after performing the checks on the old
  credential.
- Cleanup _setugid() a little bit.
- setlogin() doesn't need Giant thanks to pgrp/session locking and
  td_ucred.
2002-04-13 23:07:05 +00:00
John Baldwin
a7ff744350 - Change the first argument of ktrcanset(), ktrsetchildren(), and ktrops()
to a thread pointer so that ktrcanset() can use td_ucred.
- Add some proc locking to partially protect p_tracep and p_traceflag.
2002-04-13 22:54:18 +00:00
Thomas Moestl
8db523989f Use pmap_extract() instead of pmap_kextract() to retrieve the physical
address associated with a user virtual address in
pipe_build_write_buffer().

Reviewed by:	alc
2002-04-13 20:09:06 +00:00
Jeroen Ruigrok van der Werven
bcbf4411d6 Use the correct macros for F_SETFD/F_GETFD instead of magic numbers.
Reflect that fact in the manual page.

PR:		12723
Submitted by:	Peter Jeremy <peter.jeremy@alcatel.com.au>
Approved by:	bde
MFC after:	2 weeks
2002-04-13 10:16:53 +00:00
Thomas Moestl
de67a4bd91 Back out the last revision - it does not work correctly when one of
the pages in question is not in the top-level vm object, but in
one of the shadow ones.

Pointed out by: alc
Pointy hat to:	tmm
2002-04-13 00:03:07 +00:00
John Baldwin
6871a6c89e Rework ptrace(2) to be more locking friendly. We do any needed copyin()'s
and acquire the proctree_lock if needed first.  Then we lock the process
if necessary and fiddle with it as appropriate.  Finally we drop locks and
do any needed copyout's.  This greatly simplifies the locking.
2002-04-12 21:17:37 +00:00
Thomas Moestl
60f2606a7d Do not use pmap_kextract() to find out the physical address of a user
belong to a user virtual address; while this happens to work on some
architectures, it can't on sparc64, since user and kernel virtual
address spaces overlap there (the distinction between them is done via
separate address space identifiers).

Instead, look up the page in the vm_map of the process in question.

Reviewed by:	jake
2002-04-12 19:38:41 +00:00
Jeffrey Hsu
4037698769 Fix corner case where m_len was not being initialized.
Submitted by:	Maksim Yevmenkin <myevmenk@digisle.net>
MFC after:	1 week
2002-04-12 00:01:50 +00:00
John Baldwin
b106d2f56a - Set the base priority of an ithread that has no handlers when we set its
normal priority.
- Lock sched_lock while we dink with the priorities.
- Remove a few extra blank lines.
2002-04-11 21:03:35 +00:00
Alan Cox
ab9ab5702e Regen 2002-04-11 17:35:53 +00:00
Alan Cox
a0805f6f7a Remove the requirement that Giant be held around osigreturn(). All platform-
specific implementations are MPSAFE.
2002-04-11 17:34:38 +00:00
John Baldwin
7edfb592df - Change settime() to take a thread as its first argument instead of a proc
so it can use td_ucred.
- Push Giant down into the end of settime() where we actually set the time
  on the timecounter and time of day clock.
- Remove Giant from clock_settime().
- Push Giant down in settimeofday() to just protect the 'tz' global
  variable.
2002-04-10 04:09:07 +00:00
John Baldwin
9522390c28 Display the recursion count in the lock_instance in the show locks
output.

Indirectly requested by:	peter
2002-04-10 01:25:11 +00:00
John Baldwin
9351347a17 Cosmetic fixup in output of lock types in show locks output. 2002-04-10 01:19:53 +00:00
Brian Somers
f1e4a6e941 In linker_load_module(), check that rootdev != NODEV before calling
linker_search_module().

Without this, modules loaded from loader.conf that then try to load
in additional modules (such as digi.ko loading a card's BIOS) die
badly in the vn_open() called from linker_search_module().

It may be worth checking (KASSERTing?) that rootdev != NODEV in
vn_open() too.
2002-04-10 01:14:45 +00:00
Brian Somers
96987c74d6 Change linker_reference_module() so that it's passed a struct
mod_depend * (which may be NULL).  The only consumer of this
function at the moment is digi_loadmoduledata(), and that passes
a NULL mod_depend *.

In linker_reference_module(), check to see if we've already got
the required module loaded.  If we have, bump the reference count
and return that, otherwise continue the module search as normal.
2002-04-10 01:13:57 +00:00
John Baldwin
65c9b4303b - Change fill_kinfo_proc() to require that the process is locked when it
is called.
- Change sysctl_out_proc() to require that the process is locked when it
  is called and to drop the lock before it returns.  If this proves too
  complex we can change sysctl_out_proc() to simply acquire the lock at
  the very end and have the calling code drop the lock right after it
  returns.
- Lock the process we are going to export before the p_cansee() in the
  loop in sysctl_kern_proc() and hold the lock until we call
  sysctl_out_proc().
- Don't call p_cansee() on the process about to be exported twice in
  the aforementioned loop.
2002-04-09 20:10:46 +00:00
John Baldwin
9b28af9165 Whitespace changes to wrap long lines. 2002-04-09 20:01:16 +00:00
John Baldwin
6dc958b9ff We don't need Giant to read the pgrp ID since the proc lock has protected
p_pgrp since the pgrp locking went in.  We also don't need it to check for
invalid values in the options argument to wait1(), so push Giant down
slightly.
2002-04-09 20:00:40 +00:00
John Baldwin
16e7bc7b90 - Remove an early KSE diagnostic panic. The thread pointer here is always
curthread.
- We don't need Giant to do suser() checks now, so don't lock Giant until
  after the check.
2002-04-09 19:58:38 +00:00
John Baldwin
2b60cfc5ce Don't lock the ithread lock in ithread_create(). The ithread isn't on any
lists or in any tables yet so there are no other references to it, thus
we don't need to lock it.
2002-04-09 16:26:37 +00:00
Poul-Henning Kamp
1bdb20a68e Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start
of the device magic stuff might occupy.

Sponsored by: DARPA & NAI Labs.
2002-04-09 15:43:32 +00:00
Poul-Henning Kamp
7f086a0852 Rename DIOCGKERNELDUMP to DIOCSKERNELDUMP as it strictly speaking
is a "set" not a "get" operation.

Sponsored by:	DARPA & NAI Labs.
2002-04-09 10:04:09 +00:00
Jeff Roberson
a59f8b9e6c Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable this
behavior by default.  Also, change the options line to reflect this.

If there are no problems reported this will become the only behavior and the
knob will be removed in a month or so.

Demanded by:	obrien
2002-04-09 05:14:17 +00:00
Maxime Henrion
a48ca36999 The fourth parameter to copystr() is a size_t, not an int.
Approved by:	peter
2002-04-08 21:14:19 +00:00
Poul-Henning Kamp
2dd527b3ac Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>.
Sponsored by:	DARPA & NAI Labs
2002-04-08 09:20:07 +00:00
Poul-Henning Kamp
d39e457bba Put back dumppcb, but this time we put a comment to tell what it is for.
Brucifixion by:	bde
2002-04-08 06:59:13 +00:00
Alan Cox
c0bf5caa74 Restructure aio_return() to eliminate duplicated code and facilitate Giant
push down.
2002-04-08 04:57:56 +00:00
Jeffrey Hsu
20504246d8 There's only one socket zone so we don't need to remember it
in every socket structure.
2002-04-08 03:04:22 +00:00
Maxime Henrion
9d8353732e o Change kernel_vmount() interface to be more convenient : pass two
separate strings instead of passing "foo=bar".
o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount()
  fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount()
  when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs
  refcount in the !MNT_UPDATE case.
2002-04-07 13:22:47 +00:00
David Malone
cf4ce70bb3 Remove a comment which relates to the old name cache code, which
was replaced in 1997.

Approved by:	phk
2002-04-07 08:58:31 +00:00
Alan Cox
ae124fc4bd Reduce the duplication of code for error handling in _aio_aqueue(). 2002-04-07 07:17:59 +00:00
Alan Cox
63a4964eec Change jobref and *ijoblist from int to long in order to avoid
a catastrophe after the 2^32nd AIO operation on 64-bit architectures.
2002-04-07 01:28:34 +00:00
Jake Burkholder
98281c99fc Remove a stale comment. 2002-04-06 08:44:04 +00:00
Jake Burkholder
a9f5d33875 Include machine/ktr.h for sparc64 so we pick up KTR_CPU. 2002-04-06 08:43:17 +00:00
Jake Burkholder
a30d7c60f6 Use CTASSERT rather than a runtime check to detect kinfo_proc size changes.
Remove the ugly yuck code to busy wait for 20 seconds.
2002-04-06 08:13:52 +00:00
Yoshihiro Takahashi
d7ef6277af Added the new kernel dumping support for pc98. 2002-04-06 06:41:54 +00:00
Bruce Evans
c78f394575 Updated a doubly stale comment about signotify(). Fixed a nearby long line. 2002-04-05 10:00:37 +00:00