Commit Graph

4086 Commits

Author SHA1 Message Date
mp
02116da56f Reduce stack allocation (stack-fast?).
elf_load_file()   =>  352 to 52 bytes
    exec_elf_imgact() => 1072 to 48 bytes
    elf_corehdr()     =>  396 to  8 bytes

Reviewed by:	julian
2001-08-16 16:14:26 +00:00
peter
fbb375df47 Use the backwards compatability mechanisms so that ps/top etc dont have
unnecessary breakage.

While here, use explicit sizes for the string fields so that we dont
have unintentional changes again in the future when key tunables change.

This still is not quite right, but a june userland is happy with
a -current kernel with these tweaks.
2001-08-16 08:41:15 +00:00
peter
b47c38449f Use explicit sizes for the prpsinfo command length string so that
we dont have any more unexpected changes in core dumps.  This gets us
back to the original core dump layout from a few days ago.
2001-08-16 08:35:51 +00:00
bde
a3c257601f Don't dump on the label sector or below. This avoids clobbering the
label if the dump device overflaps the label (which is a slight
misconfiguration).  Dump routines don't use dscheck(), so the normal
write protection of the label doesn't help.

Reduced some nearby overflow bugs.  In disk_dumpcheck(), there was
(fatal but fail-safe) overflow on i386's with 4GB of memory, at least
if Maxmem was the top page (can this happen?).  The fix assumes that
the sector size divides PAGE_SIZE (dump routines already assume this).
In setdumpdev(), the corresponding overflow occurred with only about
2GB of memory on all machines with 32-bit ints.  This allowed setdumpdev()
to succeed when it shouldn't have, but then disk_dumpcheck() failed
safe later.  Except in old versions of FreeBSD like RELENG_3 where
there is no disk_dumpcheck().

PR:		28164 (label clobbering part)
MFC after:	1 week
2001-08-15 11:35:45 +00:00
jasone
0133f95fe7 Implement kernel semaphores.
Reviewed by:	jhb
2001-08-14 22:13:14 +00:00
jasone
74c140cf82 Add sx_try_upgrade() and sx_downgrade().
Submitted by:	Alexander Kabaev <ak03@gte.com>
2001-08-13 21:25:30 +00:00
jhb
e1cf3a4743 If we've panic'd already, then just bail in lockmgr rather than blocking or
possibly panic'ing again.
2001-08-10 23:29:15 +00:00
wpaul
69603fe5d4 Fix some of the GDB linkage setup. The l_name member of the gdb linkage
structure is always free()ed yet only sometimes malloc()ed. In particular,
it was simply set to point to l_filename from the a linker_file_t in
link_elf_link_preload_finish(). The l_filename had been malloc()ed inside
the kern_linker.c module and was being free()ed twice: once by
link_elf_unload_file() and again by linker_file_unload(), leading to
a panic.

How to duplicate the problem:

- Pre-load a kernel module from the loader, i.e. if_sis.ko
- Boot system
- Attempt to unload module with kldunload if_sis
- Bewm

The problem here is that the case where the module was loaded with kldload
after system boot would work correctly, so this bug went unnoticed until
I stubbed my toe on it just now. (Also, you can only trip this bug if
you compile a kernel with options DDB, but that's the default now.)

Fix: remember to malloc() a separate copy of the module name for the
l_name member of the gdb linkage structure in three places where the
linkage structure can be initialized.
2001-08-10 23:15:13 +00:00
jhb
4a89454dcd - Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel.  The ast() function now loops as long
  as the PS_ASTPENDING or PS_NEEDRESCHED flags are set.  It returns with
  preemption disabled so that any further AST's that arrive via an
  interrupt will be delayed until the low-level MD code returns to user
  mode.
- Use u_int's to store the tick counts for profiling purposes so that we
  do not need sched_lock just to read p_sticks.  This also closes a
  problem where the call to addupc_task() could screw up the arithmetic
  due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
  clear_resched(), and resched_wanted() in favor of direct bit operations
  on p_sflag.
- Fix up locking with sched_lock some.  In addupc_intr(), use sched_lock
  to ensure pr_addr and pr_ticks are updated atomically with setting
  PS_OWEUPC.  In ast() we clear pr_ticks atomically with clearing
  PS_OWEUPC.  We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by:	bde (mostly)
2001-08-10 22:53:32 +00:00
jhb
63014c2530 Make witness compile w/o DDB.
Reported by:	wpaul
2001-08-10 22:33:59 +00:00
iedowse
b3168db212 Arbitrarily limit to 64k the number of bytes that can be read at
a time using the ogetdirentries() compatibility syscall. This is a
hack to ensure that rediculous values don't get passed to MALLOC().

Reviewed by:	kris
2001-08-10 22:14:18 +00:00
jhb
54a0ded8d4 Work around a race between msleep() and endtsleep() where it was possible
for endtsleep() to be executing when msleep() resumed, for endtsleep()
to spin on sched_lock long enough for the other process to loop on
msleep() and sleep again resulting in endtsleep() waking up the "wrong"
msleep.

Obtained from:	BSD/OS
2001-08-10 21:08:56 +00:00
jhb
22915435df Change callout_stop() to return an integer. If callout_stop() succeeds in
removing the callout entry, return 1.  If callout_stop() fails to remove
the callout entry because it is currently executing or has already been
executed, then the function returns 0.  The idea was obtained from BSD/OS,
however, BSD/OS changed untimeout(), and I've just changed callout_stop()
to be more conservative.

Obtained from:	BSD/OS
2001-08-10 21:06:59 +00:00
jhb
18c8b84e2f Style nit: covert a couple of if (p_wchan) tests to if (p_wchan != NULL). 2001-08-10 20:56:25 +00:00
jhb
6c77154134 - Remove asleep(), await(), and M_ASLEEP.
- Callers of asleep() and await() have been converted to calling tsleep().
  The only caller outside of M_ASLEEP was the ata driver, which called both
  asleep() and await() with spl-raised, so there was no need for the
  asleep() and await() pair.  M_ASLEEP was unused.

Reviewed by:	jasone, peter
2001-08-10 06:45:43 +00:00
jhb
2ff1c253cd - Remove asleep(), await(), and M_ASLEEP.
- Callers of asleep() and await() have been converted to calling tsleep().
  The only caller outside of M_ASLEEP was the ata driver, which called both
  asleep() and await() with spl-raised, so there was no need for the
  asleep() and await() pair.  M_ASLEEP was unused.

Reviewed by:	jasone, peter
2001-08-10 06:37:05 +00:00
jhb
89b722b37f Axe spl's obsoleted by the callout mutex. 2001-08-10 01:36:25 +00:00
peter
0c05c098a9 *** empty log message *** 2001-08-09 01:21:58 +00:00
peter
bb5c43c4b8 Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along.  The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.

gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get.  It carries its own #defines so it doesn't break
compiles.
2001-08-08 05:25:15 +00:00
green
175c3f1d2d Previously, the ELF linker would always just store the pointer to a
filename passed in via the module loader functions in the GDB
"sharedlibrary" support structures.  This isn't good, since the pointer
would become stale in almost every case (not the pre-loaded case, of
course).

Change this to malloc()ed copy of the string and finally fix the reason
that gdb -k's "sharedlibrary" command stopped working.

Obtained from:	LOMAC/FreeBSD (cf. NAI Labs)
2001-08-06 14:21:57 +00:00
chris
81b95242db Remove the fildesc_clone() function and its associated unnecessary code.
It didn't implement the proper /dev/fd functionality (which would be to
include in the directory listing /dev/fd/n if the process has fd n open)
anyway.

Anything needing access to /dev/fd/n where n > 2 can use the optional
fdescfs module, which implements this properly and does not cause any
trouble with devfs.

Discussed with:	phk
2001-08-06 05:56:33 +00:00
tmm
fb501835eb Export the tk_nin and tk_nout variables (number of tty input/output
characters) as sysctls (kern.tty_nin and kern.tty_nout).
2001-08-04 18:09:24 +00:00
tmm
77704f41cd Export the head structure for the device statistics STAILQ in
sys/devicestat.h, so that the queue can be walked in crashdumps using
libkvm.
2001-08-04 18:02:47 +00:00
jhb
3713e597cb Add KTR_INTR tracepoints for when clock interrupts are triggered. 2001-08-03 20:54:41 +00:00
rwatson
8c4571a0e7 Anton kindly pointed out (and fixed) a bug in the Jail handling of the
bind() call on IPv4 sockets:

  Currently, if one tries to bind a socket using INADDR_LOOPBACK inside a
  jail, it will fail because prison_ip() does not take this possibility
  into account.  On the other hand, when one tries to connect(), for
  example, to localhost, prison_remote_ip() will silently convert
  INADDR_LOOPBACK to the jail's IP address.  Therefore, it is desirable to
  make bind() to do this implicit conversion as well.

  Apart from this, the patch also replaces 0x7f000001 in
  prison_remote_ip() to a more correct INADDR_LOOPBACK.

This is a 4.4-RELEASE "during the freeze, thanks" MFC candidate.

Submitted by:	Anton Berezin <tobez@FreeBSD.org>
Discussed with at some point:	phk
MFC after:	3 days
2001-08-03 18:21:06 +00:00
bmilekic
4dfee5e935 Rename mb_init() mbuf subsystem initialization routine to mbuf_init(), in
order to avoid namespace collision with subr_mchain.c's mb_init(). This
wasn't "fatal" as the mbuf initialization routine mb_init() was local to
subr_mbuf.c which in turn didn't pull in subr_mchain.c's mb_init()
declaration, but it should deffinately be changed now before it creates
headache.
2001-08-03 05:05:32 +00:00
jake
258022abac Remove some code that appears to have endian problems with INVARIANTS.
This is #if BIG_ENDIAN, but is only necessary if malloc types are shorts,
not struct malloc_type * like they are now.
2001-08-03 03:31:45 +00:00
jhb
e712875281 Use 'p' instead of the potentially more expensive 'curproc' inside of
mi_switch().
2001-08-02 22:15:31 +00:00
imp
982fd84b01 Make the fmt arguments to make_dev and make_dev_alias const char *.
Approved on IRC as long as it didn't cause a large number of warnings by: phk

MFC After: 700 hours
2001-08-02 20:35:35 +00:00
peter
9f38eeae58 Temporarily back out kern_sig.c rev 1.125 and kern_exit.c rev 1.131.
This paniced my one of my machines one time too many :-( and there is
no sign of a solution in the pipeline.  The deltas are still easily
available in cvs.  The problem is that if the parent has been swapped
out, the child process cannot grope around in the parent's UPAGES to
see the sigact[] array or it will fault.  This probably is a showstopper
for this implementation anyway.
2001-08-01 20:35:24 +00:00
bmilekic
cceec8d181 Move CPU_ABSENT() macro to smp.h, where it belongs anyway. It will be
defined to 0 in the non-SMP case, which very much makes sense as it
permits its usage in per-CPU initialization loops (for an example, check
out subr_mbuf.c).
  Further, on a UP system, make mb_alloc always use the first per-CPU
container, regardless of cpuid (i.e. remove reliability on cpuid in the
UP case).

Requested by: alfred
2001-08-01 00:54:00 +00:00
jhb
6394826cde Apply the cluebat to myself and undo the await() -> mawait() rename. The
asleep() and await() functions split the functionality of msleep() up into
two halves.  Only the asleep() half (which is what puts the process on the
sleep queue) actually needs the lock usually passed to msleep() held to
prevent lost wakeups.  await() does not need the lock held, so the lock
can be released prior to calling await() and does not need to be passed in
to the await() function.  Typical usage of these functions would be as
follows:

        mtx_lock(&foo_mtx);
        ... do stuff ...
        asleep(&foo_cond, PRIxx, "foowt", hz);
        ...
        mtx_unlock&foo_mtx);
        ...
        await(-1, -1);

Inspired by:	dillon on the couch at Usenix
2001-07-31 22:06:56 +00:00
jhb
a5cd152fc8 Add a safety belt to mawait() for the (cold || panicstr) case identical to
the one in msleep() such that we return immediately rather than blocking.

Submitted by:	peter
Prodded by:	sheldonh
2001-07-31 20:57:57 +00:00
jhb
02db8471a3 If we have already panic'd then don't bother enforcing mutex asserts as
things are pretty much shot already and all panic'ing does is hurt our
chances of getting a dump.

Inspired by:	sheldonh
2001-07-31 17:45:50 +00:00
jhb
a0a2e280bd - Fix panicstr checks to explicitly check against NULL.
- Add a few more panicstr checks so that we don't panic recursively.

Requested by:	sheldonh (2)
2001-07-31 17:44:57 +00:00
rwatson
c1e081808c o Modify p_candebug() such that there is no longer automatic acceptance
of debugging the current process when that is in conflict with other
  restrictions (such as jail, unprivileged_procdebug_permitted, etc).
o This corrects anomolies in the behavior of
  kern.security.unprivileged_procdebug_permitted when using truss and
  ktrace.  The theory goes that this is now safe to use.

Obtained from:	TrustedBSD Project
2001-07-31 17:25:12 +00:00
rwatson
09d5fb71b9 o Introduce new kern.security sysctl tree for kernel security policy
MIB entries.
o Relocate kern.suser_permitted to kern.security.suser_permitted.
o Introduce new kern.security.unprivileged_procdebug_permitted, which
  (when set to 0) prevents processes without privilege from performing
  a variety of inter-process debugging activities.  The default is 1,
  to provide current behavior.

  This feature allows "hardened" systems to disable access to debugging
  facilities, which have been associated with a number of past security
  vulnerabilities.  Previously, while procfs could be unmounted, other
  in-kernel facilities (such as ptrace()) were still available.  This
  setting should not be modified on normal development systems, as it
  will result in frustration.  Some utilities respond poorly to
  failing to get the debugging access they require, and error response
  by these utilities may be improved in the future in the name of
  beautification.

  Note that there are currently some odd interactions with some
  facilities, which will need to be resolved before this should be used
  in production, including odd interactions with truss and ktrace.
  Note also that currently, tracing is permitted on the current process
  regardless of this flag, for compatibility with previous
  authorization code in various facilities, but that will probably
  change (and resolve the odd interactions).

Obtained from:	TrustedBSD Project
2001-07-31 15:48:21 +00:00
jake
7abfb73d23 Don't try to find an eventhandler list if the list of lists hasn't
been initialized yet.
2001-07-31 03:52:16 +00:00
jake
fe4d4f7ee3 Don't try to print a field that doesn't exist; in usually commented
out debugging code.
2001-07-31 03:51:07 +00:00
jake
21b80f4133 Use a machine dependent type, Elf_Hashelt, for the elements of the elf
dynamic symbol table buckets and chains.  The sparc64 toolchain uses 32
bit .hash entries, unlike other 64 bits architectures (alpha), which use
64 bit entries.

Discussed with: dfr, jdp
2001-07-31 03:46:39 +00:00
asmodai
baad636e13 Fix obsolete code.
FreeBSD _does_ define ENOMSG, so no need for checking if we support it.

Inspired by PR:		22470
Which was submitted by:	Bjorn Tornqvist <bjorn@west.se>
MFC after:	1 week
2001-07-30 19:28:02 +00:00
peter
6ca5d5c5c5 Revert previous accidental commit. FWIW, it was part of enabling
VM caching of disks through mmap() and stopping syncing of open files
that had their last reference in the fs removed (ie: their unsync'ed
pages get discarded on close already, so I made it stop syncing too).
2001-07-27 15:57:17 +00:00
peter
18bc463cb6 Fix cut/paste blunder. Serves me right for doing a last minute tweak
to what I had for some time.

Submitted by:	bde
2001-07-27 15:52:49 +00:00
peter
94613ac7da Use the tunable maxusers rather than the compile-time one. Evaluate and
initialize in the right order to make derivative settings work right.
eg: at compile time, nmbufs was double nmbclusters.  For POLA this should
work the same at runtime.
2001-07-26 23:08:31 +00:00
peter
df2f882214 Move param.c out of the conf directory and make it fully dynamic.
Tunables are now derived at boot time from maxusers.  ie: change maxusers
via a tunable and all the derivative settings change.  You can change
the other tunables individually as well.  Even hz etc is tunable.
2001-07-26 23:04:03 +00:00
bmilekic
0caeab3ccd - Do not handle the per-CPU containers in mbuf code as though the cpuids
were indices in a dense array. The cpuids are a sparse set and treat
  them as such, setting up containers only for CPUs activated during
  mb_init().

- Fix netstat(1) and systat(1) to treat the per-CPU stats area as a sparse
  map, in accordance with the above.

This allows us to properly boot with certain CPUs disactivated. However, if
we later decide to re-activate said CPUs, we will barf until we decide to
implement CPU spinon/spinoff callback hooks to allow for said CPUs' per-CPU
containers to get configured on their activation.

Reported by: mjacob
Partially (sys/ diffs) Submitted by: mjacob
2001-07-26 18:47:46 +00:00
fenner
8efe98d859 Don't bother passing p to rtioctl just so it can fail to pass it to mrt_ioctl 2001-07-25 20:15:28 +00:00
roam
a100af4fa8 Make dynamic sysctl entries start at 0x100, not decimal 100 - there are
static entries with oid's over 100, and defining enough dynamic entries
causes an overlap.

Move the "magic" value 0x100 into <sys/sysctl.h> where it belongs.

PR:		29131
Submitted by:	"Alexander N. Kabaev" <kabaev@mail.ru>
Reviewed by:	-arch, -audit
MFC after:	2 weeks
2001-07-25 17:21:18 +00:00
roam
3e1e624da7 Style(9): function names on a separate line, max line length 80 chars.
Reviewed by:	-arch, -audit
MFC after:	2 weeks
2001-07-25 17:13:58 +00:00
dd
833e06c1f6 sys/kern/tty_snoop.c is now sys/dev/snp/snp.c.
Repo-copy by:	jdp
2001-07-25 12:06:36 +00:00
assar
20223509f9 correct description of `vpp' for mknod/symlink: they are actually
returned locked
2001-07-24 16:16:00 +00:00
dillon
5064dfdc7c As per further discussions on hackers redo the SIGCHLD patch to not generate
an unexpected user-visible side effect with the sigaction flags.  Also cleanup
a minor union issue.

Submitted by: Rudolf Cejka <cejkar@dcse.fee.vutbr.cz>
MFC addendum: MFC will be combined w/ original commit
MFC after: 3 days
2001-07-22 18:47:31 +00:00
assar
0b09f5187c revert previous commit (bad style and not needed)
Noticed:	bde
2001-07-22 10:24:31 +00:00
assar
cc40e9cd2e add prototype for dosetrlimit 2001-07-22 00:21:19 +00:00
assar
f49d464ff7 add <sys/cdefs.h> (for __unused and such) 2001-07-21 17:12:44 +00:00
jhb
b84bdc8767 Add a missing ~ so that the LO_INITIALIZED flag actually gets turned off
in witness_destroy().
2001-07-20 23:29:25 +00:00
jlemon
6279f096ec Introduce EVFILT_TIMER, which allows a process to establish an
arbitrary number of timers, both oneshot and periodic.

Repeatedly reminded to commit by: jayanth
Reviewed by: peter (a while back)
2001-07-19 18:34:40 +00:00
kris
7e154a8df4 Don't use kp->arg0 as a format string, grr.
MFC after:	1 week
2001-07-19 02:18:54 +00:00
dd
f582e5317a Keep track of all "struct snoop"'s so that snp_modevent can fail with
EBUSY if there's a device still open.
2001-07-18 13:39:43 +00:00
obrien
610c4dc6f4 Increase NMBCLUSTERS by 4x.
This takes a GENERIC kernel (MAXUSERS=32) from 1536 to 3072.
2001-07-17 15:51:12 +00:00
peter
7c0cabdf7f Move the hints gunk to a seperate file. It isn't really part of the
newbus structure (no more than subr_rman.c is anyway).
2001-07-14 08:25:18 +00:00
peter
a67c526396 Go back to having either static OR dynamic hints, with fallback
support.  Trying to fix the merged set where dynamic overrode
static was getting more and more complicated by the day.

This should fix the duplicate atkbd, psm, fd* etc in GENERIC.  (which
paniced the alpha, but not the i386)
2001-07-14 00:23:10 +00:00
dd
9a7a96328c Correct spelling in a comment and remove trailing newline from a
panic() call (panic() adds it itself).
2001-07-11 02:04:43 +00:00
des
1b82a02868 Constify the fstype argument to vfs_mount(). This eliminates at least one
"call discards qualifier" warning (in sys/compat/linux/linux_file.c).
2001-07-09 19:11:51 +00:00
guido
1e615d275f Don't share sig handlers after an exec
Reviewed by:	Alfred Perlstein
2001-07-09 19:01:42 +00:00
guido
e2d79c6113 Get rid of useless bcopy (the next statement was equivalent) 2001-07-09 19:00:08 +00:00
jake
0227d4f3f6 Backout mwakeup, etc. 2001-07-06 01:16:43 +00:00
rwatson
da1a848c61 o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx().
The p_can(...) construct was a premature (and, it turns out,
  awkward) abstraction.  The individual calls to p_canxxx() better
  reflect differences between the inter-process authorization checks,
  such as differing checks based on the type of signal.  This has
  a side effect of improving code readability.
o Replace direct credential authorization checks in ktrace() with
  invocation of p_candebug(), while maintaining the special case
  check of KTR_ROOT.  This allows ktrace() to "play more nicely"
  with new mandatory access control schemes, as well as making its
  authorization checks consistent with other "debugging class"
  checks.
o Eliminate "privused" construct for p_can*() calls which allowed the
  caller to determine if privilege was required for successful
  evaluation of the access control check.  This primitive is currently
  unused, and as such, serves only to complicate the API.

Approved by:	({procfs,linprocfs} changes) des
Obtained from:	TrustedBSD Project
2001-07-05 17:10:46 +00:00
jhb
27372749e2 Spelling fix in a KASSERT: runq_chose -> runq_choose. 2001-07-04 20:00:48 +00:00
dillon
8ff7790b1e cleanup: GIANT macros, rename DEPRECIATE to DEPRECATE
Move p_giant_optional to proc zero'd section
Remove (old) XXX zfree comment in pipe code
2001-07-04 17:11:03 +00:00
dillon
e028603b7e With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
dillon
52f62a303c postsig() currently requires Giant to be held. Giant is held properly at
the first postsig() call, but not always held at the second place,
resulting in an occassional panic.
2001-07-04 15:36:30 +00:00
jake
33e85623fa Implement mwakeup, mwakeup_one, cv_signal_drop and cv_broadcast_drop.
These take an additional mutex argument, which is dropped before any
processes are made runnable.  This can avoid contention on the mutex
if the processes would immediately acquire it, and is done in such a
way that wakeups will not be lost.

Reviewed by:	jhb
2001-07-04 00:32:50 +00:00
des
d96592aced Constify the format string.
Submitted by:	Mike Barcroft <mike@q9media.com>
2001-07-03 21:46:43 +00:00
tmm
6dd375961b Make the code to read the kernel message buffer via sysctl machine-
independent and rename the corresponding sysctls from machdep.msgbuf and
machdep.msgbuf_clear (i386 only) to kern.msgbuf and kern.msgbuf_clear.
2001-07-03 19:44:07 +00:00
jhb
69df74a645 Remove spl's in uio_yield() that are covered by the sched_lock. 2001-07-03 15:58:37 +00:00
jhb
74c2e58245 Remove commented-out garbage that skipped updating schedcpu() stats for
ithreads in SWAIT.
2001-07-03 08:03:56 +00:00
jhb
141f7800c8 Just check p_oncpu when determining if a process is executing or not.
We already did this in the SMP case, and it is now maintained in the UP
case as well, and makes the code slightly more readable.  Note that
curproc is always executing, thus the p != curproc test does not need to
be performed if the p_oncpu check is made.
2001-07-03 08:00:57 +00:00
jhb
f8917af0a6 Axe spl's that are covered by the sched_lock (and have been for quite
some time.)
2001-07-03 07:53:35 +00:00
jhb
d5b88d1293 Include the wait message and channel for msleep() in the KTR tracepoint. 2001-07-03 07:39:06 +00:00
jhb
774b040e8a Remove bogus need_resched() of the current CPU in roundrobin().
We don't actually need to force a context switch of the current process.
The act of firing the event triggers a context switch to softclock() and
then switching back out again which is equivalent to a preemption, thus
no further work is needed on the local CPU.
2001-07-03 05:33:09 +00:00
jhb
7bb1f29898 Grab Giant around postsig() since sendsig() can call into the vm to
grow the stack and we already needed Giant for KTRACE.
2001-07-03 05:27:53 +00:00
rwatson
b83ccf3fae o Unfold p31b_proc() into the individual posix4 system calls so as to
allow call-specific authorization.
o Modify the authorization model so that p_can() is used to check
  scheduling get/set events, using P_CAN_SEE for gets, and P_CAN_SCHED
  for sets.  This brings the checks in line with get/setpriority().

Obtained from:	TrustedBSD Project
2001-06-30 07:55:19 +00:00
jhb
cbf193046b Remove the p_spinlocks spin lock count that was obsoleted by the
per-CPU spinlocks list.
2001-06-30 03:35:22 +00:00
rwatson
c35de5d85d Replace some use of 'p' with 'targetp' so as to not scarily overload the
passed 'p' argument.  No functional change.

Obtained from:	USENIX Emporium, Cheap Tricks Department
2001-06-30 03:13:36 +00:00
jhb
dcedf89d63 Make the schedlock saved critical section state a per-thread property. 2001-06-30 03:11:26 +00:00
jhb
cbc88996c6 Move ast() and userret() to sys/kern/subr_trap.c now that they are MI. 2001-06-29 19:51:37 +00:00
jhb
d82893e676 Add a new MI pointer to the process' trapframe p_frame instead of using
various differently named pointers buried under p_md.

Reviewed by:	jake (in principle)
2001-06-29 11:10:41 +00:00
jhb
cc8833dfe9 Grab Giant around trap_pfault() for now. 2001-06-29 04:18:10 +00:00
jlemon
ac6b9aa8ec Fix up indentation. 2001-06-29 04:01:38 +00:00
rwatson
7c1e143aa8 Remove a fascinating but confusing construct involving chaining
conditional clauses in the following way:

	(0 || a || b);

No functional change.
2001-06-28 23:02:09 +00:00
rwatson
fc39072773 Add error checking for copyin() operations in posix4 scheduling code. 2001-06-28 22:53:42 +00:00
jhb
e77dfdc28f Don't check witness assertions if the lock doesn't use witness or witness
is dead.
2001-06-28 22:22:20 +00:00
jhb
34fab2d86c - Fix a mntvnode and vnode interlock reversal.
- Protect the mnt_vnode list with the mntvnode lock.
2001-06-28 04:05:54 +00:00
jhb
97220dbd1e - Add trylock variants of shared and exclusive locks.
- The sx assertions don't actually need the internal sx mutex lock, so
  don't bother doing so.
- Add a new assertion SX_ASSERT_LOCKED() that asserts that either a
  shared or exclusive lock should be held.  This assertion should be used
  instead of SX_ASSERT_SLOCKED() in almost all cases.
- Adjust some KASSERT()'s to include file and line information.
- Use the new witness_assert() function in the WITNESS case for sx slock
  asserts to verify that the current thread actually owns a slock.
2001-06-27 06:39:37 +00:00
jhb
e58c0d25fb - Add a new witness_assert() to perform arbitrary locking assertions.
- Clean up the KTR tracepoints to be slighlty more consistent and useful
- Fix a bug in WITNESS where we would recurse indefinitely and blow the
  stack when acquiring Giant after sleeping with a sleepable lock held.

Reported by:	tanimura (3)
2001-06-27 06:27:29 +00:00
jhb
f4029cf62b - Always use the proc lock of the task leader to protect the peers list of
processes.
- Don't construct fake call args and then call kill().  psignal is not
  anymore complicated and is quicker and not prone to locking problems.
  Calling psignal() avoids having to do a pfind() since we already have a
  proc pointer and also allows us to keep the task leader locked while we
  kill all the peer processes so the list is kept coherent.
- When a kthread exits, do a wakeup() on its proc pointers.  This can be
  used by kernel modules that have kthreads and want to ensure they have
  safely exited before completely the MOD_UNLOAD event.

Connectivity provided by:	Usenix wireless
2001-06-27 06:15:44 +00:00
jhb
b9fab7d0d4 - Move the 'clk' spinlock below other spin locks since KTR trace events
may need the clock lock for nanotime().
- Add KTR trace events for lock list manipulations and other witness
  operations.
- Use a temporary variable instead of setting the lock list head directly
  and then setting up the links to add a new lock list entry to the lock
  list.  This small race could result in witness "forgetting" about all
  the locks held by this process temporarily during an interrupt.
- Close a more fatal race condition when removing a lock from a list.
  Removing a lock from the list entails both decrementing the count of
  items in this bucket as well as shuffling items in the current bucket up
  a notch to replace the gap left by the removed item.  Wrap these
  operations in a critical section.
2001-06-25 23:17:52 +00:00
jhb
46a0597e74 - Replace the unused KTR_IDLELOOP trace class with a new KTR_WITNESS trace
class to trace witness events.
- Make the ktr_cpu field of ktr_entry be a standard field rather than one
  present only in the KTR_EXTEND case.
- Move the default definition of KTR_ENTRIES from sys/ktr.h to
  kern/kern_ktr.c.  It has not been needed in the header file since KTR
  was un-inlined.
- Minor include cleanup in kern/kern_ktr.c.
- Fiddle with the ktr_cpumask in ktr_tracepoint() to disable KTR events
  on the current CPU while we are processing an event.
- Set the current CPU inside of the critical section to ensure we don't
  migrate CPU's after the critical section but before we set the CPU.
2001-06-25 23:09:31 +00:00
jhb
832e922fdf - Sort includes.
- Count the context switches during shutdown when we give ithreads a chance
  to run as volutary context switches.

Submitted by:	bde (2)
2001-06-25 18:30:42 +00:00
jhb
2660b97507 Count the context switch when blocking on a mutex as a voluntary context
switch.  Count the context switch when preempting the current thread to let
a higher priority thread blocked on a mutex we just released run as an
involuntary context switch.

Reported by:	bde
2001-06-25 18:29:32 +00:00
jhb
fdfd5d01a7 Count the switch when an ithread goes idle as a voluntary context switch.
Submitted by:	bde
2001-06-25 18:27:33 +00:00
dwmalone
79a843a087 Don't dereference a NULL pointer if we fail to get a sendfilebuf. 2001-06-24 12:27:30 +00:00
dillon
f8016646a9 After exhaustive discussions and some meandering and confusion, enough
people are on track with the cause and effect of this, and although
fixing this severely degenerate case appears to violate the letter of
POSIX.1-200x, Bruce and I (and enough others) agree that it should be
comitted.

So, this patch generates an ENOENT error for any attempt to do a path lookup
through an empty symlink (e.g. open(), stat()).

Submitted by: "Andrey A. Chernov" <ache@nagual.pp.ru>
Reviewed by: bde
Discussed exhaustively on: freebsd-current
Previously committed to: NetBSD 4 years ago
2001-06-24 05:24:41 +00:00
jhb
dfa68807f4 - Lock CURSIG() with the proc lock to close the signal race with psignal.
- Grab Giant around ktrace points.
- Clean up KTR_PROC tracepoints to not display the value of
  sched_lock.mtx_lock as it isn't really needed anymore and just obfuscates
  the messages.
- Add a few if conditions to replace gotos.
- Ensure that every msleep KTR event ends up with a matching msleep resume
  KTR event (this was broken when we didn't do a mi_switch()).
- Only note via ktrace that we resumed from a switch once rather than twice
  in several places in msleep().
- Remove spl's rom asleep and await as the proc lock and sched_lock provide
  all the needed locking.
- In mawait() add in a needed ktrace point for noting that we are about to
  switch out.
2001-06-22 23:11:26 +00:00
jhb
ae99243f0b - Lock CURSIG with the proc lock and don't release the proc lock until
after grabbing the sched lock to close a race.
- Lock ktrace points with Giant.
2001-06-22 23:06:38 +00:00
jhb
e5e16e09ad - Grab the proc lock around CURSIG and postsig(). Don't release the proc
lock until after grabbing the sched_lock to avoid CURSIG racing with
  psignal.
- Don't grab Giant for addupc_task() as it isn't needed.

Reported by:	tegge (signal race), bde (addupc_task a while back)
2001-06-22 23:05:11 +00:00
jhb
8210b8d106 - Change CURSIG() and postsig() to require that the proc lock is held
rather than grabbing it and releasing it themselves.  This allows callers
  of these functions to get the lock to close race conditions.
- Grab Giant around ktrace in postsig.
- Count the switches performed on SIGSTOP's as involuntary context switches
  in the resource usage stats.

Reported by:	tegge (signal race), bde (missing csw stats)
2001-06-22 23:02:37 +00:00
mjacob
95a162e88f int -> size_t fix 2001-06-22 19:54:38 +00:00
mjacob
4127a25756 Temporary fix at least- define NCPU_PRESENT which will be mp_npcus for
SMP kernels, one (1) for non-SMP.
2001-06-22 16:03:23 +00:00
pirzyk
773adf0e44 changed hostid from long to unsigned long to be able to store values > 2GB
on i386 platforms.  Also changed SYSCTL type from INT to ULONG and removed
comment about it.

PR:		kern/21132
MFC after:	1 month
2001-06-22 16:03:14 +00:00
bmilekic
5d710b296b Introduce numerous SMP friendly changes to the mbuf allocator. Namely,
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:

 o Reduce contention for SMP by offering per-CPU pools and locks.
 o Better use of data cache due to per-CPU pools.
 o Much less code cache pollution due to excessively large allocation macros.
 o Framework for `grouping' objects from same page together so as to be able
   to possibly free wired-down pages back to the system if they are no longer
   needed by the network stacks.

 Additional things changed with this addition:

  - Moved some mbuf specific declarations and initializations from
    sys/conf/param.c into mbuf-specific code where they belong.
  - m_getclr() has been renamed to m_get_clrd() because the old name is really
    confusing. m_getclr() HAS been preserved though and is defined to the new
    name. No tree sweep has been done "to change the interface," as the old
    name will continue to be supported and is not depracated. The change was
    merely done because m_getclr() sounds too much like "m_get a cluster."
  - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
    systat(1) (see TODO below).
  - Fixed systat(1) to display number of "free mbufs" based on new per-CPU
    stat structures.
  - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
    per-CPU stat structures. All infos are fetched via sysctl.

 TODO (in order of priority):

  - Re-enable mbtypes statistics in both netstat(1) and systat(1) after
    introducing an SMP friendly way to collect the mbtypes stats under the
    already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
    seems too costly for a mere stat update, especially when other locks are
    already present).
  - Optionally have systat(1) display not only "total free mbufs" but also
    "total free mbufs per CPU pool."
  - Fix minor length-fetching issues in netstat(1) related to recently
    re-enabled option to read mbuf stats from a core file.
  - Move reference counters at least for mbuf clusters into an unused portion
    of the cluster itself, to save space and need to allocate a counter.
  - Look into introducing resource freeing possibly from a kproc.

Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/
2001-06-22 06:35:32 +00:00
jhb
092b28a542 Fix some lock order reversals where we called free() while holding a proc
lock.  We now use temporary variables to save the process argument pointer
and just update the pointer while holding the lock.  We then perform the
free on the cached pointer after releasing the lock.
2001-06-20 23:10:06 +00:00
bmilekic
70d52016a3 Change m_devget()'s outdated and unused `offset' argument to actually mean
something: offset into the first mbuf of the target chain before copying
the source data over.

Make drivers using m_devget() with a first argument "data - ETHER_ALIGN"
to use the offset argument to pass ETHER_ALIGN in. The way it was previously
done is potentially dangerous if the source data was at the top of a page
and the offset caused the previous page to be copied (if the
previous page has not yet been appropriately mapped).

The old `offset' argument in m_devget() is not used anywhere (it's always
0) and dates back to ~1995 (and earlier?) when support for ethernet trailers
existed. With that support gone, it was merely collecting dust.

Tested on alpha by: jlemon
Partially submitted by: jlemon
Reviewed by: jlemon
MFC after: 3 weeks
2001-06-20 19:48:35 +00:00
jhb
f466ba0f8e Preemption by an interrupt thread is an involuntary switch, not a voluntary
one.

Pointy-hat to:	me
2001-06-20 18:26:41 +00:00
des
8a223ad3ce Constify (silence warnings introduced by last commit to sys/module.h) 2001-06-20 16:08:45 +00:00
wollman
204b9a8a22 After one too many PRs on the subject, bite the bullet and define IOV_MAX
and its associated constants.  Implement _SC_IOV_MAX in the usual way.
Be a bit sloppy about the namespace question; this should get cleared up
in time for 5.0.

MFC after:	1 month
2001-06-18 20:24:54 +00:00
jhb
1852c17085 Lock Giant in postsig() for the KTRACE case as ktrpsig() needs Giant when
it writes out to the trace file.

Reported by:	peter, gallatin, and others
2001-06-18 19:23:43 +00:00
brian
a08eb62a0d Add linker_reference_module().
This function loads a module if required, otherwise bumps the reference
count -- the opposite of linker_file_unload().
2001-06-18 15:09:33 +00:00
brian
59c2ccba3b Don't remove the SI_CHEAPCLONE for unsupported minors 2001-06-18 09:22:30 +00:00
peter
e05ff7e2d6 Move setugid() a little sooner to before we release tracing in case
crdup() or change_e*id() block on malloc() or mutex.
2001-06-16 23:34:23 +00:00
peter
38ecd59e07 Add INTR_TYPE_AV so that we can get to the PI_AV priority in the ithread
handlers.  This is beneficial since it means that pcm's MPSAFE handler
can get run before things that will block on Giant in the shared irq
case.
2001-06-16 22:42:19 +00:00
jlemon
d115ce425b Fix warnings:
112: warning: cast to pointer from integer of different size
125: warning: cast to pointer from integer of different size
2001-06-16 07:02:47 +00:00
jlemon
0dbb10c226 Correctly hook up the write kqfilter to pipes.
Submitted by:  Niels Provos <provos@citi.umich.edu>
2001-06-15 20:45:01 +00:00
peter
51d35ea75c Fix some warnings in kern_environment.c. Make the getenv*() family
take a const 'name', since they dont modify anything.
159: warning: passing arg 1 of `getenv_int' discards qualifiers...
167: warning: passing arg 1 of `getenv' discards qualifiers from pointer..
2001-06-15 07:29:17 +00:00
peter
17e3eb1d7f As per comments in sys/linker_set.h:
BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK!
<reload>
BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK!
2001-06-14 01:28:56 +00:00
peter
f10fa038c1 With this commit, I hereby pronounce gensetdefs past its use-by date.
Replace the a.out emulation of 'struct linker_set' with something
a little more flexible.  <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set.  They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>).  Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated.  This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by:	eivind
2001-06-13 10:58:39 +00:00
peter
a97b956712 Patch up a blunder I made a few days ago. nmbcnt was being initialized
too late.

Noted by:      bmilekic
Pointy-hat to: peter
2001-06-13 00:36:41 +00:00
peter
bbbe8875f0 Hints overhaul:
- Replace some very poorly thought out API hacks that should have been
  fixed a long while ago.
- Provide some much more flexible search functions (resource_find_*())
- Use strings for storage instead of an outgrowth of the rather
  inconvenient temporary ioconf table from config().  We already had a
  fallback to using strings before malloc/vm was running anyway.
2001-06-12 09:40:04 +00:00
des
3463e6d056 Rename nextpid to lastpid and externalize it. 2001-06-11 21:54:19 +00:00
des
7da6da146f Blah, I cut out a tad too much in the previous commit. (thanks again, Jake!) 2001-06-11 18:43:32 +00:00
des
b21baf0f69 copyin(9) doesn't return ENAMETOOLONG. (thanks, Jake!) 2001-06-11 18:36:18 +00:00
des
86b7e548ab Add sbuf_copyin(). Also add 'b' variants of sbuf_{cat,copyin,cpy}() which
ignore NUL bytes in the source string.
2001-06-11 17:05:52 +00:00
ume
832f8d2249 Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
  - The definitions of SADB_* in sys/net/pfkeyv2.h are still different
    from RFC2407/IANA assignment because of binary compatibility
    issue.  It should be fixed under 5-CURRENT.
  - ip6po_m member of struct ip6_pktopts is no longer used.  But, it
    is still there because of binary compatibility issue.  It should
    be removed under 5-CURRENT.

Reviewed by:	itojun
Obtained from:	KAME
MFC after:	3 weeks
2001-06-11 12:39:29 +00:00
dwmalone
46ac202c04 Try to make the setting of the SIGCHLD handler the same as setting of
the NOCLDWAI flag. Susv2 seems to require this.

Submitted by:	Cejka Rudolf <cejkar@dcse.fee.vutbr.cz>
Reviewed by:	dillon
2001-06-11 09:15:41 +00:00
des
23c38e4e7c sbuf_new(9) now returns a struct sbuf * instead of an int. If the caller
does not provide a struct sbuf, sbuf_new(9) will allocate one and return
a pointer to it.
2001-06-10 15:48:04 +00:00
peter
4b91e2ecf0 "Fix" the previous initial attempt at fixing TUNABLE_INT(). This time
around, use a common function for looking up and extracting the tunables
from the kernel environment.  This saves duplicating the same function
over and over again.  This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.
2001-06-08 05:24:21 +00:00
peter
c1df44ae51 Back out part of my previous commit. This was a last minute change
and I botched testing.  This is a perfect example of how NOT to do
this sort of thing. :-(
2001-06-07 03:17:26 +00:00
tmm
0e4c21f7c1 Fix an instance of NDINIT in the extattrctl syscall: LOCKLEAF was or'ed
to the operation parameter, not to the flags as it should be.

Reviewed by:	rwatson
2001-06-06 23:34:38 +00:00
peter
0732738ec4 Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros.  TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.
2001-06-06 22:17:08 +00:00
jhb
df7d2486bc We don't need to hold a lock just to test a flag. 2001-06-06 22:05:48 +00:00
ru
d698c5d44e Unbreak setregid(2).
Spotted by:	Alexander Leidinger <Alexander@Leidinger.net>
2001-06-06 13:58:03 +00:00
jhb
7a4f835060 Don't hold sched_lock across addupc_task().
Reported by:	David Taylor <davidt@yadt.co.uk>
Submitted by:	bde
2001-06-06 00:57:24 +00:00
dd
1c7d10ac21 Add a line discipline close routine which restores some functionality
I accidently nuked in rev. 1.54.  Also rework the error handling in
snplwrite a little.
2001-06-05 05:07:53 +00:00
dd
c35e39a5cb Style and cosmetic cleanups. This driver is now reasonably stlye(9)
compliant.  All the variable definitions and function names are
reasonably consistent, and the functions which should be static (i.e.,
all of them) are.  Other assorted fixes were made.  The majority of
the delta is indentation fixes.

Partially reviewed by:	bde
2001-06-05 05:00:17 +00:00
dd
b646de742c Use the l_nullioctl exported from tty_conf.c rather than rolling our own. 2001-06-04 23:31:21 +00:00
dd
c6d2a1e6f9 Unstaticize l_nullioctl; it is needed elsewhere (like in tty_snoop.c).
Suggested by:	bde
2001-06-04 23:30:47 +00:00
dillon
7f9e532290 The pipe_write() code was locking the pipe without busying it first in
certain cases, and a close() by another process could potentially rip the
pipe out from under the (blocked) locking operation.

Reported-by: Alexander Viro <viro@math.psu.edu>
2001-06-04 04:04:45 +00:00
dd
eaa7d6fe18 Remove unused includes, use *min() inline functions rather than a
home-grown macro, rewrite a confusing conditional in snpdevtotty(),
and change ibuf to 512 bytes instead of 1024 bytes in dsnwrite().

Reviewed by:	bde
2001-06-03 05:17:39 +00:00
dd
e9e92e57f1 When tring to find out if this is a request for a write in
kernel_sysctl and userland_sysctl, check for whether new is NULL, not
whether newlen is 0.  This allows one to set a string sysctl to "".
2001-06-03 04:58:51 +00:00