3060 Commits

Author SHA1 Message Date
peter
09f2cc343d Regen. (Fix SYS_exit) 2000-07-29 10:07:38 +00:00
peter
2acd9c62a7 Sigh. Fix SYS_exit problems. I misunderstood the significance of these
trailing options.
2000-07-29 10:05:25 +00:00
ps
7588cd1d2a Remove this file incase of further confusion. 2000-07-29 04:09:07 +00:00
peter
564c126846 Regenerate with makesyscalls.sh 2000-07-29 00:21:50 +00:00
peter
b273253c9e Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings.  main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
2000-07-29 00:16:28 +00:00
peter
87a4a5f84e Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the
!VFS_AIO case.  Lots of things have hooks into here (kqueue, exit(),
 sockets, etc), I elected to keep the external interfaces the same
 rather than spread more #ifdefs around the kernel.
2000-07-28 23:10:10 +00:00
peter
32619ababb Fix a const related warning. 2000-07-28 22:41:56 +00:00
peter
059e198f75 Fix some style nits.
Fix(?) some compile warnings regarding const handling.
2000-07-28 22:40:04 +00:00
peter
203d398e48 Fix warnings - make kevent args in comment match those in syscalls.master.
Deal with consts.
2000-07-28 22:32:25 +00:00
peter
40fc2e8bd3 Fix a warning that has been annoying me for some time:
"kern/sys_generic.c:358: warning: cast discards qualifiers from pointer
   target type"
The idea for using the uintptr_t intermediate cast for de-constifying
a pointer was hinted at by bde some time ago.
2000-07-28 22:17:42 +00:00
rwatson
bf8742e138 o Modify extattr_{set,get}() syscalls so that partial reads and writes
with an error condition such as EINTR, EWOULDBLOCK, and ERESTART,
  are reported to the application, not silently conceal.  This
  behavior was copied from the {read,write}v() syscalls, and is
  appropriate there but not here.
o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is
  passed to the wrong argument in namei(), resulting in some
  unexpected errors during name resolution, and passing in an unlocked
  vnode.

Obtained from:	TrustedBSD Project
2000-07-28 19:52:38 +00:00
jlemon
a794883a96 Have kevent() automatically restart if interrupted by a signal. If this
is not desired, then the user can register an EV_SIGNAL filter to
explicitly catch a signal event.

Change requested by: jayanth, ps, peter
		     "Why is kevent non-restartable after a signal?"
2000-07-27 23:06:14 +00:00
green
340f659647 Distinguish between whether ktraceing was enabled before an IO
operation or after it.  If the ktrace operation was enabled while the
process was blocked doing IO, the race would allow it to pass down
invalid (uninitialized) data and panic later down the call stack.
2000-07-27 03:45:18 +00:00
rwatson
97118d11db o Lock vnode before calling extattr_* VOP's, and modify vnode spec to
allow for that.
o Remember to call NDFREE() if exiting as a result of a failed
  vn_start_write() when snapshotting.

Reviewed by:	mckusick
Obtained from:	TrustedBSD Project
2000-07-26 20:29:20 +00:00
mckusick
b4eb939d51 Now that buffer locks can be recursive, we need to delete the panics
that complain about them.

Obtained from:	Brian Fundakowski Feldman <green@FreeBSD.org>
2000-07-25 18:28:46 +00:00
mckusick
d439206673 Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0);
Submitted by:	Peter Holm <pho@freebsd.org>
2000-07-25 05:38:54 +00:00
rwatson
ea88a1acb4 o Add missing function return types from capability syscall call stubs,
fix compiler warning.

Submitted by:	jake
2000-07-25 03:37:36 +00:00
mckusick
acc66855bf This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
green
3dd3bc9aca Using an atomic operation here won't help if nobody else uses them (for
this).  Use the simple_lock() on v_interlock like elsewhere.
2000-07-23 22:19:49 +00:00
green
1297ef2f5c Solve the problem where it is possible to get the kernel stuck in
a loop down in pmap_init_pt().  A subtraction causes the number of
pages to become negative, that was assigned to an unsigned variable,
and there is a lot of iteration.  The bug is due to the ELF image
activator not properly checking for its files being the correct size
as specified by the ELF header.

The solution is to check that the header doesn't ask for part of a
file when that part of the file doesn't exist.  Make sure to set
VEXEC at the proper times to make the executables immutable (remove
race conditions).  Also, the ELF format specifiies header entries
that allow embedding of other executables (hence how ld-elf.so.1
gets loaded, but not the same as loading shared libraries), so those
executables need to be set VEXEC, too, so they're immutable.

Reviewed by:	peter
2000-07-23 06:49:46 +00:00
alfred
a4c6219207 only allow accept filter modifications on listening sockets
Submitted by: ps
2000-07-20 12:17:17 +00:00
alfred
fb96eed996 disallow unload until we do proper refcounting 2000-07-20 12:12:41 +00:00
jlemon
f793ebc479 Fix a bug which would cause some knotes to get lost when two kqueues
were being used in a process at the same time.

Test case provided by:  Chris Peiffer <peifferc@CS.Stanford.EDU>
2000-07-18 21:41:47 +00:00
jlemon
7171047204 Simplify kqueue API slightly.
Discussed on:	-arch
2000-07-18 19:31:52 +00:00
peter
bf473790e8 Patch up some bogons in the resource_find() vs resource_find_hard()
interfaces.  The original resource_find() returned a pointer to an internal
resource table entry.  resource_find_hard() dereferences the actual
passed in value (oops!) - effectively trashing random memory due to
the pointer being passed in with a random initial value.

Submitted by:  bde
2000-07-18 06:08:27 +00:00
abial
c7bf2569fa These patches implement dynamic sysctls. It's possible now to add
and remove sysctl oids at will during runtime - they don't rely on
linker sets. Also, the node oids can be referenced by more than
one kernel user, which means that it's possible to create partially
overlapping trees.

Add sysctl contexts to help programmers manage multiple dynamic
oids in convenient way.

Please see the manpages for detailed discussion, and example module
for typical use.

This work is based on ideas and code snippets coming from many
people, among them:  Arun Sharma, Jonathan Lemon, Doug Rabson,
Brian Feldman, Kelly Yancey, Poul-Henning Kamp and others. I'd like
to specially thank Brian Feldman for detailed review and style
fixes.

PR:		kern/16928
Reviewed by:	dfr, green, phk
2000-07-15 10:26:04 +00:00
alfred
19218742c7 Make mbstat.m_mtypes seperate and viewable via sysctl, also
expand the size from short to ulong

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
PR: kern/19809
2000-07-15 06:02:48 +00:00
ps
b436e36344 Change the way NMI's are handled. Before, if DDB was enabled and
a NMI occured, you could type continue in DDB and the kernel would
not attempt to detect what type of NMI was recieved.  Now we check
for the type of NMI first and then go to DDB if it is enabled.

This will solve the problem with having DDB enabled and getting an
NMI due to some possibly bad error and being able to continue the
operation of the kernel when you really want to panic and know
what happened.

Submitted by:	jhb
2000-07-14 11:49:44 +00:00
rwatson
1a2da9c569 o Commit two of two, introducing __cap_{get,set}_{fd,file} syscalls to
modify capability sets on files.

Obtained from:	TrustedBSD Project
2000-07-13 20:38:52 +00:00
rwatson
d8fbcbd787 o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file},
syscalls to manage capability sets on files.  First of two commits.

Obtained from:	TrustedBSD Project
2000-07-13 20:31:24 +00:00
jhb
7b33546c33 For infinite timeouts, set both the tv_sec and tv_usec fields to zero in
poll() and select().

Noticed by:	Wesley Morgan <morganw@chemicals.tacorp.com>
2000-07-13 02:12:25 +00:00
jhb
a378d2d97e Fix a very obscure bug in select() and poll() where the timeout would
never expire if poll() or select() was called before the system had been
in multiuser for 1 second.  This was caused by only checking to see if
tv_sec was zero rather than checking both tv_sec and tv_usec.
2000-07-12 22:46:40 +00:00
itojun
a0611dac11 remove m_pulldown statistics, which is highly experimental and does not
belong to *bsd-merged tree
2000-07-12 16:39:13 +00:00
mckusick
a3d0c189ea Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
bp
b25a59d797 Correct SYSINIT execution order in the case when KLD contains more
than one SYSINIT with the same 'subsystem' id and different 'order' id.

Reviewed by:	peter
2000-07-09 23:58:56 +00:00
green
efb639324d Remove two micro-pessimizations I made. Bruce is teaching me well :)
KTRPOINT(p, KTR_GENIO) is more uncommon than error == 0, so it should
be first in the && statement.
2000-07-07 22:11:37 +00:00
green
f3eed69a97 Change that &@!$# UIO_READ to be UIO_WRITE. I tested the ktrace stuff,
but somehow... pass the pointy hat, again!
2000-07-07 21:52:15 +00:00
bp
8a86977869 Fix support for more than 256 simultaneous mounts. Theoretical limit
is 2^16 mounts per fs type.

Reported by:	Troy Arie Cobb <tcobb@staff.circle.net> via phk
Reviewed by:	bde
2000-07-07 14:01:08 +00:00
jhb
b6e74b58eb Support for unsigned integer and long sysctl variables. Update the
SYSCTL_LONG macro to be consistent with other integer sysctl variables
and require an initial value instead of assuming 0.  Update several
sysctl variables to use the unsigned types.

PR:		15251
Submitted by:	Kelly Yancey <kbyanc@posi.net>
2000-07-05 07:46:41 +00:00
imp
a04753a272 End two weeks of on and off debugging. Fix the crash on the Nth
insertion of a CF card, for random values of N > 1.  With these fixes,
I've been able to do 100 insert/remove of the cards w/o a crash with
lots of system activity going on that in the past would help trigger
the crash.

The problem:

FreeBSD creates dev_t's on the fly as they are needed and never
destroys them.  These dev_t's point to a struct disk that is used for
housekeeping on the disk.  When a device goes away, the struct disk
pointer becomes a dangling pointer.  Sometimes when the device comes
back, the pointer will point to the new struct disk (in which case the
insertion will work).  Other times it won't (especially if any length
of time has passed, since it is dependent on memory returned from
malloc).

The Fix:

There is one of these dev_t's that is always correct.  The
device for the WHOLE_DISK_SLICE is always right.  It gets set at
create_disk() time.  So, the fix is to spend a little CPU time and
lookup the WHOLE_DISK_SLICE dev_t and use the si_disk from that in
preference to the one that's in the device asking to do the I/O.  In
addition, we change the test of si_disk == NULL meaning that the dev
needed to inherit properties from the pdev to dev->si_disk !=
pdev->si_disk.  This test is a little stronger than the previous test,
but can sometimes be fooled into not inheriting.  However, the results
of this fooling are that the old values will be used, which will
generally always be the same as before.  si_drv[12] are the only
values that are copied that might pose a problem.  They tend to change
as the si_disk field would change, so it is a hole, but it is a small
hole.

One could correctly argue that one should replace much of this code
with something much much better.  I would be on the pro side of that
argument.

Reviewed by: phk (who also ported the original patch to current)
Sponsored by: Timing Solutions
2000-07-05 06:01:33 +00:00
itojun
5f4e854de1 sync with kame tree as of july00. tons of bug fixes/improvements.
API changes:
- additional IPv6 ioctls
- IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8).
  (also syntax change)
2000-07-04 16:35:15 +00:00
phk
e5de271d47 Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
mckusick
2f0e9591fa Simplify and rationalise the management of the vnode free list
(preparing the code to add snapshots).
2000-07-04 04:32:40 +00:00
mckusick
040e64cd97 Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from:	 BSD/OS
2000-07-04 03:34:11 +00:00
mckusick
806786489f If a buffer flush fails when trying to reclaim a vnode, it is too
late to save the vnode, so just toss any remaining unwritten buffers
rather than leaving them lying around to make trouble in the future.
2000-07-04 03:23:29 +00:00
mckusick
c3aca60a5b Update tags directive to reflect the new location of soft updates
and the reorganization of the eisa directory.
2000-07-04 00:18:43 +00:00
phk
2a91a9dd04 Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,
that is way cleaner than using the softupdates_stub stunt, which
should be killed when convenient.

Discussed with:	mckusick
2000-07-03 13:26:54 +00:00
phk
32127fca4e Add device_set_softc() which does the obvious.
Not objected to by:	dfr
2000-07-03 13:06:29 +00:00
phk
61ff05be25 Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
chris
a51c1232f7 Instead of just blindly setting -rw-rw-rw-:
o Set access mode to -r--r--r-- if SS_CANTRCVMORE is set and the receive
  buffer is empty.

o Set access mode to --w--w--w- is SS_CANTSENDMORE is set.

Discussed with:	alfred
2000-07-02 23:56:45 +00:00