Commit Graph

3074 Commits

Author SHA1 Message Date
tegge
17a384d63d Don't skip IOAPIC id conflict detection when only one pci bus is present.
PR:		20312
Reviewed by:	Steve Roome <steve@sse0691.bri.hp.com>
2000-08-10 17:33:24 +00:00
tegge
48216e937c Don't set flags on the mount structure before all permission checks have
been done.

Don't allow multiple mount operations with MNT_UPDATE at the same
time on the same mount point.  When the first mount operation
completed, MNT_UPDATE was cleared in the mount structure, causing
the second to complete as if it was a no-update mount operation
with the following bad side effects:

        - mount structure inserted multiple times onto the mountlist
        - vp->v_mountedhere incorrectly set, causing next namei
          operation walking into the mountpoint to crash with
          a locking against myself panic.

Plug a vnode leak in case vinvalbuf fails.
2000-08-09 01:57:11 +00:00
rwatson
5796a4e746 o Introduce vn_extattr_{get,set}, wrapper routines for VOP_GETEXTATTR
and VOP_SETEXTATTR to simplify calling from in-kernel consumers,
  such as capability code.  Both accept a vnode (optionally locked,
  with ioflg to indicate that), attribute name, and a buffer + buffer
  length in UIO_SYSSPACE.  Both authorize the call as a kernel request,
  with cred set to NULL for the actual VOP_ calls.

Obtained from:	TrustedBSD Project
2000-08-08 17:15:32 +00:00
jlemon
54a240077c Make the kqueue socket read filter honor the SO_RCVLOWAT value.
Spotted by:  "Steve M." <stevem@redlinenetworks.com>
2000-08-07 17:52:08 +00:00
jlemon
7adbb50a65 Fix bug with timeout; previously, when attempting to poll the kqueue by
passing a zero-valued timeout, the code would always sleep for one tick.
Change code to avoid calling tsleep if we have no intention of sleeping.

Bring in bugfix from sys_select.c, r1.60 which also applies here.

Modify error handling slightly; passing in an invalid fd will now result
in EBADF returned in the eventlist, while an attempt to change a knote
which does not exist will result in ENOENT being returned.  Previously
such attempts would fail silently without notification.

Pointed out by: nicolas.leonard@animaths.com
	        Rick Reed (rr@yahoo-inc.com)
2000-08-07 16:45:42 +00:00
ps
da80423007 Change the behavior of isa_nmi to log an error message instead of
panicing and return a status so that we can decide whether to drop
into DDB or panic.  If the status from isa_nmi is true, panic the
kernel based on machdep.panic_on_nmi, otherwise if DDB is
enabled, drop to DDB based on machdep.ddb_on_nmi.

Reviewed by:	peter, phk
2000-08-06 14:17:21 +00:00
tegge
6482b4b4ae Be more verbose when changing APIC ID on an IO APIC.
Don't allow cpu entries in the MP table to contain APIC IDs out of range.

Don't write outside array boundaries if an IO APIC entry in the MP table
contains an APIC ID out of range.

Assign APIC IDs for all IO APICs according to section 3.6.6 in the
Intel MP spec:

  - If the current APIC ID on an IO APIC doesn't conflict with other
    IO APICs or CPUs, that APIC ID should be used.  The copy of the MP
    table must be updated if the corresponding APIC ID in the MP table
    is different.

  - If the current APIC ID was in conflict with other units, the
    corresponding APIC ID specified in the MP table is checked for conflict.

  - If a conflict is still found then fall back to using a new unique ID.
    The copy of the MP table must be updated.

  - IDs out of range is considered to be in conflict.

During these operations, the IO_TO_ID array cannot be used, since any
conflict would have caused information loss.  The array is then corrected,
since all APIC ID conflicts should have been resolved.

PR:	20312, 18919
2000-08-06 00:04:03 +00:00
hsu
fcc89d6b53 Modify to use fixed STAILQ_LAST().
Reviewed by:	dfr
2000-08-03 16:37:46 +00:00
peter
96310cd256 Fix self referential dependencies. eg: uhub was packaged along with
usb, all in usb.ko.  uhub depends on usb.  The bug was that the preload
processing only adds a module to the list once it's internal dependencies
are resolved... Since it was not "seeing" the internal usb module it
believed that uhub had a missing dependency.
2000-08-02 21:08:53 +00:00
peter
6150943fa2 Fix the SYSINIT() bubble sort. This was fixed in kern_linker.c already. 2000-08-02 21:05:21 +00:00
jlemon
150675e8e0 Back out rev 1.12; its not clear that this is the right thing to do,
and in any event, it wasn't done correctly in the first place.
2000-08-01 04:27:50 +00:00
luoqi
d581367e0d Handle write page faults (both write only or read-modify-write) as MI vm
write-only faults.  This would allow write-only mmapped regions to function
correctly.
2000-07-31 14:47:14 +00:00
alfred
f67a37429c mbstat should be a read-only sysctl.
Submitted by: Bosko Milekic <bmilekic@dsuper.net>
2000-07-31 09:24:32 +00:00
ps
7b8483cbd4 Remove unnecessary call to splnet when setting an accept filter
since we are already at splnet.
2000-07-31 08:23:43 +00:00
peter
1c0a93ea3f Regen. (Fix SYS_exit) 2000-07-29 10:07:38 +00:00
peter
b1f887e12b Sigh. Fix SYS_exit problems. I misunderstood the significance of these
trailing options.
2000-07-29 10:05:25 +00:00
ps
95bcc23c70 Remove this file incase of further confusion. 2000-07-29 04:09:07 +00:00
peter
02daa405ff Regenerate with makesyscalls.sh 2000-07-29 00:21:50 +00:00
peter
2a565e00c7 Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings.  main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
2000-07-29 00:16:28 +00:00
peter
faf406d83f Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the
!VFS_AIO case.  Lots of things have hooks into here (kqueue, exit(),
 sockets, etc), I elected to keep the external interfaces the same
 rather than spread more #ifdefs around the kernel.
2000-07-28 23:10:10 +00:00
peter
46eae76d70 Fix a const related warning. 2000-07-28 22:41:56 +00:00
peter
e8abcf8745 Fix some style nits.
Fix(?) some compile warnings regarding const handling.
2000-07-28 22:40:04 +00:00
peter
47a2c27651 Fix warnings - make kevent args in comment match those in syscalls.master.
Deal with consts.
2000-07-28 22:32:25 +00:00
peter
1cfef28233 Fix a warning that has been annoying me for some time:
"kern/sys_generic.c:358: warning: cast discards qualifiers from pointer
   target type"
The idea for using the uintptr_t intermediate cast for de-constifying
a pointer was hinted at by bde some time ago.
2000-07-28 22:17:42 +00:00
rwatson
9655b72041 o Modify extattr_{set,get}() syscalls so that partial reads and writes
with an error condition such as EINTR, EWOULDBLOCK, and ERESTART,
  are reported to the application, not silently conceal.  This
  behavior was copied from the {read,write}v() syscalls, and is
  appropriate there but not here.
o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is
  passed to the wrong argument in namei(), resulting in some
  unexpected errors during name resolution, and passing in an unlocked
  vnode.

Obtained from:	TrustedBSD Project
2000-07-28 19:52:38 +00:00
jlemon
fa6c46e45d Have kevent() automatically restart if interrupted by a signal. If this
is not desired, then the user can register an EV_SIGNAL filter to
explicitly catch a signal event.

Change requested by: jayanth, ps, peter
		     "Why is kevent non-restartable after a signal?"
2000-07-27 23:06:14 +00:00
green
259b9bfa7c Distinguish between whether ktraceing was enabled before an IO
operation or after it.  If the ktrace operation was enabled while the
process was blocked doing IO, the race would allow it to pass down
invalid (uninitialized) data and panic later down the call stack.
2000-07-27 03:45:18 +00:00
rwatson
538da8db3b o Lock vnode before calling extattr_* VOP's, and modify vnode spec to
allow for that.
o Remember to call NDFREE() if exiting as a result of a failed
  vn_start_write() when snapshotting.

Reviewed by:	mckusick
Obtained from:	TrustedBSD Project
2000-07-26 20:29:20 +00:00
mckusick
e7f2e618d2 Now that buffer locks can be recursive, we need to delete the panics
that complain about them.

Obtained from:	Brian Fundakowski Feldman <green@FreeBSD.org>
2000-07-25 18:28:46 +00:00
mckusick
3574b288ff Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0);
Submitted by:	Peter Holm <pho@freebsd.org>
2000-07-25 05:38:54 +00:00
rwatson
cec293382a o Add missing function return types from capability syscall call stubs,
fix compiler warning.

Submitted by:	jake
2000-07-25 03:37:36 +00:00
mckusick
29497f482d This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
green
737dc36706 Using an atomic operation here won't help if nobody else uses them (for
this).  Use the simple_lock() on v_interlock like elsewhere.
2000-07-23 22:19:49 +00:00
green
18d8b90030 Solve the problem where it is possible to get the kernel stuck in
a loop down in pmap_init_pt().  A subtraction causes the number of
pages to become negative, that was assigned to an unsigned variable,
and there is a lot of iteration.  The bug is due to the ELF image
activator not properly checking for its files being the correct size
as specified by the ELF header.

The solution is to check that the header doesn't ask for part of a
file when that part of the file doesn't exist.  Make sure to set
VEXEC at the proper times to make the executables immutable (remove
race conditions).  Also, the ELF format specifiies header entries
that allow embedding of other executables (hence how ld-elf.so.1
gets loaded, but not the same as loading shared libraries), so those
executables need to be set VEXEC, too, so they're immutable.

Reviewed by:	peter
2000-07-23 06:49:46 +00:00
alfred
7040933f3d only allow accept filter modifications on listening sockets
Submitted by: ps
2000-07-20 12:17:17 +00:00
alfred
391d3d72d3 disallow unload until we do proper refcounting 2000-07-20 12:12:41 +00:00
jlemon
c1f32b071a Fix a bug which would cause some knotes to get lost when two kqueues
were being used in a process at the same time.

Test case provided by:  Chris Peiffer <peifferc@CS.Stanford.EDU>
2000-07-18 21:41:47 +00:00
jlemon
f48355f12d Simplify kqueue API slightly.
Discussed on:	-arch
2000-07-18 19:31:52 +00:00
peter
6d3418a541 Patch up some bogons in the resource_find() vs resource_find_hard()
interfaces.  The original resource_find() returned a pointer to an internal
resource table entry.  resource_find_hard() dereferences the actual
passed in value (oops!) - effectively trashing random memory due to
the pointer being passed in with a random initial value.

Submitted by:  bde
2000-07-18 06:08:27 +00:00
abial
df5f740ffd These patches implement dynamic sysctls. It's possible now to add
and remove sysctl oids at will during runtime - they don't rely on
linker sets. Also, the node oids can be referenced by more than
one kernel user, which means that it's possible to create partially
overlapping trees.

Add sysctl contexts to help programmers manage multiple dynamic
oids in convenient way.

Please see the manpages for detailed discussion, and example module
for typical use.

This work is based on ideas and code snippets coming from many
people, among them:  Arun Sharma, Jonathan Lemon, Doug Rabson,
Brian Feldman, Kelly Yancey, Poul-Henning Kamp and others. I'd like
to specially thank Brian Feldman for detailed review and style
fixes.

PR:		kern/16928
Reviewed by:	dfr, green, phk
2000-07-15 10:26:04 +00:00
alfred
88d0a1dcfe Make mbstat.m_mtypes seperate and viewable via sysctl, also
expand the size from short to ulong

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
PR: kern/19809
2000-07-15 06:02:48 +00:00
ps
108d097180 Change the way NMI's are handled. Before, if DDB was enabled and
a NMI occured, you could type continue in DDB and the kernel would
not attempt to detect what type of NMI was recieved.  Now we check
for the type of NMI first and then go to DDB if it is enabled.

This will solve the problem with having DDB enabled and getting an
NMI due to some possibly bad error and being able to continue the
operation of the kernel when you really want to panic and know
what happened.

Submitted by:	jhb
2000-07-14 11:49:44 +00:00
rwatson
6785b9fc0e o Commit two of two, introducing __cap_{get,set}_{fd,file} syscalls to
modify capability sets on files.

Obtained from:	TrustedBSD Project
2000-07-13 20:38:52 +00:00
rwatson
a22c3d2c40 o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file},
syscalls to manage capability sets on files.  First of two commits.

Obtained from:	TrustedBSD Project
2000-07-13 20:31:24 +00:00
jhb
03c6be8070 For infinite timeouts, set both the tv_sec and tv_usec fields to zero in
poll() and select().

Noticed by:	Wesley Morgan <morganw@chemicals.tacorp.com>
2000-07-13 02:12:25 +00:00
jhb
817d166391 Fix a very obscure bug in select() and poll() where the timeout would
never expire if poll() or select() was called before the system had been
in multiuser for 1 second.  This was caused by only checking to see if
tv_sec was zero rather than checking both tv_sec and tv_usec.
2000-07-12 22:46:40 +00:00
itojun
11dc2aeee2 remove m_pulldown statistics, which is highly experimental and does not
belong to *bsd-merged tree
2000-07-12 16:39:13 +00:00
mckusick
78cc524a14 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
bp
a29f0566db Correct SYSINIT execution order in the case when KLD contains more
than one SYSINIT with the same 'subsystem' id and different 'order' id.

Reviewed by:	peter
2000-07-09 23:58:56 +00:00
green
a665cf0e11 Remove two micro-pessimizations I made. Bruce is teaching me well :)
KTRPOINT(p, KTR_GENIO) is more uncommon than error == 0, so it should
be first in the && statement.
2000-07-07 22:11:37 +00:00