Commit Graph

23095 Commits

Author SHA1 Message Date
dfr
0766263be1 Convert various calls to splhigh() to disable_intr() since splhigh() is
now a no-op.
2000-11-19 12:28:42 +00:00
dfr
be0be4286a We don't need <stddef.h> for offsetof() any more. 2000-11-19 12:26:14 +00:00
jake
2fe8cb6795 - Protect the callout wheel with a separate spin mutex, callout_lock.
- Use the mutex in hardclock to ensure no races between it and
  softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
  CALLOUT_MPSAFE, which specifies that a callout handler does not
  need giant.  There is still no way to set this flag when
  regstering a callout.

Reviewed by:	-smp@, jlemon
2000-11-19 06:02:32 +00:00
dillon
54d2fd2d6a Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory
    situations prior to now.

    The new code is based on the concept that I/O must be able to function in
    a low memory situation.  All major modules related to I/O (except
    networking) have been adjusted to allow allocation out of the system
    reserve memory pool.  These modules now detect a low memory situation but
    rather then block they instead continue to operate, then return resources
    to the memory pool instead of cache them or leave them wired.

    Code has been added to stall in a low-memory situation prior to a vnode
    being locked.

    Thus situations where a process blocks in a low-memory condition while
    holding a locked vnode have been reduced to near nothing.  Not only will
    I/O continue to operate, but many prior deadlock conditions simply no
    longer exist.

Implement a number of VFS/BIO fixes

	(found by Ian): in biodone(), bogus-page replacement code, the loop
        was not properly incrementing loop variables prior to a continue
        statement.  We do not believe this code can be hit anyway but we
        aren't taking any chances.  We'll turn the whole section into a
        panic (as it already is in brelse()) after the release is rolled.

	In biodone(), the foff calculation was incorrectly
        clamped to the iosize, causing the wrong foff to be calculated
        for pages in the case of an I/O error or biodone() called without
        initiating I/O.  The problem always caused a panic before.  Now it
        doesn't.  The problem is mainly an issue with NFS.

	Fixed casts for ~PAGE_MASK.  This code worked properly before only
        because the calculations use signed arithmatic.  Better to properly
        extend PAGE_MASK first before inverting it for the 64 bit masking
        op.

	In brelse(), the bogus_page fixup code was improperly throwing
        away the original contents of 'm' when it did the j-loop to
        fix the bogus pages.  The result was that it would potentially
        invalidate parts of the *WRONG* page(!), leading to corruption.

	There may still be cases where a background bitmap write is
        being duplicated, causing potential corruption.  We have identified
        a potentially serious bug related to this but the fix is still TBD.
        So instead this patch contains a KASSERT to detect the problem
  	and panic the machine rather then continue to corrupt the filesystem.
	The problem does not occur very often..  it is very hard to
	reproduce, and it may or may not be the cause of the corruption
	people have reported.

Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>)
Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
2000-11-18 23:06:26 +00:00
dillon
7f80666dcb Add the splvm()'s suggested in PR 20609 to protect vm_pager_page_unswapped().
The remainder of the PR is still open.

PR: kern/20609 (partial fix)
2000-11-18 21:11:23 +00:00
dillon
1f9b705709 This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
dwmalone
edc447e3da Further use of M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Approved by:	msmith
2000-11-18 15:21:22 +00:00
dwmalone
88173fb55b Add the use of M_ZERO to netgraph.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Submitted by:	archie
Approved by:	archie
2000-11-18 15:17:43 +00:00
sos
269c27ccd5 Fix a braino .. 2000-11-18 12:14:35 +00:00
cg
aea0875b7b do not blindly assume 8khz is supported on open(). try for 8khz but respect
minspeed/maxspeed specified by the hw driver.

Submitted by:	Andrew Gordon <arg@arg1.demon.co.uk>
2000-11-18 03:43:04 +00:00
bp
72a7d1a37c Use vop_defaultop() instead of ntfs_bypass().
PR:		kern/22756
2000-11-18 02:47:12 +00:00
jhb
16af7af349 Release sched_lock very briefly to give interrupts a chance to fire if we
are in softclock() for a long time.  The old code already did an
splx()/slphigh() pair here, I just missed adding in the equivalent mutex
operations on sched_lock earlier.
2000-11-18 00:21:00 +00:00
tegge
fb67b72282 Don't attempt to cluster write buffers where the VMIO flag isn't set. 2000-11-17 23:40:08 +00:00
des
0f986d8077 Make sure we don't cross stripe boundaries when reviving striped plexes.
This makes crash recovery work for stripe sizes that are not multiples of
DEFAULT_REVIVE_BLOCKSIZE (currently 64 kB).
While we're here, fix a few cosmetic nits.

Reviewed by:	grog
Sponsored by:	Enitel ASA (http://www.enitel.no/)
2000-11-17 23:40:01 +00:00
obrien
67b277d61a Fix the `make -jX' (X>1) breakage.
Based on patch submitted by:	Makoto MATSUSHITA <matusita@jp.freebsd.org>
Reviewed by:	marcel, bde
2000-11-17 21:25:15 +00:00
jake
87763ac61f - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
jhb
883a12555a - Change extra sanity checks in cpu_switch() to be conditional on INVARIANTS
instead of DIAGNOSTIC.
- Remove the p_wchan check as it no longer applies since a process may be
  switched out during CURSIG() within msleep() or mawait().
- Remove an extra sanity check only needed during the early SMPng work.
2000-11-17 17:37:43 +00:00
ru
77559ae015 mdoc(7) police: use certified section headers wherever possible. 2000-11-17 11:44:16 +00:00
msmith
5b5c1d2cff The default kernel filename is "kernel" again, not "kernel.ko".
Submitted by:	mckusick
2000-11-17 04:43:56 +00:00
msmith
aada6e0c0d Add the 'gdt' and 'gdtd' devices for the ICP Vortex RAID controller family. 2000-11-17 01:36:34 +00:00
brian
e4b0b74d23 Go back to using data_len in struct ngpppoe_init_data after discussions
with Julian and Archie.

Implement a new ``sizedstring'' parse type for dealing with field pairs
consisting of a uint16_t followed by a data field of that size, and use
this to deal with the data_len and data fields.

Written by:		Archie with some input by me
Agreed in principle by:	julian
2000-11-16 23:14:53 +00:00
jhb
50cf76d922 The recent changes to msleep() and mawait() resulted in timeout() and
untimeout() not being called with Giant in those functions.  For now,
use the sched_lock to protect the callout wheel in softclock() and in
the various timeout and callout functions.

Noticed by:	tegge
2000-11-16 21:20:52 +00:00
wpaul
b8340b08bf When checking the device code in the probe routine, leave the chip in
16-bit mode. Technically, pcn_probe() is destructive because once the
chip goes into 32-bit mode, the only way to get it out again is a
hardware reset. And once the device is in 32-bit mode, the lnc driver
won't be able to talk to it. So if pcn_probe() is called before the
lnc probe routine, and pcn_probe() rejects the chip as one it doesn't
support, the lnc driver will be SOL.

I don't like this. I think it's a design flaw that you can't switch
the chip out of 32-bit mode once it's selected. The only 'right'
solution is for the pcn driver to support all of the PCI devices
in 32-bit mode, however I don't have samples of all the PCnet series
cards for testing.
2000-11-16 19:56:09 +00:00
archie
01e77282a8 Add kernel option NETGRAPH_ONE2MANY. 2000-11-16 16:59:26 +00:00
imp
bbf0481274 vx is now optional rather than taking a count. Reflect that in the
files.  Also a minor white space nit.

Submitted by: bde
2000-11-16 15:16:41 +00:00
sos
3cf70fd719 Put the probe verboseness behind bootverbose 2000-11-16 10:52:00 +00:00
archie
449bf4e5ca New netgraph node type ng_one2many(4). 2000-11-16 05:58:33 +00:00
jhb
e414c361ec Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
gallatin
332bb46095 remove redundant declaration of bsd_to_linux_sigset()
reviewed by: marcel
2000-11-16 02:08:40 +00:00
gallatin
7a1d229a0b fix glaring bugs in rt signals -- copyout the right signal mask in
linux_rt_sendsig() and restore the same signal mask linux does
in rt_sigreturn().  This gets us saving/restoring all 64-bits of the
linux sigset_t in rt signals.

Reviewed by: marcel
2000-11-16 02:07:05 +00:00
jhb
3c1c213868 Argh, add in a missing release of the sched_lock. 2000-11-16 01:16:54 +00:00
jhb
534ed28202 CURSIG() calls functions that acquire sleep mutexes, so it is not a good
idea to be holding the sched_lock while we are calling it.  As such,
release sched_lock before calling CURSIG() in msleep() and mawait() and
reacquire it after CURSIG() returns.

Submitted by:	witness
2000-11-16 01:07:19 +00:00
gallatin
ae7541c2af Use the linux_connect() on alpha rather than passing directly through
to our native connect().  This is required to deal with the differences
in the way linux handles connects on non-blocking sockets.

This gets the private beta of the Compaq Linux/alpha JDK working
on FreeBSD/alpha

Approved by: marcel
2000-11-16 01:05:53 +00:00
gallatin
5a6d9ce7f2 make the fcntl() flags match what the linux/alpha port uses, not
what linux/i386 uses
2000-11-16 00:58:07 +00:00
jhb
e8689cafdb - Rename await() to mawait(). mawait() is to await() as msleep() is to
tsleep().  Namely, mawait() takes an extra argument which is a mutex
  to drop when going to sleep.  Just as with msleep(), if the priority
  argument includes the PDROP flag, then the mutex will be dropped and will
  not be reacquired when the process wakes up.
- Add in a backwards compatible macro await() that passes in NULL as the
  mutex argument to mawait().
2000-11-15 22:39:35 +00:00
jhb
b73beb3c1c - Replace a KASSERT() that knew too much about mutex internals with a
mtx_assert() that ensures the mutex we release during msleep() is both
  not recursed and owned by the current process.
2000-11-15 22:30:48 +00:00
jhb
84b861923d - Convert references from tsleep() -> msleep()
- Fix a buglet in a comment above await()
2000-11-15 22:27:38 +00:00
jhb
cd70252fab - Add a new macro DROP_GIANT_NOSWITCH() that is similar to DROP_GIANT()
except that it uses the MTX_NOSWITCH flag while it releases Giant via
  mtx_exit().
- Add a mtx_recursed() primitive.  This primitive should only be used on
  a mutex owned by the current process.  It will return non-zero if the
  mutex is recursively owned, or zero otherwise.
- Add two new flags MA_RECURSED and MA_NOTRECURSED that can be used in
  conjuction with MA_OWNED to control the assertion checked by mtx_assert().
- Fix some of the KTR tracepoint strings to use %p when displaying the lock
  field of a mutex, which is a uintptr_t.
2000-11-15 22:12:33 +00:00
jhb
46bedae524 Include the right headers to get the DDB #define and the db_active variable. 2000-11-15 22:08:16 +00:00
jhb
e7a07f6938 - Replace some instances of sched_ithd with sched_swi in KTR tracepoints.
- Assert that Giant is not owned during the main loop of sithd_loop().
2000-11-15 22:05:23 +00:00
jhb
4521d2db26 Assert that Giant is not owned during the main loop of ithd_loop(). 2000-11-15 22:03:26 +00:00
jhb
dd822f99af Declare the 'witness_spin_check' properly as a per-CPU variable in the
non-SMP case.
2000-11-15 22:02:05 +00:00
jhb
eeb4b81a8f Don't perform witness checks in witness_enter() during a panic. 2000-11-15 22:00:31 +00:00
jhb
d9686a3c43 Add the 'witness_spin_check' per-CPU variable. 2000-11-15 21:58:02 +00:00
jhb
829c374088 - Don't acquire/release Giant during an interrupt context for machine
checks, clock interrupts, and device interrupts.
- Assert that Giant is not owned during the main loop of ithd_loop().
2000-11-15 21:56:50 +00:00
jhb
1ad20a84e9 Make ktr_verbose a bit more useful:
- On SMP systems display the cpu number with each message
- If ktr_verbose > 1, then include the filename and line number with each
  trace message
2000-11-15 21:51:53 +00:00
mckusick
88da189228 Bug fix for revision 1.14 on the replacement of CIRCLEQ with TAILQ.
Submitted by:	Warner Losh <imp@village.org>
2000-11-15 20:07:16 +00:00
jhb
48f3724161 Fix all the interrupt enabled/disabled assertions which were backwards. 2000-11-15 19:45:10 +00:00
jhb
81c335bfdd Don't perform an mi_switch() when we release Giant during cpu_exit(). We
are about to call cpu_switch() anyways.

Found by:	witness
2000-11-15 19:44:38 +00:00
mckusick
e8e8186149 In preparation for deprecating CIRCLEQ macros in favor of TAILQ
macros which provide the same functionality and are a bit more
efficient, convert use of CIRCLEQ's in netgraph PPP code to TAILQ's.

Reviewed by:	Archie Cobbs <archie@dellroad.org>
2000-11-15 19:40:34 +00:00