Commit Graph

2585 Commits

Author SHA1 Message Date
Gary Palmer
0625ba2fc3 Add Id strings 1999-06-17 23:42:45 +00:00
Nick Hibma
6979450036 Update the comments on values than can be returned by DEVICE_PROBE.
DEVICE_PROBE can return priorities.

Reviewed by:	Doug Rabson <dfr@nlsystems.com>
1999-06-17 19:22:12 +00:00
Bruce Evans
212dfe6fc7 Fixed a missing userland dev_t to kernel dev_t conversion. 1999-06-17 07:07:55 +00:00
Julian Elischer
4e1b754078 Reformat comment to match indentation of code around it. 1999-06-17 01:25:25 +00:00
Kirk McKusick
f9c8cab591 Add a vnode argument to VOP_BWRITE to get rid of the last vnode
operator special case. Delete special case code from vnode_if.sh,
vnode_if.src, umap_vnops.c, and null_vnops.c.
1999-06-16 23:27:55 +00:00
Dmitrij Tejblum
71ddfdbbd5 Make sure syscall arguments properly aligned in ktrace records.
Make syscall return value a register_t.

Based on a patch from Hidetoshi Shimokawa.
Mostly reviewed by:	Hidetoshi Shimokawa and Bruce Evans.
1999-06-16 18:37:01 +00:00
David Greenman
cd3fe8d008 Changed trypbuf to a getpbuf to work around a problem where redundant writes
would occur when clustering them - caused by running out of buffers
and taking a degenerate code path as a result. It appears that waiting
instead for buffers to become available is okay.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
Discovered by: Craig A Soules <soules+@andrew.cmu.edu>
1999-06-16 15:54:30 +00:00
Tor Egge
01cf8ad024 If we still haven't got a sufficient number of free buffers after the
call to flushdirtybuffers() then sleep in waitfreebuffers().
PR:		11697
Reviewed by:	David Greenman, Matt Dillon
1999-06-16 03:19:04 +00:00
Kirk McKusick
e4ab40bcb6 Get rid of the global variable rushjob and replace it with a function in
kern/vfs_subr.c named speedup_syncer() which handles the speedup request.
Change the various clients of rushjob to use the new function.
1999-06-15 23:37:29 +00:00
Mike Smith
79fc0bf4a0 From the submitter:
- this causes POSIX locking to use the thread group leader
   (p->p_leader) as the locking thread for all advisory locks.
   In non-kernel-threaded code p->p_leader == p, so this will have
   no effect.

   This results in (more) correct POSIX threaded flock-ing semantics.

   It also prevents the leader from exiting before any of the children.
   (so that p->p_leader will never be stale) in exit1().

   We have been running this patch for over a month now in our lab
   under load and at customer sites.

Submitted by:	John Plevyak <jplevyak@inktomi.com>
1999-06-07 20:37:29 +00:00
Archie Cobbs
05292ba234 ksprintn() may be called with base=2, so redefine MAXNBUF accordingly.
Other brucification tweaks.

Obtained from:	bde@freebsd.org
1999-06-07 18:26:26 +00:00
Archie Cobbs
ad4f8dbd23 The function ksprintn(), which is used to convert numbers to ASCII, is not
reentrant because it returns a static buffer. This results in a race condition
when/if an interrupt handler calls log(), printf() etc. Fix this.
1999-06-06 02:41:55 +00:00
Alan Cox
3d41489171 Restructure pipe_read in order to eliminate several race conditions.
Submitted by:	Matthew Dillon <dillon@apollo.backplane.com> and myself
1999-06-05 03:53:57 +00:00
Peter Wemm
9c9906e912 Plug a mbuf leak in tcp_usr_send(). pru_send() routines are expected
to either enqueue or free their mbuf chains, but tcp_usr_send() was
dropping them on the floor if the tcpcb/inpcb has been torn down in the
middle of a send/write attempt.  This has been responsible for a wide
variety of mbuf leak patterns, ranging from slow gradual leakage to rather
rapid exhaustion.  This has been a problem since before 2.2 was branched
and appears to have been fixed in rev 1.16 and lost in 1.23/1.28.

Thanks to Jayanth Vijayaraghavan <jayanth@yahoo-inc.com> for checking
(extensively) into this on a live production 2.2.x system and that it
was the actual cause of the leak and looks like it fixes it.  The machine
in question was loosing (from memory) about 150 mbufs per hour under
load and a change similar to this stopped it.  (Don't blame Jayanth
for this patch though)

An alternative approach to this would be to recheck SS_CANTSENDMORE etc
inside the splnet() right before calling pru_send() after all the potential
sleeps, interrupts and delays have happened.  However, this would mean
exposing knowledge of the tcp stack's reset handling and removal of the
pcb to the generic code.  There are other things that call pru_send()
directly though.

Problem originally noted by:  John Plevyak <jplevyak@inktomi.com>
1999-06-04 02:27:06 +00:00
Dmitrij Tejblum
4ea5ad99d5 || vs && confusion in cdevsw_add(). 1999-06-01 20:41:26 +00:00
Poul-Henning Kamp
6fcd8a7c93 Introduce the makebdev() function, it does the same as the makedev()
function for now, but that will change.
1999-06-01 18:56:26 +00:00
Jonathan Lemon
eb9d435ae7 Unifdef VM86.
Reviewed by:	silence on on -current
1999-06-01 18:20:36 +00:00
Poul-Henning Kamp
2447bec829 Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it.  cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.

cdevsw_add() will print an message if the d_maj field looks bogus.

Remove nblkdev and nchrdev variables.  Most places they were used
bogusly.  Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.

Move bdevsw() and devsw() functions to kern/kern_conf.c

Bump __FreeBSD_version to 400006

This commit removes:
        72 bogus makedev() calls
        26 bogus SYSINIT functions

if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.

I4b and vinum not changed.  Patches emailed to authors.  LINT
probably broken until they catch up.
1999-05-31 11:29:30 +00:00
Poul-Henning Kamp
4e2f199e0c This commit should be a extensive NO-OP:
Reformat and initialize correctly all "struct cdevsw".

        Initialize the d_maj and d_bmaj fields.

        The d_reset field was not removed, although it is never used.

I used a program to do most of this, so all the files now use the
same consistent format.  Please keep it that way.

Vinum and i4b not modified, patches emailed to respective authors.
1999-05-30 16:53:49 +00:00
Doug Rabson
20b62c5ac9 * Add a function devclass_create() which looks up the named devclass and
creates it if it doesn't exist.
* Rename resource_list_remove() to resource_list_delete() for consistency.
1999-05-30 10:27:11 +00:00
Doug Rabson
bea6af4d31 * Change device_add_child_after() to device_add_child_ordered() which is
easier to use and more flexible.
* Change BUS_ADD_CHILD to take an order argument instead of a place.
* Define a partial ordering for isa devices so that sensitive devices are
  probed before non-sensitive ones.
1999-05-28 09:25:16 +00:00
Doug Rabson
8be70a6eac Fix an embarrasing typo in device_add_child_after(). I can't understand
how this hasn't caused problems before.

Submitted by: Kazutaka YOKOTA <yokota@zodiac.mech.utsunomiya-u.ac.jp>
1999-05-27 07:18:41 +00:00
John Birrell
350185153d Back out my previous change (phk didn't like it) in favour of setting
rootdev in the mfs initialisation code iff MFS_ROOT (which Bruce doesn't
like). Damned if I do - damned if I don't.
1999-05-24 00:37:26 +00:00
John Birrell
02013ff886 Remove the test for bdevsw(dev) == NULL from bdevvp() because it fails
if there is no character device associated with the block device. In this
case that doesn't matter because bdevvp() doesn't use the character
device structure.

I can use the pointy bit of the axe too.
1999-05-24 00:34:10 +00:00
John Birrell
d47066829a Make MFS_ROOT work again. MFS_ROOT means that rootdev is not set.
Broken by: phk
Problem ignored by: phk
1999-05-23 10:51:33 +00:00
Dmitrij Tejblum
9d3a442583 Don't call calcru() on a swapped-out process. calcru() access p_stats, which
is in U-area.
1999-05-22 20:10:31 +00:00
Doug Rabson
7e082b48a2 Add some helper functions to make it easier to write a driver for a bus
which needs to manage resources for its children.
1999-05-22 14:57:15 +00:00
Peter Wemm
13b758eb46 Add seatbelt like in previous function.. 1999-05-22 09:52:21 +00:00
Andrey A. Chernov
925fa5c3f5 Realy fix overflow on SO_*TIMEO
Submitted by: bde
1999-05-21 15:54:40 +00:00
Doug Rabson
0053cc2cfe Silently return NULL from devclass_get_device if dc == NULL. The caller
should be handling NULL returns already.

Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
1999-05-21 08:23:58 +00:00
Peter Wemm
1d23cba9b7 Oops, set module->file..
PR: 1179
Submitted-by: lha@stacken.kth.se
1999-05-20 00:00:58 +00:00
Luoqi Chen
15a33d7c1d TIOCEXT is also inapproriate before the slave is open, return EAGAIN when
these ioctls are attempted. Move a misplaced comment.

Pointed out by:	Bruce
1999-05-18 14:53:52 +00:00
Luoqi Chen
519566d2e3 Avoid negative numbers in dev_t manipulation. This should fix recent MFS
related crashes.
1999-05-18 13:14:43 +00:00
Poul-Henning Kamp
6a0ce00218 Use NOUDEV for udev_t's 1999-05-17 13:50:24 +00:00
Doug Rabson
f40ddd55b0 Change the definition of e_tdev in struct kinfo_proc from dev_t to udev_t
Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>
1999-05-17 13:28:35 +00:00
Alan Cox
e972780a11 Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert,
eliminating the need for the pmap_object_init_pt calls in imgact_* and
mmap.

Reviewed by:	David Greenman <dg@root.com>
1999-05-17 00:53:56 +00:00
Eivind Eklund
1d8290f3c3 Add enough include files to make this actually compile on an a.out system. 1999-05-15 23:18:32 +00:00
Alan Cox
e5f13bdd09 Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying
MAP_COPY_ON_WRITE, and vice versa.  Thus, MAP_COPY_ON_WRITE suffices.

Reviewed by:	David Greenman <dg@root.com>
1999-05-14 23:09:34 +00:00
Luoqi Chen
7af0acae17 Ignore some ioctls on the master until the slave is open. 1999-05-14 20:44:20 +00:00
Luoqi Chen
0ce54cbb0c Legally acquire a major number for mfs. 1999-05-14 20:40:23 +00:00
Doug Rabson
6c2e3dde8c * Define a new static method DEVICE_IDENTIFY which is called to add device
instances to a parent bus.
* Define a new method BUS_ADD_CHILD which can be called from DEVICE_IDENTIFY
  to add new instances.
* Add a generic implementation of DEVICE_PROBE which calls DEVICE_IDENTIFY
  for each driver attached to the parent's devclass.
* Move the hint-based isa probe from the isa driver to a new isahint driver
  which can be shared between i386 and alpha.
1999-05-14 11:22:47 +00:00
Doug Rabson
454a0546ba Adjust method dispatch to ensure that default methods are called properly. 1999-05-14 09:13:43 +00:00
Kirk McKusick
eaea7a9e9f Previously directories were sync'ed every 10 seconds while bitmaps &
inodes were synced every 15 seconds. This is now reversed as during
directory create, we cannot commit the directory entry until its
inode has been written. With this switch, the inodes will be more
likely to be written by the time that the directory is written thus
reducing the number of directory rollbacks that are needed.
1999-05-14 01:29:21 +00:00
Bruce Evans
7d9509f96c Added ../sys/syscall.mk to targets. Back it up like all the other
targets.
1999-05-13 09:19:14 +00:00
Bruce Evans
853cbeeb35 Regenerated. 1999-05-13 09:12:57 +00:00
Bruce Evans
f664346fbe Fixed nonsense arg type `const caddr_t' in the prototype() for utrace().
Changed to `const void *'.  utrace() is undocumented, so nothing should
notice.

Fixed missing consts for utrace() and ktrace() in syscalls.master.

sys/ktrace.h is missing some Lite2 changes of shorts to ints.
1999-05-13 09:09:37 +00:00
Peter Wemm
ccb84588dd Try an fix a couple of dev_t/major/minor etc nits. 1999-05-12 22:30:50 +00:00
Luoqi Chen
0f0fe5a4c5 Unbreak VESA on SMP. 1999-05-12 21:39:07 +00:00
Peter Wemm
cc5881cff5 Fix (?) SPECHASH dev_t/major/minor/etc args 1999-05-12 19:06:40 +00:00
Poul-Henning Kamp
e519e78b42 braino. 1999-05-12 13:06:34 +00:00
Bruce Evans
ce45b512b3 Fixed corruption of the kmemstatistcs list. The first malloc()
with malloc type at the tail of the list changed the list from
linear to circular.  This seemed to cause surprisingly few problems,
but it now causes weird output from `vmstat -m', probably because
a more important malloc type is now at the tail of the list.

Fix it by abusing ks_limit instead of ks_next as a flag for being
on the list.  Don't forget to clear the flag when a malloc type is
uninit'ed.  Uninit'ing is still fundamentally broken -- it loses
history.
1999-05-12 11:11:27 +00:00
Poul-Henning Kamp
adfea48f2b Produce compiler warning if dev_t and udev_t is confused. 1999-05-12 11:06:56 +00:00
Poul-Henning Kamp
8bee45c44e Don't peek into dev_t 1999-05-12 11:06:07 +00:00
Poul-Henning Kamp
bfbb9ce670 Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
        major()         umajor()
        minor()         uminor()
        makedev()       umakedev()
        dev2udev()      udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits.  This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference.  If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
Peter Wemm
ac48316f68 Send subr_rlist.c off to the big Attic in the sky. It's been #if 0'ed
for quite some time now and can be revived in a moment's notice if needed.
(It was replaced by subr_blist.c for VM/swap)
1999-05-11 14:29:59 +00:00
John Birrell
e40822840b Use colons instead of semi-colons to behave like UNIX instead of DOS.
Suggested by: bde
1999-05-11 10:08:10 +00:00
Peter Wemm
dc97381e37 Update one set of comments.. s/so_q0/so_incomp/ and s/so_q/so_comp/ (that's
incomplete and complete connections I think)
1999-05-10 18:15:40 +00:00
Poul-Henning Kamp
784fc12f7b Use NODEV instead of -1 1999-05-10 18:10:08 +00:00
Don Lewis
bd508d391b Fix descriptor leak provoked by KKIS.05051999.003b exploit code.
unp_internalize() takes a reference to the descriptor.  If the send
fails after unp_internalize(), the control mbuf would be freed ophaning
the reference.

Tested in -CURRENT by: Pierre Beyssac <beyssac@enst.fr>
1999-05-10 18:09:39 +00:00
Nick Hibma
b57947c947 Remove hack to accept French spelling of METHOD (METHODE) 1999-05-10 17:45:49 +00:00
Doug Rabson
8b2970bbe6 * Augment the interface language to allow arbitrary C code to be 'passed
through' to the C compiler.
* Allow the interface to specify a default implementation for methods.
* Allow 'static' methods which are not device specific.
* Add a simple scheme for probe routines to return a priority value. To
  make life simple, priority values are negative numbers (positive numbers
  are standard errno codes) with zero being the highest priority. The
  driver which returns the highest priority will be chosen for the device.
1999-05-10 17:06:14 +00:00
Doug Rabson
276794a4a4 Superceded by makedevops.pl 1999-05-10 16:45:19 +00:00
Peter Wemm
7067d561cc Lites2 seems to have pretty much disappeared from the radar, and I suspect
far more than this hack would be needed now..
1999-05-09 20:42:45 +00:00
Peter Wemm
18c3dd1745 s/main/mi_startup/ for the kernel entry point so that egcs doesn't get
upset about it (and generate things like __main() calls that are reserved
for main()).  Renaming was phk's suggestion, but I'd already thought about
it too.  (phk liked my suggested name tada() but I decided against it :-)

Reviewed by:	phk
1999-05-09 19:01:49 +00:00
Peter Wemm
e37622b251 Fix a couple of warnings and some bitrot in comments. 1999-05-09 16:04:14 +00:00
Poul-Henning Kamp
0a346dab99 major(something) can never become NODEV. 1999-05-09 13:13:52 +00:00
Poul-Henning Kamp
52400704e9 Unconfuse DEV_MODULE() and DEV_DRIVER_MODULE() about the difference between
a major number for a dev_t.
1999-05-09 13:00:50 +00:00
Doug Rabson
c586a73949 Hack the diskslice stuff so that it allows the alpha sysinstall to
manipulate the disklabel. This is almost certainly not the right way
to do it but I'm desperate.
1999-05-09 11:27:41 +00:00
Poul-Henning Kamp
8f0024a54a Peter beat me to half this patch, but didn't do the other half:
set d_bmaj

	don't cast a dev_t to int before comparing to NODEV
1999-05-09 08:18:12 +00:00
Peter Wemm
0da14f00bf Comment advising ordering of cdevsw_add and bdevsw_add is obsolete (no
bdevsw_add any more).
1999-05-09 08:10:17 +00:00
Dmitrij Tejblum
db72e05829 Fix a freelist trashing under following confitions:
- first program lock a region in a file,
- second program wait on the lock,
- first program extend the region,
- second program interrupted by a signal.
1999-05-08 22:46:46 +00:00
Doug Rabson
566643e39e Move the declaration of the interrupt type from the driver structure
to the BUS_SETUP_INTR call.
1999-05-08 21:59:43 +00:00
Peter Wemm
73d35bdd96 Change resource_set_*() to be more useful. BTW; resource_find() is a
bit odd, it looks like the wildcard stuff isn't right.
1999-05-08 18:08:59 +00:00
Peter Wemm
9c06a38614 Make sure the mem_range_AP_init() prototype is seen where it's needed, and
#ifdef SMP around it for fun.
1999-05-08 17:48:22 +00:00
Peter Wemm
4173e42044 Use KERNBASE for the load address of the kernel rather than magic constants
as it seems to work..  (at least on i386/elf).
1999-05-08 13:03:49 +00:00
Peter Wemm
b5b15c3ff0 First stages of a module dependency cleanup. This part fixes a
particularly annoying hack, namely having the linker bash the moduledata
to set the container pointer, preventing it being const.  In the process,
a stack of warnings were fixed and will probably allow a revisit of the
const C_SYSINIT() changes.  This explicitly registers modules in files or
preload areas with the module system first, and let them initialize via
SYSINIT/DECLARE_MODULE later in their SI_ORDER_xxx order.  The kludge of
finding the containing file is no longer needed since the registration
of modules onto the modules list is done in the context of initializing
the linker file.
1999-05-08 13:01:59 +00:00
Poul-Henning Kamp
1637aa4b1c Fix some of the places where too much inside knowledge about major/minor
layout and dev_t structure is being (ab)used.
1999-05-08 07:02:41 +00:00
Poul-Henning Kamp
4be2eb8c49 I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.
1999-05-08 06:40:31 +00:00
Dag-Erling Smørgrav
b83308b00b Nit fix. 1999-05-07 17:37:08 +00:00
Poul-Henning Kamp
46eede0058 Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw.  bdevsw() is now an (inline)
        function.

        Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
        to the order of the cmaj/bmaj arguments!)

        Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
        (ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
1999-05-07 10:11:40 +00:00
Poul-Henning Kamp
e994c55884 Fix a goof in the #ifdef DEVFS case which was found by inspection,
it may have made things very difficult for people if they tried to
used DEVFS.
1999-05-07 09:10:10 +00:00
Poul-Henning Kamp
c48d17750f Introduce two functions: physread() and physwrite() and use these directly
in *devsw[] rather than the 46 local copies of the same functions.

(grog will do the same for vinum when he has time)
1999-05-07 07:03:47 +00:00
Poul-Henning Kamp
b0eeea2042 remove b_proc from struct buf, it's (now) unused.
Reviewed by:	dillon, bde
1999-05-06 20:00:34 +00:00
Peter Wemm
d5558c001a Fix up a few easy 'assignment used as truth value' and 'suggest parens
around && within ||' type warnings.  I'm pretty sure I have not masked
any problems here, I've committed real problem fixes seperately.
1999-05-06 18:44:42 +00:00
Peter Wemm
dfd5dee1b0 Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.
1999-05-06 18:13:11 +00:00
Poul-Henning Kamp
84c55b38e4 Remove unused fields from struct buf:
b_savekva
	b_validoff
	b_validend

Reviewed by:	dillon, bde
1999-05-06 17:06:41 +00:00
Bruce Evans
ea2b3e3d1b Fixed profiling of elf kernels. Made high resolution profiling compile
for elf kernels (it is broken for all kernels due to lack of egcs support).

Renaming of many assembler labels is avoided by declaring by declaring
the labels that need to be visible to gprof as having type "function"
and depending on the elf version of gprof being zealous about discarding
the others.  A few type declarations are still missing, mainly for SMP.

PR:		9413
Submitted by:	Assar Westerlund <assar@sics.se> (initial parts)
1999-05-06 09:44:57 +00:00
John Birrell
67481196cc Allow the init_path to be customised in an embedded system using the
INIT_PATH config option.

Also fix two bugs which caused an infinite loop in none of the programs
in the init_path were found. That code was obviously not tested!
1999-05-05 12:20:23 +00:00
Bill Fumerola
3d177f465a Add sysctl descriptions to many SYSCTL_XXXs
PR:		kern/11197
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
Reviewed by:	billf(spelling/style/minor nits)
Looked at by:	bde(style)
1999-05-03 23:57:32 +00:00
Alan Cox
4221e284a3 The VFS/BIO subsystem contained a number of hacks in order to optimize
piecemeal, middle-of-file writes for NFS.  These hacks have caused no
end of trouble, especially when combined with mmap().  I've removed
them.  Instead, NFS will issue a read-before-write to fully
instantiate the struct buf containing the write.  NFS does, however,
optimize piecemeal appends to files.  For most common file operations,
you will not notice the difference.  The sole remaining fragment in
the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache
coherency issues with read-merge-write style operations.  NFS also
optimizes the write-covers-entire-buffer case by avoiding the
read-before-write.  There is quite a bit of room for further
optimization in these areas.

The VM system marks pages fully-valid (AKA vm_page_t->valid =
VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault.  This
is not correct operation.  The vm_pager_get_pages() code is now
responsible for marking VM pages all-valid.  A number of VM helper
routines have been added to aid in zeroing-out the invalid portions of
a VM page prior to the page being marked all-valid.  This operation is
necessary to properly support mmap().  The zeroing occurs most often
when dealing with file-EOF situations.  Several bugs have been fixed
in the NFS subsystem, including bits handling file and directory EOF
situations and buf->b_flags consistancy issues relating to clearing
B_ERROR & B_INVAL, and handling B_DONE.

getblk() and allocbuf() have been rewritten.  B_CACHE operation is now
formally defined in comments and more straightforward in
implementation.  B_CACHE for VMIO buffers is based on the validity of
the backing store.  B_CACHE for non-VMIO buffers is based simply on
whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear,
and vise-versa).  biodone() is now responsible for setting B_CACHE
when a successful read completes.  B_CACHE is also set when a bdwrite()
is initiated and when a bwrite() is initiated.  VFS VOP_BWRITE
routines (there are only two - nfs_bwrite() and bwrite()) are now
expected to set B_CACHE.  This means that bowrite() and bawrite() also
set B_CACHE indirectly.

There are a number of places in the code which were previously using
buf->b_bufsize (which is DEV_BSIZE aligned) when they should have
been using buf->b_bcount.  These have been fixed.  getblk() now clears
B_DONE on return because the rest of the system is so bad about
dealing with B_DONE.

Major fixes to NFS/TCP have been made.  A server-side bug could cause
requests to be lost by the server due to nfs_realign() overwriting
other rpc's in the same TCP mbuf chain.  The server's kernel must be
recompiled to get the benefit of the fixes.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-05-02 23:57:16 +00:00
Mark Murray
a8af2bd86b This routine was "use"ing File::Basename. This commit removes that
"use" and replaces it with equivalent inline code. The reason is that
Perl has some very nasty circular dependancies, and I am trying to
get the System Perl upgraded by one maintenance level.

The basic rule, until I can find a way to solve this, is that
the build tools MAY NOT use any library code; it must all be inline.
1999-05-02 08:55:27 +00:00
Mike Smith
4a034f21cd Add a hook that can be called to initialise a slave processor's memory
range attributes after they have been extracted from the master.

Hook up the i686 MP code to do this for each AP.

Be more careful about printing the default memory type for the i686.

Suggestions from: luoqi
1999-04-30 22:09:45 +00:00
Poul-Henning Kamp
07901f227b Add beer-ware license and $Id$
Noticed by:	dillon
1999-04-30 06:51:51 +00:00
Poul-Henning Kamp
430210c00b Make BOOTP to work again.
Submitted by:	dillon
Reviewed by:	phk
1999-04-30 06:30:15 +00:00
Dmitrij Tejblum
188554bba1 Set curproc at the end of proc0_init().
This patch also moves the bogus comment (the comment is still not quite
right) and (as a side effect) removes some verbose initialisations (we
depend on static initialisation to 0 for almost everything in proc0).

The alpha kernels are bootable again. The change  won't affect i386's
until machdep.c is changed.

Submitted by:	bde
1999-04-29 22:51:59 +00:00
Alan Cox
0043b4376a Address a performance problem in getnewbuf:
In heavy-writing situations, QUEUE_LRU can contain a large number
	of DELWRI buffers at its head.  These buffers must be moved
	to the tail if they cannot be written async in order to reduce
	the scanning time required to skip past these buffers in later
	getnewbuf() calls.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-04-29 18:15:25 +00:00
Poul-Henning Kamp
75c1354190 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
Poul-Henning Kamp
02daf150a4 Add the jail system call. 1999-04-28 11:28:49 +00:00
Dmitrij Tejblum
604359cf9b s/static foo_devsw_installed = 0;/static int foo_devsw_installed;/.
(Edited automatically)
1999-04-28 10:54:24 +00:00
Luoqi Chen
5206bca10a Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
Poul-Henning Kamp
1c308b817a Change suser_xxx() to suser() where it applies. 1999-04-27 12:21:16 +00:00
Poul-Henning Kamp
f711d546d2 Suser() simplification:
1:
  s/suser/suser_xxx/

2:
  Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
  s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.
1999-04-27 11:18:52 +00:00
Peter Wemm
b6ad3506f3 Register the local (unix domain) sockets ourselves. 1999-04-26 08:56:53 +00:00
Peter Wemm
5b23857d22 Redo domain registration to use SYSINITS rather than linker sets.
Get rid of the spl wrapper kludge, it doesn't seem to be needed between
init calls since all that's running is the domain/protocol timers and they
are safe since domain list modifications are splnet() protected (which
blocks the timers)
1999-04-26 08:56:09 +00:00
Peter Wemm
fc51d58e62 Fix a very long standing bug in run_interrupt_driven_config_hooks(). It
was fetching the next pointer from memory that could have been free()'d.
1999-04-25 22:13:34 +00:00
Poul-Henning Kamp
0bb2226a4d Make the machdep.i8254_freq and machdep.tsc_freq sysctls modify the
timecounter as well

Asked for by:	bde, jhay
1999-04-25 09:00:00 +00:00
Dmitrij Tejblum
ba41a07d04 Fixed printf format errors on alpha. 1999-04-24 18:50:48 +00:00
Andrey A. Chernov
02a3d5261d Lite2 bugfixes merge:
so_linger is in seconds, not in 1/HZ
range checking in SO_*TIMEO was wrong

PR: 11252
1999-04-24 18:22:34 +00:00
Poul-Henning Kamp
22f054e258 Fix a braino in the v_id wraparound code. Give more (current) details
in comment.

PR:		11307
Spotted by:	Ville-Pertti Keinonen <will@iki.fi>
1999-04-24 17:58:14 +00:00
Dmitrij Tejblum
0dd9741eb4 Use pointer arithmetic to do pointer arithmetic. 1999-04-24 11:25:01 +00:00
SADA Kenji
565592bd9c The function msgrcv() could copy larger data than it should do
under some circumstances.
PR:		kern/10765
Submitted by:	Yasuhito FUTATSUKI <futatuki@fureai.or.jp>
1999-04-21 13:30:01 +00:00
Peter Wemm
54a8c69347 Stage 1 of a cleanup of the i386 interrupt registration mechanism.
Interrupts under the new scheme are managed by the i386 nexus with the
awareness of the resource manager.  There is further room for optimizing
the interfaces still.  All the users of register_intr()/intr_create()
should be gone, with the exception of pcic and i386/isa/clock.c.
1999-04-21 07:26:30 +00:00
Alan Cox
f78fd73fa6 Address several problems in vn_read and vn_write:
1. Make read-ahead work for pread and aio_read.

2. Fix one place where a comparison of uio_offset with -1
   wasn't updated to use FOF_OFFSET.

3. Honor O_APPEND in the FOF_OFFSET case.

In addition, use the variable name "ioflag" in both vn_read and
vn_write to avoid possible confusion between the variable "flag"
and the parameter "flags".

Submitted by:	Bruce Evans <bde@zeta.org.au> and me
1999-04-21 05:56:45 +00:00
Dag-Erling Smørgrav
5f967b24fc Make the location of init(8) tunable at boot time. 1999-04-20 21:15:13 +00:00
Peter Wemm
4d823d728f GC some stray debugging printf()s... 1999-04-19 19:39:08 +00:00
Peter Wemm
d95939af7a Zap LKM option and support. Farewell old friend. 1999-04-19 14:19:52 +00:00
Peter Wemm
db42d90829 unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.
1999-04-19 14:14:14 +00:00
Peter Wemm
2072df97aa GC some unused code. 1999-04-17 09:12:35 +00:00
Peter Wemm
e91896117b Well folks, this is it - The second stage of the removal for build support
for LKM's..
1999-04-17 08:36:07 +00:00
Peter Wemm
6182fdbda8 Bring the 'new-bus' to the i386. This extensively changes the way the
i386 platform boots, it is no longer ISA-centric, and is fully dynamic.
Most old drivers compile and run without modification via 'compatability
shims' to enable a smoother transition.  eisa, isapnp and pccard* are
not yet using the new resource manager.  Once fully converted, all drivers
will be loadable, including PCI and ISA.

(Some other changes appear to have snuck in, including a port of Soren's
 ATA driver to the Alpha.  Soren, back this out if you need to.)

This is a checkpoint of work-in-progress, but is quite functional.

The bulk of the work was done over the last few years by Doug Rabson and
Garrett Wollman.

Approved by:	core
1999-04-16 21:22:55 +00:00
Dmitrij Tejblum
35871a15c5 getnewbuf(): check return value from tsleep(). Interruptible NFS may pass
PCATCH to slpflag.
1999-04-14 18:51:52 +00:00
Tor Egge
87c737bc83 Backout early start of APs since it caused some machines to hang. 1999-04-13 03:24:47 +00:00
Eivind Eklund
e9e9477aac More consistent with surrounding style. (Hey - it looked great in the
diff...)

Prodded by:	bde
1999-04-12 14:34:52 +00:00
Dag-Erling Smørgrav
eca2ddda6f Typo in comment. 1999-04-12 10:07:15 +00:00
Eivind Eklund
2a96b3faf9 Staticize. 1999-04-11 02:27:06 +00:00
Eivind Eklund
632a035f84 Staticize. 1999-04-11 02:17:47 +00:00
Tor Egge
44c57e7121 Add prototype for wait_ap(). 1999-04-11 00:43:43 +00:00
Tor Egge
90c26b0d2d Let BSP wait until all APs are initialized. 1999-04-10 22:58:29 +00:00
Dag-Erling Smørgrav
5a00f36414 Allow setting MAXFILES in the kernel config. 1999-04-09 16:28:11 +00:00
Nick Sayer
c0bd94a75d More secure clock management. Allow positive steps only once per second
for as much as one second, but no more. Allows a miscreant to
double-time march the clock, but no worse.

XXX Unlike putting negative deltas in a while(1), performing small
positive steps inside of a while(1) will return EPERM for the
unpermitted ones. Repeated negative deltas are clamped without
error (but the kernel does log a notice).
1999-04-07 19:48:09 +00:00
Matt Jacob
3f92429a24 Fix last delta so file would compile again- I think I got it
right. Add a clarifying (to me at least) comment. Some formatting
fixes.
1999-04-07 17:32:21 +00:00
Peter Wemm
bfda1e3ff7 Disable the mtrr copy calls, it doesn't work with the i686_mem.c stuff.
This should make it compile/link again.
1999-04-07 17:08:40 +00:00
Nick Sayer
fcae3aa61f If securelevel>1, allow the clock to be adjusted negatively only up to
1 second prior to the highest the clock has run so far. This allows
time adjusters like xntpd to do their work, but the worst a miscreant
can do is "freeze" the clock, not go back in time.

We still need to decide on an algorithm to clamp positive adjustments.
As it stands, it is possible to achieve arbitrary negative adjustments
by "wrapping" time around.

PR:		10361
1999-04-07 16:36:56 +00:00
Alan Cox
b2e2337ba1 Fix a performance problem with the new getnewbuf() code: in an outofspace
condition ( bufspace > hibufspace ), an inappropriate scan of the empty
queue was performed looking for buffer space to free up.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-04-07 02:41:54 +00:00
Peter Wemm
57dc594832 Use the reference counted PHOLD()/PRELE() rather than P_PHYSIO. 1999-04-06 03:04:47 +00:00
Peter Wemm
af8ad83e5c Use the reference-counted PHOLD()/PRELE() rather than P_NOSWAP. 1999-04-06 03:03:34 +00:00
Peter Wemm
88b4f4ee55 LK_RETRY is a vn_lock() flag, not one for lockmgr(). 1999-04-06 03:02:11 +00:00
Julian Elischer
8d17e69460 Catch a case spotted by Tor where files mmapped could leave garbage in the
unallocated parts of the last page when the file ended on a frag
but not a page boundary.
Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF,
in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h
    vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c
    ufs/ufs/ufs_readwrite.c kern/vfs_bio.c

Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Alan Cox <alc@freebsd.org>
1999-04-05 19:38:30 +00:00
Dmitrij Tejblum
5cc4ab5323 Regenerate (padding for pread and pwrite). 1999-04-04 21:43:36 +00:00
Dmitrij Tejblum
8fe387ab84 Add standard padding argument to pread and pwrite syscall. That should make them
NetBSD compatible.

Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that
the offset is set in the struct uio).

Factor out some common code from read/pread/write/pwrite syscalls.
1999-04-04 21:41:28 +00:00
Poul-Henning Kamp
a508801763 Fix a division which I had made a multiplication.
Fix return value from ntp_adjtime().

Submitted by:	jhay
1999-04-04 19:56:04 +00:00
Poul-Henning Kamp
34cffbe3f6 Dang, lost some LL's there. 1999-04-04 10:53:59 +00:00
Poul-Henning Kamp
f425c1f631 Update to latest version from Dave Mills. Mostly textual. 1999-04-04 10:28:42 +00:00
John Polstra
4fe88fe637 Restore support for executing BSD/OS binaries on the i386 by passing
the address of the ps_strings structure to the process via %ebx.
For other kinds of binaries, %ebx is still zeroed as before.

Submitted by:	Thomas Stephens <tas@stephens.org>
Reviewed by:	jdp
1999-04-03 22:20:03 +00:00
Poul-Henning Kamp
c4a6db710a Don't open window for race condition.
Detected by:	Reg Clemens <reg@dwf.com>
1999-04-02 13:57:21 +00:00
Poul-Henning Kamp
6a5d592ae8 Purging lint from the Bruce filter. 1999-03-30 09:00:45 +00:00
Doug Rabson
6350e58a8a Add some useful functions to the device framework:
* bus_setup_intr() as a wrapper for BUS_SETUP_INTR
* bus_teardown_intr() as a wrapper for BUS_TEARDOWN_INTR
* device_get_nameunit() which returns e.g. "foo0" for name "foo" and unit 0.
* device_set_desc_copy() malloc a copy of the description string.
* device_quiet(), device_is_quiet(), device_verbose() suppress probe message.

Add one method to the BUS interface, BUS_CHILD_DETACHED() which is called
after the child has been detached to allow the bus to clean up any memory
which it allocated on behalf of the child.

I also fixed a bug which corrupted the list of drivers in a devclass if
a driver was added to more than one devclass.
1999-03-29 08:54:20 +00:00
Doug Rabson
ecc6e7d5ef Fix a bug which prevented more than two clients from sharing a resource. 1999-03-29 08:30:17 +00:00
Doug Rabson
67e7cb89d9 Call ptrace_u_check with the right size. 1999-03-29 08:29:22 +00:00
Nick Hibma
2bee57be2f Fixed line counting error. 1999-03-27 22:41:40 +00:00
Alan Cox
4160ccd978 Added pread and pwrite. These functions are defined by the X/Open
Threads Extension.  (Note: We use the same syscall numbers as NetBSD.)

Submitted by:	John Plevyak <jplevyak@inktomi.com>
1999-03-27 21:16:58 +00:00
Eivind Eklund
361d0ec590 Remove incorrect lock specs for vop_whiteout (introduced by Lite/2).
The lock specs are for vnodes only.

Add (hopefully correct) lock specs for vop_strategy, vop_getpages and
vop_putpages.
1999-03-27 03:08:07 +00:00
Alan Cox
cde9bc877b Changed vn_read/write such that fp->f_offset isn't touched
if uio->uio_offset != -1.  This fixes a problem with aio_read/write
and permits a straightforward implementation of pread/pwrite.

PR:		kern/8669
Submitted by:	John Plevyak <jplevyak@inktomi.com>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-03-26 20:25:21 +00:00
Doug Rabson
6ca34d85be Call the module's unload handler before removing the device from the
cdevsw list.  This allows a handler to veto the load without losing its
place in the list.

PR:	kern/10653
1999-03-23 21:11:47 +00:00
Poul-Henning Kamp
cc7532aaf0 Add a sysctl variable which can help stop chroot(2) escapes.
kern.chroot_allow_open_directories = 0
	chroot(2) fails if there are open directories.

kern.chroot_allow_open_directories = 1 (default)
	chroot(2) fails if there are open directories and the process
	is subject of a previous chroot(2).

kern.chroot_allow_open_directories = anything else
	filedescriptors are not checked.  (old behaviour).

I'm very interested in reports about software which breaks when
running with the default setting.
1999-03-23 14:26:40 +00:00
Poul-Henning Kamp
7f4173cc09 Fix some nasty hangs if garbage were passed.
Noticed by:	Emmanuel DELOGET <pixel@DotCom.FR>
Remembered by:	msmith
1999-03-23 14:23:15 +00:00
Poul-Henning Kamp
3a25914cfd Make the same size rounding error both ways. 1999-03-22 14:01:58 +00:00
Bruce Evans
96ebc5810b Fixed a serious bug in rev.1.202. getnewbuf() sometimes didn't
initialise bp->b_data.  This tended to cause panics for file
systems whose block size is smaller than one page.
1999-03-19 10:17:44 +00:00
Poul-Henning Kamp
884ab557d9 Don't run FLL fodder through the median-filter.
Reduce max integration time to 128sec and use 50% exponential decay rather
than 256sec/25%.
1999-03-16 08:39:37 +00:00
Poul-Henning Kamp
fafbe352c0 Allow !suser() R/O access to ntp_adjtime()
Noticed by: Reg Clemens <reg@dwf.com>
1999-03-15 08:35:40 +00:00
Julian Elischer
50d3b68c81 fix breakage for alphas.
Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
1999-03-15 05:11:27 +00:00
Bruce Evans
6fc8f347cb Enforce monotonicity of apparent process user, system and interrupt times.
PR:		975, 10402
1999-03-13 19:46:13 +00:00
Poul-Henning Kamp
30f27235cb Fix an old cut&paste bogon.
Noticed by: bde
1999-03-12 21:58:54 +00:00
Poul-Henning Kamp
37d39f0a50 Remove duplicate include.
Noticed by: bde
1999-03-12 11:09:50 +00:00
Julian Elischer
beef8a367c This solves a deadlock that can occur when read()ing into a file-mmap()
space. When doing this, it is possible to for another process to attempt
to get an exclusive lock on the vnode and deadlock the mmap/read
combination when the uiomove() call tries to obtain a second
shared lock on the vnode. There is still a potential deadlock
situation with write()/mmap().
Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Luoqi Chen <luoqi@freebsd.org>
Delimmitted by tag PRE_MATT_MMAP_LOCK and POST_MATT_MMAP_LOCK
in kern/kern_lock.c kern/kern_subr.c
1999-03-12 03:09:29 +00:00
Julian Elischer
4ef2094e45 Reviewed by: Many at differnt times in differnt parts,
including alan, john, me, luoqi, and kirk
Submitted by:	Matt Dillon <dillon@frebsd.org>

This change implements a relatively sophisticated fix to getnewbuf().
There were two problems with getnewbuf(). First, the writerecursion
can lead to a system stack overflow when you have NFS and/or VN
devices in the system. Second, the free/dirty buffer accounting was
completely broken. Not only did the nfs routines blow it trying to
manually account for the buffer state, but the accounting that was
done did not work well with the purpose of their existance: figuring
out when getnewbuf() needs to sleep.

The meat of the change is to kern/vfs_bio.c. The remaining diffs are
all minor except for NFS, which includes both the fixes for bp
interaction AND fixes for a 'biodone(): buffer already done' lockup.
Sys/buf.h also contains a chaining structure which is not used by
this patchset but is used by other patches that are coming soon.
This patch deliniated by tags PRE_MAT_GETBUF and POST_MAT_GETBUF.
(sorry for the missing T matt)
1999-03-12 02:24:58 +00:00
Bruce Evans
56ce1a8dc4 Fixed runtime accounting. The time since the previous context switch
was discarded on every call to calcru().  Hacking on the `switchtime'
global for a related fix in rev.1.38 of kern_resource.c was too fragile
and broke when p_switchtime went away.

PR:		10402
1999-03-11 21:53:12 +00:00
Poul-Henning Kamp
32c203577a Make even more of the PPSAPI implementations generic.
FLL support in hardpps()

Various magic shuffles and improved comments

Style fixes from Bruce.
1999-03-11 15:09:51 +00:00
Alan Cox
0a91231d3b For clarity, use the "map" variable introduced by the last commit
throughout exec_aout_imgact.
1999-03-10 07:07:42 +00:00
Poul-Henning Kamp
a2210fe12b Make TIMER_FREQ a normal, undocumented option. Raise confusion to
a higher level with example in LINT.

Clarify comment about PPS_SYNC.  Ignore for now that it doesn't
work in FLL mode, it will in a few days.
1999-03-09 20:20:09 +00:00
Poul-Henning Kamp
c68996e271 Integrate the new "nanokernel" PLL from Dave Mills.
This code is backwards compatible with the older "microkernel" PLL, but
allows ntpd v4 to use nanosecond resolution.  Many other improvements.

PPS_SYNC and hardpps() are NOT supported yet.
1999-03-08 12:36:14 +00:00
Doug Rabson
a199ed3cc3 * Register sysctl nodes before running sysinits when loading files and
unregister them after sysuninits when unloading.
* Add code to vfs_register() to set the oid number of vfs sysctls to
  the type number of the filesystem.

Reviewed by: bde
1999-03-07 16:06:41 +00:00
Garrett Wollman
7347e1c6ab Fix callout_init(). This didn't have any practical effect since it
was only used to initialize the static timeouts, which unconditionally
clears the only bits which could have caused problems.
1999-03-06 22:27:02 +00:00
Garrett Wollman
acc8326d0c Expose a slightly-lower-level interface to timeouts which allows callers
to manage their own memory.  Tested on my machine (make buildworld).
I've made analogous changes on the alpha, but don't have a machine
to test.

Not-objected-to by:	dg, gibbs
1999-03-06 04:46:20 +00:00
Bruce Evans
efc96764e0 The magic "no-cpu" cpu number is 0xff. Don't misrepresent cpu
numbers as chars or use bogus casts in an attempt to unmisrepresnt
them.  In top, don't assume that 0xff is the only negative cpu
number when cpu numbers are (mis)represented.
1999-03-05 16:38:13 +00:00
Alan Cox
2f33b2c03e exec_aout_imgact should lock the vm_map before calling vm_map_insert.
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>,
		"John S. Dyson" <dyson@iquest.net>, and
		David Greenman <dg@root.com>
1999-03-04 18:04:40 +00:00
Julian Elischer
90b4d77467 The tunable parameter for the scheduler quantum was inverted.
Higher numbers led to smaller quanta.
In discussion with BDE, change this parameter to be in uSecs
to make it machine independent,
and limit it to non zero multiples of 'tick' (rounding down).
Also make the variabel globally available so that the present function that
returns its value (used for posix scheduling I believe) can go away.

Submitted by: Bruce Evans <bde@freebsd.org>
1999-03-03 18:15:29 +00:00
Julian Elischer
cb11191c01 Slight cleanup of code resurected for union mounts..
Submitted by: Tony Finch <dot@dotat.at>
1999-03-03 02:35:51 +00:00
Julian Elischer
850c9afd03 Make comment match code. 1999-03-02 21:23:38 +00:00
Julian Elischer
8b3bd42341 Remove inapropriate use of VOP_ISLOCKED()
This produced races resulting in panics and filesystem corruptions
under some circumstances.

Reviewed by: luoqi chen <luoqi@freebsd.org>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>
Submitted by: Matt Dillon <dillon@freebsd.org>
1999-03-02 20:26:39 +00:00
Julian Elischer
4ac9ae7083 Fix thread/process tracking and differentiation for Linux threads emulation.
Submitted by:	Richard Seaman, Jr." <dick@tar.com>

Also clean some compiler warnings in surrounding code.
1999-03-02 00:28:09 +00:00
Kirk McKusick
0ee81fe5f5 Update to know about current kernel directory layout.
Add ability to build links as well as tags.
1999-02-28 22:14:16 +00:00
Bruce Evans
4adbda97dc Declare static __inline functions as __inline in their forward
declaration.

Fixed some comments.

Fixed a staticization botch.
1999-02-28 11:30:00 +00:00
Bruce Evans
e7ba67f274 Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu,
not per-process.  Keep it in `switchtime' consistently.

It is now clear that the timestamp is always valid in fork_trampoline()
except when the child is running on a previously idle cpu, which
can only happen if there are multiple cpus, so don't check or set
the timestamp in fork_trampoline except in the (i386) SMP case.
Just remove the alpha code for setting it unconditionally, since
there is no SMP case for alpha and the code had rotted.

Parts reviewed by:	dfr, phk
1999-02-28 10:53:29 +00:00
Julian Elischer
1871f6cdd2 Fix code for union mounts
Accidentally deleted by peter when he extracted the unionfs stuff in 1.109

Submitted by: Tony Finch <dot@dotat.at>
1999-02-27 07:06:05 +00:00
Tor Egge
79a7a64b85 Don't call assign_apic_irq with a value for irq that is out of range. 1999-02-26 03:42:50 +00:00
Bruce Evans
a5c9bce777 Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>). 1999-02-25 15:54:06 +00:00
Bruce Evans
1b0b259ed2 Don't forget to update `switchticks' in corner cases (except for
the alpha fork_trampoline(), forget it because it I believe it is
only necessary for the unsupported SMP case).
1999-02-25 11:03:08 +00:00
Matthew Dillon
155f87daf2 Reviewed by: Julian Elischer <julian@whistle.com>
Add d_parms() to {c,b}devsw[].  If non-NULL this function points to
    a device routine that will properly fill in the specinfo structure.
    vfs_subr.c's checkalias() supplies appropriate defaults.  This change
    should be fully backwards compatible with existing devices.
1999-02-25 05:22:30 +00:00
Bruce Evans
3965f8d8bb The previous commit also fixed a possibly-wrong (too high) priority
for the rescheduled process.
1999-02-22 18:39:49 +00:00
Bruce Evans
554dedb3c9 Improved scheduling in uiomove(), etc. resched_wanted() is true too
often for it to be a good criterion for switching kernel cpu hogs --
it is true after most wakeups.  Use the criterion "has been running
for >= 2 quanta" instead.
1999-02-22 16:57:48 +00:00
John Polstra
c33fe77954 If you merge this into -stable, please increment __FreeBSD_version
in "src/sys/sys/param.h".

Fix the ELF image activator so that it can handle dynamic linkers
which are executables linked at a fixed address.  This improves
compliance with the ABI spec, and it opens the door to possibly
better dynamic linker performance in the future.  I've experimented
a bit with a fixed-address dynamic linker, and it works fine.  But
I don't have any measurements yet to determine whether it's
worthwhile.

Also, remove a few calculations that were never used for anything.

I will increment __FreeBSD_version, since this adds a new capability
to the kernel that the dynamic linker might some day rely upon.
1999-02-20 23:52:34 +00:00
Doug Rabson
75e08a5e7e A correction to the code which attempts to prevent the same module
being loaded twice.  It used rindex() to strip the pathname but failed
to account for the fact that rindex() will return a pointer to the '/',
not the first character of the filename.

Submitted by: Nick Hibma <hibma@skylink.it>
1999-02-20 21:22:00 +00:00
Luoqi Chen
1c6d46f93c Introduce machine-dependent macro pgtok() to convert page count to number
of kilobytes. Its definition for each architecture could be optimized to
avoid potential numerical overflows.
1999-02-19 19:34:49 +00:00
Matthew Dillon
42e26d47bd Protect vn worklist and vn->v_{clean,dirty}blkhd at splbio().
Get rid of extra LIST_REMOVE()

Reviewed by:	hsu@FreeBSD.ORG (Jeffrey Hsu), mckusick@McKusick.COM
Submitted by:	hsu@FreeBSD.ORG (Jeffrey Hsu), dillon@backplane.com ( Matthew Dillon )
1999-02-19 17:36:58 +00:00
Luoqi Chen
b1028ad122 Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
Luoqi Chen
75ffaf5939 Initialize procsig0.ps_refcnt to 1 (instead of 2), this would silence
complaints about ps_refcnt greater than two when we try to fork() a
kthread from proc0 with RFSIGSHARE flag set.

Noticed by:	Tor Egge <tegge@fast.no>
Reviewed by:	Richard Seaman, Jr. <dick@tar.com>
1999-02-17 21:03:14 +00:00
Doug Rabson
ce02431ffa * Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
  file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
	Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
1999-02-16 10:49:55 +00:00
Matthew Dillon
ef528b292a Only needed to cast array index from char to unsigned char, did not
also have to cast it to int.  (int)(unsigned char)char_exp ->
    (unsigned char)char_exp.
1999-02-14 20:58:21 +00:00
Kenneth D. Merry
2a888f938e Add a prioritization field to the devstat_add_entry() call so that
peripheral drivers can determine where in the devstat(9) list they are
inserted.

This requires recompilation of libdevstat, systat, vmstat, rpc.rstatd, and
any ports that depend on the devstat code, since the size of the devstat
structure has changed.  The devstat version number has been incremented as
well to reflect the change.

This sorts devices in the devstat list in "more interesting" to "less
interesting" order.  So, for instance, da devices are now more important
than floppy drives, and so will appear before floppy drives in the default
output from systat, iostat, vmstat, etc.

The order of devices is, for now, kept in a central table in devicestat.h.
If individual drivers were able to make a meaningful decision on what
priority they should be at attach time, we could consider splitting the
priority information out into the various drivers.  For now, though, they
have no way of knowing that, so it's easier to put them in an easy to find
table.

Also, move the checkversion() call in vmstat(8) to a more logical place.

Thanks to Bruce and David O'Brien for suggestions, for reviewing this, and
for putting up with the long time it has taken me to commit it.  Bruce did
object somewhat to the central priority table (he would rather the
priorities be distributed in each driver), so his objection is duly noted
here.

Reviewed by:	bde, obrien
1999-02-10 00:04:13 +00:00
John Polstra
47633640aa Change the load address of the ELF dynamic linker from "2L*MAXDSIZ"
to an architecture-specific value defined in <machine/elf.h>.  This
solves problems on large-memory systems that have a high value for
MAXDSIZ.

The load address is controlled by a new macro ELF_RTLD_ADDR(vmspace).
On the i386 it is hard-wired to 0x08000000, which is the standard
SVR4 location for the dynamic linker.

On the Alpha, the dynamic linker is loaded MAXDSIZ bytes beyond
the start of the program's data segment.  This is the same place
a userland mmap(0, ...) call would put it, so it ends up just below
all the shared libraries.  The rationale behind the calculation is
that it allows room for the data segment to grow to its maximum
possible size.

These changes have been tested on the i386 for several months
without problems.  They have been tested on the Alpha as well,
though not for nearly as long.  I would like to merge the changes
into 3.1 within a week if no problems have surfaced as a result of
them.
1999-02-07 23:49:56 +00:00
Matthew Dillon
9fdfe602fc Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
John Polstra
6f8126face Correct an "&" operator which should have been "&&".
Submitted by:	mjacob
1999-02-05 22:24:26 +00:00
Mark Newton
3b351cc1d3 Additional note on last rev: The rationale for this is to allow you
to run Solaris executables (or executables from any other ELF system)
directly off the CD-ROM without having to waste megabytes of disk
by copying them to another filesystem just to brand them.
1999-02-05 03:47:47 +00:00
Mark Newton
f8b3601e08 Created sysctl kern.fallback_elf_brand. Defaults to "none", which will
give the same behaviour produced before today.  If sysadmin sets it
to a valid ELF brand, ELF image activator will attempt to run unbranded
ELF exectutables as if they were branded with that value.

Suggested by: Dima Ruban <dima@best.net>
1999-02-05 03:43:18 +00:00
Matthew Dillon
e198079da2 Fix race in pipe read code whereby a blocked lock can allow another
process to sneak in and write to or close the pipe.  The read code
    enters a 'piperd' state after doing the lock operation without
    checking to see if the state changed, which can cause the process
    to wait forever.

    The code has also been documented more.
1999-02-04 23:50:49 +00:00
Matthew Dillon
82b23b5384 vp->v_object must be valid after normal flow of vfs_object_create()
completes, change if() to KASSERT().  This is not a bug, we are
    simplify clarifying and optimizing the code.

    In if/else in vfs_object_create(), the failure of both conditionals
    will lead to a NULL object.  Exit gracefully if this case occurs.
    ( this case does not normally occur, but needed to be handled ).

Obtained from: Eivind Eklund <eivind@FreeBSD.org>
1999-02-04 18:25:39 +00:00
Mark Newton
096977fae7 Provide elf_brand_inuse() as a method an emulator can use to find out
whether it is currently in use (which is kinda useful when it's about
to unload itself:  Lockups are never very much fun, are they?).
1999-02-04 12:42:39 +00:00
Bruce Evans
8b6ca0359b Switch context before doing some i/o operations that might block if
context would be switched on return to user mode.  This fixes some
denial of service problems.
1999-02-02 12:11:01 +00:00
Bill Fenner
8f70ac3e02 Fix the port of the NetBSD 19990120-accept fix. I misread a piece of
code when examining their fix, which caused my code (in rev 1.52) to:
- panic("soaccept: !NOFDREF")
- fatal trap 12, with tracebacks going thru soclose and soaccept
1999-02-02 07:23:28 +00:00
Mark Newton
9cbac9cee4 Moved prototypes for soo_{read,write,close} into socketvar.h where they
belong.

Suggested by: bde
1999-02-01 21:16:31 +00:00
Mark Newton
c4ca2670b8 Fix bogus line breaks in declarations for soo_read() and soo_write()
Suggested by: Pedant Central :-)
1999-02-01 13:24:39 +00:00
Mark Newton
ba198b1c45 Added comments about non-staticization so it doesn't get un-done next
time someone goes on a staticization binge.

Suggested by: eivind
1999-01-31 03:15:13 +00:00
Mike Smith
e25810a699 Remove unused "kern.shutdown_timeout" sysctl node. 1999-01-30 19:36:02 +00:00
Mike Smith
5c40906f11 An error in the last commit; the changes were submitted by, not reviewed by,
"D. Rock" <rock@cs.uni-sb.de>
1999-01-30 19:29:10 +00:00
Mike Smith
db82a982a7 Add a new sysctl node kern.shutdown, off which shutdown-related things
can be hung.

Add a tunable delay at the beginning of the SHUTDOWN_FINAL at_shutdown
queue, allowing time to settle before we launch into the list of things
that are expected to turn the system off.

Fix a bug in at_shutdown_pri() where the second insertion always put
the item in second position in the queue.

Reviewed by:	"D. Rock" <rock@cs.uni-sb.de>
1999-01-30 19:28:30 +00:00
Poul-Henning Kamp
4e48a6bfe0 Use suser() to determine super-user-ness.
Collapse some duplicated checks.

Reviewed by:	bde
1999-01-30 12:27:00 +00:00
Poul-Henning Kamp
57c90d6fcd Use suser() to determine super-user-ness, don't examine cr_uid directly. 1999-01-30 12:21:49 +00:00
Poul-Henning Kamp
4e2d2aa1cd Use suser() to check for super user rather than examining cr_uid directly.
Use TTYDEF_SPEED rather than 9600 a couple of places.

Reviewed by:	bde, with a few grumbles.
1999-01-30 12:17:38 +00:00
Mark Newton
69a6f20bc8 Unstaticized routines which are needed by the svr4 KLD and the streams
garbage needed to support SysVR4 networking.
1999-01-30 06:25:00 +00:00
Matthew Dillon
bc81493155 More const fixes for -Wall, -Wcast-qual 1999-01-29 23:18:50 +00:00
Matthew Dillon
820ca326e1 *_execsw static structures cannot be const due to the way they interact
with EXEC_SET, DECLARE_MODULE, and module_register.  Specifically,
    module_register.  We may eventually be able to make these const, but
    not now.
1999-01-29 22:59:43 +00:00
Bruce Evans
cacd1f6aa9 Cast to `const char *' instead of to c_caddr_t. This is part of
terminating c_caddr_t with extreme prejudice.  Here we depended
on the "opaque" type c_caddr_t being precisely `const char *'
to do unportable pointer arithmetic.
1999-01-29 09:04:27 +00:00
Matthew Dillon
3cfc69e6c2 More -Wall / -Wcast-qual cleanup. Also, EXEC_SET can't use
C_DECLARE_MODULE due to the linker_file_sysinit() function
    making modifications to the data.
1999-01-29 08:36:45 +00:00
Bruce Evans
9e26dd2a54 Removed bogus casts to c_caddr_t. This is part of terminating
c_caddr_t with extreme prejudice.  Here the original casts to
caddr_t were to support K&R compilers (or missing prototypes),
but the relevant source files require an ANSI compiler.
1999-01-29 08:29:05 +00:00
Bruce Evans
425c50cf51 Removed a bogus cast to c_caddr_t. This is part of terminating
c_caddr_t with extreme prejudice.  Here the point of the original
cast to caddr_t was to break the warning about the const mismatch
between write(2)'s `const void *buf' and `struct uio's `char
*iov_base' (previous bitrot gave a gratuitous dependency on caddr_t
being char *).  Compiling with -Wcast-qual made the cast a full
no-op.

This change has no effect on the warning for discarding `const'
on assignment to iov_base.  The warning should not be fixed by
splitting `struct iovec' into a non-const version for read()
and a const version for write(), since correct const poisoning
would affect all pointers to i/o addresses.  Const'ness should
probably be forgotten by not declaring it in syscalls.master.
1999-01-29 08:10:35 +00:00
Matthew Dillon
f01a9dd520 cleanup warnings by propogating const char pointers properly. 1999-01-29 08:09:32 +00:00
Matthew Dillon
697457a133 Fix warnings related to -Wall -Wcast-qual 1999-01-28 17:32:05 +00:00
Matthew Dillon
0a5e03dda5 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 01:59:53 +00:00
Matthew Dillon
8aef171243 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
Matthew Dillon
fe08c21a53 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile.

    This commit includes significant work to proper handle const arguments
    for the DDB symbol routines.
1999-01-27 23:45:44 +00:00
Matthew Dillon
d254af07a1 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 21:50:00 +00:00
Matthew Dillon
598217c491 Fix array index of signed char to cast to unsigned char and then to int.
Also general const cleanup.
1999-01-27 21:36:14 +00:00
Matthew Dillon
a06791bc90 Fix getenv() comparison against '=' ... was *cp = '=' instead of
*cp == '='.
1999-01-27 21:24:50 +00:00
Bruce Evans
fea579e94e Don't forget to count context switches in yield(). 1999-01-27 10:14:05 +00:00
Bruce Evans
dc34e67676 Include <sys/select.h> -- don't depend on pollution in <sys/proc.h>. 1999-01-27 10:10:03 +00:00
Matthew Dillon
56319e3a58 Ok, people didn't like kern.conf_dir. Poof, backed out. 1999-01-26 07:37:11 +00:00
Julian Elischer
88c5ea4574 Enable Linux threads support by default.
This takes the conditionals out of the code that has been tested by
various people for a while.
ps and friends (libkvm) will need a recompile as some proc structure
changes are made.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:38:12 +00:00
Matthew Dillon
b1cba377b0 Add kern.conf_dir sysctl. This is a R+W string used to specify the
directory containing rc.conf.local and rc.local, and possibly other
    things in the future.

    This sysctl is used by the diskless startup code and new rc.conf.  If
    it cannot be found or is empty, the system should revert to using /etc.
1999-01-25 18:26:09 +00:00
Bill Fenner
527b7a14a5 Port NetBSD's 19990120-accept bug fix. This works around the race condition
where select(2) can return that a listening socket has a connected socket
queued, the connection is broken, and the user calls accept(2), which then
blocks because there are no connections queued.

Reviewed by:	wollman
Obtained from:	NetBSD
(ftp://ftp.NetBSD.ORG/pub/NetBSD/misc/security/patches/19990120-accept)
1999-01-25 16:58:56 +00:00
Bill Fenner
ec42cbfc24 Don't free the socket address if soaccept() / pru_accept() doesn't
return one.
1999-01-25 16:53:53 +00:00
Doug Rabson
149a155c3b Don't try to call SYSUNINIT functions if there was a link error.
Reviewed by: Peter Wemm <peter@netplex.com.au>
1999-01-25 08:42:24 +00:00
Bruce Evans
73a6265d68 Go back to only supporting revoke() for bdevs and cdevs. It is very
buggy for fifos, and no one seems to have investigated its behaviour
on other types of files.  It has been broken since the Lite2 merge
in rev.1.54.

Nagged about by:	Brian Feldman (green@unixhelp.org)
1999-01-24 06:28:37 +00:00
Matthew Dillon
257aefa704 Addendum: The original code that the last commit 'fixed' actually did
not have a bug in it, but the last commit did make it more readable so
    we are keeping it.
1999-01-24 03:49:58 +00:00
Matthew Dillon
89600e8663 There was a situation where sendfile() might attempt to initiate I/O
on a PG_BUSY page, due to a bug in its sequencing of a conditional.
1999-01-24 01:15:58 +00:00
Matthew Dillon
377f9b28a6 Don't try to calculate B_CACHE for an NFS related bp that has a
> 0 b_validend.  This will screw up small-writes, causing lots of
    little writes out the network.

    We will assume that NFS handles B_CACHE properly.
1999-01-24 00:51:11 +00:00
Matthew Dillon
fae1f2e045 Fix an expression parenthesization typo in a conditional. It should not
have any operational effects other then to make the code in question
     a little faster.  Also added a more involved comment.
1999-01-23 06:36:15 +00:00
Peter Wemm
461b36ab54 Update userref handling after discussion with submitter of previous
patch.  lf can't be dereferenced after the unload attempt, in case it
was freed.  Instead, decrement first and back it out if the unload failed.
This should be relatively immune to races caused by the user since the
userref count will be zero for the duration of the actual unloading and
will stop further kldunload attempts.

Submitted by:   Ustimenko Semen <semen@iclub.nsu.ru>
1999-01-23 03:45:22 +00:00
David Greenman
33ce4218c6 Don't throw away the buffer contents on a fatal write error; just mark
the buffer as still being dirty. This isn't a perfect solution, but
throwing away the buffer contents will often result in filesystem
corruption and this solution will at least correctly deal with transient
errors.
Submitted by:	Kirk McKusick <mckusick@mckusick.com>
1999-01-22 08:59:05 +00:00
Mike Smith
8de6e8e102 Allow VM_KMEM_SIZE to be tuned from the kernel environment. This tuning
value *completely* overrides any value precalculated by the kernel.
1999-01-21 21:54:32 +00:00