Commit Graph

219 Commits

Author SHA1 Message Date
John Baldwin
ba626c1db2 Lock proctree_lock instead of pgrpsess_lock. 2002-04-16 17:11:34 +00:00
Alfred Perlstein
11caded34f Remove __P. 2002-03-19 22:20:14 +00:00
Poul-Henning Kamp
26facaeb4d If in strategy we find that we have no devsw on the device anymore we
are probably talking about some disk-device which wente away, so
return ENXIO instead of panicing.
2002-03-05 13:25:57 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Seigo Tanimura
f591779bb5 Lock struct pgrp, session and sigio.
New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
  session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by:	jhb, alfred
Tested on:	cvsup.jp.FreeBSD.org
		(which is a quad-CPU machine running -current)
2002-02-23 11:12:57 +00:00
Poul-Henning Kamp
40f7b5a9cc Various nit-picking, mostly of style(9) character.
Obtained from:	~bde/sys.dif.gz
2002-02-10 22:00:20 +00:00
John Baldwin
bd78cece5d Change the kernel's ucred API as follows:
- crhold() returns a reference to the ucred whose refcount it bumps.
- crcopy() now simply copies the credentials from one credential to
  another and has no return value.
- a new crshared() primitive is added which returns true if a ucred's
  refcount is > 1 and false (0) otherwise.
2001-10-11 23:38:17 +00:00
Robert Watson
f86cf763ef o Modify generic specfs device open access control checks to use
securelevel_ge() instead of direct securelevel variable checks.

Obtained from:	TrustedBSD Project
2001-09-26 20:18:26 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
John Baldwin
c7f52620e0 Don't acquire/release Giant around some of the places that need it in
spec_getpages().  Instead, assert that Giant is held by the caller.
2001-05-23 22:20:29 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Bruce Evans
438abdb9c6 Backed out previous commit. It cause massive filesystem corruption,
not to mention a compile-time warning about the critical function
becoming unused, by replacing spec_bmap() with vop_stdbmap().

ntfs seems to have the same bug.

The factor for converting specfs block numbers to physical block
numbers is 1, but vop_stdbmap() uses the bogus factor
btodb(ap->a_vp->v_mount->mnt_stat.f_iosize), which is 16 for ffs with
the default block size of 8K.  This factor is bogus even for vop_stdbmap()
-- the correct factor is related to the filesystem blocksize which is not
necessarily the same to the optimal i/o size.  vop_stdbmap() was apparently
cloned from nfs where these sizes happen to be the same.

There may also be a problem with a_vp->v_mount being null.  spec_bmap()
still checks for this, but I think the checks in specfs are dead code
which used to support block devices.
2001-04-30 14:35:35 +00:00
Poul-Henning Kamp
b7ebffbc08 Add a vop_stdbmap(), and make it part of the default vop vector.
Make 7 filesystems which don't really know about VOP_BMAP rely
on the default vector, rather than more or less complete local
vop_nopbmap() implementations.
2001-04-29 11:48:41 +00:00
Greg Lehey
60fb0ce365 Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
Greg Lehey
d98dc34f52 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
Kirk McKusick
589c7af992 Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).
2001-03-07 07:09:55 +00:00
Jonathan Lemon
608a3ce62a Extend kqueue down to the device layer.
Backwards compatible approach suggested by: peter
2001-02-15 16:34:11 +00:00
Poul-Henning Kamp
37d4006626 Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
Poul-Henning Kamp
4997ad7c1f Add a BUF_KERNPROC() in the BIO_DELETE path.
This seems to fix the problem which md(4) backed filesystems exposed.
2001-01-30 10:06:08 +00:00
Matthew Dillon
2a9737202a This patch reestablishes the spec_fsync() guarentee that synchronous
fsyncs, which typically occur during unmounting, will drain all dirty
buffers even if it takes multiple passes to do so.  The guarentee was
mangled by the last patch which solved a problem due to -current disabling
interrupts while holding giant (which caused an infinite spin loop waiting for
I/O to complete).  -stable does not have either patch, but has a similar
bug in the original spec_fsync() code which is triggered by a bug in the
softupdates umount code, a fix for which will be committed to -current
as soon as Kirk stamps it.  Then both solutions will be MFC'd to -stable.

-stable currently suffers from a combination of the softupdates bug and
a small window of opportunity in the original spec_fsync() code, and -stable
also suffers from the spin-loop bug but since interrupts are enabled the
spin resolves itself in a few milliseconds.
2001-01-29 08:19:28 +00:00
Matthew Dillon
08c0a67b2e Fix a lockup problem that occurs with 'cvs update'. specfs's fsync can
get into the same sort of infinite loop that ffs's fsync used to get
into, probably due to background bitmap writes.  The solution is
the same.
2000-12-30 23:32:24 +00:00
Matthew Dillon
2b6b0df712 This implements a better launder limiting solution. There was a solution
in 4.2-REL which I ripped out in -stable and -current when implementing the
low-memory handling solution.  However, maxlaunder turns out to be the saving
grace in certain very heavily loaded systems (e.g. newsreader box).  The new
algorithm limits the number of pages laundered in the first pageout daemon
pass.  If that is not sufficient then suceessive will be run without any
limit.

Write I/O is now pipelined using two sysctls, vfs.lorunningspace and
vfs.hirunningspace.  This prevents excessive buffered writes in the
disk queues which cause long (multi-second) delays for reads.  It leads
to more stable (less jerky) and generally faster I/O streaming to disk
by allowing required read ops (e.g. for indirect blocks and such) to occur
without interrupting the write stream, amoung other things.

NOTE: eventually, filesystem write I/O pipelining needs to be done on a
per-device basis.  At the moment it is globalized.
2000-12-26 19:41:38 +00:00
Poul-Henning Kamp
1d7e3e42e7 Take VBLK devices further out of their missery.
This should fix the panic I introduced in my previous commit on this topic.
2000-11-02 21:14:13 +00:00
Eivind Eklund
7eb9fca557 Blow away the v_specmountpoint define, replacing it with what it was
defined as (rdev->si_mountpoint)
2000-10-09 17:31:39 +00:00
Poul-Henning Kamp
a481b90b82 Fix panic when removing open device (found by bp@)
Implement subdirs.
 Build the full "devicename" for cloning functions.
 Fix panic when deleted device goes away.
 Collaps devfs_dir and devfs_dirent structures.
 Add proper cloning to the /dev/fd* "device-"driver.
 Fix a bug in make_dev_alias() handling which made aliases appear
  multiple times.
 Use devfs_clone to implement getdiskbyname()
 Make specfs maintain the stat(2) timestamps per dev_t
2000-08-24 15:36:55 +00:00
Poul-Henning Kamp
39f70682ae Introduce vop_stdinactive() and make it the default if no vop_inactive
is declared.

Sort and prune a few vop_op[].
2000-08-18 10:01:02 +00:00
Kirk McKusick
9b97113391 This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
Kirk McKusick
f2a2857bb3 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
Poul-Henning Kamp
b7ffb34243 Pull the rug under block mode devices. they return ENXIO on open(2) now. 2000-07-03 13:48:37 +00:00
Poul-Henning Kamp
a2e7a027a7 Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
Jonathan M. Bresler
caea99a368 before this commit, specfs reported disk partitions
using decimal major and minor numbers.  "ls -l" reports
	disk partitions using decimal major numbers and hex
	minor numbers.

	make specfs use decimal major numbers and hex minor numbers,
	just like "ls -l"
2000-06-12 10:20:18 +00:00
Poul-Henning Kamp
192c06ea1b Change the "bdev-whiner" to whine when open is attempted and extend
the deadline a month.
2000-05-09 18:53:57 +00:00
Poul-Henning Kamp
9626b608de Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
Poul-Henning Kamp
c244d2de43 Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.
2000-04-02 15:24:56 +00:00
Poul-Henning Kamp
b99c307a21 Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.
2000-03-20 11:29:10 +00:00
Poul-Henning Kamp
21144e3bf1 Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd.  The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue.  It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users:  Greg has not had time to test this yet, be careful.
2000-03-20 10:44:49 +00:00
Poul-Henning Kamp
db5f635acc Eliminate the undocumented, experimental, non-delivering and highly
dangerous MAX_PERF option.
2000-03-16 08:51:55 +00:00
Poul-Henning Kamp
ba4ad1fcea Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
Poul-Henning Kamp
5ecdb702b0 Remove unused #includes.
Obtained from:	http://bogon.freebsd.dk/include
1999-12-08 08:59:40 +00:00
Kirk McKusick
e9cc475851 Collect read and write counts for filesystems. This new code
drops the counting in bwrite and puts it all in spec_strategy.
I did some tests and verified that the counts collected for writes
in spec_strategy is identical to the counts that we previously
collected in bwrite. We now also get read counts (async reads
come from requests for read-ahead blocks). Note that you need
to compile a new version of mount to get the read counts printed
out. The old mount binary is completely compatible, the only
reason to install a new mount is to get the read counts printed.

Submitted by:	Craig A Soules <soules+@andrew.cmu.edu>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-12-01 02:09:30 +00:00
Poul-Henning Kamp
698f9cf828 Next step in the device cleanup process.
Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code.

Unify spec_open() for bdev and cdev cases.

Remove the disabled bdev specific read/write code.
1999-11-09 14:15:33 +00:00
Poul-Henning Kamp
b454088b82 Oops, a bit too hasty there. 1999-11-08 13:08:02 +00:00
Poul-Henning Kamp
0ed43ec68c Various cleanups. 1999-11-08 09:59:34 +00:00
Poul-Henning Kamp
f7ee7bbb21 Use vop_panic() instead of spec_badop(). 1999-11-07 15:09:59 +00:00
Poul-Henning Kamp
be8479a836 Remove the iskmemdev() function. Make it the responsibility of the mem.c
drivers to enforce the securelevel checks.
1999-11-07 12:01:32 +00:00
Poul-Henning Kamp
dc0f93d45d Remove specfs::vop_lookup() There is no code path which can call it. 1999-11-01 02:53:38 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Matthew Dillon
dfbdd7d2b3 A tentative agreement has been reached in regards to a procedure
to remove 'b'lock devices.  The agreement is, essentially, that
    block devices will be collapsed into character devices as a first
    step (though I don't particularly agree), and raw device names 'rxxx'
    will become simply 'xxx' in devfs in the second step (i.e. no 'rxxx'
    names will exist).  The renaming will not effect the original /dev
    and the expectation is that devfs will eventually (but not immediately)
    become the standard way to access devices in the system.

    If it is determined that a reimplementation of block device access
    characteristics is beneficial, a number of alternatives will
    be possible that do not involve resurrecting the 'b'lock device class.
    For example, an ioctl() that might be made on an open character device
    descriptor or a generic buffered overlay device.

    This commit removes the blockdev disablement sysctl which does not
    apply to the solution that was reached.
1999-10-20 06:31:49 +00:00
Poul-Henning Kamp
8342096f29 Change the default for the vfs.bdev_buffered sysctl to zero.
This means that access to block devices nodes will act the
same as char device nodes for disk-like devices.

If you encounter problems after this, where programs accessing
disks directly fail to operate, please use the following command
to revert to previous behaviour:

        sysctl -w vfs.bdev_buffered=1

And verify that this was indeed the cause of your trouble.

See the mail-archives of the arch@FreeBSD.org list for background.
1999-10-18 16:59:50 +00:00
Poul-Henning Kamp
1201869007 Add a couple of strategic KASSERTs 1999-10-08 19:07:23 +00:00
Poul-Henning Kamp
856de19089 Add back sysctl vfs.enable_userblk_io 1999-10-08 18:25:19 +00:00
Poul-Henning Kamp
adab70d67a Warn once per driver about dev_t's not registered with make_dev(). 1999-10-04 12:33:05 +00:00
Poul-Henning Kamp
aa4f4b695e Move the buffered read/write code out of spec_{read|write} and into
two new functions spec_buf{read|write}.

Add sysctl vfs.bdev_buffered which defaults to 1 == true.  This
sysctl can be used to experimentally turn buffered behaviour for
bdevs off.  I should not be changed while any blockdevices are
open.  Remove the misplaced sysctl vfs.enable_userblk_io.

No other changes in behaviour.
1999-10-04 11:23:10 +00:00
Poul-Henning Kamp
1b5464ef9d Remove v_maxio from struct vnode.
Replace it with mnt_iosize_max in struct mount.

Nits from:	bde
1999-09-29 20:05:33 +00:00
Poul-Henning Kamp
9ab8e01991 Remove a warning check which was too general. 1999-09-25 18:52:03 +00:00
Poul-Henning Kamp
d6a0e38a1b Remove five now unused fields from struct cdevsw. They should never
have been there in the first place.  A GENERIC kernel shrinks almost 1k.

Add a slightly different safetybelt under nostop for tty drivers.

Add some missing FreeBSD tags
1999-09-25 18:24:47 +00:00
Poul-Henning Kamp
ae8e1d08d7 This patch clears the way for removing a number of tty related
fields in struct cdevsw:

        d_stop          moved to struct tty.
        d_reset         already unused.
        d_devtotty      linkage now provided by dev_t->si_tty.

These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.

The changes in this patch consist of:

        initialize dev->si_tty in *_open()
        initialize tty->t_stop
        remove devtotty functions
        rename ttpoll to ttypoll
        a few adjustments to these changes in the generic code
        a bump of __FreeBSD_version
        add a couple of FreeBSD tags
1999-09-25 16:21:39 +00:00
Poul-Henning Kamp
c428d4c048 Kill the cdevsw->d_maxio field.
d_maxio is replaced by the dev->si_iosize_max field which the driver
should be set in all calls to cdevsw->d_open if it has a better
idea than the system wide default.

The field is a generic dev_t field (ie: not disk specific) so that
tapes and other devices can use physio as well.
1999-09-22 19:56:14 +00:00
Matthew Dillon
8411bf657d Fix handling of a device EOF that occurs in the middle of a block. The
transfer size calculation was incorrect resulting in the last read being
    potentially larger then the actual extent of the device.

    EOF and write handling has not yet been fixed.

Reviewed by:	Tor.Egge@fast.no
1999-09-20 23:17:47 +00:00
Poul-Henning Kamp
fae03f66d1 Step one of replacing devsw->d_maxio with si_bsize_max.
Rename dev->si_bsize_max to si_iosize_max and set it in spec_open
if the device didn't.

Set vp->v_maxio from dev->si_bsize_max in spec_open rather than
in ufs_bmap.c
1999-09-20 19:57:28 +00:00
Matthew Dillon
bb01f28e97 Add vfs.enable_userblk_io sysctl to control whether user reads and writes
to buffered block devices are allowed.  The default is to be backwards
    compatible, i.e. reads and writes are allowed.

    The idea is for a larger crowd to start running with this disabled and
    see what problems, if any, crop up, and then to change the default to
    off and see if any problems crop up in the next 6 months prior to
    potentially removing support entirely.  There are still a few people,
    Julian and myself included, who believe the buffered block device
    access from usermode to be useful.

    Remove use of vnode->v_lastr from buffered block device I/O in
    preparation for removal of vnode->v_lastr field, replacing it with
    the already existing seqcount metric to detect sequential operation.

Reviewed by:	Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
1999-09-17 06:10:27 +00:00
Julian Elischer
85a219d201 Changes to centralise the default blocksize behaviour.
More likely to follow.

Submitted by: phk@freebsd.org
1999-09-09 19:08:44 +00:00
Julian Elischer
1f7e07329d Print out the device name when there is an uninitialised IO size or IO error
in spec_getpages().

Submitted by:	phk suggested the idea.
1999-09-03 09:14:36 +00:00
Julian Elischer
c5d6480694 Add a catchall to set default blocksize values for disk like devices.
Submitted by:	phk@freebsd.org
1999-09-03 08:26:46 +00:00
Julian Elischer
7012bab988 Revert a bunch of contraversial changes by PHK. After
a quick think and discussion among various people some form of some of
these changes will probably be recommitted.

The reversion requested was requested by dg while discussions proceed.
PHK has indicated that he can live with this, and it has been agreed
that some form of some of these changes may return shortly after further
discussion.
1999-09-03 05:16:59 +00:00
Poul-Henning Kamp
3b7df19ba4 Set the buffersize for non BSDFFS labeled partitions to
max(dev->si_bsize_phys, BLKDEV_IOSIZE).

Requested by:   davidg
1999-08-31 21:46:42 +00:00
Poul-Henning Kamp
586e1b7b46 Make buffered acces to bdevs from userland controllable with
a sysctl vfs.bdev_access.
1999-08-31 21:01:57 +00:00
Poul-Henning Kamp
02e1576966 Make bdev userland access work like cdev userland access unless
the highly non-recommended option ALLOW_BDEV_ACCESS is used.

(bdev access is evil because you don't get write errors reported.)

Kill si_bsize_best before it kills Matt :-)

Use the specfs routines rather having cloned copies in devfs.
1999-08-30 07:56:23 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Poul-Henning Kamp
dbafb3660f Simplify the handling of VCHR and VBLK vnodes using the new dev_t:
Make the alias list a SLIST.

        Drop the "fast recycling" optimization of vnodes (including
        the returning of a prexisting but stale vnode from checkalias).
        It doesn't buy us anything now that we don't hardlimit
        vnodes anymore.

        Rename checkalias2() and checkalias() to addalias() and
        addaliasu() - which takes dev_t and udev_t arg respectively.

        Make the revoke syscalls use vcount() instead of VALIASED.

        Remove VALIASED flag, we don't need it now and it is faster
        to traverse the much shorter lists than to maintain the
        flag.

        vfs_mountedon() can check the dev_t directly, all the vnodes
        point to the same one.

Print the devicename in specfs/vprint().

Remove a couple of stale LFS vnode flags.

Remove unimplemented/unused LK_DRAINED;
1999-08-26 14:53:31 +00:00
Julian Elischer
bb70475a92 Fix comment to match reality..
vop_strategy gets a vnode argument these days.
1999-08-25 00:26:34 +00:00
Alan Cox
2c28a10540 Add the (inline) function vm_page_undirty for clearing the dirty bitmask
of a vm_page.

Use it.

Submitted by:	dillon
1999-08-17 04:02:34 +00:00
Poul-Henning Kamp
49ff4debd3 Spring cleaning around strategy and disklabels/slices:
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.

Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.

Remove bogus and unused strategy1 routines.

Remove open/close arguments from dssize().  Pick them up from dev_t.

Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
1999-08-14 11:40:51 +00:00
Poul-Henning Kamp
2820b2e762 Add support for device drivers which want to track all open/close
operations.  This allows a device driver better insight into
what is going on that the current:

        proc1:  open /dev/foo R/O
                        devsw->open( R/O, proc1 )
        proc2:  open /dev/foo R/W
                        devsw->open( R/W, proc2 )
        proc2:  close
                        /* nothing, but device is
                           really only R/O open */
        proc1:  close
                        devsw->close( R/O, proc1 )
1999-08-13 16:29:27 +00:00
Poul-Henning Kamp
608bb3ffdf Remove spec_getattr(), which as far as I can tell can never be called from the current code-paths, and if it were, would panic on any unmounted bdev. 1999-08-13 10:53:58 +00:00
Poul-Henning Kamp
7dc5cd047f The bdevsw() and cdevsw() are now identical, so kill the former. 1999-08-13 10:29:38 +00:00
Poul-Henning Kamp
4d4f932326 s/v_specinfo/v_rdev/ 1999-08-13 10:10:12 +00:00
Poul-Henning Kamp
0ef1c82630 Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.
1999-08-08 18:43:05 +00:00
Poul-Henning Kamp
698bfad7f2 Now a dev_t is a pointer to struct specinfo which is shared by all specdev
vnodes referencing this device.

Details:
        cdevsw->d_parms has been removed, the specinfo is available
        now (== dev_t) and the driver should modify it directly
        when applicable, and the only driver doing so, does so:
        vn.c.  I am not sure the logic in checking for "<" was right
        before, and it looks even less so now.

        An intial pool of 50 struct specinfo are depleted during
        early boot, after that malloc had better work.  It is
        likely that fewer than 50 would do.

        Hashing is done from udev_t to dev_t with a prime number
        remainder hash, experiments show no better hash available
        for decent cost (MD5 is only marginally better)  The prime
        number used should not be close to a power of two, we use
        83 for now.

        Add new checkalias2() to get around the loss of info from
        dev2udev() in bdevvp();

        The aliased vnodes are hung on a list straight of the dev_t,
        and speclisth[SPECSZ] is unused.  The sharing of struct
        specinfo means that the v_specnext moves into the vnode
        which grows by 4 bytes.

        Don't use a VBLK dev_t which doesn't make sense in MFS, now
        we hang a dummy cdevsw on B/Cmaj 253 so that things look sane.

	Storage overhead from all of this is O(50k).

        Bump __FreeBSD_version to 400009

The next step will add the stuff needed so device-drivers can start to
hang things from struct specinfo
1999-07-20 09:47:55 +00:00
Kirk McKusick
67812eacd7 Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.
1999-06-26 02:47:16 +00:00
Dmitrij Tejblum
e2754b751d Remove an unused variable. 1999-06-01 20:29:58 +00:00
Poul-Henning Kamp
2447bec829 Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it.  cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.

cdevsw_add() will print an message if the d_maj field looks bogus.

Remove nblkdev and nchrdev variables.  Most places they were used
bogusly.  Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.

Move bdevsw() and devsw() functions to kern/kern_conf.c

Bump __FreeBSD_version to 400006

This commit removes:
        72 bogus makedev() calls
        26 bogus SYSINIT functions

if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.

I4b and vinum not changed.  Patches emailed to authors.  LINT
probably broken until they catch up.
1999-05-31 11:29:30 +00:00
Poul-Henning Kamp
bfbb9ce670 Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
        major()         umajor()
        minor()         uminor()
        makedev()       umakedev()
        dev2udev()      udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits.  This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference.  If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
Poul-Henning Kamp
4be2eb8c49 I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.
1999-05-08 06:40:31 +00:00
Poul-Henning Kamp
46eede0058 Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw.  bdevsw() is now an (inline)
        function.

        Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
        to the order of the cmaj/bmaj arguments!)

        Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
        (ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
1999-05-07 10:11:40 +00:00
Poul-Henning Kamp
b0eeea2042 remove b_proc from struct buf, it's (now) unused.
Reviewed by:	dillon, bde
1999-05-06 20:00:34 +00:00
Julian Elischer
8d17e69460 Catch a case spotted by Tor where files mmapped could leave garbage in the
unallocated parts of the last page when the file ended on a frag
but not a page boundary.
Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF,
in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h
    vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c
    ufs/ufs/ufs_readwrite.c kern/vfs_bio.c

Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Alan Cox <alc@freebsd.org>
1999-04-05 19:38:30 +00:00
Matthew Dillon
155f87daf2 Reviewed by: Julian Elischer <julian@whistle.com>
Add d_parms() to {c,b}devsw[].  If non-NULL this function points to
    a device routine that will properly fill in the specinfo structure.
    vfs_subr.c's checkalias() supplies appropriate defaults.  This change
    should be fully backwards compatible with existing devices.
1999-02-25 05:22:30 +00:00
Matthew Dillon
831a80b0d5 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 22:42:27 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
Eivind Eklund
e910d98670 Fix possible NULL-pointer deref in error case (same as DEVFS). 1998-12-16 00:10:51 +00:00
Archie Cobbs
f1d19042b0 The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.
1998-12-07 21:58:50 +00:00
Peter Wemm
40c8cfe552 Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.
1998-10-31 15:31:29 +00:00
Bruce Evans
569555b969 Removed redundant bitrotted checks for major numbers instead of updating
them.
1998-10-26 08:53:13 +00:00
Poul-Henning Kamp
649c00db71 various nits that didn't make it through the brucefilter. 1998-09-12 20:21:54 +00:00
Poul-Henning Kamp
0375c9f2b8 Add a new vnode op, VOP_FREEBLKS(), which filesystems can use to inform
device drivers about sectors no longer in use.

Device-drivers receive the call through d_strategy, if they have
D_CANFREE in d_flags.

This allows flash based devices to erase the sectors and avoid
pointlessly carrying them around in compactions.

Reviewed by:	Kirk Mckusick, bde
Sponsored by:	M-Systems (www.m-sys.com)
1998-09-05 14:13:12 +00:00
Doug Rabson
e69763a315 Cosmetic changes to the PAGE_XXX macros to make them consistent with
the other objects in vm.
1998-09-04 08:06:57 +00:00
Poul-Henning Kamp
a9ea5c0c51 sort the prototypes 1998-08-25 17:48:54 +00:00
Poul-Henning Kamp
07fe032454 Last commit managed to get mangled somehow. 1998-08-24 18:23:18 +00:00