Commit Graph

272 Commits

Author SHA1 Message Date
phk
500e41bd71 I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.
1999-05-08 06:40:31 +00:00
phk
693dd58bb3 Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw.  bdevsw() is now an (inline)
        function.

        Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
        to the order of the cmaj/bmaj arguments!)

        Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
        (ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
1999-05-07 10:11:40 +00:00
alc
5cb08a2652 The VFS/BIO subsystem contained a number of hacks in order to optimize
piecemeal, middle-of-file writes for NFS.  These hacks have caused no
end of trouble, especially when combined with mmap().  I've removed
them.  Instead, NFS will issue a read-before-write to fully
instantiate the struct buf containing the write.  NFS does, however,
optimize piecemeal appends to files.  For most common file operations,
you will not notice the difference.  The sole remaining fragment in
the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache
coherency issues with read-merge-write style operations.  NFS also
optimizes the write-covers-entire-buffer case by avoiding the
read-before-write.  There is quite a bit of room for further
optimization in these areas.

The VM system marks pages fully-valid (AKA vm_page_t->valid =
VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault.  This
is not correct operation.  The vm_pager_get_pages() code is now
responsible for marking VM pages all-valid.  A number of VM helper
routines have been added to aid in zeroing-out the invalid portions of
a VM page prior to the page being marked all-valid.  This operation is
necessary to properly support mmap().  The zeroing occurs most often
when dealing with file-EOF situations.  Several bugs have been fixed
in the NFS subsystem, including bits handling file and directory EOF
situations and buf->b_flags consistancy issues relating to clearing
B_ERROR & B_INVAL, and handling B_DONE.

getblk() and allocbuf() have been rewritten.  B_CACHE operation is now
formally defined in comments and more straightforward in
implementation.  B_CACHE for VMIO buffers is based on the validity of
the backing store.  B_CACHE for non-VMIO buffers is based simply on
whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear,
and vise-versa).  biodone() is now responsible for setting B_CACHE
when a successful read completes.  B_CACHE is also set when a bdwrite()
is initiated and when a bwrite() is initiated.  VFS VOP_BWRITE
routines (there are only two - nfs_bwrite() and bwrite()) are now
expected to set B_CACHE.  This means that bowrite() and bawrite() also
set B_CACHE indirectly.

There are a number of places in the code which were previously using
buf->b_bufsize (which is DEV_BSIZE aligned) when they should have
been using buf->b_bcount.  These have been fixed.  getblk() now clears
B_DONE on return because the rest of the system is so bad about
dealing with B_DONE.

Major fixes to NFS/TCP have been made.  A server-side bug could cause
requests to be lost by the server due to nfs_realign() overwriting
other rpc's in the same TCP mbuf chain.  The server's kernel must be
recompiled to get the benefit of the fixes.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-05-02 23:57:16 +00:00
phk
ca21a25f17 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
phk
16e3fbd2c1 Suser() simplification:
1:
  s/suser/suser_xxx/

2:
  Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
  s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.
1999-04-27 11:18:52 +00:00
bde
7435c5f5ec Don't depend on <ufs/ufs/quota.h> or another (old) prerequisite including
<sys/queue.h>.  This fixes my recent breakage of biosboot by unpolluting
<ufs/ufs/quota.h> in the !KERNEL case.
1999-03-06 05:21:09 +00:00
imp
f86ebee5a7 Merge patch to ufs_vnops.c's ufs_rename to the copy of ufs_rename that
lives in ext2_vnops.c for ext2fs.  Also remove cast from comparision.
Bruce pointed out that it was bogus since we'd force a signed
comparision when we really wanted an unsigned comparison.
1999-03-02 05:31:47 +00:00
bde
ae4dfde7bc Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>). 1999-02-25 15:54:06 +00:00
bde
259b16911f Fixed parenthesization botch in previous commit. Async update of inodes
was broken.
1999-01-29 15:36:05 +00:00
dillon
975fba8a24 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
dillon
f9a4729a9b Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile.

    This commit includes significant work to proper handle const arguments
    for the DDB symbol routines.
1999-01-27 23:45:44 +00:00
dillon
a40e0249d4 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 21:50:00 +00:00
eivind
8369d98994 Avoid warning for unused variable. 1999-01-11 23:32:35 +00:00
bde
2facf6978a Don't pass unused unused timestamp args to UFS_UPDATE() or waste
time initializing them.  This almost finishes centralizing (in-core)
timestamp updates in ufs_itimes().
1999-01-07 16:14:19 +00:00
bde
2a8b656860 UFS_UPDATE() takes a boolean `waitfor' arg, so don't pass it the value
MNT_WAIT when we mean boolean `true' or check for that value not being
passed.  There was no problem in practice because MNT_WAIT had the
magic value of 1.
1999-01-06 18:18:06 +00:00
bde
734d13314e Ifdefed conditionally used simplock variables. 1999-01-02 11:34:57 +00:00
archie
60d13c7a9d The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.
1998-12-07 21:58:50 +00:00
bde
d3f53ca3db Fixed a misspelling of boolean true as MNT_WAIT. 1998-11-15 15:46:33 +00:00
peter
73192d8050 add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE() 1998-11-10 09:16:29 +00:00
peter
8ef35acf90 Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.
1998-10-31 15:31:29 +00:00
peter
0c010ce2d1 error return assignment was less than ideal. Fix the part that caused
warnings to be the same as the ffs code.  Previously, any error from
the UFS_UPDATE() call was lost (I think).
1998-10-29 09:44:12 +00:00
peter
c002ba1c81 Use vtruncbuf() to clean out cached blocks on a file shorten rather than
the more expensive vinvalbuf(), based on the FFS version of the same
routine.  I don't have any ext2fs filesystems to test this on.
1998-10-29 09:30:52 +00:00
bde
bd7a76a938 Oops, the redundant tests for major numbers weren't redundant here.
They checked for the magic major number for the "device" behind mfs
mount points.  Use a more obvious check for this device.

Debugged by:		Andrew Gallatin <gallatin@cs.duke.edu>
1998-10-27 11:47:08 +00:00
bde
873d7be484 Removed redundant bitrotted checks for major numbers instead of updating
them.
1998-10-26 08:53:13 +00:00
bde
558766fa94 Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted
when bdevsw[] became sparse.  We still depend on magic to avoid having to
check that (v_rdev) device numbers in vnodes are not NODEV.
1998-10-25 19:26:18 +00:00
bde
8ccf93af58 Fixed bloatage of `struct inode'. We used 5 "spare" fields for ext2fs,
but when i_effnlink was added to support soft updates, there was only
room for 4 spares.  The number of spares was not reduced, so the inode
size became 260 (on i386's), or 512 after rounding up by malloc().
Use one spare field in `struct dinode' instead of the 5th spare field
in the inode and reduced to 4 spares in the inode so that the size is
256 again.

Changed the types of the spares in the inode from int to u_int32_t
so that the inode size has more chance of being <= 256 under other
arches, and downdated ext2fs to match (it was broken to use ints
before rev.1.1).
1998-10-13 15:45:43 +00:00
bde
e7b28f3337 Quick fix for not being able to sync all the buffers in boot() if
an ext2fs file system is mounted.  The soft update changes added
a check for B_DELWRI buffers.  This exposed the complete brokenness
of the previous quick fix for failing syncs (PR 3571, committed on
1997/08/04).  Use a new buffer flag B_DIRTY and don't abuse B_DELWRI.
B_DIRTY buffers are still written too late, as broken in the previous
fix.  This is fairly harmless, because B_DIRTY is only used for
bitmap buffers and fsck.ext2 can fix up the bitmaps perfectly.

Fixed a race in ULCK_BUF() (bremfree() was outside of the splbio()
section).
1998-10-03 16:19:28 +00:00
bde
fd94b4c920 Fixed initialization of new inodes. ext2fs doesn't clear inodes when
they are deleted, so inodes must be cleared when they are reused, but
we didn't clear the indirect blocks.  This caused serious filesystem
corruption.
1998-09-29 08:07:32 +00:00
bde
30c497ff89 Updated ext2_reload() and ext2_sync(). Locking was broken, and MNT_LAZY
syncs weren't optimized properly (they probably still aren't, but are bug
for bug compatible with ffs).  These fixes are mostly academic, since
ext2fs is too broken to mount read-write (it apparently doesn't clear
indirect blocks).

Obtained from:	mostly from Lite2
1998-09-26 12:42:17 +00:00
bde
3833ff5964 Fixed missing newlines in messages in ext2_check_descriptors().
Fixed vnode and memory leaks after an unlikely (?) error in
ext2_mountfs().

Fixed an unconditional memory leak in ext2_unmount().
1998-09-26 07:16:41 +00:00
bde
4ef2ffab16 Fixed clean flag handling:
Fixes for bugs not shared with ffs:
- don't mount unclean filesystems rw unless forced to.
- accept EXT2_ERROR_FS (treat it like !EXT2_VALID_FS).  We still don't set
  this or honour the maximal mount count.
- don't attempt to print the name of the mount point when mounting an
  unclean file system, since the name of the previous mount point is
  unknown and the name of the current mount point is still "".

Fixes for bugs shared with ffs until recently:
- don't set the clean flag on unmount of an initially-unclean filesystem
  that was (forcibly) mounted rw.
- set the clean flag on rw -> ro update of a mounted initially-clean
  filesystem.
- fixed some style bugs (mostly long lines).

The fixes are slightly simpler than for ffs, because the relevant on-disk
state is not a simple boolean variable, and the superblock has a core-only
extension.

Obtained from:	parts from ffs_vfsops.c, parts from NetBSD
1998-09-26 06:18:59 +00:00
bde
6dbde5515d Fixed the usual missing permissions checks in mount(). As for cd9660,
the damage was limited by the default of 0 for vfs.usermount.

Obtained from:	Lite2 via the -current ffs_vfsops.c
1998-09-09 20:21:18 +00:00
bde
d2c322d7f6 Don't forget to initialize the inode lock. This bug caused
surprisingly few problems.  Most fields were initialized to the
correct values by bzero(), but lk_prio was 0 instead of PINOD (=8),
the lk_wmsg was NULL instead of "ext2in", and lk_lockholder was 0
instead of -1.

Obtained from:	Lite2 via the -current ffs_vfsops.c
1998-09-09 13:09:24 +00:00
bde
9493c9f45d Support compiling with `gcc -pedantic' (don't use hard newlines in
(asm) string constants).
1998-09-09 12:22:17 +00:00
bde
e170b2ba75 Removed statically configured mount type numbers (MOUNT_*) and all
references to them.

The change a couple of days ago to ignore these numbers in statically
configured vfsconf structs was slightly premature because the cd9660,
cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number
in their vfsconf struct.
1998-09-07 13:17:06 +00:00
bde
605f4cb33b Quick fix for breakage of read clustering on non-IDE drives. Read
clustering is obsolescent technology so hardly anyone noticed.  On
a DORS 32160 SCSI drive with 4 tags, read clustering makes very
little difference even for huge sequential reads.  However, on a
ZIP SCSI drive with 0 tags, the minimum overhead per block is about
40 msec, so very large clusters must be used to get anywhere near
the maximum transfer rate.  Using clusters consisting of 1 8K block
reduces the transfer rate to about 250K/sec.  Under msdosfs, missing
read clustering is normal and a cluster size of 1 512 byte block
reduces the transfer rate to about 25K/sec.

Broken in:	rev.1.18
1998-08-18 03:54:39 +00:00
msmith
64b624ba3e "The releaseing of the reference and lock is not temporary and belongs
where it is.  The reference and lock(s) are acquired just above the
 code in VREF() and relookup()."

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-08-12 21:42:54 +00:00
bde
d00cccd49f Fixed printf format errors. 1998-07-30 17:12:39 +00:00
julian
c9a6a4d566 add anti-panic workaround from chris radek (cradek@in221.inetnebr.com)
Not sure why this is needed but but does stop crashes.
1998-07-30 03:22:52 +00:00
bde
f0b863f4b5 Fixed printf format errors. 1998-07-11 07:46:16 +00:00
julian
78a155ddaf Catch a few corner cases where FreeBSD differs enough from BSD 4.4 to
confuse Soft updates..
Should solve several "dangling deps" panics.
1998-07-08 01:04:33 +00:00
julian
4363221ba2 VOP_STRATEGY grows an (struct vnode *) argument
as the value in b_vp is often not really what you want.
(and needs to be frobbed). more cleanups will follow this.
Reviewed by: Bruce Evans <bde@freebsd.org>
1998-07-04 20:45:42 +00:00
bde
660e6408e6 Sync timestamp changes for inodes of special files to disk as late
as possible (when the inode is reclaimed).  Temporarily only do
this if option UFS_LAZYMOD configured and softupdates aren't enabled.
UFS_LAZYMOD is intentionally left out of /sys/conf/options.

This is mainly to avoid almost useless disk i/o on battery powered
machines.  It's silly to write to disk (on the next sync or when the
inode becomes inactive) just because someone hit a key or something
wrote to the screen or /dev/null.

PR:		5577
Previous version reviewed by:	phk
1998-07-03 22:17:03 +00:00
bde
dfd9848c30 Centralized in-core inode update. Update the in-core inode directly
in ufs_setattr() so that there is no need to pass timestamps to
UFS_UPDATE() (everything else just needs the current time).  Ignore
the passed-in timestamps in UFS_UPDATE() and always call ufs_itimes()
(was: itimes()) to do the update.  The timestamps are still passed
so that all the callers don't need to be changed yet.
1998-07-03 18:46:52 +00:00
bde
c84e4b4759 Fixed (?) races in mark_buffer_dirty(). We abuse the buffer cache
by hacking on locked buffers without getblk()ing them, and we didn't
even use splbio() to prevent biodone() changing the buffer underneath
use when a write completes.  I think there was no problem in practice
on i386's because the operations on b_flags and numdirtybufs happen to
be atomic.  We still depend on biodone()'s operations on b_flags not
interfering with ours.  I think there is only interference for B_ERROR,
and this is harmless because errors for async writes are ignored anyway.

Don't use mark_buffer_dirty() except for superblock-related metadata.
It was used in just one case where ordinary BSD buffering is more
natural.
1998-06-21 21:06:04 +00:00
bde
b4385f0cee Removed unused function ll_w_block(). It has always had races due
to not using splbio(), and has rotted a little.  The races were
probably harmless in practice because this function was only used
for superblock updates, and separate superblock updates are probably
prevented from running into each other by doing part of the update
synchronously.
1998-06-21 19:56:31 +00:00
bde
9e868cbb1a Removed unused includes. 1998-06-21 18:02:50 +00:00
bde
403bdcb97b Removed unused includes. 1998-06-21 14:53:44 +00:00
bde
a5331e458c Added a missing options include. 1998-06-21 12:36:12 +00:00
bde
23eca1d888 Don't use "ffs" in an ext2fs sleep message string.
Don't forget to clear the inode hash lock before returning from ext2_vget()
after getnewvnode() fails.  Obtained from: rev.1.24 of ffs_vfsops.c (the
original patch for the getnewvnode() race).  Forgotten in: rev.1.4 here.

Removed a duplicate comment.  Duplicated in: rev.1.4 here.

Fixed the MALLOC() vs getnewvnode() race in ext2_vget().  Obtained from:
rev.1.39 of ffs_vfsops.c.
1998-05-16 17:47:44 +00:00
bde
de251f69b3 Abbreviate "ext2fs_fsync" as "e2fsyn" instead of as "extfsn" in a sleep
message string.
1998-05-16 16:52:20 +00:00
msmith
964ce778b1 In the words of the submitter:
---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it.  This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp.  vop_rename
still releases 4 vnode arguments before it returns.  These remaining cases
will be corrected in the next set of patches.
---------

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-07 04:58:58 +00:00
msmith
c645da3999 As described by the submitter:
Reverse the VFS_VRELE patch.  Reference counting of vnodes does not need
to be done per-fs.  I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-06 05:29:41 +00:00
des
396b114475 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
bde
b598f559b2 Support compiling with `gcc -ansi'. 1998-04-15 17:47:40 +00:00
phk
9b703b1455 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
bde
cd450d6714 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
phk
00475b662a Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.
1998-03-26 20:54:05 +00:00
eivind
94b4481b74 Make this compile after soft updates integration.
LINTing forgotten by:	julian
1998-03-09 14:46:57 +00:00
julian
10c5ccc30a Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by:	Kirk McKusick (mcKusick@mckusick.com)
Obtained from:  WHistle development tree
1998-03-08 09:59:44 +00:00
msmith
950d32131b The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE.  This is
initially only for file systems that implement the following ops
that do a WILLRELE:

	vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
	vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet.  VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
	ffs mfs nfs msdosfs devfs ext2fs

Custom
	union umapfs

Just EOPNOTSUPP
	fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers.  vput will be replaced by unlock in these cases.  Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer.  This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by:	phk, dyson, terry et. al.
Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-03-01 22:46:53 +00:00
msmith
8c8a4b1058 Style nits and staticism with the previous commit.
Submitted by:	bde
1998-03-01 01:37:38 +00:00
msmith
c6d6c0fbd7 Add local stup putpages/getpages routines.
Submitted by:	Terry Lambert <terry@freebsd.org>
1998-03-01 00:51:43 +00:00
bde
b28763a6b1 Fixed configuration and linkage of ext2_checkoverlap(). 1998-02-13 00:28:40 +00:00
eivind
d7a6ab2803 Staticize. 1998-02-09 06:11:36 +00:00
eivind
4547a09753 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
eivind
c552a9a1c3 Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
eivind
099975fa27 Make LINT at least compile. This faithfully duplicate the changes
done to ufs/ufs/ufs_vnops.c for the same problem, but I don't know if
that will actually make SUIDDIR work for ext2fs.
1998-02-04 01:16:03 +00:00
phk
bb6f7d8184 Retire LFS.
If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.

If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.

R.I.P
1998-01-30 11:34:06 +00:00
dyson
8726294764 Add better support for larger I/O clusters, including larger physical
I/O.  The support is not mature yet, and some of the underlying implementation
needs help.  However, support does exist for IDE devices now.
1998-01-24 02:01:46 +00:00
bde
3c1b6940fc Unspammed nested include of <vm/vm_zone.h>. 1997-12-27 02:56:39 +00:00
eivind
d9973ba95b Convert SUIDDIR fully to a new-style option.
Forgotten by: julian
1997-12-15 21:51:45 +00:00
bde
efd51d84cf Don't include <sys/lock.h> in headers when only `struct simplelock' is
required.  Fixed everything that depended on the pollution.
1997-12-05 19:55:52 +00:00
jkh
b3ecbbca51 Needs to include <sys/lock.h> if we're using struct lock. 1997-12-05 13:43:47 +00:00
bde
4060d58634 Fixed corruption of the per-group used directories count. It wasn't
decremented when directories were removed because rev.1.12 broke the
fixup of the i_mode of the inode being removed.
1997-12-03 16:46:21 +00:00
phk
f86993e3c3 Fix the copyright and attribution on this file. I forgot this
when the file was cloned.
1997-12-02 21:20:06 +00:00
bde
c7448527e1 Use the same algorithm as ffs for generation numbers. 1997-12-02 11:42:28 +00:00
bde
588c9db8ef Removed __FreeBSD__ ifdefs. 1997-12-02 10:39:42 +00:00
bde
5549d2f6ef Fixed missing #include of "opt_quota.h".
Sorted the functions into the same order as in ufs_vnops.c so that this
can be compared with the latter without getting 2627 lines of diffs.
Now we get only 1920 lines of diffs.
1997-11-24 19:25:24 +00:00
bde
f48794b9f5 Fixed overflow in ufs_getblns(). For ufs on systems with 32-bit ints,
triple indirect blocks only worked for block sizes of 4K, since
MNINDIR(ump)**3 overflows for larger block sizes (e.g.,
(8192/4)**3 = 2**33 > INT_MAX).  This fix is not the obvious one of
changing some types to 64 bits.  It rearranges the code to avoid some
unnecessary 64-bit calculations.

Reviewed by:	Kirk McKusick <mckusick@McKusick.COM>
1997-11-24 16:33:03 +00:00
bde
2001000d90 Use consistent description strings for M_EXT2NODE. This also fixes a
spelling error in the unused string.
1997-11-20 16:56:25 +00:00
phk
9edb1e0f35 Give ext2fs it's own VOP_REMOVE, VOP_LINK, VOP_RENAME, VOP_MKDIR, VOP_RMDIR,
VOP_CREATE, VOP_MKNOD, VOP_SYMLINK and ext2_makeinode().
1997-11-18 14:19:44 +00:00
julian
ae22df605c Reviewed by: various.
Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left  for another day.
1997-11-12 05:42:33 +00:00
bde
e38eb2dd99 Removed unused #includes. The need for most of them went away with
recent changes (docluster* and vfs improvements).
1997-10-27 13:33:47 +00:00
phk
956a3f58e3 I guess nobody uses ext2fs in current ?
vop_lookup is back now, don't know whan I lost it.
1997-10-26 21:05:40 +00:00
phk
14aa7b01ea Make a set of VOP standard lock, unlock & islocked VOP operators, which
depend on the lock being located at vp->v_data.  Saves 3x3 identical
vop procs, more as the other filesystems becomes lock aware.
1997-10-17 12:36:19 +00:00
phk
373a865574 Another VFS cleanup "kilo commit"
1.  Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
    intereface function, and now lives in the ufsmount structure.

2.  Remove VOP_SEEK, it was unused.

3.  Add mode default vops:

    VOP_ADVLOCK          vop_einval
    VOP_CLOSE            vop_null
    VOP_FSYNC            vop_null
    VOP_IOCTL            vop_enotty
    VOP_MMAP             vop_einval
    VOP_OPEN             vop_null
    VOP_PATHCONF         vop_einval
    VOP_READLINK         vop_einval
    VOP_REALLOCBLKS      vop_eopnotsupp

    And remove identical functionality from filesystems

4.   Add vop_stdpathconf, which returns the canonical stuff.  Use
     it in the filesystems.  (XXX: It's probably wrong that specfs
     and fifofs sets this vop, shouldn't it come from the "host"
     filesystem, for instance ufs or cd9660 ?)

5.   Try to make system wide VOP functions have vop_* names.

6.   Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)
1997-10-16 20:32:40 +00:00
phk
d166441755 VFS mega cleanup commit (x/N)
1.  Add new file "sys/kern/vfs_default.c" where default actions for
    VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
    POLL, REVOKE and STRATEGY.  Various stuff spread over the entire
    tree belongs here.

2.  Change VOP_BLKATOFF to a normal function in cd9660.

3.  Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC.  These
    are private interface functions between UFS and the underlying
    storage manager layer (FFS/LFS/MFS/EXT2FS).  The functions now
    live in struct ufsmount instead.

4.  Remove a kludge of VOP_ functions in all filesystems, that did
    nothing but obscure the simplicity and break the expandability.
    If a filesystem doesn't implement VOP_FOO, it shouldn't have an
    entry for it in its vnops table.  The system will try to DTRT
    if it is not implemented.  There are still some cruft left, but
    the bulk of it is done.

5.  Fix another VCALL in vfs_cache.c (thanks Bruce!)
1997-10-16 10:50:27 +00:00
julian
a901089b08 Two more places where root filesystems were mounted, put them at the head of
the mount list  in case there is already  DEVFS present.
1997-10-16 08:16:34 +00:00
phk
f7aabc3ac9 vnops megacommit
1.  Use the default function to access all the specfs operations.
2.  Use the default function to access all the fifofs operations.
3.  Use the default function to access all the ufs operations.
4.  Fix VCALL usage in vfs_cache.c
5.  Use VOCALL to access specfs functions in devfs_vnops.c
6.  Staticize most of the spec and fifofs vnops functions.
7.  Make UFS panic if it lacks bits of the underlying storage handling.
1997-10-15 13:24:07 +00:00
phk
92eeb70dc6 Hmm, realign the vnops into two columns. 1997-10-15 10:05:29 +00:00
phk
26130e0b77 Stylistic overhaul of vnops tables.
1. Remove comment stating the blatantly obvious.
        2. Align in two columns.
        3. Sort all but the default element alphabetically.
        4. Remove XXX comments pointing out entries not needed.
1997-10-15 09:22:02 +00:00
bde
a272224fba IN_HASHED goes in the in-core flags ip->i_flag, not in the on-disk flags
ip->i_flags.

Rev.1.18 completely broke ufs.  My root directory went away about 10
seconds after booting.  I think file system damage was null, since
IN_HASHED = 0x80 is not used in the disk flags (it would probably
be UF_SOMETHING if it were used).
1997-10-15 07:32:45 +00:00
phk
03bfa835e8 Reset the flag right away, could catch a bogon someday. 1997-10-14 18:51:07 +00:00
phk
15a4cff98f I think my previous change may have opened a race conditio.
This patch does the same thing, with no change in semantics.
1997-10-14 18:46:48 +00:00
phk
2b4a6ad696 ufs_ihashrem() should not be called from the UFS layer, but from the
lower layer (LFS/FFS/?) like the rest of the ihash functions.
Otherwise it is impossible to make a lower layer that doesn't use the
ihash facility.
1997-10-14 14:22:31 +00:00
phk
36e7a51ea1 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
phk
645e7b2ab6 Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
phk
c0089963c1 Make ufs_reclaim free the underlying inode. 1997-10-10 18:18:13 +00:00
phk
0a78fdbf4c Mega commit to cleanup the "remaining nits" after my malloc change.
Introduce a M_EXT2NODE for ext2fs vnodes.
Use generic ufs_reclaim instead of hijacking ffs_reclaim.
1997-10-10 18:13:06 +00:00
bde
3d4ee35e7c `numdirtybuffers' was not maintained properly. This caused excessive
flushing of buffers in an attempt to reduce numdirtybuffers, and
perhaps other problems.
1997-10-07 11:10:18 +00:00
kato
70a6b57455 Oops, include <sys/conf.h>.
Reminded-by:	Simon Shapiro <Shimon@i-Connect.Net>
1997-09-28 02:23:10 +00:00
kato
fe9b86cf0b Clustered read and write are switched at mount-option level.
1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
   bits of the mnt_flag.  The sysctl variables, vfs.foo.doclusterread
   and vfs.foo.doclusterwrite are deleted.  Only mount option can
   control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
   and D_CLUSTERW bits of the d_flags member in the block device switch
   table.  If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
   MNT_NOCLUSTERW bits will be set.  In this case, MNT_NOCLUSTERR and
   MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by:	bde
1997-09-27 13:40:20 +00:00
joerg
091bf8d373 Make MFS a supported option, finally. 1997-09-22 21:24:03 +00:00
peter
ce7feabb13 Convert select -> poll.
Delete 'always succeed' select/poll handlers, replaced with generic call.
Flag missing vnode op table entries.
1997-09-14 02:58:12 +00:00
phk
a5f03d2116 Remove some stuff from lookup which is now handled centrally. 1997-09-10 19:39:03 +00:00
bde
bcade9a903 Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.
1997-09-07 16:21:11 +00:00
bde
6ffb8bf9af Removed unused #includes. 1997-09-02 20:06:59 +00:00
phk
fddfc9d5bb Uncut&paste cache_lookup().
This unifies several times in theory indentical 50 lines of code.

The filesystems have a new method: vop_cachedlookup, which is the
meat of the lookup, and use vfs_cache_lookup() for their vop_lookup
method.  vfs_cache_lookup() will check the namecache and pass on
to the vop_cachedlookup method in case of a miss.

It's still the task of the individual filesystems to populate the
namecache with cache_enter().

Filesystems that do not use the namecache will just provide the
vop_lookup method as usual.
1997-08-26 07:32:51 +00:00
kato
7057ee806e Code cleanup. Removed !FreeBSD code arround sysctl stuff. Renamed
doclusterread/doclusterwrite into ext2_doclusterread and
ext2_doclusterwrite, which are unique names.  Moved #include of
<sys/sysctl.h> to the top of the file.

Pointed out by:		Bruce Evans <bde@zeta.org.au>
1997-08-24 11:23:17 +00:00
kato
c2d70f7400 Added sysctl args vfs.ext2fs.doclusterread and
vfs.ext2fs.doclusterwrite which control cluster read/write operation
on ext2fs filesystem.
1997-08-23 07:41:02 +00:00
wollman
4542c1cf5d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
dyson
ec0474c458 Fix a problem with ext2fs so that filesystems mounted at reboot don't
keep ahold of buffers, and therefore leave filesystems dirty.  I haven't
been able to test, but the code compiles.  Those who run -current, please
test and report back!!!  (Sorry :-)).

PR:		kern/3571
Submitted by:	Dirk Keunecke <dk@panda.rhein-main.de>
1997-08-04 05:10:31 +00:00
bde
c21548a94c Fixed comment about i_spare. 1997-07-13 15:40:31 +00:00
dyson
eb8e1f5e4e Fix a problem with the VN device. Specifically, the VN device can
cause a problem of spiraling death due to buffer resource limitations.
The vfs_bio code in general had little ability to handle buffer resource
management, and now it does.  Also, there are a lot more knobs for tuning the
vfs_bio code now.  The knobs came free because of the need that there
always be some immediately available buffers (non-delayed or locked) for
use.  Note that the buffer cache code is much less likely to get bogged
down with lots of delayed writes, even more so than before.
1997-06-15 17:56:53 +00:00
bde
80b2960a8c Removed unused #includes. 1997-06-14 14:17:07 +00:00
phk
a10b0af918 Shrink struct inode by 20 bytes, so that malloc wastes less space.
Pointed out by:	bde
1997-05-22 07:30:55 +00:00
dfr
f5e07dc7a9 Support NFS cookies in VOP_READDIR, allowing ext2fs filesystems to be
exported via NFS.

2.2 candidate.
1997-04-05 12:23:44 +00:00
bde
33314720e2 Fixed gratuitous ANSIisms.
Removed trailing newline from panic messages.
1997-04-01 15:22:59 +00:00
bde
7468722294 Use __i386__ instead of i386 in ifdefs.
Don't compile unused (debugging?) functions.
1997-04-01 15:10:38 +00:00
bde
d6083d03a2 Removed nested include of <ufs/ufs/dir.h>. Use the pre-Lite2 hack of
defining doff_t both here and in <ufs/ufs/dir.h> so that this file
is independent of <ufs/ufs/dir.h>.  It still has old prerequisites
<sys/param.h> and <ufs/ufs/quota.h>, and a new Lite2 prerequisite of
<sys/lock.h>, sigh.

This might fix lsof, which was broken by namespace pollution giving
conflicting definitions of DIRBLKSIZ.
1997-04-01 08:02:00 +00:00
bde
117209856b Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include
it when it is not used.  In most cases, the reasons for including it
went away when the special ioctl headers became self-sufficient.
1997-03-24 11:25:10 +00:00
bde
0d3591bdbd Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.
Fixed everything that depended on getting fcntl.h stuff from the wrong
place.  Most things don't depend on file.h stuff at all.
1997-03-23 03:37:54 +00:00
bde
0bc1781701 Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'.  Use a new function gettime().  The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.
1997-03-22 06:53:45 +00:00
mpp
97b9102cae Update a number of routines to reflect the actual name
of the routine that caused the panic.
1997-03-09 06:10:36 +00:00
bde
f0abbb2b2d Removed unused flag IN_RECURSE and unused struct member i_lockcount. 1997-03-03 16:25:46 +00:00
bde
064d53aea0 Removed useless setting of IN_RECURSE. The (anti) locking for this needs
to be done in a different way, if at all.
1997-03-03 16:23:15 +00:00
dyson
96d17830f3 Correct the port of ext2fs to Lite/2. I incorrectly used ufs_reclaim
instead of ffs_reclaim.
1997-02-26 05:08:18 +00:00
peter
94b6d72794 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
bde
be66779edf Fixed type mismatches. i_spare[N] in ufs/inode.h changed from long to
int.  Change ext2fs to match.  We probably already assume that ints have
>= 32 bits.
1997-02-12 15:35:18 +00:00
mpp
50f9d7b978 Add function prototypes for most of the new Lite2 functions.
Also made a few of the miscfs routines static to be
consistent.  Some modules simply required some additional
#includes to remove -Wall warnings.
1997-02-12 06:52:51 +00:00
dyson
10f666af84 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
jkh
808a36ef65 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
dyson
baa03b956c This commit is the embodiment of some VFS read clustering improvements.
Firstly, now our read-ahead clustering is on a file descriptor basis and not
on a per-vnode basis.  This will allow multiple processes reading the
same file to take advantage of read-ahead clustering.  Secondly, there
previously was a problem with large reads still using the ramp-up
algorithm.  Of course, that was bogus, and now we read the entire
"chunk" off of the disk in one operation.   The read-ahead clustering
algorithm should use less CPU than the previous also (I hope :-)).
1996-12-29 02:44:37 +00:00
bde
45a0793694 Fixed lookup of ".." in checkpath. It always failed, so renames of
directories to a different parent directory always failed.  This bug
was caused by 4.4Lite2 changing the directory format and ext2fs not
keeping up.

Should be in 2.2.
1996-11-09 10:25:04 +00:00
bde
7cae5544c3 Fixed spacefree calculation in ext2_direnter(). This bug sometimes caused
panics.

This should be in 2.2, of course.

Submitted by:	davidg
Obtained from:	bouyer@antioche.ibp.fr (Manuel BOUYER) (fix for NetBSD)
1996-11-08 19:06:34 +00:00
bde
5580ec0a03 Removed gratuitous differences between ext2_readwrite.c and ufs_readwrite.c.
This fixes several bugs and one missing feature:
- cluster_read() was needlessly used for reading files of size exactly 1
  block.
- EFAULT errors for read didn't terminate the loop.  This was probably
  harmless.
- IO_VMIO handling was missing near line 275.  I don't know what this does.
- B_CLUSTEROK was only set if (doclusterwrite) nead line 293.  This was
  harmless, if only because another bug prevents doclusterwrite from being
  0.
- MNT_NOATIME wasn't implemented.

This should be in 2.2, of course.

Reviewed by:	davidg
1996-11-08 18:50:09 +00:00
nate
94c51e38c1 Whoops, I should've used the LINT config file. More ts -> tv changes
for timespec structure.
1996-09-20 05:51:12 +00:00
nate
45c85d421d In sys/time.h, struct timespec is defined as:
/*
         * Structure defined by POSIX.4 to be like a timeval.
         */
        struct timespec {
                time_t  ts_sec;         /* seconds */
                long    ts_nsec;        /* and nanoseconds */
        };

        The correct names of the fields are tv_sec and tv_nsec.

Reminded by:	James Drobina <jdrobina@infinet.com>
1996-09-19 18:21:32 +00:00
bde
25556c3b93 Updated #includes to 4.4Lite style. 1996-09-10 08:32:01 +00:00
gpalmer
57c3ebc617 Clean up -Wunused warnings.
Reviewed by:		bde
1996-06-12 05:11:41 +00:00
bde
82c1afdaa1 Removed bogus _BEGIN_DECLS/_END_DECLS.
Removed unused struct tag declarations in cloned code.

Added or cleaned up idempotency ifdefs.
1996-05-01 02:16:17 +00:00
bde
5183c3b725 Removed vestigial support for the obsolete FIFO option. In ext2fs
it caused null pointer panics for all fifo operations unless FIFO
was defined.
1996-02-25 20:12:36 +00:00
mpp
f3dd75a38d Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.
1996-01-30 23:02:38 +00:00
dyson
8fc8a772af Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
	overhead for merged cache.
Efficiency improvement for vfs_cluster.  It used to do alot of redundant
	calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files.  Additionally,
	fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources.  The pageout code
	will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
	page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
	thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
	that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
	case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
	buffers.
1996-01-19 04:00:31 +00:00
wollman
26b6c4cd73 Convert QUOTA to new-style option. 1996-01-05 18:31:58 +00:00
phk
2a5a36a028 Staticize. 1995-12-17 21:14:36 +00:00
bde
3e882cae21 Restored variables that are used iff QUOTA is defined.
ext2fs still uses #if in many cases where the rest of the kernel uses
#ifdef (for QUOTA...).
1995-12-10 21:38:45 +00:00
bde
31bbf49f89 Restored used includes of <vm/vm_extern.h>. 1995-12-10 14:52:10 +00:00
dyson
8a10737211 Correct some serious porting errors. The worst one was that the
vnode was being placed upon the mount point twice!!!
1995-11-19 20:24:15 +00:00