103 Commits

Author SHA1 Message Date
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
jeff
87e306ad71 - Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
 - Move B_SCANNED into b_vflags and call it BV_SCANNED.
 - Create a vop_stdfsync() modeled after spec's sync.
 - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
   fs specific processing.  This gives all of these filesystems proper
   behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
 - Annotate the locking in buf.h
2003-02-09 11:28:35 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
dillon
d155b8f135 Fix a file-rewrite performance case for UFS[2]. When rewriting portions
of a file in chunks that are less then the filesystem block size, if the
data is not already cached the system will perform a read-before-write.
The problem is that it does this on a block-by-block basis, breaking up the
I/Os and making clustering impossible for the writes.  Programs such
as INN using cyclic file buffers suffer greatly.  This problem is only going
to get worse as we use larger and larger filesystem block sizes.

The solution is to extend the sequential heuristic so UFS[2] can perform
a far larger read and readahead when dealing with this case.

(note: maximum disk write bandwidth is 27MB/sec thru filesystem)
(note: filesystem blocksize in test is 8K (1K frag))
dd if=/dev/zero of=test.dat bs=1k count=2m conv=notrunc

Before:  (note half of these are reads)
      tty             da0              da1             acd0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76 14.21 598  8.30   0.00   0  0.00   0.00   0  0.00   0  0  7  1 92
   0   76 14.09 813 11.19   0.00   0  0.00   0.00   0  0.00   0  0  9  5 86
   0   76 14.28 821 11.45   0.00   0  0.00   0.00   0  0.00   0  0  8  1 91

After:	(note half of these are reads)
      tty             da0              da1             acd0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76 63.62 434 26.99   0.00   0  0.00   0.00   0  0.00   0  0 18  1 80
   0   76 63.58 424 26.30   0.00   0  0.00   0.00   0  0.00   0  0 17  2 82
   0   76 63.82 438 27.32   0.00   0  0.00   0.00   0  0.00   1  0 19  2 79

Reviewed by:	mckusick
Approved by:	re
X-MFC after:	immediately (was heavily tested in -stable for 4 months)
2002-10-18 22:52:41 +00:00
mckusick
750c7540cc When reading or writing the extended attributes of a special device
or fifo in UFS2, the normal ufs_strategy routine needs to be used
rather than the spec_strategy or fifo_strategy routine. Thus the
ffsext_strategy routine is interposed in the ffs_vnops vectors for
special devices and fifo's to pick off this special case. Otherwise
it simply falls through to the usual spec_strategy or fifo_strategy
routine.

Submitted by:	Robert Watson <rwatson@FreeBSD.org>
Sponsored by:	DARPA & NAI Labs.
2002-10-14 23:18:09 +00:00
dd
814e414600 size_t is not a struct (fix mislabelling in a comment). 2002-10-02 05:15:34 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
phk
5f5d8287e5 Use our mount-credential if we get a NOCRED when we try to write out EA
space back to disk.

This is wrong in many ways, but not as wrong as a panic.

Pancied on:	rwatson & jmallet
Sponsored by:	DARPA & NAI Labs.
2002-09-27 20:00:03 +00:00
jeff
8bebc5fdba - Convert locks to use standard macros.
- Lock access to the buflists.
 - Document broken locking.
 - Use vrefcnt().
2002-09-25 02:49:48 +00:00
phk
87f5667c5a Implement the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.
Use extattr_check_cred() to check access to EAs.

This is still a WIP.

Sponsored by:   DARPA & NAI Labs.
2002-09-05 20:59:42 +00:00
bde
fcab7b8389 Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Include <sys/vmmeter.h> instead of depending on namespace pollution in
<sys/pcpu.h>.

Sorted includes as much as possible.
2002-09-05 09:43:24 +00:00
phk
49cb8194a9 Correctly handle setting, getting and deleting EA's with zero length content.
Sponsored by:	DARPA & NAI Labs.
2002-08-30 08:57:09 +00:00
alc
cdcc7b3446 o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since
pmap_zero_page() and pmap_zero_page_area() were modified to accept
   a struct vm_page * instead of a physical address, vm_page_zero_fill()
   and vm_page_zero_fill_area() have served no purpose.
2002-08-25 00:22:31 +00:00
phk
a1e3931514 Implement list of EA return functionality.
Correctly delete EA's when the content length is set to zero.

Sponsored by:	DARPA & NAI Labs.
2002-08-20 11:34:58 +00:00
phk
d1ede7ccd1 First snapshot of UFS2 EA support.
Sponsored by: DARPA & NAI Labs.
2002-08-19 07:01:55 +00:00
phk
40f2376812 Expand the arguments to ffs_ext{read,write}() to their component
parts rather than use vop_{read,write}_args.  Access to these
functions will ultimately not be available through the
"vop_{read,write}+IO_EXT" API but this functionality is retained
for debugging purposes for now.

Sponsored by: DARPA & NAI Labs.
2002-08-13 11:33:01 +00:00
phk
088460429a Unravel the UFS_EXTATTR incest between FFS and UFS: UFS_EXTATTR is an
UFS only thing, and FFS should in principle not know if it is enabled
or not.

This commit cleans ffs_vnops.c for such knowledge, but not ffs_vfsops.c

Sponsored by: DARPA and NAI Labs.
2002-08-13 10:33:57 +00:00
phk
58bc3221a4 Stop pretending that the FFS file ufs_readwrite.c is a UFS file.
Instead of #including it, pull it into ffs_vnops.c and name things
correctly.

Sponsored by:	DARPA & NAI Labs.
2002-08-12 10:32:56 +00:00
jeff
02517b6731 - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
mckusick
88d85c15ef This commit adds basic support for the UFS2 filesystem. The UFS2
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.

Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.

Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).

Sponsored by: DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@freebsd.org>
2002-06-21 06:18:05 +00:00
phk
a1f99d3509 Name ufs_vop_[gs]etextattr() consistently with the rest of our VOPs and
put then in the ufs_vnops where they belong, rather than in the ffs_vnops.

Ok'ed by:	rwatson
Sponsored by:	DARPA & NAI Labs.
2002-05-03 08:40:33 +00:00
alfred
79061a9306 Remove __P. 2002-03-19 22:40:48 +00:00
mckusick
670fad8a9e This corrects the first of two known deadlock conditions that
come from the presence of a snapshot file.
2002-03-14 01:21:13 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
phk
5948c9ed5b Implement vop_std{get|put}pages() and add them to the default vop[].
Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but
the default.
2001-05-01 08:34:45 +00:00
phk
8e3fa89968 VOP_BALLOC was never really a VOP in the first place, so convert it
to UFS_BALLOC like the other "between UFS and FFS function interfaces".
2001-04-29 12:36:52 +00:00
grog
4b9d9cbaac Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
grog
1f5de30718 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
rwatson
8a937bbc3a o Change options FFS_EXTATTR and options FFS_EXTATTR_AUTOSTART to
options UFS_EXTATTR and UFS_EXTATTR_AUTOSTART respectively.  This change
  reflects the fact that our EA support is implemented entirely at the
  UFS layer (modulo FFS start/stop/autostart hooks for mount and unmount
  events).  This also better reflects the fact that [shortly] MFS will also
  support EAs, as well as possibly IFS.

o Consumers of the EA support in FFS are reminded that as a result, they
  must change kernel config files to reflect the new option names.

Obtained from:	TrustedBSD Project
2001-03-19 04:35:40 +00:00
mckusick
61db3f4296 Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).
2001-03-07 07:09:55 +00:00
phk
709379c1ae Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
adrian
0458054c4e Initial commit of IFS - a inode-namespaced FFS. Here is a short
description:

How it works:
--

Basically ifs is a copy of ffs, overriding some vfs/vnops. (Yes, hack.)
I didn't see the need in duplicating all of sys/ufs/ffs to get this
off the ground.

File creation is done through a special file - 'newfile' . When newfile
is called, the system allocates and returns an inode. Note that newfile
is done in a cloning fashion:

fd = open("newfile", O_CREAT|O_RDWR, 0644);
fstat(fd, &st);

printf("new file is %d\n", (int)st.st_ino);

Once you have created a file, you can open() and unlink() it by its returned
inode number retrieved from the stat call, ie:

fd = open("5", O_RDWR);

The creation permissions depend entirely if you have write access to the
root directory of the filesystem.

To get the list of currently allocated inodes, VOP_READDIR has been added
which returns a directory listing of those currently allocated.

--

What this entails:

* patching conf/files and conf/options to include IFS as a new compile
  option (and since ifs depends upon FFS, include the FFS routines)

* An entry in i386/conf/NOTES indicating IFS exists and where to go for
  an explanation

* Unstaticize a couple of routines in src/sys/ufs/ffs/ which the IFS
  routines require (ffs_mount() and ffs_reload())

* a new bunch of routines in src/sys/ufs/ifs/ which implement the IFS
  routines. IFS replaces some of the vfsops, and a handful of vnops -
  most notably are VFS_VGET(), VOP_LOOKUP(), VOP_UNLINK() and VOP_READDIR().
  Any other directory operation is marked as invalid.

What this results in:

* an IFS partition's create permissions are controlled by the perm/ownership of
  the root mount point, just like a normal directory

* Each inode has perm and ownership too

* IFS does *NOT* mean an FFS partition can be opened per inode. This is a
  completely seperate filesystem here

* Softupdates doesn't work with IFS, and really I don't think it needs it.
  Besides, fsck's are FAST. (Try it :-)

* Inodes 0 and 1 aren't allocatable because they are special (dump/swap IIRC).
  Inode 2 isn't allocatable since UFS/FFS locks all inodes in the system against
  this particular inode, and unravelling THAT code isn't trivial. Therefore,
  useful inodes start at 3.

Enjoy, and feedback is definitely appreciated!
2000-10-14 03:02:30 +00:00
eivind
4a39f454a0 Blow away the v_specmountpoint define, replacing it with what it was
defined as (rdev->si_mountpoint)
2000-10-09 17:31:39 +00:00
rwatson
40aced8438 o Permit UFS Extended Attributes to be associated with special devices
and FIFOs.

Obtained from:	TrustedBSD Project
2000-09-21 19:06:02 +00:00
mckusick
a3d0c189ea Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
phk
cb90cb2b60 Revert part of my bioops change which implemented panic(8). 2000-06-16 14:32:13 +00:00
phk
4ec91666fa Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
phk
36c3965ff9 Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
rwatson
a0dd5ab0fd Introduce extended attribute support for FFS, allowing arbitrary
(name, value) pairs to be associated with inodes.  This support is
used for ACLs, MAC labels, and Capabilities in the TrustedBSD
security extensions, which are currently under development.

In this implementation, attributes are backed to data vnodes in the
style of the quota support in FFS.  Support for FFS extended
attributes may be enabled using the FFS_EXTATTR kernel option
(disabled by default).  Userland utilities and man pages will be
committed in the next batch.  VFS interfaces and man pages have
been in the repo since 4.0-RELEASE and are unchanged.

o ufs/ufs/extattr.h: UFS-specific extattr defines
o ufs/ufs/ufs_extattr.c: bulk of support routines
o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes
o contrib/softupdates/ffs_softdep.c: extattr.h includes
o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR

o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h
(This should not be the case, and will be fixed in a future commit)

Currently attributes are not supported in MFS.  This will be fixed.

Reviewed by:	adrian, bp, freebsd-fs, other unthanked souls
Obtained from:	TrustedBSD Project
2000-04-15 03:34:27 +00:00
phk
ae0c1ec8f7 Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
mckusick
d4409da210 Several performance improvements for soft updates have been added:
1) Fastpath deletions. When a file is being deleted, check to see if it
   was so recently created that its inode has not yet been written to
   disk. If so, the delete can proceed to immediately free the inode.
2) Background writes: No file or block allocations can be done while the
   bitmap is being written to disk. To avoid these stalls, the bitmap is
   copied to another buffer which is written thus leaving the original
   available for futher allocations.
3) Link count tracking. Constantly track the difference in i_effnlink and
   i_nlink so that inodes that have had no change other than i_effnlink
   need not be written.
4) Identify buffers with rollback dependencies so that the buffer flushing
   daemon can choose to skip over them.
2000-01-10 00:24:24 +00:00
phk
1848d96439 Convert various pieces of code to use vn_isdisk() rather than checking
for vp->v_type == VBLK.

In ccd: we don't need to call VOP_GETATTR to find the type of a vnode.

Reviewed by:    sos
1999-11-22 10:33:55 +00:00
phk
8e3c3eafed useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
phk
e938d317d5 Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.
1999-08-08 18:43:05 +00:00
mckusick
5b58f2f951 Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.
1999-06-26 02:47:16 +00:00
mckusick
3050d8dd0b On our final pass through ffs_fsync, do all I/O synchronously so that
we can find out if our flush is failing because of write errors. This
change avoids a "flush failed" panic during unrecoverable disk errors.
1999-06-18 05:49:46 +00:00
mckusick
365073a062 Add a hook to ffs_fsync to allow soft updates to get first chance at doing
a sync on the block device for the filesystem. That allows it to push the
bitmap blocks before the inode blocks which greatly reduces the number of
inode rollbacks that need to be done.
1999-05-14 01:26:46 +00:00
mckusick
421edf71f1 When fsync'ing a file on a filesystem using soft updates, we first try
to write all the dirty blocks. If some of those blocks have dependencies,
they will be remarked dirty when the I/O completes. On systems with
really fast I/O systems, it is possible to get in an infinite loop trying
to flush the buffers, because the I/O finishes before we can get all the
dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The
previous algorithm looped over the v_dirtyblkhd list writing out buffers
until the list emptied.) So, now we mark each buffer that we try to
write so that we can distinguish the ones that are being remarked dirty
from those that we have not yet tried to flush. Once we have tried to
push every buffer once, we then push any associated metadata that is
causing the remaining buffers to be redirtied.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-03-02 04:04:31 +00:00
bde
2facf6978a Don't pass unused unused timestamp args to UFS_UPDATE() or waste
time initializing them.  This almost finishes centralizing (in-core)
timestamp updates in ufs_itimes().
1999-01-07 16:14:19 +00:00