Commit Graph

1177 Commits

Author SHA1 Message Date
rwatson
563547c6bc Implement ffs_listextattr() by breaking out that logic and special-cased
attribute name of "" from ffs_getextattr().  Invoking VOP_GETETATTR()
with an empty name is now no longer supported; user application
compatibility is provided by a system call level compatibility
wrapper.  We make sure to explicitly reject attempts to set an EA
with the name "".

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 05:57:39 +00:00
rwatson
5fd55dbf85 Don't special-case handling of the empty string in the UFS1
extended attribute retrieval code: it's no longer special-cased,
and is caught by the normal UFS1 EA validity checks (and, in
fact, returns the same error, EINVAL).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 04:58:58 +00:00
rwatson
1963dd0ba4 Return EOPNOTSUPP for attempted EA operations on VCHR vnodes in UFS2;
if we permit them to occur, the kernel panics due to our performing
EA operations using VOP_STRATEGY on the vnode.  This went unnoticed
previously because there are very for users of device nodes on UFS2
due to the introduction of devfs.  However, this can come up with
the Linux compat directories and its hard-coded dev nodes (which will
need to go away as we move away from hard-coded device numbers).
This can come up if you use EA-intensive features such as ACLs and
MAC.

The proper fix is pretty complicated, but this band-aid would be
an excellent MFC candidate for the release.
2003-06-01 02:42:18 +00:00
phk
34f931b00b Remove unused variable.
Found by:       FlexeLint
2003-05-31 19:56:09 +00:00
phk
9bb8512e8f Remove unused local variables.
Found by:       FlexeLint
2003-05-31 18:17:32 +00:00
phk
0129a20107 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
alc
57b29f0e87 Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
rwatson
7e2cfac5e0 Jeff added locking assertions that the VV_ flags on vnodes were modified
only while holding appropriate vnode locks.  This patch slides the lock
release for ufs_extattr_enable() to continue to hold the active vnode lock
on a backing file until after the flag change; it also acquires a vnode
lock when disabling an attribute and hence clearing a flag on the backing
vnode.  This permits VFS_DEBUG_LOCKS to run UFS1 extended attributes
without panicking, as well as preventing a potential race and vnode flag
problem.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:07:33 +00:00
alc
f9966ce9e8 Lock the vm_object on entry to vm_object_vndeallocate(). 2003-05-03 20:28:26 +00:00
tjr
854348219c Do not attempt to free NULL dinodes (i_din1 or i_din2) in ffs_ifree().
These fields can be left as NULL if ffs_vget() allocates an inode but
fails before the dinode memory has been allocated. There are two cases
when this can occur: when we lose a race and another process has added
the inode to the hash, and when reading the inode off disk fails.

The bug was observed by Kris on one of the package-building machines.
See http://marc.theaimsgroup.com/?l=freebsd-current&m=105172731013411&w=2
In Kris's case, it was the bread() that failed because of a disk error.

The alternative to this patch is to ensure that ffs_vget() does not call
vput() when the inode that hasn't been properly initialised.
2003-05-01 06:41:59 +00:00
tjr
0b639b63af Free i_din2 instead of i_din1 in ffs_ifree() on UFS2 filesystems.
This is purely a cosmetic change because these members are in a
union together.
2003-05-01 06:38:27 +00:00
markm
6cc289554b Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
kan
9468fdaf14 Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
jhb
ced60d737a Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
2003-04-22 20:45:38 +00:00
jeff
886a932d7b - Use the sched_nice() api instead of setting the nice value directly.
Tested by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2003-04-12 01:05:19 +00:00
alc
ac3de07d6f Sufficient access checks are performed by vmapbuf() that calling useracc()
is pointless.  Remove the call to useracc().

Don't reinitialize fields that are already initialized by getpbuf().

Reviewed by:	tegge
2003-04-06 19:26:30 +00:00
tegge
23e9ae3483 Check return value from vmapbuf instead of the function address. 2003-03-27 20:48:34 +00:00
tegge
d1a3d87bf5 Eliminate a buffer sleep/wakeup race. 2003-03-27 19:28:11 +00:00
tegge
ede5ebede7 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
jhb
b8b062b09b Use td->td_ucred instead of td->td_proc->p_ucred. 2003-03-20 21:17:40 +00:00
jhb
15ebade0f4 Minor fixes to ffs_fserr():
- Assume that curthread is not NULL.  It never is in -current.
- Use td_ucred instead of p_ucred.
2003-03-20 21:15:54 +00:00
phk
e059b79437 Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
jeff
ae3c8799da - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
mckusick
9c654a63e6 Use the appropriate size when zeroing out the unused portion
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.

Reported by:	Ian Freislich <ianf@za.uu.net>
Sponsored by:   DARPA & NAI Labs.
2003-03-07 23:49:16 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
jeff
4de0ae322c - Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick
2003-03-04 00:04:44 +00:00
njl
5a225ad933 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
des
2756b6c964 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
mckusick
259fedfc3c Change the field used to test whether the superblock has been updated
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.

This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.

Reported by:	Nate Lawson <nate@root.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 23:21:08 +00:00
jeff
9e4c9a6ce9 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
das
1cc0669761 Expand the reference count on struct dquot to 32 bits.
This fixes a panic on large systems where a single user
may have more than 64K active or inactive vnodes.

PR:		48234
Reviewed by:	mike (mentor)
2003-02-24 08:49:59 +00:00
mckusick
46e9534a11 When removing the last item from a non-empty worklist, the worklist
tail pointer must be updated.

Reported by:	Kris Kennaway <kris@obsecurity.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-24 07:28:41 +00:00
mckusick
9ed5a11f37 This patch fixes a deadlock between the bufdaemon and a process taking
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:59:34 +00:00
mckusick
5340f28b15 This patch fixes a bug on an active filesystem on which a snapshot
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:29:51 +00:00
mckusick
e6aeffdeb5 This patch fixes a bug in the logical block calculation macros so
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:19:26 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
mckusick
d9ebbec084 Replace use of random() with arc4random() to provide less guessable
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.

Submitted by:	Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 21:31:58 +00:00
mckusick
d8fb26b1c6 Correct lines incorrectly added to the copyright message.
Submitted by:	Frank van der Linden <fvdl@wasabisystems.com>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 00:31:06 +00:00
jeff
87e306ad71 - Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
 - Move B_SCANNED into b_vflags and call it BV_SCANNED.
 - Create a vop_stdfsync() modeled after spec's sync.
 - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
   fs specific processing.  This gives all of these filesystems proper
   behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
 - Annotate the locking in buf.h
2003-02-09 11:28:35 +00:00
alfred
86daf0cca6 Catch more uses of MIN(). 2003-02-02 13:30:00 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
dillon
ccd5574cc6 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
dillon
ddf9ef103e Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
marcel
111c003344 o Improve wording of the comment that accompanies fs_pad. The
padding is not specific to non-i386 architectures. It is
   caused by non-i386 specific alignment requirements of
   fs_swuid,
o  Add a CTASSERT to catch a change in the size of struct fs
   at compile-time rather than run-time.

Ok'd: gordon
Tested on: i386 ia64
2003-01-10 06:59:34 +00:00
gordon
2af32f18dc Fix superblock alignment problems on non-i386 platforms. Also change fs_uuid
to fs_swuid, making it more descriptive.

Submitted by:	marcel
Reviewed by:	peter
Pointy hat to:	gordon
2003-01-09 23:53:30 +00:00
gordon
f1018f664a Steal some space from fs_fsmnt to create fs_volname and fs_uuid. The volname
will be used to support volume names with the help of a GEOM module (to be
committed). uuid will be used to deal with conflicting volume names (which
doesn't work just yet).

Approved by:	mckusick@
2003-01-08 22:53:54 +00:00
mckusick
db74e87c2d This patch fixes a problem caused by applications that rapidly and
repeatedly truncate the same file. Each time the file is truncated,
a buffer is grabbed to store the indirect block numbers that need
to be freed. Those blocks cannot be freed until the inode claiming
them is written to disk. Thus, the number of buffers being held by
soft updates explodes and in extreme cases can run the kernel out
of buffers. The problem can be avoided by doing an fsync on the
file every debug.maxindirdep truncates (currently defaulted to 50).
The fsync causes the inode to be written so that the held buffers
can be freed. The check for excessive buffers is checked as part
of the existing hook for excessive dependencies (softdep_slowdown)
in the truncate code.

Reported by:	David Schultz <dschultz@uclink.Berkeley.EDU>
Sponsored by:   DARPA & NAI Labs.
MFC after:	3 weeks
2003-01-07 18:23:50 +00:00
phk
131885aa2f Temporarily introduce a new VOP_SPECSTRATEGY operation while I try
to sort out disk-io from file-io in the vm/buffer/filesystem space.

The intent is to sort VOP_STRATEGY calls into those which operate
on "real" vnodes and those which operate on VCHR vnodes.  For
the latter kind, the call will be changed to VOP_SPECSTRATEGY,
possibly conditionally for those places where dual-use happens.

Add a default VOP_SPECSTRATEGY method which will call the normal
VOP_STRATEGY.  First time it is called it will print debugging
information.  This will only happen if a normal vnode is passed
to VOP_SPECSTRATEGY by mistake.

Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY
does on a VCHR vnode today.

Add a new VOP_STRATEGY method in specfs to catch instances where
the conversion to VOP_SPECSTRATEGY has not yet happened.  Handle
the request just like we always did, but first time called print
debugging information.

Apart up to two instances of console messages per boot, this amounts
to a glorified no-op commit.

If you get any of the messages on your console I would very much
like a copy of them mailed to phk@freebsd.org
2003-01-04 22:10:36 +00:00
phk
157437ec08 Since Jeffr made the std* functions the default in rev 1.63 of
kern/vfs_defaults.c it is wrong for the individual filesystems to use
the std* functions as that prevents override of the default.

Found by:       src/tools/tools/vop_table
2003-01-04 08:47:19 +00:00
phk
daf6948653 Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since
all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
2003-01-03 06:32:15 +00:00
schweikh
d3367c5f5d Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
alfred
927595101c When compiling the kernel do not implicitly include filedesc.h from proc.h,
this was causing filedesc work to be very painful.
In order to make this work split out sigio definitions to thier own header
(sigio.h) which is included from proc.h for the time being.
2003-01-01 01:56:19 +00:00
phk
72aba11981 Use three UMA zones for FFS/UFS inodes instead of malloc space.
Since inodes are currently 144 bytes, this will save 112 bytes per
inode.  This can amount to up to 10MByte on large systems.
2002-12-27 11:05:05 +00:00
phk
3d9be4e20a Move the allocation of the inode contents into ffs_vfsops.c rather than
passing malloc types around.
2002-12-27 10:23:03 +00:00
phk
afd8ad09d7 Make ffs_mountfs() static.
Remove the malloctype from the ufs mount structure, instead add a callback
to the storage method for freeing inodes: UFS_IFREE().

Add vfs_ifree() method function which frees an inode.

Unvariablelize the malloc type used for allocating inodes.
2002-12-27 10:06:37 +00:00
mckusick
91195ac766 Fix corruption introduced in previous delta.
Reported by:	Aurelien Nephtali <aurelien.nephtali@wanadoo.fr>
Sponsored by:   DARPA & NAI Labs.
2002-12-18 19:50:28 +00:00
mckusick
f86b91ebe0 Keep comments consistent with the code. Minor optimization.
Sponsored by:   DARPA & NAI Labs.
2002-12-18 07:19:41 +00:00
mckusick
345fe7ca1f Cosmetic cleanup of unsigned buglets.
Submitted by:	Bruce Evans <bde@zeta.org.au>
Sponsored by:   DARPA & NAI Labs.
2002-12-18 00:53:45 +00:00
phk
723e3b218d Remove unused lockcnt variable.
Approved by:	mckusick
2002-12-17 20:23:51 +00:00
mckusick
668d95fdc4 Update to previous change (1.54) to use an approperly wide inode field
so as to work correctly on 64-bit platforms.

Reported-by:	Jake Burkholder <jake@locore.ca>
Sponsored by:   DARPA & NAI Labs.
Approved by:	Ian Dowse <iedowse@maths.tcd.ie>
2002-12-15 19:25:59 +00:00
iedowse
3bf3f7056c Undo the adjustment of the total memory used by dirhash in the case
where allocating the dirhash structure fails. Fix a few typos in
comments and update copyright.

MFC after:	1 week
2002-12-14 17:16:16 +00:00
mckusick
069ffa44b5 Only the most recent snapshot contains the complete list of blocks
that were copied in all of the earlier snapshots, thus its precomputed
list must be used in the copyonwrite test. Using incomplete lists may
lead to deadlock. Also do not include the blocks used for the indirect
pointers in the indirect pointers as this may lead to inconsistent
snapshots.

Sponsored by:   DARPA & NAI Labs.
Approved by:	re
2002-12-14 01:36:59 +00:00
trhodes
b45b4e9e9e Remove the comment about dump(8) not working properly with snapshots.
Discussed with:	mckusick
Approved by:	re (rwatson)
2002-12-12 00:31:45 +00:00
mckusick
461d1f1c94 More tightly verify the preference returned for the new inode.
Submitted by:	Kris Kennaway <kris@obsecurity.org>
Sponsored by:   DARPA & NAI Labs.
Approved by:	re
2002-12-06 02:08:46 +00:00
mckusick
6822be5fe2 Have to use bread() rather than UFS_BALLOC() when obtaining a
previously allocated block as the previous use of the block may
have fallen out of the cache. Failure to reread its contents cause
zeroed results to be written instead of the proper contents.
Conversely, when the block is going to be entirely filled in, it
is not necessary reread the old contents.

Sponsored by:   DARPA & NAI Labs.
Approved by:	re
2002-12-03 18:19:27 +00:00
mckusick
69c6a0dc68 Add a check to disable the previous patch so that future filesystems
that choose to place their superblocks in non-standard locations will
not get them smashed.

Sponsored by:   DARPA & NAI Labs.
2002-11-30 19:04:57 +00:00
mckusick
c1f52553a2 Remove a race condition / deadlock from snapshots. When
converting from individual vnode locks to the snapshot
lock, be sure to pass any waiting processes along to the
new lock as well. This transfer is done by a new function
in the lock manager, transferlockers(from_lock, to_lock);
Thanks to Lamont Granquist <lamont@scriptkiddie.org> for
his help in pounding on snapshots beyond all reason and
finding this deadlock.

Sponsored by:   DARPA & NAI Labs.
2002-11-30 19:00:51 +00:00
mckusick
d4a32db4ae Fix two deadlocks in snapshots:
1) Release the snapshot file lock while suspending the system. Otherwise
   a process trying to read the lock may block on its containing directory
   preventing the suspension from completing. Thanks to Sean Kelly
   <smkelly@zombie.org> for finding this deadlock.

2) Replace some bdwrite's with bawrite's so as not to fill all the
   buffers with dirty data. The buffers could not be cleaned as the
   snapshot vnode was locked hence the system could deadlock when
   making snapshots of really massive filesystems. Thanks to
   Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp> for figuring
   this out.

Sponsored by:   DARPA & NAI Labs.
2002-11-30 07:27:12 +00:00
mckusick
78d15991e4 Check to make sure that the fs_sblockloc field was properly updated
before using it to write the superblock. This is to guard against
accidentally trashing the disklabel if the superblock format missed
being upgraded by the new kernel.

Reported by:	Sam Leffler <sam@errno.com>
Sponsored by:   DARPA & NAI Labs.
Approved by:	Murray Stokely <murray@FreeBSD.org>
2002-11-29 19:20:15 +00:00
mckusick
9251693096 Create a new 32-bit fs_flags word in the superblock. Add code to move
the old 8-bit fs_old_flags to the new location the first time that the
filesystem is mounted by a new kernel. One of the unused flags in
fs_old_flags is used to indicate that the flags have been moved.
Leave the fs_old_flags word intact so that it will work properly if
used on an old kernel.

Change the fs_sblockloc superblock location field to be in units
of bytes instead of in units of filesystem fragments. The old units
did not work properly when the fragment size exceeeded the superblock
size (8192). Update old fs_sblockloc values at the same time that
the flags are moved.

Suggested by:	BOUWSMA Barry <freebsd-misuser@netscum.dyndns.dk>
Sponsored by:   DARPA & NAI Labs.
2002-11-27 02:18:58 +00:00
mckusick
a51ebc9dc0 The target for the maximum number of dependencies has been cut
in half because of reports that under heavy load the kernel could
exhaust its memory pool. The limit is now (desiredvnodes * 4)
rather than (desiredvnodes * 8), so it will still scale with
larger systems, just not as quickly.

Sponsored by:   DARPA & NAI Labs.
2002-11-20 05:16:11 +00:00
mckusick
637af64f54 If an error occurs while writing a buffer, then the data will
not have hit the disk and the dependencies cannot be unrolled.
In this case, the system will mark the buffer as dirty again so
that the write can be retried in the future. When the write
succeeds or the system gives up on the buffer and marks it as
invalid (B_INVAL), the dependencies will be cleared.

Sponsored by:   DARPA & NAI Labs.
2002-11-20 05:14:16 +00:00
peter
c841be9bcb Do not assume that time_t is an int.
Approved by:	re (jhb)
2002-11-15 22:36:57 +00:00
jhb
b8daeabf59 Print daddr_t's with %j and intmax_t. 2002-11-08 22:28:35 +00:00
rwatson
3ad18c8074 Update licenses and wording: NAI has authorized the removal of clause three
of their BSD-style license; also, carry out the NAI Labs -> Network
Associates Laboratories renaming in these files.
2002-11-04 02:35:46 +00:00
wollman
ce3867deda Implement the new 1003.1-2001 pathconf() keys, including the Advisory
Information option.  Other filesystem implementations should do something
similar.

With advice from:	mckusick, phk
2002-10-27 18:09:49 +00:00
rwatson
312cab0dee Slightly change the semantics of vnode labels for MAC: rather than
"refreshing" the label on the vnode before use, just get the label
right from inception.  For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system.  With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance.  This
also corrects sematics for shared vnode locks, which were not
previously present in the system.  This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form.  With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception.  We'll introduce a work around for this shortly.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-26 14:38:24 +00:00
mckusick
6b1611bd94 Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.
2002-10-25 00:20:37 +00:00
mckusick
0337df10b7 We must be careful to avoid recursive copy-on-write faults when
trying to clean up during disk-full senarios.

Sponsored by:	DARPA & NAI Labs.
2002-10-23 21:47:02 +00:00
mckusick
3819d46020 Missplaced FREE_LOCK causes a panic when hit while taking a snapshot.
Sponsored by:	DARPA & NAI Labs.
2002-10-23 05:14:06 +00:00
mckusick
04450228c6 This update further fine tunes the locking of snapshot vnodes in
the ffs_copyonwrite routine to avoid a deadlock between the syncer
daemon trying to sync out a snapshot vnode and the bufdaemon
trying to write out a buffer containing the snapshot inode.
With any luck this will be the last snapshot race condition.

Sponsored by:	DARPA & NAI Labs.
2002-10-22 01:23:00 +00:00
mckusick
a515fcf789 This update is a performance improvement when allocating blocks on
a full filesystem. Previously, if the allocation failed, we had to
fsync the file before rolling back any partial allocation of indirect
blocks. Most block allocation requests only need to allocate a single
data block and if that allocation fails, there is nothing to unroll.
So, before doing the fsync, we check to see if any rollback will
really be necessary. If none is necessary, then we simply return.
This update eliminates the flurry of disk activity that got triggered
whenever a filesystem would run out of space.

Sponsored by:	DARPA & NAI Labs.
2002-10-22 01:14:25 +00:00
mckusick
305e5868f3 This checkin reimplements the io-request priority hack in a way
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):

----------------------------
revision 1.65
date: 2002/04/22 06:53:20;  author: phk;  state: Exp;  lines: +5 -0
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
----------------------------

As suggested in this comment, it is no longer located in the disk sort
routine, but rather now resides in spec_strategy where the disk operations
are being queued by the thread that is associated with the process that
is really requesting the I/O. At that point, the disk queues are not
visible, so the I/O for positively niced processes is always slowed
down whether or not there is other activity on the disk.

On the issue of scaling HZ, I believe that the current scheme is
better than using a fixed quantum of time. As machines and I/O
subsystems get faster, the resolution on the clock also rises.
So, ten years from now we will be slowing things down for shorter
periods of time, but the proportional effect on the system will
be about the same as it is today. So, I view this as a feature
rather than a drawback. Hence this patch sticks with using HZ.

Sponsored by:	DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
2002-10-22 00:59:49 +00:00
rwatson
d862ecfee8 Rename _POSIX_FOO_PRESENT and friends from POSIX.1e to _PC_FOO_PRESENT
and related friends.  This would have been corrected had POSIX.1e
progressed to a standard.

Pointed out by:	wollman
2002-10-20 22:11:13 +00:00
rwatson
438835cabb Implement _POSIX_ACL_PATH_MAX, which returns the maximum number of ACL
entries for a file system node using pathconf().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 22:08:26 +00:00
rwatson
9d17032f64 Teach UFS to respond to pathconf() tests for _POSIX_ACL_EXTENDED and
_POSIX_MAC_PRESENT based on available mount flags, if the services are
available.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 21:49:41 +00:00
rwatson
a2eb2e3662 Clarify that the UFS1 extended attribute configuration steps do not apply
to UFS2 file systems.

Submitted by:	jedgar
Obtained from:	TrustedBSD Project
2002-10-19 16:09:16 +00:00
dillon
d155b8f135 Fix a file-rewrite performance case for UFS[2]. When rewriting portions
of a file in chunks that are less then the filesystem block size, if the
data is not already cached the system will perform a read-before-write.
The problem is that it does this on a block-by-block basis, breaking up the
I/Os and making clustering impossible for the writes.  Programs such
as INN using cyclic file buffers suffer greatly.  This problem is only going
to get worse as we use larger and larger filesystem block sizes.

The solution is to extend the sequential heuristic so UFS[2] can perform
a far larger read and readahead when dealing with this case.

(note: maximum disk write bandwidth is 27MB/sec thru filesystem)
(note: filesystem blocksize in test is 8K (1K frag))
dd if=/dev/zero of=test.dat bs=1k count=2m conv=notrunc

Before:  (note half of these are reads)
      tty             da0              da1             acd0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76 14.21 598  8.30   0.00   0  0.00   0.00   0  0.00   0  0  7  1 92
   0   76 14.09 813 11.19   0.00   0  0.00   0.00   0  0.00   0  0  9  5 86
   0   76 14.28 821 11.45   0.00   0  0.00   0.00   0  0.00   0  0  8  1 91

After:	(note half of these are reads)
      tty             da0              da1             acd0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76 63.62 434 26.99   0.00   0  0.00   0.00   0  0.00   0  0 18  1 80
   0   76 63.58 424 26.30   0.00   0  0.00   0.00   0  0.00   0  0 17  2 82
   0   76 63.82 438 27.32   0.00   0  0.00   0.00   0  0.00   1  0 19  2 79

Reviewed by:	mckusick
Approved by:	re
X-MFC after:	immediately (was heavily tested in -stable for 4 months)
2002-10-18 22:52:41 +00:00
rwatson
ab9568ccbf Update extended attribute readme file to note that no special configuration
is required to use EAs with UFS2, and that UFS2 is recommend for EA use
for a variety of reasons.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-18 21:11:36 +00:00
rwatson
10e2a00a6a Update instructions for ACLs given recent tunefs, mount changes. Also
note that UFS2 doesn't require explicit extended attribute configuration,
and is recommends for this and other reasons if you plan to use ACLs.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-18 21:09:57 +00:00
rwatson
749729a702 Use 'size_t' instead of 'int' for the result of sizeof(). 2002-10-18 21:03:30 +00:00
mckusick
0af0d22682 With the revised single-lock method used in snapshots, the
BA_NOWAIT flag is no longer needed.

Sponsored by:	DARPA & NAI Labs.
2002-10-18 01:17:28 +00:00
mckusick
733bfbdd78 Change locking so that all snapshots on a particular filesystem share
a common lock. This change avoids a deadlock between snapshots when
separate requests cause them to deadlock checking each other for a
need to copy blocks that are close enough together that they fall
into the same indirect block. Although I had anticipated a slowdown
from contention for the single lock, my filesystem benchmarks show
no measurable change in throughput on a uniprocessor system with
three active snapshots. I conjecture that this result is because
every copy-on-write fault must check all the active snapshots, so
the process was inherently serial already. This change removes the
last of the deadlocks of which I am aware in snapshots.

Sponsored by:	DARPA & NAI Labs.
2002-10-16 00:19:23 +00:00
rwatson
174c4a0034 Push most UFS ACL behavior behind a check for MNT_ACLS, permitting ACLs
to be administratively disabled as needed on UFS/UFS2 file systems.  This
also has the effect of preventing the slightly more expensive ACL code
from running on non-ACL file systems, avoiding storage allocation for
ACLs that may be read from disk.  MNT_ACLS may be set at mount-time
using mount -o acls, or implicitly by setting the FS_ACLS flag using
tunefs.  On UFS1, you may also have to configure ACL store.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-15 21:28:24 +00:00
rwatson
e112f21cae If the FS_MULTILABEL flag is set in a UFS or UFS2 superblock,
automatically set MNT_MULTILABEL in the mount flags.

If FS_ACLS is set in a UFS or UFS2 superblock, automatically
set MNT_ACLS in the mount flags.

If either of these flags is set, but the appropriate kernel option
to support the features associated with the flag isn't available,
then print a warning at mount-time.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-15 20:00:06 +00:00
mckusick
750c7540cc When reading or writing the extended attributes of a special device
or fifo in UFS2, the normal ufs_strategy routine needs to be used
rather than the spec_strategy or fifo_strategy routine. Thus the
ffsext_strategy routine is interposed in the ffs_vnops vectors for
special devices and fifo's to pick off this special case. Otherwise
it simply falls through to the usual spec_strategy or fifo_strategy
routine.

Submitted by:	Robert Watson <rwatson@FreeBSD.org>
Sponsored by:	DARPA & NAI Labs.
2002-10-14 23:18:09 +00:00
rwatson
67d568c288 Fix two memory leaks in error conditions involving the UFS ACL code:
if failures occur, make sure that we release both the default ACL
and access ACL storage during new object creation.

Spotted by:	phk and his pet flexelint
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-14 19:55:49 +00:00
rwatson
9c6b9f51d1 Define two new superblock file system flags:
FS_ACLS		Administrative enable/disable of extended ACL support
FS_MULTILABEL	Administrative flag to indicate to the MAC Framework
		that objects in the file system are individually
		labeled using extended attributes.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
Reviewed by:	(in principal) mckusick, phk
2002-10-14 17:07:11 +00:00
mckusick
25230d4c6a Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
mike
8630abe45f Change iov_base's type from char *' to the standard void *'. All
uses of iov_base which assume its type is `char *' (in order to do
pointer arithmetic) have been updated to cast iov_base to `char *'.
2002-10-11 14:58:34 +00:00
mux
ddefc9d7d6 Fix build of 64 bit platforms. 2002-10-09 12:19:36 +00:00
mckusick
e13d3e9276 When creating a snapshot, create a list of initially allocated blocks.
Whenever doing a copy-on-write check, first look in the list of
initially allocated blocks to see if it is there. If so, no further
check is needed. If not, fall through and do the full check. This
change eliminates one of two known deadlocks caused by snapshots.
Handling the second deadlock will be the subject of another check-in.
This change also reduces the cost of the copy-on-write check by
speeding up the verification of frequently checked blocks.

Sponsored by:	DARPA & NAI Labs.
2002-10-09 07:28:35 +00:00
mckusick
8cecf7b3e5 When creating a snapshot, create a list of initially allocated blocks.
Whenever doing a copy-on-write check, first look in the list of
initially allocated blocks to see if it is there. If so, no further
check is needed. If not, fall through and do the full check. This
change eliminates one of two known deadlocks caused by snapshots.
Handling the second deadlock will be the subject of another check-in.
This change also reduces the cost of the copy-on-write check by
speeding up the verification of frequently checked blocks.

Sponsored by:	DARPA & NAI Labs.
2002-10-09 06:13:48 +00:00
mckusick
d01b53c175 The appropriate units for disk block addresses are always DEV_BSIZE,
even when the underlying device has a larger sector size. Therefore,
the filesystem code should not (and with this patch does not) try to
use the underlying sector size when doing disk block address calculations.

This patch fixes problems in -current when using the swap-based
memory-disk device (mdconfig -a -t swap ...). This bugfix is not
relevant to -stable as -stable does not have the memory-disk device.

Sponsored by:	DARPA & NAI Labs.
2002-10-09 04:01:23 +00:00
jeff
29006c0306 - Remove LK_INTERLOCK from the vn_lock() in ffs_snapshot().
Pointy hat to:	me
Found by:	green
2002-10-08 21:00:52 +00:00
phk
f98c8d3a06 Mark two places where an unsigned number is checked "if (foo < 0)" with
an XXX comment.

Somebody[TM] should look at this in some detail.

Spotted by:	FlexeLint
2002-10-02 09:11:18 +00:00
dd
814e414600 size_t is not a struct (fix mislabelling in a comment). 2002-10-02 05:15:34 +00:00
phk
b55fa4540e Fix some harmless mis-indents.
Spotted by:	FlexeLint
2002-10-01 15:48:31 +00:00
jmallett
a17c9b6f5c When spamming me with a printf(9), under DIAGNOSTIC, at least be nice enough
to include a newline.

MFC after:	4 days
Sponsored by:	Bright Path Solutions
2002-09-28 19:04:49 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
phk
1a999cb3f9 Make it a tad easier to deal with struct inode in userland programs which
fondle /dev/kmem by using "struct cdev *" instead of "dev_t".

Requsted by:	jake
2002-09-27 20:03:05 +00:00
phk
5f5d8287e5 Use our mount-credential if we get a NOCRED when we try to write out EA
space back to disk.

This is wrong in many ways, but not as wrong as a panic.

Pancied on:	rwatson & jmallet
Sponsored by:	DARPA & NAI Labs.
2002-09-27 20:00:03 +00:00
jeff
8bebc5fdba - Convert locks to use standard macros.
- Lock access to the buflists.
 - Document broken locking.
 - Use vrefcnt().
2002-09-25 02:49:48 +00:00
jeff
90e87c8eb5 - Document broken locking.
- Use vrefcnt().
2002-09-25 02:47:49 +00:00
jeff
41b9d1ca5d - Lock accesses to v_usecount.
- Convert interlock locks to use standard macros.
2002-09-25 02:45:50 +00:00
jeff
263f8202f6 - Don't use the interlock to protect v_writecount. 2002-09-25 02:44:55 +00:00
phk
71d1473801 We don't need to #include <sys/disklabel.h>.
We don't need to #include <sys/disklabel.h> second time either.

Sponsored by:	DARPA & NAI Labs.
2002-09-20 16:42:33 +00:00
truckman
f280782003 VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link()
wasn't doing.  Rather than just lock and unlock the vnode around the call
to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode
in kern_link() before calling VOP_LINK(), since the other filesystems
also locked the file vnode right away in their link methods.  Remove the
locking and and unlocking from the leaf filesystem link methods.

Reviewed by:	rwatson, bde  (except for the unionfs_link() changes)
2002-09-19 13:32:45 +00:00
obrien
a8deaee84b intmax_t is printed with %jd, not %lld. 2002-09-19 03:55:30 +00:00
njl
00c79f5c92 Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde
2002-09-18 20:42:04 +00:00
njl
0590c43070 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
bde
8aa3df4eb2 vfs_syscalls.c:
Changed rename(2) to follow the letter of the POSIX spec.  POSIX
requires rename() to have no effect if its args "resolve to the same
existing file".  I think "file" can only reasonably be read as referring
to the inode, although the rationale and "resolve" seem to say that
sameness is at the level of (resolved) directory entries.

ext2fs_vnops.c, ufs_vnops.c:
Replaced code that gave the historical BSD behaviour of removing one
link name by checks that this code is now unreachable.  This fixes
some races.  All vnodes needed to be unlocked for the removal, and
locking at another level using something like IN_RENAME was not even
attempted, so it was possible for rename(x, y) to return with both x
and y removed even without any unlink(2) syscalls (one process can
remove x using rename(x, y) and another process can remove y using
rename(y, x)).

Prodded by:	alfred
MFC after:	8 weeks
PR:		42617
2002-09-10 11:09:13 +00:00
phk
87f5667c5a Implement the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.
Use extattr_check_cred() to check access to EAs.

This is still a WIP.

Sponsored by:   DARPA & NAI Labs.
2002-09-05 20:59:42 +00:00
phk
db06a743d8 Use canonical extattr_check_cred() instead of private implementation of the
same policy.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:39:36 +00:00
phk
80704cd6a8 Fix credentials check: do not leak ENOATTR until we know if they're
supposed to know.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:28:24 +00:00
bde
fcab7b8389 Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Include <sys/vmmeter.h> instead of depending on namespace pollution in
<sys/pcpu.h>.

Sorted includes as much as possible.
2002-09-05 09:43:24 +00:00
rwatson
9c25e3c24e Since we have vp and td cached in local variables, use those instead
of derefencing the VOP arguments again when calling the UFS code.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-09-01 16:06:40 +00:00
phk
49cb8194a9 Correctly handle setting, getting and deleting EA's with zero length content.
Sponsored by:	DARPA & NAI Labs.
2002-08-30 08:57:09 +00:00
charnier
7dd9d47059 Replace various spelling with FALLTHROUGH which is lint()able 2002-08-25 13:23:09 +00:00
alc
cdcc7b3446 o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since
pmap_zero_page() and pmap_zero_page_area() were modified to accept
   a struct vm_page * instead of a physical address, vm_page_zero_fill()
   and vm_page_zero_fill_area() have served no purpose.
2002-08-25 00:22:31 +00:00
phk
a1e3931514 Implement list of EA return functionality.
Correctly delete EA's when the content length is set to zero.

Sponsored by:	DARPA & NAI Labs.
2002-08-20 11:34:58 +00:00
phk
d1ede7ccd1 First snapshot of UFS2 EA support.
Sponsored by: DARPA & NAI Labs.
2002-08-19 07:01:55 +00:00
rwatson
44404e4547 In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
  "cred", and change the semantics of consumers of fo_read() and
  fo_write() to pass the active credential of the thread requesting
  an operation rather than the cached file cred.  The cached file
  cred is still available in fo_read() and fo_write() consumers
  via fp->f_cred.  These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
  pipe_read/write() now authorize MAC using active_cred rather
  than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
  VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred.  Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not.  If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-15 20:55:08 +00:00
phk
40f2376812 Expand the arguments to ffs_ext{read,write}() to their component
parts rather than use vop_{read,write}_args.  Access to these
functions will ultimately not be available through the
"vop_{read,write}+IO_EXT" API but this functionality is retained
for debugging purposes for now.

Sponsored by: DARPA & NAI Labs.
2002-08-13 11:33:01 +00:00
phk
088460429a Unravel the UFS_EXTATTR incest between FFS and UFS: UFS_EXTATTR is an
UFS only thing, and FFS should in principle not know if it is enabled
or not.

This commit cleans ffs_vnops.c for such knowledge, but not ffs_vfsops.c

Sponsored by: DARPA and NAI Labs.
2002-08-13 10:33:57 +00:00
phk
e4f487f25e Introduce typedefs for the member functions of struct vfsops and employ
these in the main filesystems.  This does not change the resulting code
but makes the source a little bit more grepable.

Sponsored by:	DARPA and NAI Labs.
2002-08-13 10:05:50 +00:00
rwatson
b0388fc24a Pass IO_NOMACCHECK to vn_rdwr() in the following checks to prevent
enforcement of MAC policy on the read or write operations:

- In ext2fs, don't enforce MAC on loop-back reads and writes supporting
  directory read operations in lookup(), directory modifications in
  rename(), directory write operations in mkdir(), symlink write
  operations in symlink().

- In the NFS client locking code, perform vn_rdwr() on the NFS locking
  socket without enforcing MAC, since the write is done on behalf of
  the kernel NFS implementation rather than the user process.

- In UFS, don't enforce MAC on loop-back reads and writes supporting
  directory read operations in lookup(), and symlink write operations
  in symlink().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-12 16:43:04 +00:00
phk
58bc3221a4 Stop pretending that the FFS file ufs_readwrite.c is a UFS file.
Instead of #including it, pull it into ffs_vnops.c and name things
correctly.

Sponsored by:	DARPA & NAI Labs.
2002-08-12 10:32:56 +00:00
phk
d5c53dc7ef Fix a comment. 2002-08-12 09:22:11 +00:00
iedowse
504ae196a9 Don't call softdep_slowdown() if soft updates are not active on the
filesystem. This causes a panic for kernels compiled without
softupdates.

Reported by:	luigi
2002-08-05 17:59:20 +00:00
jeff
02517b6731 - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
rwatson
85e0975519 Introduce support for Mandatory Access Control and extensible
kernel access control.

Instrument UFS to support per-inode MAC labels.  In particular,
invoke MAC framework entry points for generically supporting the
backing of MAC labels into extended attributes.  This ends up
introducing new vnode operation vector entries point at the MAC
framework entry points, as well as some explicit entry point
invocations for file and directory creation events so that the
MAC framework can push labels to disk before the directory names
become persistent (this will work better once EAs in UFS2 are
hooked into soft updates).  The generic EA MAC entry points
support executing with the file system in either single label
or multilabel operation, and will fall back to the mount label
if multilabel is not specified at mount-time.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-07-31 16:05:30 +00:00
phk
c1cd9e269d I forgot this bit of uglyness in the fsck_ffs cleanup. 2002-07-31 07:01:18 +00:00
phk
aba97c33f3 Fix braino in last commit. 2002-07-30 12:02:41 +00:00
phk
99c4adc497 Move ffs_isfreeblock() to ffs_alloc.c and make it static.
Sponsored by: DARPA & NAI Labs.
2002-07-30 11:54:48 +00:00
alc
27bb8a7310 Lock page queue accesses by vm_page_free(). 2002-07-28 08:01:48 +00:00
benno
7409589c7a Add a missing argument to the stub for softdep_setup_freeblocks.
Forgotten by:	mckusick
2002-07-20 04:07:15 +00:00
peter
c458732bcf Fix a warning:
ffs_softdep.c:1630: warning: int format, different type arg (arg 2)
2002-07-20 01:09:35 +00:00
mckusick
b44cb5787c Add support to UFS2 to provide storage for extended attributes.
As this code is not actually used by any of the existing
interfaces, it seems unlikely to break anything (famous
last words).

The internal kernel interface to manipulate these attributes
is invoked using two new IO_ flags: IO_NORMAL and IO_EXT.
These flags may be specified in the ioflags word of VOP_READ,
VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that
you want to do I/O to the normal data part of the file and
IO_EXT means that you want to do I/O to the extended attributes
part of the file. IO_NORMAL and IO_EXT are mutually exclusive
for VOP_READ and VOP_WRITE, but may be specified individually
or together in the case of VOP_TRUNCATE. For example, when
removing a file, VOP_TRUNCATE is called with both IO_NORMAL
and IO_EXT set. For backward compatibility, if neither IO_NORMAL
nor IO_EXT is set, then IO_NORMAL is assumed.

Note that the BA_ and IO_ flags have been `merged' so that they
may both be used in the same flags word. This merger is possible
by assigning the IO_ flags to the low sixteen bits and the BA_
flags the high sixteen bits. This works because the high sixteen
bits of the IO_ word is reserved for read-ahead and help with
write clustering so will never be used for flags. This merge
lets us get away from code of the form:

        if (ioflags & IO_SYNC)
                flags |= BA_SYNC;

For the future, I have considered adding a new field to the
vattr structure, va_extsize. This addition could then be
exported through the stat structure to allow applications to
find out the size of the extended attribute storage and also
would provide a more standard interface for truncating them
(via VOP_SETATTR rather than VOP_TRUNCATE).

I am also contemplating adding a pathconf parameter (for
concreteness, lets call it _PC_MAX_EXTSIZE) which would
let an application determine the maximum size of the extended
atribute storage.

Sponsored by:	DARPA & NAI Labs.
2002-07-19 07:29:39 +00:00
mckusick
3abb526f86 Change utimes to set the file creation time (for filesystems that
support creation times such as UFS2) to the value of the
modification time if the value of the modification time is older
than the current creation time. See utimes(2) for further details.

Sponsored by:	DARPA & NAI Labs.
2002-07-17 02:03:19 +00:00