freebsd-skq

Author	SHA1	Message	Date
rwatson	3ad18c8074	Update licenses and wording: NAI has authorized the removal of clause three of their BSD-style license; also, carry out the NAI Labs -> Network Associates Laboratories renaming in these files.	2002-11-04 02:35:46 +00:00
wollman	ce3867deda	Implement the new 1003.1-2001 pathconf() keys, including the Advisory Information option. Other filesystem implementations should do something similar. With advice from: mckusick, phk	2002-10-27 18:09:49 +00:00
rwatson	312cab0dee	Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-26 14:38:24 +00:00
mckusick	6b1611bd94	Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned. Sponsored by: DARPA & NAI Labs.	2002-10-25 00:20:37 +00:00
mckusick	0337df10b7	We must be careful to avoid recursive copy-on-write faults when trying to clean up during disk-full senarios. Sponsored by: DARPA & NAI Labs.	2002-10-23 21:47:02 +00:00
mckusick	3819d46020	Missplaced FREE_LOCK causes a panic when hit while taking a snapshot. Sponsored by: DARPA & NAI Labs.	2002-10-23 05:14:06 +00:00
mckusick	04450228c6	This update further fine tunes the locking of snapshot vnodes in the ffs_copyonwrite routine to avoid a deadlock between the syncer daemon trying to sync out a snapshot vnode and the bufdaemon trying to write out a buffer containing the snapshot inode. With any luck this will be the last snapshot race condition. Sponsored by: DARPA & NAI Labs.	2002-10-22 01:23:00 +00:00
mckusick	a515fcf789	This update is a performance improvement when allocating blocks on a full filesystem. Previously, if the allocation failed, we had to fsync the file before rolling back any partial allocation of indirect blocks. Most block allocation requests only need to allocate a single data block and if that allocation fails, there is nothing to unroll. So, before doing the fsync, we check to see if any rollback will really be necessary. If none is necessary, then we simply return. This update eliminates the flurry of disk activity that got triggered whenever a filesystem would run out of space. Sponsored by: DARPA & NAI Labs.	2002-10-22 01:14:25 +00:00
mckusick	305e5868f3	This checkin reimplements the io-request priority hack in a way that works in the new threaded kernel. It was commented out of the disksort routine earlier this year for the reasons given in kern/subr_disklabel.c (which is where this code used to reside before it moved to kern/subr_disk.c): ---------------------------- revision 1.65 date: 2002/04/22 06:53:20; author: phk; state: Exp; lines: +5 -0 Comment out Kirks io-request priority hack until we can do this in a civilized way which doesn't cause grief. The problem is that it is not generally safe to cast a "struct bio " to a "struct buf ". Things like ccd, vinum, ata-raid and GEOM constructs bio's which are not entrails of a struct buf. Also, curthread may or may not have anything to do with the I/O request at hand. The correct solution can either be to tag struct bio's with a priority derived from the requesting threads nice and have disksort act on this field, this wouldn't address the "silly-seek syndrome" where two equal processes bang the diskheads from one edge to the other of the disk repeatedly. Alternatively, and probably better: a sleep should be introduced either at the time the I/O is requested or at the time it is completed where we can be sure to sleep in the right thread. The sleep also needs to be in constant timeunits, 1/hz can be practicaly any sub-second size, at high HZ the current code practically doesn't do anything. ---------------------------- As suggested in this comment, it is no longer located in the disk sort routine, but rather now resides in spec_strategy where the disk operations are being queued by the thread that is associated with the process that is really requesting the I/O. At that point, the disk queues are not visible, so the I/O for positively niced processes is always slowed down whether or not there is other activity on the disk. On the issue of scaling HZ, I believe that the current scheme is better than using a fixed quantum of time. As machines and I/O subsystems get faster, the resolution on the clock also rises. So, ten years from now we will be slowing things down for shorter periods of time, but the proportional effect on the system will be about the same as it is today. So, I view this as a feature rather than a drawback. Hence this patch sticks with using HZ. Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>	2002-10-22 00:59:49 +00:00
rwatson	d862ecfee8	Rename _POSIX_FOO_PRESENT and friends from POSIX.1e to _PC_FOO_PRESENT and related friends. This would have been corrected had POSIX.1e progressed to a standard. Pointed out by: wollman	2002-10-20 22:11:13 +00:00
rwatson	438835cabb	Implement _POSIX_ACL_PATH_MAX, which returns the maximum number of ACL entries for a file system node using pathconf(). Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-20 22:08:26 +00:00
rwatson	9d17032f64	Teach UFS to respond to pathconf() tests for _POSIX_ACL_EXTENDED and _POSIX_MAC_PRESENT based on available mount flags, if the services are available. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-20 21:49:41 +00:00
rwatson	a2eb2e3662	Clarify that the UFS1 extended attribute configuration steps do not apply to UFS2 file systems. Submitted by: jedgar Obtained from: TrustedBSD Project	2002-10-19 16:09:16 +00:00
dillon	d155b8f135	Fix a file-rewrite performance case for UFS[2]. When rewriting portions of a file in chunks that are less then the filesystem block size, if the data is not already cached the system will perform a read-before-write. The problem is that it does this on a block-by-block basis, breaking up the I/Os and making clustering impossible for the writes. Programs such as INN using cyclic file buffers suffer greatly. This problem is only going to get worse as we use larger and larger filesystem block sizes. The solution is to extend the sequential heuristic so UFS[2] can perform a far larger read and readahead when dealing with this case. (note: maximum disk write bandwidth is 27MB/sec thru filesystem) (note: filesystem blocksize in test is 8K (1K frag)) dd if=/dev/zero of=test.dat bs=1k count=2m conv=notrunc Before: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 14.21 598 8.30 0.00 0 0.00 0.00 0 0.00 0 0 7 1 92 0 76 14.09 813 11.19 0.00 0 0.00 0.00 0 0.00 0 0 9 5 86 0 76 14.28 821 11.45 0.00 0 0.00 0.00 0 0.00 0 0 8 1 91 After: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 63.62 434 26.99 0.00 0 0.00 0.00 0 0.00 0 0 18 1 80 0 76 63.58 424 26.30 0.00 0 0.00 0.00 0 0.00 0 0 17 2 82 0 76 63.82 438 27.32 0.00 0 0.00 0.00 0 0.00 1 0 19 2 79 Reviewed by: mckusick Approved by: re X-MFC after: immediately (was heavily tested in -stable for 4 months)	2002-10-18 22:52:41 +00:00
rwatson	ab9568ccbf	Update extended attribute readme file to note that no special configuration is required to use EAs with UFS2, and that UFS2 is recommend for EA use for a variety of reasons. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-18 21:11:36 +00:00
rwatson	10e2a00a6a	Update instructions for ACLs given recent tunefs, mount changes. Also note that UFS2 doesn't require explicit extended attribute configuration, and is recommends for this and other reasons if you plan to use ACLs. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-18 21:09:57 +00:00
rwatson	749729a702	Use 'size_t' instead of 'int' for the result of sizeof().	2002-10-18 21:03:30 +00:00
mckusick	0af0d22682	With the revised single-lock method used in snapshots, the BA_NOWAIT flag is no longer needed. Sponsored by: DARPA & NAI Labs.	2002-10-18 01:17:28 +00:00
mckusick	733bfbdd78	Change locking so that all snapshots on a particular filesystem share a common lock. This change avoids a deadlock between snapshots when separate requests cause them to deadlock checking each other for a need to copy blocks that are close enough together that they fall into the same indirect block. Although I had anticipated a slowdown from contention for the single lock, my filesystem benchmarks show no measurable change in throughput on a uniprocessor system with three active snapshots. I conjecture that this result is because every copy-on-write fault must check all the active snapshots, so the process was inherently serial already. This change removes the last of the deadlocks of which I am aware in snapshots. Sponsored by: DARPA & NAI Labs.	2002-10-16 00:19:23 +00:00
rwatson	174c4a0034	Push most UFS ACL behavior behind a check for MNT_ACLS, permitting ACLs to be administratively disabled as needed on UFS/UFS2 file systems. This also has the effect of preventing the slightly more expensive ACL code from running on non-ACL file systems, avoiding storage allocation for ACLs that may be read from disk. MNT_ACLS may be set at mount-time using mount -o acls, or implicitly by setting the FS_ACLS flag using tunefs. On UFS1, you may also have to configure ACL store. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-15 21:28:24 +00:00
rwatson	e112f21cae	If the FS_MULTILABEL flag is set in a UFS or UFS2 superblock, automatically set MNT_MULTILABEL in the mount flags. If FS_ACLS is set in a UFS or UFS2 superblock, automatically set MNT_ACLS in the mount flags. If either of these flags is set, but the appropriate kernel option to support the features associated with the flag isn't available, then print a warning at mount-time. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-15 20:00:06 +00:00
mckusick	750c7540cc	When reading or writing the extended attributes of a special device or fifo in UFS2, the normal ufs_strategy routine needs to be used rather than the spec_strategy or fifo_strategy routine. Thus the ffsext_strategy routine is interposed in the ffs_vnops vectors for special devices and fifo's to pick off this special case. Otherwise it simply falls through to the usual spec_strategy or fifo_strategy routine. Submitted by: Robert Watson <rwatson@FreeBSD.org> Sponsored by: DARPA & NAI Labs.	2002-10-14 23:18:09 +00:00
rwatson	67d568c288	Fix two memory leaks in error conditions involving the UFS ACL code: if failures occur, make sure that we release both the default ACL and access ACL storage during new object creation. Spotted by: phk and his pet flexelint Sponsored by: DARPA, Network Associates Laboratories	2002-10-14 19:55:49 +00:00
rwatson	9c6b9f51d1	Define two new superblock file system flags: FS_ACLS Administrative enable/disable of extended ACL support FS_MULTILABEL Administrative flag to indicate to the MAC Framework that objects in the file system are individually labeled using extended attributes. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories Reviewed by: (in principal) mckusick, phk	2002-10-14 17:07:11 +00:00
mckusick	25230d4c6a	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.	2002-10-14 03:20:36 +00:00
mike	8630abe45f	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
mux	ddefc9d7d6	Fix build of 64 bit platforms.	2002-10-09 12:19:36 +00:00
mckusick	e13d3e9276	When creating a snapshot, create a list of initially allocated blocks. Whenever doing a copy-on-write check, first look in the list of initially allocated blocks to see if it is there. If so, no further check is needed. If not, fall through and do the full check. This change eliminates one of two known deadlocks caused by snapshots. Handling the second deadlock will be the subject of another check-in. This change also reduces the cost of the copy-on-write check by speeding up the verification of frequently checked blocks. Sponsored by: DARPA & NAI Labs.	2002-10-09 07:28:35 +00:00
mckusick	8cecf7b3e5	When creating a snapshot, create a list of initially allocated blocks. Whenever doing a copy-on-write check, first look in the list of initially allocated blocks to see if it is there. If so, no further check is needed. If not, fall through and do the full check. This change eliminates one of two known deadlocks caused by snapshots. Handling the second deadlock will be the subject of another check-in. This change also reduces the cost of the copy-on-write check by speeding up the verification of frequently checked blocks. Sponsored by: DARPA & NAI Labs.	2002-10-09 06:13:48 +00:00
mckusick	d01b53c175	The appropriate units for disk block addresses are always DEV_BSIZE, even when the underlying device has a larger sector size. Therefore, the filesystem code should not (and with this patch does not) try to use the underlying sector size when doing disk block address calculations. This patch fixes problems in -current when using the swap-based memory-disk device (mdconfig -a -t swap ...). This bugfix is not relevant to -stable as -stable does not have the memory-disk device. Sponsored by: DARPA & NAI Labs.	2002-10-09 04:01:23 +00:00
jeff	29006c0306	- Remove LK_INTERLOCK from the vn_lock() in ffs_snapshot(). Pointy hat to: me Found by: green	2002-10-08 21:00:52 +00:00
phk	f98c8d3a06	Mark two places where an unsigned number is checked "if (foo < 0)" with an XXX comment. Somebody[TM] should look at this in some detail. Spotted by: FlexeLint	2002-10-02 09:11:18 +00:00
dd	814e414600	size_t is not a struct (fix mislabelling in a comment).	2002-10-02 05:15:34 +00:00
phk	b55fa4540e	Fix some harmless mis-indents. Spotted by: FlexeLint	2002-10-01 15:48:31 +00:00
jmallett	a17c9b6f5c	When spamming me with a printf(9), under DIAGNOSTIC, at least be nice enough to include a newline. MFC after: 4 days Sponsored by: Bright Path Solutions	2002-09-28 19:04:49 +00:00
phk	1dfc2c167f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
phk	1a999cb3f9	Make it a tad easier to deal with struct inode in userland programs which fondle /dev/kmem by using "struct cdev *" instead of "dev_t". Requsted by: jake	2002-09-27 20:03:05 +00:00
phk	5f5d8287e5	Use our mount-credential if we get a NOCRED when we try to write out EA space back to disk. This is wrong in many ways, but not as wrong as a panic. Pancied on: rwatson & jmallet Sponsored by: DARPA & NAI Labs.	2002-09-27 20:00:03 +00:00
jeff	8bebc5fdba	- Convert locks to use standard macros. - Lock access to the buflists. - Document broken locking. - Use vrefcnt().	2002-09-25 02:49:48 +00:00
jeff	90e87c8eb5	- Document broken locking. - Use vrefcnt().	2002-09-25 02:47:49 +00:00
jeff	41b9d1ca5d	- Lock accesses to v_usecount. - Convert interlock locks to use standard macros.	2002-09-25 02:45:50 +00:00
jeff	263f8202f6	- Don't use the interlock to protect v_writecount.	2002-09-25 02:44:55 +00:00
phk	71d1473801	We don't need to #include <sys/disklabel.h>. We don't need to #include <sys/disklabel.h> second time either. Sponsored by: DARPA & NAI Labs.	2002-09-20 16:42:33 +00:00
truckman	f280782003	VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link() wasn't doing. Rather than just lock and unlock the vnode around the call to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode in kern_link() before calling VOP_LINK(), since the other filesystems also locked the file vnode right away in their link methods. Remove the locking and and unlocking from the leaf filesystem link methods. Reviewed by: rwatson, bde (except for the unionfs_link() changes)	2002-09-19 13:32:45 +00:00
obrien	a8deaee84b	intmax_t is printed with %jd, not %lld.	2002-09-19 03:55:30 +00:00
njl	00c79f5c92	Remove any VOP_PRINT that redundantly prints the tag. Move lockmgr_printinfo() into vprint() for everyone's benefit. Suggested by: bde	2002-09-18 20:42:04 +00:00
njl	0590c43070	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
bde	8aa3df4eb2	vfs_syscalls.c: Changed rename(2) to follow the letter of the POSIX spec. POSIX requires rename() to have no effect if its args "resolve to the same existing file". I think "file" can only reasonably be read as referring to the inode, although the rationale and "resolve" seem to say that sameness is at the level of (resolved) directory entries. ext2fs_vnops.c, ufs_vnops.c: Replaced code that gave the historical BSD behaviour of removing one link name by checks that this code is now unreachable. This fixes some races. All vnodes needed to be unlocked for the removal, and locking at another level using something like IN_RENAME was not even attempted, so it was possible for rename(x, y) to return with both x and y removed even without any unlink(2) syscalls (one process can remove x using rename(x, y) and another process can remove y using rename(y, x)). Prodded by: alfred MFC after: 8 weeks PR: 42617	2002-09-10 11:09:13 +00:00
phk	87f5667c5a	Implement the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods. Use extattr_check_cred() to check access to EAs. This is still a WIP. Sponsored by: DARPA & NAI Labs.	2002-09-05 20:59:42 +00:00
phk	db06a743d8	Use canonical extattr_check_cred() instead of private implementation of the same policy. Sponsored by: DARPA & NAI Labs.	2002-09-05 20:39:36 +00:00

1 2 3 4 5 ...

1003 Commits