freebsd-dev

Author	SHA1	Message	Date
Tor Egge	95e7a3c3ac	Reduce probability of unmount failing after having unmounted snapshots.	2006-03-19 21:09:19 +00:00
Tor Egge	ca2fa80767	Block secondary writes while expunging active unlinked files. Fix detection of active unlinked files by checking VI_OWEINACT and VI_DOINGINACT in addition to v_usecount. Defer inactive handling for unlinked files if the file system is mostly suspended (secondary writes being blocked). Perform deferred inactive handling after the file system is resumed.	2006-03-11 01:08:37 +00:00
Tor Egge	1e70cd7fc7	Remove unneeded (and broken) usage of MNT_REF()/MNT_REL().	2006-03-10 02:31:12 +00:00
Tor Egge	791dd2fade	Use vn_start_secondary_write() and vn_finished_secondary_write() as a replacement for vn_write_suspend_wait() to better account for secondary write processing. Close race where secondary writes could be started after ffs_sync() returned but before the file system was marked as suspended. Detect if secondary writes or softdep processing occurred during vnode sync loop in ffs_sync() and retry the loop if needed.	2006-03-08 23:43:39 +00:00
Tor Egge	82be0a5a24	Add marker vnodes to ensure that all vnodes associated with the mount point are iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman	2006-01-09 20:42:19 +00:00
Craig Rodrigues	b6bd025c35	Fix parsing of atime, clusterr, clusterw, exec, suid, symfollow mount options. Noticed by: Amir Shalem < amir at boom dot org dot il>	2005-11-24 15:06:40 +00:00
Craig Rodrigues	cea903627f	If export mount flag is not passed in, set default parameters for export structure and pass that to vfs_export(). Currently in userland mount(8), an export structure is unconditionally passed in, only for UFS. This is an attempt to move that UFS-specific behavior out of mount(8) and into the UFS filesystem code.	2005-11-20 17:04:50 +00:00
Craig Rodrigues	359d438885	Add more options to ffs_opts, so that vfs_filteropts() will not complain when we pass these options to a UFS filesystem as strings via nmount(): noexec, nosuid, nosymfollow, sync, suiddir	2005-11-19 23:28:19 +00:00
Craig Rodrigues	26f59b6455	- Add parsing for the following existing UFS/FFS mount options in the nmount() callpath via vfs_getopt(), and set the appropriate MNT_* flag: -> acls, async, force, multilabel, noasync, noatime, -> noclusterr, noclusterw, snapshot, update - Allow errmsg as a valid mount option via vfs_getopt(), so we can later add a hook to propagate mount errors back to userspace via vfs_mount_error().	2005-11-18 06:06:10 +00:00
Nate Lawson	8680d6985f	Adjust maxfilesize for UFS1 and old 4.4 FFS. For UFS1, increase the limit to (max block - 1) * bsize. For DEV_BSIZE, this doubles the limit from 0.5 TB to 1 TB. For the old 4.4 FFS case, decrease the limit from 0.5 TB to 2 GB - 1. Older systems had a 32 bit off_t so they couldn't access the larger files anyway. Collaboration with: bde	2005-10-21 01:54:00 +00:00
Tor Egge	9248a8271c	Don't pretend that a failed sync write was succesful.	2005-10-09 20:49:01 +00:00
Suleiman Souhlal	fdedad764a	ffs_mountfs() needs devvp to be locked, so lock it. Glanced at by: phk Tested by: pjd MFC after: 3 days	2005-09-02 13:52:55 +00:00
Suleiman Souhlal	93373c422f	Set the mountpoint path in the superblock (fs_fsmnt) at mount-time so that it appears in the various messages (not cleanly unmounted, filesystem full, etc). This has been broken since rev 1.261.	2005-08-21 22:06:41 +00:00
Alan Cox	ec9c9e7363	Eliminate inconsistency in the setting of the B_DONE flag. Specifically, make the b_iodone callback responsible for setting it if it is needed. Previously, it was set unconditionally by bufdone() without holding whichever lock is shared by the b_iodone callback and the corresponding top-half function. Consequently, in a race, the top-half function could conclude that operation was done before the b_iodone callback finished. See, for example, aio_physwakeup() and aio_fphysio(). Note: I don't believe that the other, more widely-used b_iodone callbacks are affected. Discussed with: jeff Reviewed by: phk MFC after: 2 weeks	2005-07-20 19:06:06 +00:00
Jeff Roberson	204ec66d38	- Don't set our bio op to be a READ when we've just completed a write. There are subtle differences in the read and write completion path. Instead, grab an extra write ref so the write path can drop it when we recursively call bufdone(). I believe this may be the source of the wrong bufobj panics. Reported by: pho, kkenn	2005-05-30 07:04:15 +00:00
Jeff Roberson	41d4783d49	- In ffs_sync we need to pass LK_SLEEPFAIL in when we lock the vnode because it may change identities while we're sleeping on the lock. Otherwise we may bail out of ffs_sync() early due to an error from deadfs. - Collapse a VOP_UNLOCK, vrele into a single vput().	2005-04-03 10:38:18 +00:00
Jeff Roberson	153910e0f5	- Move the contents of softdep_disk_prewrite into ffs_geom_strategy to fix two bugs. - ffs_disk_prewrite was pulling the vp from the buf and checking for COPYONWRITE, when really it wanted the vp from the bufobj that we're writing to, which is the devvp. This lead to us skipping the copy on write to all file data, which significantly broke snapshots for the last few months. - When the SOFTUPDATES option was not included in the kernel config we would also skip the copy on write check, which would effectively disable snapshots. - Remove an invalid mp_fixme(). Debugging tips from: mckusick Reported by: iedowse, others Discussed with: phk	2005-04-03 10:29:55 +00:00
Jeff Roberson	aa7ba42796	- FFS supports shared locks, clear LK_NOSHARE from our vnode locks. Sponsored by: Isilon Systems, Inc.	2005-03-31 05:23:20 +00:00
Jeff Roberson	d6919865fa	- Upgrade a shared lock request to exclusive in ffs_vget() if we have to create the vnode. Sponsored by: Isilon Systems, Inc.	2005-03-29 10:10:51 +00:00
Poul-Henning Kamp	51f5ce0c8c	Add two arguments to the vfs_hash() KPI so that filesystems which do not have unique hashes (NFS) can also use it.	2005-03-16 11:20:51 +00:00
Poul-Henning Kamp	de68347b1b	Don't hold a reference on the disk vnode for each inode.	2005-03-15 20:50:58 +00:00
Poul-Henning Kamp	45c26fa2b6	Improve the vfs_hash() API: vput() the unneeded vnode centrally to avoid replicating the vput in all the filesystems.	2005-03-15 20:00:03 +00:00
Poul-Henning Kamp	e82ef95c11	Simplify the vfs_hash calling convention.	2005-03-15 08:07:07 +00:00
Poul-Henning Kamp	14bc0685ac	Use vfs_hash instead of home-rolled.	2005-03-14 10:21:16 +00:00
Jeff Roberson	fe68abe291	- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. - Shorten ffs_reload by one step. The old check for an inactive vnode was slightly racey, and the code which deals with still active vnodes is not much more expensive. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:03:14 +00:00
Poul-Henning Kamp	dfd4be14bd	Try to unbreak the vnode locking around vop_reclaim() (based mostly on patch from kan@). Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on close. This is not yet a generally safe function, but for this very specific use it is safe. This solves the problem with buffers not being flushed by unmount or after failed mount attempts.	2005-02-19 11:44:57 +00:00
Poul-Henning Kamp	adf4157738	Make a some SYSCTL_NODEs and some of FFS's VFS_ methods static.	2005-02-10 12:20:08 +00:00
Poul-Henning Kamp	02f2c6a9d8	Split the vop_vector for ffs1 and ffs2, this is mostly for the different EXTATTR support.	2005-02-08 21:03:52 +00:00
Poul-Henning Kamp	dd19a799b8	Background writes are entirely an FFS/Softupdates thing. Give FFS vnodes a specific bufwrite method which contains all the background write stuff and then calls into the default bufwrite() for the rest of the job. Remove all the background write related stuff from the normal bufwrite. This drags the softdep_move_dependencies() back into FFS. Long term, it is worth looking at simply copying the data into allocated memory and issuing the bio directly and not create the "shadow buf" in the first place (just like copy-on-write is done in snapshots for instance). I don't think we really gain anything but complexity from doing this with a buf.	2005-02-08 20:29:10 +00:00
Poul-Henning Kamp	efd6d9808c	Don't use the UFS_* and VFS_* functions where a direct call is possble. The UFS_ functions are for UFS to call back into VFS. The VFS functions are external entry points into the filesystem.	2005-02-08 17:40:01 +00:00
Poul-Henning Kamp	40854ff546	For snapshots we need all VOP_LOCKs to be exclusive. The "business class upgrade" was implemented in UFS's VOP_LOCK implementation ufs_lock() which is the wrong layer, so move it to ffs_lock(). Also, as long as we have not abandonned advanced vfs-stacking we should not preclude it from happening: instead of implementing a copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at vop_stdlock() at the bottom.	2005-02-08 16:25:50 +00:00
Poul-Henning Kamp	84a6975215	Introduce and use g_vfs_close().	2005-01-25 15:52:04 +00:00
Poul-Henning Kamp	ce12d37e7b	Don't create vnode_pager objects for the disk device. geom_vfs will do that.	2005-01-24 22:41:59 +00:00
Jeff Roberson	3ba649d792	- Initialize and destroy the per-filesystem ufs lock where appropriate. - Use the buffer lock on the superblock buf to serialize calls to sbupdate. - Set the MNTK_MPSAFE flag when QUOTA is not defined in the kernel. Sponsored By: Isilon Systems, Inc.	2005-01-24 10:12:28 +00:00
Pawel Jakub Dawidek	39cfb23935	Fix ACLs handling for the root file system. Without this fix, when ACLs are set via tunefs(8) on the root file system, they are removed on boot when 'mount -a' is called, because mount(8) called for the root file system always add MNT_UPDATE flag and MNT_UPDATE flag isn't perfect. Now, one cannot remove ACLs stored in superblock (configured with tunefs(8)) via 'mount -a' nor 'mount -u -o noacls <file system>', but it is still possible to mount file system which doesn't have ACLs in superblock via 'mount -o acls <file system>' or /etc/fstab's 'acls' option. Reported by: Lech Lorens/pl.comp.os.bsd Discussed with: phk, rwatson Reviewed by: rwatson MFC after: 2 weeks	2005-01-15 17:09:53 +00:00
Poul-Henning Kamp	7c0745eeae	Eliminate unused and unnecessary "cred" argument from vinvalbuf()	2005-01-14 07:33:51 +00:00
Poul-Henning Kamp	e39db32ab0	Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT() directly.	2005-01-13 12:25:19 +00:00
Poul-Henning Kamp	6ef8480a88	Add BO_SYNC() and add a default which uses the secret vnode pointer and VOP_FSYNC() for now.	2005-01-11 10:43:08 +00:00
Poul-Henning Kamp	8df6bac4c7	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
Warner Losh	60727d8b86	/* -> /*- for license, minor formatting changes	2005-01-07 02:29:27 +00:00
Poul-Henning Kamp	4a18054d7b	With the introduction of UFS2 we started looking for superblocks in four different locations on a prospective filesystem. If we found none, we forgot to invalidate the four buffers, thus the following sequence would fails: (md0 = blank disk) mount /dev/md0 /mnt (fails, no superblocks) newfs /dev/md0 (writes using physio which does not go through buffercache). mount /dev/md0 /mnt (still fails, the four cached buffers still contain no superblocks) Found by: ru	2004-12-12 14:19:11 +00:00
Poul-Henning Kamp	f21cc2cafc	Fix nfs exports (for now). The real fix is to teach mountd about nmount.	2004-12-07 15:09:30 +00:00
Poul-Henning Kamp	20a92a18f1	The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly split the conversion of the remaining three filesystems out from the root mounting changes, so in one go: cd9660: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() nfs(client): Convert to nmount (the simple way, mount_nfs(8) is still necessary). Add omount compat shims. Drop COMPAT_PRELITE2 mount arg compatibility. ffs: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() Remove vfs_omount() method, all filesystems are now converted. Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem task, and they all do it now. Change rootmounting to use DEVFS trampoline: vfs_mount.c: Mount devfs on /. Devfs needs no 'from' so this is clean. symlink /dev to /. This makes it possible to lookup /dev/foo. Mount "real" root filesystem on /. Surgically move the devfs mountpoint from under the real root filesystem onto /dev in the real root filesystem. Remove now unnecessary getdiskbyname(). kern_init.c: Don't do devfs mounting and rootvnode assignment here, it was already handled by vfs_mount.c. Remove now unused bdevvp(), addaliasu() and addalias(). Put the few necessary lines in devfs where they belong. This eliminates the second-last source of bogo vnodes, leaving only the lemming-syncer. Remove rootdev variable, it doesn't give meaning in a global context and was not trustworth anyway. Correct information is provided by statfs(/).	2004-12-07 08:15:41 +00:00
Poul-Henning Kamp	743312367a	VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that. Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.	2004-12-05 22:41:02 +00:00
Poul-Henning Kamp	93e0b506e3	typo in comment.	2004-12-03 20:36:55 +00:00
Poul-Henning Kamp	aec0fb7b40	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
Poul-Henning Kamp	964ebefd8d	Use system wide no-op vfs_start function.	2004-11-25 09:11:27 +00:00
Poul-Henning Kamp	51ac12ab28	Be prepared to accept NULL mountargs as part of root-mounting.	2004-11-13 13:04:31 +00:00
Poul-Henning Kamp	cf5e414960	Put back the vfs_object_create() calls, they do make a difference when my test-setup does what I want it to instead of what I ask it to. Pointed out by: tegge	2004-11-12 10:27:14 +00:00
Poul-Henning Kamp	40ce27cb57	fix some comments	2004-11-10 06:53:31 +00:00
Poul-Henning Kamp	2e6649198a	Use mount flags instead of NULL path to detect root filesystem mount.	2004-11-09 23:38:10 +00:00
Poul-Henning Kamp	5e2ccaff7a	Stop pretending to have a vm_object backing the underlying disk vnode: it isn't used for anything anywhere and the vnode_pager would explode if we attempted to.	2004-11-09 23:12:45 +00:00
Poul-Henning Kamp	40c340aa5d	Don't grab the exclusive bit on a root filesystem until we are willing to mount it. Doing so prevented fsck to be run after a refused mount.	2004-11-04 09:11:22 +00:00
Poul-Henning Kamp	4392001125	Move UFS from DEVFS backing to GEOM backing. This eliminates a bunch of vnode overhead (approx 1-2 % speed improvement) and gives us more control over the access to the storage device. Access counts on the underlying device are not correctly tracked and therefore it is possible to read-only mount the same disk device multiple times: syv# mount -p /dev/md0 /var ufs rw 2 2 /dev/ad0 /mnt ufs ro 1 1 /dev/ad0 /mnt2 ufs ro 1 1 /dev/ad0 /mnt3 ufs ro 1 1 Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches things in RAM) this is not possible with read-write mounts, and the system will correctly reject this. Details: Add a geom consumer and a bufobj pointer to ufsmount. Eliminate the vnode argument from softdep_disk_prewrite(). Pick the vnode out of bp->b_vp for now. Eventually we should find it through bp->b_bufobj->b_private. In the mountcode, use g_vfs_open() once we have used VOP_ACCESS() to check permissions. When upgrading and downgrading between r/o and r/w do the right thing with GEOM access counts. Remove all the workarounds for not being able to do this with VOP_OPEN(). If we are the root mount, drop the exclusive access count until we upgrade to r/w. This allows fsck of the root filesystem and the MNT_RELOAD to work correctly. Set bo_private to the GEOM consumer on the device bufobj. Change the ffs_ops->strategy function to call g_vfs_strategy() In ufs_strategy() directly call the strategy on the disk bufobj. Same in rawread. In ffs_fsync() we will no longer see VCHR device nodes, so remove code which synced the filesystem mounted on it, in case we came there. I'm not sure this code made sense in the first place since we would have taken the specfs route on such a vnode. Redo the highly bogus readblock() function in the snapshot code to something slightly less bogus: Constructing an uio and using physio was really quite a detour. Instead just fill in a bio and ship it down.	2004-10-29 10:15:56 +00:00
Poul-Henning Kamp	570a7ddaa3	We only support backing UFS/FFS with disks.	2004-10-28 06:19:28 +00:00
Poul-Henning Kamp	8dd5650594	White space changes. Add missing static.	2004-10-26 20:13:21 +00:00
Poul-Henning Kamp	6e77a04170	The island council met and voted buf_prewrite() home. Give ffs it's own bufobj->bo_ops vector and create a private strategy routine, (currently misnamed for forwards compatibility), which is just a copy of the generic bufstrategy routine except we call softdep_disk_prewrite() directly instead of through the buf_prewrite() indirection. Teach UFS about the need for softdep_disk_prewrite() and call the function directly in FFS. Remove buf_prewrite() from the default bufstrategy() and from the global bio_ops method vector.	2004-10-26 10:44:10 +00:00
Poul-Henning Kamp	5d9d81e7ea	Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.	2004-10-26 07:39:12 +00:00
Poul-Henning Kamp	156cb26583	Loose the v_dirty* and v_clean* alias macros. Check the count field where we just want to know the full/empty state, rather than using TAILQ_EMPTY() or TAILQ_FIRST().	2004-10-25 09:14:03 +00:00
Poul-Henning Kamp	ee1d0eb330	Remove vnode->v_bsize. This was a dead-end.	2004-10-25 07:50:59 +00:00
Pawel Jakub Dawidek	8d02a378aa	Back out changes which were introduced to delay mounting root file system. Those changes were made on gmirror needs, but now gmirror handles this by itself.	2004-10-05 11:26:43 +00:00
Poul-Henning Kamp	4f116178ba	Remove support for accessing device nodes in UFS/FFS. Device nodes can still be created and exported with NFS.	2004-09-28 13:30:58 +00:00
Pawel Jakub Dawidek	5a19f8b0c4	Introduce new /boot/loader.conf variable: root_mount_delay. It can be used to delay mounting root partition to give a chance to GEOM providers to show up. Now, when there is no needed provider, vfs_rootmount() function will look for it every second and if it can't be find in defined time, it'll ask for root device name (before this change it was done immediately). This will allow to boot from gmirror device in degraded mode.	2004-09-23 10:13:18 +00:00
Poul-Henning Kamp	5e8c582ac2	Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch. s/vfs_mount/vfs_omount/ s/vfs_nmount/vfs_mount/ Name our filesystems mount function consistently. Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused. Reorganize the root filesystem selection code.	2004-07-30 22:08:52 +00:00
Poul-Henning Kamp	d634f69316	Remove global variable rootdevs and rootvp, they are unused as such. Add local rootvp variables as needed. Remove checks for miniroot's in the swappartition. We never did that and most of the filesystems could never be used for that, but it had still been copy&pasted all over the place.	2004-07-28 20:21:04 +00:00
Alexander Kabaev	b403319b8d	Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper inode field based on UFS version. Use DIP ro read values and DIP_SET to modify them throughout FFS code base.	2004-07-28 06:41:27 +00:00
Poul-Henning Kamp	d8d3d4158b	Make sure to update the mnt_stats before UFS1 extattr tried to do I/O on the device. Otherwise the blocksize is undefined in the buffer cache.	2004-07-14 14:19:32 +00:00
Alfred Perlstein	f257b7a54b	Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.	2004-07-12 08:14:09 +00:00
Poul-Henning Kamp	c94cd5fc8c	Explicity initialize vp->v_bsize.	2004-07-07 20:04:06 +00:00
Poul-Henning Kamp	e3c5a7a4dd	When we traverse the vnodes on a mountpoint we need to look out for our cached 'next vnode' being removed from this mountpoint. If we find that it was recycled, we restart our traversal from the start of the list. Code to do that is in all local disk filesystems (and a few other places) and looks roughly like this: MNT_ILOCK(mp); loop: for (vp = TAILQ_FIRST(&mp...); (vp = nvp) != NULL; nvp = TAILQ_NEXT(vp,...)) { if (vp->v_mount != mp) goto loop; MNT_IUNLOCK(mp); ... MNT_ILOCK(mp); } MNT_IUNLOCK(mp); The code which takes vnodes off a mountpoint looks like this: MNT_ILOCK(vp->v_mount); ... TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes); ... MNT_IUNLOCK(vp->v_mount); ... vp->v_mount = something; (Take a moment and try to spot the locking error before you read on.) On a SMP system, one CPU could have removed nvp from our mountlist but not yet gotten to assign a new value to vp->v_mount while another CPU simultaneously get to the top of the traversal loop where it finds that (vp->v_mount != mp) is not true despite the fact that the vnode has indeed been removed from our mountpoint. Fix: Introduce the macro MNT_VNODE_FOREACH() to traverse the list of vnodes on a mountpoint while taking into account that vnodes may be removed from the list as we go. This saves approx 65 lines of duplicated code. Split the insmntque() which potentially moves a vnode from one mount point to another into delmntque() and insmntque() which does just what the names say. Fix delmntque() to set vp->v_mount to NULL while holding the mountpoint lock.	2004-07-04 08:52:35 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Bosko Milekic	451079d4ab	Revert previous change to this file because it breaks some things which compare /etc/fstab entries to results from getfsstat(). The real way to fix this is to make 'ufs2' a recognized filesystem (for real, no beating around the bush). This should fix things like 'umount -a -t ufs' now. Appologies for the previous breakage.	2004-04-29 15:10:42 +00:00
Bosko Milekic	2aebb586db	The previous change to mount(8) to report ufs or ufs2 used libufs, which only works for Charlie root. This change reverts the introduction of libufs and moves the check into the kernel. Since the f_fstypename is the same for both ufs and ufs2, we check fs_magic for presence of ufs2 and copy "ufs2" explicitly instead. Submitted by: Christian S.J. Peron <maneo@bsdpro.com>	2004-04-26 15:13:46 +00:00
Warner Losh	012d41340a	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and irc message from Robert Watson saying that clause 3 can be removed from those files with an NAI copyright that also have only a University of California copyrights. Approved by: core, rwatson	2004-04-07 03:47:21 +00:00
Bruce Evans	e9827c6d93	Fixed some style bugs: - don't unlock the vnode after vinvalbuf() only to have to relock it almost immediately. - don't refer to devices classified by vn_isdisk() as block devices.	2004-02-14 04:41:13 +00:00
Bruce Evans	0efb13948d	MFextfs: backed out secondary changes in rev.1.40 that had become just style bugs (a variable that is used only once, and misformattings).	2004-02-13 03:05:12 +00:00
Bruce Evans	8adff5fc12	Fixed some minor style bugs (English usage and formatting of binary operators) in and near revs.1.169-1.170 (open mode bandaid). This (or better a proper fix) should have been done before cloning the bandaid to many other file systems.	2004-02-12 16:52:24 +00:00
Don Lewis	31c81e4bed	Set fs_ronly to the correct value in ffs_reload() when reloading the file system super block after fsck has repaired the file system. The value of fs_ronly was getting overwritten, which caused ffs_update() to attempt to update inode timestamps even though the file system was still mounted read-only. This fixes the "giving up on N buffers" error that is triggered by running fsck on the root file system and then rebooting without mounting the file system read-write.	2003-12-07 05:16:52 +00:00
Kirk McKusick	fde81c7d8e	Update the statfs structure with 64-bit fields to allow accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.	2003-11-12 08:01:40 +00:00
Alexander Kabaev	ca430f2e92	Remove mntvnode_mtx and replace it with per-mountpoint mutex. Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to operate on this mutex transparently. Eventually new mutex will be protecting more fields in struct mount, not only vnode list. Discussed with: jeff	2003-11-05 04:30:08 +00:00
Alexander Kabaev	45d45c6cde	Use VOP_UNLOCK/vrele instead of vput. td was erecived as a parameter and one cannot be sure it is equal to curthread.	2003-11-03 04:46:19 +00:00
Alexander Kabaev	cb9ddc80ae	Take care not to call vput if thread used in corresponding vget wasn't curthread, i.e. when we receive a thread pointer to use as a function argument. Use VOP_UNLOCK/vrele in these cases. The only case there td != curthread known at the moment is boot() calling sync with thread0 pointer. This fixes the panic on shutdown people have reported.	2003-11-02 04:52:53 +00:00
Alexander Kabaev	492c1e68fb	Temporarily undo parts of the stuct mount locking commit by jeff. It is unsafe to hold a mutex across vput/vrele calls. This will be redone when a better locking strategy is agreed upon. Discussed with: jeff	2003-11-01 05:51:54 +00:00
Jeff Roberson	69b609a85d	- The VCHR case in ffs_sync() is an unneccsary optimization especially considering how infrequently we access devices via ffs now that we have devfs. Collapse this case with the other case. Obtained from: bde	2003-10-05 22:56:33 +00:00
Jeff Roberson	ab1f917b53	- Further simplify ffs_sync(). The vnode lock is required for UFS_UPDATE() so make the code slightly more uniform. The vnode lock is acquired in all cases and now the only difference between VCHR and other is we call UFS_UPDATE instead of VOP_FSYNC().	2003-10-05 09:42:24 +00:00
Jeff Roberson	2f05568aa8	- Check the XLOCK before inspecting v_data. - Slightly rewrite the fsync loop to be more lock friendly. We must acquire the vnode interlock before dropping the mnt lock. We must also check XLOCK to prevent vclean() races. - Use LK_INTERLOCK in the vget() in ffs_sync to further prevent vclean() races. - Use a local variable to store the results of the nvp == TAILQ_NEXT test so that we do not access the vp after we've vrele()d it. - Add an XXX comment about UFS_UPDATE() not being protected by any lock here. I suspect that it should need the VOP lock.	2003-10-05 07:16:45 +00:00
Jeff Roberson	04a17687ea	- Increase the scope of the interlock in ffs_reload(). Acquire it before we release the mntvnode_mtx. - Call vgonel() directly instead of going through vrecycle() since we own the interlock now. - Remove a few cases where we locked the interlock just so that we could call VOP_UNLOCK with interlock held.	2003-10-04 14:27:49 +00:00
Poul-Henning Kamp	5c24d6ee26	Eliminate the i_devvp field from the incore UFS inodes, we can get the same value from ip->i_ump->um_devvp. This saves a pointer in the memory copies of inodes, which can easily run into several hundred kilobytes. The extra indirection is unmeasurable in benchmarks. Approved by: mckusick	2003-08-15 20:03:19 +00:00
Poul-Henning Kamp	a8d43c90af	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
Poul-Henning Kamp	7652131bee	Initialize struct vfsops C99-sparsely. Submitted by: hmp Reviewed by: phk	2003-06-12 20:48:38 +00:00
David E. O'Brien	f4636c5959	Use __FBSDID().	2003-06-11 06:34:30 +00:00
Poul-Henning Kamp	6280ed26af	Remove unused local variables. Found by: FlexeLint	2003-05-31 18:17:32 +00:00
Tim J. Robbins	3632928957	Do not attempt to free NULL dinodes (i_din1 or i_din2) in ffs_ifree(). These fields can be left as NULL if ffs_vget() allocates an inode but fails before the dinode memory has been allocated. There are two cases when this can occur: when we lose a race and another process has added the inode to the hash, and when reading the inode off disk fails. The bug was observed by Kris on one of the package-building machines. See http://marc.theaimsgroup.com/?l=freebsd-current&m=105172731013411&w=2 In Kris's case, it was the bread() that failed because of a disk error. The alternative to this patch is to ensure that ffs_vget() does not call vput() when the inode that hasn't been properly initialised.	2003-05-01 06:41:59 +00:00
Tim J. Robbins	8d721e877d	Free i_din2 instead of i_din1 in ffs_ifree() on UFS2 filesystems. This is purely a cosmetic change because these members are in a union together.	2003-05-01 06:38:27 +00:00
John Baldwin	31566c96f4	Use td->td_ucred instead of td->td_proc->p_ucred.	2003-03-20 21:17:40 +00:00
Poul-Henning Kamp	b4b138c27f	Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.	2003-03-18 08:45:25 +00:00
Jeff Roberson	7261f5f68e	- Add a new 'flags' parameter to getblk(). - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick	2003-03-04 00:04:44 +00:00
Kirk McKusick	74f3809a19	Change the field used to test whether the superblock has been updated from the filesystem size field to the filesystem maximum blocksize field. The problem is that older versions of growfs updated only the new size field and not the old size field. This resulted in the old (smaller) size field being copied up to the new size field which caused the filesystem to appear to fsck to be badly trashed. This also adds a sanity check to ensure that the superblock is not being updated when the filesystem is mounted read-only. Obviously such an update should never happen. Reported by: Nate Lawson <nate@root.org> Sponsored by: DARPA & NAI Labs.	2003-02-25 23:21:08 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Kirk McKusick	aca3e4974f	Replace use of random() with arc4random() to provide less guessable values for the initial inode generation numbers in newfs and for newly allocated inode generation numbers in the kernel. Submitted by: Theo de Raadt <deraadt@cvs.openbsd.org> Sponsored by: DARPA & NAI Labs.	2003-02-14 21:31:58 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Poul-Henning Kamp	aa4d7a8a4b	Use three UMA zones for FFS/UFS inodes instead of malloc space. Since inodes are currently 144 bytes, this will save 112 bytes per inode. This can amount to up to 10MByte on large systems.	2002-12-27 11:05:05 +00:00
Poul-Henning Kamp	de6ba7c016	Move the allocation of the inode contents into ffs_vfsops.c rather than passing malloc types around.	2002-12-27 10:23:03 +00:00
Poul-Henning Kamp	975512a907	Make ffs_mountfs() static. Remove the malloctype from the ufs mount structure, instead add a callback to the storage method for freeing inodes: UFS_IFREE(). Add vfs_ifree() method function which frees an inode. Unvariablelize the malloc type used for allocating inodes.	2002-12-27 10:06:37 +00:00
Kirk McKusick	31574422a3	Add a check to disable the previous patch so that future filesystems that choose to place their superblocks in non-standard locations will not get them smashed. Sponsored by: DARPA & NAI Labs.	2002-11-30 19:04:57 +00:00
Kirk McKusick	fa5d33e242	Check to make sure that the fs_sblockloc field was properly updated before using it to write the superblock. This is to guard against accidentally trashing the disklabel if the superblock format missed being upgraded by the new kernel. Reported by: Sam Leffler <sam@errno.com> Sponsored by: DARPA & NAI Labs. Approved by: Murray Stokely <murray@FreeBSD.org>	2002-11-29 19:20:15 +00:00
Kirk McKusick	ada981b228	Create a new 32-bit fs_flags word in the superblock. Add code to move the old 8-bit fs_old_flags to the new location the first time that the filesystem is mounted by a new kernel. One of the unused flags in fs_old_flags is used to indicate that the flags have been moved. Leave the fs_old_flags word intact so that it will work properly if used on an old kernel. Change the fs_sblockloc superblock location field to be in units of bytes instead of in units of filesystem fragments. The old units did not work properly when the fragment size exceeeded the superblock size (8192). Update old fs_sblockloc values at the same time that the flags are moved. Suggested by: BOUWSMA Barry <freebsd-misuser@netscum.dyndns.dk> Sponsored by: DARPA & NAI Labs.	2002-11-27 02:18:58 +00:00
Robert Watson	763bbd2f4f	Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-26 14:38:24 +00:00
Kirk McKusick	9ab73fd11a	Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned. Sponsored by: DARPA & NAI Labs.	2002-10-25 00:20:37 +00:00
Robert Watson	80830407c6	If the FS_MULTILABEL flag is set in a UFS or UFS2 superblock, automatically set MNT_MULTILABEL in the mount flags. If FS_ACLS is set in a UFS or UFS2 superblock, automatically set MNT_ACLS in the mount flags. If either of these flags is set, but the appropriate kernel option to support the features associated with the flag isn't available, then print a warning at mount-time. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-15 20:00:06 +00:00
Kirk McKusick	a5b65058d5	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.	2002-10-14 03:20:36 +00:00
Kirk McKusick	b6cef5648d	The appropriate units for disk block addresses are always DEV_BSIZE, even when the underlying device has a larger sector size. Therefore, the filesystem code should not (and with this patch does not) try to use the underlying sector size when doing disk block address calculations. This patch fixes problems in -current when using the swap-based memory-disk device (mdconfig -a -t swap ...). This bugfix is not relevant to -stable as -stable does not have the memory-disk device. Sponsored by: DARPA & NAI Labs.	2002-10-09 04:01:23 +00:00
Jeff Roberson	2ee5711e84	- Convert locks to use standard macros. - Lock access to the buflists. - Document broken locking. - Use vrefcnt().	2002-09-25 02:49:48 +00:00
Nate Lawson	06be2aaa83	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
Poul-Henning Kamp	d6fe88e475	Unravel the UFS_EXTATTR incest between FFS and UFS: UFS_EXTATTR is an UFS only thing, and FFS should in principle not know if it is enabled or not. This commit cleans ffs_vnops.c for such knowledge, but not ffs_vfsops.c Sponsored by: DARPA and NAI Labs.	2002-08-13 10:33:57 +00:00
Poul-Henning Kamp	9bf1a75697	Introduce typedefs for the member functions of struct vfsops and employ these in the main filesystems. This does not change the resulting code but makes the source a little bit more grepable. Sponsored by: DARPA and NAI Labs.	2002-08-13 10:05:50 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Ian Dowse	5346934fe7	Add the ffs bits necessary to support unloading of the ufs kernel module. This adds an ffs_uninit() function that calls ufs_uninit() and also calls a new softdep_uninitialize() function. Add a stub for softdep_uninitialize() to cover the non-SOFTUPDATES case. Reviewed by: mckusick	2002-07-01 11:00:47 +00:00
Ian Dowse	8f42fb8fc9	Remove the kernel file-size limit for UFS2, so that only the limit imposed by the filesystem structure itself remains. With 16k blocks, the maximum file size is now just over 128TB. For now, the UFS1 file size limit is left unchanged so as to remain consistent with RELENG_4, but it too could be removed in the future. Reviewed by: mckusick	2002-06-26 18:34:51 +00:00
Maxime Henrion	cfbf0a4678	Warning fixes for 64 bits platforms. This eliminates all the warnings I have had in the FFS code on sparc64. Reviewed by: mckusick	2002-06-23 18:17:27 +00:00
Kirk McKusick	1c85e6a35d	This commit adds basic support for the UFS2 filesystem. The UFS2 filesystem expands the inode to 256 bytes to make space for 64-bit block pointers. It also adds a file-creation time field, an ability to use jumbo blocks per inode to allow extent like pointer density, and space for extended attributes (up to twice the filesystem block size worth of attributes, e.g., on a 16K filesystem, there is space for 32K of attributes). UFS2 fully supports and runs existing UFS1 filesystems. New filesystems built using newfs can be built in either UFS1 or UFS2 format using the -O option. In this commit UFS1 is the default format, so if you want to build UFS2 format filesystems, you must specify -O 2. This default will be changed to UFS2 when UFS2 proves itself to be stable. In this commit the boot code for reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c) as there is insufficient space in the boot block. Once the size of the boot block is increased, this code can be defined. Things to note: the definition of SBSIZE has changed to SBLOCKSIZE. The header file <ufs/ufs/dinode.h> must be included before <ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and ufs_lbn_t. Still TODO: Verify that the first level bootstraps work for all the architectures. Convert the utility ffsinfo to understand UFS2 and test growfs. Add support for the extended attribute storage. Update soft updates to ensure integrity of extended attribute storage. Switch the current extended attribute interfaces to use the extended attribute storage. Add the extent like functionality (framework is there, but is currently never used). Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@freebsd.org>	2002-06-21 06:18:05 +00:00
Semen Ustimenko	13866b3fd2	Fix a typo in my recently added comment: s/beleived/believed/ Submitted by: keramida	2002-06-06 20:43:03 +00:00
Semen Ustimenko	f576a00d1b	Remove lock from ffs_vget introduced by v1.24. Instead of locking the vnode creation globaly, we allow processes to create vnodes concurently. In case of concurent creation of vnode for the one ino, we allow processes to race and then check who wins. Assuming that concurent creation of vnode for same ino is really rare case, this is belived to be an improvement, as it just allows concurent creation of vnodes. Idea by: bp Reviewed by: dillon MFC after: 1 month	2002-05-30 22:04:17 +00:00
Ian Dowse	00b162d018	Remove um_i_effnlink_valid, i_spare[] and the ufsmount_u and inode_u unions, since these were only necessary when ext2fs used ufs code. Reviewed by: mckusick	2002-05-18 18:51:14 +00:00
Tom Rhodes	d394511de3	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
Poul-Henning Kamp	05f4ff5da1	Remove register keyword. Sponsored by: DARPA & NAI Labs. Submitted by: mckusick	2002-05-13 09:22:31 +00:00
Poul-Henning Kamp	2dd527b3ac	Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>. Sponsored by: DARPA & NAI Labs	2002-04-08 09:20:07 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Poul-Henning Kamp	46a67eaced	Use DIOCGSECTORSIZE instead of the bogus DIOCGPART ioctl.	2002-04-02 11:23:14 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
Bruce Evans	0508986cce	In ffs_mountffs(), set mnt_iosize_max to si_iosize_max unconditionally provided the latter is nonzero. At this point, the former is a fairly arbitrary default value (DFTPHYS), so changing it to any reasonable value specified by the device driver is safe. Using the maximum of these limits broke ffs clustered i/o for devices whose si_iosize_max is < DFLTPHYS. Using the minimum would break device drivers' ability to increase the active limit from DFTLPHYS up to MAXPHYS. Copied the code for this and the associated (unnecessary?) fixup of mp_iosize_max to all other filesystems that use clustering (ext2fs and msdosfs). It was completely missing. PR: 36309 MFC-after: 1 week	2002-03-30 15:12:57 +00:00
Alfred Perlstein	6f1e855112	Remove __P.	2002-03-19 22:40:48 +00:00
Kirk McKusick	a0595d0249	Add a flags parameter to VFS_VGET to pass through the desired locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.	2002-03-17 01:25:47 +00:00
Poul-Henning Kamp	063f776327	I missed one VOP_CLOSE in the previous commit. Pointed out by: bde	2002-03-11 16:27:04 +00:00
Poul-Henning Kamp	3dbceccb78	As a XXX bandaid open the mounted device READ/WRITE even if we only mount read-only. The trouble here is that we don't reopen the device in read/write mode when we remount in read/write mode resulting in a filesystem sending write requests to a device which was only opened read/only. I'm not quite sure how such a reopen would best be done and defer the problem to more agile hackers.	2002-03-11 13:53:00 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Kirk McKusick	cd6005961f	When downgrading a filesystem from read-write to read-only, operations involving file removal or file update were not always being fully committed to disk. The result was lost files or corrupted file data. This change ensures that the filesystem is properly synced to disk before the filesystem is down-graded. This delta also fixes a long standing bug in which a file open for reading has been unlinked. When the last open reference to the file is closed, the inode is reclaimed by the filesystem. Previously, if the filesystem had been down-graded to read-only, the inode could not be reclaimed, and thus was lost and had to be later recovered by fsck. With this change, such files are found at the time of the down-grade. Normally they will result in the filesystem down-grade failing with `device busy'. If a forcible down-grade is done, then the affected files will be revoked causing the inode to be released and the open file descriptors to begin failing on attempts to read. Submitted by: "Sam Leffler" <sam@errno.com>	2002-01-15 07:17:12 +00:00
Matthew Dillon	23b590188f	Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget() against VM_WAIT in the pageout code. Both fixes involve adjusting the lockmgr's timeout capability so locks obtained with timeouts do not interfere with locks obtained without a timeout. Hopefully MFC: before the 4.5 release	2001-12-20 22:42:27 +00:00
Ian Dowse	143a5346c9	Make sure we ignore the value of `fs_active' when reloading the superblock, and move the initialisation of it to beside where other pointer fields are initialised.	2001-12-16 18:54:09 +00:00
Kirk McKusick	cc5a92334f	Minimize the time necessary to suspend operations on a filesystem when taking a snapshot. The two time consuming operations are scanning all the filesystem bitmaps to determine which blocks are in use and scanning all the other snapshots so as to be able to expunge their blocks from the view of the current snapshot. The bitmap scanning is broken into two passes. Before suspending the filesystem all bitmaps are scanned. After the suspension, those bitmaps that changed after being scanned the first time are rescanned. Typically there are few bitmaps that need to be rescanned. The expunging of other snapshots is now done after the suspension is released by observing that we can easily identify any blocks that were allocated to them after the suspension (they will be maked as `not needing to be copied' in the just created snapshot). For all the gory details, see the ``Running fsck in the Background'' paper in the Usenix BSDCon 2002 Conference Proceedings, pages 55-64.	2001-12-14 00:15:06 +00:00
Matthew Dillon	245df27cee	Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has a real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week	2001-10-26 00:08:05 +00:00
Matthew Dillon	c72ccd014d	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
Robert Watson	b73d2870cd	o Replace two direct uid!=0 comparisons with suser_td() calls. Obtained from: TrustedBSD Project	2001-10-02 14:34:22 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Ian Dowse	4691e9ead0	The "dirpref" directory layout preference improvements make use of an array "fs_contigdirs[]" to avoid too many directories getting created in each cylinder group. The memory required for this and two other arrays (fs_csp[] and fs_maxcluster[]) is allocated with a single malloc() call, and divided up afterwards. However, the 'space' pointer is not advanced correctly, so fs_contigdirs and fs_maxcluster end up pointing to the same address. Add the missing code to advance the 'space' pointer, and remove an unnecessary update of the pointer that follows. This is likely to fix the "ffs_clusteralloc: map mismatch" panics that have been reported recently. Submitted by: Luke Mewburn <lukem@wasabisystems.com>	2001-09-09 23:48:28 +00:00
Robert Watson	7df97b6117	o At some point, unmounting a non-EA file system with EA's compiled in got a bit broken, when ufs_extattr_stop() was called and failed, ufs_extattr_destroy() would panic. This makes the call to destroy() conditional on the success of stop(). Submitted by: Christian Carstensen <cc@devcon.net> Obtained from: TrustedBSD Project	2001-09-01 20:11:05 +00:00
John Baldwin	ed87274d16	Fix more mntvnode and vnode interlock order reversals.	2001-06-28 22:21:33 +00:00
John Baldwin	49d2d9f4a4	- Fix a mntvnode and vnode interlock reversal. - Protect the mnt_vnode list with the mntvnode lock. - Use queue(9) macros.	2001-06-28 04:12:56 +00:00
Poul-Henning Kamp	c7a3e2379c	Remove last vestiges of MFS.	2001-05-29 21:21:53 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00

1 2 3 4 5 ...

403 Commits