freebsd-nq

Author	SHA1	Message	Date
Kirk McKusick	9b97113391	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
Boris Popov	3fbd97427e	Prevent possible dereference of NULL pointer. Submitted by: Marius Bendiksen <mbendiks@eunet.no>	2000-07-13 02:17:14 +00:00
Kirk McKusick	d303f71fdc	Brain fault, forgot to update ffs_snapshot.c with the new calling convention for vn_start_write.	2000-07-12 00:27:27 +00:00
Kirk McKusick	f2a2857bb3	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
Kirk McKusick	d4c1816924	Clean up warning about undeclared function by declaring softdep_fsync in mount.h instead of ffs_extern.h. The correct solution is to use an indirect function pointer so that the kernel does not have to be built with options FFS, but that will be left for another day.	2000-07-11 19:28:26 +00:00
Kirk McKusick	cc3962a9cd	Delete README as it is now obsolete. Relevant information is in README.softupdates.	2000-07-08 02:32:49 +00:00
Kirk McKusick	876578906d	Update to reflect current status.	2000-07-08 02:31:21 +00:00
Kirk McKusick	22e5a6234e	Get userland visible flags added for snapshots to give a few days advance preparation for them to get migrated into place so that subsequent changes in utilities will not fail to compile for lack of up-to-date header files in /usr/include.	2000-07-04 04:58:34 +00:00
Poul-Henning Kamp	3275cf7379	Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES, that is way cleaner than using the softupdates_stub stunt, which should be killed when convenient. Discussed with: mckusick	2000-07-03 13:26:54 +00:00
Andrey A. Chernov	2d90744fd8	Remove obsoleted info about linking from contrib	2000-06-24 13:29:25 +00:00
Kirk McKusick	858c16fab8	Update to new copyright.	2000-06-22 00:29:53 +00:00
Kirk McKusick	6019e6208f	When running with quotas enabled on a filesystem using soft updates, the system would panic when a user's inode quota was exceeded (see PR 18959 for details). This fixes that problem. PR: 18959 Submitted by: Jason Godsey <jason@unixguy.fidalgo.net>	2000-06-18 22:14:28 +00:00
Kirk McKusick	d3abb52714	Some additional performance improvements. When freeing an inode check to see if it has been committed to disk. If it has never been written, it can be freed immediately. For short lived files this change allows the same inode to be reused repeatedly. Similarly, when upgrading a fragment to a larger size, if it has never been claimed by an inode on disk, it too can be freed immediately making it available for reuse often in the next slowly growing block of the same file.	2000-06-18 22:05:57 +00:00
Poul-Henning Kamp	7c50d77218	Revert part of my bioops change which implemented panic(8).	2000-06-16 14:32:13 +00:00
Poul-Henning Kamp	7523681895	ARGH! I have too many source trees :-( Fix prototype errors in last commit.	2000-06-16 13:00:33 +00:00
Poul-Henning Kamp	a2e7a027a7	Virtualizes & untangles the bioops operations vector. Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@	2000-06-16 08:48:51 +00:00
Poul-Henning Kamp	6ea6805f8c	Remove a comment which should never have made it in.	2000-06-14 21:48:19 +00:00
Robert Watson	b2b0497ab5	o If FFS_EXTATTR is defined, don't print out an error message on unmount if an FFS partition returns EOPNOTSUPP, as it just means extended attributes weren't enabled on that partition. Prevents spurious warning per-partition at shutdown.	2000-06-04 04:50:36 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Robert Watson	f3706a0361	s/ffs_unmonut/ffs_unmount/ in a gratuitous ufs_extattr printf. Reported by: knu	2000-05-07 17:21:08 +00:00
Poul-Henning Kamp	9626b608de	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
Poul-Henning Kamp	2c9b67a8df	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
Poul-Henning Kamp	87150cb06d	s/biowait/bufwait/g Prodded by: several.	2000-04-29 16:25:22 +00:00
Poul-Henning Kamp	eb95c536ad	Remove unneeded #include <sys/kernel.h>	2000-04-29 15:36:14 +00:00
Poul-Henning Kamp	3389ae9350	Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>	2000-04-19 14:58:28 +00:00
Robert Watson	a64ed08955	Introduce extended attribute support for FFS, allowing arbitrary (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development. In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged. o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit) Currently attributes are not supported in MFS. This will be fixed. Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project	2000-04-15 03:34:27 +00:00
Poul-Henning Kamp	c244d2de43	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
Poul-Henning Kamp	b99c307a21	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
Poul-Henning Kamp	21144e3bf1	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
Kirk McKusick	584508a741	Use 64-bit math to calculate if we have hit our freespace limit. Necessary for coherent results on filesystems bigger than 0.5Tb.	2000-03-17 03:44:47 +00:00
Kirk McKusick	9f043878d0	Use 64-bit math to decide if optimization needs to be changed. Necessary for coherent results on filesystems bigger than 0.5Tb. Submitted by: Paul Saab <ps@yahoo-inc.com>	2000-03-15 07:08:36 +00:00
Matthew Dillon	f8fa53397f	Fix a 'freeing free block' panic in UFS. The problem occurs when the filesystem fills up. If the first indirect block exists and FFS is able to allocate deeper indirect blocks, but is not able to allocate the data block, FFS improperly unwinds the indirect blocks and leaves a block pointer hanging to a freed block. This will cause a panic later when the file is removed. The solution is to properly account for the first block-pointer-to-an-indirect-block we had to create in a balloc operation and then unwind it if a failure occurs. Detective work by: Ian Dowse <iedowse@maths.tcd.ie> Reviewed by: mckusick, Ian Dowse <iedowse@maths.tcd.ie> Approved by: jkh	2000-02-24 20:43:20 +00:00
Kirk McKusick	4434ff1d38	When writing out bitmap buffers, need to skip over ones that already have a write in progress. Otherwise one can get in an infinite loop trying to get them all flushed. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	2000-01-30 20:32:59 +00:00
Kirk McKusick	57a91f6fb0	During fastpath processing for removal of a short-lived inode, the set of restrictions for cancelling an inode dependency (inodedep) is somewhat stronger than originally coded. Since this check appears in two places, we codify it into the function check_inode_unwritten which we then call from the two sites, one freeing blocks and the other freeing directory entries. Submitted by: Steinar Haug via Matthew Dillon	2000-01-18 01:33:05 +00:00
Kirk McKusick	4c6adb0622	Need to reorganize the flushing of directory entry (pagedep) dependencies so that they never try to lock an inode corresponding to ".." as this can lead to deadlock. We observe that any inode with an updated link count is always pushed into its buffer at the time of the link count change, so we do not need to do a VOP_UPDATE, but merely find its buffer and write it. The only time we need to get the inode itself is from the result of a mkdir whose name will never be ".." and hence locking such an inode will never request a lock above us in the filesystem tree. Thanks to Brian Fundakowski Feldman for providing the test program that tickled soft updates into hanging in "inode" sleep. Submitted by: Brian Fundakowski Feldman <green@FreeBSD.org>	2000-01-18 01:30:03 +00:00
Kirk McKusick	105ef72c55	Better bounding on softdep_flushfiles; other minor tweeks to checks.	2000-01-17 06:35:11 +00:00
Kirk McKusick	107d5039ef	Must track multiple uncommitted renames until one ultimately gets committed to disk or is removed.	2000-01-17 06:28:18 +00:00
Matthew Dillon	173cce7c8e	Non-operational change, fix compiler warning. Reviewed by: mckusick	2000-01-14 04:39:28 +00:00
Kirk McKusick	d7127837a2	Confirming Peter's fix (locking 101: release the lock before you go to sleep). Locking 101, part 2: do not look at buffer contents after you have been asleep. There is no telling what wonderous changes may have occurred.	2000-01-13 20:03:22 +00:00
Peter Wemm	7f473504e6	Free the global softupdates lock prior to tsleep() in getdirtybuf(). This seems to be responsible for a bunch of panics where the process sleeps and something else finds softupdates "locked" when it shouldn't be. This commit is unreviewed, but has been a big help here. Previously my boxes would panic pretty much on the first fsync() that wrote something to disk.	2000-01-13 18:48:12 +00:00
Kirk McKusick	1c2ceb2880	Because cylinder group blocks are now written in background, it is no longer sufficient to get a lock on a buffer to know that its write has been completed. We have to first get the lock on the buffer, then check to see if it is doing a background write. If it is doing background write, we have to wait for the background write to finish, then check to see if that fullfilled our dependency, and if not to start another write. Luckily the explanation is longer than the fix.	2000-01-13 07:20:01 +00:00
Kirk McKusick	94313add1f	A panic occurs during an fsync when a dirty block associated with a vnode has not been written (which would clear certain of its dependencies). The problems arises because fsync with MNT_NOWAIT no longer pushes all the dirty blocks associated with a vnode. It skips those that require rollbacks, since they will just get instantly dirty again. Such skipped blocks are marked so that they will not be skipped a second time (otherwise circular dependencies would never clear). So, we fsync twice to ensure that everything will be written at least once.	2000-01-13 07:17:39 +00:00
Kirk McKusick	4ed62fbd7f	The only known cause of this panic is running out of disk space. The problem occurs when an indirect block and a data block are being allocated at the same time. For example when the 13th block of the file is written, the filesystem needs to allocate the first indirect block and a data block. If the indirect block allocation succeeds, but the data block allocation fails, the error code dellocates the indirect block as it has nothing at which to point. Unfortunately, it does not deallocate the indirect block's associated dependencies which then fail when they find the block unexpectedly gone (ptr == 0 instead of its expected value). The fix is to fsync the file before doing the block rollback, as the fsync will flush out all of the dependencies. Once the rollback is done the file must be fsync'ed again so that the soft updates code does not find unexpected changes. This approach is much slower than writing the code to back out the extraneous dependencies, but running out of disk space is not expected to be a common occurence, so just getting it right is the main criterion. PR: kern/15063 Submitted by: Assar Westerlund <assar@stacken.kth.se>	2000-01-11 08:27:00 +00:00
Kirk McKusick	10767f840b	We cannot proceed to free the blocks of the file until the dependencies have been cleaned up by deallocte_dependencies(). Once that is done, it is safe to post the request to free the blocks. A similar change is also needed for the freefile case.	2000-01-11 06:52:35 +00:00
Poul-Henning Kamp	ba4ad1fcea	Give vn_isdisk() a second argument where it can return a suitable errno. Suggested by: bde	2000-01-10 12:04:27 +00:00
Kirk McKusick	26e5527c86	Missing FREE_LOCK call before handle_workitem_freeblocks. Submitted by: "Kenneth D. Merry" <ken@kdm.org>	2000-01-10 08:39:03 +00:00
Kirk McKusick	cf60e8e4bf	Several performance improvements for soft updates have been added: 1) Fastpath deletions. When a file is being deleted, check to see if it was so recently created that its inode has not yet been written to disk. If so, the delete can proceed to immediately free the inode. 2) Background writes: No file or block allocations can be done while the bitmap is being written to disk. To avoid these stalls, the bitmap is copied to another buffer which is written thus leaving the original available for futher allocations. 3) Link count tracking. Constantly track the difference in i_effnlink and i_nlink so that inodes that have had no change other than i_effnlink need not be written. 4) Identify buffers with rollback dependencies so that the buffer flushing daemon can choose to skip over them.	2000-01-10 00:24:24 +00:00
Kirk McKusick	f0f7d38386	Keep tighter control of removal dependencies by limiting the number of dirrem structure rather than the collaterally created freeblks and freefile structures. Limit the rate of buffer dirtying by the syncer process during periods of intense file removal.	2000-01-09 23:35:38 +00:00
Kirk McKusick	3f5b28bc07	Reorganize softdep_fsync so that it only does the inode-is-flushed check before the inode is unlocked while grabbing its parent directory. Once it is unlocked, other operations may slip in that could make the inode-is-flushed check fail. Allowing other writes to the inode before returning from fsync does not break the semantics of fsync since we have flushed everything that was dirty at the time of the fsync call.	2000-01-09 23:14:57 +00:00
Kirk McKusick	e2dc60835d	Get rid of unreferenced function.	2000-01-09 22:42:42 +00:00
Kirk McKusick	83aaf63ab2	Make static non-exported functions from soft updates.	2000-01-09 22:40:09 +00:00
Peter Wemm	c447342094	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.	1999-12-29 05:07:58 +00:00
Bruce Evans	7e58bfacbe	Update the unclean flag for mount -u. I forgot to handle this case when I made the absence of the clean flag sticky in rev.1.88. This was a problem main for "mount /". There is no way to mount "/" for writing without using mount -u (normally implicitly), so after "mount -f /" of an unclean filesystem, the absence of the clean flag was sticky forever.	1999-12-23 15:42:14 +00:00
Eivind Eklund	369dc8ceb8	Change incorrect NULLs to 0s	1999-12-21 11:14:12 +00:00
Robert Watson	91f37dcba1	Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind	1999-12-19 06:08:07 +00:00
Kirk McKusick	6a4152243f	The function request_cleanup() had a tsleep() with PCATCH. It is quite dangerous, since the process may hold locks at the point, and if it is stopped in that tsleep the machine may hang. Because the sleep is so short, the PCATCH is not required here, so it has been removed. For the future, the FreeBSD team needs to decide whether it is still reasonable to stop a process in tsleep, as that may affect any other code that uses PCATCH while holding kernel locks. Submitted by: Dmitrij Tejblum <tejblum@arc.hq.cti.ru> Reviewed by: Kirk McKusick <mckusick@mckusick.com>	1999-12-16 22:02:09 +00:00
Eivind Eklund	762e6b856c	Introduce NDFREE (and remove VOP_ABORTOP)	1999-12-15 23:02:35 +00:00
Eivind Eklund	6bdfe06ad9	Lock reporting and assertion changes. * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter	1999-12-11 16:13:02 +00:00
Bill Fumerola	43cd4e8815	Remove the 'alpha, use at your own risk' death-statement. Reviewed by: mckusick (verbally at FreeBSDcon)	1999-12-03 00:40:31 +00:00
Bill Fumerola	cfa5001489	Fix typo, add $FreeBSD$	1999-12-03 00:34:26 +00:00
Kirk McKusick	9f54c05286	Preferentially allocate the first indirect block in the same cylinder group as the inode. This makes a 15% difference in read speed for files in the 96K to 500K size range.	1999-12-01 19:33:12 +00:00
Poul-Henning Kamp	38224dcd59	Convert various pieces of code to use vn_isdisk() rather than checking for vp->v_type == VBLK. In ccd: we don't need to call VOP_GETATTR to find the type of a vnode. Reviewed by: sos	1999-11-22 10:33:55 +00:00
Eivind Eklund	b2f2b704d0	We do not have ffs_checkexp, so remove the prototype	1999-11-20 16:44:44 +00:00
Poul-Henning Kamp	0429e37ade	struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967	1999-11-20 10:00:46 +00:00
Poul-Henning Kamp	698f9cf828	Next step in the device cleanup process. Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.	1999-11-09 14:15:33 +00:00
Bruce Evans	5bd5c8b9e5	Quick fix for breakage of ext2fs link counts as reported by stat(2) by the soft updates changes: only report the link count to be i_effnlink in ufs_getattr() for file systems that maintain i_effnlink. Tested by: Mike Dracopoulos <mdraco@math.uoa.gr>	1999-11-03 12:05:39 +00:00
Mike Smith	6d14782861	Newline-terminate the complaint message about not being able to find the root vnode pointer.	1999-11-01 23:57:28 +00:00
Poul-Henning Kamp	923502ff91	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
Poul-Henning Kamp	b89392e703	Remove the D_NOCLUSTER[RW] options which were added because vn had problems. Now that Matt has fixed vn, this can go. The vn driver should have used d_maxio (now si_iosize_max) anyway.	1999-09-30 07:11:30 +00:00
Poul-Henning Kamp	1b5464ef9d	Remove v_maxio from struct vnode. Replace it with mnt_iosize_max in struct mount. Nits from: bde	1999-09-29 20:05:33 +00:00
Alfred Perlstein	c24fda81c9	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD	1999-09-11 00:46:08 +00:00
Peter Wemm	280652828b	$Id$ -> $FreeBSD$	1999-08-28 02:16:32 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Poul-Henning Kamp	41d2e3e09e	Introduce vn_isdisk(struct vnode *vp) function, and use it to test for diskness.	1999-08-25 12:24:39 +00:00
Sheldon Hearn	740e3a15f7	Fix bug introduced in rev 1.28, which causes kernel build to break for the case where DEBUG is defined but not DIAGNOSTIC. ffs_checkblk is declared conditionally on DIAGNOSTIC, not DEBUG. PR: 13314 Reviewed by: bde	1999-08-24 08:39:41 +00:00
Bruce Evans	d918320517	Use devtoname() to print dev_t's instead of casting them to long or u_long for misprinting in %lx format.	1999-08-23 20:35:21 +00:00
Poul-Henning Kamp	7dc5cd047f	The bdevsw() and cdevsw() are now identical, so kill the former.	1999-08-13 10:29:38 +00:00
Poul-Henning Kamp	0ef1c82630	Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.	1999-08-08 18:43:05 +00:00
Kirk McKusick	4dc0c8f521	Create the macro DOINGASYNC to check whether the MNT_ASYNC flag has been set for a mount point. Insert missing checks to ensure that all write operations are done asynchronously when the MNT_ASYNC option has been requested. Submitted by: Craig A Soules <soules+@andrew.cmu.edu> Reviewed by: Kirk McKusick <mckusick@mckusick.com>	1999-07-13 18:20:13 +00:00
Poul-Henning Kamp	68de329e34	Use the fsid from the superblock, unless it looks bogus or has already been taken by some other filesystem.	1999-07-11 19:16:50 +00:00
Ollivier Robert	7fe29b0aef	Add $Id$ Approved by: kirk	1999-07-07 07:51:04 +00:00
John Polstra	24755bdc25	Update pathnames for new location of soft-updates sources.	1999-07-03 21:34:05 +00:00
Kirk McKusick	48703fedf1	No longer need to set B_ASYNC flag since BUF_KERNPROC now unconditionally sets the identity of the buffer.	1999-06-29 15:57:40 +00:00
Peter Wemm	a6451da76b	Keep the inlines for <sys/buf.h> happy..	1999-06-27 13:26:23 +00:00
Kirk McKusick	67812eacd7	Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.	1999-06-26 02:47:16 +00:00
Kirk McKusick	7481264c1e	On our final pass through ffs_fsync, do all I/O synchronously so that we can find out if our flush is failing because of write errors. This change avoids a "flush failed" panic during unrecoverable disk errors.	1999-06-18 05:49:46 +00:00
Kirk McKusick	f9c8cab591	Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.	1999-06-16 23:27:55 +00:00
Kirk McKusick	e4ab40bcb6	Get rid of the global variable rushjob and replace it with a function in kern/vfs_subr.c named speedup_syncer() which handles the speedup request. Change the various clients of rushjob to use the new function.	1999-06-15 23:37:29 +00:00
Poul-Henning Kamp	2447bec829	Simplify cdevsw registration. The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing. cdevsw_add() will print an message if the d_maj field looks bogus. Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL. Move bdevsw() and devsw() functions to kern/kern_conf.c Bump __FreeBSD_version to 400006 This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions if_xe.c bogusly accessed cdevsw[], author/maintainer please fix. I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.	1999-05-31 11:29:30 +00:00
Julian Elischer	2e897e94b6	Cosmetic changes to make it compile without errors in gcc -Wall	1999-05-22 04:43:04 +00:00
Kirk McKusick	c2606ec5c6	Add a hook to ffs_fsync to allow soft updates to get first chance at doing a sync on the block device for the filesystem. That allows it to push the bitmap blocks before the inode blocks which greatly reduces the number of inode rollbacks that need to be done.	1999-05-14 01:26:46 +00:00
Peter Wemm	51b5226683	Try and fix a dev_t/major/minor etc nit.	1999-05-12 22:32:07 +00:00
Kirk McKusick	71a0942aca	Put back changes that might be causing trouble on Alpha.	1999-05-09 19:39:54 +00:00
Poul-Henning Kamp	4be2eb8c49	I got tired of seeing all the cdevsw[major(foo)] all over the place. Made a new (inline) function devsw(dev_t dev) and substituted it. Changed to the BDEV variant to this format as well: bdevsw(dev_t dev) DEVFS will eventually benefit from this change too.	1999-05-08 06:40:31 +00:00
Poul-Henning Kamp	46eede0058	Continue where Julian left off in July 1998: Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline) function. Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention to the order of the cmaj/bmaj arguments!) Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE (ditto!) (Next step will be to convert all bdev dev_t's to cdev dev_t's before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)	1999-05-07 10:11:40 +00:00
Kirk McKusick	36cfb417de	Whitespace cleanup.	1999-05-07 05:21:16 +00:00
Kirk McKusick	7957996abd	Get rid of random debugging cruft; sync up with latest version.	1999-05-07 05:11:31 +00:00
Kirk McKusick	224a6aa241	Severe slowdowns have been reported when creating or removing many files at once on a filesystem running soft updates. The root of the problem is that soft updates limits the amount of memory that may be allocated to dependency structures so as to avoid hogging kernel memory. The original algorithm just waited for the disk I/O to catch up and reduce the number of dependencies. This new code takes a much more aggressive approach. Basically there are two resources that routinely hit the limit. Inode dependencies during periods with a high file creation rate and file and block removal dependencies during periods with a high file removal rate. I have attacked these problems from two fronts. When the inode dependency limits are reached, I pick a random inode dependency, UFS_UPDATE it together with all the other dirty inodes contained within its disk block and then write that disk block. This trick usually clears 5-50 inode dependencies in a single disk I/O. For block and file removal dependencies, I pick a random directory page that has at least one remove pending and VOP_FSYNC its directory. That releases all its removal dependencies to the work queue. To further hasten things along, I also immediately start the work queue process rather than waiting for its next one second scheduled run.	1999-05-07 02:26:47 +00:00
Peter Wemm	dfd5dee1b0	Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.	1999-05-06 18:13:11 +00:00
Alan Cox	4221e284a3	The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-02 23:57:16 +00:00
Mike Smith	f4711b2df4	Simplify the tunefs example, since tunefs uses getfsfile(). Lots of people complain about working out what device their filesystems are mounted on.	1999-04-27 21:11:19 +00:00
Kirk McKusick	38e28fd66b	Reorganize locking to avoid holding the lock during calls to bdwrite and brelse (which may sleep in some systems). Obtained from: Matthew Dillon <dillon@apollo.backplane.com>	1999-03-02 06:38:07 +00:00
Kirk McKusick	eef33ce9bd	When fsync'ing a file on a filesystem using soft updates, we first try to write all the dirty blocks. If some of those blocks have dependencies, they will be remarked dirty when the I/O completes. On systems with really fast I/O systems, it is possible to get in an infinite loop trying to flush the buffers, because the I/O finishes before we can get all the dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The previous algorithm looped over the v_dirtyblkhd list writing out buffers until the list emptied.) So, now we mark each buffer that we try to write so that we can distinguish the ones that are being remarked dirty from those that we have not yet tried to flush. Once we have tried to push every buffer once, we then push any associated metadata that is causing the remaining buffers to be redirtied. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-03-02 04:04:31 +00:00
Kirk McKusick	4cbb89d95d	Ensure that softdep_sync_metadata can handle bmsafemap and mkdir entries if they ever arise (which should not happen as softdep_sync_metadata is currently used).	1999-03-02 00:19:47 +00:00
Kirk McKusick	133ff2619a	fix double LIST_REMOVE; other cosmetic changes to match version 9.32. Obtained from: Jeffrey Hsu <hsu@FreeBSD.ORG>	1999-02-17 20:01:20 +00:00
Matthew Dillon	8aef171243	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-28 00:57:57 +00:00
David Greenman	8ab2fa0073	Gutted softdep_deallocate_dependencies and replaced it with a panic. It turns out to not be useful to unwind the dependencies and continue in the face of a fatal error. Also changed the log() to a printf() in softdep_error() so that it will be output in the case of a impending panic. Submitted by: Kirk McKusick <mckusick@mckusick.com>	1999-01-22 09:07:32 +00:00
Eivind Eklund	5b1b6c5859	Silence warning about unused debug function. (I'll turn this function into a DDB command in my next staticization sweep).	1999-01-12 11:42:41 +00:00
Eivind Eklund	a862221fa0	Add a warning about the copyright restraints.	1999-01-08 16:03:12 +00:00
Bruce Evans	de5d1ba57c	Don't pass unused unused timestamp args to UFS_UPDATE() or waste time initializing them. This almost finishes centralizing (in-core) timestamp updates in ufs_itimes().	1999-01-07 16:14:19 +00:00
Bruce Evans	4591d9bb7e	UFS_UPDATE() takes a boolean `waitfor' arg, so don't pass it the value MNT_WAIT when we mean boolean `true' or check for that value not being passed. There was no problem in practice because MNT_WAIT had the magic value of 1.	1999-01-06 18:18:06 +00:00
Bruce Evans	d64dbc8719	Ifdefed the conditionally used variable `prtrealloc'. Declare it as volatile so that there is no chance that the code that it controls is optimised away.	1999-01-06 17:04:33 +00:00
Bruce Evans	5991fd0370	Backed out rev.1.47. It just broke my optimisations for lazy syncing of timestamps in rev.1.45. The soft updates bug was elsewhere. Forgotten by: luoqi	1999-01-06 16:52:38 +00:00
Eivind Eklund	fb1167777a	Remove the 'waslocked' parameter to vfs_object_create().	1999-01-05 18:50:03 +00:00
Eivind Eklund	a777e82019	Remove the last clients of vfs_object_create(..., waslocked=1); waslocked will go away shortly. Reviewed by: dg	1999-01-02 01:32:36 +00:00
Julian Elischer	1f35e8c8da	Remove some compiler warnings.	1998-12-10 20:11:47 +00:00
Bruce Evans	672be20b9f	Don't use the strange null pointer constant `(ufs_daddr_t)0' in a call to VOP_BMAP(). Don't use uncast NULLs in the same call.	1998-11-29 03:12:06 +00:00
David Greenman	1c680b45a2	Restored the "reallocblks" code to its former glory. What this does is basically do a on-the-fly defragmentation of the FFS filesystem, changing file block allocations to make them contiguous. Thanks to Kirk McKusick for providing hints on what needed to be done to get this working.	1998-11-13 01:01:44 +00:00
Peter Wemm	2ec07c6614	Change dirty block list handling to use TAILQ macros.	1998-10-31 15:33:32 +00:00
Peter Wemm	40c8cfe552	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
Jordan K. Hubbard	2dcc2f0693	Clarify a rather ambiguous debugging message.	1998-10-28 10:37:54 +00:00
Bruce Evans	b5ee16407f	Oops, the redundant tests for major numbers weren't redundant here. They checked for the magic major number for the "device" behind mfs mount points. Use a more obvious check for this device. Debugged by: Andrew Gallatin <gallatin@cs.duke.edu>	1998-10-27 11:47:08 +00:00
Bruce Evans	9c0619dace	Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted when bdevsw[] became sparse. We still depend on magic to avoid having to check that (v_rdev) device numbers in vnodes are not NODEV. Removed redundant `major(dev) < nblkdev' tests instead of updating them.	1998-10-25 19:02:48 +00:00
Poul-Henning Kamp	f5ef029e92	Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.	1998-10-25 17:44:59 +00:00
Nate Williams	ed8d80c2de	Fix 'noatime' bug that was unrelated to use of noatime. The problem is caused when a directory block is compacted. When this occurs, softdep_change_directoryentry_offset() is called to relocate each directory entry and adjust its matching diradd structure, if any, to match the new location of the entry. The bug is that while softdep_change_directoryentry_offset() correctly adjusts the offsets of the diradd structures on the pd_diraddhd[] lists (which are not yet ready to be committed to disk), it fails to adjust the offsets of the diradd structures on the pd_pendinghd list (which are ready to be committed to disk). This causes the dependency structures to be inconsistent with the buf contents. Now, if the compaction has moved a directory entry to the same offset as one of the diradd structures on the pd_pendinghd list and a syscall is done that tries to remove this directory entry before this directory block has been written to disk (which would empty pd_pendinghd), a sanity check in newdirrem() will call panic() when it notices that the inode number in the entry that it is to be removed doesn't match the inode number in the diradd structure with that offset of that entry. Reviewed by: Kirk McKusick <mckusick@McKusick.COM> Submitted by: Don Lewis <Don.Lewis@tsc.tdk.com>	1998-10-03 19:17:11 +00:00
Bruce Evans	0922cce61c	Fixed clean flag handling: - don't set the clean flag on unmount of an unclean filesystem that was (forcibly) mounted rw. - set the clean flag on rw -> ro update of a mounted initially-clean filesystem. - fixed some style bugs (mostly long lines). This uses the fs_flags field and FS_UNCLEAN state bit which were introduced in the softdep changes. NetBSD uses extra state bits in fs_clean. Reviewed by: luoqui	1998-09-26 04:59:42 +00:00
Luoqi Chen	e266594c25	Eliminate a race in VOP_FSYNC() when softupdates is enabled. Submitted by: Kirk McKusick <mckusick@McKusick.COM> Two minor changes are also included, 1. Remove gratuitious checks for error return from vn_lock with LK_RETRY set, vn_lock should always succeed in these cases. 2. Back out change rev. 1.36->1.37, which unnecessarily makes async mount a little more unstable. It also keeps us in sync with other BSDs. Suggested by: Bruce Evans <bde@zeta.org.au>	1998-09-24 15:02:46 +00:00
Luoqi Chen	f9e84c2fee	Restore pre-v1.44 behavior: always copy modified in-core inode to disk buffer. Otherwise some in-core inode changes might be lost, including important meta data (e.g. size) if softupdates is enabled.	1998-09-15 14:45:28 +00:00
Søren Schmidt	d024c95599	Remove the SLICE code. This clearly needs alot more thought, and we dont need this to hunt us down in 3.0-RELEASE.	1998-09-14 19:56:42 +00:00
Bruce Evans	9164000766	Don't dereference an uninitialized pointer in dead code. The dead code gets executed if it is compiled without optimization.	1998-09-12 14:46:15 +00:00
Bruce Evans	8994ca3ce9	Removed statically configured mount type numbers (MOUNT_) and all references to them. The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_ instead of the number in their vfsconf struct.	1998-09-07 13:17:06 +00:00
Bruce Evans	ff261f16f6	Put the zombie ffs sysctl node in "notyet" state together with its few remaining children. Prepare it for MOUNT_UFS going away.	1998-09-07 11:50:19 +00:00
Poul-Henning Kamp	0375c9f2b8	Add a new vnode op, VOP_FREEBLKS(), which filesystems can use to inform device drivers about sectors no longer in use. Device-drivers receive the call through d_strategy, if they have D_CANFREE in d_flags. This allows flash based devices to erase the sectors and avoid pointlessly carrying them around in compactions. Reviewed by: Kirk Mckusick, bde Sponsored by: M-Systems (www.m-sys.com)	1998-09-05 14:13:12 +00:00
Bruce Evans	0492d857d1	Removed unused includes.	1998-08-17 19:09:36 +00:00
Julian Elischer	55d80b2df1	Handle the case of moving a directory onto the top of a sibling's child of the same name. Submitted by: Kirk Mckusick with fixes from luoqi Chen Obtained from: Whistle test tree.	1998-08-12 20:46:47 +00:00
Bruce Evans	ac1e407b32	Fixed printf format errors.	1998-07-11 07:46:16 +00:00
Julian Elischer	bcbd6c6fdd	Don't update superblock if mounted readonly, also fixes some problems with softupdates on root. More cleanups are needed here.. Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>	1998-07-08 23:52:27 +00:00
Julian Elischer	fd5d1124e2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
Bruce Evans	3055187290	Sync timestamp changes for inodes of special files to disk as late as possible (when the inode is reclaimed). Temporarily only do this if option UFS_LAZYMOD configured and softupdates aren't enabled. UFS_LAZYMOD is intentionally left out of /sys/conf/options. This is mainly to avoid almost useless disk i/o on battery powered machines. It's silly to write to disk (on the next sync or when the inode becomes inactive) just because someone hit a key or something wrote to the screen or /dev/null. PR: 5577 Previous version reviewed by: phk	1998-07-03 22:17:03 +00:00
Bruce Evans	33cc029eab	Centralized in-core inode update. Update the in-core inode directly in ufs_setattr() so that there is no need to pass timestamps to UFS_UPDATE() (everything else just needs the current time). Ignore the passed-in timestamps in UFS_UPDATE() and always call ufs_itimes() (was: itimes()) to do the update. The timestamps are still passed so that all the callers don't need to be changed yet.	1998-07-03 18:46:52 +00:00
Jordan K. Hubbard	d94ce17be4	Flesh this document out just a little in response to some user questions and also recommend linking over copying since, at this stage, a stale copy is a real concern.	1998-06-26 10:35:55 +00:00
Julian Elischer	c619155f0e	Slight change to directory cleanup Makes soft updates a bit cleaner. Eliminates some warnings about 'corrupted directories' from fsck.	1998-06-14 19:31:28 +00:00
Julian Elischer	28ed032673	Note which version of Kirk's sources this corresponds to.	1998-06-12 21:21:26 +00:00
Julian Elischer	aa75cb86b4	Fix the case when renaming to a file that you've just created and deleted, that had an inode that has not yet been written to disk, when the inode of the new file is also not yet written to disk, and your old directory entry is not yet on disk but you need to remove it and the new name exists in memory but has been deleted but the transaction to write the deleted name to disk exists and has not yet been cancelled by the request to delete the non existant name. I don't know how kirk could have missed such a glaring problem for so long. :-) Especially since the inconsitency survived on the disk for a whole 4 second on average before being fixed by other code. This was not a crashing bug but just led to filesystem inconsitencies if you crashed. Submitted by: Kirk McKusick (mckusick@mckusick.com)	1998-06-12 20:48:30 +00:00
Julian Elischer	6d0ba44288	Add B_NOCACHE to several cases where BSD4.4 only required a B_INVAL. Change worked out by john and kirk in consort.	1998-06-11 17:44:32 +00:00
Julian Elischer	8c221701c3	Fix for "live inode" panic. Submitted by: Kirk McKusick <mckusick@McKusick.COM> Reviewed by: yeah right...	1998-06-10 20:45:46 +00:00
Julian Elischer	4af0bb0f9e	Remove buggy debugging code.	1998-06-10 20:03:16 +00:00
Julian Elischer	939001af5c	Back out John's changes 1.45 -> 1.46 Kirk confirms that the original semantic was what he wanted... (well, a very slight difference) May fix "dangling deps" panic with soft updates.	1998-06-10 19:27:56 +00:00
Doug Rabson	8435e0aef5	Use size_t instead of u_int for sizes.	1998-06-04 17:21:39 +00:00

1 2 3 4 5 ...

431 Commits