freebsd-dev

Author	SHA1	Message	Date
David E. O'Brien	6e551fb628	Update to C99, s/__FUNCTION__/__func__/, also don't use ANSI string concatenation.	2001-12-10 08:09:49 +00:00
Matthew Dillon	6b8bd2efc1	Add mnt_reservedvnlist so we can MFC to 4.x, in order to make all mount structure changes now rather then piecemeal later on. mnt_nvnodelist currently holds all the vnodes under the mount point. This will eventually be split into a 'dirty' and 'clean' list. This way we only break kld's once rather then twice. nvnodelist will eventually turn into the dirty list and should remain compatible with the klds.	2001-11-04 18:55:42 +00:00
Matthew Dillon	c72ccd014d	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
Ian Dowse	5d76690a7f	The addition of i_dirhash to struct inode pushed RELENG_4's sizeof(struct inode) into a new malloc bucket on the i386. This didn't happen in -current due to the removal of i_lock, but it does no harm to apply the workaround to -current first. Reduce the size of the i_spare[] array in struct inode from 4 to 3 entries, and change ext2fs to use i_din.di_spare[1] so that it does not need i_spare[3]. Reviewed by: bde MFC after: 3 days	2001-09-24 18:29:20 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Ian Dowse	9b5ad47fb7	Bring in dirhash, a simple hash-based lookup optimisation for large directories. When enabled via "options UFS_DIRHASH", in-core hash arrays are maintained for large directories. These allow all directory operations to take place quickly instead of requiring long linear searches. For now anyway, dirhash is not enabled by default. The in-core hash arrays have a memory requirement that is approximately half the size of the size of the on-disk directory file. A number of new sysctl variables allow control over which directories get hashed and over the maximum amount of memory that dirhash will use: vfs.ufs.dirhash_minsize The minimum on-disk directory size for which hashing should be used. The default is 2560 (2.5k). vfs.ufs.dirhash_maxmem The system-wide maximum total memory to be used by dirhash data structures. The default is 2097152 (2MB). The current amount of memory being used by dirhash is visible through the read-only sysctl variable vfs.ufs.dirhash_maxmem. Finally, some extra sanity checks that are enabled by default, but which may have an impact on performance, can be disabled by setting vfs.ufs.dirhash_docheck to 0. Discussed on: -fs, -hackers	2001-07-10 21:21:29 +00:00
John Baldwin	ed87274d16	Fix more mntvnode and vnode interlock order reversals.	2001-06-28 22:21:33 +00:00
John Baldwin	797c3dba25	Fix a mntvnode and vnode interlock reversal.	2001-06-28 03:52:04 +00:00
Poul-Henning Kamp	c7a3e2379c	Remove last vestiges of MFS.	2001-05-29 21:21:53 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
Kirk McKusick	9ccb939ef0	When running with soft updates, track the number of blocks and files that are committed to being freed and reflect these blocks in the counts returned by statfs (and thus also by the `df' command). This change allows programs such as those that do news expiration to know when to stop if they are trying to create a certain percentage of free space. Note that this change does not solve the much harder problem of making this to-be-freed space available to applications that want it (thus on a nearly full filesystem, you may still encounter out-of-space conditions even though the free space will show up eventually). Hopefully this harder problem will be the subject of a future enhancement.	2001-05-08 07:42:20 +00:00
Poul-Henning Kamp	cf94807d03	Remove blatantly pointless call to VOP_BMAP().	2001-05-01 09:12:05 +00:00
Poul-Henning Kamp	a62615e59b	Implement vop_std{get\|put}pages() and add them to the default vop[]. Un-copy&paste all the VOP_{GET\|PUT}PAGES() functions which do nothing but the default.	2001-05-01 08:34:45 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Poul-Henning Kamp	855aa097af	VOP_BALLOC was never really a VOP in the first place, so convert it to UFS_BALLOC like the other "between UFS and FFS function interfaces".	2001-04-29 12:36:52 +00:00
Poul-Henning Kamp	bdb8855550	Make a panic less misleading.	2001-04-29 11:45:15 +00:00
Poul-Henning Kamp	954a0e256e	Remove two unused arguments from ufs_bmaparray().	2001-04-29 10:24:58 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
Bruce Evans	e8a28f87d8	MFffs ffs_balloc.c 1.5. Long ago, bread() set b_blkno to the disk block number as a side effect of doing physical i/o (or it just retained the setting from when the i/o was done). The setting is lost when buffers go away and then are reconsituted from VM. bread() originally compensated by doing a VOP_BMAP() to recover b_blkno, but this was no good since it sometimes caused extra i/o or even deadlock for bread()ing metadata to do the bmap. This was fixed in vfs_bio.c 1.33 (1995/03/03) and ffs_balloc.c 1.5, etc., by removing the VOP_BMAP() from bread() and breadn(), and changing all (?) places that used b_blkno to set it if necessary. ext2fs was not imported until later in 1995 and was still depending on the old behaviour of bread() in at least ext2_balloc(). This caused filesystem and file corruption by clobbering direct block numbers in inodes.	2001-04-25 10:33:09 +00:00
Poul-Henning Kamp	a13234bb35	Move the netexport structure from the fs-specific mountstructure to struct mount. This makes the "struct netexport *" paramter to the vfs_export and vfs_checkexport interface unneeded. Consequently that all non-stacking filesystems can use vfs_stdcheckexp(). At the same time, make it a pointer to a struct netexport in struct mount, so that we can remove the bogus AF_MAX and #include <net/radix.h> from <sys/mount.h>	2001-04-25 07:07:52 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
Kirk McKusick	589c7af992	Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).	2001-03-07 07:09:55 +00:00
John Baldwin	19eb87d22a	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
Adrian Chadd	f3a90da995	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.	2001-03-01 21:00:17 +00:00
Jeroen Ruigrok van der Werven	7c63796828	Preceed/preceeding are not english words. Use precede or preceding.	2001-02-18 10:25:42 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Poul-Henning Kamp	fc2ffbe604	Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 13:13:25 +00:00
John Baldwin	ba88dfc733	Back out proc locking to protect p_ucred for obtaining additional references along with the actual obtaining of additional references.	2001-01-27 00:01:31 +00:00
Jason Evans	1b367556b5	Convert all simplelocks to mutexes and remove the simplelock implementations.	2001-01-24 12:35:55 +00:00
John Baldwin	157403fff0	Proc locking, mostly protecting p_ucred while obtaining additional references.	2001-01-23 22:41:15 +00:00
Matthew Dillon	6ddaf0f45e	Avoid a data-consistency race between write() and mmap() by ensuring that newly allocated blocks are zerod. The race can occur even in the case where the write covers the entire block. Reported by: Sven Berkvens <sven@berkvens.net>, Marc Olzheim <zlo@zlo.nu>	2000-12-17 23:57:05 +00:00
Matt Jacob	80e8f27bbc	Put the bits in place for Alpha support for ext2. Not tested.	2000-12-09 22:32:49 +00:00
Matt Jacob	2c8380ba4a	Correct to a common %ld the 5 argument to a printf.	2000-12-09 22:32:01 +00:00
Matt Jacob	f5a5fd9ed1	Use a pointer to a size_t for the 4th argument to copyinstr- not a pointer to a u_int.	2000-12-09 22:31:34 +00:00
Bruce Evans	03b67a395f	Backed out previous commit. Don't depend on namespace pollution in <sys/buf.h>.	2000-12-02 12:03:58 +00:00
Alfred Perlstein	82625cf321	remove unneded sys/ucred.h includes	2000-11-30 18:52:32 +00:00
Bruce Evans	3a715f43c6	Quick fix for not writing group descriptor group, inode bitmaps or block bitmaps before unmount() completes. They were written using bdwrite(), so they were normally written less than 32 seconds after unmount(), but this is too late if the media is removed or the system is rebooted soon after unmount(). sync()ing before unmount() didn't help, because ext2fs uses buggy private caching for these blocks -- it doesn't even bdwrite() them until they are uncached or the filesystem is unmounted. sync()ing after unmount() didn't help, because sync() only applies to (vnodes for) mounted filesystems. PR: 22726	2000-11-10 14:54:15 +00:00
Bruce Evans	1c1752872f	Fixed breakage of mknod() in rev.1.48 of ext2_vnops.c and rev.1.126 of ufs_vnops.c: 1) i_ino was confused with i_number, so the inode number passed to VFS_VGET() was usually wrong (usually 0U). 2) ip was dereferenced after vgone() freed it, so the inode number passed to VFS_VGET() was sometimes not even wrong. Bug (1) was usually fatal in ext2_mknod(), since ext2fs doesn't have space for inode 0 on the disk; ino_to_fsba() subtracts 1 from the inode number, so inode number 0U gives a way out of bounds array index. Bug(1) was usually harmless in ufs_mknod(); ino_to_fsba() doesn't subtract 1, and VFS_VGET() reads suitable garbage (all 0's?) from the disk for the invalid inode number 0U; ufs_mknod() returns a wrong vnode, but most callers just vput() it; the correct vnode is eventually obtained by an implicit VFS_VGET() just like it used to be. Bug (2) usually doesn't happen.	2000-11-04 08:10:56 +00:00
Bruce Evans	e6410301f0	Support filesystems with the not-so-new "sparse_superblocks" feature. When this feature is enabled, mke2fs doesn't necessarily allocate a super block and its associated descriptor blocks for every group. The (non-)allocations are reflected in the block bitmap. Since the filesystem code doesn't write to these blocks except for the first superblock, all it has to do to support them is to not count them in ext2_statfs() and not attempt to check them at mount time in ext2_check_blocks_bitmap() (the check has never been enabled in FreeBSD anyway).	2000-11-03 16:41:48 +00:00
Poul-Henning Kamp	9f69a4578a	Weaken a bogus dependency on <sys/proc.h> in <sys/buf.h> by #ifdef'ing the offending inline function (BUF_KERNPROC) on it being #included already. I'm not sure BUF_KERNPROC() is even the right thing to do or in the right place or implemented the right way (inline vs normal function). Remove consequently unneeded #includes of <sys/proc.h>	2000-10-29 14:54:55 +00:00
Poul-Henning Kamp	46aa3347cb	Convert all users of fldoff() to offsetof(). fldoff() is bad because it only takes a struct tag which makes it impossible to use unions, typedefs etc. Define __offsetof() in <machine/ansi.h> Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h> Remove myriad of local offsetof() definitions. Remove includes of <stddef.h> in kernel code. NB: Kernelcode should never include from /usr/include ! Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API. Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001. Paritials reviews by: various. Significant brucifications by: bde	2000-10-27 11:45:49 +00:00
Eivind Eklund	7eb9fca557	Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)	2000-10-09 17:31:39 +00:00
Jason Evans	a18b1f1d4d	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
Boris Popov	e500f61fde	ext2fs depends on ufs code, so update it to properly handle v_lock field. Noticed by: bde	2000-09-26 01:31:46 +00:00
Boris Popov	67e871664b	Add a lock structure to vnode structure. Previously it was either allocated separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_nolock() functions maintain only interlock lock. vop_stdlock() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick	2000-09-25 15:24:04 +00:00
Bruce Evans	96cae770d3	Fixed some serious bugs in ext2_readdir(): The cookie buffer was usually overrun by a large amount whenever cookies were used. Cookies are used by nfs and the Linuxulator, so this bug usually caused panics whenever an ext2fs filesystem was nfs mounted or a Linux utility that calls readdir() was run on an ext2fs filesystem. The directory buffer was sometimes overrun by a small amount. This sometimes caused panics and wrong results even for FreeBSD utilities, but it was usually harmless because FreeBSD utilities use a large enough buffer size (4K). Linux utilities usually triggered the bug since they use a too-small buffer size (512 bytes), at least with the old RedHat utilities that I tested with. PR: 19407 (this fix is incomplete or for a slightly different bug)	2000-09-12 17:10:39 +00:00
Kirk McKusick	9b97113391	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
Kirk McKusick	f2a2857bb3	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
Alexander Langer	0cca1cc078	Fix typo (accessable --> accessible). PR: 18588 Submitted by: Anatoly Vorobey <mellon@pobox.com> Reviewed by: asmodai	2000-06-14 17:53:40 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Poul-Henning Kamp	9626b608de	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
Poul-Henning Kamp	2c9b67a8df	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
Poul-Henning Kamp	87150cb06d	s/biowait/bufwait/g Prodded by: several.	2000-04-29 16:25:22 +00:00
Poul-Henning Kamp	3389ae9350	Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>	2000-04-19 14:58:28 +00:00
Robert Watson	7ef47eec4c	ext2fs relies on UFS support code, and as a result also requires extattr.h to be included. This fixes the broken ext2fs build as of the import of extattr code. Also added $FreeBSD: $ to a couple of files that didn't have them, without which I couldn't commit this fix. Reported by: "George W. Dinolt" <gdinolt@pacbell.net>	2000-04-15 17:14:22 +00:00
Robert Watson	a64ed08955	Introduce extended attribute support for FFS, allowing arbitrary (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development. In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged. o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit) Currently attributes are not supported in MFS. This will be fixed. Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project	2000-04-15 03:34:27 +00:00
Poul-Henning Kamp	c244d2de43	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
Matthew Dillon	e4649cfac3	Change the write-behind code to take more care when starting async I/O's. The sequential read heuristic has been extended to cover writes as well. We continue to call cluster_write() normally, thus blocks in the file will still be reallocated for large (but still random) I/O's, but I/O will only be initiated for truely sequential writes. This solves a number of annoying situations, especially with DBM (hash method) writes, and also has the side effect of fixing a number of (stupid) benchmarks. Reviewed-by: mckusick	2000-04-02 00:55:28 +00:00
Poul-Henning Kamp	b99c307a21	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
Poul-Henning Kamp	21144e3bf1	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
Kirk McKusick	15e549f668	Bug fixes for currently harmless bugs that could rise to bite the unwary if the code were called in slightly different ways. 1) In ufs_bmaparray() the code for calculating 'runb' will stop one block short of the first entry in an indirect block. i.e. if an indirect block contains N block numbers b[0]..b[N-1] then the code will never check if b[0] and b[1] are sequential. For reference, compare with the equivalent code that deals with direct blocks. 2) In ufs_lookup() there is an off-by-one error in the test that checks if dp->i_diroff is outside the range of the the current directory size. This is completely harmless, since the following while-loop condition 'dp->i_offset < endsearch' is never met, so the code immediately does a second pass starting at dp->i_offset = 0. 3) Again in ufs_lookup(), the condition in a sanity check is wrong for directories that are longer than one block. This bug means that the sanity check is only effective for small directories. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	2000-03-15 07:18:15 +00:00
Bruce Evans	d1a417f17e	Don't forget to check for unsupported features when updating. It was possible to defeat the check for rw incompatibilty by mounting ro and updating to rw. Approved by: jkh	2000-03-09 05:21:10 +00:00
Bruce Evans	fc47f29c64	MFS (ext2_lookup.c 1.17.2.2, ext2_vnops.c 1.42.2.2: fix "filetype" support). Approved by: jkh	2000-03-03 08:00:27 +00:00
Poul-Henning Kamp	ba4ad1fcea	Give vn_isdisk() a second argument where it can return a suitable errno. Suggested by: bde	2000-01-10 12:04:27 +00:00
Bruce Evans	b9b652d2f6	Support filesystems with the not-so-new "filetype" feature. This feature gives the d_type field for struct dirent. We used to panic in ext2_readdir() for filesystems with this feature.	2000-01-05 19:31:26 +00:00
Bruce Evans	6291b96b03	Don't allow mounting (or mounting R/W) of filesystems with unsupported features (except for file types in directory entries, which will be supported soon). Centralized the magic number and compatibility checking. Dropped support for ancient (pre-0.2b) filesystems, as in the Linux version. Our "support" consisted of printing more details in the error message before failing at mount time.	2000-01-02 17:40:02 +00:00
Bruce Evans	d68084cd89	Merged changes in ext2_fs.h between Linux 1.2.2 and Linux 2.3.35. The main changes are: - many things are more dynamic; e.g., the inode size is a new parameter in the superblock instead of a constant. - extensions are controlled by new flags in the superblock. - directory entries may have a file type field. These changes are not used yet, except for a spelling change which affects ext2_cnv.c	2000-01-01 17:39:21 +00:00
Bruce Evans	c9fbb5bc2c	Merged cosmetic changes from the initial import on the vendor branch (mainly things that were lost or misformatted in a different way by moving them to ext2_fs_i.h and back, and ifdefs for user mode that were excessively edited).	2000-01-01 16:26:43 +00:00
Bruce Evans	8e19715af3	Use an ifdef in ext2_fs.h instead of a bogus separate file (ext2_fs_i.h) to avoid the namespace problems caused by <ufs/ufs/inode.h> #defining i_mode, etc. ext2_fs_i.h had nothing to do with the Linux version. It was a small part of the Linux version of ext2_fs.h (the part that declares extra in-core fields for an inode). We don't need it because we use the ufs in-core inode for the extra fields.	2000-01-01 14:43:20 +00:00
Bruce Evans	5c8b462df8	Updated/corrected the list of GPL'ed files.	2000-01-01 11:27:50 +00:00
Peter Wemm	c447342094	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.	1999-12-29 05:07:58 +00:00
Robert Watson	91f37dcba1	Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind	1999-12-19 06:08:07 +00:00
Eivind Eklund	762e6b856c	Introduce NDFREE (and remove VOP_ABORTOP)	1999-12-15 23:02:35 +00:00
Poul-Henning Kamp	0429e37ade	struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967	1999-11-20 10:00:46 +00:00
David E. O'Brien	594017f90d	Fix __asm__ clobber list abuse. Submitted by: bde	1999-11-15 23:16:06 +00:00
Eivind Eklund	dd8c04f4c7	Remove WILLRELE from VOP_SYMLINK Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.	1999-11-13 20:58:17 +00:00
Eivind Eklund	edfe736df9	Remove WILLRELE from VOP_RENAME	1999-11-12 03:34:28 +00:00
Poul-Henning Kamp	698f9cf828	Next step in the device cleanup process. Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.	1999-11-09 14:15:33 +00:00
Bruce Evans	5bd5c8b9e5	Quick fix for breakage of ext2fs link counts as reported by stat(2) by the soft updates changes: only report the link count to be i_effnlink in ufs_getattr() for file systems that maintain i_effnlink. Tested by: Mike Dracopoulos <mdraco@math.uoa.gr>	1999-11-03 12:05:39 +00:00
Mike Smith	6d14782861	Newline-terminate the complaint message about not being able to find the root vnode pointer.	1999-11-01 23:57:28 +00:00
Poul-Henning Kamp	b89392e703	Remove the D_NOCLUSTER[RW] options which were added because vn had problems. Now that Matt has fixed vn, this can go. The vn driver should have used d_maxio (now si_iosize_max) anyway.	1999-09-30 07:11:30 +00:00
Poul-Henning Kamp	1b5464ef9d	Remove v_maxio from struct vnode. Replace it with mnt_iosize_max in struct mount. Nits from: bde	1999-09-29 20:05:33 +00:00
Matthew Dillon	67ddfcaf69	More removals of vnode->v_lastr, replaced by preexisting seqcount heuristic to detect sequential operation. VM-related forced clustering code removed from ufs in preparation for a commit to vm/vm_fault.c that does it more generally. Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>	1999-09-20 23:27:58 +00:00
Poul-Henning Kamp	faad302913	Fix a harmless bug I introduced, simplify a bit more while here.	1999-09-20 21:14:43 +00:00
Poul-Henning Kamp	fae03f66d1	Step one of replacing devsw->d_maxio with si_bsize_max. Rename dev->si_bsize_max to si_iosize_max and set it in spec_open if the device didn't. Set vp->v_maxio from dev->si_bsize_max in spec_open rather than in ufs_bmap.c	1999-09-20 19:57:28 +00:00
Alfred Perlstein	c24fda81c9	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD	1999-09-11 00:46:08 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Poul-Henning Kamp	41d2e3e09e	Introduce vn_isdisk(struct vnode *vp) function, and use it to test for diskness.	1999-08-25 12:24:39 +00:00
Bruce Evans	feb54dc506	Oops, the previous commit was missing a new include.	1999-08-23 22:05:49 +00:00
Bruce Evans	939cb7521a	Initialise fsids with (user) device numbers again. Bitrot when dev_t's were changed to pointers was obscured by casting dev_t's to longs. fsids haven't even been comprised of longs since the Lite2 merge.	1999-08-23 21:07:13 +00:00
Bruce Evans	d918320517	Use devtoname() to print dev_t's instead of casting them to long or u_long for misprinting in %lx format.	1999-08-23 20:35:21 +00:00
Poul-Henning Kamp	7dc5cd047f	The bdevsw() and cdevsw() are now identical, so kill the former.	1999-08-13 10:29:38 +00:00
Poul-Henning Kamp	0ef1c82630	Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.	1999-08-08 18:43:05 +00:00
Bruce Evans	2ac6e74655	Don't set IN_ACCESS for requests to read 0 bytes or for unsuccessful reads. Translated from: similar fixes in ufs_readwrite.c rev.1.61. Things are simpler (but annoyingly different) here because there are no vm optimisations.	1999-07-25 02:56:17 +00:00
Kirk McKusick	4dc0c8f521	Create the macro DOINGASYNC to check whether the MNT_ASYNC flag has been set for a mount point. Insert missing checks to ensure that all write operations are done asynchronously when the MNT_ASYNC option has been requested. Submitted by: Craig A Soules <soules+@andrew.cmu.edu> Reviewed by: Kirk McKusick <mckusick@mckusick.com>	1999-07-13 18:20:13 +00:00
Kirk McKusick	67812eacd7	Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.	1999-06-26 02:47:16 +00:00
Kirk McKusick	f9c8cab591	Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.	1999-06-16 23:27:55 +00:00
Poul-Henning Kamp	2447bec829	Simplify cdevsw registration. The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing. cdevsw_add() will print an message if the d_maj field looks bogus. Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL. Move bdevsw() and devsw() functions to kern/kern_conf.c Bump __FreeBSD_version to 400006 This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions if_xe.c bogusly accessed cdevsw[], author/maintainer please fix. I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.	1999-05-31 11:29:30 +00:00
Bruce Evans	22b6b1cd1e	Fixed printing of a dev_t in a panic message. Fixed the function name in this message.	1999-05-13 06:27:51 +00:00
Poul-Henning Kamp	4be2eb8c49	I got tired of seeing all the cdevsw[major(foo)] all over the place. Made a new (inline) function devsw(dev_t dev) and substituted it. Changed to the BDEV variant to this format as well: bdevsw(dev_t dev) DEVFS will eventually benefit from this change too.	1999-05-08 06:40:31 +00:00
Poul-Henning Kamp	46eede0058	Continue where Julian left off in July 1998: Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline) function. Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention to the order of the cmaj/bmaj arguments!) Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE (ditto!) (Next step will be to convert all bdev dev_t's to cdev dev_t's before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)	1999-05-07 10:11:40 +00:00
Alan Cox	4221e284a3	The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-02 23:57:16 +00:00
Poul-Henning Kamp	75c1354190	This Implements the mumbled about "Jail" feature. This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/	1999-04-28 11:38:52 +00:00
Poul-Henning Kamp	f711d546d2	Suser() simplification: 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc ), prototyped in <sys/proc.h>. 3: s/suser_xxx($[a-zA-Z0-9_]$->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.	1999-04-27 11:18:52 +00:00
Bruce Evans	44f332052d	Don't depend on <ufs/ufs/quota.h> or another (old) prerequisite including <sys/queue.h>. This fixes my recent breakage of biosboot by unpolluting <ufs/ufs/quota.h> in the !KERNEL case.	1999-03-06 05:21:09 +00:00
Warner Losh	5369eb85ca	Merge patch to ufs_vnops.c's ufs_rename to the copy of ufs_rename that lives in ext2_vnops.c for ext2fs. Also remove cast from comparision. Bruce pointed out that it was bogus since we'd force a signed comparision when we really wanted an unsigned comparison.	1999-03-02 05:31:47 +00:00
Bruce Evans	a5c9bce777	Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>).	1999-02-25 15:54:06 +00:00
Bruce Evans	ae4d334421	Fixed parenthesization botch in previous commit. Async update of inodes was broken.	1999-01-29 15:36:05 +00:00
Matthew Dillon	8aef171243	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-28 00:57:57 +00:00
Matthew Dillon	fe08c21a53	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile. This commit includes significant work to proper handle const arguments for the DDB symbol routines.	1999-01-27 23:45:44 +00:00
Matthew Dillon	d254af07a1	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 21:50:00 +00:00
Eivind Eklund	65c0c7b08e	Avoid warning for unused variable.	1999-01-11 23:32:35 +00:00
Bruce Evans	de5d1ba57c	Don't pass unused unused timestamp args to UFS_UPDATE() or waste time initializing them. This almost finishes centralizing (in-core) timestamp updates in ufs_itimes().	1999-01-07 16:14:19 +00:00
Bruce Evans	4591d9bb7e	UFS_UPDATE() takes a boolean `waitfor' arg, so don't pass it the value MNT_WAIT when we mean boolean `true' or check for that value not being passed. There was no problem in practice because MNT_WAIT had the magic value of 1.	1999-01-06 18:18:06 +00:00
Archie Cobbs	f1d19042b0	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
Bruce Evans	b54e74eb87	Fixed a misspelling of boolean true as MNT_WAIT.	1998-11-15 15:46:33 +00:00
Peter Wemm	40c8cfe552	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
Peter Wemm	91ecc00e71	error return assignment was less than ideal. Fix the part that caused warnings to be the same as the ffs code. Previously, any error from the UFS_UPDATE() call was lost (I think).	1998-10-29 09:44:12 +00:00
Peter Wemm	f6020599aa	Use vtruncbuf() to clean out cached blocks on a file shorten rather than the more expensive vinvalbuf(), based on the FFS version of the same routine. I don't have any ext2fs filesystems to test this on.	1998-10-29 09:30:52 +00:00
Bruce Evans	b5ee16407f	Oops, the redundant tests for major numbers weren't redundant here. They checked for the magic major number for the "device" behind mfs mount points. Use a more obvious check for this device. Debugged by: Andrew Gallatin <gallatin@cs.duke.edu>	1998-10-27 11:47:08 +00:00
Bruce Evans	569555b969	Removed redundant bitrotted checks for major numbers instead of updating them.	1998-10-26 08:53:13 +00:00
Bruce Evans	65baf8f06b	Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted when bdevsw[] became sparse. We still depend on magic to avoid having to check that (v_rdev) device numbers in vnodes are not NODEV.	1998-10-25 19:26:18 +00:00
Bruce Evans	d2165c2f7d	Fixed bloatage of `struct inode'. We used 5 "spare" fields for ext2fs, but when i_effnlink was added to support soft updates, there was only room for 4 spares. The number of spares was not reduced, so the inode size became 260 (on i386's), or 512 after rounding up by malloc(). Use one spare field in `struct dinode' instead of the 5th spare field in the inode and reduced to 4 spares in the inode so that the size is 256 again. Changed the types of the spares in the inode from int to u_int32_t so that the inode size has more chance of being <= 256 under other arches, and downdated ext2fs to match (it was broken to use ints before rev.1.1).	1998-10-13 15:45:43 +00:00
Bruce Evans	8f359bc68c	Quick fix for not being able to sync all the buffers in boot() if an ext2fs file system is mounted. The soft update changes added a check for B_DELWRI buffers. This exposed the complete brokenness of the previous quick fix for failing syncs (PR 3571, committed on 1997/08/04). Use a new buffer flag B_DIRTY and don't abuse B_DELWRI. B_DIRTY buffers are still written too late, as broken in the previous fix. This is fairly harmless, because B_DIRTY is only used for bitmap buffers and fsck.ext2 can fix up the bitmaps perfectly. Fixed a race in ULCK_BUF() (bremfree() was outside of the splbio() section).	1998-10-03 16:19:28 +00:00
Bruce Evans	9702cd0422	Fixed initialization of new inodes. ext2fs doesn't clear inodes when they are deleted, so inodes must be cleared when they are reused, but we didn't clear the indirect blocks. This caused serious filesystem corruption.	1998-09-29 08:07:32 +00:00
Bruce Evans	6674be30f6	Updated ext2_reload() and ext2_sync(). Locking was broken, and MNT_LAZY syncs weren't optimized properly (they probably still aren't, but are bug for bug compatible with ffs). These fixes are mostly academic, since ext2fs is too broken to mount read-write (it apparently doesn't clear indirect blocks). Obtained from: mostly from Lite2	1998-09-26 12:42:17 +00:00
Bruce Evans	7cff8977ca	Fixed missing newlines in messages in ext2_check_descriptors(). Fixed vnode and memory leaks after an unlikely (?) error in ext2_mountfs(). Fixed an unconditional memory leak in ext2_unmount().	1998-09-26 07:16:41 +00:00
Bruce Evans	a094db128f	Fixed clean flag handling: Fixes for bugs not shared with ffs: - don't mount unclean filesystems rw unless forced to. - accept EXT2_ERROR_FS (treat it like !EXT2_VALID_FS). We still don't set this or honour the maximal mount count. - don't attempt to print the name of the mount point when mounting an unclean file system, since the name of the previous mount point is unknown and the name of the current mount point is still "". Fixes for bugs shared with ffs until recently: - don't set the clean flag on unmount of an initially-unclean filesystem that was (forcibly) mounted rw. - set the clean flag on rw -> ro update of a mounted initially-clean filesystem. - fixed some style bugs (mostly long lines). The fixes are slightly simpler than for ffs, because the relevant on-disk state is not a simple boolean variable, and the superblock has a core-only extension. Obtained from: parts from ffs_vfsops.c, parts from NetBSD	1998-09-26 06:18:59 +00:00
Bruce Evans	5d207357db	Fixed the usual missing permissions checks in mount(). As for cd9660, the damage was limited by the default of 0 for vfs.usermount. Obtained from: Lite2 via the -current ffs_vfsops.c	1998-09-09 20:21:18 +00:00
Bruce Evans	05d46b3cd6	Don't forget to initialize the inode lock. This bug caused surprisingly few problems. Most fields were initialized to the correct values by bzero(), but lk_prio was 0 instead of PINOD (=8), the lk_wmsg was NULL instead of "ext2in", and lk_lockholder was 0 instead of -1. Obtained from: Lite2 via the -current ffs_vfsops.c	1998-09-09 13:09:24 +00:00
Bruce Evans	d6c54caabe	Support compiling with `gcc -pedantic' (don't use hard newlines in (asm) string constants).	1998-09-09 12:22:17 +00:00
Bruce Evans	8994ca3ce9	Removed statically configured mount type numbers (MOUNT_) and all references to them. The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_ instead of the number in their vfsconf struct.	1998-09-07 13:17:06 +00:00
Bruce Evans	1874ef935c	Quick fix for breakage of read clustering on non-IDE drives. Read clustering is obsolescent technology so hardly anyone noticed. On a DORS 32160 SCSI drive with 4 tags, read clustering makes very little difference even for huge sequential reads. However, on a ZIP SCSI drive with 0 tags, the minimum overhead per block is about 40 msec, so very large clusters must be used to get anywhere near the maximum transfer rate. Using clusters consisting of 1 8K block reduces the transfer rate to about 250K/sec. Under msdosfs, missing read clustering is normal and a cluster size of 1 512 byte block reduces the transfer rate to about 25K/sec. Broken in: rev.1.18	1998-08-18 03:54:39 +00:00
Mike Smith	f01beb610a	"The releaseing of the reference and lock is not temporary and belongs where it is. The reference and lock(s) are acquired just above the code in VREF() and relookup()." Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-08-12 21:42:54 +00:00
Bruce Evans	85badd7eba	Fixed printf format errors.	1998-07-30 17:12:39 +00:00
Julian Elischer	49cc016a39	add anti-panic workaround from chris radek (cradek@in221.inetnebr.com) Not sure why this is needed but but does stop crashes.	1998-07-30 03:22:52 +00:00
Bruce Evans	ac1e407b32	Fixed printf format errors.	1998-07-11 07:46:16 +00:00
Julian Elischer	6deaf84b1f	Catch a few corner cases where FreeBSD differs enough from BSD 4.4 to confuse Soft updates.. Should solve several "dangling deps" panics.	1998-07-08 01:04:33 +00:00
Julian Elischer	fd5d1124e2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
Bruce Evans	3055187290	Sync timestamp changes for inodes of special files to disk as late as possible (when the inode is reclaimed). Temporarily only do this if option UFS_LAZYMOD configured and softupdates aren't enabled. UFS_LAZYMOD is intentionally left out of /sys/conf/options. This is mainly to avoid almost useless disk i/o on battery powered machines. It's silly to write to disk (on the next sync or when the inode becomes inactive) just because someone hit a key or something wrote to the screen or /dev/null. PR: 5577 Previous version reviewed by: phk	1998-07-03 22:17:03 +00:00
Bruce Evans	33cc029eab	Centralized in-core inode update. Update the in-core inode directly in ufs_setattr() so that there is no need to pass timestamps to UFS_UPDATE() (everything else just needs the current time). Ignore the passed-in timestamps in UFS_UPDATE() and always call ufs_itimes() (was: itimes()) to do the update. The timestamps are still passed so that all the callers don't need to be changed yet.	1998-07-03 18:46:52 +00:00
Bruce Evans	add4ae9324	Fixed (?) races in mark_buffer_dirty(). We abuse the buffer cache by hacking on locked buffers without getblk()ing them, and we didn't even use splbio() to prevent biodone() changing the buffer underneath use when a write completes. I think there was no problem in practice on i386's because the operations on b_flags and numdirtybufs happen to be atomic. We still depend on biodone()'s operations on b_flags not interfering with ours. I think there is only interference for B_ERROR, and this is harmless because errors for async writes are ignored anyway. Don't use mark_buffer_dirty() except for superblock-related metadata. It was used in just one case where ordinary BSD buffering is more natural.	1998-06-21 21:06:04 +00:00
Bruce Evans	9b7a8fb7d8	Removed unused function ll_w_block(). It has always had races due to not using splbio(), and has rotted a little. The races were probably harmless in practice because this function was only used for superblock updates, and separate superblock updates are probably prevented from running into each other by doing part of the update synchronously.	1998-06-21 19:56:31 +00:00
Bruce Evans	be160d60ab	Removed unused includes.	1998-06-21 18:02:50 +00:00
Bruce Evans	e5b19842ef	Removed unused includes.	1998-06-21 14:53:44 +00:00
Bruce Evans	4344f492c4	Added a missing options include.	1998-06-21 12:36:12 +00:00
Bruce Evans	dae50f6c50	Don't use "ffs" in an ext2fs sleep message string. Don't forget to clear the inode hash lock before returning from ext2_vget() after getnewvnode() fails. Obtained from: rev.1.24 of ffs_vfsops.c (the original patch for the getnewvnode() race). Forgotten in: rev.1.4 here. Removed a duplicate comment. Duplicated in: rev.1.4 here. Fixed the MALLOC() vs getnewvnode() race in ext2_vget(). Obtained from: rev.1.39 of ffs_vfsops.c.	1998-05-16 17:47:44 +00:00
Bruce Evans	2c8838fec4	Abbreviate "ext2fs_fsync" as "e2fsyn" instead of as "extfsn" in a sleep message string.	1998-05-16 16:52:20 +00:00
Mike Smith	7be2d30077	In the words of the submitter: --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-07 04:58:58 +00:00
Mike Smith	79cc756d8b	As described by the submitter: Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him. The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-06 05:29:41 +00:00
Dag-Erling Smørgrav	dc73342347	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.	1998-04-17 22:37:19 +00:00
Bruce Evans	c1087c1324	Support compiling with `gcc -ansi'.	1998-04-15 17:47:40 +00:00
Poul-Henning Kamp	227ee8a188	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
Bruce Evans	08637435f2	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
Poul-Henning Kamp	a0502b19d4	Add two new functions, get{micro\|nano}time. They are atomic, but return in essence what is in the "time" variable. gettime() is now a macro front for getmicrotime(). Various patches to use the two new functions instead of the various hacks used in their absence. Some puntuation and grammer patches from Bruce. A couple of XXX comments.	1998-03-26 20:54:05 +00:00
Eivind Eklund	3bfd185367	Make this compile after soft updates integration. LINTing forgotten by: julian	1998-03-09 14:46:57 +00:00
Julian Elischer	b1897c197c	Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree	1998-03-08 09:59:44 +00:00
Mike Smith	34bdbbd0de	The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE: vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called. A default vfs_vrele was created for fs implementations that use the standard vnode management routines. VFS_VRELE implementations were made for the following file systems: Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs Custom union umapfs Just EOPNOTSUPP fdesc procfs kernfs portal cd9660 These implementations may change as VOP changes are implemented. In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code. This will only be done for vnode arguments that are released by the various fs vop implementations. Wider use of VFS_VRELE will likely require restructuring of the code. Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-03-01 22:46:53 +00:00
Mike Smith	1ee98f0885	Style nits and staticism with the previous commit. Submitted by: bde	1998-03-01 01:37:38 +00:00
Mike Smith	b1f04c95e1	Add local stup putpages/getpages routines. Submitted by: Terry Lambert <terry@freebsd.org>	1998-03-01 00:51:43 +00:00
Bruce Evans	5858ada877	Fixed configuration and linkage of ext2_checkoverlap().	1998-02-13 00:28:40 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
Eivind Eklund	a30e742145	Make LINT at least compile. This faithfully duplicate the changes done to ufs/ufs/ufs_vnops.c for the same problem, but I don't know if that will actually make SUIDDIR work for ext2fs.	1998-02-04 01:16:03 +00:00
Poul-Henning Kamp	c5b193bfba	Retire LFS. If you want to play with it, you can find the final version of the code in the repository the tag LFS_RETIREMENT. If somebody makes LFS work again, adding it back is certainly desireable, but as it is now nobody seems to care much about it, and it has suffered considerable bitrot since its somewhat haphazard integration. R.I.P	1998-01-30 11:34:06 +00:00
John Dyson	50ce7ff499	Add better support for larger I/O clusters, including larger physical I/O. The support is not mature yet, and some of the underlying implementation needs help. However, support does exist for IDE devices now.	1998-01-24 02:01:46 +00:00
Bruce Evans	675ea6f083	Unspammed nested include of <vm/vm_zone.h>.	1997-12-27 02:56:39 +00:00
Eivind Eklund	8c13c35718	Convert SUIDDIR fully to a new-style option. Forgotten by: julian	1997-12-15 21:51:45 +00:00
Bruce Evans	1cd52ec333	Don't include <sys/lock.h> in headers when only `struct simplelock' is required. Fixed everything that depended on the pollution.	1997-12-05 19:55:52 +00:00
Jordan K. Hubbard	8cf27db018	Needs to include <sys/lock.h> if we're using struct lock.	1997-12-05 13:43:47 +00:00
Bruce Evans	0f1dddfb0c	Fixed corruption of the per-group used directories count. It wasn't decremented when directories were removed because rev.1.12 broke the fixup of the i_mode of the inode being removed.	1997-12-03 16:46:21 +00:00
Poul-Henning Kamp	70387fe11d	Fix the copyright and attribution on this file. I forgot this when the file was cloned.	1997-12-02 21:20:06 +00:00
Bruce Evans	1dd78fb7ef	Use the same algorithm as ffs for generation numbers.	1997-12-02 11:42:28 +00:00
Bruce Evans	2f169e4b76	Removed __FreeBSD__ ifdefs.	1997-12-02 10:39:42 +00:00
Bruce Evans	93146306a2	Fixed missing #include of "opt_quota.h". Sorted the functions into the same order as in ufs_vnops.c so that this can be compared with the latter without getting 2627 lines of diffs. Now we get only 1920 lines of diffs.	1997-11-24 19:25:24 +00:00
Bruce Evans	5b76055a53	Fixed overflow in ufs_getblns(). For ufs on systems with 32-bit ints, triple indirect blocks only worked for block sizes of 4K, since MNINDIR(ump)3 overflows for larger block sizes (e.g., (8192/4)3 = 2**33 > INT_MAX). This fix is not the obvious one of changing some types to 64 bits. It rearranges the code to avoid some unnecessary 64-bit calculations. Reviewed by: Kirk McKusick <mckusick@McKusick.COM>	1997-11-24 16:33:03 +00:00
Bruce Evans	ff0618391a	Use consistent description strings for M_EXT2NODE. This also fixes a spelling error in the unused string.	1997-11-20 16:56:25 +00:00
Poul-Henning Kamp	0930eb3012	Give ext2fs it's own VOP_REMOVE, VOP_LINK, VOP_RENAME, VOP_MKDIR, VOP_RMDIR, VOP_CREATE, VOP_MKNOD, VOP_SYMLINK and ext2_makeinode().	1997-11-18 14:19:44 +00:00
Julian Elischer	b1f4a44b03	Reviewed by: various. Ever since I first say the way the mount flags were used I've hated the fact that modes, and events, internal and exported, and short-term and long term flags are all thrown together. Finally it's annoyed me enough.. This patch to the entire FreeBSD tree adds a second mount flag word to the mount struct. it is not exported to userspace. I have moved some of the non exported flags over to this word. this means that we now have 8 free bits in the mount flags. There are another two that might well move over, but which I'm not sure about. The only user visible change would have been in pstat -v, except that davidg has disabled it anyhow. I'd still like to move the state flags and the 'command' flags apart from each other.. e.g. MNT_FORCE really doesn't have the same semantics as MNT_RDONLY, but that's left for another day.	1997-11-12 05:42:33 +00:00
Bruce Evans	ef91bd5734	Removed unused #includes. The need for most of them went away with recent changes (docluster* and vfs improvements).	1997-10-27 13:33:47 +00:00
Poul-Henning Kamp	82c5d0395d	I guess nobody uses ext2fs in current ? vop_lookup is back now, don't know whan I lost it.	1997-10-26 21:05:40 +00:00
Poul-Henning Kamp	d54d34b533	Make a set of VOP standard lock, unlock & islocked VOP operators, which depend on the lock being located at vp->v_data. Saves 3x3 identical vop procs, more as the other filesystems becomes lock aware.	1997-10-17 12:36:19 +00:00
Poul-Henning Kamp	987f569678	Another VFS cleanup "kilo commit" 1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure. 2. Remove VOP_SEEK, it was unused. 3. Add mode default vops: VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp And remove identical functionality from filesystems 4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?) 5. Try to make system wide VOP functions have vop_* names. 6. Initialize the um_* vectors in LFS. (Recompile your LKMS!!!)	1997-10-16 20:32:40 +00:00
Poul-Henning Kamp	cec0f20ce7	VFS mega cleanup commit (x/N) 1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here. 2. Change VOP_BLKATOFF to a normal function in cd9660. 3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead. 4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done. 5. Fix another VCALL in vfs_cache.c (thanks Bruce!)	1997-10-16 10:50:27 +00:00
Julian Elischer	7d1f0a2825	Two more places where root filesystems were mounted, put them at the head of the mount list in case there is already DEVFS present.	1997-10-16 08:16:34 +00:00
Poul-Henning Kamp	138ec1f71a	vnops megacommit 1. Use the default function to access all the specfs operations. 2. Use the default function to access all the fifofs operations. 3. Use the default function to access all the ufs operations. 4. Fix VCALL usage in vfs_cache.c 5. Use VOCALL to access specfs functions in devfs_vnops.c 6. Staticize most of the spec and fifofs vnops functions. 7. Make UFS panic if it lacks bits of the underlying storage handling.	1997-10-15 13:24:07 +00:00
Poul-Henning Kamp	6a525123aa	Hmm, realign the vnops into two columns.	1997-10-15 10:05:29 +00:00
Poul-Henning Kamp	539ef70c2d	Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.	1997-10-15 09:22:02 +00:00
Poul-Henning Kamp	40715905a7	I think my previous change may have opened a race conditio. This patch does the same thing, with no change in semantics.	1997-10-14 18:46:48 +00:00
Poul-Henning Kamp	34a6a33036	ufs_ihashrem() should not be called from the UFS layer, but from the lower layer (LFS/FFS/?) like the rest of the ihash functions. Otherwise it is impossible to make a lower layer that doesn't use the ihash facility.	1997-10-14 14:22:31 +00:00
Poul-Henning Kamp	a1c995b626	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
Poul-Henning Kamp	55166637cd	Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde	1997-10-11 18:31:40 +00:00
Poul-Henning Kamp	2cfc47fbc8	Make ufs_reclaim free the underlying inode.	1997-10-10 18:18:13 +00:00
Poul-Henning Kamp	631821df68	Mega commit to cleanup the "remaining nits" after my malloc change. Introduce a M_EXT2NODE for ext2fs vnodes. Use generic ufs_reclaim instead of hijacking ffs_reclaim.	1997-10-10 18:13:06 +00:00
Bruce Evans	dab8d6e4e7	`numdirtybuffers' was not maintained properly. This caused excessive flushing of buffers in an attempt to reduce numdirtybuffers, and perhaps other problems.	1997-10-07 11:10:18 +00:00
KATO Takenori	7825620c11	Oops, include <sys/conf.h>. Reminded-by: Simon Shapiro <Shimon@i-Connect.Net>	1997-09-28 02:23:10 +00:00
KATO Takenori	81bca6ddae	Clustered read and write are switched at mount-option level. 1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread and vfs.foo.doclusterwrite are deleted. Only mount option can control clustered I/O from userland. 2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR and D_CLUSTERW bits of the d_flags member in the block device switch table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR / MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and MNT_NOCLUSTERW cannot be cleared from userland. 3. Vnode driver disables both clustered read and write. 4. Union filesystem disables clutered write. Reviewed by: bde	1997-09-27 13:40:20 +00:00
Joerg Wunsch	6cce995019	Make MFS a supported option, finally.	1997-09-22 21:24:03 +00:00
Peter Wemm	a6aeade2c4	Convert select -> poll. Delete 'always succeed' select/poll handlers, replaced with generic call. Flag missing vnode op table entries.	1997-09-14 02:58:12 +00:00

... 2 3 4 5 6 ...

415 Commits