freebsd-skq

Author	SHA1	Message	Date
jhb	4806d88677	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
rwatson	c9c82b43c3	o Modify generic specfs device open access control checks to use securelevel_ge() instead of direct securelevel variable checks. Obtained from: TrustedBSD Project	2001-09-26 20:18:26 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
dillon	e028603b7e	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
jhb	67d46f1c41	Don't acquire/release Giant around some of the places that need it in spec_getpages(). Instead, assert that Giant is held by the caller.	2001-05-23 22:20:29 +00:00
alfred	a3f0842419	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
bde	e7467b0a7b	Backed out previous commit. It cause massive filesystem corruption, not to mention a compile-time warning about the critical function becoming unused, by replacing spec_bmap() with vop_stdbmap(). ntfs seems to have the same bug. The factor for converting specfs block numbers to physical block numbers is 1, but vop_stdbmap() uses the bogus factor btodb(ap->a_vp->v_mount->mnt_stat.f_iosize), which is 16 for ffs with the default block size of 8K. This factor is bogus even for vop_stdbmap() -- the correct factor is related to the filesystem blocksize which is not necessarily the same to the optimal i/o size. vop_stdbmap() was apparently cloned from nfs where these sizes happen to be the same. There may also be a problem with a_vp->v_mount being null. spec_bmap() still checks for this, but I think the checks in specfs are dead code which used to support block devices.	2001-04-30 14:35:35 +00:00
phk	608c1caf3b	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
grog	4b9d9cbaac	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
grog	1f5de30718	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
mckusick	61db3f4296	Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).	2001-03-07 07:09:55 +00:00
jlemon	11781a7431	Extend kqueue down to the device layer. Backwards compatible approach suggested by: peter	2001-02-15 16:34:11 +00:00
phk	709379c1ae	Another round of the <sys/queue.h> FOREACH transmogriffer. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 16:08:18 +00:00
phk	3ed24cd17e	Add a BUF_KERNPROC() in the BIO_DELETE path. This seems to fix the problem which md(4) backed filesystems exposed.	2001-01-30 10:06:08 +00:00
dillon	11fb1bf637	This patch reestablishes the spec_fsync() guarentee that synchronous fsyncs, which typically occur during unmounting, will drain all dirty buffers even if it takes multiple passes to do so. The guarentee was mangled by the last patch which solved a problem due to -current disabling interrupts while holding giant (which caused an infinite spin loop waiting for I/O to complete). -stable does not have either patch, but has a similar bug in the original spec_fsync() code which is triggered by a bug in the softupdates umount code, a fix for which will be committed to -current as soon as Kirk stamps it. Then both solutions will be MFC'd to -stable. -stable currently suffers from a combination of the softupdates bug and a small window of opportunity in the original spec_fsync() code, and -stable also suffers from the spin-loop bug but since interrupts are enabled the spin resolves itself in a few milliseconds.	2001-01-29 08:19:28 +00:00
dillon	41fd6873a8	Fix a lockup problem that occurs with 'cvs update'. specfs's fsync can get into the same sort of infinite loop that ffs's fsync used to get into, probably due to background bitmap writes. The solution is the same.	2000-12-30 23:32:24 +00:00
dillon	fd223545d4	This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit. Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things. NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.	2000-12-26 19:41:38 +00:00
phk	4e063f5534	Take VBLK devices further out of their missery. This should fix the panic I introduced in my previous commit on this topic.	2000-11-02 21:14:13 +00:00
eivind	4a39f454a0	Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)	2000-10-09 17:31:39 +00:00
phk	ec761116e2	Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t	2000-08-24 15:36:55 +00:00
phk	6dde24da5e	Introduce vop_stdinactive() and make it the default if no vop_inactive is declared. Sort and prune a few vop_op[].	2000-08-18 10:01:02 +00:00
mckusick	acc66855bf	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
mckusick	a3d0c189ea	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
phk	f101401a90	Pull the rug under block mode devices. they return ENXIO on open(2) now.	2000-07-03 13:48:37 +00:00
phk	4ec91666fa	Virtualizes & untangles the bioops operations vector. Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@	2000-06-16 08:48:51 +00:00
jmb	777866439c	before this commit, specfs reported disk partitions using decimal major and minor numbers. "ls -l" reports disk partitions using decimal major numbers and hex minor numbers. make specfs use decimal major numbers and hex minor numbers, just like "ls -l"	2000-06-12 10:20:18 +00:00
phk	bddf428952	Change the "bdev-whiner" to whine when open is attempted and extend the deadline a month.	2000-05-09 18:53:57 +00:00
phk	36c3965ff9	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
phk	8ee11d587f	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
phk	5df766a0f8	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
phk	a246e10f55	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
phk	6b3385b773	Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.	2000-03-16 08:51:55 +00:00
phk	ae0c1ec8f7	Give vn_isdisk() a second argument where it can return a suitable errno. Suggested by: bde	2000-01-10 12:04:27 +00:00
phk	989c4095ed	Remove unused #includes. Obtained from: http://bogon.freebsd.dk/include	1999-12-08 08:59:40 +00:00
mckusick	a7a8ed1423	Collect read and write counts for filesystems. This new code drops the counting in bwrite and puts it all in spec_strategy. I did some tests and verified that the counts collected for writes in spec_strategy is identical to the counts that we previously collected in bwrite. We now also get read counts (async reads come from requests for read-ahead blocks). Note that you need to compile a new version of mount to get the read counts printed out. The old mount binary is completely compatible, the only reason to install a new mount is to get the read counts printed. Submitted by: Craig A Soules <soules+@andrew.cmu.edu> Reviewed by: Kirk McKusick <mckusick@mckusick.com>	1999-12-01 02:09:30 +00:00
phk	8c9bc6b146	Next step in the device cleanup process. Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.	1999-11-09 14:15:33 +00:00
phk	3e649437d2	Oops, a bit too hasty there.	1999-11-08 13:08:02 +00:00
phk	e6b1d22771	Various cleanups.	1999-11-08 09:59:34 +00:00
phk	63959e2797	Use vop_panic() instead of spec_badop().	1999-11-07 15:09:59 +00:00
phk	a7f67fc819	Remove the iskmemdev() function. Make it the responsibility of the mem.c drivers to enforce the securelevel checks.	1999-11-07 12:01:32 +00:00
phk	52c0213f3b	Remove specfs::vop_lookup() There is no code path which can call it.	1999-11-01 02:53:38 +00:00
phk	8e3c3eafed	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
dillon	a7d2b0d180	A tentative agreement has been reached in regards to a procedure to remove 'b'lock devices. The agreement is, essentially, that block devices will be collapsed into character devices as a first step (though I don't particularly agree), and raw device names 'rxxx' will become simply 'xxx' in devfs in the second step (i.e. no 'rxxx' names will exist). The renaming will not effect the original /dev and the expectation is that devfs will eventually (but not immediately) become the standard way to access devices in the system. If it is determined that a reimplementation of block device access characteristics is beneficial, a number of alternatives will be possible that do not involve resurrecting the 'b'lock device class. For example, an ioctl() that might be made on an open character device descriptor or a generic buffered overlay device. This commit removes the blockdev disablement sysctl which does not apply to the solution that was reached.	1999-10-20 06:31:49 +00:00
phk	ed12aa381a	Change the default for the vfs.bdev_buffered sysctl to zero. This means that access to block devices nodes will act the same as char device nodes for disk-like devices. If you encounter problems after this, where programs accessing disks directly fail to operate, please use the following command to revert to previous behaviour: sysctl -w vfs.bdev_buffered=1 And verify that this was indeed the cause of your trouble. See the mail-archives of the arch@FreeBSD.org list for background.	1999-10-18 16:59:50 +00:00
phk	c3bc2a7bec	Add a couple of strategic KASSERTs	1999-10-08 19:07:23 +00:00
phk	9e2a2cf3ab	Add back sysctl vfs.enable_userblk_io	1999-10-08 18:25:19 +00:00
phk	a8e22c41f5	Warn once per driver about dev_t's not registered with make_dev().	1999-10-04 12:33:05 +00:00
phk	8b06d6a2fb	Move the buffered read/write code out of spec_{read\|write} and into two new functions spec_buf{read\|write}. Add sysctl vfs.bdev_buffered which defaults to 1 == true. This sysctl can be used to experimentally turn buffered behaviour for bdevs off. I should not be changed while any blockdevices are open. Remove the misplaced sysctl vfs.enable_userblk_io. No other changes in behaviour.	1999-10-04 11:23:10 +00:00
phk	073b941095	Remove v_maxio from struct vnode. Replace it with mnt_iosize_max in struct mount. Nits from: bde	1999-09-29 20:05:33 +00:00
phk	e9e4ee2c1f	Remove a warning check which was too general.	1999-09-25 18:52:03 +00:00

1 2 3 4

163 Commits