freebsd-skq

Author	SHA1	Message	Date
jasone	4e290e67b7	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
bp	6110b03d24	Add a lock structure to vnode structure. Previously it was either allocated separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_nolock() functions maintain only interlock lock. vop_stdlock() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick	2000-09-25 15:24:04 +00:00
msmith	7de3089b6a	Don't scan for the "right" network interface by shooting in the dark. Assume that the nfs_diskless structure is correctly set up; the provider ought to be getting it right.	2000-09-05 22:29:36 +00:00
mckusick	acc66855bf	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
ps	34013c4020	Correctly set the Maximum DHCP Message Size. bootpd now works again as well as ISC dhcpd.	2000-06-13 09:32:09 +00:00
jake	961b97d434	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
jake	d93fbc9916	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
phk	204a9e7fba	Include a RFC 1533 "Maximum DHCP Message Size" option in our request. ISC DHCP will limit the reply length to 64 bytes for bootp replies unless we explicitly tell it we can do more. We tell it that we can do 1200 bytes.	2000-05-07 14:29:19 +00:00
phk	36c3965ff9	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
phk	10914aa708	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
phk	1931990da0	s/biowait/bufwait/g Prodded by: several.	2000-04-29 16:25:22 +00:00
phk	6be1308ad1	Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>	2000-04-19 14:58:28 +00:00
phk	aaaef0b54e	Complete the bio/buf divorce for all code below devfs::strategy Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS	2000-04-15 05:54:02 +00:00
phk	8ee11d587f	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
dillon	d7295a1a39	Add a sysctl to specify the amount of UDP receive space NFS should reserve, in maximal NFS packets. Originally only 2 packets worth of space was reserved. The default is now 4, which appears to greatly improve performance for slow to mid-speed machines on gigabit networks. Add documentation and correct some prior documentation. Problem Researched by: Andrew Gallatin <gallatin@cs.duke.edu> Approved by: jkh	2000-03-27 21:38:35 +00:00
phk	5df766a0f8	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
phk	a246e10f55	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
peter	a5441090de	Clean up some loose ends in the network code, including the X.25 and ISO #ifdefs. Clean out unused netisr's and leftover netisr linker set gunk. Tested on x86 and alpha, including world. Approved by: jkh	2000-02-13 03:32:07 +00:00
dillon	9bba72b2a3	The alpha build cuases the 'nfsuid bloated' warning to occur. Well, there is nothing we can do about it. In fact, after further review there simply are not very many instances of the two structures NFS checks for 'bloat' so I've decided to simply rip the checks out entirely. Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>	2000-01-13 20:18:25 +00:00
shin	3bdc213839	tcp updates to support IPv6. also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	2000-01-09 19:17:30 +00:00
dillon	c6689c797d	Enhance reassignbuf(). When a buffer cannot be time-optimally inserted into vnode dirtyblkhd we append it to the list instead of prepend it to the list in order to maintain a 'forward' locality of reference, which is arguably better then 'reverse'. The original algorithm did things this way to but at a huge time cost. Enhance the append interlock for NFS writes to handle intr/soft mounts better. Fix the hysteresis for NFS async daemon I/O requests to reduce the number of unnecessary context switches. Modify handling of NFS mount options. Any given user option that is too high now defaults to the kernel maximum for that option rather then the kernel default for that option. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	2000-01-05 05:11:37 +00:00
dillon	42026f70c2	Fix at least one source of the continued 'NFS append race'. close() was calling nfs_flush() and then clearing the NMODIFIED bit. This is not legal since there might still be dirty buffers after the nfs_flush (for example, pending commits). The clearing of this bit in turn prevented a necessary vinvalbuf() from occuring leaving left over dirty buffers even after truncating the file in a new operation. The fix is to simply not clear NMODIFIED. Also added a sysctl vfs.nfs.nfsv3_commit_on_close which, if set to 1, will cause close() to do a stage 1 write AND a stage 2 commit synchronously. By default only the stage 1 write is done synchronously. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	2000-01-05 00:32:18 +00:00
peter	d53e4c1d80	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.	1999-12-29 05:07:58 +00:00
alfred	4f1be4b332	make getfh a standard syscall instead of dependant on having NFSSERVER defined, useful for userland fileservers that want to use a filehandle type interface to the filesystem. Submitted by: Assar Westerlund assar@stacken.kth.se PR: kern/15452	1999-12-21 20:21:12 +00:00
rwatson	4b6baecfc7	Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind	1999-12-19 06:08:07 +00:00
green	99a0254a52	M_PREPEND-related cleanups (unregisterifying struct mbuf *s).	1999-12-19 01:55:37 +00:00
eivind	87724eb673	Introduce NDFREE (and remove VOP_ABORTOP)	1999-12-15 23:02:35 +00:00
dillon	3968ced3f8	Fix two problems: First, fix the append seek position race that can occur due to np->n_size potentially changing if nfs_getcacheblk() blocks in nfs_write(). Second, under -current we must supply the proper bufsize when obtaining buffers that straddle the EOF, but due to the fact that np->n_size can change out from under us it is possible that we may specify the wrong buffer size and wind up truncating dirty data written by another process. Both problems are solved by implementing nfs_rslock(), which allows us to lock around sensitive buffer cache operations such as those that occur when appending to a file. It is believed that this race is responsible for causing dirtyoff/dirtyend and (in stable) validoff/validend to exceed the buffer size. Therefore we have now added a warning printf for the dirtyoff/end case in current. However, we have introduced a new problem which we need to fix at some point, and that is that soft or intr NFS mounts may become uninterruptable from the point of view of process A which is stuck waiting on rslock while process B is stuck doing the rpc. To unstick process A, process B would have to be interrupted first. Reviewed by: Alfred Perlstein <bright@wintelcom.net>	1999-12-14 19:07:54 +00:00
dillon	275dfe6756	Fix a timeout deadlock that can occur when the process holding the receive lock hasn't yet managed to send its own request. PR: kern/15055 Submitted by: Ian Dowse iedowse@maths.tcd.ie	1999-12-13 04:24:55 +00:00
dillon	08e8d78b50	Fix a number of server-side issues related to aborting badly formed NFS packets, mainly initializing structure pointers to NULL which are conditionally freed prior to return. PR: kern/15249 Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-12-12 07:06:39 +00:00
dillon	1da2183061	Synopsis of problem being fixed: Dan Nelson originally reported that blocks of zeros could wind up in a file written to over NFS by a client. The problem only occurs a few times per several gigabytes of data. This problem turned out to be bug #3 below. bug #1: B_CLUSTEROK must be cleared when an NFS buffer is reverted from stage 2 (ready for commit rpc) to stage 1 (ready for write). Reversions can occur when a dirty NFS buffer is redirtied with new data. Otherwise the VFS/BIO system may end up thinking that a stage 1 NFS buffer is clusterable. Stage 1 NFS buffers are not clusterable. bug #2: B_CLUSTEROK was inappropriately set for a 'short' NFS buffer (short buffers only occur near the EOF of the file). Change to only set when the buffer is a full biosize (usually 8K). This bug has no effect but should be fixed in -current anyway. It need not be backported. bug #3: B_NEEDCOMMIT was inappropriately set in nfs_flush() (which is typically only called by the update daemon). nfs_flush() does a multi-pass loop but due to the lack of vnode locking it is possible for new buffers to be added to the dirtyblkhd list while a flush operation is going on. This may result in nfs_flush() setting B_NEEDCOMMIT on a buffer which has NOT yet gone through its stage 1 write, causing only the commit rpc to be made and thus causing the contents of the buffer to be thrown away (never sent to the server). The patch also contains some cleanup, which only applies to the commit into -current. Reviewed by: dg, julian Originally Reported by: Dan Nelson <dnelson@emsphone.com>	1999-12-12 06:09:57 +00:00
eivind	287836faea	Lock reporting and assertion changes. * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter	1999-12-11 16:13:02 +00:00
dillon	679b553dfa	The symlink implementation could improperly return a NULL vp along with a 0 error code. The problem occured with NFSv2 mounts and also with any NFSv3 mount returning an EEXIST error (which is translated to 0 prior to return). The reply to the rpc only contains the file handle for the no-error case under NFSv3. The error case under NFSv3 and all cases under NFSv2 do not return the file handle. The fix is to do a secondary lookup to obtain the file handle and thus be able to generate a return vnode for the situations where the rpc reply does not contain the required information. The bug was originally introduced when VOP_SYMLINK semantics were changed for -CURRENT. The NFS symlink implementation was not properly modified to go along with the change despite the fact that three people reviewed the code. It took four attempts to get the current fix correct with five people. Is NFS obfuscated? Ha! Reviewed by: Alfred Perlstein <bright@wintelcom.net> Testing and Discussion: "Viren R.Shah" <viren@rstcorp.com>, Eivind Eklund <eivind@FreeBSD.ORG>, Ian Dowse <iedowse@maths.tcd.ie>	1999-11-30 06:56:15 +00:00
eivind	6e2e0f5931	Remap the error EEXISTS => 0 before using error to determine if we should return a vp.	1999-11-27 18:14:41 +00:00
dillon	913c1ce734	nm_srtt and nm_sdrtt are arrays[4]. Remove explicit initialization of element [4] in both, which goes beyond the end of the array, leaving [0], [1], [2], and [3]. This bug did not cause any problems since the overrun fields are initialized after the bogus array init but needs to be fixed anyway. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-22 04:50:09 +00:00
eivind	9ba3c6c857	Fix VOP_MKNOD for loss of WILLRELE. I don't know how I could have missed this in the first place :-( Noticed by: bde	1999-11-20 16:09:10 +00:00
eivind	4ce73d7096	Remove WILLRELE from VOP_SYMLINK Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.	1999-11-13 20:58:17 +00:00
dillon	ac372bbf0e	Remove special case socket sharing code in order to allow nfsd to bind IP addresses to udp/cltp sockets separately. PR: kern/13049 Reviewed by: David Malone <dwmalone@maths.tcd.ie>, freebsd-current	1999-11-11 17:24:02 +00:00
dillon	2fb940a103	Fix nfssvc_addsock() to not attempt to free a NULL socket structure when returning an error. Bug fix was extracted from the PR. The PR is not yet entirely resolved by this commit. PR: kern/13049 Reviewed by: Matt Dillon <dillon@freebsd.org> Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-08 19:10:16 +00:00
msmith	2ba600f767	Call bootpc_init before we try to mount an NFS root, if we're configured to use BOOTP for NFS root discovery. The entire interface setup inside nfs_mountroot is evil, and should die.	1999-11-01 23:55:38 +00:00
phk	8e3c3eafed	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
dillon	47a282162e	Move NFS access cache hits/misses into nfsstats structure so /usr/bin/nfsstat can get to it easily.	1999-10-25 19:22:33 +00:00
phk	322edeeaa9	Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.	1999-10-03 12:18:29 +00:00
marcel	a77102446a	Careless use of struct proc *p caused major problems. 'p' is allowed to be NULL in this function (nfs_sigintr). Reorder the statements and guard them all with a single if (p != NULL). reported, reviewed and tested by: jdp	1999-09-29 20:12:39 +00:00
marcel	d5e8d714b9	sigset_t change (part 2 of 5) ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.	1999-09-29 15:03:48 +00:00
dillon	00b895590e	Add comment to clarify a commit rpc optimization already being performed.	1999-09-20 19:10:28 +00:00
dillon	581716d4df	Asynchronized client-side nfs_commit. NFS commit operations were previously issued synchronously even if async daemons (nfsiod's) were available. The commit has been moved from the strategy code to the doio code in order to asynchronize it. Removed use of lastr in preparation for removal of vnode->v_lastr. It has been replaced with seqcount, which is already supported by the system and, in fact, gives us a better heuristic for sequential detection then lastr ever did. Made major performance improvements to the server side commit. The server previously fsync'd the entire file for each commit rpc. The server now bawrite()s only those buffers related to the offset/size specified in the commit rpc. Note that we do not commit the meta-data yet. This works still needs to be done. Note that a further optimization can be done (and has not yet been done) on the client: we can merge multiple potential commit rpc's into a single rpc with a greater file offset/size range and greatly reduce rpc traffic. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>	1999-09-17 05:57:57 +00:00
alfred	b9136a6115	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD	1999-09-11 00:46:08 +00:00
alfred	e16a3900a7	All unimplemented VFS ops now have entries in kern/vfs_default.c that return reasonable defaults. This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed. This should make *_vfsops.c more readable and reduce bloat. Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>	1999-09-07 22:42:38 +00:00
phk	d311a0563b	remove unused variables.	1999-08-28 19:21:03 +00:00

1 2 3 4 5 ...

396 Commits