freebsd-skq

Author	SHA1	Message	Date
Robert Watson	c1787d3b75	o Note an additional potential problem here: LOCKD_MSG directly exports struct ucred to userland. In 5.0-CURRENT, it is desirable to instead export struct xucred, as ucred contains mutexes, pointers, and other kernel evil. I'll add it to my work queue.	2001-10-24 02:48:38 +00:00
Robert Watson	b5c05ddcb8	o Add two comments identifying problems with the current nfs_lock.c implementation, so that the information doesn't get lost. (1) /var/run/lock is looked up relative to the current thread's root directory, but it's not clear that's desirable. (2) A race condition associated with live credential modification on a shared credential is present when privilege is granted for the purposes of talking to /var/run/lock.	2001-10-23 19:11:31 +00:00
Matthew Dillon	c72ccd014d	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
John Baldwin	5162c5cc1e	Use crhold() instead of crdup() since we aren't modifying the cred but just need to ensure it remains immutable.	2001-10-09 16:48:57 +00:00
Peter Wemm	caf4b18ba9	Make this compile after last commit. It should be: "td ? td->td_proc : NULL", not "td ? td->td_proc, NULL"	2001-10-09 02:40:45 +00:00
Julian Elischer	7e49874f08	Don't dereference td if it's NULL. Submitted by: Alexander N. Kabaev <ak03@gte.com>	2001-10-08 23:47:44 +00:00
Peter Wemm	b9b0e19206	Unwind some more macros. NFSMADV() was kinda silly since it was right next to equivalent m_len adjustments. Move the nfsm_subs.h macros into groups depending on which phase they are used in, since that affects the error recovery requirements. Collect some of the common error checking into a single macro as preparation for unwinding some more. Have nfs_rephead return a value instead of secretly modifying args. Remove some unused function arguments that were being passed around. Clarify nfsm_reply()'s error handling (I hope).	2001-09-28 04:37:08 +00:00
Peter Wemm	1290984b33	Make nfsm_dissect() have an obvious return value.	2001-09-27 22:40:38 +00:00
Peter Wemm	ea7fe289fe	Tidy up nfsm_build usage. This is only partially finished.	2001-09-27 02:33:36 +00:00
Ian Dowse	1782e17d6f	Add a missing dereference level. This caused nfsm_postop_attr_xx() to try and extract node attributes from an RPC reply even if none were present. Reviewed by: peter	2001-09-25 00:00:33 +00:00
Peter Wemm	d55d47aded	Add the magic marker so that loader and kldload(2) can find this in module form automagically.	2001-09-20 04:57:34 +00:00
Peter Wemm	247c65c27f	Oops. Fix a missing indirection level. gcc didn't complain about it on x86, but did complain about it on alpha (since int and pointer are different sizes)	2001-09-20 03:45:51 +00:00
Peter Wemm	891a092764	Sigh, Last minute pre-merge typo. (missing quotes)	2001-09-18 23:49:33 +00:00
Peter Wemm	eb25edbda3	Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.	2001-09-18 23:32:09 +00:00
Warner Losh	976a26437e	nfs_strategy calls nfs_asyncio with td as NULL. So add a bandaid that will pass NULL as the struct proc when td is NULL. This has stopped crashing on my machine. Note: The passing of NULL may be bogus, but I'll let others fix that problem. Reviewed by: jhb	2001-09-18 18:37:52 +00:00
Peter Wemm	38f48395d6	Sync some differences that were different between the copies of the files that were in nfs/nfs.h and nfsserver/nfs.h in the p4 tree.	2001-09-15 04:41:56 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Kris Kennaway	bf61e26696	Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions. Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks	2001-09-10 11:28:07 +00:00
Matthew Dillon	4e174404a3	Pushdown Giant for nfs syscalls (nfssvc())	2001-08-31 22:39:36 +00:00
Andrey A. Chernov	f6bf1abc1b	Stupid error from my side in prev. commit: \|\| -> &&	2001-08-23 18:02:29 +00:00
Andrey A. Chernov	e02faad5ca	Implement l_len<0 per POSIX check. Check for valid l_whence too.	2001-08-23 16:13:59 +00:00
Andrey A. Chernov	6c3f4fef64	Even better move: suppose that server is able to handle SEEK_END, so check arguments for all but not SEEK_END case, leaving SEEK_END handling for server	2001-08-23 14:21:26 +00:00
Andrey A. Chernov	e018907ed4	Apparently SEEK_END locking not supported by NFS. Previous variant returns EINVAL in that case, change it to EOPNOTSUPP.	2001-08-23 14:09:16 +00:00
Andrey A. Chernov	fb2f187058	Move <machine/> after <sys/> Pointed by: bde	2001-08-23 13:27:58 +00:00
Andrey A. Chernov	e9d095afdc	adv. lock: detect off_t overflow _before_ it occurse and return EOVERFLOW instead of EINVAL	2001-08-23 08:20:21 +00:00
Ian Dowse	02b31a0ee9	Fix a client-side memory leak in nfs_flush(). The code allocates a temporary array to store struct buf pointers if the list doesn't fit in a local array. Usually it frees the array when finished, but if it jumps to the 'again' label and the new list does fit in the local array then it can forget to free a previously malloc'd M_TEMP memory. Move the free() up a line so that it frees any previously allocated memory whether or not it needs to malloc a new array. Reviewed by: dillon	2001-08-01 10:25:13 +00:00
Peter Wemm	7b141d5db3	Check the filehandle size when mounting. Obtained from: Constantine Sapuntzakis <csapuntz@openbsd.org>	2001-07-30 20:01:59 +00:00
John Baldwin	617e358cdf	- Sort includes. - Update vmmeter statistics for vnode pagein/pageouts in getpages/putpages.	2001-07-04 20:14:59 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
John Baldwin	bc2327c310	- Protect the mnt_vnode list with the mntvnode lock. - Use queue(9) macros.	2001-06-28 04:10:07 +00:00
Jake Burkholder	d389ead74f	Unlock the process returned from pfind() if it does not return NULL. This fixes a witness lock violation for nfssvc returning with locks held. Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr> PR: kern/27776	2001-06-01 01:30:51 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
John Baldwin	ce70e0a964	Assert Giant is held by the caller rather than getting it and releasing it in getpages/putpages.	2001-05-23 22:26:05 +00:00
Ruslan Ermilov	99d300a1ec	- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.	2001-05-23 09:42:29 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Poul-Henning Kamp	b7ebffbc08	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
Alfred Perlstein	f411fba5d3	Remove incorrect comment. Submitted by: quinot@inf.enst.fr <quinot@inf.enst.fr> PR: kern/26893	2001-04-29 03:10:24 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
Alfred Perlstein	d8d5fa8805	vnode_pager_freepage() is really vm_page_free() in disguise, nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()	2001-04-19 06:18:23 +00:00
Alfred Perlstein	603c86672c	Implement client side NFS locks. Obtained from: BSD/os Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org	2001-04-17 20:45:23 +00:00
Poul-Henning Kamp	f84e29a06c	This patch removes the VOP_BWRITE() vector. VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing. This patch takes a more general approach and adds a bp->b_op vector where more methods can be added. The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.	2001-04-17 08:56:39 +00:00
Peter Wemm	9d10eb0c0c	Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode to enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.	2001-04-11 00:39:20 +00:00
Robert Watson	2955f0b360	o Rather than arbitrarily construct a credential in the nfs_statfs() VFS operation, make use of the calling process's credential. This solution may not be ideal (there are a number of other possible proposals, including making use of the proc0 credential, adding a credential argument to the VFSOP, and switching from a hard-coded ucred to a hard-coded nfscred), it is simple and appears to work. The arguments against using simply crget() are fairly strong: it is the only place in the code (other than a nearly identical invocation in ncp) where crget() is invoked, other than in the process credential creation code; as ucred becomes extensible, this use of crget() without appropriate context results in less and less meaningful credential data. The implementation here will probably be tweaked as a result of experimentation and further exploration of the requirements. In the mean-time, it allows progress to be made in ucred expansion for new security models without causing a crash every time df is used on an NFS mounted file system. This code has been interop tested against FreeBSD and Solaris NFS servers. While using the process credentials should not introduce interop problems, please let me know if any turn out to exist. Reviewed by: freebsd-arch	2001-04-05 06:12:38 +00:00
Peter Wemm	439fea92c2	Use the same API as the example code. Allow the initial hash value to be passed in, as the examples do. Incrementally hash in the dvp->v_id (using the official api) rather than add it. This seems to help power-of-two predictable filename trees where the filenames repeat on a power-of-two cycle and the directory trees have power-of-two components in it. The simple add then mask was causing things like 12000+ entry collision chains while most other entries have between 0 and 3 entries each. This way seems to improve things.	2001-03-20 02:10:18 +00:00
Peter Wemm	6eb39ac8fc	Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash). Make the name cache hash as well as the nfsnode hash use it. As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32). The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha. Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.	2001-03-17 09:31:06 +00:00
Peter Wemm	be1d4058eb	Dramatically improve the lame nfs_hash(). This is based on the Fowler / Noll / Vo Hash (http://www.isthe.com/chongo/tech/comp/fnv/). This improves hash coverage a massive amount. We were seeing one set of machines that were using 0.84% of their 131072 entry nfsnode hash buckets with maximum chain lengths of up to ~500 entries. The machine was spending nearly 100% of its time in 'system'. A test with this has pushed the coverage from a few perCent up to 91% utilization with a max chain length of 11. Submitted by: David Filo	2001-03-17 05:43:01 +00:00

1 2 3 4 5 ...

466 Commits