freebsd-skq

Author	SHA1	Message	Date
John Baldwin	87eca70e0c	Fix some LORs between vnode locks and filedescriptor table locks. - Don't grab the filedesc lock just to read fd_cmask. - Drop vnode locks earlier when mounting the root filesystem and before sanitizing stdin/out/err file descriptors during execve(). Submitted by: kib Approved by: re (rwatson) MFC after: 1 week	2009-07-31 13:40:06 +00:00
Brooks Davis	838d985825	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
Robert Watson	bcf11e8d00	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Robert Watson	885868cd8f	Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon	2009-04-10 10:52:19 +00:00
Daichi GOTO	16385727ce	Simplify mode_t check treatment (suggested by trasz). By semantical view, trasz's code is better than prior one. Submitted by: trasz Reviewed by: Masanori OZAWA <ozawa@ongs.co.jp>	2008-11-25 03:49:41 +00:00
Daichi GOTO	1e5da15a63	Fixes Unionfs socket issue reported as kern/118346. PR: 118346 Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> Discussed at: devsummit Strassburg, EuroBSDCon2008 Discussed with: rwatson, gnn, hrs MFC after: 2 week	2008-11-25 03:18:35 +00:00
John Baldwin	7265164f53	Don't pass WANTPARENT to the pathname lookup of the mount point for a unionfs mount just so we can immediately drop the reference on the parent directory vnode without using it.	2008-11-04 18:54:44 +00:00
Doug Rabson	a9148abd9d	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav	e11e3f187d	Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.	2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Daichi GOTO	3af387c9d2	- change function name from _vdir to _vnode because VSOCK has been added as cache target. Now they process not only VDIR but also VSOCK. - fixed panic issue caused by cache incorrect free process by "umount -f" Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> MFC after: 1 week	2008-05-07 05:32:55 +00:00
Daichi GOTO	fe5f08cda3	o Fixed multi thread access issue reported by Alexander V. Chernikov (admin@su29.net) fixed: kern/109950 PR: kern/109950 Submitted by: Alexander V. Chernikov (admin@su29.net) Reviewed by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 11:37:20 +00:00
Daichi GOTO	938161d61a	o Improved unix socket connection issue fixed: kern/118346 PR: kern/118346 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:53:52 +00:00
Daichi GOTO	5307411cbe	o Fixed rename panic issue Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:44:47 +00:00
Daichi GOTO	a9b794ff5e	o Fixed inaccessible issue especially including devfs on unionfs case. fixed also: kern/117829 PR: kern/117829 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:38:48 +00:00
Daichi GOTO	a68ae31c71	o Added system hang-up process when VOP_READDIR of unionfs_nodeget() returns not end of the file status on debug mode (DIAGNOSTIC defined) kernel. Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 07:58:19 +00:00
Attilio Rao	047dd67e96	Optimize lockmgr in order to get rid of the pool mutex interlock, of the state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007	2008-04-06 20:08:51 +00:00
Konstantin Belousov	57b4252e45	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	0e9eb108f0	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo	2008-01-24 12:34:30 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Attilio Rao	100f241571	Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace. This option just adds complexity and the new implementation no longer will support it, so axing it now that it is unused is probabilly the better idea. FreeBSD version is bumped in order to reflect the KPI breakage introduced by this patch. In the ports tree, kris found that only old OSKit code uses it, but as it is thought to work only on 2.x kernels serie, version bumping will solve any problem.	2007-12-28 00:38:13 +00:00
Robert Watson	3de213cc00	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Alfred Perlstein	77465d9390	Get rid of qaddr_t. Requested by: bde	2007-10-16 10:54:55 +00:00
Daichi GOTO	20885def58	Added whiteout behavior option. ``-o whiteout=always'' is default mode (it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:55:38 +00:00
Daichi GOTO	524f3f285d	Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:53:38 +00:00
Daichi GOTO	7d72c5e67d	Fixed un-vrele issue of upper layer root vnode of unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:52:01 +00:00
Daichi GOTO	6c98d0e9db	Added NULL check code pointed out by Coverity. (via Stanislav Sedov. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:50:58 +00:00
Daichi GOTO	57821163d3	- It has been become MPSAFE. - Fixed lock panic issue under MPSAFE. - Fixed panic issue whenever it locks vnode with reclaim. - Fixed lock implementations not conforming to vnode_if.src style. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:49:30 +00:00
Daichi GOTO	7e0c899579	Fixed vnode unlock/vrele untreated issues whenever errors have occurred during some treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:47:44 +00:00
Daichi GOTO	dc2dd18518	- Added support for vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Fixed kern/111262 issue. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:46:11 +00:00
Daichi GOTO	5adc408078	Added treatments to prevent readdir infinity loop using with Linux binary compatibility feature. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:44:06 +00:00
Daichi GOTO	b2b0db08c5	Changed it frees unneeded memory ASAP. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:42:05 +00:00
Daichi GOTO	3282e2c406	Log: Improved access permission check treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:37:52 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Konstantin Belousov	d413d21071	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Tor Egge	61b9d89ff0	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib	2007-03-13 01:50:27 +00:00
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Craig Rodrigues	dda4f444de	Simplify code in union_hashins() and union_hashget() functions. These functions now more closely resemble similar functions in nullfs. This also eliminates some errors. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 14:06:42 +00:00
Craig Rodrigues	98155f1f51	Eliminate ASSERT_VOP_ELOCKED panics when doing mkdir or symlink when sysctl vfs.lookup_shared=1. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 02:25:44 +00:00
Craig Rodrigues	b05872f29b	Remove unused variable in unionfs_root(). Submitted by: daichi, Masanori OZAWA	2006-12-09 17:24:18 +00:00
Craig Rodrigues	1e370dbbdc	Use vfs_mount_error() in a few places to give more descriptive mount error messages.	2006-12-09 17:21:25 +00:00
Craig Rodrigues	30d471e654	Add locking around calls to unionfs_get_node_status() in unionfs_ioctl() and unionfs_poll(). Submitted by: daichi, Masanori OZAWA <ozawa@ongs.co.jp> Prompted by: kris	2006-12-09 16:51:09 +00:00
Craig Rodrigues	b16f4eec16	In unionfs_readdir(), prevent a possible NULL dereference. CID: 1667 Found by: Coverity Prevent (tm)	2006-12-09 16:34:37 +00:00

1 2 3 4 5 ...

275 Commits