freebsd-skq

Author	SHA1	Message	Date
pjd	ef2355e38d	Fix usecount leak in mknod(2) on file system exported over NFS. While I'm here, correct typo in comment. Reviewed by: kan, kib MFC after: 3 days	2009-09-09 12:56:05 +00:00
dfr	5d248bb05f	Remove the old kernel RPC implementation and the NFS_LEGACYRPC option. Approved by: re	2009-06-30 19:03:27 +00:00
jhb	ccb2608d06	Fix build with NFS_LEGACYRPC enabled after the socket upcall locking changes. Approved by: re (kensmith)	2009-06-30 03:18:51 +00:00
brooks	f53c1c309d	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
rmacklem	d88296a89f	Since svc_[dg\|vc\|tli\|tp]_create() did not hold a reference count on the SVCXPTR structure returned by them, it was possible for the structure to be free'd before svc_reg() had been completed using the structure. This patch acquires a reference count on the newly created structure that is returned by svc_[dg\|vc\|tli\|tp]_create(). It also adds the appropriate SVC_RELEASE() calls to the callers, except the experimental nfs subsystem. The latter will be committed separately. Submitted by: dfr Tested by: pho Approved by: kib (mentor)	2009-06-17 22:50:26 +00:00
rmacklem	9d4e4b79d9	Add a #include <sys/jail.h> so that it builds when options KGSSAPI is specified in the kernel configuration. Approved by: kib (mentor)	2009-06-12 20:18:08 +00:00
rwatson	f4934662e5	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
jhb	a1af9ecca4	Rework socket upcalls to close some races with setup/teardown of upcalls. - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem	2009-06-01 21:17:03 +00:00
jamie	572db1408a	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
jamie	a013e0afcb	Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)	2009-05-27 14:11:23 +00:00
dfr	4bf244123e	Fix build of KGSSAPI bits post-vimage.	2009-05-24 11:10:27 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
kan	7b57a857b7	Do not embed struct ucred into larger netcred parent structures. Credential might need to hang around longer than its parent and be used outside of mnt_explock scope controlling netcred lifetime. Use separate reference-counted ucred allocated separately instead. While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up some unused declarations in new NFS code. Reported by: John Hickey PR: kern/133439 Reviewed by: dfr, kib	2009-05-09 18:09:17 +00:00
rmacklem	bb19ddd3ae	Change nfsserver so that it uses the nfssvc() system call provided in sys/nfs/nfs_nfssvc.c by registering with it using the nfsd_call_nfsserver function pointer. Also, add the build glue for nfs_nfssvc.c optionally based on "nfsserver" and also as a loadable module. Submitted by: rmacklem Reviewed by: kib Approved by: kib (mentor)	2009-04-12 19:04:27 +00:00
dfr	41382eee31	Fix an mbuf leak in the error path. Submitted by: Rick Macklem <rick at snowhite dot cis dot uoguelph dot ca>	2009-03-19 14:13:18 +00:00
rwatson	ceb08f5693	Include audit.h so that the system call path protected by NFS_LEGACYRPC can audit its arguments. Submitted by: Jaakko Heinonen <jh at saunalahti.fi> MFC after: 1 week X-MFC-note: MFC with r188311	2009-02-23 23:04:15 +00:00
jhb	26e338d6fc	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
rwatson	4349e4002b	Audit the flag argument to the nfssvc(2) system call. Obtained from: TrustedBSD Project Sponsored by: Apple, Inc.	2009-02-08 14:04:08 +00:00
ed	a964306db9	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
kensmith	7d088b7fca	Handle VFS_VGET() failing with an error other than EOPNOTSUPP in addition to failing with that error. PR: 125149 Submitted by: Jaakko Heinonen (jh <at> saunalahti <dot> fi) Reviewed by: mohans, kan MFC after: 3 days	2008-12-16 04:34:09 +00:00
dfr	75eeed4f1f	We need to pass a structure with enough space for an NFSv2 filehandle to nfs_srvmtofh_xx otherwise bad things happen when an NFSv2 client tries to make a request.	2008-12-10 14:49:54 +00:00
kan	c7b0520697	Change nfsserver slightly so that it does not trip over the timestamp validation code on ZFS. Problem: when opening file with O_CREAT\|O_EXCL NFS has to jump through extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8 bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely identify this create request. Server then creates a new file with access mode 0, copies received 8 bytes into va_atime member of struct vattr and attempt to set the atime on file using VOP_SETATTR. If that succeeds, it fetches file attributes with VOP_GETATTR and verifies that atime timestamps match. If timestamps do not match, NFS server concludes it has probbaly lost the race to another process creating the file with the same name and bails with EEXIST. This scheme works OK when exported FS is FFS, but if underlying filesystem is ZFS _and_ server is running 64bit kernel, it breaks down due to sanity checking in zfs_setattr function, which refuses to accept any timestamps which have tv_sec that cannot be represented as 32bit int. Since struct timespec fields are 64 bit integers on 64bit platforms and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight bytes supplied by client end up in va_atime.tv_sec, forcing it out of valid 32bit range. The solution this change implements is simple: it treats NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing that tv_sec remains in 32 bit range and ZFS remains happy. Reviewed by: kib	2008-12-03 17:54:09 +00:00
kib	bf74bb2e16	In the nfsrv_fhtovp(), after the vfs_getvfs() function found the pointer to the fs, but before a vnode on the fs is locked, unmount may free fs structures, causing access to destroyed data and freed memory. Introduce a vfs_busymp() function that looks up and busies found fs while mountlist_mtx is held. Use it in nfsrv_fhtovp() and in the implementation of the handle syscalls. Two other uses of the vfs_getvfs() in the vfs_subr.c, namely in sysctl_vfs_ctl and vfs_getnewfsid seems to be ok. In particular, sysctl_vfs_ctl is protected by Giant by being a non-sleeping sysctl handler, that prevents Giant-locked unmount code to interfere with it. Noted by: tegge Reviewed by: dfr Tested by: pho MFC after: 1 month	2008-11-29 13:34:59 +00:00
dfr	367ae23f20	Switch the default rpc implementation for NFS back to the new code. I believe I have fixed the reported problems - if you still have trouble with it, please contact me with as much detail as possible so that I can track down any other issues as quickly as possible.	2008-11-14 11:27:53 +00:00
dfr	3b3a59e0c7	Use the remote address for access control, not the local address. This fixes the nfsd problems that some people have with the new code. Add support for the vfs.nfsrv.nfs_privport sysctl which denies access unless the client is using a port number less than 1024. Not really sure if this is particularly useful since it doesn't add any real security.	2008-11-13 14:36:52 +00:00
dfr	f2543b22e0	Temporarily switch NFS back to the old RPC code while I try to diagnose and fix the problems a few people have noticed with the new code. People who want to continue testing the new code or who need RPCSEC_GSS support should use the new option NFS_NEWRPC to select it.	2008-11-13 11:35:18 +00:00
dfr	743ba239e7	Turn (NFSERR_AUTHERR\|code) status values into svcerr_auth(rqst, code) replies instead of returning a success with a bogus NFS error code.	2008-11-12 09:38:18 +00:00
dfr	dd7b0d73e4	Allow v3 GETATTR requests even when weakly authenticated. Change the error return for for weakly authenticated requests from REJECTEDCRED to WEAKAUTH for consistency with Solaris.	2008-11-12 09:36:35 +00:00
dfr	cc925302a0	Range-check NFSv2 procedure numbers before converting to NFSv3. Submitted by: csjp	2008-11-07 10:43:01 +00:00
dfr	ee8eea89ff	Don't depend on krpc.ko in the NFS_LEGACYRPC case.	2008-11-06 11:43:49 +00:00
des	4cc1c2c89f	Unbreak NFS. Pointy hat to: dfr	2008-11-06 10:53:35 +00:00
dfr	d070e7ad67	If mountd doesn't specify a secflavor list for the mount, assume that -sec=sys is what was wanted.	2008-11-05 16:25:26 +00:00
dfr	aa144d95db	Include <sys/eventhandler.h>.	2008-11-04 16:43:02 +00:00
dfr	2fb03513fc	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
trhodes	3c9c77e154	Document a few sysctls in the NFS client and server code. Minor style(9) where applicable. Approved by: alfred (slightly older version)	2008-11-02 17:00:23 +00:00
trasz	0ad8692247	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
rwatson	a2129bd144	Rename three MAC entry points from _proc_ to _cred_ to reflect the fact that they operate directly on credentials: mac_proc_create_swapper(), mac_proc_create_init(), and mac_proc_associate_nfsd(). Update policies. Obtained from: TrustedBSD Project	2008-10-28 11:33:06 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
rwatson	25ed07a2fe	Turn XXX's for unlocked writes of NFS server statistics to simple notes, as we consider it a feature to exchange performance for consistency. MFC after: 3 days	2008-10-12 20:06:59 +00:00
attilio	23ff3dbeb8	Remove the suser(9) interface from the kernel. It has been replaced from years by the priv_check(9) interface and just very few places are left. Note that compatibility stub with older FreeBSD version (all above the 8 limit though) are left in order to reduce diffs against old versions. It is responsibility of the maintainers for any module, if they think it is the case, to axe out such cases. This patch breaks KPI so __FreeBSD_version will be bumped into a later commit. This patch needs to be credited 50-50 with rwatson@ as he found time to explain me how the priv_check() works in detail and to review patches. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reviewed by: rwatson	2008-09-17 15:49:44 +00:00
attilio	a9873f87a6	Decontext-alize the nfsserver module. Now, only some few places still require thread passing (mostly the ones which access to VOP_* functions) and will be fixed once the primitive also will be. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-09-16 21:57:39 +00:00
attilio	dbf35e279f	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
rwatson	8805d22e34	Remove spls from NFS server setup call; expand receive socket buffer locking to cover full setup of socket upcalls; remove XXX about locking. MFC after: 3 weeks	2008-06-30 20:43:06 +00:00
kib	f4b9bd396a	Change the fix in the rev. 1.179 to use nfsrv_lockedpair_nd(). Tested by: pho MFC after: 3 days	2008-05-28 16:23:17 +00:00
kib	2e00a34c1d	Initialize vfslocked prior to calling nfsm_srvmtofh where it was forgotten. Reported by: Andrew Edwards <aedwards sandvine com> Tested by: pho MFC after: 3 days	2008-05-28 16:21:32 +00:00
ru	3b1bf8c2e9	Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.	2008-03-25 09:39:02 +00:00
jeff	a9d123c3ab	- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)	2008-03-22 09:15:16 +00:00
dfr	cba668f51c	Fix a regression from the last revision - don't edit the ns_rec list while not holding the lock.	2008-03-19 12:33:25 +00:00
dfr	f46620ae37	Don't call nfs_realign while holding locks. Reviewed by: kib	2008-03-18 18:42:59 +00:00
kib	02dada141b	Fix the Giant leak in the nfsrv_remove(). Reported by: pluknet <pluknet gmail com> MFC after: 1 week	2008-03-04 11:05:03 +00:00
remko	c050b3d1bc	Use nfsrv_destroycache() only once, else it crashes the server. PR: kern/118152 Submitted by: Bjoern Groenvall <bg at sics dot se> Approved by: imp (mentor, a while ago already), jhb MFC After: 3 days	2008-01-18 17:03:36 +00:00
attilio	71b7824213	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
rwatson	35dc7f2858	Garbage collect now-unused nfsrv_setcred() -- it's not only unused, but also a purveyor of unfortunate (and now unsupported) direct frobbing of struct ucred. MFC after: 3 days	2007-11-04 19:20:33 +00:00
rwatson	8756317538	Rename mac_associate_nfsd_label() to mac_proc_associate_nfsd(), and move from mac_vfs.c to mac_process.c to join other functions that setup up process labels for specific purposes. Unlike the two proc create calls, this call is intended to run after creation when a process registers as the NFS daemon, so remains an _associate_ call.. Obtained from: TrustedBSD Project	2007-10-25 12:34:14 +00:00
jhb	1f6b3a5f2c	Add a -z flag to nfsstat which zeros the NFS statistics after displaying them. MFC after: 1 week Requested by: ps Submitted by: ps (6 years ago)	2007-10-18 16:38:07 +00:00
mohans	11057bb00f	Set the NFS server sockbuf high watermarks to the system defaults (up form 32KB). The low highwatermark setting caused UDP fullsock request drops, throttling thruput greatly. Reported by: Kris Kennaway Approved by: re@ (Ken Smith)	2007-10-12 03:56:27 +00:00
rwatson	23574c8673	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
rwatson	c29e74320b	First in a series of changes to remove the now-unused Giant compatibility framework for non-MPSAFE network protocols: - Remove debug_mpsafenet variable, sysctl, and tunable. - Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel. - Remove logic to automatically flag interrupt handlers as non-MPSAFE if debug.mpsafenet is set for an INTR_TYPE_NET handler. - Remove logic to automatically flag netisr handlers as non-MPSAFE if debug.mpsafenet is set. - Remove references in a few subsystems, including NFS and Cronyx drivers, which keyed off debug_mpsafenet to determine various aspects of their own locking behavior. - Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into no-op's, as their entire behavior was determined by the value in debug_mpsafenet. - Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE. Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still present in subsystems, and will be removed in followup commits. Reviewed by: bz, jhb Approved by: re (kensmith)	2007-07-27 11:59:57 +00:00
rwatson	f938c62f4e	Include priv.h to pick up suser(9) definitions, missed in an earlier commit. Warnings spotted by: kris	2007-06-13 22:42:43 +00:00
mjacob	24a416aad5	Init timespec to zero fo quiesce warnings.	2007-06-10 04:42:20 +00:00
rwatson	d1196975a0	Remove MAC Framework access control check entry points made redundant with the introduction of priv(9) and MAC Framework entry points for privilege checking/granting. These entry points exactly aligned with privileges and provided no additional security context: - mac_check_sysarch_ioperm() - mac_check_kld_unload() - mac_check_settime() - mac_check_system_nfsd() Add mpo_priv_check() implementations to Biba and LOMAC policies, which, for each privilege, determine if they can be granted to processes considered unprivileged by those two policies. These mostly, but not entirely, align with the set of privileges granted in jails. Obtained from: TrustedBSD Project	2007-04-22 15:31:22 +00:00
rwatson	32f12b60cc	Attempt to rationalize NFS privileges: - Replace PRIV_NFSD with PRIV_NFS_DAEMON, add PRIV_NFS_LOCKD. - Use PRIV_NFS_DAEMON in the NFS server. - In the NFS client, move the privilege check from nfslockdans(), which occurs every time a write is performed on /dev/nfslock, and instead do it in nfslock_open() just once. This allows us to avoid checking the saved uid for root, and just use the effective on open. Use PRIV_NFS_LOCKD.	2007-04-21 18:11:19 +00:00
rwatson	35b5232a25	In nfsrv_rcv(), don't reacquire the nfs server lock until after nfs_realign() has been called, as it may sleep waiting on memory allocation. Reported by: simon	2007-04-15 15:50:50 +00:00
jhb	1ad11dc4f1	- Split out the part of SYSCALL_MODULE_HELPER() that builds a 'struct sysent' for a new system call into a new MAKE_SYSENT() macro. - Use MAKE_SYSENT() to build a full sysent for the nfssvc system call in the NFS server and use syscall_register() and syscall_deregister() to manage the nfssvc system call entry instead of manually frobbing the sysent[] array.	2007-04-02 13:53:26 +00:00
jhb	c2c01f044f	Initialize vfslocked to 0 before nfsm_srvmtofh() so that the variable is not used uninitialized in 'nfsmout' if nfsm_srvmtofh() gets an internal error. CID: 1766 Found by: Coverity Prevent (tm)	2007-03-26 15:14:58 +00:00
jeff	d43d58ff45	- Turn all explicit giant acquires into conditional VFS_LOCK_GIANTs. Only ops which used namei still remained. - Implement a scheme for reducing the overhead of tracking which vops require giant by constantly reducing the number of recursive giant acquires to one, leaving us with only one vfslocked variable. - Remove all NFSD lock acquisition and release from the individual nfs ops. Careful examination has shown that they are not required. This greatly simplifies the code. Sponsored by: Isilon Systems, Inc. Discussed with: rwatson Tested by: kkenn Approved by: re	2007-03-17 18:18:08 +00:00
wkoszek	d9c0510dba	Change these descriptions of memory types used in malloc(9), as their current, rather long strings make output from vmstat -m look unpleasant. Approved by: cognet (mentor)	2007-03-05 00:21:40 +00:00
rwatson	300d4098cf	Remove 'MPSAFE' annotations from the comments above most system calls: all system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.	2007-03-04 22:36:48 +00:00
pjd	cb2d7c85a8	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
mpp	f66eda706d	Get the vfs giant lock before calling nfs_access. Reviewed by: mohan	2007-02-13 03:27:45 +00:00
hrs	7c35092b08	The nfsm_srvpathsiz() macro in nfsrv_symlink() in nfs_serv.c should check length of the pathname in the range 0<=n<=NFS_MAXPATHLEN, not 0<n<=NFS_MAXPATHLEN. This fixes a minor interoperability problem that the FreeBSD NFS server did not allow a symlink pointing the empty pathname. MFC after: 1 week	2007-01-02 20:42:08 +00:00
bz	297206ec2a	MFp4: 92972, 98913 + one more change In ip6_sprintf no longer use and return one of eight static buffers for printing/logging ipv6 addresses. The caller now has to hand in a sufficiently large buffer as first argument.	2006-12-12 12:17:58 +00:00
rwatson	65d3526a64	Push Giant a bit further off the NFS server in a number of straight forward cases by converting from unconditional acquisition of Giant around vnode operations to conditional acquisition: - Remove nfsrv_access_withgiant(), and cause nfsrv_access() to now assert that Giant will be held if it is required for the vnode. - Add nfsrv_fhtovp_locked(), which will drop the NFS server lock if required, and modify nfsrv_fhtovp() to conditionally acquire Giant if required. - In the VOP's not dealing with more than one vnode at a time (i.e., not involving a lookup), conditionally acquire Giant. This removes Giant use for MPSAFE file systems for a number of quite important RPCs, including getattr, read, write. It leaves unconditional Giant acquisitions in vnode operations that interact with the name space or more than one vnode at a time as these require further work. Tested by: kris Reviewed by: kib	2006-11-24 11:53:16 +00:00
pjd	62a0bc913e	Protect nfsm_srvpathsiz() call with the nfsd_mtx lock. Reviewed by: mohans	2006-11-20 07:32:52 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
kib	bf3aa367e2	Fix leak in NAMEI zone caused by nfs server when VOP_RENAME fails. Submitted by: Padma Bhooma <pbhooma at panasas com> Reviewed by: bde Approved by: pjd (mentor) MFC after: 1 week	2006-10-26 12:41:53 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
jhb	4dc640e56e	- Add a new function nfsrv_destroycache() to tear down the server request cache when unloading the nfsserver module. This fixes a memory leak and a stale pointer. - Use callout_drain() rather than callout_stop() when unloading the nfsserver module. MFC after: 3 days	2006-08-01 16:27:14 +00:00
jhb	dcdaa35dc6	Use TAILQ_FOREACH_SAFE() in a couple of places.	2006-08-01 15:32:25 +00:00
jhb	c62c38439f	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
rwatson	40868fda8a	soreceive_generic(), and sopoll_generic(). Add new functions sosend(), soreceive(), and sopoll(), which are wrappers for pru_sosend, pru_soreceive, and pru_sopoll, and are now used univerally by socket consumers rather than either directly invoking the old so*() functions or directly invoking the protocol switch method (about an even split prior to this commit). This completes an architectural change that was begun in 1996 to permit protocols to provide substitute implementations, as now used by UDP. Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to perform these operations on sockets -- in particular, distributed file systems and socket system calls. Architectural head nod: sam, gnn, wollman	2006-07-24 15:20:08 +00:00
mohans	798a5b356c	Size the NFS server dupreq cache on the basis of nmbclusters. On servers with low nmbclusters, we tie up too many mbclusters in the NFS duplicate request cache. This change limits the size of the dupreq cache to 1/2 the nmbclusters (and flaots in a range of [64, 2048]). MFC after 2 weeks. Reported by: Steve Kargl, David O'Brien Tested by: Steve Kargl	2006-06-23 00:42:26 +00:00
kib	a5b858d3fd	Temporary workaround to prevent leak of Giant from nfsd when calling lookup(). Reviewed by: tegge Tested by: "Arno J. Klaassen" <arno at heho snv jussieu fr>, "Rong-en Fan" <grafan at gmail com>, Dmitriy Kirhlarov <dimma at higis ru>, Dmitry Pryanishnikov <dmitry at atlantis dp ua> MFC after: 1 week Approved by: kan, pjd (mentors)	2006-06-05 14:48:02 +00:00
mohans	38b8fecaba	Bump up the NFS server dupreq cache limit to 2K (from 64). With a small duplicate request cache, under heavy load a lot of non-idempotent requests were getting served again, resulting in errors. Found by : Kris Kennaway.	2006-04-25 00:21:56 +00:00
csjp	be495bef58	Introduce a new MAC entry point for label initialization of the NFS daemon's credential: mac_associate_nfsd_label() This entry point can be utilized by various Mandatory Access Control policies so they can properly initialize the label of files which get created as a result of an NFS operation. This work will be useful for fixing kernel panics associated with accessing un-initialized or invalid vnode labels. The implementation of these entry points will come shortly. Obtained from: TrustedBSD Requested by: mdodd MFC after: 3 weeks	2006-04-06 23:33:11 +00:00
cel	08249d49bf	rick says: The following bug was just identified in OpenBSD and it looks like the same bug exists in the other BSDen NFS servers. A Linux client (don't know which version, but you can look at http://bugzilla.kernel.org/show_bug.cgi?id=6256) does a Setattr of mtime to the server's time, where the file is mode 0664 and the client user has group access (ie. caller is not the file owner). The BSD servers fail the Setattr with EPERM, since the VA_UTIMES_NULL flag isn't set before doing the VOP_SETATTR. It seems to me that this should be allowed, since it is allowed for a local utimes(2). If so, the fix is to set VA_UTIMES_NULL for the "set-time-to-server-time" cases of setting atime and/or mtime. Submitted by: rick@snowhite.cis.uoguelph.ca Reviewed by: cel Approved by: silby MFC after: 1 week	2006-04-02 04:24:57 +00:00
jeff	32b1878006	- Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs(). Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:54:20 +00:00
jeff	52c1783c83	- Reorder vrele calls after vput calls to prevent lock order reversals between leaf and directory locks. Found by: kris Sponsored by: Isilon Systems, Inc.	2006-03-12 04:59:04 +00:00
simon	edc000b320	When parsing an RPC request in nfsrv_dorec(), KASSERT that there actually is an mbuf to process. This catches the missing mbuf before it would otherwise causes a NULL pointer dereference, which could be triggered by a 0 length RPC record before the check for such records was added in rev 1.97. Approved by: cperciva (mentor)	2006-03-08 20:21:15 +00:00
simon	1b31e5fc10	Correct a remote kernel panic when processing zero-length RPC records via TCP. [06:10] Security: FreeBSD-SA-06:10.nfs Approved by: cperciva	2006-03-01 14:17:32 +00:00
jeff	30a231055b	- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately. MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn	2006-02-01 00:25:26 +00:00
csjp	34b8c6a440	Manage the ucred for the NFS server using the crget/crfree API defined in kern_prot.c. This API handles reference counting among many other things. Notably, if MAC is compiled into the kernel, it will properly initialize the MAC labels when the ucred is allocated. This work is in preparation for a new MAC entry point which will be responsible for properly initializing policy specific labels for the NFS server credential. Utilization of the crfree/crget APIs reduce the complexity associated with this label's management. Submitted by: green (with changes) [1] Obtained from: TrustedBSD Project Discussed with: rwatson, alfred [1] I moved the ucred allocation outside the scope of the NFS server lock to prevent M_WAIKOK allocations from occurring with non-sleep-able locks held. Additionally, to reduce complexity, the ucred persist as long as the NFS server descriptor.	2006-01-28 19:24:40 +00:00
trhodes	80610803f5	Revert my previous commit. Proved I'm not that bright at times: jhb	2006-01-23 21:06:22 +00:00
trhodes	f927a72593	Fix indentation. Prodded by: stefanf, ru, njl (in that order)	2006-01-23 17:41:43 +00:00
trhodes	f9cd8b5d9f	Remove some dead code. Found with: Coverity Prevent(tm)	2006-01-21 12:10:33 +00:00
rwatson	be4f357149	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
glebius	c593d62fd6	Keep locks consistent before goto. Reported by: pho Reviewed by: mohans	2005-10-27 19:02:34 +00:00
jhb	0d152100b2	Use the refcount API to manage the reference count for user credentials rather than using pool mutexes. Tested on: i386, alpha, sparc64	2005-09-27 18:09:42 +00:00
rwatson	21db4509f1	NFS write gathering defers execution of NFS server write requests to wait to see if additional write requests will arrive that can be coalesced and clustered with earlier ones. When doing so, it must determine whether the two requests are made by credentials with the same access writes, so as not to coalesce improperly. NFSW_SAMECRED() implements a test of two credentials using a binary compare. Replace NFSW_SAMECRED() macro with nfsrv_samecred() function, which is aware of the contents and layout of a struct ucred, rather than a simple binary compare. While the binary compare works when ucred is simply a zero'd and embedded 'struct ucred' in the NFS descriptor, it will work less well when the ucred associated with an NFS descriptor is "real", so has defined and populated reference count, mutex, etc. MFC after: 1 week Obtained from: TrustedBSD Project	2005-04-17 16:25:36 +00:00

1 2 3 4 5 ...

506 Commits