freebsd-skq

Author	SHA1	Message	Date
rwatson	765a83fd79	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
kris	2ddeaf9203	Annotate that this giant acqusition is dependent on tty locking.	2007-03-26 21:56:46 +00:00
tegge	214bc5723c	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib	2007-03-13 01:50:27 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
kib	8a69d7f5b4	Update the access and modification times for dev while still holding thread reference on it. Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 08:03:42 +00:00
kib	5f5bf9dadc	Fix the race between devfs_fp_check and devfs_reclaim. Derefence the vnode' v_rdev and increment the dev threadcount , as well as clear it (in devfs_reclaim) under the dev_lock(). Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 07:59:50 +00:00
kib	afa2e43fb6	Properly lock the vnode around vgone() calls. Unlock the vnode in devfs_close() while calling into the driver d_close() routine. devfs_revoke() changes by: ups Reviewed and bugfixes by: tegge Tested by: mbr, Peter Holm Approved by: pjd (mentor) MFC after: 1 week	2006-10-18 11:17:14 +00:00
tegge	83154f853d	Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().	2006-09-26 04:12:49 +00:00
kib	11200e2de3	Fix the bug in rev. 1.134. In devfs_allocv_drop_refs(), when not_found == 2 and drop_dm_lock is true, no unlocking shall be attempted. The lock is already dropped and memory is freed. Found with: Coverity Prevent(tm) CID: 1536 Approved by: pjd (mentor)	2006-09-19 14:03:02 +00:00
kib	ecf34f4504	Resolve the devfs deadlock caused by LOR between devfs_mount->dm_lock and vnode lock in devfs_allocv. Do this by temporary dropping dm_lock around vnode locking. For safe operation, add hold counters for both devfs_mount and devfs_dirent, and DE_DOOMED flag for devfs_dirent. The facilities allow to continue after dropping of the dm_lock, by making sure that referenced memory does not disappear. Reviewed by: tegge Tested by: kris Approved by: kan (mentor) PR: kern/102335	2006-09-18 13:23:08 +00:00
phk	0f924547b0	Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in DEVFS. Remove the opt_devfs.h file now that it is empty.	2006-07-17 09:07:02 +00:00
ups	b5dc376dfa	Add vnode interlocking to devfs. This prevents race conditions that can cause pagefaults or devfs to use arbitrary vnodes. MFC after: 1 week	2006-07-12 20:25:35 +00:00
rwatson	c9e5505f09	Remove now unneeded opt_mac.h and mac.h includes. MFC after: 3 days	2006-07-06 13:24:22 +00:00
rwatson	9d9a014802	Use #include "", not #include <> for opt_foo.h. MFC after: 3 days	2006-07-06 13:22:08 +00:00
pjd	b8538a9381	Remove unused prototypes.	2006-04-12 12:17:29 +00:00
jeff	158187fcb0	- Add a bogus vhold/vdrop around vgone() in devfs_revoke. Without this the vnode is never recycled. It is bogus because the reference really should be associated with the devfs dirent.	2006-03-31 23:37:29 +00:00
jeff	4c3ad6634a	- We must hold a reference to a vnode before calling vgone() otherwise it may not be removed from the freelist. MFC After: 1 week Found by: kris	2006-02-22 09:05:40 +00:00
jeff	af5f248494	- Remove a stale comment. This function was rewritten to be SMP safe some time ago. Sponsored by: Isilon Systems, Inc.	2006-01-30 08:24:14 +00:00
rwatson	428f554873	When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule lock. Otherwise the system comes to a rather sudden and grinding halt. MFC after: 1 week	2006-01-03 09:49:10 +00:00
dwhite	0bcdf7c033	This is a workaround for a complicated issue involving VFS cookies and devfs. The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days	2005-11-09 22:03:50 +00:00
phk	3ed5c9efd0	Use correct cirteria for determining which directory entries we can purge right away and which we merely can hide. Beaten into my skull by: kris	2005-10-18 20:21:25 +00:00
phk	a951e327e6	Make rule zero really magical, that way we don't have to do anything when we mount and get zero cost if no rules are used in a mountpoint. Add code to deref rules on unmount. Switch from SLIST to TAILQ. Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead. Drop goto, a break will do. Reduce double pointers to single pointers. Combine reaping and destroying rulesets. Avoid memory leaks in a some error cases.	2005-09-24 07:03:09 +00:00
phk	6a408cbd71	Rewamp DEVFS internals pretty severely [1]. Give DEVFS a proper inode called struct cdev_priv. It is important to keep in mind that this "inode" is shared between all DEVFS mountpoints, therefore it is protected by the global device mutex. Link the cdev_priv's into a list, protected by the global device mutex. Keep track of each cdev_priv's state with a flag bit and of references from mountpoints with a dedicated usecount. Reap the benefits of much improved kernel memory allocator and the generally better defined device driver APIs to get rid of the tables of pointers + serial numbers, their overflow tables, the atomics to muck about in them and all the trouble that resulted in. This makes RAM the only limit on how many devices we can have. The cdev_priv is actually a super struct containing the normal cdev as the "public" part, and therefore allocation and freeing has moved to devfs_devs.c from kern_conf.c. The overall responsibility is (to be) split such that kern/kern_conf.c is the stuff that deals with drivers and struct cdev and fs/devfs handles filesystems and struct cdev_priv and their private liason exposed only in devfs_int.h. Move the inode number from cdev to cdev_priv and allocate inode numbers properly with unr. Local dirents in the mountpoints (directories, symlinks) allocate inodes from the same pool to guarantee against overlaps. Various other fields are going to migrate from cdev to cdev_priv in the future in order to hide them. A few fields may migrate from devfs_dirent to cdev_priv as well. Protect the DEVFS mountpoint with an sx lock instead of lockmgr, this lock also protects the directory tree of the mountpoint. Give each mountpoint a unique integer index, allocated with unr. Use it into an array of devfs_dirent pointers in each cdev_priv. Initially the array points to a single element also inside cdev_priv, but as more devfs instances are mounted, the array is extended with malloc(9) as necessary when the filesystem populates its directory tree. Retire the cdev alias lists, the cdev_priv now know about all the relevant devfs_dirents (and their vnodes) and devfs_revoke() will pick them up from there. We still spelunk into other mountpoints and fondle their data without 100% good locking. It may make better sense to vector the revoke event into the tty code and there do a destroy_dev/make_dev on the tty's devices, but that's for further study. Lots of shuffling of stuff and churn of bits for no good reason[2]. XXX: There is still nothing preventing the dev_clone EVENTHANDLER from being invoked at the same time in two devfs mountpoints. It is not obvious what the best course of action is here. XXX: comment out an if statement that lost its body, until I can find out what should go there so it doesn't do damage in the meantime. XXX: Leave in a few extra malloc types and KASSERTS to help track down any remaining issues. Much testing provided by: Kris Much confusion caused by (races in): md(4) [1] You are not supposed to understand anything past this point. [2] This line should simplify life for the peanut gallery.	2005-09-19 19:56:48 +00:00
phk	2d4ad1cc44	Don't attempt to recurse lockmgr, it doesn't like it.	2005-09-15 21:16:43 +00:00
phk	3c4b94c1fe	Various minor polishing.	2005-09-15 10:28:19 +00:00
phk	eafa84f647	Protect the devfs rule internal global lists with a sx lock, the per mount locks are not enough. Finer granularity (x)locking could be implemented, but I prefer to keep it simple for now.	2005-09-15 08:50:16 +00:00
phk	1a7de63bbb	Absolve devfs_rule.c from locking responsibility and call it with all necessary locking held.	2005-09-15 08:36:37 +00:00
phk	a543c5b761	Close a race which could result in unwarranted "ruleset %d already running" panics. Previously, recursion through the "include" feature was prevented by marking each ruleset as "running" when applied. This doesn't work for the case where two DEVFS instances try to apply the same ruleset at the same time. Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which limits how many levels of "include" we will traverse. Be aware that traversal of "include" is recursive and kernel stack size is limited. MFC: after 3 days	2005-09-15 06:57:28 +00:00
phk	aca041ee53	Clean up prototypes.	2005-09-12 08:03:15 +00:00
phk	a469be1ef3	Add a missing dev_relthread() call. Remove unused variable. Spotted by: Hans Petter Selasky <hselasky@c2i.net>	2005-08-29 11:14:18 +00:00
phk	fcf6768753	Handle device drivers with D_NEEDGIANT in a way which does not penalize the 'good' drivers: Allocate a shadow cdevsw and populate it with wrapper functions which grab Giant	2005-08-17 08:19:52 +00:00
phk	8a3fe94804	Collect the devfs related sysctls in one place	2005-08-16 19:25:02 +00:00
phk	e89ebd4119	Create a new internal .h file to communicate very private stuff from kern_conf.c to devfs. For now just two prototypes, more to come.	2005-08-16 19:08:01 +00:00
phk	4edc625526	Eliminate effectively unused dm_basedir field from devfs_mount.	2005-08-15 19:40:53 +00:00
rwatson	daa1c89f45	Merge the dev_clone and dev_clone_cred event handlers into a single event handler, dev_clone, which accepts a credential argument. Implementors of the event can ignore it if they're not interested, and most do. This avoids having multiple event handler types and fall-back/precedence logic in devfs. This changes the kernel API for /dev cloning, and may affect third party packages containg cloning kernel modules. Requested by: phk MFC after: 3 days	2005-08-08 19:55:32 +00:00
kris	509594693e	devfs is not yet fully MPSAFE - for example, multiple concurrent devfs(8) processes can cause a panic when operating on rulesets. Approved by: phk	2005-07-29 23:00:56 +00:00
simon	dd09386bed	Correct devfs ruleset bypass. Submitted by: csjp Reviewed by: phk Security: FreeBSD-SA-05:17.devfs Approved by: cperciva	2005-07-20 13:34:16 +00:00
rwatson	79690d711b	When devfs cloning takes place, provide access to the credential of the process that caused the clone event to take place for the device driver creating the device. This allows cloned device drivers to adapt the device node based on security aspects of the process, such as the uid, gid, and MAC label. - Add a cred reference to struct cdev, so that when a device node is instantiated as a vnode, the cloning credential can be exposed to MAC. - Add make_dev_cred(), a version of make_dev() that additionally accepts the credential to stick in the struct cdev. Implement it and make_dev() in terms of a back-end make_dev_credv(). - Add a new event handler, dev_clone_cred, which can be registered to receive the credential instead of dev_clone, if desired. - Modify the MAC entry point mac_create_devfs_device() to accept an optional credential pointer (may be NULL), so that MAC policies can inspect and act on the label or other elements of the credential when initializing the skeleton device protections. - Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(), so that the pty clone credential is exposed to the MAC Framework. While currently primarily focussed on MAC policies, this change is also a prerequisite for changes to allow ptys to be instantiated with the UID of the process looking up the pty. This requires further changes to the pty driver -- in particular, to immediately recycle pty nodes on last close so that the credential-related state can be recreated on next lookup. Submitted by: Andrew Reisse <andrew.reisse@sparta.com> Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA MFC after: 1 week MFC note: Merge to 6.x, but not 5.x for ABI reasons	2005-07-14 10:22:09 +00:00
rodrigc	dba46a3ce7	Do not declare a struct as extern, and then implement it as static in the same file. This is not legal C, and GCC 4.0 will issue an error. Reviewed by: phk Approved by: das (mentor)	2005-05-31 14:50:49 +00:00
jeff	6c4a330d28	- In devfs_open() and devfs_close() grab Giant if the driver sets NEEDGIANT. We still have to DROP_GIANT and PICKUP_GIANT when NEEDGIANT is not set because vfs is still sometime entered with Giant held.	2005-05-01 00:56:34 +00:00
jeff	53caed435d	- Mark devfs as MNTK_MPSAFE as I belive it does not require Giant. Sponsored by: Isilon Systems, Inc. Agreed in principle by: phk	2005-04-30 11:24:17 +00:00
jeff	afab3762a0	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
phk	7af1e31761	Explicitly hold a reference to the cdev we have just cloned. This closes the race where the cdev was reclaimed before it ever made it back to devfs lookup.	2005-03-31 12:19:44 +00:00
phk	2379f61770	cdev (still) needs per instance uid/gid/mode Add unlocked version of dev_ref() Clean up various stuff in sys/conf.h	2005-03-31 10:29:57 +00:00
phk	b83adaf8e5	Rename dev_ref() to dev_refl()	2005-03-31 06:51:54 +00:00
jeff	902bc24bce	- LK_NOPAUSE is a nop now. Sponsored by: Isilon Systems, Inc.	2005-03-31 04:27:49 +00:00
jeff	b136fd4eee	- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. Sponsored by: Isilon Systems, Inc.	2005-03-28 09:34:36 +00:00
jeff	226bf6ead4	- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem. Sponsored by: Isilon Systems, Inc.	2005-03-24 07:36:16 +00:00
phk	cfa6bb09ea	Prepare for the final onslaught on devices: Move uid/gid/mode from cdev to cdevsw. Add kind field to use for devd(8) later. Bump both D_VERSION and __FreeBSD_version	2005-03-17 12:07:00 +00:00
jeff	5bd51ec6e6	- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:14:56 +00:00
phk	7a56697451	One more bit of the major/minor patch to make ttyname happy as well.	2005-03-10 18:49:17 +00:00
phk	75bcf4d381	Try to fix the mess I made of devname, with the minimal subset of the larger minor/major patch which was posted for testing.	2005-03-10 18:21:34 +00:00
phk	5493756155	Remove kernelside support for devfs rules filtering on major numbers.	2005-03-08 19:51:27 +00:00
phk	f24285cfac	We may not have an actual cdev at this point.	2005-02-22 18:17:31 +00:00
phk	f1d058e032	Reap more benefits from DEVFS: List devfs_dirents rather than vnodes off their shared struct cdev, this saves a pointer field in the vnode at the expense of a field in the devfs_dirent. There are often 100 times more vnodes so this is bargain. In addition it makes it harder for people to try to do stypid things like "finding the vnode from cdev". Since DEVFS handles all VCHR nodes now, we can do the vnode related cleanup in devfs_reclaim() instead of in dev_rel() and vgonel(). Similarly, we can do the struct cdev related cleanup in dev_rel() instead of devfs_reclaim(). rename idestroy_dev() to destroy_devl() for consistency. Add LIST_ENTRY de_alias to struct devfs_dirent. Remove v_specnext from struct vnode. Change si_hlist to si_alist in struct cdev. String new devfs vnodes' devfs_dirent on si_alist when we create them and take them off in devfs_reclaim(). Fix devfs_revoke() accordingly. Also don't clear fields devfs_reclaim() will clear when called from vgone(); Let devfs_reclaim() call dev_rel() instead of vgonel(). Move the usecount tracking from dev_rel() to devfs_reclaim(), and let dev_rel() take a struct cdev argument instead of vnode. Destroy SI_CHEAPCLONE devices in dev_rel() (instead of devfs_reclaim()) when they are no longer used. (This should maybe happen in devfs_close() instead.)	2005-02-22 15:51:07 +00:00
phk	d77b9fb94f	Make dev_ref() require the dev_lock() to be held and use it from devfs instead of directly frobbing the si_refcount.	2005-02-22 14:41:04 +00:00
phk	af1fa2025c	Introduce vx_wait{l}() and use it instead of home-rolled versions.	2005-02-17 10:49:51 +00:00
phk	9fbd4a503d	Make a SYSCTL_NODE static	2005-02-10 12:23:29 +00:00
phk	4fc19472ad	Statize devfs_ops_f	2005-02-10 12:04:26 +00:00
phk	1b21636022	Make filesystems get rid of their own vnodes vnode_pager object in VOP_RECLAIM().	2005-01-28 14:42:17 +00:00
phk	7617afd15d	Whitespace in vop_vector{} initializations.	2005-01-13 18:59:48 +00:00
phk	ea3239f358	Silently ignore forced argument to unmount.	2005-01-11 12:02:26 +00:00
imp	fb0a393e99	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 18:10:42 +00:00
phk	9799def1bc	Unsupport forceful unmounts of DEVFS. After disscussing things I have decided to take the easy and consistent 90% solution instead of aiming for the very involved 99% solution. If we allow forceful unmounts of DEVFS we need to decide how to handle the devices which are in use through this filesystem at the time. We cannot just readopt the open devices in the main /dev instance since that would open us to security issues. For the majority of the devices, this is relatively straightforward as we can just pretend they got revoke(2)'ed. Some devices get tricky: /dev/console and /dev/tty for instance does a sort of recursive open of the real console device. Other devices may be mmap'ed (kill the processes ?). And then there are disk devices which are mounted. The correct thing here would be to recursively unmount the filesystems mounte from devices from our DEVFS instance (forcefully) and if this succeeds, complete the forcefully unmount of DEVFS. But if one of the forceful unmounts fail we cannot complete the forceful unmount of DEVFS, but we are likely to already have severed a lot of stuff in the process of trying. Event attempting this would be a lot of code for a very far out corner-case which most people would never see or get in touch with. It's just not worth it.	2005-01-04 07:52:26 +00:00
phk	c13e70487f	Be consistent about flag values passed to device drivers read/write methods: Read can see O_NONBLOCK and O_DIRECT. Write can see O_NONBLOCK, O_DIRECT and O_FSYNC. In addition O_DIRECT is shadowed as IO_DIRECT for now for backwards compatibility.	2004-12-22 17:05:44 +00:00
phk	4e6a1d00d2	Shuffle numeric values of the IO_* flags to match the O_* flags from fcntl.h. This is in preparation for making the flags passed to device drivers be consistently from fcntl.h for all entrypoints. Today open, close and ioctl uses fcntl.h flags, while read and write uses vnode.h flags.	2004-12-22 16:25:50 +00:00
phk	0faeb292ed	We can only ever get to vgonechrl() from a devfs vnode, so we do not need to reassign the vp->v_op to devfs_specops, we know that is the value already. Make devfs_specops private to devfs.	2004-12-20 21:34:29 +00:00
phk	4483c94662	Add a couple of KASSERTS to try to diagnose a problem reported.	2004-12-20 21:12:11 +00:00
phk	ccf97ed87f	Be a bit more assertive about vnode bypass.	2004-12-14 09:32:18 +00:00
phk	4e2d779c7a	Another FNONBLOCK -> O_NONBLOCK. Don't unconditionally set IO_UNIT to device drivers in write: nobody checks it, and since it was always set it did not carry information anyway.	2004-12-13 07:41:19 +00:00
phk	381851ecfe	Use O_NONBLOCK instead of FNONBLOCK alias.	2004-12-13 07:37:29 +00:00
phk	f4810ece42	Explicit panic in vop_read/vop_write for devices	2004-12-13 07:13:21 +00:00
phk	4a639d6164	The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly split the conversion of the remaining three filesystems out from the root mounting changes, so in one go: cd9660: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() nfs(client): Convert to nmount (the simple way, mount_nfs(8) is still necessary). Add omount compat shims. Drop COMPAT_PRELITE2 mount arg compatibility. ffs: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() Remove vfs_omount() method, all filesystems are now converted. Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem task, and they all do it now. Change rootmounting to use DEVFS trampoline: vfs_mount.c: Mount devfs on /. Devfs needs no 'from' so this is clean. symlink /dev to /. This makes it possible to lookup /dev/foo. Mount "real" root filesystem on /. Surgically move the devfs mountpoint from under the real root filesystem onto /dev in the real root filesystem. Remove now unnecessary getdiskbyname(). kern_init.c: Don't do devfs mounting and rootvnode assignment here, it was already handled by vfs_mount.c. Remove now unused bdevvp(), addaliasu() and addalias(). Put the few necessary lines in devfs where they belong. This eliminates the second-last source of bogo vnodes, leaving only the lemming-syncer. Remove rootdev variable, it doesn't give meaning in a global context and was not trustworth anyway. Correct information is provided by statfs(/).	2004-12-07 08:15:41 +00:00
phk	463e576c38	Use vfs_mountedfrom() and rely on vfs_mount.c to call VFS_STATFS()	2004-12-06 19:54:31 +00:00
phk	6c14f71ef7	VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that. Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.	2004-12-05 22:41:02 +00:00
phk	59f305606c	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
phk	05b9cb2a46	Mechanically change prototypes for vnode operations to use the new typedefs.	2004-12-01 12:24:41 +00:00
phk	820a982ffa	Ignore MNT_NODEV, it is implicit in choice of filesystem these days.	2004-11-26 07:37:42 +00:00
phk	07ea2b1e7d	Make vnode bypass for devices mandatory.	2004-11-17 07:18:49 +00:00
phk	1329297d10	Make vnode bypass the default for devices. Can be disabled in case of problems with vfs.devfs.fops=0 in loader.conf	2004-11-15 22:11:09 +00:00
phk	aa4f69ad30	Integrate most of vop_revoke() into devfs_revoke() where it belongs.	2004-11-13 23:37:29 +00:00
phk	437fa95897	Add the devfs_fp_check() function which helps us get from a struct file to a cdev and a devsw, doing all the relevant checks along the way. Add the check to see if fp->f_vnode->v_rdev differs from our cached fp->f_data copy of our cdev. If it does the device was revoked and we return ENXIO.	2004-11-13 23:21:54 +00:00
phk	921930c585	Refuse attemps to mount root filesystem	2004-11-09 22:14:57 +00:00
phk	63cd9549c7	Add optional device vnode bypass to DEVFS. The tunable vfs.devfs.fops controls this feature and defaults to off. When enabled (vfs.devfs.fops=1 in loader), device vnodes opened through a filedescriptor gets a special fops vector which instead of the detour through the vnode layer goes directly to DEVFS. Amongst other things this allows us to run Giant free read/write to device drivers which have been weaned off D_NEEDGIANT. Currently this means /dev/null, /dev/zero, disks, (and maybe the random stuff ?) On a 700MHz K7 machine this doubles the speed of dd if=/dev/zero of=/dev/null bs=1 count=1000000 This roughly translates to shaving 2usec of each read/write syscall. The poll/kqfilter paths need more work before they are giant free, this work is ongoing in p4::phk_bufwork Please test this and report any problems, LORs etc.	2004-11-08 10:46:47 +00:00
phk	723cc1105c	Properly implement a default version of VOP_GETWRITEMOUNT. Remove improper access to vop_stdgetwritemount() which should and will instead rely on the VOP default path.	2004-11-06 11:41:22 +00:00
phk	31149e65e2	Add back securelevel check for disks. XXX: This should live in geom_dev.c but we don't have access to the cred there. XXX: XXX: This may not matter anymore since filesystems use geom_vfs.	2004-11-04 09:17:55 +00:00
phk	8fa3fd0acf	Don't give disks special treatment, they don't come this way anymore.	2004-10-29 11:10:55 +00:00
phk	861b10d6de	Remove VOP_SPECSTRATEGY() from the system.	2004-10-29 10:59:28 +00:00
phk	86cc21c765	Give dev_strategy() an explict cdev argument in preparation for removing buf->b-dev. Put a bio between the buf passed to dev_strategy() and the device driver strategy routine in order to not clobber fields in the buf. Assert copyright on vfs_bio.c and update copyright message to canonical text. There is no legal difference between John Dysons two-clause abbreviated BSD license and the canonical text.	2004-10-29 07:16:37 +00:00
phk	6b2d7f7134	What can I say: don't allow people to mount DEVFS with option "nodev".	2004-10-28 06:03:25 +00:00
phk	c66aa10c8e	Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.	2004-10-26 07:39:12 +00:00
phk	2c3e47b668	Alas, poor SPECFS! -- I knew him, Horatio; A filesystem of infinite jest, of most excellent fancy: he hath taught me lessons a thousand times; and now, how abhorred in my imagination it is! my gorge rises at it. Here were those hacks that I have curs'd I know not how oft. Where be your kludges now? your workarounds? your layering violations, that were wont to set the table on a roar? Move the skeleton of specfs into devfs where it now belongs and bury the rest.	2004-10-22 09:59:37 +00:00
phk	8d623dca9a	XXX mark two places where we do not hold a threadcount on the dev when frobbing the cdevsw. In both cases we examine only the cdevsw and it is a good question if we weren't better off copying those properties into the cdev in the first place. This question will be revisited.	2004-09-24 08:32:36 +00:00
phk	2d868d02cf	Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch. s/vfs_mount/vfs_omount/ s/vfs_nmount/vfs_mount/ Name our filesystems mount function consistently. Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused. Reorganize the root filesystem selection code.	2004-07-30 22:08:52 +00:00
cperciva	d9fecc83c8	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
rwatson	01be595ab3	In devfs_allocv(), rather than assigning 'td = curthread', assert that the caller passes in a td that is curthread, and consistently pass 'td' into vget(). Remove some bogus logic that passed in td or curthread conditional on td being non-NULL, which seems redundant in the face of the earlier assignment of td to curthread if td is NULL. In devfs_symlink(), cache the passed thread in 'td' so we don't have to keep retrieving it from the 'ap' structure, and assert that td is curthread (since we dereference it to get thread-local td_ucred). Use 'td' in preference to curthread for later lockmgr calls, since they are equal.	2004-07-22 17:03:14 +00:00
alfred	8a1713aada	Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.	2004-07-12 08:14:09 +00:00
phk	607546ee37	Reduce a fair bit of the atomics because we are now called with a lock from kern_conf.c and cdev's act a lot more like real objects these days.	2004-06-18 08:08:47 +00:00
phk	40dd98a3bd	Second half of the dev_t cleanup. The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev() Various minor adjustments including handling of userland access to kernel space struct cdev etc.	2004-06-17 17:16:53 +00:00
phk	dfd1f7fd50	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
phk	df260dfa17	Report the correct length for symlink entries.	2004-02-19 19:09:52 +00:00
phk	758f11d127	White-space align a struct definition. Move a SYSINIT to the file where it belongs.	2004-02-15 21:43:08 +00:00
cperciva	b9ee343f9e	Fix style(9) of my previous commit. Noticed by: nate Approved by: nate, rwatson (mentor)	2004-01-21 18:03:54 +00:00
cperciva	f2970382f1	Allow devfs path rules to work on directories. Without this fix, devfs rule add path fd unhide is a no-op, while it should unhide the fd subdirectory. Approved by: phk, rwatson (mentor) PR: kern/60897	2004-01-21 16:43:29 +00:00
phk	5b996ad186	Improve on POLA by populating DEVFS before doing devfs(8) rule ioctls. PR: 60687 Spotted by: Colin Percival <cperciva@daemonology.net>	2004-01-02 19:02:28 +00:00
rwatson	77ed6e2d1c	Modify the MAC Framework so that instead of embedding a (struct label) in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures. This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability. While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory. NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol. Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-12 03:14:31 +00:00
phk	6b3ae2c6aa	Remember to check the DE_WHITEOUT flag in the case where a cloned device is hidden by a devfs(8) rule. Spotted by: Adam Nowacki <ptnowak@bsk.vectranet.pl>	2003-10-20 15:08:10 +00:00
phk	c096f8aa94	When a driver successfully created a device on demand, we can directly pick up the DEVFS inode number from the dev_t and find our directory entry from that, we don't need to scan the directory to find it. This also solves an issue with on-demand devices in subdirectories. Submitted by: cognet	2003-10-20 07:04:09 +00:00
phk	fd139fd7d0	Initialize struct vfsops C99-sparsely. Submitted by: hmp Reviewed by: phk	2003-06-12 20:48:38 +00:00
phk	557d80921b	Remove unused variable. Found by: FlexeLint	2003-05-31 19:34:52 +00:00
kan	378cd3b05d	Rename vfs_stdsync function to vfs_stdnosync which matches more closely what function is really doing. Update all existing consumers to use the new name. Introduce a new vfs_stdsync function, which iterates over mount point's vnodes and call FSYNC on each one of them in turn. Make nwfs and smbfs use this new function instead of rolling their own identical sync implementations. Reviewed by: jeff	2003-03-11 22:15:10 +00:00
njl	5a225ad933	Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.	2003-03-03 19:15:40 +00:00
des	7b016a11e6	Clean up whitespace, s/register //, refrain from strong urge to ANSIfy.	2003-03-02 15:56:49 +00:00
des	765ebc59b4	uiomove-related caddr_t -> void * (just the low-hanging fruit)	2003-03-02 15:50:23 +00:00
phk	4ad4dab84a	NODEVFS cleanup: Replace devfs_{create,destroy} hooks with direct function calls.	2003-03-02 13:35:30 +00:00
imp	cf874b345d	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
phk	98a90e953d	NODEVFS cleanup: remove #ifdefs.	2003-01-29 22:36:45 +00:00
alfred	bf8e8a6e8f	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
phk	24596ddb76	Originally when DEVFS was added, a global variable "devfs_present" was used to control code which were conditional on DEVFS' precense since this avoided the need for large-scale source pollution with #include "opt_geom.h" Now that we approach making DEVFS standard, replace these tests with an #ifdef to facilitate mechanical removal once DEVFS becomes non-optional. No functional change by this commit.	2003-01-19 11:03:07 +00:00
phk	7558240f56	Even if the permissions deny it, a process should be allowed to access its controlling terminal. In essense, history dictates that any process is allowed to open /dev/tty for RW, irrespective of credential, because by definition it is it's own controlling terminal. Before DEVFS we relied on a hacky half-device thing (kern/tty_tty.c) which did the magic deep down at device level, which at best was disgusting from an architectural point of view. My first shot at this was to use the cloning mechanism to simply give people the right tty when they ask for /dev/tty, that's why you get this, slightly counter intuitive result: syv# ls -l /dev/tty `tty` crw--w---- 1 u1 tty 5, 0 Jan 13 22:14 /dev/tty crw--w---- 1 u1 tty 5, 0 Jan 13 22:14 /dev/ttyp0 Trouble is, when user u1 su(1)'s to user u2, he cannot open /dev/ttyp0 anymore because he doesn't have permission to do so. The above fix allows him to do that. The interesting side effect is that one was previously only able to access the controlling tty by indirection: date > /dev/tty but not by name: date > `tty` This is now possible, and that feels a lot more like DTRT. PR: 46635 MFC candidate: could be.	2003-01-13 22:20:36 +00:00
dd	7c8a733a05	Add symlink support to devfs_rule_matchpath(). This allows the user to unhide symlinks as well as hide them.	2003-01-11 02:36:20 +00:00
phk	157437ec08	Since Jeffr made the std* functions the default in rev 1.63 of kern/vfs_defaults.c it is wrong for the individual filesystems to use the std* functions as that prevents override of the default. Found by: src/tools/tools/vop_table	2003-01-04 08:47:19 +00:00
rwatson	b6609bcea8	Trim left-over and unused vop_refreshlabel() bits from devfs. Reported by: bde	2002-12-28 05:39:25 +00:00
rwatson	c5caffe9c4	Remove dm_root entry from struct devfs_mount. It's never set, and is unused. Replace it with a dm_mount back-pointer to the struct mount that the devfs_mount is associated with. Export that pointer to MAC Framework entry points, where all current policies don't use the pointer. This permits the SEBSD port of SELinux's FLASK/TE to compile out-of-the-box on 5.0-CURRENT with full file system labeling support. Approved by: re (murray) Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-12-09 03:44:28 +00:00
rwatson	312cab0dee	Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-26 14:38:24 +00:00
rwatson	58072098f1	Missed a case of _POSIX_MAC_PRESENT -> _PC_MAC_PRESENT rename. Pointed out by: phk	2002-10-20 22:50:43 +00:00
phk	f01369965f	Fix comments and one resulting code confusion about the type of the "command" argument to VOP_IOCTL. Spotted by: FlexeLint.	2002-10-16 08:04:11 +00:00
phk	04dca80dba	A better solution to avoiding variable sized structs in DEVFS.	2002-10-16 07:51:18 +00:00
phk	bb72fa916d	#include "opt_devfs.h" to protect against variable sized structures. Spotted by: FlexeLint	2002-10-16 07:16:47 +00:00
mike	8630abe45f	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
dd	eff660789c	Treat the pathptrn field as a real pattern with the aid of fnmatch().	2002-10-08 04:21:54 +00:00
rwatson	7b150b70c2	Integrate a devfs/MAC fix from the MAC tree: avoid a race condition during devfs VOP symlink creation by introducing a new entry point to determine the label of the devfs_dirent prior to allocation of a vnode for the symlink. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 18:40:10 +00:00
phk	7491f6314f	Move the vop-vector declaration into devfs_vnops.c where it belongs.	2002-10-01 10:08:08 +00:00
phk	983b8bb673	s/struct dev_t /dev_t /	2002-09-28 21:21:01 +00:00
phk	d1681a7d3d	Fix mis-indent.	2002-09-28 17:37:55 +00:00
njl	00c79f5c92	Remove any VOP_PRINT that redundantly prints the tag. Move lockmgr_printinfo() into vprint() for everyone's benefit. Suggested by: bde	2002-09-18 20:42:04 +00:00
njl	0590c43070	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
phk	e4f487f25e	Introduce typedefs for the member functions of struct vfsops and employ these in the main filesystems. This does not change the resulting code but makes the source a little bit more grepable. Sponsored by: DARPA and NAI Labs.	2002-08-13 10:05:50 +00:00
jeff	02517b6731	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
rwatson	1fa5d0d927	Introduce support for Mandatory Access Control and extensible kernel access control. Teach devfs how to respond to pathconf() _POSIX_MAC_PRESENT queries, allowing it to indicate to user processes that individual vnode labels are available. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-02 03:12:40 +00:00
rwatson	6c6053d961	Hook up devfs_pathconf() for specfs devfs nodes, not just regular devfs nodes. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 22:27:57 +00:00
rwatson	751f2d0c51	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument devfs to support per-dirent MAC labels. In particular, invoke MAC framework when devfs directory entries are instantiated due to make_dev() and related calls, and invoke the MAC framework when vnodes are instantiated from these directory entries. Implement vop_setlabel() for devfs, which pushes the label update into the devfs directory entry for semi-persistant store. This permits the MAC framework to assign labels to devices and directories as they are instantiated, and export access control information via devfs vnodes. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 15:45:16 +00:00
rwatson	25ab0054a1	Introduce support for Mandatory Access Control and extensible kernel access control. Label devfs directory entries, permitting labels to be maintained on device nodes in devfs instances persistently despite vnode recycling. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 23:12:37 +00:00
dd	cca47ca03d	Correct misindentation of DRA_UID.	2002-07-28 06:57:57 +00:00
dd	a22e9df072	Unimplement panic(8) by making sure that we don't recurse into a ruleset. If we do, that means there's a ruleset loop (10 includes 20 include 30 includes 10), which will quickly cause a double fault due to stack overflow (since "include" is implemented by recursion). (Previously, we only checked that X didn't include X.)	2002-07-28 03:52:44 +00:00
dd	9498a983a9	Introduce the DEVFS "rule" subsystem. DEVFS rules permit the administrator to define certain properties of new devfs nodes before they become visible to the userland. Both static (e.g., /dev/speaker) and dynamic (e.g., /dev/bpf*, some removable devices) nodes are supported. Each DEVFS mount may have a different ruleset assigned to it, permitting different policies to be implemented for things like jails. Approved by: phk	2002-07-17 01:46:48 +00:00
semenu	c7fb877f7f	Make devfs to give honour to PDIRUNLOCK flag. Reviewed by: jeff MFC after: 1 week	2002-06-01 09:17:43 +00:00
mux	5bb8b3f421	Fix several bugs in devfs_lookupx(). When we check the nameiop to make sure it's a correct operation for devfs, do it only in the ISLASTCN case. If we don't, we are assuming that the final file will be in devfs, which is not true if another partition is mounted on top of devfs or with special filenames (like /dev/net/../../foo). Reviewed by: phk	2002-05-10 15:41:14 +00:00
mux	85b0c22bf2	Convert devfs to nmount. Reviewed by: phk	2002-05-02 20:27:42 +00:00
rwatson	5ca05f1642	Use vnode locking with devfs; permit VFS locking assertions to make sense for devfs vnodes, and reduce/remove potential races in the devfs code. Submitted by: iadowse Approved by: phk	2002-04-29 20:00:39 +00:00
bde	b4c173f82f	Don't attempt to decvlare M_DEVFS whern MALLOC_DECLARE is not defined. This fixes warnings that should be errors in fstat. Reminded by: alpha tinderbox Fixed some style bugs (ones near BOF and EOF; there are many more).	2002-04-21 15:47:03 +00:00
bde	b20f428bf3	Fixed assorted bugs in setting of timestamps in devfs_setattr(). Setting of timestamps on devices had no effect visible to userland because timestamps for devices were set in places that are never used. This broke: - update of file change time after a change of an attribute - setting of file access and modification times. The VA_UTIMES_NULL case did not work. Revs 1.31-1.32 were supposed to fix this by copying correct bits from ufs, but had little or no effect because the old checks were not removed.	2002-04-05 15:16:08 +00:00
jhb	dc2e474f79	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
alfred	1446d09429	Remove __P.	2002-03-19 22:20:14 +00:00
maxim	92c24ff925	Be consistent with UFS in a way how devfs_setattr() checks credentials for chmod(2), chown(2) and utimes(2) with respect to jail(2). Reviewed by: rwatson, ru Not objected by: phk Approved by: ru	2002-03-14 11:18:42 +00:00
msmith	c2656ac96b	Add a new sysinit SI_SUB_DEVFS. Devfs hooks into the kernel at SI_ORDER_FIRST, and devices can be created anytime after that. Print a warning if an atttempt is made to create a device too early.	2002-01-09 04:58:49 +00:00
msmith	299ff2a776	Use a sysinit to initialise the devfs hooks in kern_conf.c rather than common variables. Reviewed by: phk (in principle)	2002-01-09 01:00:20 +00:00
dd	97c62fdc11	Address two minor issues: implement the _PC_NAME_MAX and _PC_PATH_MAX pathconf() variables for directories, and set st_size and st_blocks (of struct stat) for directories as appropriate. Note that st_size is always set to DEV_BSIZE, since the size of the directories is not currently kept. Reviewed by: phk, bde	2001-11-25 21:00:38 +00:00
phk	3dd31a7df9	Fix "echo > /dev/null" for non-root users which broke in previous commit.	2001-11-04 19:12:59 +00:00
phk	b6f787939f	Use vfs_timestamp() instead of getnanotime(). Add magic stuff copied from ufs_setattr(). Instructed by: bde	2001-11-03 17:00:02 +00:00
phk	b86df2fcac	Use vfs_timestamp() instead of getnanotime() directly. Fix some modes on directories and symlinks. Instructed by: bde	2001-11-03 16:53:24 +00:00
bde	235dffbc18	Backed out vestiges of the quick fixes for the transient breakage of <sys/mount.h> in rev.1.106 of the latter (don't include <sys/socket.h> just to work around bugs in <sys/mount.h>).	2001-10-13 06:41:41 +00:00
phk	4d3230f1e7	The behaviour of whiteout'ing symlinks were too confusing, instead remove them when asked to.	2001-09-30 08:43:33 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
phk	14112e9d98	linux ls fails on DEVFS /dev because linux_getdents fails because linux_getdents uses VOP_READDIR( ..., &ncookies, &cookies ) instead of VOP_READDIR( ..., NULL, NULL ) because it seems to need the offsets for linux_dirent and sizeof(dirent) != sizeof(linux_dirent)... PR: 29467 Submitted by: Michael Reifenberger <root@nihil.plaut.de> Reviewed by: phk	2001-08-14 06:42:32 +00:00
brian	18d829816a	Support /dev/tun cloning. Ansify if_tun.c while I'm there. Only tun0 -> tun32767 may now be opened as struct ifnet's if_unit is a short. It's now possible to open /dev/tun and get a handle back for an available tun device (use devname to find out what you got). The implementation uses rman by popular demand (and against my judgement) to track opened devices and uses the new dev_depends() to ensure that all make_dev()d devices go away before the module is unloaded. Reviewed by: phk	2001-06-01 15:51:10 +00:00
phk	89034502d1	Don't copy the trailing zero in readlink, it confuses namei(). PR: 27656	2001-05-26 20:07:57 +00:00
phk	2072a71f0e	Create a general facility for making dev_t's depend on another dev_t. The dev_depends(dev_t, dev_t) function is for tying them to each other. When destroy_dev() is called on a dev_t, all dev_t's depending on it will also be destroyed (depth first order). Rewrite the make_dev_alias() to use this dependency facility. kern/subr_disk.c: Make the disk mini-layer use dependencies to make sure all relevant dev_t's are removed when the disk disappears. Make the disk mini-layer precreate some magic sub devices which the disk/slice/label code expects to be there. kern/subr_disklabel.c: Remove some now unneeded variables. kern/subr_diskmbr.c: Remove some ancient, commented out code. kern/subr_diskslice.c: Minor cleanup. Use name from dev_t instead of dsname()	2001-05-26 08:27:58 +00:00
phk	c8d4d78546	Change the way deletes are managed in DEVFS. This fixes a number of warnings relating to removed cloned devices. It also makes it possible to recreate deleted devices with mknod(2). The major/minor arguments are ignored.	2001-05-23 17:48:20 +00:00
iedowse	dafd513732	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
phk	4fe46b461d	After a successfull poll of the cloning functions, match on the returned dev_t rather than the original name. This allows cloning from one name to another which is useful for /dev/tty and later for the pty's.	2001-05-14 08:20:46 +00:00
phk	0e2026a179	Convert DEVFS from an "opt-in" to an "opt-out" option. If for some reason DEVFS is undesired, the "NODEVFS" option is needed now. Pending any significant issues, DEVFS will be made mandatory in -current on july 1st so that we can start reaping the full benefits of having it.	2001-05-13 20:52:40 +00:00
phk	11b5378677	Remove unneeded devfs_badop() Noticed by: rwatson	2001-05-06 17:40:34 +00:00
markm	bcca5847d5	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
phk	608c1caf3b	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
mjacob	053acf45ec	add this ridiculous include foo so it will compile again	2001-04-23 18:14:41 +00:00
adrian	4018955334	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.	2001-03-01 21:00:17 +00:00
phk	15fc7bce14	Remove a debug printf.	2001-02-18 09:16:49 +00:00
phk	99d7a44ee7	At the point in time where most devices are created, we don't know what time it is because boottime is not yet initialized. Finagle the relevant fields when we get the chance.	2001-02-02 22:54:41 +00:00
phk	766147079e	Only superuser can create symlinks. Give symlinks mode 755 by default to avoid triggering alert eyes. (the mode isn't use on symlinks)	2001-02-02 18:35:29 +00:00
phk	006cf45cd7	Fix two minor nits. Existences revealed, but no details offered by: bp	2001-01-30 08:39:52 +00:00
dwmalone	dd75d1d73b	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
phk	e0196ec99c	staticize.	2000-12-08 15:07:24 +00:00
phk	ff5cdfae2d	Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>. Correctly document the #includes needed in the manpage. Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.	2000-10-29 16:06:56 +00:00
phk	94a5006c9a	Remove unneeded #include <sys/proc.h> lines.	2000-10-29 13:57:19 +00:00
phk	25e67656df	Don't hold an extra reference to vnodes. Devfs vnodes are sufficiently cheap to setup that it doesn't really matter that we recycle device vnodes at kleenex speed. Implement first cut try at killing cloned devices when they are not needed anymore. For now only the bpf driver is involved in this experiment. Cloned devices can set the SI_CHEAPCLONE flag which allows us to destroy_dev() it when the vcount() drops to zero and the vnode is reclaimed. For now it's a requirement that the driver doesn't keep persistent state from close to (re)open. Some whitespace changes.	2000-10-09 14:18:07 +00:00
jasone	4e290e67b7	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
phk	56aecf1ece	Ignore attempts to set flags to zero. This quenches a syslog warning from login(1).	2000-09-18 09:40:01 +00:00
phk	d927c81a82	Add canonical checks to devfs_setattr().	2000-09-16 12:06:58 +00:00
jhb	f94cd225a3	Use size_t instead of u_int for 4th argument to copyinstr().	2000-09-12 22:39:34 +00:00
phk	c9cb5c289d	Add refcounts to the "global" DEVFS inode slots, this allows us to recycle inodes after a destroy_dev() but not until all mounts have picked up the change. Add support for an overflow table for DEVFS inodes. The static table defaults to 1024 inodes, if that fills, an overflow table of 32k inodes is allocated. Both numbers can be changed at compile time, the size of the overflow table also with the sysctl vfs.devfs.noverflow. Use atomic instructions to barrier between make_dev()/destroy_dev() and the mounts. Add lockmgr() locking of directories for operations accessing or modifying the directory TAILQs. Various nitpicking here and there.	2000-09-06 11:26:43 +00:00
phk	06c7160c02	Off by one error. Submitted by: des	2000-09-04 18:24:30 +00:00
phk	e47f61e183	Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.	2000-09-02 19:17:34 +00:00
rwatson	e54ea574fa	o Restructure vaccess() so as to check for DAC permission to modify the object before falling back on privilege. Make vaccess() accept an additional optional argument, privused, to determine whether privilege was required for vaccess() to return 0. Add commented out capability checks for reference. Rename some variables to make it more clear which modes/uids/etc are associated with the object, and which with the access mode. o Update file system use of vaccess() to pass NULL as the optional privused argument. Once additional patches are applied, suser() will no longer set ASU, so privused will permit passing of privilege information up the stack to the caller. Reviewed by: bde, green, phk, -security, others Obtained from: TrustedBSD Project	2000-08-29 14:45:49 +00:00
phk	1109c83215	Reorder vop's alphabetically. Smarter use of devfs_allocv() (from bp@) Introduce devfs_find() ".." fixes to devfs_lookup (from bp@)	2000-08-27 14:46:36 +00:00
phk	c1421f6ef5	Minor cleanups tp devfs_readdir(); Add devfs_read() for directories. (inspired by bp@)	2000-08-26 16:20:57 +00:00
phk	ec761116e2	Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t	2000-08-24 15:36:55 +00:00
phk	323f259948	Fix devfs_access() bug on directories. Remove unused #includes. Bug spotted by: markm	2000-08-21 14:45:19 +00:00
phk	b648921acc	Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c) Remove old DEVFS support fields from dev_t. Make uid, gid & mode members of dev_t and set them in make_dev(). Use correct uid, gid & mode in make_dev in disk minilayer. Add support for registering alias names for a dev_t using the new function make_dev_alias(). These will show up as symlinks in DEVFS. Use makedev() rather than make_dev() for MFSs magic devices to prevent DEVFS from noticing this abuse. Add a field for DEVFS inode number in dev_t. Add new DEVFS in fs/devfs. Add devfs cloning to: disk minilayer (ie: ad(4), sd(4), cd(4) etc etc) md(4), tun(4), bpf(4), fd(4) If DEVFS add -d flag to /sbin/inits args to make it mount devfs. Add commented out DEVFS to GENERIC	2000-08-20 21:34:39 +00:00

... 2 3 4 5 6 ...

350 Commits