Commit Graph

243 Commits

Author SHA1 Message Date
Konstantin Belousov
828d6d12da Properly lock the vnode around vgone() calls.
Unlock the vnode in devfs_close() while calling into the driver d_close()
routine.

devfs_revoke() changes by:	ups
Reviewed and bugfixes by:	tegge
Tested by:	mbr, Peter Holm
Approved by:	pjd (mentor)
MFC after:	1 week
2006-10-18 11:17:14 +00:00
Tor Egge
5da56ddb21 Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().
2006-09-26 04:12:49 +00:00
Konstantin Belousov
af72db7175 Fix the bug in rev. 1.134. In devfs_allocv_drop_refs(), when not_found == 2
and drop_dm_lock is true, no unlocking shall be attempted. The lock is
already dropped and memory is freed.

Found with:	Coverity Prevent(tm)
CID:	1536
Approved by:	pjd (mentor)
2006-09-19 14:03:02 +00:00
Konstantin Belousov
e7f9b74438 Resolve the devfs deadlock caused by LOR between devfs_mount->dm_lock and
vnode lock in devfs_allocv. Do this by temporary dropping dm_lock around
vnode locking.

For safe operation, add hold counters for both devfs_mount and devfs_dirent,
and DE_DOOMED flag for devfs_dirent. The facilities allow to continue after
dropping of the dm_lock, by making sure that referenced memory does not
disappear.

Reviewed by:	tegge
Tested by:	kris
Approved by:	kan (mentor)
PR:		kern/102335
2006-09-18 13:23:08 +00:00
Poul-Henning Kamp
9c499ad92f Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in
DEVFS.

Remove the opt_devfs.h file now that it is empty.
2006-07-17 09:07:02 +00:00
Stephan Uphoff
56eeb277cb Add vnode interlocking to devfs.
This prevents race conditions that can cause pagefaults or devfs
to use arbitrary vnodes.

MFC after:	1 week
2006-07-12 20:25:35 +00:00
Robert Watson
2551d4f66e Remove now unneeded opt_mac.h and mac.h includes.
MFC after:	3 days
2006-07-06 13:24:22 +00:00
Robert Watson
83ff52a7f3 Use #include "", not #include <> for opt_foo.h.
MFC after:	3 days
2006-07-06 13:22:08 +00:00
Pawel Jakub Dawidek
00a480ac5c Remove unused prototypes. 2006-04-12 12:17:29 +00:00
Jeff Roberson
23b77994f2 - Add a bogus vhold/vdrop around vgone() in devfs_revoke. Without this
the vnode is never recycled.  It is bogus because the reference really
   should be associated with the devfs dirent.
2006-03-31 23:37:29 +00:00
Jeff Roberson
f50b03bfd6 - We must hold a reference to a vnode before calling vgone() otherwise
it may not be removed from the freelist.

MFC After:	1 week
Found by:	kris
2006-02-22 09:05:40 +00:00
Jeff Roberson
3b77d80cdd - Remove a stale comment. This function was rewritten to be SMP safe some
time ago.

Sponsored by:	Isilon Systems, Inc.
2006-01-30 08:24:14 +00:00
Robert Watson
8f0d99d790 When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule
lock.  Otherwise the system comes to a rather sudden and grinding
halt.

MFC after:	1 week
2006-01-03 09:49:10 +00:00
Doug White
16e35dcc39 This is a workaround for a complicated issue involving VFS cookies and devfs.
The PR and patch have the details. The ultimate fix requires architectural
changes and clarifications to the VFS API, but this will prevent the system
from panicking when someone does "ls /dev" while running in a shell under the
linuxulator.

This issue affects HEAD and RELENG_6 only.

PR:		88249
Submitted by:	"Devon H. O'Dell" <dodell@ixsystems.com>
MFC after:	3 days
2005-11-09 22:03:50 +00:00
Poul-Henning Kamp
3b72f38b5e Use correct cirteria for determining which directory entries we can
purge right away and which we merely can hide.

Beaten into my skull by:	kris
2005-10-18 20:21:25 +00:00
Poul-Henning Kamp
e515ee7832 Make rule zero really magical, that way we don't have to do anything
when we mount and get zero cost if no rules are used in a mountpoint.

Add code to deref rules on unmount.

Switch from SLIST to TAILQ.

Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead.

Drop goto, a break will do.

Reduce double pointers to single pointers.

Combine reaping and destroying rulesets.

Avoid memory leaks in a some error cases.
2005-09-24 07:03:09 +00:00
Poul-Henning Kamp
e606a3c63e Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv.  It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.

Link the cdev_priv's into a list, protected by the global device
mutex.  Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.

Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables,  the atomics
to muck about in them and all the trouble that resulted in.

This makes RAM the only limit on how many devices we can have.

The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.

The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.

Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr.  Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.

Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them.  A few fields may migrate
from devfs_dirent to cdev_priv as well.

Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.

Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.

Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there.  We still spelunk into other mountpoints
and fondle their data without 100% good locking.  It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.

Lots of shuffling of stuff and churn of bits for no good reason[2].

XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints.  It
is not obvious what the best course of action is here.

XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.

XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.

Much testing provided by:		Kris
Much confusion caused by (races in):	md(4)

[1] You are not supposed to understand anything past this point.

[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
Poul-Henning Kamp
59307b0dfe Don't attempt to recurse lockmgr, it doesn't like it. 2005-09-15 21:16:43 +00:00
Poul-Henning Kamp
214c8ff0e4 Various minor polishing. 2005-09-15 10:28:19 +00:00
Poul-Henning Kamp
6556102dcb Protect the devfs rule internal global lists with a sx lock, the per
mount locks are not enough.  Finer granularity (x)locking could be
implemented, but I prefer to keep it simple for now.
2005-09-15 08:50:16 +00:00
Poul-Henning Kamp
ab32e95296 Absolve devfs_rule.c from locking responsibility and call it with
all necessary locking held.
2005-09-15 08:36:37 +00:00
Poul-Henning Kamp
5e080af41f Close a race which could result in unwarranted "ruleset %d already
running" panics.

Previously, recursion through the "include" feature was prevented by
marking each ruleset as "running" when applied.  This doesn't work for
the case where two DEVFS instances try to apply the same ruleset at
the same time.

Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which
limits how many levels of "include" we will traverse.

Be aware that traversal of "include" is recursive and kernel stack
size is limited.

MFC:	after 3 days
2005-09-15 06:57:28 +00:00
Poul-Henning Kamp
21806f30bc Clean up prototypes. 2005-09-12 08:03:15 +00:00
Poul-Henning Kamp
80447bf701 Add a missing dev_relthread() call.
Remove unused variable.

Spotted by:	Hans Petter Selasky <hselasky@c2i.net>
2005-08-29 11:14:18 +00:00
Poul-Henning Kamp
516ad423b1 Handle device drivers with D_NEEDGIANT in a way which does not
penalize the 'good' drivers:  Allocate a shadow cdevsw and populate
it with wrapper functions which grab Giant
2005-08-17 08:19:52 +00:00
Poul-Henning Kamp
31cc57cdbd Collect the devfs related sysctls in one place 2005-08-16 19:25:02 +00:00
Poul-Henning Kamp
9c0af1310c Create a new internal .h file to communicate very private stuff
from kern_conf.c to devfs.

For now just two prototypes, more to come.
2005-08-16 19:08:01 +00:00
Poul-Henning Kamp
d785dfefa4 Eliminate effectively unused dm_basedir field from devfs_mount. 2005-08-15 19:40:53 +00:00
Robert Watson
6a113b3de7 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
Kris Kennaway
e29c976a58 devfs is not yet fully MPSAFE - for example, multiple concurrent devfs(8)
processes can cause a panic when operating on rulesets.

Approved by:	phk
2005-07-29 23:00:56 +00:00
Simon L. B. Nielsen
02a4be3f74 Correct devfs ruleset bypass.
Submitted by:	csjp
Reviewed by:	phk
Security:	FreeBSD-SA-05:17.devfs
Approved by:	cperciva
2005-07-20 13:34:16 +00:00
Robert Watson
d26dd2d99e When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device.  This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.

- Add a cred reference to struct cdev, so that when a device node is
  instantiated as a vnode, the cloning credential can be exposed to
  MAC.

- Add make_dev_cred(), a version of make_dev() that additionally
  accepts the credential to stick in the struct cdev.  Implement it and
  make_dev() in terms of a back-end make_dev_credv().

- Add a new event handler, dev_clone_cred, which can be registered to
  receive the credential instead of dev_clone, if desired.

- Modify the MAC entry point mac_create_devfs_device() to accept an
  optional credential pointer (may be NULL), so that MAC policies can
  inspect and act on the label or other elements of the credential
  when initializing the skeleton device protections.

- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
  so that the pty clone credential is exposed to the MAC Framework.

While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty.  This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.

Submitted by:	Andrew Reisse <andrew.reisse@sparta.com>
Obtained from:	TrustedBSD Project
Sponsored by:	SPAWAR, SPARTA
MFC after:	1 week
MFC note:	Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
Craig Rodrigues
fd225fe4a3 Do not declare a struct as extern, and then implement
it as static in the same file.  This is not legal C,
and GCC 4.0 will issue an error.

Reviewed by:	phk
Approved by:	das (mentor)
2005-05-31 14:50:49 +00:00
Jeff Roberson
7b6b7657d2 - In devfs_open() and devfs_close() grab Giant if the driver sets NEEDGIANT.
We still have to DROP_GIANT and PICKUP_GIANT when NEEDGIANT is not set
   because vfs is still sometime entered with Giant held.
2005-05-01 00:56:34 +00:00
Jeff Roberson
cd360e947b - Mark devfs as MNTK_MPSAFE as I belive it does not require Giant.
Sponsored by:	Isilon Systems, Inc.
Agreed in principle by:		phk
2005-04-30 11:24:17 +00:00
Jeff Roberson
4585e3ac5a - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
Poul-Henning Kamp
f4f6abcb4e Explicitly hold a reference to the cdev we have just cloned. This
closes the race where the cdev was reclaimed before it ever made it
back to devfs lookup.
2005-03-31 12:19:44 +00:00
Poul-Henning Kamp
9477d73e32 cdev (still) needs per instance uid/gid/mode
Add unlocked version of dev_ref()

Clean up various stuff in sys/conf.h
2005-03-31 10:29:57 +00:00
Poul-Henning Kamp
eb151cb989 Rename dev_ref() to dev_refl() 2005-03-31 06:51:54 +00:00
Jeff Roberson
ea124bf597 - LK_NOPAUSE is a nop now.
Sponsored by:	Isilon Systems, Inc.
2005-03-31 04:27:49 +00:00
Jeff Roberson
eddcb03d02 - We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.
Sponsored by:   Isilon Systems, Inc.
2005-03-28 09:34:36 +00:00
Jeff Roberson
d9b2d9f7a2 - Update vfs_root implementations to match the new prototype. None of
these filesystems will support shared locks until they are explicitly
   modified to do so.  Careful review must be done to ensure that this
   is safe for each individual filesystem.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:36:16 +00:00
Poul-Henning Kamp
800b42bde0 Prepare for the final onslaught on devices:
Move uid/gid/mode from cdev to cdevsw.

Add kind field to use for devd(8) later.

Bump both D_VERSION and __FreeBSD_version
2005-03-17 12:07:00 +00:00
Jeff Roberson
c0f681c21d - The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem.  Check that rather than VI_XLOCK.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:14:56 +00:00
Poul-Henning Kamp
2647407860 One more bit of the major/minor patch to make ttyname happy as well. 2005-03-10 18:49:17 +00:00
Poul-Henning Kamp
b43ab0e378 Try to fix the mess I made of devname, with the minimal subset of the
larger minor/major patch which was posted for testing.
2005-03-10 18:21:34 +00:00
Poul-Henning Kamp
f5af7353c0 Remove kernelside support for devfs rules filtering on major numbers. 2005-03-08 19:51:27 +00:00
Poul-Henning Kamp
0454a53d65 We may not have an actual cdev at this point. 2005-02-22 18:17:31 +00:00
Poul-Henning Kamp
aa2f6ddc3f Reap more benefits from DEVFS:
List devfs_dirents rather than vnodes off their shared struct cdev, this
saves a pointer field in the vnode at the expense of a field in the
devfs_dirent.  There are often 100 times more vnodes so this is bargain.
In addition it makes it harder for people to try to do stypid things like
"finding the vnode from cdev".

Since DEVFS handles all VCHR nodes now, we can do the vnode related
cleanup in devfs_reclaim() instead of in dev_rel() and vgonel().
Similarly, we can do the struct cdev related cleanup in dev_rel()
instead of devfs_reclaim().

	rename idestroy_dev() to destroy_devl() for consistency.

	Add LIST_ENTRY de_alias to struct devfs_dirent.
	Remove v_specnext from struct vnode.
	Change si_hlist to si_alist in struct cdev.
	String new devfs vnodes' devfs_dirent on si_alist when
	we create them and take them off in devfs_reclaim().

	Fix devfs_revoke() accordingly.  Also don't clear fields
	devfs_reclaim() will clear when called from vgone();

	Let devfs_reclaim() call dev_rel() instead of vgonel().

	Move the usecount tracking from dev_rel() to devfs_reclaim(),
	and let dev_rel() take a struct cdev argument instead of vnode.

	Destroy SI_CHEAPCLONE devices in dev_rel() (instead of
	devfs_reclaim()) when they are no longer used.   (This
	should maybe happen in devfs_close() instead.)
2005-02-22 15:51:07 +00:00
Poul-Henning Kamp
1a1457d427 Make dev_ref() require the dev_lock() to be held and use it from
devfs instead of directly frobbing the si_refcount.
2005-02-22 14:41:04 +00:00
Poul-Henning Kamp
4d8ac58b05 Introduce vx_wait{l}() and use it instead of home-rolled versions. 2005-02-17 10:49:51 +00:00
Poul-Henning Kamp
5ece08f57a Make a SYSCTL_NODE static 2005-02-10 12:23:29 +00:00
Poul-Henning Kamp
df32e67c73 Statize devfs_ops_f 2005-02-10 12:04:26 +00:00
Poul-Henning Kamp
a369f34d76 Make filesystems get rid of their own vnodes vnode_pager object in
VOP_RECLAIM().
2005-01-28 14:42:17 +00:00
Poul-Henning Kamp
83c6439714 Whitespace in vop_vector{} initializations. 2005-01-13 18:59:48 +00:00
Poul-Henning Kamp
7164e8f291 Silently ignore forced argument to unmount. 2005-01-11 12:02:26 +00:00
Warner Losh
d167cf6f3a /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:10:42 +00:00
Poul-Henning Kamp
59f69ba49f Unsupport forceful unmounts of DEVFS.
After disscussing things I have decided to take the easy and
consistent 90% solution instead of aiming for the very involved 99%
solution.

If we allow forceful unmounts of DEVFS we need to decide how to handle
the devices which are in use through this filesystem at the time.

We cannot just readopt the open devices in the main /dev instance since
that would open us to security issues.

For the majority of the devices, this is relatively straightforward
as we can just pretend they got revoke(2)'ed.

Some devices get tricky:  /dev/console and /dev/tty for instance
does a sort of recursive open of the real console device.   Other devices
may be mmap'ed (kill the processes ?).

And then there are disk devices which are mounted.

The correct thing here would be to recursively unmount the filesystems
mounte from devices from our DEVFS instance (forcefully) and if
this succeeds, complete the forcefully unmount of DEVFS.  But if
one of the forceful unmounts fail we cannot complete the forceful
unmount of DEVFS, but we are likely to already have severed a lot
of stuff in the process of trying.

Event attempting this would be a lot of code for a very far out
corner-case which most people would never see or get in touch with.

It's just not worth it.
2005-01-04 07:52:26 +00:00
Poul-Henning Kamp
50a36c111f Be consistent about flag values passed to device drivers read/write
methods:

Read can see O_NONBLOCK and O_DIRECT.

Write can see O_NONBLOCK, O_DIRECT and O_FSYNC.

In addition O_DIRECT is shadowed as IO_DIRECT for now for backwards
compatibility.
2004-12-22 17:05:44 +00:00
Poul-Henning Kamp
10eee285f7 Shuffle numeric values of the IO_* flags to match the O_* flags from
fcntl.h.

This is in preparation for making the flags passed to device drivers be
consistently from fcntl.h for all entrypoints.

Today open, close and ioctl uses fcntl.h flags, while read and write
uses vnode.h flags.
2004-12-22 16:25:50 +00:00
Poul-Henning Kamp
e87047b437 We can only ever get to vgonechrl() from a devfs vnode, so we do not
need to reassign the vp->v_op to devfs_specops, we know that is the
value already.

Make devfs_specops private to devfs.
2004-12-20 21:34:29 +00:00
Poul-Henning Kamp
2c0220129d Add a couple of KASSERTS to try to diagnose a problem reported. 2004-12-20 21:12:11 +00:00
Poul-Henning Kamp
2a9e0c3216 Be a bit more assertive about vnode bypass. 2004-12-14 09:32:18 +00:00
Poul-Henning Kamp
1dc4727ea3 Another FNONBLOCK -> O_NONBLOCK.
Don't unconditionally set IO_UNIT to device drivers in write:  nobody
checks it, and since it was always set it did not carry information anyway.
2004-12-13 07:41:19 +00:00
Poul-Henning Kamp
ab9caf9d67 Use O_NONBLOCK instead of FNONBLOCK alias. 2004-12-13 07:37:29 +00:00
Poul-Henning Kamp
f0d5cba935 Explicit panic in vop_read/vop_write for devices 2004-12-13 07:13:21 +00:00
Poul-Henning Kamp
20a92a18f1 The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly
split the conversion of the remaining three filesystems out from the root
mounting changes, so in one go:

cd9660:
	Convert to nmount.
	Add omount compat shims.
	Remove dedicated rootfs mounting code.
	Use vfs_mountedfrom()
	Rely on vfs_mount.c calling VFS_STATFS()

nfs(client):
	Convert to nmount (the simple way, mount_nfs(8) is still necessary).
	Add omount compat shims.
	Drop COMPAT_PRELITE2 mount arg compatibility.

ffs:
	Convert to nmount.
	Add omount compat shims.
	Remove dedicated rootfs mounting code.
	Use vfs_mountedfrom()
	Rely on vfs_mount.c calling VFS_STATFS()

Remove vfs_omount() method, all filesystems are now converted.

Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem
task, and they all do it now.

Change rootmounting to use DEVFS trampoline:

vfs_mount.c:
	Mount devfs on /.  Devfs needs no 'from' so this is clean.
	symlink /dev to /.  This makes it possible to lookup /dev/foo.
	Mount "real" root filesystem on /.
	Surgically move the devfs mountpoint from under the real root
	filesystem onto /dev in the real root filesystem.

Remove now unnecessary getdiskbyname().

kern_init.c:
	Don't do devfs mounting and rootvnode assignment here, it was
	already handled by vfs_mount.c.

Remove now unused bdevvp(), addaliasu() and addalias().  Put the
few necessary lines in devfs where they belong.  This eliminates the
second-last source of bogo vnodes, leaving only the lemming-syncer.

Remove rootdev variable, it doesn't give meaning in a global context and
was not trustworth anyway.  Correct information is provided by
statfs(/).
2004-12-07 08:15:41 +00:00
Poul-Henning Kamp
a1f5fe1538 Use vfs_mountedfrom() and rely on vfs_mount.c to call VFS_STATFS() 2004-12-06 19:54:31 +00:00
Poul-Henning Kamp
743312367a VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases
doesn't.  Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.

Fix this the cleaner way:  Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.
2004-12-05 22:41:02 +00:00
Poul-Henning Kamp
aec0fb7b40 Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
Poul-Henning Kamp
6fde64c778 Mechanically change prototypes for vnode operations to use the new typedefs. 2004-12-01 12:24:41 +00:00
Poul-Henning Kamp
ce59d2149d Ignore MNT_NODEV, it is implicit in choice of filesystem these days. 2004-11-26 07:37:42 +00:00
Poul-Henning Kamp
ea566ae2a5 Make vnode bypass for devices mandatory. 2004-11-17 07:18:49 +00:00
Poul-Henning Kamp
8352b1925d Make vnode bypass the default for devices.
Can be disabled in case of problems with
	vfs.devfs.fops=0
in loader.conf
2004-11-15 22:11:09 +00:00
Poul-Henning Kamp
49b7607eba Integrate most of vop_revoke() into devfs_revoke() where it belongs. 2004-11-13 23:37:29 +00:00
Poul-Henning Kamp
aac5167c38 Add the devfs_fp_check() function which helps us get from a struct file
to a cdev and a devsw, doing all the relevant checks along the way.

Add the check to see if fp->f_vnode->v_rdev differs from our cached
fp->f_data copy of our cdev.  If it does the device was revoked and
we return ENXIO.
2004-11-13 23:21:54 +00:00
Poul-Henning Kamp
b0aed5267e Refuse attemps to mount root filesystem 2004-11-09 22:14:57 +00:00
Poul-Henning Kamp
56dd3a6182 Add optional device vnode bypass to DEVFS.
The tunable vfs.devfs.fops controls this feature and defaults to off.

When enabled (vfs.devfs.fops=1 in loader), device vnodes opened
through a filedescriptor gets a special fops vector which instead
of the detour through the vnode layer goes directly to DEVFS.

Amongst other things this allows us to run Giant free read/write to
device drivers which have been weaned off D_NEEDGIANT.

Currently this means /dev/null, /dev/zero, disks, (and maybe the
random stuff ?)

On a 700MHz K7 machine this doubles the speed of
	dd if=/dev/zero of=/dev/null bs=1 count=1000000

This roughly translates to shaving 2usec of each read/write syscall.

The poll/kqfilter paths need more work before they are giant free,
this work is ongoing in p4::phk_bufwork

Please test this and report any problems, LORs etc.
2004-11-08 10:46:47 +00:00
Poul-Henning Kamp
5349c79d75 Properly implement a default version of VOP_GETWRITEMOUNT.
Remove improper access to vop_stdgetwritemount() which should and
will instead rely on the VOP default path.
2004-11-06 11:41:22 +00:00
Poul-Henning Kamp
ecc14aae12 Add back securelevel check for disks.
XXX: This should live in geom_dev.c but we don't have access to the
cred there.
XXX: XXX:  This may not matter anymore since filesystems use geom_vfs.
2004-11-04 09:17:55 +00:00
Poul-Henning Kamp
4cea3289da Don't give disks special treatment, they don't come this way anymore. 2004-10-29 11:10:55 +00:00
Poul-Henning Kamp
c108bb741c Remove VOP_SPECSTRATEGY() from the system. 2004-10-29 10:59:28 +00:00
Poul-Henning Kamp
6afb3b1c37 Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text.  There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.
2004-10-29 07:16:37 +00:00
Poul-Henning Kamp
45628dd373 What can I say: don't allow people to mount DEVFS with option "nodev". 2004-10-28 06:03:25 +00:00
Poul-Henning Kamp
5d9d81e7ea Put the I/O block size in bufobj->bo_bsize.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open().  The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
2004-10-26 07:39:12 +00:00
Poul-Henning Kamp
ff7c5a4880 Alas, poor SPECFS! -- I knew him, Horatio; A filesystem of infinite
jest, of most excellent fancy: he hath taught me lessons a thousand
times; and now, how abhorred in my imagination it is! my gorge rises
at it.  Here were those hacks that I have curs'd I know not how
oft.  Where be your kludges now? your workarounds? your layering
violations, that were wont to set the table on a roar?

Move the skeleton of specfs into devfs where it now belongs and
bury the rest.
2004-10-22 09:59:37 +00:00
Poul-Henning Kamp
891822a853 XXX mark two places where we do not hold a threadcount on the dev when
frobbing the cdevsw.

In both cases we examine only the cdevsw and it is a good question if we
weren't better off copying those properties into the cdev in the first
place.  This question will be revisited.
2004-09-24 08:32:36 +00:00
Poul-Henning Kamp
5e8c582ac2 Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version.  This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space.  A few places abused
it to get hold of some credentials to pass around.  Effectively
it is unused.

Reorganize the root filesystem selection code.
2004-07-30 22:08:52 +00:00
Colin Percival
56f21b9d74 Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with:	rwatson, scottl
Requested by:	jhb
2004-07-26 07:24:04 +00:00
Robert Watson
de592112e1 In devfs_allocv(), rather than assigning 'td = curthread', assert that
the caller passes in a td that is curthread, and consistently pass 'td'
into vget().  Remove some bogus logic that passed in td or curthread
conditional on td being non-NULL, which seems redundant in the face of
the earlier assignment of td to curthread if td is NULL.

In devfs_symlink(), cache the passed thread in 'td' so we don't have
to keep retrieving it from the 'ap' structure, and assert that td is
curthread (since we dereference it to get thread-local td_ucred).  Use
'td' in preference to curthread for later lockmgr calls, since they are
equal.
2004-07-22 17:03:14 +00:00
Alfred Perlstein
f257b7a54b Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
Poul-Henning Kamp
9d96090725 Reduce a fair bit of the atomics because we are now called with a
lock from kern_conf.c and cdev's act a lot more like real objects
these days.
2004-06-18 08:08:47 +00:00
Poul-Henning Kamp
f3732fd15b Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Poul-Henning Kamp
bc55355956 Report the correct length for symlink entries. 2004-02-19 19:09:52 +00:00
Poul-Henning Kamp
f82dfde7e5 White-space align a struct definition.
Move a SYSINIT to the file where it belongs.
2004-02-15 21:43:08 +00:00
Colin Percival
9d0be84912 Fix style(9) of my previous commit.
Noticed by: nate
Approved by: nate, rwatson (mentor)
2004-01-21 18:03:54 +00:00
Colin Percival
9f8ef8b8d1 Allow devfs path rules to work on directories. Without this fix,
devfs rule add path fd unhide
is a no-op, while it should unhide the fd subdirectory.

Approved by: phk, rwatson (mentor)
PR: kern/60897
2004-01-21 16:43:29 +00:00
Poul-Henning Kamp
49e9fc0a0d Improve on POLA by populating DEVFS before doing devfs(8) rule ioctls.
PR:	60687
Spotted by:	Colin Percival <cperciva@daemonology.net>
2004-01-02 19:02:28 +00:00
Robert Watson
eca8a663d4 Modify the MAC Framework so that instead of embedding a (struct label)
in various kernel objects to represent security data, we embed a
(struct label *) pointer, which now references labels allocated using
a UMA zone (mac_label.c).  This allows the size and shape of struct
label to be varied without changing the size and shape of these kernel
objects, which become part of the frozen ABI with 5-STABLE.  This opens
the door for boot-time selection of the number of label slots, and hence
changes to the bound on the number of simultaneous labeled policies
at boot-time instead of compile-time.  This also makes it easier to
embed label references in new objects as required for locking/caching
with fine-grained network stack locking, such as inpcb structures.

This change also moves us further in the direction of hiding the
structure of kernel objects from MAC policy modules, not to mention
dramatically reducing the number of '&' symbols appearing in both the
MAC Framework and MAC policy modules, and improving readability.

While this results in minimal performance change with MAC enabled, it
will observably shrink the size of a number of critical kernel data
structures for the !MAC case, and should have a small (but measurable)
performance benefit (i.e., struct vnode, struct socket) do to memory
conservation and reduced cost of zeroing memory.

NOTE: Users of MAC must recompile their kernel and all MAC modules as a
result of this change.  Because this is an API change, third party
MAC modules will also need to be updated to make less use of the '&'
symbol.

Suggestions from:	bmilekic
Obtained from:		TrustedBSD Project
Sponsored by:		DARPA, Network Associates Laboratories
2003-11-12 03:14:31 +00:00