1995-10-02 09:24:44 +00:00
|
|
|
/*-
|
2002-09-20 22:26:27 +00:00
|
|
|
* Copyright (c) 1999-2002 Poul-Henning Kamp
|
1995-10-02 09:24:44 +00:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
2002-09-20 22:26:27 +00:00
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
1995-10-02 09:24:44 +00:00
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
2002-09-20 22:26:27 +00:00
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
1995-10-02 09:24:44 +00:00
|
|
|
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
* SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
2003-06-11 00:56:59 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
1995-10-02 09:24:44 +00:00
|
|
|
#include <sys/param.h>
|
1999-07-20 09:47:55 +00:00
|
|
|
#include <sys/kernel.h>
|
1995-10-02 10:15:40 +00:00
|
|
|
#include <sys/systm.h>
|
2008-05-14 14:29:54 +00:00
|
|
|
#include <sys/bus.h>
|
2003-02-20 15:35:54 +00:00
|
|
|
#include <sys/bio.h>
|
2002-02-16 17:44:43 +00:00
|
|
|
#include <sys/lock.h>
|
|
|
|
#include <sys/mutex.h>
|
1998-06-07 17:13:14 +00:00
|
|
|
#include <sys/module.h>
|
1999-07-20 09:47:55 +00:00
|
|
|
#include <sys/malloc.h>
|
1995-10-02 09:24:44 +00:00
|
|
|
#include <sys/conf.h>
|
1995-12-21 20:09:46 +00:00
|
|
|
#include <sys/vnode.h>
|
1999-07-20 09:47:55 +00:00
|
|
|
#include <sys/queue.h>
|
2003-09-27 12:53:33 +00:00
|
|
|
#include <sys/poll.h>
|
2007-07-03 17:42:37 +00:00
|
|
|
#include <sys/sx.h>
|
2000-09-02 19:17:34 +00:00
|
|
|
#include <sys/ctype.h>
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
#include <sys/ucred.h>
|
2007-07-03 17:42:37 +00:00
|
|
|
#include <sys/taskqueue.h>
|
1999-08-08 18:43:05 +00:00
|
|
|
#include <machine/stdarg.h>
|
1995-12-21 20:09:46 +00:00
|
|
|
|
2005-08-16 19:08:01 +00:00
|
|
|
#include <fs/devfs/devfs_int.h>
|
|
|
|
|
2004-07-11 19:26:43 +00:00
|
|
|
static MALLOC_DEFINE(M_DEVT, "cdev", "cdev storage");
|
1999-07-20 09:47:55 +00:00
|
|
|
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
struct mtx devmtx;
|
2005-02-22 15:51:07 +00:00
|
|
|
static void destroy_devl(struct cdev *dev);
|
2007-07-03 18:18:30 +00:00
|
|
|
static int destroy_dev_sched_cbl(struct cdev *dev,
|
|
|
|
void (*cb)(void *), void *arg);
|
2007-07-03 17:42:37 +00:00
|
|
|
static struct cdev *make_dev_credv(int flags,
|
2008-09-26 14:31:24 +00:00
|
|
|
struct cdevsw *devsw, int unit,
|
2007-07-03 17:42:37 +00:00
|
|
|
struct ucred *cr, uid_t uid, gid_t gid, int mode, const char *fmt,
|
|
|
|
va_list ap);
|
2004-06-17 17:16:53 +00:00
|
|
|
|
2007-06-19 13:19:23 +00:00
|
|
|
static struct cdev_priv_list cdevp_free_list =
|
|
|
|
TAILQ_HEAD_INITIALIZER(cdevp_free_list);
|
2008-03-17 13:17:10 +00:00
|
|
|
static SLIST_HEAD(free_cdevsw, cdevsw) cdevsw_gt_post_list =
|
|
|
|
SLIST_HEAD_INITIALIZER();
|
2007-06-19 13:19:23 +00:00
|
|
|
|
2004-09-23 07:17:41 +00:00
|
|
|
void
|
|
|
|
dev_lock(void)
|
2004-02-21 21:57:26 +00:00
|
|
|
{
|
2005-10-18 18:27:44 +00:00
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
mtx_lock(&devmtx);
|
|
|
|
}
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
/*
|
|
|
|
* Free all the memory collected while the cdev mutex was
|
|
|
|
* locked. Since devmtx is after the system map mutex, free() cannot
|
|
|
|
* be called immediately and is postponed until cdev mutex can be
|
|
|
|
* dropped.
|
|
|
|
*/
|
2007-06-19 13:19:23 +00:00
|
|
|
static void
|
|
|
|
dev_unlock_and_free(void)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdev_priv_list cdp_free;
|
|
|
|
struct free_cdevsw csw_free;
|
2007-06-19 13:19:23 +00:00
|
|
|
struct cdev_priv *cdp;
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *csw;
|
2007-06-19 13:19:23 +00:00
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2008-03-17 13:17:10 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Make the local copy of the list heads while the dev_mtx is
|
|
|
|
* held. Free it later.
|
|
|
|
*/
|
|
|
|
TAILQ_INIT(&cdp_free);
|
|
|
|
TAILQ_CONCAT(&cdp_free, &cdevp_free_list, cdp_list);
|
|
|
|
csw_free = cdevsw_gt_post_list;
|
|
|
|
SLIST_INIT(&cdevsw_gt_post_list);
|
|
|
|
|
|
|
|
mtx_unlock(&devmtx);
|
|
|
|
|
|
|
|
while ((cdp = TAILQ_FIRST(&cdp_free)) != NULL) {
|
|
|
|
TAILQ_REMOVE(&cdp_free, cdp, cdp_list);
|
2007-06-19 13:19:23 +00:00
|
|
|
devfs_free(&cdp->cdp_c);
|
|
|
|
}
|
2008-03-17 13:17:10 +00:00
|
|
|
while ((csw = SLIST_FIRST(&csw_free)) != NULL) {
|
|
|
|
SLIST_REMOVE_HEAD(&csw_free, d_postfree_list);
|
|
|
|
free(csw, M_DEVT);
|
|
|
|
}
|
2007-06-19 13:19:23 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
dev_free_devlocked(struct cdev *cdev)
|
|
|
|
{
|
|
|
|
struct cdev_priv *cdp;
|
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2008-06-16 17:34:59 +00:00
|
|
|
cdp = cdev2priv(cdev);
|
2007-06-19 13:19:23 +00:00
|
|
|
TAILQ_INSERT_HEAD(&cdevp_free_list, cdp, cdp_list);
|
|
|
|
}
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
static void
|
|
|
|
cdevsw_free_devlocked(struct cdevsw *csw)
|
|
|
|
{
|
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
|
|
|
SLIST_INSERT_HEAD(&cdevsw_gt_post_list, csw, d_postfree_list);
|
|
|
|
}
|
|
|
|
|
2004-09-23 07:17:41 +00:00
|
|
|
void
|
|
|
|
dev_unlock(void)
|
2004-02-21 21:57:26 +00:00
|
|
|
{
|
2004-09-24 05:54:32 +00:00
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
mtx_unlock(&devmtx);
|
|
|
|
}
|
|
|
|
|
2005-03-31 10:29:57 +00:00
|
|
|
void
|
|
|
|
dev_ref(struct cdev *dev)
|
|
|
|
{
|
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_NOTOWNED);
|
|
|
|
mtx_lock(&devmtx);
|
|
|
|
dev->si_refcount++;
|
|
|
|
mtx_unlock(&devmtx);
|
|
|
|
}
|
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
void
|
2005-03-31 06:51:54 +00:00
|
|
|
dev_refl(struct cdev *dev)
|
2004-02-21 21:57:26 +00:00
|
|
|
{
|
2004-09-24 05:54:32 +00:00
|
|
|
|
2005-02-22 14:41:04 +00:00
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2004-02-21 21:57:26 +00:00
|
|
|
dev->si_refcount++;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2005-02-22 15:51:07 +00:00
|
|
|
dev_rel(struct cdev *dev)
|
2004-02-21 21:57:26 +00:00
|
|
|
{
|
2005-02-22 15:51:07 +00:00
|
|
|
int flag = 0;
|
2004-09-23 07:17:41 +00:00
|
|
|
|
2004-10-01 06:33:39 +00:00
|
|
|
mtx_assert(&devmtx, MA_NOTOWNED);
|
|
|
|
dev_lock();
|
2004-02-21 21:57:26 +00:00
|
|
|
dev->si_refcount--;
|
|
|
|
KASSERT(dev->si_refcount >= 0,
|
|
|
|
("dev_rel(%s) gave negative count", devtoname(dev)));
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
#if 0
|
2005-02-22 15:51:07 +00:00
|
|
|
if (dev->si_usecount == 0 &&
|
|
|
|
(dev->si_flags & SI_CHEAPCLONE) && (dev->si_flags & SI_NAMED))
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
;
|
|
|
|
else
|
|
|
|
#endif
|
2006-01-04 17:40:54 +00:00
|
|
|
if (dev->si_devsw == NULL && dev->si_refcount == 0) {
|
2004-02-21 21:57:26 +00:00
|
|
|
LIST_REMOVE(dev, si_list);
|
2004-10-01 06:33:39 +00:00
|
|
|
flag = 1;
|
2004-02-21 21:57:26 +00:00
|
|
|
}
|
2004-10-01 06:33:39 +00:00
|
|
|
dev_unlock();
|
|
|
|
if (flag)
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
devfs_free(dev);
|
2004-02-21 21:57:26 +00:00
|
|
|
}
|
2004-10-01 06:33:39 +00:00
|
|
|
|
2004-09-24 05:54:32 +00:00
|
|
|
struct cdevsw *
|
|
|
|
dev_refthread(struct cdev *dev)
|
|
|
|
{
|
|
|
|
struct cdevsw *csw;
|
2007-07-03 17:42:37 +00:00
|
|
|
struct cdev_priv *cdp;
|
2004-09-24 05:54:32 +00:00
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_NOTOWNED);
|
|
|
|
dev_lock();
|
|
|
|
csw = dev->si_devsw;
|
2007-07-03 17:42:37 +00:00
|
|
|
if (csw != NULL) {
|
2008-06-16 17:34:59 +00:00
|
|
|
cdp = cdev2priv(dev);
|
2007-07-03 17:42:37 +00:00
|
|
|
if ((cdp->cdp_flags & CDP_SCHED_DTR) == 0)
|
|
|
|
dev->si_threadcount++;
|
|
|
|
else
|
|
|
|
csw = NULL;
|
|
|
|
}
|
2004-09-24 05:54:32 +00:00
|
|
|
dev_unlock();
|
|
|
|
return (csw);
|
|
|
|
}
|
|
|
|
|
2006-10-20 07:59:50 +00:00
|
|
|
struct cdevsw *
|
|
|
|
devvn_refthread(struct vnode *vp, struct cdev **devp)
|
|
|
|
{
|
|
|
|
struct cdevsw *csw;
|
2007-07-03 17:42:37 +00:00
|
|
|
struct cdev_priv *cdp;
|
2006-10-20 07:59:50 +00:00
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_NOTOWNED);
|
|
|
|
csw = NULL;
|
|
|
|
dev_lock();
|
|
|
|
*devp = vp->v_rdev;
|
|
|
|
if (*devp != NULL) {
|
2008-06-16 17:34:59 +00:00
|
|
|
cdp = cdev2priv(*devp);
|
2007-07-03 17:42:37 +00:00
|
|
|
if ((cdp->cdp_flags & CDP_SCHED_DTR) == 0) {
|
|
|
|
csw = (*devp)->si_devsw;
|
|
|
|
if (csw != NULL)
|
|
|
|
(*devp)->si_threadcount++;
|
|
|
|
}
|
2006-10-20 07:59:50 +00:00
|
|
|
}
|
|
|
|
dev_unlock();
|
|
|
|
return (csw);
|
|
|
|
}
|
|
|
|
|
2004-09-24 05:54:32 +00:00
|
|
|
void
|
|
|
|
dev_relthread(struct cdev *dev)
|
|
|
|
{
|
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_NOTOWNED);
|
|
|
|
dev_lock();
|
2008-05-23 16:38:38 +00:00
|
|
|
KASSERT(dev->si_threadcount > 0,
|
|
|
|
("%s threadcount is wrong", dev->si_name));
|
2004-09-24 05:54:32 +00:00
|
|
|
dev->si_threadcount--;
|
|
|
|
dev_unlock();
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
|
2003-09-27 12:53:33 +00:00
|
|
|
int
|
|
|
|
nullop(void)
|
|
|
|
{
|
|
|
|
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
eopnotsupp(void)
|
|
|
|
{
|
|
|
|
|
|
|
|
return (EOPNOTSUPP);
|
|
|
|
}
|
2003-02-20 15:35:54 +00:00
|
|
|
|
|
|
|
static int
|
|
|
|
enxio(void)
|
|
|
|
{
|
|
|
|
return (ENXIO);
|
|
|
|
}
|
|
|
|
|
2003-09-27 12:53:33 +00:00
|
|
|
static int
|
|
|
|
enodev(void)
|
|
|
|
{
|
|
|
|
return (ENODEV);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Define a dead_cdevsw for use when devices leave unexpectedly. */
|
|
|
|
|
2003-02-20 15:35:54 +00:00
|
|
|
#define dead_open (d_open_t *)enxio
|
|
|
|
#define dead_close (d_close_t *)enxio
|
|
|
|
#define dead_read (d_read_t *)enxio
|
|
|
|
#define dead_write (d_write_t *)enxio
|
|
|
|
#define dead_ioctl (d_ioctl_t *)enxio
|
2003-09-27 12:53:33 +00:00
|
|
|
#define dead_poll (d_poll_t *)enodev
|
|
|
|
#define dead_mmap (d_mmap_t *)enodev
|
2003-02-20 15:35:54 +00:00
|
|
|
|
|
|
|
static void
|
|
|
|
dead_strategy(struct bio *bp)
|
|
|
|
{
|
|
|
|
|
|
|
|
biofinish(bp, NULL, ENXIO);
|
|
|
|
}
|
|
|
|
|
2003-02-21 19:00:48 +00:00
|
|
|
#define dead_dump (dumper_t *)enxio
|
2003-02-20 15:35:54 +00:00
|
|
|
#define dead_kqfilter (d_kqfilter_t *)enxio
|
|
|
|
|
|
|
|
static struct cdevsw dead_cdevsw = {
|
2004-02-21 21:10:55 +00:00
|
|
|
.d_version = D_VERSION,
|
|
|
|
.d_flags = D_NEEDGIANT, /* XXX: does dead_strategy need this ? */
|
2003-03-03 12:15:54 +00:00
|
|
|
.d_open = dead_open,
|
|
|
|
.d_close = dead_close,
|
|
|
|
.d_read = dead_read,
|
|
|
|
.d_write = dead_write,
|
|
|
|
.d_ioctl = dead_ioctl,
|
|
|
|
.d_poll = dead_poll,
|
|
|
|
.d_mmap = dead_mmap,
|
|
|
|
.d_strategy = dead_strategy,
|
|
|
|
.d_name = "dead",
|
|
|
|
.d_dump = dead_dump,
|
|
|
|
.d_kqfilter = dead_kqfilter
|
2003-02-20 15:35:54 +00:00
|
|
|
};
|
|
|
|
|
2003-09-27 12:53:33 +00:00
|
|
|
/* Default methods if driver does not specify method */
|
|
|
|
|
|
|
|
#define null_open (d_open_t *)nullop
|
|
|
|
#define null_close (d_close_t *)nullop
|
|
|
|
#define no_read (d_read_t *)enodev
|
|
|
|
#define no_write (d_write_t *)enodev
|
|
|
|
#define no_ioctl (d_ioctl_t *)enodev
|
|
|
|
#define no_mmap (d_mmap_t *)enodev
|
2004-08-15 06:24:42 +00:00
|
|
|
#define no_kqfilter (d_kqfilter_t *)enodev
|
2003-09-27 12:53:33 +00:00
|
|
|
|
|
|
|
static void
|
|
|
|
no_strategy(struct bio *bp)
|
|
|
|
{
|
|
|
|
|
|
|
|
biofinish(bp, NULL, ENODEV);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2004-06-16 09:47:26 +00:00
|
|
|
no_poll(struct cdev *dev __unused, int events, struct thread *td __unused)
|
2003-09-27 12:53:33 +00:00
|
|
|
{
|
|
|
|
|
2009-03-06 15:35:37 +00:00
|
|
|
return (poll_no_poll(events));
|
2003-09-27 12:53:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
#define no_dump (dumper_t *)enodev
|
2001-10-27 17:44:21 +00:00
|
|
|
|
2005-08-17 08:19:52 +00:00
|
|
|
static int
|
|
|
|
giant_open(struct cdev *dev, int oflags, int devtype, struct thread *td)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_open(dev, oflags, devtype, td);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2007-05-31 11:51:53 +00:00
|
|
|
giant_fdopen(struct cdev *dev, int oflags, struct thread *td, struct file *fp)
|
2005-08-17 08:19:52 +00:00
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_fdopen(dev, oflags, td, fp);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_close(struct cdev *dev, int fflag, int devtype, struct thread *td)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_close(dev, fflag, devtype, td);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
giant_strategy(struct bio *bp)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
|
|
|
struct cdev *dev;
|
2005-08-17 08:19:52 +00:00
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dev = bp->bio_dev;
|
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL) {
|
|
|
|
biofinish(bp, NULL, ENXIO);
|
|
|
|
return;
|
|
|
|
}
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw->d_gianttrick->d_strategy(bp);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_ioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-04-02 11:11:58 +00:00
|
|
|
retval = dsw->d_gianttrick->d_ioctl(dev, cmd, data, fflag, td);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_read(struct cdev *dev, struct uio *uio, int ioflag)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-04-02 11:11:58 +00:00
|
|
|
retval = dsw->d_gianttrick->d_read(dev, uio, ioflag);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_write(struct cdev *dev, struct uio *uio, int ioflag)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_write(dev, uio, ioflag);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_poll(struct cdev *dev, int events, struct thread *td)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_poll(dev, events, td);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_kqfilter(struct cdev *dev, struct knote *kn)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_kqfilter(dev, kn);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
giant_mmap(struct cdev *dev, vm_offset_t offset, vm_paddr_t *paddr, int nprot)
|
|
|
|
{
|
2008-03-17 13:17:10 +00:00
|
|
|
struct cdevsw *dsw;
|
2005-08-17 08:19:52 +00:00
|
|
|
int retval;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw = dev_refthread(dev);
|
|
|
|
if (dsw == NULL)
|
|
|
|
return (ENXIO);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_lock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
retval = dsw->d_gianttrick->d_mmap(dev, offset, paddr, nprot);
|
2005-08-17 08:19:52 +00:00
|
|
|
mtx_unlock(&Giant);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_relthread(dev);
|
2005-08-17 08:19:52 +00:00
|
|
|
return (retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2008-05-14 14:29:54 +00:00
|
|
|
static void
|
|
|
|
notify(struct cdev *dev, const char *ev)
|
|
|
|
{
|
|
|
|
static const char prefix[] = "cdev=";
|
|
|
|
char *data;
|
|
|
|
int namelen;
|
|
|
|
|
|
|
|
if (cold)
|
|
|
|
return;
|
|
|
|
namelen = strlen(dev->si_name);
|
|
|
|
data = malloc(namelen + sizeof(prefix), M_TEMP, M_WAITOK);
|
|
|
|
memcpy(data, prefix, sizeof(prefix) - 1);
|
|
|
|
memcpy(data + sizeof(prefix) - 1, dev->si_name, namelen + 1);
|
|
|
|
devctl_notify("DEVFS", "CDEV", ev, data);
|
|
|
|
free(data, M_TEMP);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
notify_create(struct cdev *dev)
|
|
|
|
{
|
|
|
|
|
|
|
|
notify(dev, "CREATE");
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
notify_destroy(struct cdev *dev)
|
|
|
|
{
|
|
|
|
|
|
|
|
notify(dev, "DESTROY");
|
|
|
|
}
|
|
|
|
|
2004-06-16 09:47:26 +00:00
|
|
|
static struct cdev *
|
2005-03-29 09:56:21 +00:00
|
|
|
newdev(struct cdevsw *csw, int y, struct cdev *si)
|
Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()
For now they're functions, they will become in-line functions
after one of the next two steps in this process.
Return major/minor/makedev to macro-hood for userland.
Register a name in cdevsw[] for the "filedescriptor" driver.
In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.
In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).
A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.
Without DEVT_FASCIST I belive this patch is a no-op.
Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.
Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
|
|
|
{
|
2005-01-24 12:44:56 +00:00
|
|
|
struct cdev *si2;
|
2004-06-17 17:16:53 +00:00
|
|
|
dev_t udev;
|
1999-07-20 09:47:55 +00:00
|
|
|
|
2005-01-24 12:44:56 +00:00
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2005-03-29 11:15:54 +00:00
|
|
|
udev = y;
|
2008-06-11 18:55:19 +00:00
|
|
|
if (csw->d_flags & D_NEEDMINOR) {
|
|
|
|
/* We may want to return an existing device */
|
|
|
|
LIST_FOREACH(si2, &csw->d_devs, si_list) {
|
|
|
|
if (si2->si_drv0 == udev) {
|
|
|
|
dev_free_devlocked(si);
|
|
|
|
return (si2);
|
|
|
|
}
|
2005-01-24 12:44:56 +00:00
|
|
|
}
|
1999-07-20 09:47:55 +00:00
|
|
|
}
|
2005-03-15 11:33:28 +00:00
|
|
|
si->si_drv0 = udev;
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
si->si_devsw = csw;
|
2005-03-29 09:56:21 +00:00
|
|
|
LIST_INSERT_HEAD(&csw->d_devs, si, si_list);
|
2004-02-15 20:14:47 +00:00
|
|
|
return (si);
|
Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()
For now they're functions, they will become in-line functions
after one of the next two steps in this process.
Return major/minor/makedev to macro-hood for userland.
Register a name in cdevsw[] for the "filedescriptor" driver.
In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.
In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).
A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.
Without DEVT_FASCIST I belive this patch is a no-op.
Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.
Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
|
|
|
}
|
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
static void
|
|
|
|
fini_cdevsw(struct cdevsw *devsw)
|
|
|
|
{
|
2005-08-20 12:13:51 +00:00
|
|
|
struct cdevsw *gt;
|
2005-03-29 11:15:54 +00:00
|
|
|
|
2005-08-20 12:13:51 +00:00
|
|
|
if (devsw->d_gianttrick != NULL) {
|
|
|
|
gt = devsw->d_gianttrick;
|
|
|
|
memcpy(devsw, gt, sizeof *devsw);
|
2008-03-17 13:17:10 +00:00
|
|
|
cdevsw_free_devlocked(gt);
|
2005-08-20 12:13:51 +00:00
|
|
|
devsw->d_gianttrick = NULL;
|
|
|
|
}
|
2004-02-23 08:42:55 +00:00
|
|
|
devsw->d_flags &= ~D_INIT;
|
2004-02-21 20:29:52 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
prep_cdevsw(struct cdevsw *devsw)
|
|
|
|
{
|
2005-08-17 08:19:52 +00:00
|
|
|
struct cdevsw *dsw2;
|
2004-02-21 20:29:52 +00:00
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
|
|
|
if (devsw->d_flags & D_INIT)
|
|
|
|
return;
|
|
|
|
if (devsw->d_flags & D_NEEDGIANT) {
|
|
|
|
dev_unlock();
|
2005-08-17 08:19:52 +00:00
|
|
|
dsw2 = malloc(sizeof *dsw2, M_DEVT, M_WAITOK);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_lock();
|
|
|
|
} else
|
2005-08-17 08:19:52 +00:00
|
|
|
dsw2 = NULL;
|
2008-03-17 13:17:10 +00:00
|
|
|
if (devsw->d_flags & D_INIT) {
|
|
|
|
if (dsw2 != NULL)
|
|
|
|
cdevsw_free_devlocked(dsw2);
|
|
|
|
return;
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
|
2005-03-17 12:07:00 +00:00
|
|
|
if (devsw->d_version != D_VERSION_01) {
|
2004-02-21 21:57:26 +00:00
|
|
|
printf(
|
|
|
|
"WARNING: Device driver \"%s\" has wrong version %s\n",
|
2006-01-12 19:15:14 +00:00
|
|
|
devsw->d_name == NULL ? "???" : devsw->d_name,
|
|
|
|
"and is disabled. Recompile KLD module.");
|
2004-02-21 21:57:26 +00:00
|
|
|
devsw->d_open = dead_open;
|
|
|
|
devsw->d_close = dead_close;
|
|
|
|
devsw->d_read = dead_read;
|
|
|
|
devsw->d_write = dead_write;
|
|
|
|
devsw->d_ioctl = dead_ioctl;
|
|
|
|
devsw->d_poll = dead_poll;
|
|
|
|
devsw->d_mmap = dead_mmap;
|
|
|
|
devsw->d_strategy = dead_strategy;
|
|
|
|
devsw->d_dump = dead_dump;
|
|
|
|
devsw->d_kqfilter = dead_kqfilter;
|
|
|
|
}
|
|
|
|
|
2005-08-17 08:19:52 +00:00
|
|
|
if (devsw->d_flags & D_NEEDGIANT) {
|
|
|
|
if (devsw->d_gianttrick == NULL) {
|
|
|
|
memcpy(dsw2, devsw, sizeof *dsw2);
|
|
|
|
devsw->d_gianttrick = dsw2;
|
2008-03-17 13:17:10 +00:00
|
|
|
dsw2 = NULL;
|
|
|
|
}
|
2005-08-17 08:19:52 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
#define FIXUP(member, noop, giant) \
|
|
|
|
do { \
|
|
|
|
if (devsw->member == NULL) { \
|
|
|
|
devsw->member = noop; \
|
|
|
|
} else if (devsw->d_flags & D_NEEDGIANT) \
|
|
|
|
devsw->member = giant; \
|
|
|
|
} \
|
|
|
|
while (0)
|
|
|
|
|
|
|
|
FIXUP(d_open, null_open, giant_open);
|
|
|
|
FIXUP(d_fdopen, NULL, giant_fdopen);
|
|
|
|
FIXUP(d_close, null_close, giant_close);
|
|
|
|
FIXUP(d_read, no_read, giant_read);
|
|
|
|
FIXUP(d_write, no_write, giant_write);
|
|
|
|
FIXUP(d_ioctl, no_ioctl, giant_ioctl);
|
|
|
|
FIXUP(d_poll, no_poll, giant_poll);
|
|
|
|
FIXUP(d_mmap, no_mmap, giant_mmap);
|
|
|
|
FIXUP(d_strategy, no_strategy, giant_strategy);
|
|
|
|
FIXUP(d_kqfilter, no_kqfilter, giant_kqfilter);
|
|
|
|
|
2003-09-27 12:53:33 +00:00
|
|
|
if (devsw->d_dump == NULL) devsw->d_dump = no_dump;
|
2004-02-21 21:57:26 +00:00
|
|
|
|
|
|
|
LIST_INIT(&devsw->d_devs);
|
|
|
|
|
|
|
|
devsw->d_flags |= D_INIT;
|
|
|
|
|
2008-03-17 13:17:10 +00:00
|
|
|
if (dsw2 != NULL)
|
|
|
|
cdevsw_free_devlocked(dsw2);
|
2004-02-15 10:35:33 +00:00
|
|
|
}
|
|
|
|
|
2007-07-03 17:42:37 +00:00
|
|
|
struct cdev *
|
2008-09-26 14:31:24 +00:00
|
|
|
make_dev_credv(int flags, struct cdevsw *devsw, int unit,
|
2007-07-03 17:42:37 +00:00
|
|
|
struct ucred *cr, uid_t uid,
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
gid_t gid, int mode, const char *fmt, va_list ap)
|
2004-02-15 10:35:33 +00:00
|
|
|
{
|
2004-06-16 09:47:26 +00:00
|
|
|
struct cdev *dev;
|
2004-02-15 10:35:33 +00:00
|
|
|
int i;
|
|
|
|
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
dev = devfs_alloc();
|
2005-01-24 12:44:56 +00:00
|
|
|
dev_lock();
|
2008-03-17 13:17:10 +00:00
|
|
|
prep_cdevsw(devsw);
|
2008-09-26 14:31:24 +00:00
|
|
|
dev = newdev(devsw, unit, dev);
|
2007-07-03 17:42:37 +00:00
|
|
|
if (flags & MAKEDEV_REF)
|
|
|
|
dev_refl(dev);
|
2003-09-27 21:50:00 +00:00
|
|
|
if (dev->si_flags & SI_CHEAPCLONE &&
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
dev->si_flags & SI_NAMED) {
|
2003-09-27 21:50:00 +00:00
|
|
|
/*
|
|
|
|
* This is allowed as it removes races and generally
|
|
|
|
* simplifies cloning devices.
|
2004-02-21 21:57:26 +00:00
|
|
|
* XXX: still ??
|
2003-09-27 21:50:00 +00:00
|
|
|
*/
|
2007-06-19 13:19:23 +00:00
|
|
|
dev_unlock_and_free();
|
2003-09-27 21:50:00 +00:00
|
|
|
return (dev);
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
KASSERT(!(dev->si_flags & SI_NAMED),
|
2005-03-29 09:56:21 +00:00
|
|
|
("make_dev() by driver %s on pre-existing device (min=%x, name=%s)",
|
2008-09-27 08:51:18 +00:00
|
|
|
devsw->d_name, dev2unit(dev), devtoname(dev)));
|
2004-02-21 21:57:26 +00:00
|
|
|
|
2003-02-04 11:04:26 +00:00
|
|
|
i = vsnrprintf(dev->__si_namebuf, sizeof dev->__si_namebuf, 32, fmt, ap);
|
|
|
|
if (i > (sizeof dev->__si_namebuf - 1)) {
|
2004-08-30 01:10:20 +00:00
|
|
|
printf("WARNING: Device name truncated! (%s)\n",
|
2003-02-04 11:04:26 +00:00
|
|
|
dev->__si_namebuf);
|
|
|
|
}
|
2004-10-25 13:12:06 +00:00
|
|
|
|
2000-09-11 17:15:33 +00:00
|
|
|
dev->si_flags |= SI_NAMED;
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
if (cr != NULL)
|
|
|
|
dev->si_cred = crhold(cr);
|
|
|
|
else
|
|
|
|
dev->si_cred = NULL;
|
2005-03-31 10:29:57 +00:00
|
|
|
dev->si_uid = uid;
|
|
|
|
dev->si_gid = gid;
|
|
|
|
dev->si_mode = mode;
|
2000-08-20 21:34:39 +00:00
|
|
|
|
2003-03-02 13:35:30 +00:00
|
|
|
devfs_create(dev);
|
2007-07-04 06:56:58 +00:00
|
|
|
clean_unrhdrl(devfs_inos);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_unlock_and_free();
|
2008-05-14 14:29:54 +00:00
|
|
|
|
|
|
|
notify_create(dev);
|
|
|
|
|
2000-08-20 21:34:39 +00:00
|
|
|
return (dev);
|
|
|
|
}
|
|
|
|
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
struct cdev *
|
2008-09-26 14:31:24 +00:00
|
|
|
make_dev(struct cdevsw *devsw, int unit, uid_t uid, gid_t gid, int mode,
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
const char *fmt, ...)
|
|
|
|
{
|
|
|
|
struct cdev *dev;
|
|
|
|
va_list ap;
|
|
|
|
|
|
|
|
va_start(ap, fmt);
|
2008-09-26 14:31:24 +00:00
|
|
|
dev = make_dev_credv(0, devsw, unit, NULL, uid, gid, mode, fmt, ap);
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
va_end(ap);
|
|
|
|
return (dev);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct cdev *
|
2008-09-26 14:31:24 +00:00
|
|
|
make_dev_cred(struct cdevsw *devsw, int unit, struct ucred *cr, uid_t uid,
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
gid_t gid, int mode, const char *fmt, ...)
|
|
|
|
{
|
|
|
|
struct cdev *dev;
|
|
|
|
va_list ap;
|
|
|
|
|
|
|
|
va_start(ap, fmt);
|
2008-09-26 14:31:24 +00:00
|
|
|
dev = make_dev_credv(0, devsw, unit, cr, uid, gid, mode, fmt, ap);
|
2007-07-03 17:42:37 +00:00
|
|
|
va_end(ap);
|
|
|
|
|
|
|
|
return (dev);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct cdev *
|
2008-09-26 14:31:24 +00:00
|
|
|
make_dev_credf(int flags, struct cdevsw *devsw, int unit,
|
2007-07-03 17:42:37 +00:00
|
|
|
struct ucred *cr, uid_t uid,
|
|
|
|
gid_t gid, int mode, const char *fmt, ...)
|
|
|
|
{
|
|
|
|
struct cdev *dev;
|
|
|
|
va_list ap;
|
|
|
|
|
|
|
|
va_start(ap, fmt);
|
2008-09-26 14:31:24 +00:00
|
|
|
dev = make_dev_credv(flags, devsw, unit, cr, uid, gid, mode,
|
2007-07-03 17:42:37 +00:00
|
|
|
fmt, ap);
|
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
2005-07-14 10:22:09 +00:00
|
|
|
va_end(ap);
|
|
|
|
|
|
|
|
return (dev);
|
|
|
|
}
|
|
|
|
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
static void
|
|
|
|
dev_dependsl(struct cdev *pdev, struct cdev *cdev)
|
2001-05-26 08:27:58 +00:00
|
|
|
{
|
|
|
|
|
|
|
|
cdev->si_parent = pdev;
|
|
|
|
cdev->si_flags |= SI_CHILD;
|
|
|
|
LIST_INSERT_HEAD(&pdev->si_children, cdev, si_siblings);
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
void
|
|
|
|
dev_depends(struct cdev *pdev, struct cdev *cdev)
|
|
|
|
{
|
|
|
|
|
|
|
|
dev_lock();
|
|
|
|
dev_dependsl(pdev, cdev);
|
2004-09-23 07:17:41 +00:00
|
|
|
dev_unlock();
|
2001-05-26 08:27:58 +00:00
|
|
|
}
|
|
|
|
|
2004-06-16 09:47:26 +00:00
|
|
|
struct cdev *
|
|
|
|
make_dev_alias(struct cdev *pdev, const char *fmt, ...)
|
2000-08-20 21:34:39 +00:00
|
|
|
{
|
2004-06-16 09:47:26 +00:00
|
|
|
struct cdev *dev;
|
2000-08-20 21:34:39 +00:00
|
|
|
va_list ap;
|
|
|
|
int i;
|
|
|
|
|
2008-07-11 11:22:19 +00:00
|
|
|
KASSERT(pdev != NULL, ("NULL pdev"));
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
dev = devfs_alloc();
|
2004-09-23 07:17:41 +00:00
|
|
|
dev_lock();
|
2000-08-20 21:34:39 +00:00
|
|
|
dev->si_flags |= SI_ALIAS;
|
2000-09-11 17:15:33 +00:00
|
|
|
dev->si_flags |= SI_NAMED;
|
2000-08-20 21:34:39 +00:00
|
|
|
va_start(ap, fmt);
|
2003-02-04 11:04:26 +00:00
|
|
|
i = vsnrprintf(dev->__si_namebuf, sizeof dev->__si_namebuf, 32, fmt, ap);
|
|
|
|
if (i > (sizeof dev->__si_namebuf - 1)) {
|
2004-08-30 01:10:20 +00:00
|
|
|
printf("WARNING: Device name truncated! (%s)\n",
|
2003-02-04 11:04:26 +00:00
|
|
|
dev->__si_namebuf);
|
|
|
|
}
|
2000-08-20 21:34:39 +00:00
|
|
|
va_end(ap);
|
1999-08-20 20:25:00 +00:00
|
|
|
|
2003-03-02 13:35:30 +00:00
|
|
|
devfs_create(dev);
|
2008-07-11 11:22:19 +00:00
|
|
|
dev_dependsl(pdev, dev);
|
2007-07-04 06:56:58 +00:00
|
|
|
clean_unrhdrl(devfs_inos);
|
2004-09-23 07:17:41 +00:00
|
|
|
dev_unlock();
|
2008-05-14 14:29:54 +00:00
|
|
|
|
|
|
|
notify_create(dev);
|
|
|
|
|
1999-08-08 18:43:05 +00:00
|
|
|
return (dev);
|
|
|
|
}
|
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
static void
|
2005-02-22 15:51:07 +00:00
|
|
|
destroy_devl(struct cdev *dev)
|
1999-08-29 09:09:12 +00:00
|
|
|
{
|
2004-09-27 06:18:25 +00:00
|
|
|
struct cdevsw *csw;
|
2008-05-21 09:31:44 +00:00
|
|
|
struct cdev_privdata *p, *p1;
|
2004-09-27 06:18:25 +00:00
|
|
|
|
2005-02-22 15:51:07 +00:00
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2004-09-27 06:18:25 +00:00
|
|
|
KASSERT(dev->si_flags & SI_NAMED,
|
2008-09-27 08:51:18 +00:00
|
|
|
("WARNING: Driver mistake: destroy_dev on %d\n", dev2unit(dev)));
|
2006-01-04 17:40:54 +00:00
|
|
|
|
2003-03-02 13:35:30 +00:00
|
|
|
devfs_destroy(dev);
|
2004-02-21 21:57:26 +00:00
|
|
|
|
|
|
|
/* Remove name marking */
|
2004-02-21 20:29:52 +00:00
|
|
|
dev->si_flags &= ~SI_NAMED;
|
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
/* If we are a child, remove us from the parents list */
|
2001-05-26 08:27:58 +00:00
|
|
|
if (dev->si_flags & SI_CHILD) {
|
|
|
|
LIST_REMOVE(dev, si_siblings);
|
|
|
|
dev->si_flags &= ~SI_CHILD;
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
|
|
|
|
/* Kill our children */
|
2001-05-26 08:27:58 +00:00
|
|
|
while (!LIST_EMPTY(&dev->si_children))
|
2005-02-22 15:51:07 +00:00
|
|
|
destroy_devl(LIST_FIRST(&dev->si_children));
|
2004-02-21 21:57:26 +00:00
|
|
|
|
|
|
|
/* Remove from clone list */
|
2004-02-21 20:29:52 +00:00
|
|
|
if (dev->si_flags & SI_CLONELIST) {
|
|
|
|
LIST_REMOVE(dev, si_clone);
|
|
|
|
dev->si_flags &= ~SI_CLONELIST;
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
|
2006-10-13 20:49:24 +00:00
|
|
|
dev->si_refcount++; /* Avoid race with dev_rel() */
|
2004-09-27 06:18:25 +00:00
|
|
|
csw = dev->si_devsw;
|
2004-09-29 16:38:38 +00:00
|
|
|
dev->si_devsw = NULL; /* already NULL for SI_ALIAS */
|
|
|
|
while (csw != NULL && csw->d_purge != NULL && dev->si_threadcount) {
|
2004-09-27 06:18:25 +00:00
|
|
|
csw->d_purge(dev);
|
|
|
|
msleep(csw, &devmtx, PRIBIO, "devprg", hz/10);
|
2006-05-17 06:37:14 +00:00
|
|
|
if (dev->si_threadcount)
|
|
|
|
printf("Still %lu threads in %s\n",
|
|
|
|
dev->si_threadcount, devtoname(dev));
|
2004-09-27 06:18:25 +00:00
|
|
|
}
|
2006-10-13 20:49:24 +00:00
|
|
|
while (dev->si_threadcount != 0) {
|
|
|
|
/* Use unique dummy wait ident */
|
|
|
|
msleep(&csw, &devmtx, PRIBIO, "devdrn", hz / 10);
|
|
|
|
}
|
2004-09-27 06:18:25 +00:00
|
|
|
|
2008-05-21 09:31:44 +00:00
|
|
|
dev_unlock();
|
2008-05-14 14:29:54 +00:00
|
|
|
notify_destroy(dev);
|
2008-05-21 09:31:44 +00:00
|
|
|
mtx_lock(&cdevpriv_mtx);
|
2008-06-16 17:34:59 +00:00
|
|
|
LIST_FOREACH_SAFE(p, &cdev2priv(dev)->cdp_fdpriv, cdpd_list, p1) {
|
2008-05-21 09:31:44 +00:00
|
|
|
devfs_destroy_cdevpriv(p);
|
|
|
|
mtx_lock(&cdevpriv_mtx);
|
|
|
|
}
|
|
|
|
mtx_unlock(&cdevpriv_mtx);
|
|
|
|
dev_lock();
|
2008-05-14 14:29:54 +00:00
|
|
|
|
2004-09-27 06:18:25 +00:00
|
|
|
dev->si_drv1 = 0;
|
|
|
|
dev->si_drv2 = 0;
|
|
|
|
bzero(&dev->__si_u, sizeof(dev->__si_u));
|
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
if (!(dev->si_flags & SI_ALIAS)) {
|
|
|
|
/* Remove from cdevsw list */
|
|
|
|
LIST_REMOVE(dev, si_list);
|
|
|
|
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
/* If cdevsw has no more struct cdev *'s, clean it */
|
2007-07-03 17:42:37 +00:00
|
|
|
if (LIST_EMPTY(&csw->d_devs)) {
|
2004-09-27 06:34:30 +00:00
|
|
|
fini_cdevsw(csw);
|
2007-07-03 17:42:37 +00:00
|
|
|
wakeup(&csw->d_devs);
|
|
|
|
}
|
2004-02-21 21:57:26 +00:00
|
|
|
}
|
2000-09-11 17:15:33 +00:00
|
|
|
dev->si_flags &= ~SI_ALIAS;
|
2006-10-13 20:49:24 +00:00
|
|
|
dev->si_refcount--; /* Avoid race with dev_rel() */
|
2004-09-27 06:18:25 +00:00
|
|
|
|
2004-02-21 21:57:26 +00:00
|
|
|
if (dev->si_refcount > 0) {
|
|
|
|
LIST_INSERT_HEAD(&dead_cdevsw.d_devs, dev, si_list);
|
|
|
|
} else {
|
2007-06-19 13:19:23 +00:00
|
|
|
dev_free_devlocked(dev);
|
2004-02-21 21:57:26 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2004-06-16 09:47:26 +00:00
|
|
|
destroy_dev(struct cdev *dev)
|
2004-02-21 21:57:26 +00:00
|
|
|
{
|
|
|
|
|
2008-11-27 16:47:25 +00:00
|
|
|
WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "destroy_dev");
|
2004-09-23 07:17:41 +00:00
|
|
|
dev_lock();
|
2007-07-05 13:04:59 +00:00
|
|
|
destroy_devl(dev);
|
|
|
|
dev_unlock_and_free();
|
1999-08-29 09:09:12 +00:00
|
|
|
}
|
|
|
|
|
1999-09-13 12:29:32 +00:00
|
|
|
const char *
|
2004-06-16 09:47:26 +00:00
|
|
|
devtoname(struct cdev *dev)
|
1999-08-17 20:25:50 +00:00
|
|
|
{
|
1999-08-29 09:09:12 +00:00
|
|
|
char *p;
|
2004-09-24 06:29:23 +00:00
|
|
|
struct cdevsw *csw;
|
1999-09-13 12:29:32 +00:00
|
|
|
int mynor;
|
1999-08-29 09:09:12 +00:00
|
|
|
|
|
|
|
if (dev->si_name[0] == '#' || dev->si_name[0] == '\0') {
|
|
|
|
p = dev->si_name;
|
2004-09-24 06:29:23 +00:00
|
|
|
csw = dev_refthread(dev);
|
|
|
|
if (csw != NULL) {
|
|
|
|
sprintf(p, "(%s)", csw->d_name);
|
|
|
|
dev_relthread(dev);
|
|
|
|
}
|
1999-08-29 09:09:12 +00:00
|
|
|
p += strlen(p);
|
2008-09-27 08:51:18 +00:00
|
|
|
mynor = dev2unit(dev);
|
1999-09-13 12:29:32 +00:00
|
|
|
if (mynor < 0 || mynor > 255)
|
2004-09-24 06:29:23 +00:00
|
|
|
sprintf(p, "/%#x", (u_int)mynor);
|
1999-09-13 12:29:32 +00:00
|
|
|
else
|
2004-09-24 06:29:23 +00:00
|
|
|
sprintf(p, "/%d", mynor);
|
1999-08-29 09:09:12 +00:00
|
|
|
}
|
1999-08-17 20:25:50 +00:00
|
|
|
return (dev->si_name);
|
|
|
|
}
|
2000-09-02 19:17:34 +00:00
|
|
|
|
|
|
|
int
|
2002-03-10 10:50:05 +00:00
|
|
|
dev_stdclone(char *name, char **namep, const char *stem, int *unit)
|
2000-09-02 19:17:34 +00:00
|
|
|
{
|
|
|
|
int u, i;
|
|
|
|
|
|
|
|
i = strlen(stem);
|
2001-04-14 21:33:58 +00:00
|
|
|
if (bcmp(stem, name, i) != 0)
|
|
|
|
return (0);
|
2000-09-02 19:17:34 +00:00
|
|
|
if (!isdigit(name[i]))
|
|
|
|
return (0);
|
|
|
|
u = 0;
|
2001-11-16 17:05:07 +00:00
|
|
|
if (name[i] == '0' && isdigit(name[i+1]))
|
|
|
|
return (0);
|
2000-09-02 19:17:34 +00:00
|
|
|
while (isdigit(name[i])) {
|
|
|
|
u *= 10;
|
|
|
|
u += name[i++] - '0';
|
|
|
|
}
|
2002-10-05 17:10:28 +00:00
|
|
|
if (u > 0xffffff)
|
|
|
|
return (0);
|
2000-09-02 19:17:34 +00:00
|
|
|
*unit = u;
|
|
|
|
if (namep)
|
|
|
|
*namep = &name[i];
|
|
|
|
if (name[i])
|
|
|
|
return (2);
|
|
|
|
return (1);
|
|
|
|
}
|
2000-09-09 11:39:59 +00:00
|
|
|
|
2004-02-21 20:29:52 +00:00
|
|
|
/*
|
|
|
|
* Helper functions for cloning device drivers.
|
|
|
|
*
|
|
|
|
* The objective here is to make it unnecessary for the device drivers to
|
|
|
|
* use rman or similar to manage their unit number space. Due to the way
|
|
|
|
* we do "on-demand" devices, using rman or other "private" methods
|
|
|
|
* will be very tricky to lock down properly once we lock down this file.
|
|
|
|
*
|
2004-06-22 20:22:24 +00:00
|
|
|
* Instead we give the drivers these routines which puts the struct cdev *'s
|
|
|
|
* that are to be managed on their own list, and gives the driver the ability
|
2004-02-21 20:29:52 +00:00
|
|
|
* to ask for the first free unit number or a given specified unit number.
|
|
|
|
*
|
|
|
|
* In addition these routines support paired devices (pty, nmdm and similar)
|
|
|
|
* by respecting a number of "flag" bits in the minor number.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
|
|
|
struct clonedevs {
|
|
|
|
LIST_HEAD(,cdev) head;
|
|
|
|
};
|
|
|
|
|
2004-03-11 12:58:55 +00:00
|
|
|
void
|
|
|
|
clone_setup(struct clonedevs **cdp)
|
|
|
|
{
|
|
|
|
|
|
|
|
*cdp = malloc(sizeof **cdp, M_DEVBUF, M_WAITOK | M_ZERO);
|
|
|
|
LIST_INIT(&(*cdp)->head);
|
|
|
|
}
|
|
|
|
|
2004-02-21 20:29:52 +00:00
|
|
|
int
|
2007-02-02 22:27:45 +00:00
|
|
|
clone_create(struct clonedevs **cdp, struct cdevsw *csw, int *up, struct cdev **dp, int extra)
|
2004-02-21 20:29:52 +00:00
|
|
|
{
|
|
|
|
struct clonedevs *cd;
|
2005-01-24 12:44:56 +00:00
|
|
|
struct cdev *dev, *ndev, *dl, *de;
|
2004-02-21 20:29:52 +00:00
|
|
|
int unit, low, u;
|
|
|
|
|
2004-03-11 12:58:55 +00:00
|
|
|
KASSERT(*cdp != NULL,
|
|
|
|
("clone_setup() not called in driver \"%s\"", csw->d_name));
|
2004-02-21 20:29:52 +00:00
|
|
|
KASSERT(!(extra & CLONE_UNITMASK),
|
2004-03-11 12:58:55 +00:00
|
|
|
("Illegal extra bits (0x%x) in clone_create", extra));
|
2004-02-21 20:29:52 +00:00
|
|
|
KASSERT(*up <= CLONE_UNITMASK,
|
2004-03-11 12:58:55 +00:00
|
|
|
("Too high unit (0x%x) in clone_create", *up));
|
2008-06-11 18:55:19 +00:00
|
|
|
KASSERT(csw->d_flags & D_NEEDMINOR,
|
|
|
|
("clone_create() on cdevsw without minor numbers"));
|
2004-02-21 20:29:52 +00:00
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Search the list for a lot of things in one go:
|
|
|
|
* A preexisting match is returned immediately.
|
|
|
|
* The lowest free unit number if we are passed -1, and the place
|
|
|
|
* in the list where we should insert that new element.
|
|
|
|
* The place to insert a specified unit number, if applicable
|
|
|
|
* the end of the list.
|
|
|
|
*/
|
|
|
|
unit = *up;
|
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.
Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.
Much testing provided by: Kris
Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
2005-09-19 19:56:48 +00:00
|
|
|
ndev = devfs_alloc();
|
2005-01-24 12:44:56 +00:00
|
|
|
dev_lock();
|
2008-03-17 13:17:10 +00:00
|
|
|
prep_cdevsw(csw);
|
2004-03-11 14:11:02 +00:00
|
|
|
low = extra;
|
2004-02-21 20:29:52 +00:00
|
|
|
de = dl = NULL;
|
2004-03-11 12:58:55 +00:00
|
|
|
cd = *cdp;
|
2004-02-21 20:29:52 +00:00
|
|
|
LIST_FOREACH(dev, &cd->head, si_clone) {
|
2005-01-24 12:44:56 +00:00
|
|
|
KASSERT(dev->si_flags & SI_CLONELIST,
|
|
|
|
("Dev %p(%s) should be on clonelist", dev, dev->si_name));
|
2004-02-21 20:29:52 +00:00
|
|
|
u = dev2unit(dev);
|
|
|
|
if (u == (unit | extra)) {
|
|
|
|
*dp = dev;
|
2005-01-24 12:44:56 +00:00
|
|
|
dev_unlock();
|
2007-06-19 13:19:23 +00:00
|
|
|
devfs_free(ndev);
|
2004-02-21 20:29:52 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
if (unit == -1 && u == low) {
|
|
|
|
low++;
|
|
|
|
de = dev;
|
|
|
|
continue;
|
2005-10-01 19:21:03 +00:00
|
|
|
} else if (u < (unit | extra)) {
|
|
|
|
de = dev;
|
|
|
|
continue;
|
|
|
|
} else if (u > (unit | extra)) {
|
2004-02-21 20:29:52 +00:00
|
|
|
dl = dev;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (unit == -1)
|
2004-03-11 14:11:02 +00:00
|
|
|
unit = low & CLONE_UNITMASK;
|
2008-09-26 14:19:52 +00:00
|
|
|
dev = newdev(csw, unit | extra, ndev);
|
2005-01-24 12:44:56 +00:00
|
|
|
if (dev->si_flags & SI_CLONELIST) {
|
|
|
|
printf("dev %p (%s) is on clonelist\n", dev, dev->si_name);
|
2005-10-01 19:21:03 +00:00
|
|
|
printf("unit=%d, low=%d, extra=0x%x\n", unit, low, extra);
|
2005-01-24 12:44:56 +00:00
|
|
|
LIST_FOREACH(dev, &cd->head, si_clone) {
|
|
|
|
printf("\t%p %s\n", dev, dev->si_name);
|
|
|
|
}
|
|
|
|
panic("foo");
|
|
|
|
}
|
2004-02-21 20:29:52 +00:00
|
|
|
KASSERT(!(dev->si_flags & SI_CLONELIST),
|
2005-01-24 12:44:56 +00:00
|
|
|
("Dev %p(%s) should not be on clonelist", dev, dev->si_name));
|
2004-02-21 20:29:52 +00:00
|
|
|
if (dl != NULL)
|
|
|
|
LIST_INSERT_BEFORE(dl, dev, si_clone);
|
|
|
|
else if (de != NULL)
|
|
|
|
LIST_INSERT_AFTER(de, dev, si_clone);
|
|
|
|
else
|
|
|
|
LIST_INSERT_HEAD(&cd->head, dev, si_clone);
|
|
|
|
dev->si_flags |= SI_CLONELIST;
|
|
|
|
*up = unit;
|
2007-06-19 13:19:23 +00:00
|
|
|
dev_unlock_and_free();
|
2004-02-21 20:29:52 +00:00
|
|
|
return (1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Kill everything still on the list. The driver should already have
|
2004-06-16 09:47:26 +00:00
|
|
|
* disposed of any softc hung of the struct cdev *'s at this time.
|
2004-02-21 20:29:52 +00:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
clone_cleanup(struct clonedevs **cdp)
|
|
|
|
{
|
2007-07-03 17:42:37 +00:00
|
|
|
struct cdev *dev;
|
|
|
|
struct cdev_priv *cp;
|
2004-02-21 20:29:52 +00:00
|
|
|
struct clonedevs *cd;
|
|
|
|
|
|
|
|
cd = *cdp;
|
|
|
|
if (cd == NULL)
|
|
|
|
return;
|
2005-01-24 12:44:56 +00:00
|
|
|
dev_lock();
|
2007-07-03 17:42:37 +00:00
|
|
|
while (!LIST_EMPTY(&cd->head)) {
|
|
|
|
dev = LIST_FIRST(&cd->head);
|
|
|
|
LIST_REMOVE(dev, si_clone);
|
2005-01-24 12:44:56 +00:00
|
|
|
KASSERT(dev->si_flags & SI_CLONELIST,
|
|
|
|
("Dev %p(%s) should be on clonelist", dev, dev->si_name));
|
2007-07-03 17:42:37 +00:00
|
|
|
dev->si_flags &= ~SI_CLONELIST;
|
2008-06-16 17:34:59 +00:00
|
|
|
cp = cdev2priv(dev);
|
2007-07-03 17:42:37 +00:00
|
|
|
if (!(cp->cdp_flags & CDP_SCHED_DTR)) {
|
|
|
|
cp->cdp_flags |= CDP_SCHED_DTR;
|
|
|
|
KASSERT(dev->si_flags & SI_NAMED,
|
|
|
|
("Driver has goofed in cloning underways udev %x", dev->si_drv0));
|
|
|
|
destroy_devl(dev);
|
|
|
|
}
|
2004-02-21 20:29:52 +00:00
|
|
|
}
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_unlock_and_free();
|
2004-02-21 20:29:52 +00:00
|
|
|
free(cd, M_DEVBUF);
|
|
|
|
*cdp = NULL;
|
|
|
|
}
|
2007-07-03 17:42:37 +00:00
|
|
|
|
|
|
|
static TAILQ_HEAD(, cdev_priv) dev_ddtr =
|
|
|
|
TAILQ_HEAD_INITIALIZER(dev_ddtr);
|
|
|
|
static struct task dev_dtr_task;
|
|
|
|
|
|
|
|
static void
|
|
|
|
destroy_dev_tq(void *ctx, int pending)
|
|
|
|
{
|
|
|
|
struct cdev_priv *cp;
|
|
|
|
struct cdev *dev;
|
|
|
|
void (*cb)(void *);
|
|
|
|
void *cb_arg;
|
|
|
|
|
|
|
|
dev_lock();
|
|
|
|
while (!TAILQ_EMPTY(&dev_ddtr)) {
|
|
|
|
cp = TAILQ_FIRST(&dev_ddtr);
|
|
|
|
dev = &cp->cdp_c;
|
|
|
|
KASSERT(cp->cdp_flags & CDP_SCHED_DTR,
|
|
|
|
("cdev %p in dev_destroy_tq without CDP_SCHED_DTR", cp));
|
|
|
|
TAILQ_REMOVE(&dev_ddtr, cp, cdp_dtr_list);
|
|
|
|
cb = cp->cdp_dtr_cb;
|
|
|
|
cb_arg = cp->cdp_dtr_cb_arg;
|
|
|
|
destroy_devl(dev);
|
2008-03-17 13:17:10 +00:00
|
|
|
dev_unlock_and_free();
|
2007-07-03 17:42:37 +00:00
|
|
|
dev_rel(dev);
|
|
|
|
if (cb != NULL)
|
|
|
|
cb(cb_arg);
|
|
|
|
dev_lock();
|
|
|
|
}
|
|
|
|
dev_unlock();
|
|
|
|
}
|
|
|
|
|
2007-07-03 18:18:30 +00:00
|
|
|
/*
|
|
|
|
* devmtx shall be locked on entry. devmtx will be unlocked after
|
|
|
|
* function return.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
destroy_dev_sched_cbl(struct cdev *dev, void (*cb)(void *), void *arg)
|
2007-07-03 17:42:37 +00:00
|
|
|
{
|
|
|
|
struct cdev_priv *cp;
|
2007-07-03 18:18:30 +00:00
|
|
|
|
|
|
|
mtx_assert(&devmtx, MA_OWNED);
|
2008-06-16 17:34:59 +00:00
|
|
|
cp = cdev2priv(dev);
|
2007-07-03 17:42:37 +00:00
|
|
|
if (cp->cdp_flags & CDP_SCHED_DTR) {
|
|
|
|
dev_unlock();
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
dev_refl(dev);
|
|
|
|
cp->cdp_flags |= CDP_SCHED_DTR;
|
|
|
|
cp->cdp_dtr_cb = cb;
|
|
|
|
cp->cdp_dtr_cb_arg = arg;
|
|
|
|
TAILQ_INSERT_TAIL(&dev_ddtr, cp, cdp_dtr_list);
|
|
|
|
dev_unlock();
|
|
|
|
taskqueue_enqueue(taskqueue_swi_giant, &dev_dtr_task);
|
|
|
|
return (1);
|
|
|
|
}
|
|
|
|
|
2007-07-03 18:18:30 +00:00
|
|
|
int
|
|
|
|
destroy_dev_sched_cb(struct cdev *dev, void (*cb)(void *), void *arg)
|
|
|
|
{
|
|
|
|
dev_lock();
|
|
|
|
return (destroy_dev_sched_cbl(dev, cb, arg));
|
|
|
|
}
|
|
|
|
|
2007-07-03 17:42:37 +00:00
|
|
|
int
|
|
|
|
destroy_dev_sched(struct cdev *dev)
|
|
|
|
{
|
|
|
|
return (destroy_dev_sched_cb(dev, NULL, NULL));
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
destroy_dev_drain(struct cdevsw *csw)
|
|
|
|
{
|
|
|
|
|
|
|
|
dev_lock();
|
|
|
|
while (!LIST_EMPTY(&csw->d_devs)) {
|
|
|
|
msleep(&csw->d_devs, &devmtx, PRIBIO, "devscd", hz/10);
|
|
|
|
}
|
|
|
|
dev_unlock();
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
drain_dev_clone_events(void)
|
|
|
|
{
|
|
|
|
|
|
|
|
sx_xlock(&clone_drain_lock);
|
|
|
|
sx_xunlock(&clone_drain_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
devdtr_init(void *dummy __unused)
|
|
|
|
{
|
|
|
|
|
|
|
|
TASK_INIT(&dev_dtr_task, 0, destroy_dev_tq, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
SYSINIT(devdtr, SI_SUB_DEVFS, SI_ORDER_SECOND, devdtr_init, NULL);
|