Commit Graph

8285 Commits

Author SHA1 Message Date
Poul-Henning Kamp
800b42bde0 Prepare for the final onslaught on devices:
Move uid/gid/mode from cdev to cdevsw.

Add kind field to use for devd(8) later.

Bump both D_VERSION and __FreeBSD_version
2005-03-17 12:07:00 +00:00
Poul-Henning Kamp
572b4402d1 In stange circumstances we may end up being the last reference to a
session in tprintf().   SESSRELE() needs to properly dispose of the
sessions mutex.

Add sessrele() which does the proper cleanup and have SESSRELE() call it.

Use SESSRELE also in pgdelete().

Found by:	Coverity (ID:526)
2005-03-17 08:44:41 +00:00
Poul-Henning Kamp
51f5ce0c8c Add two arguments to the vfs_hash() KPI so that filesystems which do
not have unique hashes (NFS) can also use it.
2005-03-16 11:20:51 +00:00
Poul-Henning Kamp
9068e77689 Fix a memoryleak in case of failed root filesystem mount.
Spotted by:     Coverity via sam
2005-03-16 11:06:49 +00:00
John-Mark Gurney
2a77000b75 MFp4: print a more useful error when we don't have a /dev to mount devfs
on..
2005-03-16 08:04:39 +00:00
Poul-Henning Kamp
78bb3c21ed Add mnt_hashseed to struct mount and initialize it witn PRNG bits, use
it to get better hashing in vfs_hash.

In case of an insert collision in vfs_hash_insert(), put the loosing vnode
on a special list so that vfs_hash_remove() can just assume that it is on
a list.

Drop the VI_HASHED flag.
2005-03-16 07:35:06 +00:00
Warner Losh
358fef538f Sometimes, when asked to return region A..C, we'd return A+N..C+N
instead of failing.

When looking for a region to allocate, we used to check to see if the
start address was < end.  In the case where A..B is allocated already,
and one wants to allocate A..C (B < C), then this test would
improperly fail (which means we'd examine that region as a possible
one), and we'd return the region B+1..C+(B-A+1) rather than NULL.
Since C+(B-A+1) is necessarily larger than C (end argument), this is
incorrect behavior for rman_reserve_resource_bound().

The fix is to exclude those regions where r->r_start + count - 1 > end
rather than r->r_start > end.  This bug has been in this code for a
very long time.  I believe that all other tests against end are
correctly done.

This is why sio0 generated a message about interrupts not being
enabled properly for the device.  When fdc had a bug that allocated
from 0x3f7 to 0x3fb, sio0 was then given 0x3fc-0x404 rather than the
0x3f8-0x3ff that it wanted.  Now when fdc has the same bug, sio0 fails
to allocate its ports, which is the proper behavior.  Since the probe
failed, we never saw the messed up resources reported.

I suspect that there are other places in the tree that have weird
looping or other odd work arounds to try to cope with the observed
weirdness this bug can introduce.  These workarounds should be located
and eliminated.

Minor debug write fix to match the above test done as well.

'nice' by: mdodd
Sponsored by: timing solutions (http://www.timing.com/)
2005-03-15 20:28:51 +00:00
Warner Losh
a33ab77447 Fix a debugging printf. The order of start/end was inconsistant with
all the other start/end debugs, causing momentary confusion when the
output was examined.
2005-03-15 20:15:15 +00:00
Poul-Henning Kamp
45c26fa2b6 Improve the vfs_hash() API: vput() the unneeded vnode centrally to
avoid replicating the vput in all the filesystems.
2005-03-15 20:00:03 +00:00
Jeff Roberson
b172f6c5f9 - Now that there are no external users of vfree() make it static.
- Move VSHOULDBUSY, VSHOULDFREE, and VTRYRECYCLE into vfs_subr.c so
   no one else attempts to grow a dependency on them.
 - Now that objects with pages hold the vnode we don't have to do unlocked
   checks for the page count in the vm object in VSHOULDFREE.  These three
   macros could simply check for holdcnt state transitions to determine
   whether the vnode is on the free list already, but the extra safety
   the flag affords us is probably worth the minimal cost.
 - The leafonly sysctl and code have been dead for several years now,
   remove the sysctl and the code that employed it from vtryrecycle().
 - vtryrecycle() also no longer has to check the object's page count as
   the object holds the vnode until it reaches 0.

Sponsored by:	Isilon Systems, Inc.
2005-03-15 14:38:16 +00:00
Poul-Henning Kamp
7933351a28 Fix a debug message to print a usable device name rather than useless
major+minor tupple.
2005-03-15 14:08:10 +00:00
Jeff Roberson
c178628d6e - Expose vholdl() so it may be used outside of vfs_subr.c 2005-03-15 13:43:10 +00:00
Poul-Henning Kamp
4ba679d6d0 Remove findcdev(). 2005-03-15 12:58:08 +00:00
Poul-Henning Kamp
0a2e49f1f8 Rename cdev->si_udev to cdev->si_drv0 to reflect the new nature of
the field.
2005-03-15 11:33:28 +00:00
Jeff Roberson
f5f0da0a0e - transferlockers() requires the interlock to be SMP safe.
Sponsored by:	Isilon Systems, Inc.
2005-03-15 09:27:45 +00:00
Poul-Henning Kamp
e82ef95c11 Simplify the vfs_hash calling convention. 2005-03-15 08:07:07 +00:00
Poul-Henning Kamp
ee148e2606 Cleanup accidentally include #if 0 section. 2005-03-14 10:25:09 +00:00
Poul-Henning Kamp
6c325a2a21 Currently (almost) all filesystems maintain a local inode hash table
to get from (mount + inode) to vnode.  These tables are mostly
copy&pasted from UFS, sized based on desiredvnodes and therefore
quite large (128K-512K).  Several filesystems are buggy enough that
they allocate the hash table even before they know if they will
ever be used or not.

Add "vfs_hash", a system wide hash table, which will replace all
the per-filesystem hash-tables.

The fields we add to struct vnode will more or less be saved in
the respective filesystems inodes.

Having one central implementation will save code and will allow us
to justify the complexity of code to dynamically (re)size the hash
at a later point.
2005-03-14 10:01:29 +00:00
Jeff Roberson
8045557f2b - Increment the holdcnt once for each usecount reference. This allows us
to use only the holdcnt to determine whether a vnode may be recycled,
   simplifying the V* macros as well as vtryrecycle(), etc.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 09:25:19 +00:00
Jeff Roberson
159b454819 - We do not have to check the object's ref_count in VSHOULDFREE or
vtryrecycle().  All obj refs also ref the vnode.
 - Consistently use v_incr_usecount() to increment the usecount.  This will
   be more important later.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 08:30:31 +00:00
Jeff Roberson
8f13a540ed - Slightly rearrange vrele() to move the common case in one indentation
level.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 07:16:55 +00:00
Jeff Roberson
6fc16a838c - Rework vget() so we drop the usecount in two failure cases that were
missed by my last commit.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 07:11:19 +00:00
Poul-Henning Kamp
93f6c81e25 Remove debugging printfs. 2005-03-14 06:51:29 +00:00
Jeff Roberson
0463dc9ef1 - Do a vn_start_write in vn_close, we may write if this is the last ref
on an unlinked file.  We can't know if this is the case until after we
   have the lock.
 - Lock the vnode in vn_close, many filesystems had code which was unsafe
   without the lock held, and holding it greatly simplifies vgone().
 - Adjust vn_lock() to check for the VI_DOOMED flag where appropriate.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:56:28 +00:00
Jeff Roberson
6703c30bb5 - Remove vx_lock, vx_unlock, vx_wait, etc.
- Add a vn_start_write/vn_finished_write around vlrureclaim so we don't do
   writing ops without suspending.  This could suspend the vlruproc which
   should not be a problem under normal circumstances.
 - Manually implement VMIGHTFREE in vlrureclaim as this was the only instance
   where it was used.
 - Acquire a lock before calling vgone() as it now requires it.
 - Move the acquisition of the vnode interlock from vtryrecycle() to
   getnewvnode() so that if it fails we don't drop and reacquire the
   vnode_free_list_mtx.
 - Check for a usecount or holdcount at the end of vtryrecycle() in case
   someone grabbed a ref while we were recycling.  Abort the recycle, and
   on the final ref drop this vnode will be placed on the head of the free
   list.
 - Move the redundant VOP_INACTIVE protection code into the local
   vinactive() routine to avoid code bloat.
 - Keep the vnode lock held across calls to vgone() in several places.
 - vgonel() no longer uses XLOCK, instead callers must hold an exclusive
   vnode lock.  The VI_DOOMED flag is set to allow other threads to detect
   a vnode which is no longer valid.  This flag is set until the last
   reference is gone, and there are no chances for a new ref.  vgonel()
   holds this lock across the entire function, which greatly simplifies
   logic.
 _ Only vfree() in one place in vgone() not three.
 - Adjust vget() to check the VI_DOOMED flag prior to waiting on the lock
   in the LK_NOWAIT case.  In other cases, check after we have slept and
   acquired an exlusive lock.  This will simulate the old vx_wait()
   behavior.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:54:28 +00:00
Jeff Roberson
2b3183a8b7 - A lock is required before calling VOP_REVOKE. Our reference protects us
from accessing another vnode so a naked VOP_LOCK is sufficient.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:47:04 +00:00
Jeff Roberson
9331fd135b - Don't VOP_UNLOCK prior to VOP_REVOKE. The lock is required now.
Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:45:51 +00:00
Jeff Roberson
23f2513a4e - Don't drop the lock in the default inactive handler anymore, VOP_NULL
will do for vop_stdinactive now.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:45:01 +00:00
Jeff Roberson
4e6746965e - CLOSE, REVOKE, INACTIVE, and RECLAIM are not L L L, that's a locked vnode
on enter, exit, error.  This allows for the removal of the XLOCK.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:42:16 +00:00
Pawel Jakub Dawidek
cefcecbefd Function jailed() looks into ucred strcture, so be sure ucred is not NULL.
Reviewed by:	rwatson
MFC after:	1 week
2005-03-12 14:31:04 +00:00
Pawel Jakub Dawidek
d079d0a0d2 Clean up a bit.
Reviewed by:	rwatson
MFC after:	1 week
2005-03-12 14:28:34 +00:00
Robert Watson
59f21d5ab1 Extend the coverage of the accept and socket mutexes in soisconnected()
so that the socket lock is held over the test-and-set removal of the
accept filter option during connect, and the two socket mutex regions
(transition to connected, perform accept filter) are combined.
2005-03-12 13:39:39 +00:00
Robert Watson
a59f81d263 Move the logic implementing retrieval of the SO_ACCEPTFILTER socket option
from uipc_socket.c to uipc_accf.c in do_getopt_accept_filter(), so that it
now matches do_setopt_accept_filter().  Slightly reformulate the logic to
match the optimistic allocation of storage for the argument in advance,
and slightly expand the coverage of the socket lock.
2005-03-12 12:57:18 +00:00
Robert Watson
92081a8344 Part two of post-SMPng cleanup of accept filter registration: perform all
allocation up front before grabbing the socket mutex and doing the
registration work.  The result is a lot cleaner.
2005-03-12 12:27:47 +00:00
Peter Wemm
f71692e9be Replace my previous change for 32 bit systems with hz > 169 with Bruce's
simpler one.
2005-03-12 00:13:45 +00:00
Peter Wemm
2afec87508 Make the tty vmin/vtime timeouts work for hz > 169 on 32 bit machines. 2005-03-12 00:10:23 +00:00
Robert Watson
64c238075f First step in simplifying accept filter socket option logic in the
post-SMPng world order.  Centralize handling of the socket option
clear case in do_setopt_accept_filter().
2005-03-11 21:37:45 +00:00
Robert Watson
56856fbfb4 Remove an additional commented out reference to a possible future sx
lock.
2005-03-11 19:16:02 +00:00
Robert Watson
2b37548a71 When setting up a socket in socreate(), there's no need to lock the
socket lock around knlist_init(), so don't.

Hard code the setting of the socket reference count to 1 rather than
using soref() to avoid asserting the socket lock, since we've not yet
exposed the socket to other threads.

This removes two mutex operations from each socket allocation.
2005-03-11 16:30:02 +00:00
Robert Watson
5fab68b19e Remove suggestive sx_init() comment in soalloc(). We will have something
like this at some point, but for now it clutters the source.
2005-03-11 16:26:33 +00:00
Robert Watson
35a196154f The SO_NOSIGPIPE socket option allows a user process to mark a socket
so that the socket does not generate SIGPIPE, only EPIPE, when a write
is attempted after socket shutdown.  When the option was introduced in
2002, this required the logic for determining whether SIGPIPE was
generated to be pushed down from dofilewrite() to the socket layer so
that the socket options could be considered.  However, the change in
2002 omitted modification to soo_write() required to add that logic,
resulting in SIGPIPE not being generated even without SO_NOSIGPIPE when
the socket was written to using write() or related generic system calls.

This change adds the EPIPE logic to soo_write(), generating a SIGPIPE
signal to the process associated with the passed uio in the event that
the SO_NOSIGPIPE option is not set.

Notes:

- The are upsides and downsides to placing this logic in the socket
  layer as opposed to the file descriptor layer.  This is really fd
  layer logic, but because we need so_options, we have a choice of
  layering violations and pick this one.

- SIGPIPE possibly should be delivered to the thread performing the
  write, not the process performing the write.

- uio->uio_td and the td argument to soo_write() might potentially
  differ; we use the thread in the uio argument.

- The "sigpipe" regression test in src/tools/regression/sockets/sigpipe
  tests for the bug.

Submitted by:		Mikko Tyolajarvi <mbsd at pacbell dot net>
Talked with:		glebius, alfred
PR:			78478
MFC after:		1 week
2005-03-11 15:06:16 +00:00
John-Mark Gurney
74e620476c fix spelling of match in comment...
MFC after:	3 days
2005-03-10 21:23:06 +00:00
Poul-Henning Kamp
b43ab0e378 Try to fix the mess I made of devname, with the minimal subset of the
larger minor/major patch which was posted for testing.
2005-03-10 18:21:34 +00:00
Robert Watson
53358cc907 Document, via WITNESS, that the NFS server mutex falls ahead of the socket
buffer mutexes.
2005-03-09 21:38:53 +00:00
Dag-Erling Smørgrav
628b83cd08 My addled brains didn't realize that since vtp points into value, we
can't freeenv(value) before we're done inspecting vtp[0].

Tested by:	Anish Mistry <mistry.7@osu.edu>
2005-03-09 12:16:45 +00:00
Stefan Farfeleder
b26244446b Fix typo in comment. 2005-03-09 11:50:55 +00:00
Sam Leffler
a4e714295a allow the destination of m_move_pkthdr to have external
storage (e.g. a cluster)

Glanced at by:	rwatson, silby
2005-03-08 17:52:01 +00:00
Giorgos Keramidas
0a11e99990 Remove redundant initialization that is repeated in the for() loop
right below it.

Approved by:	jhb
2005-03-08 16:57:20 +00:00
Maxim Sobolev
8d6e40c3f1 Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress
SIGPIPE signal for the duration of the sento-family syscalls. Use it to
replace previously added hack in Linux layer based on temporarily setting
SO_NOSIGPIPE flag.

Suggested by:	alfred
2005-03-08 16:11:41 +00:00
Poul-Henning Kamp
d9a54d5c23 Reengineer subr_unit
Add support for passing in a mutex.  If NULL is passed a global
	subr_unit mutex is used.

	Add alloc_unrl() which expects the mutex to be held.

	Allocating a unit will never sleep as it does not need to allocate
	memory.

	Cut possible range in half so we can use -1 to mean "out of number".

	Collapse first and last runs into the head by means of counters.
	This saves memory in the common case(s).
2005-03-08 10:40:48 +00:00