Commit Graph

8379 Commits

Author SHA1 Message Date
Jeff Roberson
ea9aa09dd1 - Remove an unused variable from relookup().
- Assert that REMOVE, CREATE, and RENAME callers have WANTPARENT
   or LOCKPARENT set.  You can't complete any of these operations without
   at least a reference to the parent.  Many filesystems check for this case
   even though it isn't possible in the current system.
2005-03-28 13:56:56 +00:00
Jeff Roberson
f7b404d88f - Remove an unused variable.
Sponsored by:	Isilon Systems, Inc.
2005-03-28 13:29:48 +00:00
Jeff Roberson
9d65cdf6ff - Rev 1.83 of kern_lock.c fixes the td_locks assert, reenable it here.
Sponsored by:	Isilon Systems, Inc.
2005-03-28 12:52:46 +00:00
Jeff Roberson
bf5c2a1940 - Don't bump the count twice in the LK_DRAIN case.
Sponsored by:	Isilon Systems, Inc.
2005-03-28 12:52:10 +00:00
Jeff Roberson
9dcc5da318 - Move code that should probably be an assert above the main body of
vrele so that we can decrease the indentation of the real work and
   make things slightly more clear.

Sponsored by:	Isilon Systems, Inc.
2005-03-28 11:18:47 +00:00
Jeff Roberson
ee5a0a2d7c - We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.
Sponsored by:	Isilon Systems, Inc.
2005-03-28 09:26:17 +00:00
Jeff Roberson
d36f0a4ff8 - Adjust asserts in vop_lookup_post() to match the new post PDIRUNLOCK
vfs.

Sponsored by:	Isilon Systems, Inc.
2005-03-28 09:25:25 +00:00
Jeff Roberson
1e38e08e76 - Get rid of PDIRUNLOCK, instead, we fixup the lock state immediately after
calling VOP_LOOKUP().  Rather than having each filesystem check the
   LOCKPARENT flag, we simply check it once here and unlock as required.
   The only unusual case is ISDOTDOT, where we require an unlocked vnode
   on return.  Relocking this vnode with the child locked is allowed since
   the child is actually its parent.
 - Add a few asserts for some unusual conditions that I do not believe can
   happen.  These will later go away and turn into implementations for these
   conditions.

Sponsored by:	Isilon Systems, Inc.
2005-03-28 09:24:50 +00:00
Poul-Henning Kamp
3b73a3c079 Remove another ';' after if().
Also spotted by:	bz
2005-03-27 07:53:13 +00:00
Poul-Henning Kamp
2d8dfb2836 Remove extra ; at end of if().
Found by:	bz
2005-03-27 07:52:12 +00:00
Poul-Henning Kamp
4a650cc291 Make (some) serial ports implement the PPS-API again. This change
appearantly fell out during the tty code cleanup.
2005-03-26 20:12:39 +00:00
Poul-Henning Kamp
f83856243d s/ENOTTY/ENOIOCTL/ 2005-03-26 20:04:28 +00:00
Jeff Roberson
eb8d0e01c0 - The td_locks check is currently broken with snapshots and possibly
some case in unmount.  Disable the KASSERT until these problems can
   be diagnosed.

Sponsored by:	Isilon Systems, Inc.
2005-03-25 09:56:56 +00:00
Jeff Roberson
228ea9d212 - Don't recycle vnodes anymore. Free them once they are dead. getnewvnode
now always allocates a new vnode.
 - Define a new function, vnlru_free, which frees vnodes from the free list.
   It takes as a parameter the number of vnodes to free, which is
   wantfreevnodes - freevnodes when called from vnlru_proc or 1 when
   called from getnewvnode().  For now, getnewvnode() still tries to reclaim
   a free vnode before creating a new one when we are near the limit.
 - Define a function, vdestroy, which handles the actual release of memory
   and teardown of locks, etc.  This could become a uma_dtor() routine.
 - Get rid of minvnodes.  Now wantfreevnodes is 1/4th the max vnodes.  This
   keeps more unreferenced vnodes around so that files which have only
   been stat'd are less likely to be kicked out of the system before we
   have a chance to read them, etc.  These vnodes may still be freed via
   the normal vnlru_proc() routines which may some day become a real lru.
2005-03-25 05:34:39 +00:00
Marcel Moolenaar
379ba85322 Fix inittodr() invocation. Now that devfs is mounted before the
actual root file system is mounted, the first entry on the mountlist
is not the root file system and the timestamp for that entry is
typically 0. Passing that to inittodr() caused annoying errors on
alpha and ia64.
So, call inittodr() for all file systems on mountlist, but only when
the timestamp (mnt_time) is non-zero.
2005-03-25 01:56:12 +00:00
Jeff Roberson
6c759f3558 - Add information about the buf lock to db_show_buffer.
- Add a 'show lockedbufs' command that is similar to show lockedvnods.

Sponsored by:	Isilon Systems, Inc.
2005-03-25 00:20:37 +00:00
Jeff Roberson
f158df07ab - Restore COUNT() in all of its original glory. Don't make it dependent
on DEBUG as ufs will soon grow a dependency on this count.

Discussed with:	bde
Sponsored by:	Isilon Systems, Inc.
2005-03-25 00:00:44 +00:00
John Baldwin
85c36f3d25 Don't set ret_namelen and ret_resnamelen in res_find() unless both the
corresponding pointer to the buffer (ret_name and ret_resname) is non-NULL
to avoid possible NULL pointer derefs.

Reported by:	Coverity via sam
2005-03-24 21:20:25 +00:00
Poul-Henning Kamp
eb0d6cde00 Move implementation of hw.bus.rman sysctl to subr_rman.c so that
subr_bus.c doesn't need to peek inside struct resource.

OK from:	imp
2005-03-24 18:13:11 +00:00
Jeff Roberson
61ef09d118 - Fail an assert if we attempt to return with any lockmgr locks held in
userret().

Sponsored by:	Isilon Systems, Inc.
2005-03-24 09:35:38 +00:00
Jeff Roberson
92e251caf7 - Complete the implementation of td_locks. Track the number of outstanding
lockmgr locks that this thread owns.  This is complicated due to
   LK_KERNPROC and because lockmgr tolerates unlocking an unlocked lock.

Sponsored by:	Isilon Systes, Inc.
2005-03-24 09:35:06 +00:00
Jeff Roberson
d830f82824 - Pass LK_EXCLUSIVE to VFS_ROOT() to satisfy the new flags argument. For
now, all calls to VFS_ROOT() should still acquire exclusive locks.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:31:38 +00:00
Jeff Roberson
aabb175391 - Fixup the default vfs_root function to match the new prototype.
Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:30:00 +00:00
Jeff Roberson
5d14d29912 - Grab the lock type that the caller requests in vfs_hash_insert().
Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:16:27 +00:00
Jeff Roberson
c167961e27 - If vput() is called with a shared lock it must upgrade to an exclusive
before it can call VOP_INACTIVE().  This must use the EXCLUPGRADE path
   because we may violate some lock order with another locked vnode if
   we drop and reacquire the lock.  If EXCLUPGRADE fails, we mark the
   vnode with VI_OWEINACT.  This case should be very rare.
 - Clear VI_OWEINACT in vinactive() and vbusy().
 - If VI_OWEINACT is set in vgone() do the VOP_INACTIVE call here as well.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:08:58 +00:00
Jeff Roberson
3e6bcad375 - Remove some long dead LOOKUP_SHARED code that tracked the lock state.
- Always pass LOCKSHARED and rely on namei() to ignore it when
   LOOKUP_SHARED is not set.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:04:35 +00:00
Jeff Roberson
ae88db8a72 - Remove the #ifdef LOOKUP_SHARED from some calls to NDINIT. The
LOCKSHARED flag is simply ignored in namei() if LOOKUP_SHARED is not
   enabled.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:03:31 +00:00
Jeff Roberson
ad09e57f41 - Clear LOCKSHARED if LOOKUP_SHARED is not enabled. This is not strictly
necessary since we disable the shared locks in vfs_cache, but it is
   prefered that the option not leak out into filesystems when it is
   disabled.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:02:37 +00:00
Jeff Roberson
fdd6a3ff3c - All of the bugs which lead to the complication of the LOOKUP_SHARED
config option have now been fixed.  All filesystems are properly locked
   and checked via DEBUG_VFS_LOCKS.  Remove the workaround code.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:00:45 +00:00
Julian Elischer
b75b03116f Fix code freeing wrong cred pointer.
Submitted by:	das
Noticed by: Coverity tool
MFC after:	3 days

Note: usually the two pointers point to the same
thing but it was still a bug.
2005-03-21 22:55:38 +00:00
Robert Watson
6220dcba84 Add a read-only kern.sched.preemption sysctl so that user space can tell
if "options PREEMPTION" is compiled into the kernel.
2005-03-20 17:05:12 +00:00
Pawel Jakub Dawidek
c78941e69e Add ki_jid field to the kinfo_proc structure and store jail ID there.
Reviewed by:	gad
MFC after:	3 days
2005-03-20 10:35:23 +00:00
Poul-Henning Kamp
773eff9d97 Sleeping is not allowed in uma->fini 2005-03-19 08:22:13 +00:00
Sam Leffler
b53d6ac575 check copyin return value
Noticed by:	Coverity Prevent analysis tool
2005-03-19 04:34:23 +00:00
David Schultz
f7fdcd45f0 Add missing cases for PT_SYSCALL.
Found by:	Coverity Prevent analysis tool
2005-03-18 21:22:28 +00:00
Maxim Sobolev
2322a0a77d Impose the upper limit on signals that are allowed between kernel threads
in set[ug]id program for compatibility with Linux. Linuxthreads uses
4 signals from SIGRTMIN to SIGRTMIN+3.

Pointed out by:		rwatson
2005-03-18 13:33:18 +00:00
Poul-Henning Kamp
1ea7a6f806 Use subr_unit to allocate thread ID's with.
Tested by:	davidxu
2005-03-18 12:34:14 +00:00
Maxim Sobolev
f9cd63d436 Linuxthreads uses not only signal 32 but several signals >= 32.
PR:		kern/72922
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
2005-03-18 11:08:55 +00:00
Poul-Henning Kamp
a1e1d551d8 Fix a bad copy&paste mistake I made.
Spotted by:	truckman
2005-03-18 06:01:21 +00:00
Warner Losh
36fed96550 Use STAILQ in preference to SLIST for the resources. Insert new resources
last in the list rather than first.

This makes the resouces print in the 4.x order rather than the 5.x order
(eg fdc0 at 0x3f0-0x3f5,0x3f7 is 4.x, but 0x3f7,0x3f0-0x3f5 is 5.x).  This
also means that the pci code will once again print the resources in BAR
ascending order.
2005-03-18 05:19:50 +00:00
John-Mark Gurney
c4c44d2935 fix aio+kq... I've been running ambrisko's test program for much longer
w/o problems than I was before...  This simply brings back the knote_delete
as knlist_delete which will also drop the knote's, instead of just clearing
the list and seeing _ONESHOT...

Fix a race where if a note was _INFLUX and _DETACHED, it could end up being
modified... whoopse..

MFC after:	1 week
Prodded by:	ambrisko and dwhite
2005-03-18 01:11:39 +00:00
John-Mark Gurney
7ac139a904 add m_copyup function.. This can be used to help make our ip stack less
alignment restrictive, and help performance on some ethernet cards which
currently copy the entire packet a couple bytes to get the packet aligned
properly...

Wordsmithing by:	dwhite
Obtained from:	NetBSD (code only)
I'll clean it up later:	rwatson
2005-03-17 19:34:57 +00:00
Robert Watson
bc60830675 A further step on the journey of meaking panics and debugging more reliable:
in the window between the beginning of panic() and entering the debugger,
it's possible to receive interrupts.  If we receive an interrupt, don't
preempt if panicstr != NULL, as the system is in the process of failing, and
the preempting thread is likely to stumble over the failure.  The typical
scenario is during the printf() in panic() prior to entering the debugger,
but when running with a slower console type such as serial console.

It could be that the panic string should be passed to the debugger to print,
so that it can run from the debugger's environment rather than a regular
kernel printf.

Glanced at by:	jhb
2005-03-17 15:18:01 +00:00
Poul-Henning Kamp
bde1a9c98b Kill MAJOR_AUTO 2005-03-17 13:37:28 +00:00
Poul-Henning Kamp
800b42bde0 Prepare for the final onslaught on devices:
Move uid/gid/mode from cdev to cdevsw.

Add kind field to use for devd(8) later.

Bump both D_VERSION and __FreeBSD_version
2005-03-17 12:07:00 +00:00
Poul-Henning Kamp
572b4402d1 In stange circumstances we may end up being the last reference to a
session in tprintf().   SESSRELE() needs to properly dispose of the
sessions mutex.

Add sessrele() which does the proper cleanup and have SESSRELE() call it.

Use SESSRELE also in pgdelete().

Found by:	Coverity (ID:526)
2005-03-17 08:44:41 +00:00
Poul-Henning Kamp
51f5ce0c8c Add two arguments to the vfs_hash() KPI so that filesystems which do
not have unique hashes (NFS) can also use it.
2005-03-16 11:20:51 +00:00
Poul-Henning Kamp
9068e77689 Fix a memoryleak in case of failed root filesystem mount.
Spotted by:     Coverity via sam
2005-03-16 11:06:49 +00:00
John-Mark Gurney
2a77000b75 MFp4: print a more useful error when we don't have a /dev to mount devfs
on..
2005-03-16 08:04:39 +00:00
Poul-Henning Kamp
78bb3c21ed Add mnt_hashseed to struct mount and initialize it witn PRNG bits, use
it to get better hashing in vfs_hash.

In case of an insert collision in vfs_hash_insert(), put the loosing vnode
on a special list so that vfs_hash_remove() can just assume that it is on
a list.

Drop the VI_HASHED flag.
2005-03-16 07:35:06 +00:00
Warner Losh
358fef538f Sometimes, when asked to return region A..C, we'd return A+N..C+N
instead of failing.

When looking for a region to allocate, we used to check to see if the
start address was < end.  In the case where A..B is allocated already,
and one wants to allocate A..C (B < C), then this test would
improperly fail (which means we'd examine that region as a possible
one), and we'd return the region B+1..C+(B-A+1) rather than NULL.
Since C+(B-A+1) is necessarily larger than C (end argument), this is
incorrect behavior for rman_reserve_resource_bound().

The fix is to exclude those regions where r->r_start + count - 1 > end
rather than r->r_start > end.  This bug has been in this code for a
very long time.  I believe that all other tests against end are
correctly done.

This is why sio0 generated a message about interrupts not being
enabled properly for the device.  When fdc had a bug that allocated
from 0x3f7 to 0x3fb, sio0 was then given 0x3fc-0x404 rather than the
0x3f8-0x3ff that it wanted.  Now when fdc has the same bug, sio0 fails
to allocate its ports, which is the proper behavior.  Since the probe
failed, we never saw the messed up resources reported.

I suspect that there are other places in the tree that have weird
looping or other odd work arounds to try to cope with the observed
weirdness this bug can introduce.  These workarounds should be located
and eliminated.

Minor debug write fix to match the above test done as well.

'nice' by: mdodd
Sponsored by: timing solutions (http://www.timing.com/)
2005-03-15 20:28:51 +00:00
Warner Losh
a33ab77447 Fix a debugging printf. The order of start/end was inconsistant with
all the other start/end debugs, causing momentary confusion when the
output was examined.
2005-03-15 20:15:15 +00:00
Poul-Henning Kamp
45c26fa2b6 Improve the vfs_hash() API: vput() the unneeded vnode centrally to
avoid replicating the vput in all the filesystems.
2005-03-15 20:00:03 +00:00
Jeff Roberson
b172f6c5f9 - Now that there are no external users of vfree() make it static.
- Move VSHOULDBUSY, VSHOULDFREE, and VTRYRECYCLE into vfs_subr.c so
   no one else attempts to grow a dependency on them.
 - Now that objects with pages hold the vnode we don't have to do unlocked
   checks for the page count in the vm object in VSHOULDFREE.  These three
   macros could simply check for holdcnt state transitions to determine
   whether the vnode is on the free list already, but the extra safety
   the flag affords us is probably worth the minimal cost.
 - The leafonly sysctl and code have been dead for several years now,
   remove the sysctl and the code that employed it from vtryrecycle().
 - vtryrecycle() also no longer has to check the object's page count as
   the object holds the vnode until it reaches 0.

Sponsored by:	Isilon Systems, Inc.
2005-03-15 14:38:16 +00:00
Poul-Henning Kamp
7933351a28 Fix a debug message to print a usable device name rather than useless
major+minor tupple.
2005-03-15 14:08:10 +00:00
Jeff Roberson
c178628d6e - Expose vholdl() so it may be used outside of vfs_subr.c 2005-03-15 13:43:10 +00:00
Poul-Henning Kamp
4ba679d6d0 Remove findcdev(). 2005-03-15 12:58:08 +00:00
Poul-Henning Kamp
0a2e49f1f8 Rename cdev->si_udev to cdev->si_drv0 to reflect the new nature of
the field.
2005-03-15 11:33:28 +00:00
Jeff Roberson
f5f0da0a0e - transferlockers() requires the interlock to be SMP safe.
Sponsored by:	Isilon Systems, Inc.
2005-03-15 09:27:45 +00:00
Poul-Henning Kamp
e82ef95c11 Simplify the vfs_hash calling convention. 2005-03-15 08:07:07 +00:00
Poul-Henning Kamp
ee148e2606 Cleanup accidentally include #if 0 section. 2005-03-14 10:25:09 +00:00
Poul-Henning Kamp
6c325a2a21 Currently (almost) all filesystems maintain a local inode hash table
to get from (mount + inode) to vnode.  These tables are mostly
copy&pasted from UFS, sized based on desiredvnodes and therefore
quite large (128K-512K).  Several filesystems are buggy enough that
they allocate the hash table even before they know if they will
ever be used or not.

Add "vfs_hash", a system wide hash table, which will replace all
the per-filesystem hash-tables.

The fields we add to struct vnode will more or less be saved in
the respective filesystems inodes.

Having one central implementation will save code and will allow us
to justify the complexity of code to dynamically (re)size the hash
at a later point.
2005-03-14 10:01:29 +00:00
Jeff Roberson
8045557f2b - Increment the holdcnt once for each usecount reference. This allows us
to use only the holdcnt to determine whether a vnode may be recycled,
   simplifying the V* macros as well as vtryrecycle(), etc.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 09:25:19 +00:00
Jeff Roberson
159b454819 - We do not have to check the object's ref_count in VSHOULDFREE or
vtryrecycle().  All obj refs also ref the vnode.
 - Consistently use v_incr_usecount() to increment the usecount.  This will
   be more important later.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 08:30:31 +00:00
Jeff Roberson
8f13a540ed - Slightly rearrange vrele() to move the common case in one indentation
level.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 07:16:55 +00:00
Jeff Roberson
6fc16a838c - Rework vget() so we drop the usecount in two failure cases that were
missed by my last commit.

Sponsored by:	Isilon Systems, Inc.
2005-03-14 07:11:19 +00:00
Poul-Henning Kamp
93f6c81e25 Remove debugging printfs. 2005-03-14 06:51:29 +00:00
Jeff Roberson
0463dc9ef1 - Do a vn_start_write in vn_close, we may write if this is the last ref
on an unlinked file.  We can't know if this is the case until after we
   have the lock.
 - Lock the vnode in vn_close, many filesystems had code which was unsafe
   without the lock held, and holding it greatly simplifies vgone().
 - Adjust vn_lock() to check for the VI_DOOMED flag where appropriate.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:56:28 +00:00
Jeff Roberson
6703c30bb5 - Remove vx_lock, vx_unlock, vx_wait, etc.
- Add a vn_start_write/vn_finished_write around vlrureclaim so we don't do
   writing ops without suspending.  This could suspend the vlruproc which
   should not be a problem under normal circumstances.
 - Manually implement VMIGHTFREE in vlrureclaim as this was the only instance
   where it was used.
 - Acquire a lock before calling vgone() as it now requires it.
 - Move the acquisition of the vnode interlock from vtryrecycle() to
   getnewvnode() so that if it fails we don't drop and reacquire the
   vnode_free_list_mtx.
 - Check for a usecount or holdcount at the end of vtryrecycle() in case
   someone grabbed a ref while we were recycling.  Abort the recycle, and
   on the final ref drop this vnode will be placed on the head of the free
   list.
 - Move the redundant VOP_INACTIVE protection code into the local
   vinactive() routine to avoid code bloat.
 - Keep the vnode lock held across calls to vgone() in several places.
 - vgonel() no longer uses XLOCK, instead callers must hold an exclusive
   vnode lock.  The VI_DOOMED flag is set to allow other threads to detect
   a vnode which is no longer valid.  This flag is set until the last
   reference is gone, and there are no chances for a new ref.  vgonel()
   holds this lock across the entire function, which greatly simplifies
   logic.
 _ Only vfree() in one place in vgone() not three.
 - Adjust vget() to check the VI_DOOMED flag prior to waiting on the lock
   in the LK_NOWAIT case.  In other cases, check after we have slept and
   acquired an exlusive lock.  This will simulate the old vx_wait()
   behavior.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:54:28 +00:00
Jeff Roberson
2b3183a8b7 - A lock is required before calling VOP_REVOKE. Our reference protects us
from accessing another vnode so a naked VOP_LOCK is sufficient.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:47:04 +00:00
Jeff Roberson
9331fd135b - Don't VOP_UNLOCK prior to VOP_REVOKE. The lock is required now.
Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:45:51 +00:00
Jeff Roberson
23f2513a4e - Don't drop the lock in the default inactive handler anymore, VOP_NULL
will do for vop_stdinactive now.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:45:01 +00:00
Jeff Roberson
4e6746965e - CLOSE, REVOKE, INACTIVE, and RECLAIM are not L L L, that's a locked vnode
on enter, exit, error.  This allows for the removal of the XLOCK.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:42:16 +00:00
Pawel Jakub Dawidek
cefcecbefd Function jailed() looks into ucred strcture, so be sure ucred is not NULL.
Reviewed by:	rwatson
MFC after:	1 week
2005-03-12 14:31:04 +00:00
Pawel Jakub Dawidek
d079d0a0d2 Clean up a bit.
Reviewed by:	rwatson
MFC after:	1 week
2005-03-12 14:28:34 +00:00
Robert Watson
59f21d5ab1 Extend the coverage of the accept and socket mutexes in soisconnected()
so that the socket lock is held over the test-and-set removal of the
accept filter option during connect, and the two socket mutex regions
(transition to connected, perform accept filter) are combined.
2005-03-12 13:39:39 +00:00
Robert Watson
a59f81d263 Move the logic implementing retrieval of the SO_ACCEPTFILTER socket option
from uipc_socket.c to uipc_accf.c in do_getopt_accept_filter(), so that it
now matches do_setopt_accept_filter().  Slightly reformulate the logic to
match the optimistic allocation of storage for the argument in advance,
and slightly expand the coverage of the socket lock.
2005-03-12 12:57:18 +00:00
Robert Watson
92081a8344 Part two of post-SMPng cleanup of accept filter registration: perform all
allocation up front before grabbing the socket mutex and doing the
registration work.  The result is a lot cleaner.
2005-03-12 12:27:47 +00:00
Peter Wemm
f71692e9be Replace my previous change for 32 bit systems with hz > 169 with Bruce's
simpler one.
2005-03-12 00:13:45 +00:00
Peter Wemm
2afec87508 Make the tty vmin/vtime timeouts work for hz > 169 on 32 bit machines. 2005-03-12 00:10:23 +00:00
Robert Watson
64c238075f First step in simplifying accept filter socket option logic in the
post-SMPng world order.  Centralize handling of the socket option
clear case in do_setopt_accept_filter().
2005-03-11 21:37:45 +00:00
Robert Watson
56856fbfb4 Remove an additional commented out reference to a possible future sx
lock.
2005-03-11 19:16:02 +00:00
Robert Watson
2b37548a71 When setting up a socket in socreate(), there's no need to lock the
socket lock around knlist_init(), so don't.

Hard code the setting of the socket reference count to 1 rather than
using soref() to avoid asserting the socket lock, since we've not yet
exposed the socket to other threads.

This removes two mutex operations from each socket allocation.
2005-03-11 16:30:02 +00:00
Robert Watson
5fab68b19e Remove suggestive sx_init() comment in soalloc(). We will have something
like this at some point, but for now it clutters the source.
2005-03-11 16:26:33 +00:00
Robert Watson
35a196154f The SO_NOSIGPIPE socket option allows a user process to mark a socket
so that the socket does not generate SIGPIPE, only EPIPE, when a write
is attempted after socket shutdown.  When the option was introduced in
2002, this required the logic for determining whether SIGPIPE was
generated to be pushed down from dofilewrite() to the socket layer so
that the socket options could be considered.  However, the change in
2002 omitted modification to soo_write() required to add that logic,
resulting in SIGPIPE not being generated even without SO_NOSIGPIPE when
the socket was written to using write() or related generic system calls.

This change adds the EPIPE logic to soo_write(), generating a SIGPIPE
signal to the process associated with the passed uio in the event that
the SO_NOSIGPIPE option is not set.

Notes:

- The are upsides and downsides to placing this logic in the socket
  layer as opposed to the file descriptor layer.  This is really fd
  layer logic, but because we need so_options, we have a choice of
  layering violations and pick this one.

- SIGPIPE possibly should be delivered to the thread performing the
  write, not the process performing the write.

- uio->uio_td and the td argument to soo_write() might potentially
  differ; we use the thread in the uio argument.

- The "sigpipe" regression test in src/tools/regression/sockets/sigpipe
  tests for the bug.

Submitted by:		Mikko Tyolajarvi <mbsd at pacbell dot net>
Talked with:		glebius, alfred
PR:			78478
MFC after:		1 week
2005-03-11 15:06:16 +00:00
John-Mark Gurney
74e620476c fix spelling of match in comment...
MFC after:	3 days
2005-03-10 21:23:06 +00:00
Poul-Henning Kamp
b43ab0e378 Try to fix the mess I made of devname, with the minimal subset of the
larger minor/major patch which was posted for testing.
2005-03-10 18:21:34 +00:00
Robert Watson
53358cc907 Document, via WITNESS, that the NFS server mutex falls ahead of the socket
buffer mutexes.
2005-03-09 21:38:53 +00:00
Dag-Erling Smørgrav
628b83cd08 My addled brains didn't realize that since vtp points into value, we
can't freeenv(value) before we're done inspecting vtp[0].

Tested by:	Anish Mistry <mistry.7@osu.edu>
2005-03-09 12:16:45 +00:00
Stefan Farfeleder
b26244446b Fix typo in comment. 2005-03-09 11:50:55 +00:00
Sam Leffler
a4e714295a allow the destination of m_move_pkthdr to have external
storage (e.g. a cluster)

Glanced at by:	rwatson, silby
2005-03-08 17:52:01 +00:00
Giorgos Keramidas
0a11e99990 Remove redundant initialization that is repeated in the for() loop
right below it.

Approved by:	jhb
2005-03-08 16:57:20 +00:00
Maxim Sobolev
8d6e40c3f1 Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress
SIGPIPE signal for the duration of the sento-family syscalls. Use it to
replace previously added hack in Linux layer based on temporarily setting
SO_NOSIGPIPE flag.

Suggested by:	alfred
2005-03-08 16:11:41 +00:00
Poul-Henning Kamp
d9a54d5c23 Reengineer subr_unit
Add support for passing in a mutex.  If NULL is passed a global
	subr_unit mutex is used.

	Add alloc_unrl() which expects the mutex to be held.

	Allocating a unit will never sleep as it does not need to allocate
	memory.

	Cut possible range in half so we can use -1 to mean "out of number".

	Collapse first and last runs into the head by means of counters.
	This saves memory in the common case(s).
2005-03-08 10:40:48 +00:00
Poul-Henning Kamp
3238ec33e1 Fix signedness of minor2unit(). 2005-03-08 10:40:03 +00:00
Jeff Roberson
ec346d1040 - Lock access to the buffer_map with the vm_map lock. In 4.x this was
done with splbio, in 5.x this was done with Giant.

Discussed with:		alc
Reported by:		julian, pho
2005-03-08 09:34:54 +00:00
Giorgos Keramidas
46da8bf8fb Typo & grammar fixes in comments. 2005-03-08 00:58:50 +00:00
Robert Watson
9bfb7389bc When upcalling from a socket in soisconnected() for an accept filter,
call with flag M_DONTWAIT rather than M_TRYWAIT, as we don't want to
do blocking memory allocation (etc) in the netisr.

MFC after:	3 days
2005-03-07 13:50:16 +00:00
Poul-Henning Kamp
3b3f38ed7d Add placeholder mutex argument to new_unrhdr(). 2005-03-07 11:05:47 +00:00
Bill Paul
58a6edd121 When you call MiniportInitialize() for an 802.11 driver, it will
at some point result in a status event being triggered (it should
be a link down event: the Microsoft driver design guide says you
should generate one when the NIC is initialized). Some drivers
generate the event during MiniportInitialize(), such that by the
time MiniportInitialize() completes, the NIC is ready to go. But
some drivers, in particular the ones for Atheros wireless NICs,
don't generate the event until after a device interrupt occurs
at some point after MiniportInitialize() has completed.

The gotcha is that you have to wait until the link status event
occurs one way or the other before you try to fiddle with any
settings (ssid, channel, etc...). For the drivers that set the
event sycnhronously this isn't a problem, but for the others
we have to pause after calling ndis_init_nic() and wait for the event
to arrive before continuing. Failing to wait can cause big trouble:
on my SMP system, calling ndis_setstate_80211() after ndis_init_nic()
completes, but _before_ the link event arrives, will lock up or
reset the system.

What we do now is check to see if a link event arrived while
ndis_init_nic() was running, and if it didn't we msleep() until
it does.

Along the way, I discovered a few other problems:

- Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL.
  ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation
  wrong.)

- Similarly, the NDIS interrupt handler, which is essentially a
  DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask()
  has been fixed accordingly.

- MiniportQueryInformation() and MiniportSetInformation() run at
  DISPATCH_LEVEL, and each request must complete before another
  can be submitted. ndis_get_info() and ndis_set_info() have been
  fixed accordingly.

- Turned the sleep lock that guards the NDIS thread job list into
  a spin lock. We never do anything with this lock held except manage
  the job list (no other locks are held), so it's safe to do this,
  and it's possible that ndis_sched() and ndis_unsched() can be
  called from DISPATCH_LEVEL, so using a sleep lock here is
  semantically incorrect. Also updated subr_witness.c to add the
  lock to the order list.
2005-03-07 03:05:31 +00:00