9959 Commits

Author SHA1 Message Date
kmacy
f66541917b style fixes and make sure that the lock is treated as released in the sharers == 0 case
not that this is somewhat racy because a new sharer can come in while we're updating stats
2007-04-04 00:01:05 +00:00
kmacy
836059f8b9 Fixes to sx for newsx - fix recursed case and move out of inline
Submitted by: Attilio Rao <attilio@freebsd.org>
2007-04-03 22:58:21 +00:00
kmacy
58561e082b move lock_profile calls out of the macros and into kern_mutex.c
add check for mtx_recurse == 0 when releasing sleep lock
2007-04-03 22:52:31 +00:00
kmacy
8913ddf202 skip call to _lock_profile_obtain_lock_success entirely if acquisition time is non-zero
(i.e. recursing or adding sharers)
2007-04-03 18:36:27 +00:00
pjd
aa77196921 Add root_mount_wait() function which can be used to wait until the root
file system is mounted. This is useful for kernel modules loaded from
/boot/loader.conf, that have to access file system.
2007-04-03 11:45:28 +00:00
jhb
b9a3a5afc7 Fix a fd leak in socketpair():
- Close the new file objects created during socketpair() if the copyout of
  the new file descriptors fails.
- Add a test to the socketpair regression test for this edge case.
2007-04-02 19:15:47 +00:00
jhb
6d9ee961ef Don't go to a whole lot of extra work to handle the race where the new
file descriptor is closed out from under us in kern_open().  This race
is already handled and the file will be closed when kern_open() does an
fdrop just before returning.
2007-04-02 13:40:38 +00:00
wkoszek
aae26bbf9f ng_node and ng_worklist locks both migrated from being spinning locks to
adaptive mutexes. Let witness(4) calm down and bring proper types of those
locks to the lock order database.

Glanced at by:	rwatson
2007-04-01 15:48:10 +00:00
pjd
b078229dad I think the code I'm removing here is completely bogus.
vfs_flags field is used for VFCF_* flags which are given at file system
driver creation time (via VFS_SET(9)) macro.

What this code did was bascially this:

If file system registers itself with VFCF_UNICODE flag (stores file names
as Unicode), it will gain MNT_SOFTDEP flag (UFS soft-updates).

If file system registers itself with VFCF_LOOPBACK flag (aliases some other
mounted FS), it will gain MNT_SUIDDIR flag (special handling of SUID on
dirs).

The latter will be quite dangerous, but those flags are reset later in
vfs_domount().

MFC after:	1 month
2007-04-01 13:08:05 +00:00
pjd
c20a93a345 Now that the vdropl() function is public, assert that the vnode interlock
is held.
2007-04-01 10:45:32 +00:00
des
b0b258dcad Make vdropl() public; zfs needs it. There is also plenty of existing
file system code (mostly *_reclaim()) which look like this:

    VOP_LOCK(vp);
    /* examine vp */
    VOP_UNLOCK(vp);
    vdrop(vp);

This can now be rewritten to:

    VOP_LOCK(vp);
    /* examine vp */
    vdropl(vp); /* will unlock vp */

MFC after:	1 week
2007-03-31 23:57:17 +00:00
jhb
b0b93a3c55 Optimize sx locks to use simple atomic operations for the common cases of
obtaining and releasing shared and exclusive locks.  The algorithms for
manipulating the lock cookie are very similar to that rwlocks.  This patch
also adds support for exclusive locks using the same algorithm as mutexes.

A new sx_init_flags() function has been added so that optional flags can be
specified to alter a given locks behavior.  The flags include SX_DUPOK,
SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature
to the similar flags for mutexes.

Adaptive spinning on select locks may be enabled by enabling the
ADAPTIVE_SX kernel option.  Only locks initialized with the SX_ADAPTIVESPIN
flag via sx_init_flags() will adaptively spin.

The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock()
are now performed inline in non-debug kernels.  As a result, <sys/sx.h> now
requires <sys/lock.h> to be included prior to <sys/sx.h>.

The new kernel option SX_NOINLINE can be used to disable the aforementioned
inlining in non-debug kernels.

The size of struct sx has changed, so the kernel ABI is probably greatly
disturbed.

MFC after:	1 month
Submitted by:	attilio
Tested by:	kris, pjd
2007-03-31 23:23:42 +00:00
pjd
969ba66299 Make vfs_mount_destroy() and vfs_freeopts() non-static, I'd like to use them. 2007-03-31 22:44:45 +00:00
rwatson
76104e0492 Rather than ignoring any error return from getnewvnode() in nameiinit(),
explicitly test and panic.  This should not ever happen, but if it does,
this is a preferred failure mode to a NULL pointer dereference in kernel.

Coverity CID:	1716
Found with:	Coverity Prevent(tm)
2007-03-31 16:08:50 +00:00
jhb
602dd12fbc - Drop memory barriers in rw_try_upgrade(). We don't need an 'acq' memory
barrier here as the earlier rw_rlock() already contained one.
- Comment fix.
2007-03-30 18:08:55 +00:00
jhb
0d7ad36cd0 - Use lock_init/lock_destroy() to setup the lock_object inside of lockmgr.
We can now use LOCK_CLASS() as a stronger check in lockmgr_chain() as a
  result.  This required putting back lk_flags as lockmgr's use of flags
  conflicted with other flags in lo_flags otherwise.
- Tweak 'show lock' output for lockmgr to match sx, rw, and mtx.
2007-03-30 18:07:24 +00:00
wkoszek
78bc9b6aa1 vm_map_delete should be used only internally, by the VM subsystem. Replace
it with vm_map_remove, which not only embeds additional check, but also
takes care of locking.

Reviewed by:	alc
Approved by:	alc, cognet (mentor)
2007-03-29 13:26:13 +00:00
jhb
38635fcc55 Align 'struct thread' on 16 byte boundaries so that the lower 4 bits are
always 0.  Previously we aligned threads on a minimum of 8-byte boundaries.

Note: This changes the uma zone to no longer cache align threads.  We
really want the uma zone to do align threads to MAX(16, cache line size)
but there currently isn't a good way to express that to uma.

Submitted by:	attilio
2007-03-27 16:51:34 +00:00
marcel
3436aa6504 PowerPC is the only architecture with mpsafe_vfs=0. This is now
broken. Rudimentary tests show that PowerPC can run with
mpsafe_vfs=1. Make it so...
2007-03-27 05:29:41 +00:00
njl
4933ca0aa0 Add an interface for drivers to be notified of changes to CPU frequency.
cpufreq_pre_change is called before the change, giving each driver a chance
to revoke the change.  cpufreq_post_change provides the results of the
change (success or failure).  cpufreq_levels_changed gives the unit number
of the cpufreq device whose number of available levels has changed.  Hook
in all the drivers I could find that needed it.

* TSC: update TSC frequency value.  When the available levels change, take the
highest possible level and notify the timecounter set_cputicker() of that
freq.  This gets rid of the "calcru: runtime went backwards" messages.
* identcpu: updates the sysctl hw.clockrate value
* Profiling: if profiling is active when the clock changes, let the user
know the results may be inaccurate.

Reviewed by:	bde, phk
MFC after:	1 month
2007-03-26 18:03:29 +00:00
emaste
3cb53690e0 Avoid manipulating semu_list outside of the scope of SEMUNDO_LOCK(). This
would lead to an occasional hang with a cycle in semu_list.

X-Discussed-On: hackers@
2007-03-26 17:41:14 +00:00
rwatson
b6cc16f9a0 Following movement of functions from uipc_socket2.c to uipc_socket.c and
uipc_sockbuf.c, clean up and update comments.
2007-03-26 17:05:09 +00:00
rwatson
81eac4c7f0 Complete removal of uipc_socket2.c by moving the last few functions to
other C files:

- Move sbcreatecontrol() and sbtoxsockbuf() to uipc_sockbuf.c.  While
  sbcreatecontrol() is really an mbuf allocation routine, it does its work
  with awareness of the layout of socket buffer memory.

- Move pru_*() protocol switch stubs to uipc_socket.c where the non-stub
  versions of several of these functions live.  Likewise, move socket state
  transition calls (soisconnecting(), etc) to uipc_socket.c.  Moveo
  sodupsockaddr() and sotoxsocket().
2007-03-26 08:59:03 +00:00
kris
56c63dc234 Correct a comment typo 2007-03-25 10:07:23 +00:00
kris
7cac458d7a Update a comment: we usually call exec_vmspace_new with Giant not held,
but sometimes it is.
2007-03-25 10:05:44 +00:00
emaste
1639f74854 Stop setting ki_ocomm (thread name) to the proc name by default, as nothing
in the base system relies on this any longer.
2007-03-23 04:01:08 +00:00
jhb
ec6d5dd10f - Simplify the #ifdef's for adaptive mutexes and rwlocks by conditionally
defining a macro earlier in the file.
- Add NO_ADAPTIVE_RWLOCKS option to disable adaptive spinning for rwlocks.
2007-03-22 16:09:23 +00:00
glebius
88fc902c13 Move the dom_dispose and pru_detach calls in sofree() earlier. Only after
calling pru_detach we can be absolutely sure, that we don't have any
references to the socket in the stack.

This closes race between lockless sbdestroy() and data arriving on socket.

Reviewed by:	rwatson
2007-03-22 13:21:24 +00:00
jhb
29a0e4d380 Rename the cv_*wait*() functions to _cv_*wait*() and change their second
argument from a mutex to a lock_object.  Add cv_*wait*() wrapper macros
that accept either a mutex, rwlock, or sx lock as the second argument and
convert it to a lock_object and then call _cv_*wait*().  Basically, the
visible difference is that you can now use rwlocks and sx locks with
condition variables using the same API as with mutexes.
2007-03-21 22:22:13 +00:00
jhb
a84f74bb36 Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,
rwlocks, and sx locks to 'lock_object'.
2007-03-21 21:20:51 +00:00
jhb
4b65fa3cdb Don't use cv_wait_unlock() to implement cv_wait(). Instead, implement
cv_wait() fully and add missing KTRACE context switch traces.
2007-03-21 20:46:26 +00:00
jhb
2d89c91492 If vn_open() fails during kern_open(), don't fdrop() the new file object
until after the call to fdclose().  This closes an obscure race that
could result in the later call to fdclose() actually closing a different
file descriptor if another thread close()'s the file descriptor being
opened before fdrop() is called, so the fdrop() in kern_open() frees the
file object, then the second thread (or a third) creates a new file
descriptor which reuses both the same index and the same file pointer
thus tricking fdclose() in the first thread into thinking that the
original file was still open.

MFC after:	1 week
2007-03-21 19:32:08 +00:00
jhb
08da44e275 Handle the case when a thread is blocked on a lockmgr lock with LK_DRAIN
in DDB's 'show sleepchain'.

MFC after:	3 days
2007-03-21 19:28:20 +00:00
andre
515a501549 Maintain a pointer and offset pair into the socket buffer mbuf chain to
avoid traversal of the entire socket buffer for larger offsets on stream
sockets.

Adjust tcp_output() make use of it.

Tested by:	gallatin
2007-03-19 18:35:13 +00:00
pjd
d128126120 Don't deny unmounting file systems for jailed processes immediately, allow
prison_priv_check() to decide what to do.

This change is suppose not to change current (security) behaviour
in any way.

This change is simlar to the change of PRIV_VFS_MOUNT in previous revision.
2007-03-18 02:39:19 +00:00
jeff
07c8782e65 - Handle the case where slptime == runtime.
Submitted by:	Atoine Brodin
2007-03-17 23:32:48 +00:00
jeff
82143f95ad - Cast the intermediate value in priority computtion back down to
unsigned char.  Weirdly, casting the 1 constant to u_char still produces
   a signed integer result that is then used in the % computation.  This
   avoids that mess all together and causes a 0 pri to turn into 255 % 64
   as we expect.

Reported by:	kkenn (about 4 times, thanks)
2007-03-17 18:13:32 +00:00
jhb
cbfbccd221 Just use 'fdrop()' instead of 'FILE_LOCK(); fdrop_locked()' in
dupfdopen().  While I'm at it, move the second fdrop() out from under the
filedesc lock.
2007-03-15 21:19:21 +00:00
pjd
b8cb05ecd8 Don't deny mounting for jailed processes immediately, allow
prison_priv_check() to decide what to do.

This change is suppose not to change current (security) behaviour
in any way.

Reviewed by:	rwatson
2007-03-14 13:09:59 +00:00
pjd
aa74bb6d55 White space nits. 2007-03-14 12:54:10 +00:00
kib
1a473e3ed5 Busy filesystem around call of VFS_QUOTACTL() vfs op.
Tested by:	Peter Holm
Reviewed by:	tegge
Approved by:	re (kensmith)
2007-03-14 08:45:55 +00:00
jhb
dccb53bbac Print readers count as unsigned in ddb 'show lock'.
Submitted by:	attilio
2007-03-13 16:51:27 +00:00
tegge
214bc5723c Make insmntque() externally visibile and allow it to fail (e.g. during
late stages of unmount).  On failure, the vnode is recycled.

Add insmntque1(), to allow for file system specific cleanup when
recycling vnode on failure.

Change getnewvnode() to no longer call insmntque().  Previously,
embryonic vnodes were put onto the list of vnode belonging to a file
system, which is unsafe for a file system marked MPSAFE.

Change vfs_hash_insert() to no longer lock the vnode.  The caller now
has that responsibility.

Change most file systems to lock the vnode and call insmntque() or
insmntque1() after a new vnode has been sufficiently setup.  Handle
failed insmntque*() calls by propagating errors to callers, possibly
after some file system specific cleanup.

Approved by:	re (kensmith)
Reviewed by:	kib
In collaboration with:	kib
2007-03-13 01:50:27 +00:00
jhb
33c77f5168 Fix a typo. 2007-03-12 20:10:29 +00:00
jhb
d8d45f52d7 - Use m_gethdr(), m_get(), and m_clget() instead of the macros in
sosend_copyin().
- Use M_WAITOK instead of M_TRYWAIT in sosend_copyin().
- Don't check for NULL from M_WAITOK and return ENOBUFS.
  M_WAITOK/M_TRYWAIT allocations don't fail with NULL.

Reviewed by:	andre
Requested by:	andre (2)
2007-03-12 19:27:36 +00:00
rwatson
5ce4db6432 In uipc_close(), we no longer always free the unpcb, as the last reference
may be dropped later.  In this case, always unlock the unpcb so as not to
leak the lock.

Found by:	kris (BugMagnet)
2007-03-12 14:52:00 +00:00
jhb
e454d575d9 Use sx_sleep() in the main loop of the accounting kthread. 2007-03-09 23:29:31 +00:00
jhb
f5e3969340 Allow threads to atomically release rw and sx locks while waiting for an
event.  Locking primitives that support this (mtx, rw, and sx) now each
include their own foo_sleep() routine.
- Rename msleep() to _sleep() and change it's 'struct mtx' object to a
  'struct lock_object' pointer.  _sleep() uses the recently added
  lc_unlock() and lc_lock() function pointers for the lock class of the
  specified lock to release the lock while the thread is suspended.
- Add wrappers around _sleep() for mutexes (mtx_sleep()), rw locks
  (rw_sleep()), and sx locks (sx_sleep()).  msleep() still exists and
  is now identical to mtx_sleep(), but it is deprecated.
- Rename SLEEPQ_MSLEEP to SLEEPQ_SLEEP.
- Rewrite much of sleep.9 to not be msleep(9) centric.
- Flesh out the 'RETURN VALUES' section in sleep.9 and add an 'ERRORS'
  section.
- Add __nonnull(1) to _sleep() and msleep_spin() so that the compiler will
  warn if you try to pass a NULL wait channel.  The functions already have
  a KASSERT to that effect.
2007-03-09 22:41:01 +00:00
jhb
60ad130f46 Add two new function pointers 'lc_lock' and 'lc_unlock' to lock classes.
These functions are intended to be used to drop a lock and then reacquire
it when doing an sleep such as msleep(9).  Both functions accept a
'struct lock_object *' as their first parameter.  The 'lc_unlock' function
returns an integer that is then passed as the second paramter to the
subsequent 'lc_lock' function.  This can be used to communicate state.
For example, sx locks and rwlocks use this to indicate if the lock was
share/read locked vs exclusive/write locked.

Currently, spin mutexes and lockmgr locks do not provide working lc_lock
and lc_unlock functions.
2007-03-09 16:27:11 +00:00
jhb
f5f98a764d Use C99-style struct member initialization for lock classes. 2007-03-09 16:19:34 +00:00