Commit Graph

8659 Commits

Author SHA1 Message Date
pjd
a520307ce1 Add 'depth' argument to CTRSTACK() macro, which allows to reduce number
of ktr slots used. If 'depth' is equal to 0, the whole stack will be
logged, just like before.
2005-08-29 11:34:08 +00:00
ssouhlal
3041058fad Fix a typo in vop_rename_pre() where we ended up using vholdl()
instead of vhold(), even though the vnode interlock is unlocked.

MFC after:	3 days
2005-08-28 23:00:11 +00:00
alc
5346dda812 Handle vm_map_wire()'s failure. 2005-08-28 05:38:40 +00:00
alc
dfddcf2af4 Correctly handle vm_map_wire()'s failure. (See also revisions 1.81 and
1.82.)

Reviewed by:	tegge
2005-08-28 04:50:11 +00:00
alc
ecb497dd87 Eliminate an unneeded reference on a vm object. If, in fact, the nearby
vm_map_find() fails, then the excess reference causes the vm object to be
leaked.

Reviewed by:	tegge
2005-08-28 00:24:58 +00:00
alc
597f2facce Revert the previous change for two reasons: (1) If vm_map_find() succeeds
but vm_map_wire() fails, then a vm object, vm map entries, and kernel_map
free space is leaked and (2) unwiring is handled automatically by
vm_map_remove().

Suggested by:   tegge
2005-08-28 00:19:54 +00:00
des
6222a26a3b Two minor optimizations of fdalloc():
- if minfd < fd_freefile (as is most often the case, since minfd is
   usually 0), set it to fd_freefile.

 - remove a call to fd_first_free() which duplicates work already done
   by fdused().

This change results in a small but measurable speedup for processes
with large numbers (several thousands) of open files.

PR:		kern/85176
Submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
MFC after:	3 weeks
2005-08-26 11:16:39 +00:00
truckman
c6df9cec44 Track all lock relationships instead of pruning direct relationships
if an indirect relationship exists (keep both A->B->C and A->C).
This allows witness_checkorder() to use isitmychild() instead of
the much more expensive isitmydescendant() to check for valid lock
ordering.

Don't do an expensive tree walk to update the w_level values when
the tree is updated.  Only update the w_level values when using the
debugger to display the tree.

Nuke the experimental "witness_watch > 1" mode that only compared
w_level for the two locks.  This information is no longer maintained
at run time, and the use of isitmychild() in witness_checkorder
should bring performance close enough to the acceptable level that
this hack is not needed.

Report witness data structure allocation statistics under the
debug.witness sysctl.

Reviewed by:	jhb
MFC after:	30 days
2005-08-25 03:47:37 +00:00
truckman
aa31faa377 Back out the removal of LK_NOWAIT from the VOP_LOCK() call in
vlrureclaim() in vfs_subr.c 1.636  because waiting for the vnode
lock aggravates an existing race condition.  It is also undesirable
according to the commit log for 1.631.

Fix the tiny race condition that remains by rechecking the vnode
state after grabbing the vnode lock and grabbing the vnode interlock.

Fix the problem of other threads being starved (which 1.636 attempted
to fix by removing LK_NOWAIT) by calling uio_yield() periodically
in vlrureclaim().  This should be more deterministic than hoping
that VOP_LOCK() without LK_NOWAIT will block, which may not happen
in this loop.

Reviewed by:	kan
MFC after:	5 days
2005-08-23 03:44:06 +00:00
pjd
7bd8127f8f mp_ncpus is always (properly) initialized, even on UP kernels, so just use it. 2005-08-21 18:03:31 +00:00
rwatson
867f71548b Silence "busy" warnings when unmounting devfs at system shutdown. This
is a workaround for non-symetric teardown of the file systems at
shutdown with respect to the mount order at boot.  The proper long term
fix is to properly detach devfs from the root mount before unmounting
each, and should be implemented, but since the problem is non-harmful,
this temporary band-aid will prevent false positive bug reports and
unnecessary error output for 6.0-RELEASE.

MFC after:	3 days
Tested by:	pav, pjd
2005-08-20 17:12:47 +00:00
phk
df3343c5ae Properly un-giant-trick the cdevsw in fini_cdevsw()
Tripped over by:	Huang wen hui <huang@gddsn.org.cn>
2005-08-20 12:13:51 +00:00
davidxu
72d79d8357 Add missing brackets.
Noticed by: stefanf@
2005-08-19 22:30:13 +00:00
davidxu
bce3f5771d Fix a LOR between sched_lock and sleep queue lock. 2005-08-19 13:35:34 +00:00
davidxu
fae38f5558 Move up code for testing KEF_HOLD to avoid ke_cpu being changed unexpectly
for PRI_ITHD and PRI_REALTIME threads.
2005-08-19 11:51:41 +00:00
ume
d72dbf2615 - don't forget to save freqency when priority is raised.
- nuke redundant variable initialization.
2005-08-18 16:41:25 +00:00
ume
57562e0618 don't forget to update curr_priority. even when frequency is
not changed, priority may be changed.
2005-08-18 16:08:56 +00:00
phk
fcf6768753 Handle device drivers with D_NEEDGIANT in a way which does not
penalize the 'good' drivers:  Allocate a shadow cdevsw and populate
it with wrapper functions which grab Giant
2005-08-17 08:19:52 +00:00
phk
2352e07b6f In vop_stdpathconf(ap) also default for _PC_NAME_MAX and _PC_PATH_MAX. 2005-08-17 06:59:23 +00:00
ume
69a7276f8f Save cpu level only when priority is greater than PRIO_USER
to make CPUFREQ_SET(NULL, prio) work.
TODO: implement saved_level as stack.

Reviewed by:	njl
2005-08-16 20:03:08 +00:00
phk
d6e36cd748 Remove stale comment. 2005-08-16 19:47:42 +00:00
phk
8a3fe94804 Collect the devfs related sysctls in one place 2005-08-16 19:25:02 +00:00
phk
e89ebd4119 Create a new internal .h file to communicate very private stuff
from kern_conf.c to devfs.

For now just two prototypes, more to come.
2005-08-16 19:08:01 +00:00
kan
5f5eca170a Do not keep parent directory locked while calling VFS_ROOT to traverse mount
points in lookup(). The lock can be dropped safely around VFS_ROOT because
LOCKPARENT semantics with child and perent vnodes coming from different FSes
does not really have any meaningful use. On the other hard, this prevents
easily triggered deadlock on systems using automounter daemon.
2005-08-14 18:10:04 +00:00
kan
51355225d4 Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable.
vm_pager_init() is run before required nswbuf variable has been set
to correct value. This caused system to run with single pbuf available
for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt
variable in the same way.

Reported by:	ade
Obtained from:	alc
MFC after:	2 days
2005-08-13 20:21:33 +00:00
marcel
f94807eceb Make mpsafe_vfs=1 the default on ia64. 2005-08-13 20:07:50 +00:00
njl
1198d2f2a8 The "lowest" sysctl setting makes more sense as the lowest one to use, so
discard all levels less than this setting, not less than/equal to.

MFC after:	1 day
2005-08-11 18:40:58 +00:00
kan
9590889861 Do not drop the vnode interlock if vdropl is called on already doomed vnode.
vdropl callers expect it to return with interlock still being held.

MFC after:	2 days
2005-08-10 11:46:03 +00:00
rwatson
9025d9a2b8 Add an order between UDP inpcb locks and the IPv4 multicast address
list lock, as there has been a report that an alternative lock order
is getting introduced.  This should help ferret it out.

Reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
2005-08-09 13:27:50 +00:00
rwatson
5d770a09e8 Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
csjp
cbebbd7704 Drop in a WITNESS_WARN into SYSCTL_IN to make sure that we are
not holding any non-sleep-able-locks locks when copyin is called.
This gets executed un-conditionally since we have no function
to wire the buffer in this direction.

Pointed out by:	truckman
MFC after:	1 week
2005-08-08 21:06:42 +00:00
rwatson
daa1c89f45 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
csjp
b42a694c8b Check to see if we wired the user-supplied buffers in SYSCTL_OUT, if
the buffer has not been wired and we are holding any non-sleep-able locks,
drop a witness warning. If the buffer has not been wired, it is possible
that the writing of the data can sleep, especially if the page is not in
memory. This can result in a number of different locking issues, including
dead locks.

MFC after:	1 week
Discussed with:	rwatson
Reviewed by:	jhb
2005-08-08 18:54:35 +00:00
davidxu
fee0e762f4 Try best to keep a preempted thread at front of run queue, this seems
improved performance a bit for some workloads, but still seeing interactive
lagging unless cpu idling race is fixed.
2005-08-08 14:20:10 +00:00
grehan
449d4474b4 Export a routine, kobj_machdep_init(), that allows platforms
to use the kobj subsystem as soon at mutex_init() has been called
instead of having to wait for the SI_SUB_LOCK sysinit.

Reviewed by:	dfr
2005-08-07 02:20:35 +00:00
csjp
8d2abe7f00 Change the data type of the upper shared memory limits from a signed
integer to an unsigned long. This lifts variables like the maximum
number of pages available for shared memory from 2^31 to 2^32 on 32
bit architectures, and from 2^31 to 2^64 on 64 bit architectures.

It should be noted that this changes breaks ABI on 64 bit architectures
because the size of the shmmax, shmmin, shmmni, shmseg and shmall members
of the shminfo structure has changed.

Silence on:	current@
2005-08-06 07:20:18 +00:00
ssouhlal
1f4d3e95ef Holding a vnode doesn't prevent v_mount from disappearing (when the
vnode is inactivated), possibly leading to a NULL dereference when
checking if the mount wants knotes to be activated in the VOP hooks.
So, we add a new vnode flag VV_NOKNOTE that is only set in getnewvnode(),
if necessary, and check it when activating knotes.
Since the flags are not erased when a vnode is being held, we can safely
read them.

Reviewed by:	kris@
MFC after:	3 days
2005-08-06 01:42:04 +00:00
rwatson
7504160c1e Introduce in_multi_mtx, which will protect IPv4-layer multicast address
lists, as well as accessor macros.  For now, this is a recursive mutex
due code sequences where IPv4 multicast calls into IGMP calls into
ip_output(), which then tests for a multicast forwarding case.

For support macros in in_var.h to check multicast address lists, assert
that in_multi_mtx is held.

Acquire in_multi_mtx around iteration over the IPv4 multicast address
lists, such as in ip_input() and ip_output().

Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses,
as well as over the manipulation of ifnet multicast address lists in order
to keep the two layers in sync.

Lock down accesses to IPv4 multicast addresses in IGMP, or assert the
lock when performing IGMP join/leave events.

Eliminate spl's associated with IPv4 multicast addresses, portions of
IGMP that weren't previously expunged by IGMP locking.

Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded
lock order in WITNESS, in that order.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		10 days
2005-08-03 19:29:47 +00:00
jeff
df3babd63b - Unlock before we call mac_destroy_vnode to prevent a lock order reversal.
Found by:	trhodes
2005-08-03 05:36:50 +00:00
jeff
8076188a59 - Use lockmgr_printinfo rather than rolling our own. This introduces a
slight problem by using printf instead of db_printf however
   'show lockedvnods' does the same so I believe it is ok for now.
2005-08-03 05:02:08 +00:00
jeff
a2347b7b51 - Fix a problem that slipped through review; the stack member of the lockmgr
structure should have the lk_ prefix.
 - Add stack_print(lkp->lk_stack) to the information printed with
   lockmgr_printinfo().
2005-08-03 04:59:07 +00:00
jeff
08fb635b55 - Replace the series of DEBUG_LOCKS hacks which tried to save the vn_lock
caller by saving the stack of the last locker/unlocker in lockmgr.  We
   also put the stack in KTR at the moment.

Contributed by:		Antoine Brodin <antoine.brodin@laposte.net>
2005-08-03 04:48:22 +00:00
jeff
4a761caec7 - Add support for saving stack traces and displaying them via printf(9)
and KTR.

Contributed by:		Antoine Brodin <antoine.brodin@laposte.net>
Concept code from:	Neal Fachan <neal@isilon.com>
2005-08-03 04:27:40 +00:00
davidxu
69017b27e9 In adjustrunqueue(), add code to handle thread migrating case for
ULE scheduler. In original code, local run queue of threaded ksegrp
is corrupted if adjustrunqueue() is called while thread is migrating.
2005-08-03 01:23:45 +00:00
ru
c7ff05fb06 Long overdue, keep up with mbuf.h,v 1.148. 2005-08-02 20:03:23 +00:00
kbyanc
7cdc25bbb9 Make getsockopt(..., SOL_SOCKET, SO_ACCEPTCONN, ...) work per IEEE Std
1003.1 (POSIX).
2005-08-01 21:15:09 +00:00
davidxu
811c17378d If a thread was removed from system run queue, kse_assign shouldn't
add it again.
2005-07-31 15:11:21 +00:00
netchild
63a34efe1a The resource_xxx routines in subr_hints.c are called before and after the
kenv environment in kern_environment.c switches to dynamic kenv. The prior
call sets the static variable hintp to the static hints in subr_hints.c
(hintmode==0).

However, changes to the environment are not detected by the resource_xxx
lookups after the change to dynamic kernel environment, so the lookup
routines only report the old stuff of hintmode==0, even after the change to
the dynamic kenv. This causes kenv users to see a different environment than
the kernel routines.

This is a problem in the mixer.c code that looks up initial mixer volume
settings from the hints: If the hints are dynamic and not from the
device.hints file, mixer.c doesn't see them, but kenv does.

The patch from the PR (modified to comply to the style of the function)
solves this.

PR:		83686
Submitted by:	Harry Coin <harrycoin@qconline.com>
2005-07-31 10:46:55 +00:00
netchild
5d0bbfe29c Add bounds checking to the setenv part of the kernel environment.
This has no security implications since only root is allowed to use
kenv(1) (and corrupt the kernel memory after adding too much variables
previous to this commit).

This is based upon the PR [1] mentioned below, but extended to check both
bounds (in case of an overflow of the counting variable) and to comply
to the style of the function. An overflow of the counting variable
shouldn't happen after adding the check for the upper bound, but better
safe than sorry (in case some other function in the kernel overwrites
random memory).

An interested soul may want to add a printf to notify root in case the
bounds are hit.

Also allocate KENV_SIZE+1 entries (the array is NULL-terminated), since
the comment for KENV_SIZE says it's the maximum number of environment
strings. [2]

PR:		83687 [1]
Submitted by:	Harry Coin <harrycoin@qconline.com> [1]
Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my> [2]
2005-07-31 10:28:35 +00:00
jkoshy
05d42812ac Fail the module loading process if the currently executing kernel
was not compiled with 'options HWPMC_HOOKS' or if the compiled-in
version numbers of the kernel and module are out of sync.

Reported by:	cracauer
MFC after:	3 days
2005-07-30 09:02:42 +00:00