Commit Graph

8681 Commits

Author SHA1 Message Date
truckman
d2af5326c9 Add a new struct buf flag bit, B_PERSISTENT, and use it to tag
struct bufs that are persistently held by ext2fs.  Ignore any buffers
with this flag in the code in boot() that counts "busy" and dirty
buffers and attempts to sync the dirty buffers, which is done before
attempting to unmount all the file systems during shutdown.

This fixes the problem caused by any ext2fs file systems that are
mounted at system shutdown time, which caused boot() to give up on
a non-zero number of buffers and skip the call to vfs_unmountall().
This left all the mounted file systems in a dirty state and caused
them to all require cleanup by fsck on reboot.

Move the two separate copies of the "busy" buffer test in boot()
to a separate function.

Nuke the useless spl() stuff in the ext2fs ULCK_BUF() macro.

Bring the PRINT_BUF_FLAGS definition in sys/buf.h up to date with
this and previous flag changes.

PR:		kern/56675, kern/85163
Tested by:	"Matthias Andree" matthias.andree at gmx.de
Reviewed by:	bde
MFC after:	3 days
2005-09-08 06:30:05 +00:00
obrien
b888392910 Forward declaring static variables as extern is invalid ISO-C. Now that
GCC can properly handle forward static declarations, do this properly.
2005-09-07 10:06:14 +00:00
glebius
a166323868 In soreceive(), when a first mbuf is removed from socket buffer use
sockbuf_pushsync(). Previous manipulation could lead to an inconsistent
mbuf.

Reviewed by:	rwatson
2005-09-06 17:05:11 +00:00
glebius
832f7b2c0d Document flags of a pollrec. 2005-09-06 11:09:18 +00:00
csjp
33e564c762 Convert the primary ACL allocator from malloc(9) to using a UMA zone instead.
Also introduce an aclinit function which will be used to create the UMA zone
for use by file systems at system start up.

MFC after:	1 month
Discussed with:	rwatson
2005-09-06 00:06:30 +00:00
glebius
a43b8a5135 Remove Giant mutex from polling(4) and use a separate poll_mtx(4)
instead. Detailed changelist:

o Add flags field to struct pollrec, to indicate that
  are particular entry is being worked on.
o Define a macro PR_VALID() to check that a pollrec
  is valid and pollable.
o Mark ISRs as mpsafe.

o ether_poll()
  - Acquire poll_mtx while traversing pollrec array.
  - Skip pollrecs, that are being worked on.
  - Conditionally acquire Giant when entering handler.

o netisr_pollmore()
  - Conditionally assert Giant.
  - Acquire poll_mtx while working with statistics.

o netisr_poll()
  - Conditionally assert Giant.
  - Acquire poll_mtx while working with statistics
    and traversing pollrec array.

o ether_poll_register(), ether_poll_deregister()
  - Conditionally assert Giant.
  - Acquire poll_mtx while working with pollrec array.

o poll_idle()
  - Remove all strange manipulations with Giant.

In collaboration with:	ru, pjd
In collaboration with:	Oleg Bulyzhin <oleg rinet.ru>
In collaboration with:	dima <_pppp mail.ru>
2005-09-05 16:02:11 +00:00
delphij
5f683ee68d When padding with zero, do pad after prefixes rather than padding
before prefixes.

Use cases:
	printf("%05d", -42);   -->   "00-42"   (should be "-0042")
	printf("%#05x", 12);   -->   "000xc"   (should be "0x00c")

Submitted by:	Oliver Fromme
PR:		kern/85520
MFC After:	1 week
2005-09-04 18:03:45 +00:00
phk
40bead9126 If we ignore an unknown % sequence, we must stop interpreting the
remaining % arguments because the varargs are now out of sync and
there is a risk that we might for instance dereference an integer
in a %s argument.

Sponsored by: Napatech.com
2005-09-03 10:28:08 +00:00
jhb
107f288b6f - Add some comments to some of the static lock orders. Don't explicitly
link proctree and allproc to Giant since that order is already implicitly
  enforced.
- Use a goto to handle the case where we want to enforce a reversal before
  calling isitmydescendant() in witness_checkorder() so that the logic is
  easier to follow and so that it is easier to add more forced-reversal
  cases in the future.

MFC after:	 3 days
2005-09-02 20:23:49 +00:00
jhb
033d70b179 - Add an assertion to panic if one tries to call mtx_trylock() on a spin
mutex.
- Don't panic if a spin lock is held too long inside _mtx_lock_spin() if
  panicstr is set (meaning that we are already in a panic).  Just keep
  spinning forever instead.
2005-09-02 20:21:49 +00:00
jhb
736b03e795 Add witness warnings to panic if a thread tries to exit while holding any
locks.

Requested by:	jeff
MFC after:	3 days
2005-09-02 20:20:01 +00:00
njl
b0c4bd3081 Break out the checks for duplicates and absolute settings being too high
instead of trying to do them all at once.  This should fix the level sorting
problems from the previous revision.

Testing help:	ume
2005-09-02 16:32:43 +00:00
ssouhlal
6dfa271942 Print out a warning and a backtrace if we try to unlock a lockmgr that
we do not hold.

Glanced at by:	phk
MFC after:	3 days
2005-09-02 15:56:01 +00:00
ssouhlal
dd616181ff Don't unbusy the devfs mount in vfs_mountroot_try() as it gets accessed
and unbusied in devfs_fixup(), which assumes that the devfs mount is
still locked.

Granced at by:	phk
MFC after:	3 days
2005-09-02 13:37:54 +00:00
pjd
226a90d03e In case of mac_check_vnode_rename_from() or vn_start_write() failure,
vn_finished_write() should not be called.

Reviewed by:	ssouhlal
MFC after:	3 days
2005-09-01 21:46:33 +00:00
andre
056dde3309 Changes and cleanups to m_sanity():
o for() instead of while() looping  over mbuf chain
o paren's around all flag checks
o more verbose function and purpose description
o some more style changes

Based on feedback from:	sam
2005-08-30 21:31:42 +00:00
andre
6418b7b141 Unbreak m_demote() and put back the 'all' flag. Without it we cannot
correctly test for m_nextpkt in an mbuf chain.
2005-08-30 21:14:30 +00:00
andre
51994fffc4 o Remove the 'all' flag from m_demote(). Users can simply call it with
m_demote(m->m_next) if they wish to start at the second mbuf in chain.
o Test m_type with == instead of &.
o Check m_nextpkt against NULL instead of implicit 0.

Based on feedback from:	sam
2005-08-30 20:07:49 +00:00
njl
63a52296e3 Eliminate cpufreq levels for two cases that are less than optimal:
1. Walk the absolute list in reverse to prefer duplicated levels that have
a lower absolute setting, i.e. 800 Mhz/50% is better than 1600 Mhz/25% even
though both have the same actual frequency.  This also removes the need to
check for already-modified levels since by definition, those will be added
later in the sorted list.

2. Compare the absolute settings for derived levels and don't use the new
level if it's higher.  For example, a level of 800 Mhz/75% is preferable to
1600 Mhz/25% even though the latter has a lower total frequency.

This work is based on a patch from the submitter but reworked by myself.

Submitted by:	Tijl Coosemans (tijl/ulyssis.org)
2005-08-30 04:45:32 +00:00
andre
41519e2afc Add m_copymdata(struct mbuf *m, struct mbuf *n, int off, int len,
int prep, int how).

Copies the data portion of mbuf (chain) n starting from offset off
for length len to mbuf (chain) m.  Depending on prep the copied
data will be appended or prepended.  The function ensures that the
mbuf (chain) m will be fully writeable by making real (not refcnt)
copies of mbuf clusters.  For the prepending the function returns
a pointer to the new start of mbuf chain m and leaves as much
leading space as possible in the new first mbuf.

Reviewed by:	glebius
2005-08-29 20:15:33 +00:00
andre
139f31aa37 Add m_sanity(struct mbuf *m, int sanitize) to do some heavy sanity
checking on mbuf's and mbuf chains.  Set sanitize to 1 to garble
illegal things and have them blow up later when used/accessed.

m_sanity()'s main purpose is for KASSERT()'s and debugging of non-
kosher mbuf manipulation (of which we have a number of).

Reviewed by:	glebius
2005-08-29 19:58:56 +00:00
andre
71f036e379 Add m_demote(struct mbuf *m, int all) to clean up mbuf (chain) from
any tags and packet headers.  If "all" is set then the first mbuf
in the chain will be cleaned too.

This function is used before an mbuf, that arrived as packet with
m->flags & M_PKTHDR, is appended to an mbuf chain using m->m_next
(not m->m_nextpkt).

Reviewed by:	glebius
2005-08-29 19:45:39 +00:00
pjd
a520307ce1 Add 'depth' argument to CTRSTACK() macro, which allows to reduce number
of ktr slots used. If 'depth' is equal to 0, the whole stack will be
logged, just like before.
2005-08-29 11:34:08 +00:00
ssouhlal
3041058fad Fix a typo in vop_rename_pre() where we ended up using vholdl()
instead of vhold(), even though the vnode interlock is unlocked.

MFC after:	3 days
2005-08-28 23:00:11 +00:00
alc
5346dda812 Handle vm_map_wire()'s failure. 2005-08-28 05:38:40 +00:00
alc
dfddcf2af4 Correctly handle vm_map_wire()'s failure. (See also revisions 1.81 and
1.82.)

Reviewed by:	tegge
2005-08-28 04:50:11 +00:00
alc
ecb497dd87 Eliminate an unneeded reference on a vm object. If, in fact, the nearby
vm_map_find() fails, then the excess reference causes the vm object to be
leaked.

Reviewed by:	tegge
2005-08-28 00:24:58 +00:00
alc
597f2facce Revert the previous change for two reasons: (1) If vm_map_find() succeeds
but vm_map_wire() fails, then a vm object, vm map entries, and kernel_map
free space is leaked and (2) unwiring is handled automatically by
vm_map_remove().

Suggested by:   tegge
2005-08-28 00:19:54 +00:00
des
6222a26a3b Two minor optimizations of fdalloc():
- if minfd < fd_freefile (as is most often the case, since minfd is
   usually 0), set it to fd_freefile.

 - remove a call to fd_first_free() which duplicates work already done
   by fdused().

This change results in a small but measurable speedup for processes
with large numbers (several thousands) of open files.

PR:		kern/85176
Submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
MFC after:	3 weeks
2005-08-26 11:16:39 +00:00
truckman
c6df9cec44 Track all lock relationships instead of pruning direct relationships
if an indirect relationship exists (keep both A->B->C and A->C).
This allows witness_checkorder() to use isitmychild() instead of
the much more expensive isitmydescendant() to check for valid lock
ordering.

Don't do an expensive tree walk to update the w_level values when
the tree is updated.  Only update the w_level values when using the
debugger to display the tree.

Nuke the experimental "witness_watch > 1" mode that only compared
w_level for the two locks.  This information is no longer maintained
at run time, and the use of isitmychild() in witness_checkorder
should bring performance close enough to the acceptable level that
this hack is not needed.

Report witness data structure allocation statistics under the
debug.witness sysctl.

Reviewed by:	jhb
MFC after:	30 days
2005-08-25 03:47:37 +00:00
truckman
aa31faa377 Back out the removal of LK_NOWAIT from the VOP_LOCK() call in
vlrureclaim() in vfs_subr.c 1.636  because waiting for the vnode
lock aggravates an existing race condition.  It is also undesirable
according to the commit log for 1.631.

Fix the tiny race condition that remains by rechecking the vnode
state after grabbing the vnode lock and grabbing the vnode interlock.

Fix the problem of other threads being starved (which 1.636 attempted
to fix by removing LK_NOWAIT) by calling uio_yield() periodically
in vlrureclaim().  This should be more deterministic than hoping
that VOP_LOCK() without LK_NOWAIT will block, which may not happen
in this loop.

Reviewed by:	kan
MFC after:	5 days
2005-08-23 03:44:06 +00:00
pjd
7bd8127f8f mp_ncpus is always (properly) initialized, even on UP kernels, so just use it. 2005-08-21 18:03:31 +00:00
rwatson
867f71548b Silence "busy" warnings when unmounting devfs at system shutdown. This
is a workaround for non-symetric teardown of the file systems at
shutdown with respect to the mount order at boot.  The proper long term
fix is to properly detach devfs from the root mount before unmounting
each, and should be implemented, but since the problem is non-harmful,
this temporary band-aid will prevent false positive bug reports and
unnecessary error output for 6.0-RELEASE.

MFC after:	3 days
Tested by:	pav, pjd
2005-08-20 17:12:47 +00:00
phk
df3343c5ae Properly un-giant-trick the cdevsw in fini_cdevsw()
Tripped over by:	Huang wen hui <huang@gddsn.org.cn>
2005-08-20 12:13:51 +00:00
davidxu
72d79d8357 Add missing brackets.
Noticed by: stefanf@
2005-08-19 22:30:13 +00:00
davidxu
bce3f5771d Fix a LOR between sched_lock and sleep queue lock. 2005-08-19 13:35:34 +00:00
davidxu
fae38f5558 Move up code for testing KEF_HOLD to avoid ke_cpu being changed unexpectly
for PRI_ITHD and PRI_REALTIME threads.
2005-08-19 11:51:41 +00:00
ume
d72dbf2615 - don't forget to save freqency when priority is raised.
- nuke redundant variable initialization.
2005-08-18 16:41:25 +00:00
ume
57562e0618 don't forget to update curr_priority. even when frequency is
not changed, priority may be changed.
2005-08-18 16:08:56 +00:00
phk
fcf6768753 Handle device drivers with D_NEEDGIANT in a way which does not
penalize the 'good' drivers:  Allocate a shadow cdevsw and populate
it with wrapper functions which grab Giant
2005-08-17 08:19:52 +00:00
phk
2352e07b6f In vop_stdpathconf(ap) also default for _PC_NAME_MAX and _PC_PATH_MAX. 2005-08-17 06:59:23 +00:00
ume
69a7276f8f Save cpu level only when priority is greater than PRIO_USER
to make CPUFREQ_SET(NULL, prio) work.
TODO: implement saved_level as stack.

Reviewed by:	njl
2005-08-16 20:03:08 +00:00
phk
d6e36cd748 Remove stale comment. 2005-08-16 19:47:42 +00:00
phk
8a3fe94804 Collect the devfs related sysctls in one place 2005-08-16 19:25:02 +00:00
phk
e89ebd4119 Create a new internal .h file to communicate very private stuff
from kern_conf.c to devfs.

For now just two prototypes, more to come.
2005-08-16 19:08:01 +00:00
kan
5f5eca170a Do not keep parent directory locked while calling VFS_ROOT to traverse mount
points in lookup(). The lock can be dropped safely around VFS_ROOT because
LOCKPARENT semantics with child and perent vnodes coming from different FSes
does not really have any meaningful use. On the other hard, this prevents
easily triggered deadlock on systems using automounter daemon.
2005-08-14 18:10:04 +00:00
kan
51355225d4 Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable.
vm_pager_init() is run before required nswbuf variable has been set
to correct value. This caused system to run with single pbuf available
for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt
variable in the same way.

Reported by:	ade
Obtained from:	alc
MFC after:	2 days
2005-08-13 20:21:33 +00:00
marcel
f94807eceb Make mpsafe_vfs=1 the default on ia64. 2005-08-13 20:07:50 +00:00
njl
1198d2f2a8 The "lowest" sysctl setting makes more sense as the lowest one to use, so
discard all levels less than this setting, not less than/equal to.

MFC after:	1 day
2005-08-11 18:40:58 +00:00
kan
9590889861 Do not drop the vnode interlock if vdropl is called on already doomed vnode.
vdropl callers expect it to return with interlock still being held.

MFC after:	2 days
2005-08-10 11:46:03 +00:00