Commit Graph

8402 Commits

Author SHA1 Message Date
Robert Watson
e551d45211 Modify the alq(9) alq_open() API to accept a file creation mode, rather
than defaulting the cmode argument to vn_open() to 0.  Supply a default
argument of ALQ_DEFAULT_CMODE (0600) in current callers.

Discussed with/pointed out by:	hmp
Reveiwed by:	jeff, hmp
MFC after:	3 days
2005-04-16 12:12:27 +00:00
Maxim Konovalov
f305048664 Fix a typo in the comment.
Noticed by:	Samy Al Bahra
2005-04-15 14:01:43 +00:00
John Baldwin
95b66e9e53 Close a race between sleepq_broadcast() and sleepq_catch_signals().
Specifically, sleepq_broadcast() uses td_slpq for its private pending
queue of threads that it is going to wake up after it takes them off the
sleep queue.  The problem is that if one of the threads is actually not
asleep yet, then we can end up with td_slpq being corrupted and/or the
thread being made runnable at the wrong time resulting in the td_sleepqueue
== NULL assertion failures occasionally reported under heavy load.

The fix is to stop being so fancy and ditch the whole pending queue bit.
Instead, sleepq_remove_thread() and sleepq_resume_thread() were merged
into one function that requires the caller to hold sched_lock.  This
fixes several places that unlocked sched_lock only to call a function
that then locked sched_lock, so even though sched_lock is now held
slightly longer, removing the extra lock acquires (1 pair instead of 3
in some cases) probably makes it an overall win if you don't include the
fact that it closes a race.  This is definitely a 5.4 candidate.

PR:		kern/79693
Submitted by:	Steven Sears stevenjsears at yahoo dot com
MFC after:	4 days
2005-04-14 06:30:32 +00:00
Jeff Roberson
74a5123246 - Remove a debugging printf that slipped in.
Spotted by:	Peter Wemm
2005-04-13 23:36:28 +00:00
Tai-hwa Liang
2d4420789d According to the comment in struct tty, t_modem is optional; hence we should
guard against NULL t_modem entry. Otherwise, driver doesn't have t_modem
callback implemented(such like sys/dev/usb/ucycom.c) would panic when
someone opens the driver's associated tty device.

Reviewed by:	phk, sam (mentor)
2005-04-13 13:56:17 +00:00
Jeff Roberson
4585e3ac5a - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
Jeff Roberson
374df05fd3 - Change vop_lookup_post assertions to reflect recent vfs_lookup changes.
Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:57:53 +00:00
Jeff Roberson
18ef8344b4 - Further simplify lookup; Force all filesystems to relock in the DOTDOT
case.  There are bugs in some which didn't unlock in the ISDOTDOT case
   to begin with that need to be addressed seperately.  This simplifies
   things anyway.
 - Fix relookup() to prevent it from vrele()'ing the dvp while the vp
   is locked.  Catch up to other lookup changes.

Sponsored by:	Isilon Systems, Inc.
Reported by:	Peter Wemm
2005-04-13 10:57:13 +00:00
Matthew N. Dodd
6a2989fd54 Implement unix(4) socket options LOCAL_CREDS and LOCAL_CONNWAIT.
- Add unp_addsockcred() (for LOCAL_CREDS).
- Add an argument to unp_connect2() to differentiate between
  PRU_CONNECT and PRU_CONNECT2. (for LOCAL_CONNWAIT)

Obtained from:	 NetBSD (with some changes)
2005-04-13 00:01:46 +00:00
Robert Watson
87efd4d58a Consistently style function declarations in kern_malloc.c.
MFC after:	3 days
2005-04-12 23:54:34 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
Vinod Kashyap
f0c1dee27f The latest release of the FreeBSD driver (twa) for
3ware's 9xxx series controllers.  This corresponds to
the 9.2 release (for FreeBSD 5.2.1) on the 3ware website.

Highlights of this release are:

1. The driver has been re-architected to use a "Common Layer"
    (all tw_cl* files), which is a consolidation of all OS-independent
    parts of the driver.  The FreeBSD OS specific portions of the
    driver go into an "OS Layer" (all tw_osl* files).
    This re-architecture is to achieve better maintainability, consistency
    of behavior across OS's, and better portability to new OS's (drivers
    for new OS's can be written by just adding an OS Layer that's specific
    to the OS, by complying to a "Common Layer Programming Interface" API.

2. The driver takes advantage of multiple processors.

3. The driver has a new firmware image bundled, the new features of which
   include Online Capacity Expansion and multi-lun support, among others.
   More details about 3ware's 9.2 release can be found here:
   http://www.3ware.com/download/Escalade9000Series/9.2/9.2_Release_Notes_Web.pdf

Since the Common Layer is used across OS's, the FreeBSD specific include
path for header files (/sys/dev/twa) is not part of the #include pre-processor
directive in any of the source files.  For being able to integrate twa into
the kernel despite this, Makefile.<arch> has been changed to add the include
path to CFLAGS.

Reviewed by: scottl
2005-04-12 22:07:11 +00:00
Warner Losh
2bd5d8147a resource_list_purge: release the resources in this list, and purge the
elements of this list (eg, reset it).

Man page to follow
2005-04-12 15:20:36 +00:00
Warner Losh
f351862a17 rman_set_device() seems to have been omitted by mistake. Implement it. 2005-04-12 06:21:59 +00:00
Jeff Roberson
0b581232df - Remove unused include. 2005-04-12 05:45:58 +00:00
Jeff Roberson
436901a86b - Differentiate two UPGRADE panics so I have a better idea of what's going
on here.
2005-04-12 05:43:03 +00:00
Warner Losh
cdf7c848cf Return the resource created/found in resource_list_add to avoid an extra
resouce_list_find in some places.

Suggested by: sam
Found by: Coventry Analysis tool.
2005-04-12 04:22:17 +00:00
Jeff Roberson
17c916e321 - Mark the VOPs that require exclusive locks. Those that aren't marked
with E may be called with a shared lock held.  This list really could
   be made per filesystem if we had any filesystems which differed from
   ffs in locking guarantees.  VFS itself is not sensitive to this except
   where vgone() etc. are concerned.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 15:19:29 +00:00
Jeff Roberson
539de9eda0 - Enable ASSERT_VOP_ELOCKED and assert_vop_elocked() now that vnode_if.awk
uses it.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 15:17:06 +00:00
Jeff Roberson
070898b1b3 - Change the VOP_LOCK UPGRADE in vput() to do a LK_NOWAIT to avoid a
potential lock order reversal.  Also, don't unlock the vnode if this
   fails, lockmgr has already unlocked it for us.
 - Restructure vget() now that vn_lock() does all of VI_DOOMED checking
   for us and also handles the case where there is no real lock type.
 - If VI_OWEINACT is set, we need to upgrade the lock request to EXCLUSIVE
   so that we can call inactive.  It's not legal to vget a vnode that hasn't
   had INACTIVE called yet.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 09:28:32 +00:00
Jeff Roberson
1b19c74d73 - Assert that we're no longer doing recursive vn_locks in inactive/reclaim
as I'd like to get rid of the vxthread.
 - Handle lock requests which don't actually want a lock as this is a
   much more convenient place to handle this condition than in vget().
   These requests simply want to know that VI_DOOMED isn't set.
 - Correct a test at the end of vn_lock, if error !=0 should be
   if error == 0, this has been broken since I comitted the VI_DOOMED
   changes, but no one ran into it because vget() duplicated this
   functionality.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 09:23:56 +00:00
Jeff Roberson
836c5b4149 - vput(tvp) before vrele(tdvp) in kern_rename() to avoid lock order issues. 2005-04-11 09:19:08 +00:00
Nate Lawson
8d9134815e Add debugging prints to all the methods in case there are problems with
managing levels.  This can be enabled with the debug.cpufreq.verbose
tunable and sysctl.
2005-04-10 19:11:23 +00:00
David Schultz
f97c3df18d Suspend all other threads in the process while generating a core dump.
The main reason for doing this is that the ELF dump handler expects
the thread list to be fixed while the dump header is generated, so an
upcall that occurs at the wrong time can lead to buffer overruns and
other Bad Things.

Another solution would be to grab sched_lock in the ELF dump handler,
but we might as well single-thread, since the process is about to die.
Furthermore, I think this should ensure that the register sets in the
core file are sequentially consistent.
2005-04-10 02:31:24 +00:00
Pawel Jakub Dawidek
c19618dd7d CDEV lock should be before 'system map' lock.
Hardcode this order to help track down reported LOR.

LOR reported by:	Thierry Herbelot <thierry@herbelot.com>
LOR info:		http://sources.zabbadoz.net/freebsd/lor.html#080
2005-04-09 13:32:01 +00:00
Jeff Roberson
5ef9827cea - Remove the namei NOOBJ flag. It is meaningless now.
Sponsored by:	Isilon Systems, Inc.
2005-04-09 12:04:36 +00:00
Jeff Roberson
d3b78f7337 - If we vrele() a dvp while the child is locked we can potentially deadlock
when vrele() acquires the directory lock in the wrong order.  Fix this
   via the following changes:
 - Keep the directory locked after VOP_LOOKUP() until we've determined
   what we're going to do with the child.  This allows us to remove the
   complicated post LOOKUP code which determins whether we should lock or
   unlock the parent.  This means we may have to vput() in the appropriate
   cases later, rather than doing an unsafe vrele.
 - in NDFREE() keep two flags to indicate whether we need to unlock vp or
   dvp.  This allows us to vput rather than vrele in the appropriate
   cases without rechecking the flags.  Move the code to handle dvp after
   we handle vp.
 - Remove some dead code from namei() that was the result of changes to
   VFS_LOCK_GIANT().

Sponsored by:	Isilon Systems, Inc.
2005-04-09 11:53:16 +00:00
Pawel Jakub Dawidek
cd104dd3c1 Add a missing terminator.
Confirmed by:	rwatson
2005-04-09 11:31:31 +00:00
Gleb Smirnoff
4f20185860 Add additional newline to debug.mutex.prof.stats header, so that
column names are printed exactly above the columns.
2005-04-08 14:14:09 +00:00
Stephan Uphoff
779186434a Sprinkle some volatile magic and rearrange things a bit to avoid race
conditions in critical_exit now that it no longer blocks interrupts.

Reviewed by:	jhb
2005-04-08 03:37:53 +00:00
Poul-Henning Kamp
2e0b9b22f0 Fix bug in vfs_hash_rehash(): use correct bucket. This only affected
msdosfs which is broken in other ways too.
2005-04-07 07:54:08 +00:00
Poul-Henning Kamp
30a1695b11 Constify hexdump() harder. 2005-04-06 10:14:13 +00:00
Jeff Roberson
a96ab77002 - Remove dead code. 2005-04-06 10:11:14 +00:00
Jeff Roberson
d78e0ee9fd - Assert that the bufobj matches in flushbuflists. I still haven't gotten
to root cause on exactly how this happens.
 - If the assert is disabled, we presently try to handle this case, but the
   BUF_UNLOCK was missing.  Thus, if this condition ever hit we would leak
   a buf lock.

Many thanks to Peter Holm for all his help in finding this bug.  He really
put more effort into it than I did.
2005-04-06 06:49:46 +00:00
Jeff Roberson
2bbd6c9818 - Move NDFREE() from vfs_subr to vfs_lookup where namei() is. 2005-04-05 08:58:49 +00:00
Jeff Roberson
22fdc83f93 - Use taskqueue_thread rather than taskqueue_swi since our task is going
to vrele, which may vop lock.  This is not safe in a software interrupt
   context.
2005-04-05 08:51:45 +00:00
Christian S.J. Peron
f3e89267c0 Assert that the vnode is locked. This is meant to catch bugs or
mis-use of the vnode API in conditions where IO_NODELOCKED has been
used without the vnode actually being locked.
2005-04-05 01:11:43 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
Nate Lawson
4abfd70c87 Document that devclass_get_maxunit(9) returns one greater than the current
highest unit.

Reviewed by:	dfr
MFC after:	2 weeks
2005-04-04 15:37:59 +00:00
Nate Lawson
fada20b989 Add devclass_get_drivers(9) which provides an array of pointers to driver
instances in a given devclass.  This is useful for systems that want to
call code in driver static methods, similar to device_identify().

Reviewed by:	dfr
MFC after:	2 weeks
2005-04-04 15:26:51 +00:00
Jeff Roberson
d1cc6041e6 - Add a missing unlock of the vnode_free_list_mtx.
Spotted by:	Antoine Brodin
2005-04-04 12:07:16 +00:00
Jeff Roberson
92b8231d4f - Instead of waiting forever to get a vnode in getnewvnode() wait for
one to become available for one second and then return ENFILE.  We
   can run out of vnodes, and there must be a hard limit because without
   one we can quickly run out of KVA on x86.  Presently the system can
   deadlock if there are maxvnodes directories in the namecache.  The
   original 4.x BSD behavior was to return ENFILE if we reached the max,
   but 4.x BSD did not have the vnlru proc so it was less profitable to
   wait.
2005-04-04 11:43:44 +00:00
Jeff Roberson
9a6bb8ad8f - Include opt_vfs.h for LOOKUP_SHARED.
- Control the behavior of shared lookups with the lookup_shared sysctl
   which has its default behavior set via the LOOKUP_SHARED option.
2005-04-03 23:50:20 +00:00
Nate Lawson
a44732323f maxunit is actually one higher than the greatest currently-allocated unit
in a devclass.  All the other uses of maxunit are correct and this one was
safe since it checks the return value of devclass_get_device(), which would
always say that the highest unit device doesn't exist.

Reviewed by:	dfr
MFC after:	3 days
2005-04-03 22:23:18 +00:00
Jeff Roberson
20728d8f63 - Slightly restructure acquire() so I can add more ktr information and
an assert to help find two strange bugs.
 - Remove some nearby spls.
2005-04-03 11:49:02 +00:00
Jeff Roberson
f0ddc75ed0 - Now that writes to character devices supporting softupdates can
generate dirty bufs even with a locked vnode, 100 retries is not that
   many.  This should probably change from a retry count to an abort when
   we are no longer cleaning any buffers.
 - Don't call vprint() while we still hold the vnode locked.  Move the call
   to later in the function.
 - Clean up a comment.
2005-04-03 10:24:03 +00:00
Alan Cox
9f65fb13aa Remove GIANT_REQUIRED from elfN_load_section(). 2005-04-03 07:57:47 +00:00
John Baldwin
98df9218da - Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
  vnodes and anonymous memory.  Note that mmaping a cdev directly does not
  currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
  cdev the ioctl is acting on rather than trying to find a suitable vnode
  to map from.

Reviewed by:	alc, arch@
2005-04-01 20:00:11 +00:00
John Baldwin
fe24ab5fc5 Actually commit the code for kern_sched_get_rr_interval(). 2005-03-31 22:54:48 +00:00
John Baldwin
b88ec951e1 Implement kern_adjtime(), kern_readv(), kern_sched_rr_get_interval(),
kern_settimeofday(), and kern_writev() to allow for further stackgap
reduction in the compat ABIs.
2005-03-31 22:51:18 +00:00