Commit Graph

8427 Commits

Author SHA1 Message Date
David Xu
c4bd610f58 Add new syscall thr_new to create thread in atomic, it will
inherit signal mask from parent thread, setup TLS and stack, and
user entry address.
Also support POSIX thread's PTHREAD_SCOPE_PROCESS and PTHREAD_SCOPE_SYSTEM,
sysctl is also provided to control the scheduler scope.
2005-04-23 02:36:07 +00:00
David Xu
21fc316430 Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.
2005-04-23 02:32:32 +00:00
Jeff Roberson
17314e6286 - Define the real lock order with cdev and a few vm/vfs related locks. This
can be removed once cdev no longer calls free() with the cdev lock held.
2005-04-22 22:43:31 +00:00
Jeff Roberson
57f66be038 - Check LO_DUPOK as well as LOP_DUPOK when determining whether we should
warn about duplicate acquires.

Sponsored by:	Isilon Systems, Inc.
2005-04-22 22:39:46 +00:00
Tom Rhodes
498693053c Get the directory structure correct in a comment.
Submitted by:	Samy Al Bahra
2005-04-22 19:09:12 +00:00
Jeff Roberson
7d60dc524b - Disable code which allows getnewvnode() to fail. Many ffs_vget() callers
do not correctly deal with failures.  This presently risks deadlock
   problems if dependency processing is held up by failures to allocate
   a vnode, however, this is better than the situation with the failures.

Sponsored by:	Isilon Systems, Inc.
2005-04-22 00:57:05 +00:00
Jeff Roberson
0d12524bbf - Add two KASSERTs to prevent us from recycling a buf that is still on a
bufobj list.

Sponsored by:	Isilon Systems, Inc.
2005-04-22 00:53:20 +00:00
Marcel Moolenaar
1020bb9756 Do not conditionally compile the contents of this file upon whether
HWPMC_HOOKS is defined. The pmc_cpu_is_*() functions in this file
are referenced unconditionally by hwpmc(4).

This is mostly a stop-gap. The pmc_cpu_is*() function should
probably be declared inline in <sys/pmc.h> or <sys/pmckern.h> and
the function pointers with corresponding SX lock should probably
be moved to another file and compiled conditionally upon HWPMC_HOOKS.

Ok'd by: jkoshy@
2005-04-20 20:30:59 +00:00
David Xu
3d5c30f7c2 Inherit signal mask for child process in fork1(), RELENG_4 and other
*BSD have this behaviour, also it is required by POSIX.

PR: kern/80130
Submitted by: Kostik Belousov konstantin.belousov at zoral dot com dot ua
2005-04-20 13:14:52 +00:00
Matthew N. Dodd
96a041b533 Check sopt_level in uipc_ctloutput() and return early if it is non-zero.
This prevents unintended consequnces when an application calls things like
setsockopt(x, SOL_SOCKET, SO_REUSEADDR, ...) on a Unix domain socket.
2005-04-20 02:57:56 +00:00
Pawel Jakub Dawidek
f163441e7e Call g_waitidle() before every check the list of holds is empty.
Suggested by:	phk
2005-04-19 21:44:44 +00:00
David Xu
902c0d8297 Clear P_STATCHILD earlier to avoid unnecessary retrying. 2005-04-19 12:31:15 +00:00
David Xu
407948a530 Oops, forgot to update this file.
Fix a race condition between kern_wait() and thread_stopped().
Problem is in kern_wait(), parent process steps through children list,
once a child process is skipped, and later even if the child is stopped,
parent process still sleeps in msleep(), the race happens if parent
masked SIGCHLD.

Submitted by : Peter Edwards peadar.edwards at gmail dot com
MFC after    : 4 days
2005-04-19 08:11:28 +00:00
David Xu
95992d56f5 Fix a race condition between kern_wait() and thread_stopped().
Problem is in kern_wait(), parent process steps through children list,
once a child process is skipped, and later even if the child is stopped,
parent process still sleeps in msleep(), the race happens if parent
masked SIGCHLD.

Submitted by : Peter Edwards peadar.edwards at gmail dot com
MFC after    : 4 days
2005-04-19 08:07:28 +00:00
Poul-Henning Kamp
d1c712ede2 Call g_waitidle() instead of GEOM using the root_mount_hold() KPI.
GEOM could (and will) get events as a result of drivers coming in
late so a one-shot method is not good enough for GEOM.
2005-04-19 06:23:59 +00:00
Joseph Koshy
ebccf1e3a6 Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities
and documentation into -CURRENT.

Bump FreeBSD_version.

Reviewed by:	alc, jhb (kernel changes)
2005-04-19 04:01:25 +00:00
Poul-Henning Kamp
73fbaa74e5 Add a named reference-count KPI to hold off mounting of the root filesystem.
While we wait for holds to be released, print a list of who holds us
back once per second.

Use the new KPI from GEOM instead of vfs_mount.c calling g_waitidle().

Use the new KPI also from ata.

With ATAmkIII's newbusification, ata could narrowly miss the window
and ad0 would not exist when we tried to mount root.
2005-04-18 21:21:26 +00:00
Poul-Henning Kamp
bdb3564638 Initialize mountlist_mtx with an MTX_SYSINIT(), we need it to be ready
earlier.
2005-04-18 21:11:47 +00:00
Robert Watson
babe9a2bb3 Introduce p_canwait() and MAC Framework and MAC Policy entry points
mac_check_proc_wait(), which control the ability to wait4() specific
processes.  This permits MAC policies to limit information flow from
children that have changed label, although has to be handled carefully
due to common programming expectations regarding the behavior of
wait4().  The cr_seeotheruids() check in p_canwait() is #if 0'd for
this reason.

The mac_stub and mac_test policies are updated to reflect these new
entry points.

Sponsored by:	SPAWAR, SPARTA
Obtained from:	TrustedBSD Project
2005-04-18 13:36:57 +00:00
Robert Watson
8e37dd2bb9 Remove end-of-line tabs.
MFC after:	3 days
2005-04-18 11:51:10 +00:00
David Schultz
fe769cdd95 Add a sysctl that returns the full path of a process' text file.
This information is needed by things like `gdb -p' and Sun's javac,
and previously it could only be obtained via procfs
2005-04-18 02:10:37 +00:00
Robert Watson
7f53207b92 Introduce three additional MAC Framework and MAC Policy entry points to
control socket poll() (select()), fstat(), and accept() operations,
required for some policies:

        poll()          mac_check_socket_poll()
        fstat()         mac_check_socket_stat()
        accept()        mac_check_socket_accept()

Update mac_stub and mac_test policies to be aware of these entry points.
While here, add missing entry point implementations for:

        mac_stub.c      stub_check_socket_receive()
        mac_stub.c      stub_check_socket_send()
        mac_test.c      mac_test_check_socket_send()
        mac_test.c      mac_test_check_socket_visible()

Obtained from:	TrustedBSD Project
Sponsored by:	SPAWAR, SPARTA
2005-04-16 18:46:29 +00:00
Robert Watson
f0c2044bd9 In mac_get_fd(), remove unconditional acquisition of Giant around copying
of the socket label to thread-local storage, and replace it with
conditional acquisition based on debug.mpsafenet.  Acquire the socket
lock around the copy operation.

In mac_set_fd(), replace the unconditional acquisition of Giant with
the conditional acquisition of Giant based on debug.mpsafenet.  The socket
lock is acquired in mac_socket_label_set() so doesn't have to be
acquired here.

Obtained from:	TrustedBSD Project
Sponsored by:	SPAWAR, SPARTA
2005-04-16 18:33:13 +00:00
Marius Strobl
ea35b592d4 Increase default HZ for sparc64 to 1000. 2005-04-16 15:07:41 +00:00
Robert Watson
030a28b3b5 Introduce new MAC Framework and MAC Policy entry points to control the use
of system calls to manipulate elements of the process credential,
including:

        setuid()                mac_check_proc_setuid()
        seteuid()               mac_check_proc_seteuid()
        setgid()                mac_check_proc_setgid()
        setegid()               mac_check_proc_setegid()
        setgroups()             mac_check_proc_setgroups()
        setreuid()              mac_check_proc_setreuid()
        setregid()              mac_check_proc_setregid()
        setresuid()             mac_check_proc_setresuid()
        setresgid()             mac_check_rpoc_setresgid()

MAC checks are performed before other existing security checks; both
current credential and intended modifications are passed as arguments
to the entry points.  The mac_test and mac_stub policies are updated.

Submitted by:	Samy Al Bahra <samy@kerneled.org>
Obtained from:	TrustedBSD Project
2005-04-16 13:29:15 +00:00
Robert Watson
e551d45211 Modify the alq(9) alq_open() API to accept a file creation mode, rather
than defaulting the cmode argument to vn_open() to 0.  Supply a default
argument of ALQ_DEFAULT_CMODE (0600) in current callers.

Discussed with/pointed out by:	hmp
Reveiwed by:	jeff, hmp
MFC after:	3 days
2005-04-16 12:12:27 +00:00
Maxim Konovalov
f305048664 Fix a typo in the comment.
Noticed by:	Samy Al Bahra
2005-04-15 14:01:43 +00:00
John Baldwin
95b66e9e53 Close a race between sleepq_broadcast() and sleepq_catch_signals().
Specifically, sleepq_broadcast() uses td_slpq for its private pending
queue of threads that it is going to wake up after it takes them off the
sleep queue.  The problem is that if one of the threads is actually not
asleep yet, then we can end up with td_slpq being corrupted and/or the
thread being made runnable at the wrong time resulting in the td_sleepqueue
== NULL assertion failures occasionally reported under heavy load.

The fix is to stop being so fancy and ditch the whole pending queue bit.
Instead, sleepq_remove_thread() and sleepq_resume_thread() were merged
into one function that requires the caller to hold sched_lock.  This
fixes several places that unlocked sched_lock only to call a function
that then locked sched_lock, so even though sched_lock is now held
slightly longer, removing the extra lock acquires (1 pair instead of 3
in some cases) probably makes it an overall win if you don't include the
fact that it closes a race.  This is definitely a 5.4 candidate.

PR:		kern/79693
Submitted by:	Steven Sears stevenjsears at yahoo dot com
MFC after:	4 days
2005-04-14 06:30:32 +00:00
Jeff Roberson
74a5123246 - Remove a debugging printf that slipped in.
Spotted by:	Peter Wemm
2005-04-13 23:36:28 +00:00
Tai-hwa Liang
2d4420789d According to the comment in struct tty, t_modem is optional; hence we should
guard against NULL t_modem entry. Otherwise, driver doesn't have t_modem
callback implemented(such like sys/dev/usb/ucycom.c) would panic when
someone opens the driver's associated tty device.

Reviewed by:	phk, sam (mentor)
2005-04-13 13:56:17 +00:00
Jeff Roberson
4585e3ac5a - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
Jeff Roberson
374df05fd3 - Change vop_lookup_post assertions to reflect recent vfs_lookup changes.
Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:57:53 +00:00
Jeff Roberson
18ef8344b4 - Further simplify lookup; Force all filesystems to relock in the DOTDOT
case.  There are bugs in some which didn't unlock in the ISDOTDOT case
   to begin with that need to be addressed seperately.  This simplifies
   things anyway.
 - Fix relookup() to prevent it from vrele()'ing the dvp while the vp
   is locked.  Catch up to other lookup changes.

Sponsored by:	Isilon Systems, Inc.
Reported by:	Peter Wemm
2005-04-13 10:57:13 +00:00
Matthew N. Dodd
6a2989fd54 Implement unix(4) socket options LOCAL_CREDS and LOCAL_CONNWAIT.
- Add unp_addsockcred() (for LOCAL_CREDS).
- Add an argument to unp_connect2() to differentiate between
  PRU_CONNECT and PRU_CONNECT2. (for LOCAL_CONNWAIT)

Obtained from:	 NetBSD (with some changes)
2005-04-13 00:01:46 +00:00
Robert Watson
87efd4d58a Consistently style function declarations in kern_malloc.c.
MFC after:	3 days
2005-04-12 23:54:34 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
Vinod Kashyap
f0c1dee27f The latest release of the FreeBSD driver (twa) for
3ware's 9xxx series controllers.  This corresponds to
the 9.2 release (for FreeBSD 5.2.1) on the 3ware website.

Highlights of this release are:

1. The driver has been re-architected to use a "Common Layer"
    (all tw_cl* files), which is a consolidation of all OS-independent
    parts of the driver.  The FreeBSD OS specific portions of the
    driver go into an "OS Layer" (all tw_osl* files).
    This re-architecture is to achieve better maintainability, consistency
    of behavior across OS's, and better portability to new OS's (drivers
    for new OS's can be written by just adding an OS Layer that's specific
    to the OS, by complying to a "Common Layer Programming Interface" API.

2. The driver takes advantage of multiple processors.

3. The driver has a new firmware image bundled, the new features of which
   include Online Capacity Expansion and multi-lun support, among others.
   More details about 3ware's 9.2 release can be found here:
   http://www.3ware.com/download/Escalade9000Series/9.2/9.2_Release_Notes_Web.pdf

Since the Common Layer is used across OS's, the FreeBSD specific include
path for header files (/sys/dev/twa) is not part of the #include pre-processor
directive in any of the source files.  For being able to integrate twa into
the kernel despite this, Makefile.<arch> has been changed to add the include
path to CFLAGS.

Reviewed by: scottl
2005-04-12 22:07:11 +00:00
Warner Losh
2bd5d8147a resource_list_purge: release the resources in this list, and purge the
elements of this list (eg, reset it).

Man page to follow
2005-04-12 15:20:36 +00:00
Warner Losh
f351862a17 rman_set_device() seems to have been omitted by mistake. Implement it. 2005-04-12 06:21:59 +00:00
Jeff Roberson
0b581232df - Remove unused include. 2005-04-12 05:45:58 +00:00
Jeff Roberson
436901a86b - Differentiate two UPGRADE panics so I have a better idea of what's going
on here.
2005-04-12 05:43:03 +00:00
Warner Losh
cdf7c848cf Return the resource created/found in resource_list_add to avoid an extra
resouce_list_find in some places.

Suggested by: sam
Found by: Coventry Analysis tool.
2005-04-12 04:22:17 +00:00
Jeff Roberson
17c916e321 - Mark the VOPs that require exclusive locks. Those that aren't marked
with E may be called with a shared lock held.  This list really could
   be made per filesystem if we had any filesystems which differed from
   ffs in locking guarantees.  VFS itself is not sensitive to this except
   where vgone() etc. are concerned.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 15:19:29 +00:00
Jeff Roberson
539de9eda0 - Enable ASSERT_VOP_ELOCKED and assert_vop_elocked() now that vnode_if.awk
uses it.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 15:17:06 +00:00
Jeff Roberson
070898b1b3 - Change the VOP_LOCK UPGRADE in vput() to do a LK_NOWAIT to avoid a
potential lock order reversal.  Also, don't unlock the vnode if this
   fails, lockmgr has already unlocked it for us.
 - Restructure vget() now that vn_lock() does all of VI_DOOMED checking
   for us and also handles the case where there is no real lock type.
 - If VI_OWEINACT is set, we need to upgrade the lock request to EXCLUSIVE
   so that we can call inactive.  It's not legal to vget a vnode that hasn't
   had INACTIVE called yet.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 09:28:32 +00:00
Jeff Roberson
1b19c74d73 - Assert that we're no longer doing recursive vn_locks in inactive/reclaim
as I'd like to get rid of the vxthread.
 - Handle lock requests which don't actually want a lock as this is a
   much more convenient place to handle this condition than in vget().
   These requests simply want to know that VI_DOOMED isn't set.
 - Correct a test at the end of vn_lock, if error !=0 should be
   if error == 0, this has been broken since I comitted the VI_DOOMED
   changes, but no one ran into it because vget() duplicated this
   functionality.

Sponsored by:	Isilon Systems, Inc.
2005-04-11 09:23:56 +00:00
Jeff Roberson
836c5b4149 - vput(tvp) before vrele(tdvp) in kern_rename() to avoid lock order issues. 2005-04-11 09:19:08 +00:00
Nate Lawson
8d9134815e Add debugging prints to all the methods in case there are problems with
managing levels.  This can be enabled with the debug.cpufreq.verbose
tunable and sysctl.
2005-04-10 19:11:23 +00:00
David Schultz
f97c3df18d Suspend all other threads in the process while generating a core dump.
The main reason for doing this is that the ELF dump handler expects
the thread list to be fixed while the dump header is generated, so an
upcall that occurs at the wrong time can lead to buffer overruns and
other Bad Things.

Another solution would be to grab sched_lock in the ELF dump handler,
but we might as well single-thread, since the process is about to die.
Furthermore, I think this should ensure that the register sets in the
core file are sequentially consistent.
2005-04-10 02:31:24 +00:00
Pawel Jakub Dawidek
c19618dd7d CDEV lock should be before 'system map' lock.
Hardcode this order to help track down reported LOR.

LOR reported by:	Thierry Herbelot <thierry@herbelot.com>
LOR info:		http://sources.zabbadoz.net/freebsd/lor.html#080
2005-04-09 13:32:01 +00:00