Commit Graph

9530 Commits

Author SHA1 Message Date
davidxu
44261d5f28 Add umtx support for 32bit process on AMD64 machine. 2006-09-22 00:52:54 +00:00
mbr
cdbb778538 Back out rev. 1.258. The real race cause has been fixed
in rev. 1.241 of kern_proc.c.

Requested by:	jhb
2006-09-21 14:09:26 +00:00
rrs
9221681b1a atomic_fetchadd_int is used by mb_free_ext(), but it
returns the previous value that the "add" effected (In
this case we are adding -1), afterwhich we compare it
to '0'... to see if we free the mbuf... we should
be comparing it to '1'... Note that this only effects
when there is contention since there is a first part
to the comparison that checks to see if its '1'. So
this bug would only crop up if two CPU's are trying
to free the same mbuf refcount at the same time. This
will happen in SCTP but I doubt can happen in TCP or
UDP.
PR:		N/A
Submitted by:	rrs
Reviewed by:	gnn,sam
Approved by:	gnn,sam
2006-09-21 09:55:43 +00:00
davidxu
92bd1e76b1 Regenerate. 2006-09-21 04:19:48 +00:00
davidxu
bac7c2b79d Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparam
with rtprio_thread, while rtprio system call is for process only, the new
system call rtprio_thread is responsible for LWP.
2006-09-21 04:18:46 +00:00
rwatson
41fdf2d481 Remove MAC_DEBUG + MPRINTF debugging from System V IPC. This no longer
appears to be serving a useful purpose, as it was used during initial
development of MAC support for System V IPC.

MFC after:	1 month
Obtained from:	TrustedBSD Project
Suggested by:	Christopher dot Vance at SPARTA dot com
2006-09-20 13:40:00 +00:00
rwatson
5cab05d889 Remove MAC_DEBUG label counters, which were used to debug leaks and
other problems while labels were first being added to various kernel
objects.  They have outlived their usefulness.

MFC after:	1 month
Suggested by:	Christopher dot Vance at SPARTA dot com
Obtained from:	TrustedBSD Project
2006-09-20 13:33:41 +00:00
pjd
32e23e90df There is no need to set 'sp' to NULL anymore. 2006-09-20 07:27:05 +00:00
tegge
f5b67318ae Copy stat information from mount structure before it can change identity. 2006-09-20 00:32:07 +00:00
tegge
61b02921e7 Don't try to obtain a reference to a nonexisting (NULL) mount structure in
default VOP_GETWRITEMOUNT().
2006-09-20 00:27:02 +00:00
mbr
fc7fc7fa1c Fix races between tty.c and sessrele() / doenterpgrp() / leavepgrp(). The tty
code is still under giant lock, but the session/pgrp release code just used
proctree_locks. This explains why moving the proctree_lock in sys/kern/tty.c
rev. 1.258 did fix the panics in our SMP systems.

This should also fix some race panics with revoked ttys.

Reviewed by:	jhb
MFC after:	1 week
2006-09-19 19:25:11 +00:00
kib
cf9722d790 Fix the bug in rev. 1.232. If vfs_suser returned false, coveredvp shall be
unlocked only if it really exists.

Found with:	Coverity Prevent(tm)
CID:	1535
Approved by:	pjd (mentor)
2006-09-19 14:04:12 +00:00
kib
e98552d0d9 Fix the race while waiting for coveredvp lock during unmount. The vnode may
be recycled during the sleep, wrap the vn_lock with vhold/vdrop.
Check that coveredvp still points to the same mp after sleep (needed
because sleep dropped Giant).
Move check for user rights for unmount after coveredvp lock is obtained.

Tested by:	Peter Holm
Reviewed by:	tegge
Approved by:	kan (mentor)
MFC after:	2 weeks
2006-09-18 15:35:22 +00:00
rwatson
8b3f7ca1ce Declare security and security.bsd sysctl hierarchies in sysctl.h along
with other commonly used sysctl name spaces, rather than declaring them
all over the place.

MFC after:	1 month
Sponsored by:	nCircle Network Security, Inc.
2006-09-17 20:00:36 +00:00
andre
710a642f7a Remove VLAN mtag UMA zones and initialize ether_vtag and tso_segsz packet
header fields to zero on mbuf allocation.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-17 13:44:32 +00:00
rwatson
9f40438221 Regenerate. 2006-09-17 13:29:36 +00:00
rwatson
f50a5f19fb AUE_SIGALTSTACK instead of AUE_SIGPENDING for sigaltstack().
Obtained from:	TrustedBSD Project
MFC after:	3 days
2006-09-17 13:28:11 +00:00
rwatson
cc2c7c1920 Expore kern.acct_configured, a sysctl that reflects the configured/
unconfigured state of the kernel accounting system.  This is used by
the accounting privilege regression test to determine whether
accounting is in use and will be disrupted by the regression test.

Sponsored by:	nCircle Network Security, Inc.
Obtained from:	TrustedBSD Project
MFC after:	1 month
2006-09-17 11:00:36 +00:00
mohans
a569229408 Fix for a potential bug caught by Coverity. Pointed out to me by Kris Kennaway. 2006-09-14 17:57:02 +00:00
mohans
21daa650a9 Fixes up the handling of shared vnode lock lookups in the NFS client,
adds a FS type specific flag indicating that the FS supports shared
vnode lock lookups, adds some logic in vfs_lookup.c to test this flag
and set lock flags appropriately.

- amd on 6.x is a non-starter (without this change). Using amd under
  heavy load results in a deadlock (with cascading vnode locks all the
  way to the root) very quickly.
- This change should also fix the more general problem of cascading
  vnode deadlocks when an NFS server goes down.

Ideally, we wouldn't need these changes, as enabling shared vnode lock
lookups globally would work. Unfortunately, UFS, for example isn't
ready for shared vnode lock lookups, crashing pretty quickly.

This change is the result of discussions with Stephan Uphoff (ups@).

Reviewed by:	ups@
2006-09-13 18:39:09 +00:00
scottl
ae1ca6fd73 Introduce a spinlock for synchronizing access to the video output hardware
in syscons.  This replaces a simple access semaphore that was assumed to be
protected by Giant but often was not.  If two threads that were otherwise
SMP-safe called printf at the same time, there was a high likelyhood that
the semaphore would get corrupted and result in a permanently frozen video
console.  This is similar to what is already done in the serial console
drivers.
2006-09-13 15:48:15 +00:00
csjp
3927aa4474 Back out one of the Giant removals from revision 1.272. Giant was not here to
protect the vnode, it was present to synchronize access to TTY session
information between exit(2) and the TTY code. While we are here, note that
Giant is required for TTY protection.

Clue from:	bde
Discussed with:	jhb
MFC after:	1 week
2006-09-13 15:47:53 +00:00
pjd
166e57c29a Fix a lock leak in an error case.
Reported by:	netchild
Reviewed by:	rwatson
2006-09-13 06:58:40 +00:00
jhb
c74e70f7a8 - Revert making bus_generic_add_child() the default for BUS_ADD_CHILD().
Instead, we want busses to explicitly specify an add_child routine if they
  want to support identify routines, but by default disallow having outside
  drivers add devices.
- Give smbus(4) an explicit bus_add_child() method.

Requested by:	imp
2006-09-11 22:20:37 +00:00
jhb
7be67f8a93 Add a default method for BUS_ADD_CHILD() that just calls
device_add_child_ordered().  Previously, a device driver that wanted to
add a new child device in its identify routine had to know if the parent
driver had a custom bus_add_child method and use BUS_ADD_CHILD() in that
case, otherwise use device_add_child().  Getting it wrong in either
direction would result in panics or failure to add the child device.  Now,
BUS_ADD_CHILD() always works isolating child drivers from having to know
intimate details about the parent driver.

Discussed with:	imp
MFC after:	1 week
2006-09-11 19:41:31 +00:00
jhb
896ceebabd - Fix rman_manage_region() to be a lot more intelligent. It now checks
for overlaps, but more importantly, it collapses adjacent free regions.
  This is needed to cope with BIOSen that split up ports for system devices
  (like IPMI controllers) across multiple system resource entries.
- Now that rman_manage_region() is not so dumb, remove extra logic in the
  x86 nexus drivers to populate the IRQ rman that manually coalesced the
  regions.

MFC after:	1 week
2006-09-11 19:31:52 +00:00
andre
7478fd14ea New sockets created by incoming connections into listen sockets should
inherit all settings and options except listen specific options.

Add the missing send/receive timeouts and low watermarks.
Remove inheritance of the field so_timeo which is unused.

Noticed by:	phk
Reviewed by:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-09-10 17:08:06 +00:00
mbr
eecf512f8f Fix locking race in ttymodem(). The locking of the proctree happens too late
and opens a small race window before tp->t_session->s_leader is accessed. In case
tp->t_session has just been set to NULL elsewhere, we get a panic().

This fix is a bandaid until someone else fixes the whole locking in the tty subsystem.
Definitly more work needs to be done.

MFC after:	1 week
Reviewed by:	mlaier
PR:		kern/103101
2006-09-10 16:51:56 +00:00
rwatson
5eee50ca36 Remove slightly oddly placed suser() call from the KTR/ALQ setup sysctl:
it was present only in the enable path, not the disable path, which one
presumes would be equally of interest.  Either way, it was not needed,
as the sysctl framework already calls suser() if the operation is a
write operation, which configuration requests are.

Sponsored by:	nCircle Network Security, Inc.
2006-09-09 16:09:01 +00:00
jhb
27f742341b Use sysctl_handle_long() instead of duplicating it's logic for
kern.ipc.maxsockbuf so that this sysctl works for 32-bit binaries running
on amd64 via compat/freebsd32.

MFC after:	3 days
2006-09-06 21:59:36 +00:00
mp
f2b0030374 Remove call to fdfree() for the AIO daemons to prevent kernel panics
with linprocfs. This call is not needed since file descriptor sharing
was removed in v1.125.

Reviewed by:	alc, davidxu, ambrisko
MFC after:	3 days
2006-09-06 15:11:20 +00:00
davidxu
3001f9c21f Merge all code of do_lock_normal, do_lock_pi and do_lock_pp into
function do_lock_umutex.
2006-09-05 12:01:09 +00:00
pjd
12baf6e1ec Add 'show vnode <addr>' DDB command. 2006-09-04 22:15:44 +00:00
rwatson
58bc728eee Regenerate for updated audit event identifiers. 2006-09-03 15:11:13 +00:00
rwatson
211375c235 Assign proper audit event identifiers to a number of system calls not
covered in previous passes:

- sysarch, rtprio
- clock_settime
- preadv/pwritev
- __getcwd
- kqueue
- fhstatfs
- kldunloadf

Obtained from:	TrustedBSD Project
2006-09-03 15:10:40 +00:00
rwatson
c5e1a3ed4e Regenerate. 2006-09-03 13:48:48 +00:00
rwatson
3387bd0cd3 Use AUE_NTP_ADJTIME for ntp_adjtime() instead of AUE_ADJTIME.
Obtained from:	TrustedBSD Project
2006-09-03 13:44:21 +00:00
jmg
c25fb06d92 add a newbus method for obtaining the bus's bus_dma_tag_t... This is
required by arches like sparc64 (not yet implemented) and sun4v where there
are seperate IOMMU's for each PCI bus...  For all other arches, it will
end up returning NULL, which makes it a no-op...

Convert a few drivers (the ones we've been working w/ on sun4v) to the
new convection...  Eventually all drivers will need to replace the parent
tag of NULL, w/ bus_get_dma_tag(dev), though dev is usually different for
each driver, and will require hand inspection...

Reviewed by:	scottl (earlier version)
2006-09-03 00:27:42 +00:00
davidxu
b73a4c5234 Check if it is root user in do_unlock_pp. 2006-09-03 00:07:37 +00:00
davidxu
f41d0bcd97 Make sure we get new m_owner value if we can not unlock it in
uncontested case. Reorder statements in do_unlock_umutex.
2006-09-02 02:41:33 +00:00
wsalamon
c62317c442 Audit the argv and env vectors passed in on exec:
Add the argument auditing functions for argv and env.
  Add kernel-specific versions of the tokenizer functions for the
  arg and env represented as a char array.
  Implement the AUDIT_ARGV and AUDIT_ARGE audit policy commands to
  enable/disable argv/env auditing.
  Call the argument auditing from the exec system calls.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-09-01 11:45:40 +00:00
davidxu
14a4be51a4 Reorder some statments. Fix typo and remove stale comments. 2006-08-30 23:59:45 +00:00
davidxu
c4a566770e Update comments about interrupted mutex locking. 2006-08-28 07:09:27 +00:00
davidxu
3cac7b5ddb Regenerate. 2006-08-28 04:28:25 +00:00
davidxu
5a12667fcf This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
        __lwpid_t       m_owner;
        uint32_t        m_flags;
        uint32_t        m_ceilings[2];
        uint32_t        m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.

The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.

No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
marius
e78225039a Fix another bug introduced with rev. 1.204; in vfs_donmount() if
the 'vfs_getopt(optlist, "errmsg", (void **)&errmsg, &errmsg_len)'
call fails, 'errmsg' is left uninitialized, making the later tests
against NULL meaningless, and the uses bogus. Thus initialize
'errmsg' to NULL beforehand. [1]
While at it, remove the superfluous assignment of 0 to 'errmsg_len'
if the above mentioned call fails as it's already initialized to 0.

Submitted by:	Michael Plass [1]
2006-08-26 16:28:19 +00:00
ssouhlal
4fdc09f2f9 The "taskqueue_fast" spinlocks were renamed to "fast_taskqueue" in
subr_taskqueue.c:r1.32

Reported by:	rdivacky
2006-08-26 11:21:25 +00:00
pjd
a2a865527b Fix comment. 2006-08-25 15:13:49 +00:00
davidxu
fa0e6a0558 Same as previous change, the user provided priority should be reversed
too.
2006-08-25 10:05:30 +00:00
davidxu
e2f7e9cc95 Initialize kg_base_user_pri. 2006-08-25 06:29:16 +00:00