9713 Commits

Author SHA1 Message Date
rwatson
cc2c7c1920 Expore kern.acct_configured, a sysctl that reflects the configured/
unconfigured state of the kernel accounting system.  This is used by
the accounting privilege regression test to determine whether
accounting is in use and will be disrupted by the regression test.

Sponsored by:	nCircle Network Security, Inc.
Obtained from:	TrustedBSD Project
MFC after:	1 month
2006-09-17 11:00:36 +00:00
mohans
a569229408 Fix for a potential bug caught by Coverity. Pointed out to me by Kris Kennaway. 2006-09-14 17:57:02 +00:00
mohans
21daa650a9 Fixes up the handling of shared vnode lock lookups in the NFS client,
adds a FS type specific flag indicating that the FS supports shared
vnode lock lookups, adds some logic in vfs_lookup.c to test this flag
and set lock flags appropriately.

- amd on 6.x is a non-starter (without this change). Using amd under
  heavy load results in a deadlock (with cascading vnode locks all the
  way to the root) very quickly.
- This change should also fix the more general problem of cascading
  vnode deadlocks when an NFS server goes down.

Ideally, we wouldn't need these changes, as enabling shared vnode lock
lookups globally would work. Unfortunately, UFS, for example isn't
ready for shared vnode lock lookups, crashing pretty quickly.

This change is the result of discussions with Stephan Uphoff (ups@).

Reviewed by:	ups@
2006-09-13 18:39:09 +00:00
scottl
ae1ca6fd73 Introduce a spinlock for synchronizing access to the video output hardware
in syscons.  This replaces a simple access semaphore that was assumed to be
protected by Giant but often was not.  If two threads that were otherwise
SMP-safe called printf at the same time, there was a high likelyhood that
the semaphore would get corrupted and result in a permanently frozen video
console.  This is similar to what is already done in the serial console
drivers.
2006-09-13 15:48:15 +00:00
csjp
3927aa4474 Back out one of the Giant removals from revision 1.272. Giant was not here to
protect the vnode, it was present to synchronize access to TTY session
information between exit(2) and the TTY code. While we are here, note that
Giant is required for TTY protection.

Clue from:	bde
Discussed with:	jhb
MFC after:	1 week
2006-09-13 15:47:53 +00:00
pjd
166e57c29a Fix a lock leak in an error case.
Reported by:	netchild
Reviewed by:	rwatson
2006-09-13 06:58:40 +00:00
jhb
c74e70f7a8 - Revert making bus_generic_add_child() the default for BUS_ADD_CHILD().
Instead, we want busses to explicitly specify an add_child routine if they
  want to support identify routines, but by default disallow having outside
  drivers add devices.
- Give smbus(4) an explicit bus_add_child() method.

Requested by:	imp
2006-09-11 22:20:37 +00:00
jhb
7be67f8a93 Add a default method for BUS_ADD_CHILD() that just calls
device_add_child_ordered().  Previously, a device driver that wanted to
add a new child device in its identify routine had to know if the parent
driver had a custom bus_add_child method and use BUS_ADD_CHILD() in that
case, otherwise use device_add_child().  Getting it wrong in either
direction would result in panics or failure to add the child device.  Now,
BUS_ADD_CHILD() always works isolating child drivers from having to know
intimate details about the parent driver.

Discussed with:	imp
MFC after:	1 week
2006-09-11 19:41:31 +00:00
jhb
896ceebabd - Fix rman_manage_region() to be a lot more intelligent. It now checks
for overlaps, but more importantly, it collapses adjacent free regions.
  This is needed to cope with BIOSen that split up ports for system devices
  (like IPMI controllers) across multiple system resource entries.
- Now that rman_manage_region() is not so dumb, remove extra logic in the
  x86 nexus drivers to populate the IRQ rman that manually coalesced the
  regions.

MFC after:	1 week
2006-09-11 19:31:52 +00:00
andre
7478fd14ea New sockets created by incoming connections into listen sockets should
inherit all settings and options except listen specific options.

Add the missing send/receive timeouts and low watermarks.
Remove inheritance of the field so_timeo which is unused.

Noticed by:	phk
Reviewed by:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-09-10 17:08:06 +00:00
mbr
eecf512f8f Fix locking race in ttymodem(). The locking of the proctree happens too late
and opens a small race window before tp->t_session->s_leader is accessed. In case
tp->t_session has just been set to NULL elsewhere, we get a panic().

This fix is a bandaid until someone else fixes the whole locking in the tty subsystem.
Definitly more work needs to be done.

MFC after:	1 week
Reviewed by:	mlaier
PR:		kern/103101
2006-09-10 16:51:56 +00:00
rwatson
5eee50ca36 Remove slightly oddly placed suser() call from the KTR/ALQ setup sysctl:
it was present only in the enable path, not the disable path, which one
presumes would be equally of interest.  Either way, it was not needed,
as the sysctl framework already calls suser() if the operation is a
write operation, which configuration requests are.

Sponsored by:	nCircle Network Security, Inc.
2006-09-09 16:09:01 +00:00
jhb
27f742341b Use sysctl_handle_long() instead of duplicating it's logic for
kern.ipc.maxsockbuf so that this sysctl works for 32-bit binaries running
on amd64 via compat/freebsd32.

MFC after:	3 days
2006-09-06 21:59:36 +00:00
mp
f2b0030374 Remove call to fdfree() for the AIO daemons to prevent kernel panics
with linprocfs. This call is not needed since file descriptor sharing
was removed in v1.125.

Reviewed by:	alc, davidxu, ambrisko
MFC after:	3 days
2006-09-06 15:11:20 +00:00
davidxu
3001f9c21f Merge all code of do_lock_normal, do_lock_pi and do_lock_pp into
function do_lock_umutex.
2006-09-05 12:01:09 +00:00
pjd
12baf6e1ec Add 'show vnode <addr>' DDB command. 2006-09-04 22:15:44 +00:00
rwatson
58bc728eee Regenerate for updated audit event identifiers. 2006-09-03 15:11:13 +00:00
rwatson
211375c235 Assign proper audit event identifiers to a number of system calls not
covered in previous passes:

- sysarch, rtprio
- clock_settime
- preadv/pwritev
- __getcwd
- kqueue
- fhstatfs
- kldunloadf

Obtained from:	TrustedBSD Project
2006-09-03 15:10:40 +00:00
rwatson
c5e1a3ed4e Regenerate. 2006-09-03 13:48:48 +00:00
rwatson
3387bd0cd3 Use AUE_NTP_ADJTIME for ntp_adjtime() instead of AUE_ADJTIME.
Obtained from:	TrustedBSD Project
2006-09-03 13:44:21 +00:00
jmg
c25fb06d92 add a newbus method for obtaining the bus's bus_dma_tag_t... This is
required by arches like sparc64 (not yet implemented) and sun4v where there
are seperate IOMMU's for each PCI bus...  For all other arches, it will
end up returning NULL, which makes it a no-op...

Convert a few drivers (the ones we've been working w/ on sun4v) to the
new convection...  Eventually all drivers will need to replace the parent
tag of NULL, w/ bus_get_dma_tag(dev), though dev is usually different for
each driver, and will require hand inspection...

Reviewed by:	scottl (earlier version)
2006-09-03 00:27:42 +00:00
davidxu
b73a4c5234 Check if it is root user in do_unlock_pp. 2006-09-03 00:07:37 +00:00
davidxu
f41d0bcd97 Make sure we get new m_owner value if we can not unlock it in
uncontested case. Reorder statements in do_unlock_umutex.
2006-09-02 02:41:33 +00:00
wsalamon
c62317c442 Audit the argv and env vectors passed in on exec:
Add the argument auditing functions for argv and env.
  Add kernel-specific versions of the tokenizer functions for the
  arg and env represented as a char array.
  Implement the AUDIT_ARGV and AUDIT_ARGE audit policy commands to
  enable/disable argv/env auditing.
  Call the argument auditing from the exec system calls.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-09-01 11:45:40 +00:00
davidxu
14a4be51a4 Reorder some statments. Fix typo and remove stale comments. 2006-08-30 23:59:45 +00:00
davidxu
c4a566770e Update comments about interrupted mutex locking. 2006-08-28 07:09:27 +00:00
davidxu
3cac7b5ddb Regenerate. 2006-08-28 04:28:25 +00:00
davidxu
5a12667fcf This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
        __lwpid_t       m_owner;
        uint32_t        m_flags;
        uint32_t        m_ceilings[2];
        uint32_t        m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.

The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.

No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
marius
e78225039a Fix another bug introduced with rev. 1.204; in vfs_donmount() if
the 'vfs_getopt(optlist, "errmsg", (void **)&errmsg, &errmsg_len)'
call fails, 'errmsg' is left uninitialized, making the later tests
against NULL meaningless, and the uses bogus. Thus initialize
'errmsg' to NULL beforehand. [1]
While at it, remove the superfluous assignment of 0 to 'errmsg_len'
if the above mentioned call fails as it's already initialized to 0.

Submitted by:	Michael Plass [1]
2006-08-26 16:28:19 +00:00
ssouhlal
4fdc09f2f9 The "taskqueue_fast" spinlocks were renamed to "fast_taskqueue" in
subr_taskqueue.c:r1.32

Reported by:	rdivacky
2006-08-26 11:21:25 +00:00
pjd
a2a865527b Fix comment. 2006-08-25 15:13:49 +00:00
davidxu
fa0e6a0558 Same as previous change, the user provided priority should be reversed
too.
2006-08-25 10:05:30 +00:00
davidxu
e2f7e9cc95 Initialize kg_base_user_pri. 2006-08-25 06:29:16 +00:00
davidxu
8e3f035b3f Add user priority loaning code to support priority propagation for
1:1 threading's POSIX priority mutexes, the code is no-op unless
priority-aware umtx code is committed.
2006-08-25 06:12:53 +00:00
marius
604b84193c Fix a bug introduced with rev. 1.204; in vfs_donmount() use
copyout(9) instead of copystr(9) for copying the errmsg from
kernel- to user-space. This fixes a panic on sparc64 when
using the nmount(2)-converted mountd(8).
While at it, use bcopy(3) instead of strncpy(3) in the kernel-
to kernel-space case for consistency with vfs_buildopts() and
between kernel- to user-space and kernel- to kernel-space case.
2006-08-24 18:52:28 +00:00
davidxu
97c67e6558 POSIX requires that higher numerical values for the priority represent
higher priorities, so we should reverse the passed value here.
2006-08-23 07:22:25 +00:00
cperciva
4ec3e871a6 Fix a signedness bug.
MFC after:	3 days
Security:	Local DoS
2006-08-20 10:29:08 +00:00
gnn
b22d0cad1e Fix a kernel panic based on receiving an ICMPv6 Packet too Big message.
PR:		99779
Submitted by:	Jinmei Tatuya
Reviewed by:	clement, rwatson
MFC after:	1 week
2006-08-18 14:05:13 +00:00
peter
b34fae62b6 Grab two syscall numbers. One is used to emulate functionality that linux
has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname
that corresponds to a given file descriptor).  Valgrind-3.x needs this
functionality.  This is a placeholder only at this time.
2006-08-16 22:32:50 +00:00
cperciva
9525a1c43e Swap the names "sem_exithook" and "sem_exechook" in the previous commit to
match up with reality and the prototype definitions.

Register the sem_exechook as the "process_exec" event handler, not
sem_exithook.

Submitted by:	rdivacky
Sponsored by:	SoC 2006
2006-08-16 08:25:40 +00:00
jhb
4e96206d8a Add a new 'show sleepchain' ddb command similar to 'show lockchain' except
that it operates on lockmgr and sx locks.  This can be useful for tracking
down vnode deadlocks in VFS for example.  Note that this command is a bit
more fragile than 'show lockchain' as we have to poke around at the
wait channel of a thread to see if it points to either a struct lock or
a condition variable inside of a struct sx.  If td_wchan points to
something unmapped, then this command will terminate early due to a fault,
but no harm will be done.
2006-08-15 18:29:01 +00:00
jhb
e092be0121 - When spinning on a spin lock, if the debugger is active or we are in a
panic, go ahead and do the longer DELAY(1) spin wait.
- If we panic due to spinning too long, print out a few more details
  including the pointer to the mutex in question and the tid of the owning
  thread.
2006-08-15 18:26:12 +00:00
jhb
d900df3c77 Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier
systrace changes.
2006-08-15 17:37:01 +00:00
jhb
3cbb1cf256 Add a new set of macros <prefix>_AUE_<syscallname> to sysproto.h that
map to the audit event associated with a specific system call.  For
example, SYS_AUE___semctl would be set to AUE_SEMCTL in sys/sysproto.h.
2006-08-15 17:09:32 +00:00
jhb
0d834c1eff - Use NOSTD rather than NOIMPL for nfssvc() to match other syscalls
provided via klds.
- Correct audit identifier for nfssvc().
2006-08-15 16:45:41 +00:00
jhb
c8c91ce0a9 Rename 'show lockchain' to 'show locktree' and 'show threadchain' to
'show lockchain'.  The churn is because I'm about to add a new
'show sleepchain' similar to 'show lockchain' for sleep locks (lockmgr and
sx) and 'show threadchain' was a bit ambiguous as both commands show
a chain of thread dependencies, 'lockchain' is for non-sleepable locks
(mtx and rw) and 'sleepchain' is for sleepable locks.
2006-08-15 16:44:18 +00:00
jhb
9810a59f6d Add a 'show lockmgr' command that dumps the relevant details of a lockmgr
lock.
2006-08-15 16:42:16 +00:00
netchild
b2a39f267a - Change process_exec function handlers prototype to include struct
image_params arg.
- Change struct image_params to include struct sysentvec pointer and
  initialize it.
- Change all consumers of process_exit/process_exec eventhandlers to
  new prototypes (includes splitting up into distinct exec/exit functions).
- Add eventhandler to userret.

Sponsored by:		Google SoC 2006
Submitted by:		rdivacky
Parts suggested by:	jhb (on hackers@)
2006-08-15 12:10:57 +00:00
rwatson
ff74569fb4 Minor white space tweaks. 2006-08-13 23:16:59 +00:00
alc
b3c3e1dc95 Reduce the scope of the page queues lock in vm_pgmoveco() now that
vm_page_sleep_if_busy() no longer requires the page queue lock to be held.

Correctly spell "TRUE".
2006-08-12 19:47:49 +00:00