Commit Graph

9714 Commits

Author SHA1 Message Date
delphij
2e20bff54b Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 14:58:53 +00:00
ssouhlal
d4434aa6e9 Remove hptlock from the static witness table, now that it's a regular sleep
mutex.
2007-01-16 22:56:28 +00:00
rrs
e614960c33 Removes useless (flags | ) KASSERT. The ^ one that actually
does what we want.

Submitted by:	Li Xin delphij@delphij.net
Reviewed by:	rrs
Approved by:	gnn
2007-01-16 11:40:55 +00:00
kmacy
4da320a732 Fix warning by adding extra parentheses 2007-01-16 00:09:58 +00:00
rrs
af870dbd2e Reviewed by: rwatson
Approved by:	gnn

Add a new function hashinit_flags() which allows NOT-waiting
for memory (or waiting). The old hashinit() function now
calls hashinit_flags(..., HASH_WAITOK);
2007-01-15 15:06:28 +00:00
rwatson
3dd2666ab7 Re-wrap comments to wider margins now that they have been relocated from
within functions.
2007-01-12 22:01:03 +00:00
imp
faa5da2dc1 When ntp_gettime() was converted from a sysctl + wrapper to a system
call, its semantics were unintentionally changed.  It went from
returning the time state to returning 0 or -1.  Since 0 means time
normal, and non-zero effectively only shows up around leap seconds,
this went unnoticed until now.  At least unnoticed until someone was
trying to run a binary they didn't have source for and it was
misbehaving...

Submitted by: Judah Levine
MFC After: 2 weeks
2007-01-12 07:40:30 +00:00
jhb
496f904eab Wrap propagate_priority() in a critical section to prevent unwanted
preemptions when adjusting the priority of a thread that is on a run
queue.  This was only observed when FULL_PREEMPTION was enabled.

Reported by:	kris
Diagnosed by:	ups
MFC after:	1 week
2007-01-11 19:13:27 +00:00
rwatson
37fe9cfef4 Sort copyrights together.
MFC after:	3 days
2007-01-08 20:37:02 +00:00
rwatson
0ef16b090a Resort copyrights and licenses in kern_acct.c: per UCB letter,
the UCB license now excludes the advertising clause.  I'm not
interested in it either, so move my copyright.  This leaves
only a CGD copyright with the advertising clause.

MFC after:      3 days
2007-01-08 20:35:13 +00:00
rwatson
fd7dad9d6e Canonicalize copyrights in some files I hold copyrights on:
- Sort by date in license blocks, oldest copyright first.
- All rights reserved after all copyrights, not just the first.
- Use (c) to be consistent with other entries.

MFC after:	3 days
2007-01-08 17:49:59 +00:00
jeff
0f9511e94e - Don't let SCHED_TICK_TOTAL() return less than hz. This can cause integer
divide faults in roundup() later if it is able to return 0.  For some
   reason this bug only shows up on my laptop and not my testboxes.
2007-01-06 12:33:43 +00:00
jeff
b19ed2c7b0 - Fix the sched_priority() invalid priority bugs. Use roundup() instead
of max() when computing the divisor in SCHED_TICK_PRI().  This prevents
   cases where rounding down would allow the quotient to exceed
   SCHED_PRI_RANGE.
 - Garbage collect some unused flags and fields.
 - Replace TDF_HOLD with sched_pin_td()/sched_unpin_td() since it simply
   duplicated this functionality.
 - Re-enable the rebalancer by default and fix the sysctl so it can be
   modified.
2007-01-06 08:44:13 +00:00
jeff
80a97a8d5e - Don't IPI unless we're going to interrupt something exiting in the kernel.
otherwise we can afford the latency.  This makes a significant performance
   improvement.
2007-01-06 02:34:23 +00:00
jeff
08aa1698e7 - Fix a comparison in sched_choose() that caused cpus to be constantly
marked idle, thus breaking cpu load balancing.
 - Change sched_interact_update() to fix cases where the stored history
   has expanded significantly rather than handling them in the callers.  This
   fixes a case where sched_priority() could compute a bad value.
 - Add a sysctl to disable the global load balancer for experimentation.
2007-01-05 23:45:38 +00:00
jhb
256d3cdbaf - Close a race between enumerating UNIX domain socket pcb structures via
sysctl and socket teardown by adding a reference count to the UNIX domain
  pcb object and fixing the sysctl that enumerates unpcbs to grab a
  reference on each unpcb while it builds the list to copy out to userland.
- Close a race between UNIX domain pcb garbage collection (unp_gc()) and
  file descriptor teardown (fdrop()) by adding a new garbage collection
  flag FWAIT.  unp_gc() sets FWAIT while it walks the message buffers
  in a UNIX domain socket looking for nested file descriptor references
  and clears the flag when it is finished.  fdrop() checks to see if the
  flag is set on a file descriptor whose refcount just dropped to 0 and
  waits for unp_gc() to clear the flag before completely destroying the
  file descriptor.

MFC after:	1 week
Reviewed by:	rwatson
Submitted by:	ups
Hopefully makes the panics go away:	mx1
2007-01-05 19:59:46 +00:00
jeff
47d8080afa - ftick was initialized to -1 for init and any of it's children. Fix this by
setting ftick = ltick = ticks in schedinit().
 - Update the priority when we are pulled off of the run queue and when we
   are inserted onto the run queue so that it more accurately reflects our
   present status.  This is important for efficient priority propagation
   functioning.
 - Move the frequency test into sched_pctcpu_update() so we don't repeat it
   each time we'd like to call it.
 - Put some temporary work-around code in sched_priority() in case the tick
   mechanism produces a bad priority.  Eventually this should revert to an
   assert again.
2007-01-05 08:50:38 +00:00
jeff
ae96850d62 - Only allow the tdq_idx to increase by one each tick rather than up to
the most recently chosen index.  This significantly improves nice
   behavior.  This allows a lower priority thread to run some multiple of
   times before the higher priority thread makes it to the front of
   the queue.  A nice +20 cpu hog now only gets ~5% of the cpu when running
   with a nice 0 cpu hog and about 1.5% with a nice -20 hog.  A nice
   difference of 1 makes a 4% difference in cpu usage between two hogs.
 - Track a seperate insert and removal index.  When the removal index is
   empty it is updated to point at the current insert index.
 - Don't remove and re-add a thread to the runq when it is being adjusted
   down in priority.
 - Pull some conditional code out of sched_tick().  It's looking a bit
   large now.
2007-01-04 12:16:19 +00:00
jeff
60a15a9f22 - Don't pass a pointer into runq_choose_from(). The caller can adjust the
index if it chooses to.
2007-01-04 12:10:58 +00:00
jeff
2c3282f28a ULE 2.0:
- Remove the double queue mechanism for timeshare threads.  It was slow
   due to excess cache lines in play, caused suboptimal scheduling behavior
   with niced and other non-interactive processes, complicated priority
   lending, etc.
 - Use a circular queue with a floating starting index for timeshare threads.
   Enforces fairness by moving the insertion point closer to threads with
   worse priorities over time.
 - Give interactive timeshare threads real-time user-space priorities and
   place them on the realtime/ithd queue.
 - Select non-interactive timeshare thread priorities based on their cpu
   utilization over the last 10 seconds combined with the nice value.  This
   gives us more sane priorities and behavior in a loaded system as
   compared to the old method of using the interactivity score.  The
   interactive score quickly hit a ceiling if threads were non-interactive
   and penalized new hog threads.
 - Use one slice size for all threads.  The slice is not currently
   dynamically set to adjust scheduling behavior of different threads.
 - Add some new sysctls for scheduling parameters.

Bug fixes/Clean up:
 - Fix zeroing of td_sched after initialization in sched_fork_thread() caused
   by recent ksegrp removal.
 - Fix KSE interactivity issues related to frequent forking and exiting of
   kse threads.  We simply disable the penalty for thread creation and exit
   for kse threads.
 - Cleanup the cpu estimator by using tickincr here as well.  Keep ticks and
   ltick/ftick in the same frequency.  Previously ticks were stathz and
   others were hz.
 - Lots of new and updated comments.
 - Many many others.

Tested on:	up x86/amd64, 8way amd64.
2007-01-04 08:56:25 +00:00
jeff
78c3275ce1 - Add three new functions to support circular run queues.
- runq_add_pri allows the caller to position the thread at any rqindex
   regardless of priority.
 - runq_choose_from() chooses the lowest priority thread starting from a given
   index.  The index is updated with the rqindex of the chosen thread.  This
   routine is used to pick the lowest priority relative to a given index.
 - runq_remove_idx() updates the index if the run queue that held the removed
   thread is now empty.
2007-01-04 08:39:58 +00:00
jeff
b5c5ce5407 - Fix schedgraph output with KSE threads. Call thread_switchout() after
calling CTR() so we don't confuse a new kse thread with a real preemption.
2007-01-03 02:38:41 +00:00
davidxu
eafca3f075 Fix compiling. 2007-01-02 04:14:01 +00:00
rwatson
2a3330b81a Prefer a more traditional spelling of inhibited in comments and panic
messages.
2006-12-31 15:56:04 +00:00
jeff
9c815f4892 - More search and replace prettying. 2006-12-29 12:55:32 +00:00
jeff
e74edb3876 - Clean up a bit after the most recent KSE restructuring. 2006-12-29 10:37:07 +00:00
rwatson
4a9f23955f Break contents of kern_mac.c out into two files following a repo-copy:
mac_framework.c   Contains basic MAC Framework functions, policy
                  registration, sysinits, etc.

mac_syscalls.c    Contains implementations of various MAC system calls,
                  including ENOSYS stubs when compiling without options
                  MAC.

Obtained from:	TrustedBSD Project
2006-12-28 20:52:02 +00:00
rwatson
e7f843dc94 Update MAC Framework general comments, referencing various interfaces it
consumes and implements, as well as the location of the framework and
policy modules.

Refactor MAC Framework versioning a bit so that the current ABI version can
be exported via a read-only sysctl.

Further update comments relating to locking/synchronization.

Update copyright to take into account these and other recent changes.

Obtained from:	TrustedBSD Project
2006-12-28 17:25:57 +00:00
davidxu
70875d94ab break loop early if we know that there are at least two signals. 2006-12-25 03:00:15 +00:00
davidxu
f51b738f57 Fix typo, p_slptime should be td_slptime. 2006-12-24 01:52:27 +00:00
bms
1a77168e4f Drop all received data mbufs from a socket's queue if the MT_SONAME
mbuf is dropped, to preserve the invariant in the PR_ADDR case.

Add a regression test to detect this condition, but do not hook it
up to the build for now.

PR:             kern/38495
Submitted by:   James Juran
Reviewed by:    sam, rwatson
Obtained from:  NetBSD
MFC after:      2 weeks
2006-12-23 21:07:07 +00:00
rwatson
a1911a8513 Update comments to reflect changes in the extattrctl() code.
Clean up comment formatting.

Obtained from:	TrustedBSD Project
2006-12-23 00:30:03 +00:00
rwatson
520b5875d2 Following a repo-copy of vfs_syscalls.c to vfs_extattr.c, remove
non-extattr functions from vfs_extattr.c, and extattr functions from
vfs_syscalls.c.

Change copyright/license on vfs_extattr.c to my copyright/license on
the extended attribute implementation (from extattr.h).

Clean up includes a bit.

Obtained from:	TrustedBSD Project
2006-12-23 00:10:36 +00:00
rwatson
ae9ef07995 Move src/sys/sys/mac_policy.h, the kernel interface between the MAC
Framework and security modules, to src/sys/security/mac/mac_policy.h,
completing the removal of kernel-only MAC Framework include files from
src/sys/sys.  Update the MAC Framework and MAC policy modules.  Delete
the old mac_policy.h.

Third party policy modules will need similar updating.

Obtained from:	TrustedBSD Project
2006-12-22 23:34:47 +00:00
rrs
c427816562 The prepend function did not handle non-pkthdr's correctly.
It always called MH_ALIGN for small lengths being
prepended (less than MHLEN). This meant that if you did
a prepend on a non M_PKTHDR the system would panic with
the KASSERT in MH_ALIGN. Instead we are not aware of
this and do a MH_ALIGN or M_ALIGN as appropriate.

Reviewed by:	andre
Approved by:	gnn
2006-12-21 19:58:04 +00:00
rwatson
6fa1425be4 Remove mac_enforce_subsystem debugging sysctls. Enforcement on
subsystems will be a property of policy modules, which may require
access control check entry points to be invoked even when not actively
enforcing (i.e., to track information flow without providing
protection).

Obtained from:	TrustedBSD Project
Suggested by:	Christopher dot Vance at sparta dot com
2006-12-21 09:51:34 +00:00
rwatson
5749ecccba Expand commenting on label slots, justification for the MAC Framework locking
model, interactions between locking and policy init/destroy methods.

Rewrap some comments to 77 character line wrap.

Obtained from:	TrustedBSD Project
2006-12-20 20:38:44 +00:00
jkim
0099defbac MFP4: (part of) 110058
copyin()/copyout() for message type is separated from msgsnd()/msgrcv() and
it is done from its wrapper functions to support 32-bit emulations.  After I
implemented this, I have briefly referenced NetBSD and Darwin.  NetBSD passes
copyin()/copyout() function pointers from wrappers.  Darwin passes size of
message type as an argument, which is actually similar to my first
implementation (P4 109706).  We may revisit these implementations later.
2006-12-20 19:26:30 +00:00
kib
9311fcbc5d In rev. 1.514, iodone on async buffer may happen before code checks the
vnode v_flag. For cluster buffers this would result in dereferencing NULL
b_vp. To prevent the panic, cache relevant vnode flag before calling
bstrategy.

Reported by:	Peter Holm, kris
Tested by:	Peter Holm
Reviewed by: tegge
Pointy hat to:	kib
2006-12-20 09:22:31 +00:00
davidxu
5a984630fa Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.

Added sysctls:
    kern.threads.umtx_dflt_spins
        if the sysctl value is non-zero, a zero umutex.m_spincount will
        cause the sysctl value to be used a spin cycle count.
    kern.threads.umtx_max_spins
        the sysctl sets upper limit of spin cycle count.

Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
mbr
a2c03bf6cb Back out rev. 1.266. The real cause for the recent panics has been fixed
in rev. 1.267 and there is no need to keep this test.
2006-12-20 02:49:59 +00:00
mbr
37965664cc Giant might have been temporarily dropped while waiting for proctree_lock, allowing for an
intervening tty_close() that cleared tp->t_session.

Submitted by:	tegge
MFC:		1 day
2006-12-19 22:34:32 +00:00
mbr
ccabbc6486 Add the tp->t_refcnt validity check back. There are still some race
conditions where tp->t_refcnt can go to zero.
2006-12-19 16:46:13 +00:00
davidxu
b0b74f9bd3 Remove unused sysctls. 2006-12-19 13:06:01 +00:00
pjd
6cc6a8d100 Use pipe_direct_write() optimization only if the data is in process' memory.
This fixes sending data through pipe from the kernel.

Fix suggested by:	rwatson
2006-12-19 12:52:22 +00:00
kmacy
af645e118f ktrace_cv is no longer used - remove
Submitted by: Attilio Rao
2006-12-17 00:16:09 +00:00
kmacy
bb69932355 Cleaner fix for handling declaration of loop variable under INVARIANTS
- in trying to avoid nested brackets and #ifdef INVARIANTS around i at the
  top, I broke booting for INVARIANTS all together :-(
- the cleanest fix is to simply assign to sq twice if INVARIANTS is enabled
- tested both with and without INVARIANTS :-/
2006-12-17 00:14:20 +00:00
ache
aebc61a22f Don't intermix assignments and variable declarations in prev. commit 2006-12-16 21:17:27 +00:00
ache
84d03f55f7 Fix NULL pointer reference for INVARIANTS case
Submitted by:   Yuriy Tsibizov <Yuriy.Tsibizov@gfk.ru>
2006-12-16 20:33:26 +00:00
rodrigc
10e4664552 In vfs_export(), if we specify MNT_DELEXPORT in the struct export_args,
after we perform the operations to delete the export,
call vfs_deleteopt() to delete the "export" mount option from
the linked list of mount options associated with that mount point.

This fixes one scenario:
- put a filesystem in /etc/exports to export it
- remove the filesystem from /etc/exports to delete the export and restart
  mountd
- try to do a "mount -u -o ro" or "mount -u -o rw" on that filesystem
  now that it is no  longer exported.
2006-12-16 15:50:36 +00:00