Commit Graph

10045 Commits

Author SHA1 Message Date
Jeff Roberson
fb62eea266 - Use ruxagg() in calcru() to make sure we have current tick information
from all threads.

Discussed with:	bde, attilio
Approved by:	re
2007-07-17 01:08:09 +00:00
Craig Rodrigues
d7f81adbd4 Revert previous commits which I committed by mistake.
Approved by:	re (implicit)
Pointy hat to:	me
2007-07-14 21:23:31 +00:00
Craig Rodrigues
d678780e60 The last entry in the ext2_opts array must be NULL,
otherwise the kernel with crash in vfs_filteropt() if an invalid
mount option is passed to ext2fs.

Approved by:	re (kensmith)
2007-07-14 21:18:19 +00:00
John Baldwin
59d8f3ff08 Fix a couple of issues with the stack limit for 32-bit processes on 64-bit
kernels exposed by the recent fixes to resource limits for 32-bit processes
on 64-bit kernels:
- Let ABIs expose their maximum stack size via a new pointer in sysentvec
  and use that in preference to maxssiz during exec() rather than always
  using maxssiz for all processses.
- Apply the ABI's limit fixup to the previous stack size when adjusting
  RLIMIT_STACK to determine if the existing mapping for the stack needs to
  be grown or shrunk (as well as how much it should be grown or shrunk).

Approved by:	re (kensmith)
2007-07-12 18:01:31 +00:00
Attilio Rao
c1a6d9fa42 Fix some problems with lock_profiling in sx locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
  more accurate and trustable
- Disable shared paths for lock_profiling.  Actually, lock_profiling has a
  subtle race which makes results caming from shared paths not completely
  trustable. A macro stub (LOCK_PROFILING_SHARED) can be actually used for
  re-enabling this paths, but is currently intended for developing use only.
- Use homogeneous names for automatic variables in hard functions regarding
  lock_profiling
- Style fixes
- Add a CTASSERT for some flags building

Discussed with: kmacy, kris
Approved by: jeff (mentor)
Approved by: re
2007-07-06 13:20:44 +00:00
Konstantin Belousov
196a7385ac Revert destroy_dev() to the state before destroy_dev_sched() was introduced.
Attempt to spawn destroy_dev_sched() from it causes inadmissible races.

Requested by:	tegge
Approved by:	re (kensmith)
2007-07-05 13:04:59 +00:00
Bjoern A. Zeeb
f43455fd89 Remove netkey directory from cscope/TAGs generation and replace
it with netipsec now that KAME IPsec is gone.
While here add missing netinet6 directories.

Add comments about the ports needed to be able to run those targets.

Reviewed by:	philip
Approved by:	re (rwatson)
2007-07-05 08:55:14 +00:00
Peter Wemm
22af4cab91 Fix bad function type passed to destroy_dev_sched_cb().
Approved by:  re (rwatson)
2007-07-05 05:54:47 +00:00
Peter Wemm
c2815ad564 Add freebsd6_ wrappers for mmap/lseek/pread/pwrite/truncate/ftruncate
Approved by: re (kensmith)
2007-07-04 22:57:21 +00:00
Peter Wemm
552fbe752f Regenerate after mmap/lseek/etc syscall changes.
Approved by:  re (kensmith)
2007-07-04 22:49:55 +00:00
Peter Wemm
51504d9ac4 Create new syscalls for mmap(), lseek(), pread(), pwrite(), truncate() and
ftruncate(), but without the pad arg.

There are several reasons for this.  Consider 'mmap()'.  On AMD64, the
function call (and syscall) ABI allow for 6 register arguments.  Additional
arguments go on the stack.  mmap(2) has 6 arguments.  However, the syscall
definition has an extra 'int pad' argument.  This pushes it to 7 arguments,
which means one must spill into the memory stack.  Since the kernel API
doesn't match userland API, we have a hack in libc - libc/sys/mmap.c.
This implements the userland API by calling __syscall() with an extra
argument and the pad argument, for a total of 8 args.  This is all
unnecessary and inconvenient for several things, including the kernel's
syscall handler code which now has to handle merging stack arguments with
register arguments.  It is a big deal for certain 3rd party code.

I'm adding libc glue to make the transition totally painless.  I had
intended to mark the old syscalls as COMPAT6, but the potential to shoot
your feet by building a new kernel without COMPAT_FREEBSD6 but with a
slighly older userland was too great.  For now, they have manual
"freebsd6_" prefixes rather than being COMPAT6.  They will go back to
being marked 'COMPAT6' after 7-stable starts.

Approved by: re (kensmith)
2007-07-04 22:47:37 +00:00
Peter Wemm
9f0482e515 Add support for COMPAT6 syscalls.
Also, change the visibility of compat syscalls a slightly.  Compat
syscalls were missing from 'syscalls.h' entirely.  This additionally adds
them with their compat prefix.  eg: SYS_freebsd6_mmap.

Also, the syscalls.c names strings have different prefixes to differentiate
syscalls. Instead of several "old.mmap" strings, there will now be a
"compat.mmap" and "compat6.mmap" etc.  Before, both would have had the
same "old.mmap" label.

Approved by:  re
2007-07-04 22:38:28 +00:00
Konstantin Belousov
09828ba947 Since cdev mutex is after system map mutex in global lock order, free()
shall not be called while holding cdev mutex. devfs_inos unrhdr has cdev as
mutex, thus creating this LOR situation.

Postpone calling free() in kern/subr_unit.c:alloc_unr() and nested functions
until the unrhdr mutex is dropped. Save the freed items on the ppfree list
instead, and provide the clean_unrhdrl() and clean_unrhdr() functions to
clean the list.
Call clean_unrhdrl() after devfs_create() calls immediately before
dropping cdev mutex. devfs_create() is the only user of the alloc_unrl()
in the tree.

Reviewed by:	phk
Tested by:	Peter Holm
LOR:	80
Approved by:	re (kensmith)
2007-07-04 06:56:58 +00:00
Jeff Roberson
f6c1ecca50 - Use explicit locking in the various fcntl case statements so that we
can acquire shared filedescriptor locks in the appropriate cases.
 - Remove Giant from calls that issue ioctls.  The ioctl path has been
   mpsafe for some time now.
 - Only acquire giant for VOP_ADVLOCK when the filesystem requires giant.
   advlock is now mpsafe.

Reviewed by:	rwatson
Approved by:	re
2007-07-03 21:26:06 +00:00
Jeff Roberson
bc02f1d98d - Remove explicit Giant protection from lockf. Use the vnode interlock
to protect this datastructure instead.
 - Preallocate an extra lockf structure in case we want to split a lock
   on insert or delete.
 - msleep() on the vnode interlock when blocking on a lock.

Reviewed by:	rwatson
Approved by:	re
2007-07-03 21:22:58 +00:00
John Baldwin
fb1faf2082 Tweak the low-level MI SMP code some:
- Use cpu_spinwait() in the spin loops in stop_cpus(), restart_cpus(), and
  smp_rendezvous_action().
- Remove unneeded acq memory barriers in stop_cpus(), restart_cpus(), and
  smp_rendezvous_action().
- Add an additional synch point in smp_rendezvous() to ensure that all the
  CPUs will always see an up-to-date value of smp_rv_setup_func.

Reviewed by:	attilio
Approved by:	re (kensmith)
Tested on:	alpha, amd64, i386, sparc64 SMP (for several years)
2007-07-03 18:37:06 +00:00
Konstantin Belousov
9d53363bc8 Rev. 1.204 and 1.205 got an erronous version of destroy_dev() that
calls destroy_dev_sched() with cdev mutex locked. Commit the code
that was actually tested.

Pointy hat to:	kib
Approved by:	re (implicit)
2007-07-03 18:18:30 +00:00
Konstantin Belousov
f5baf8d66b Lock Giant and proctree lock around dereferencing p_session->s_ttyvp->v_rdev.
Lock cdev mutex too to close the race with tty being freed.
Relock clone_drain_lock to prevent the LOR with proctree lock, thus
add #include <fs/devfs/devfs_int.h>.

Suggested by:	tegge
Debugging help and testing by:	Peter Holm
Approved by:	re (kensmith)
2007-07-03 17:46:37 +00:00
Konstantin Belousov
8a5d7ef25c Use make_dev_credf(MAKEDEV_REF) instead of make_dev() from pty clone handler.
Debugging help and testing by:	Peter Holm
Approved by:	re (kensmith)
2007-07-03 17:45:52 +00:00
Konstantin Belousov
0a9c2b6db8 Use make_dev_credf(MAKEDEV_REF) instead of make_dev() from the clone handler.
Lock Giant in the clone handler.
Use destroy_dev_sched() explicitely from pty_maybecleanup() and postpone
pty_release() until both master and slave cdevs are destroyed by setting
it as callback for destroy_dev_sched().

Debugging help and testing by:	Peter Holm
Approved by:	re (kensmith)
2007-07-03 17:44:59 +00:00
Konstantin Belousov
6f0281937b Automatically detect deadlock condition in destroy_dev(), that is, if
destroy_dev() is called from csw method, and no d_purge driver method is
provided. Transform the direct call to destroy_dev() into destroy_dev_sched().

Reviewed by:	njl (programming interface)
Debugging help and testing by:	Peter Holm
Approved by:	re (kensmith)
2007-07-03 17:43:20 +00:00
Konstantin Belousov
de10ffa527 Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls
destroy_dev() from d_close() cdev method would self-deadlock.
devfs_close() bump device thread reference counter, and destroy_dev()
sleeps, waiting for si_threadcount to reach zero for cdev without
d_purge method.

destroy_dev_sched() could be used instead from d_close(), to
schedule execution of destroy_dev() in another context. The
destroy_dev_sched_drain() function can be used to drain the scheduled
calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains
the events clone to make sure no lingering devices are left after
dev_clone event handler deregistered.

make_dev_credf(MAKEDEV_REF) function should be used from dev_clone
event handlers instead of make_dev()/make_dev_cred() to ensure that created
device has reference counter bumped before cdev mutex is dropped inside
make_dev().

Reviewed by:	tegge (early versions), njl (programming interface)
Debugging help and testing by:	Peter Holm
Approved by:	re (kensmith)
2007-07-03 17:42:37 +00:00
Konstantin Belousov
7aee5992a5 Relock the sema_mtxp unconditionally after copyin() for SETALL case in
kern_semctl. Otherwise, later mtx_unlock() can operate on unlocked mutex.

Submitted by:	rdivacky
MFC after:	3 days
Approved by:	re (kensmith)
2007-07-03 15:58:47 +00:00
Robert Watson
bc6eca2432 Continue kernel privilege cleanup for 7.0: unstaticize suser_enabled and
stop declaring it in systm.h -- it's used only in kern_priv.c and is not
required elsewhere.

Approved by:	re (kensmith)
2007-07-02 14:03:29 +00:00
Randall Stewart
b8709d23c5 - Add some needed error checking on bad fd passing in the sctp
syscalls.
Approved by:	re@freebsd.org (Ken Smith)
Obtained from:	Weongyo Jeong (weongyo.jeong@gmail.com)
2007-07-02 12:50:53 +00:00
Jeff Roberson
03d03260b2 - Use rufetchcalc() rather than calcru() in ttyinfo so that we get
correct system and user time stats.

Approved by:	re
Reported by:	kris
Discussed with:	Attilio
2007-07-01 00:17:59 +00:00
Robert Watson
dc2e1e3fae Use vm_offset_t for kmembase and kmemlimit rather than char *, avoiding
unnecessary casts, and making it possible to compile kern_malloc.c with
strict aliasing.

Submitted by:	rdivacky
Approved by:	re (kensmith)
2007-06-27 13:39:38 +00:00
Attilio Rao
6a0ce57d10 Fix an old standing LOR between callout_lock and sleepqueues chain (which
could lead to a deadlock).
- sleepq_set_timeout acquires callout_lock (via callout_reset()) only
  with sleepq chain lock held
- msleep_spin in _callout_stop_safe lock the sleepqueue chain with
  callout_lock held

In order to solve this don't use msleep_spin in _callout_stop_safe() but
use directly sleepqueues as inline msleep_spin code. Rearrange the
wakeup path in order to have it consistent too.

Reported by: kris (via stress2 test suite)
Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re
2007-06-26 21:42:01 +00:00
Attilio Rao
f08945a7d2 Introduce a new rwlocks initialization function: rw_init_flags.
This is very similar to sx_init_flags: it initializes the rwlock using
special flags passed as third argument (RW_DUPOK, RW_NOPROFILE,
RW_NOWITNESS, RW_QUIET, RW_RECURSE).
Among these, the most important new feature is probabilly that rwlocks
can be acquired recursively now (for both shared and exclusive paths).

Because of the recursion counter, the ABI is changed.

Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re
2007-06-26 21:31:56 +00:00
Rong-En Fan
534046e301 - Remove UMAP filesystem. It was disconnected from build three years ago,
and it is seriously broken.

Discussed on:   freebsd-arch@
Approved by:	re (mux)
2007-06-25 05:06:57 +00:00
Konstantin Belousov
9bc911d4a2 devfs_free() calls free_unr(), that may sleep.
Postpone call to devfs_free() after cdev mutex is dropped. Reuse
cdp_list link for queuing devices awaiting deletion in the
cdevp_free_list.

Reported by:	Hans Petter Selasky <hselasky c2i net>
Tested by:	Peter Holm
Approved by:	re (kensmith)
MFC after:	2 weeks
2007-06-19 13:19:23 +00:00
Konstantin Belousov
7550e3eac4 Add the witness warning for free_unr. Function could sleep, thus callers
shall not have any non-sleepable locks held.

Submitted by:	Hans Petter Selasky <hselasky c2i net>
Approved by:	re (kensmith)
2007-06-19 13:13:17 +00:00
Pawel Jakub Dawidek
dfe97ff4a5 We only flush entries related to the given file system. Currently there are
no 'invalid' cache entires - file system is responsible for keeping it that
way. The comment should have been updated in rev.1.25.
2007-06-18 09:28:24 +00:00
Robert Watson
7251b7863c Rather than passing SUSER_RUID into priv_check_cred() to specify when
a privilege is checked against the real uid rather than the effective
uid, instead decide which uid to use in priv_check_cred() based on the
privilege passed in.  We use the real uid for PRIV_MAXFILES,
PRIV_MAXPROC, and PRIV_PROC_LIMIT.  Remove the definition of
SUSER_RUID; there are now no flags defined for priv_check_cred().

Obtained from:	TrustedBSD Project
2007-06-16 23:41:43 +00:00
Marius Strobl
79be8b5082 - Remove zstty spin lock for no longer existing zs(4).
- Move the rtc_mtx spin lock out from under #ifdef SMP as it's just
  not SMP-specific.
- Add a new spin lock pcib_mtx for locking "fast" interrupt handlers
  of host-to-PCI bridge drivers on sparc64.
2007-06-16 23:30:57 +00:00
Jeff Roberson
dda713dfb8 - Fix an off by one error in sched_pri_range.
- In tdq_choose() only assert that a thread does not have too high a
   priority (low value) for the queue we removed it from.  This will catch
   bugs in priority elevation.  It's not a serious error for the thread
   to have too low a priority as we don't change queues in this case as
   an optimization.

Reported by:	kris
2007-06-15 19:33:58 +00:00
Robert Watson
7e273744a6 Remove the restriction that rtprio(2) cannot be used to set the realtime
or idle priority of another process owned by the same user.  This means
that privilege in rtprio(2) (and rtprio_thread(2)) is required indirectly
via p_cansched(9) or directly to set realtime/idle privilege, rather than
directly affecting target process authorization.
2007-06-14 23:31:52 +00:00
Robert Watson
b4be6ef22f Only require privilege to set the current time adjustment, not in order to
query it.
2007-06-14 18:37:58 +00:00
Robert Watson
3805385e3d Spell statistics more correctly in comments. 2007-06-14 03:02:33 +00:00
John Baldwin
34a9edafbc Improve the ktrace locking somewhat to reduce overhead:
- Depessimize userret() in kernels where KTRACE is enabled by doing an
  unlocked check of the per-process queue of pending events before
  acquiring any locks.  Previously ktr_userret() unconditionally acquired
  the global ktrace_sx lock on every return to userland for every thread,
  even if ktrace wasn't enabled for the thread.
- Optimize the locking in exit() to first perform an unlocked read of
  p_traceflag to see if ktrace is enabled and only acquire locks and
  teardown ktrace if the test succeeds.  Also, explicitly disable tracing
  before draining any pending events so the pending events actually get
  written out.  The unlocked read is safe because proc lock is acquired
  earlier after single-threading so p_traceflag can't change between then
  and this check (well, it can currently due to a bug in ktrace I will fix
  next, but that race existed prior to this change as well).

Reviewed by:	rwatson
2007-06-13 20:01:42 +00:00
John Baldwin
ce0be64687 Conditionally acquire Giant when dropping a reference on the ktrace vnode
during execve() when turning off tracing due to executing a setuid binary
as non-root.  Previously this could fail to acquire Giant and fail an
assertion if the ktrace file was on a non-MPSAFE filesystem and the
executable was on an MPSAFE filesystem.

MFC after:	3 days
Reported by:	kris
2007-06-13 19:41:47 +00:00
Jeff Roberson
3036ab79e3 - Include opt_sched.h for SCHED_STATS. 2007-06-12 23:27:31 +00:00
Jeff Roberson
671f2709ae - Garbage collect unused concurrency functions. 2007-06-12 19:50:31 +00:00
Jeff Roberson
e7c8d2e9fe - Garbage collect unused concurrency functions.
- Remove unused kse fields from struct proc.
 - Group remaining fields and #ifdef KSE them.
 - Move some kern_kse.c only prototypes out of proc and into kern_kse.

Discussed with:	Julian
2007-06-12 19:49:39 +00:00
Jeff Roberson
fe54587ffa - Move some common code out of sched_fork_exit() and back into fork_exit(). 2007-06-12 07:47:09 +00:00
Jeff Roberson
ff8fbcffcb Solve a complex exit race introduced with thread_lock:
- Add a count of exiting threads, p_exitthreads, to struct proc.
 - Increment p_exithreads when we set the deadthread in thread_exit().
 - When we thread_stash() a deadthread use an atomic to drop the count.
 - Spin until the p_exithreads count reaches 0 in thread_wait().
 - Lock the last exiting thread momentarily to be certain that it has
   exited cpu_throw().
 - Restructure thread_wait().  It does not need a loop as there will only
   ever be one thread.

Tested by:	moose@opera.com
Reported by:	kris, moose@opera.com
2007-06-12 07:24:46 +00:00
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Jeff Roberson
efe641b939 - Add a missing PROC_SUNLOCK() in tdsignal() 2007-06-11 23:27:03 +00:00
Olivier Houchard
e411ce026a Re-acquire the PROC_SLOCK before calling calcru(), and release it after,
since calcru() expects it to be locked.

Reviewed by:	attilio
2007-06-11 21:05:41 +00:00
Sam Leffler
68e8e04e93 Update 802.11 wireless support:
o major overhaul of the way channels are handled: channels are now
  fully enumerated and uniquely identify the operating characteristics;
  these changes are visible to user applications which require changes
o make scanning support independent of the state machine to enable
  background scanning and roaming
o move scanning support into loadable modules based on the operating
  mode to enable different policies and reduce the memory footprint
  on systems w/ constrained resources
o add background scanning in station mode (no support for adhoc/ibss
  mode yet)
o significantly speedup sta mode scanning with a variety of techniques
o add roaming support when background scanning is supported; for now
  we use a simple algorithm to trigger a roam: we threshold the rssi
  and tx rate, if either drops too low we try to roam to a new ap
o add tx fragmentation support
o add first cut at 802.11n support: this code works with forthcoming
  drivers but is incomplete; it's included now to establish a baseline
  for other drivers to be developed and for user applications
o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates
  prepending mbufs for traffic generated locally
o add support for Atheros protocol extensions; mainly the fast frames
  encapsulation (note this can be used with any card that can tx+rx
  large frames correctly)
o add sta support for ap's that beacon both WPA1+2 support
o change all data types from bsd-style to posix-style
o propagate noise floor data from drivers to net80211 and on to user apps
o correct various issues in the sta mode state machine related to handling
  authentication and association failures
o enable the addition of sta mode power save support for drivers that need
  net80211 support (not in this commit)
o remove old WI compatibility ioctls (wicontrol is officially dead)
o change the data structures returned for get sta info and get scan
  results so future additions will not break user apps
o fixed tx rate is now maintained internally as an ieee rate and not an
  index into the rate set; this needs to be extended to deal with
  multi-mode operation
o add extended channel specifications to radiotap to enable 11n sniffing

Drivers:
o ath: add support for bg scanning, tx fragmentation, fast frames,
       dynamic turbo (lightly tested), 11n (sniffing only and needs
       new hal)
o awi: compile tested only
o ndis: lightly tested
o ipw: lightly tested
o iwi: add support for bg scanning (well tested but may have some
       rough edges)
o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data
o wi: lightly tested

This work is based on contributions by Atheros, kmacy, sephe, thompsa,
mlaier, kevlo, and others.  Much of the scanning work was supported by
Atheros.  The 11n work was supported by Marvell.
2007-06-11 03:36:55 +00:00