10645 Commits

Author SHA1 Message Date
John Baldwin
8c68f75a7c - Reduce scope of #ifdef's in uma_zcreate() call in init_turnstile0().
- Set UMA_ZONE_NOFREE so that the per-turnstile spin locks are type stable
  to avoid a race where one thread might dereference a lock in a free'd
  turnstile that was previously used by another thread.

Theorized by:	tegge (2)
MFC after:	1 week
2008-09-08 21:40:15 +00:00
John Baldwin
aa4c44b58b Close a race in sleepq_broadcast() where the sleepq could be reused after
it had been assigned to the last sleeping thread.  That thread might have
started running on another CPU and have reused that sleep queue.  Fix it
by just walking the thread queue using TAILQ_FOREACH_SAFE() rather than
a while loop.

PR:		amd64/124200
Discovered by:	tegge
Tested by:	benjsc
MFC after:	1 week
2008-09-08 19:44:57 +00:00
Bjoern A. Zeeb
6f4745d575 Catch a possible NULL pointer deref in case the offsets got mangled
somehow.
As a consequence we may now get an unexpected result(*).
Catch that error cases with a well defined panic giving appropriate
pointers to ease debugging.

(*) While the concensus was that the case should never happen unless
    there was a bug, noone was definitively sure.

Discussed with:		kmacy (about 8 months back)
Reviewed by:		silby (as part of a larger patch in March)
MFC after:		2 months
2008-09-07 13:09:04 +00:00
Ed Schouten
3c8574bc8a Make TIOCCONS use priv_check() instead of checking /dev/console permissions.
As discussed with Robert on IRC, checking the permissions on
/dev/console to see if we can call TIOCCONS could be unreliable. When we
run a chroot() without a devfs instance mounted inside, it won't
actually check the permissions on the device node inside the devfs
instance.

Using the already existing PRIV_TTY_CONSOLE for this seems like a better
idea.

Approved by:	rwatson
2008-09-06 14:43:32 +00:00
Ed Schouten
c27991e819 Fix a small typo in a comment in calcru1().
The word "happene" should read "happened".

Submitted by:	Jille Timmermans <jille quis cx>
2008-09-05 15:55:06 +00:00
David Xu
fbc48e974e Fix LOR between vnode lock and internal mqueue locks. 2008-09-05 07:32:57 +00:00
Andrew Thompson
9128ec21f3 Remove the alignment of the align parameter. This is up to the caller to pass
in and it breaks tap(4) on strict alignment machines as m_uiotombuf is called
with ETHER_ALIGN.

Found by:	Jared Go
Reviewed by:	emax
MFC after:	3 days
2008-09-05 04:05:31 +00:00
David Xu
b042e9760c Fix lock name conflict.
PR:	kern/127040
2008-09-05 02:07:25 +00:00
Ed Schouten
64308260f6 Implement pts(4) packet mode.
As reported by several users on the mailing lists, applications like
screen(1) fail to properly handle ^S and ^Q characters. This was because
MPSAFE TTY didn't implement packet mode (TIOCPKT) yet. Add basic packet
mode support to make these applications work again.

Obtained from:	//depot/projects/mpsafetty/...
2008-09-04 16:39:02 +00:00
Ed Schouten
2bda9238e5 Fix an awful bug inside our COMPAT_43TTY code.
When I migrated tty_compat.c to MPSAFE TTY, I just hooked it up to the
build and fixed it until it compiled and somewhat worked. It turns out
this was not the smartest thing, because the old TTY layer also had a
field called t_flags, which contained a set of sgtty flags.

This means our current COMPAT_43TTY code overwrites the TTY flags,
causing all strange problems to occur. Fix this code to use a new struct
member called t_compatflags. This commit may cause kern/127054 to be
fixed, but this still has to be tested/confirmed by the originator. It
has to be fixed anyway.

PR:		kern/127054
2008-09-04 16:30:53 +00:00
Kevin Lo
f308bddd3f If the process id specified is invalid, the system call returns ESRCH 2008-09-04 10:44:33 +00:00
Simon L. B. Nielsen
59ca51adba - Fix amd64 local privilege escalation. [08:07]
- Fix nmount(2) local privilege escalation. [08:08]
- Fix IPv6 remote kernel panics. [08:09]

Fix for [08:07] is merge of r181823.

Submitted by:	kib [08:07], csjp [08:08], bz [08:09]
Reviewed by:	peter [08:07], jhb [08:07]
Reviewed by:	jinmei [08:09], rwatson [08:09]
Approved by:	re (SA blanket)
Approved by:	so (simon)
Security:	FreeBSD-SA-08:07.amd64
Security:	FreeBSD-SA-08:08.nmount
Security:	FreeBSD-SA-08:09.icmp6
2008-09-03 19:09:47 +00:00
Ed Schouten
ffffa83b60 Use size_t to store the return value of ttydisc_getc().
The ttydisc_getc() routine obtains a read length from ttyoutq_read().
For no valid reason, the current code stores this value in an int, and
returns a size_t. There is no need to perform this useless conversion.

Obtained from:	//depot/projects/mpsafetty/...
2008-09-02 17:13:11 +00:00
Robert Watson
26ec197d15 Remove XXXRW in soreceive_dgram that proves unnecessary.
Remove unused orig_resid variable in soreceive_dgram.

Submitted by:	alfred
X-MFC with:	soreceive_dgram (r180198, r180211)
2008-09-02 16:55:21 +00:00
Pawel Jakub Dawidek
2765482b7f When setting error to EINVAL in 'fvp == tdvp' case, jump to out label,
because if not, the error will be later overwritten by
mac_vnode_check_rename_to() call.

Reviewed by:	rwatson
2008-09-01 10:11:39 +00:00
Attilio Rao
59d4932531 Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.
Manpages are updated accordingly.

Tested by:	Diego Sardina <siarodx at gmail dot com>
2008-08-31 14:26:08 +00:00
Attilio Rao
988d28340a - Improve some witness_watch operability in code which does perform both
lock tracking and checks, doing just the former ones.
- Fix a bug where sysctl utility was printing crazy values when setting a
  new value for debug.witness.watch [0]

[0] Reported by:	yongari
2008-08-30 13:20:35 +00:00
Ed Schouten
74bb9e3ad5 Fix some edge cases in the TTY queues:
- In the current design, when a TTY decreases its baud rate, it tries to
  shrink the queues. This may not always be possible, because it will
  not free any blocks that are still filled with data.

  Change the TTY queues to store a `quota' value as well, which means it
  will not free any blocks when changing the baud rate, but when placing
  blocks back into the queue. When the amount of blocks exceeds the
  quota, they get freed.

  It also fixes some edge cases, where TIOCSETA during read()/
  write()-calls could actually make the queue a tiny bit bigger than in
  normal cases.

- Don't leak blocks of memory when calling TIOCSETA when the device
  driver abandons the TTY while allocating memory.

- Create ttyoutq_init() and ttyinq_init() to initialize the queues,
  instead of initializing them by hand. The new TTY snoop driver also
  creates an outq, so it's good to have a proper interface to do this.

Obtained from:	//depot/projects/mpsafetty/...
2008-08-30 09:18:27 +00:00
Attilio Rao
df3310e04a - Make witness_watch a 3 state value.
1 means that witness is up and running.
  0 means that witness is disabled but that it can be established later
    again in effective way.
  -1 means that witness is disabled permanently
- Fix a bug causing kernel to panic on witness disabling through
  witness_watch.  lock lists queues were still full of entries and this was
  causing throubles with debugging stubs (like witness_thread_exit()).

Reported by:	kris, yongari
Sponsored by:	Nokia
2008-08-29 15:47:53 +00:00
Ed Schouten
a15ec0a5e4 Backport two small fixes from the MPSAFE TTY branch in Perforce:
- Implement IMAXBEL. It turned out the IMAXBEL termios switch was marked
  as supported, while it had not been implemented.

- Don't go into the high watermark when in canonical mode, no data has
  been canonicalized and the input buffer is full. This caused the
  terminal to lock up. This prevented users from pressing
  backspace/^U/etc in such cases.

  This could easily be simulated by pasting a very big amount of data in
  a shell with sh(1) in canonical mode.

Obtained from:	//depot/projects/mpsafetty/...
2008-08-29 15:02:50 +00:00
David Xu
3eb8b8bbeb Don't remove queued SIGCHLD if options contain WNOWAIT, so other
threads still can be notified by the signal.
2008-08-29 01:34:05 +00:00
Tom Rhodes
1e018d99f2 Fix a typo in r180291
"NAme of the current YP/NIS domain" -> "Name of the current YP/NIS domain"
2008-08-28 23:52:34 +00:00
Ed Schouten
a05cae5186 Make ureadc() warn when holding any locks, just like uiomove().
A couple of months ago I was quite impressed, because when I was writing
code, I discovered that uiomove() would not allow any locks to be held,
while ureadc() did, mainly because ureadc() is implemented using the
same building blocks as uiomove().

Let's see if this triggers any aditional witness warnings on our source
tree.

Reviewed by:	atillio
2008-08-28 19:34:58 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Konstantin Belousov
a888d54d39 Introduce the VV_FORCEINSMQ vnode flag. It instructs the insmnque() function
to ignore the unmounting and forces insertion of the vnode into the mount
vnode list.

Change insmntque() to fail when forced unmount is in progress and
VV_FORCEINSMQ is not specified.

Add an assertion to the insmntque(), requiring the vnode to be
exclusively locked for mp-safe filesystems.

Use the VV_FORCEINSMQ for the creation of the syncvnode.

Tested by:	pho
Reviewed by:	tegge
MFC after:	1 month
2008-08-28 09:08:15 +00:00
Ed Schouten
ceef66c0e3 Properly unlock the init/lock-state devices when invoking TIOCSETA.
For some reason a return-statement crept into this code, where it
shouldn't belong. This means we didn't properly unlock the TTY before
returning to userspace.

Submitted by:	Tor Egge <tor egge cvsup no freebsd org>
2008-08-27 19:37:21 +00:00
John Baldwin
9c2bf0cce2 - Only count the number of CPUs in the rendezvous map once rather than
doing it on every CPU.
- Use CPU_ABSENT() rather than pcpu_find() to determine if a CPU is not
  present.
- Count up to mp_maxid rather than MAXCPU when iterating over CPUs to
  match the rest of the code in the kernel.

MFC after:	1 week
2008-08-27 18:23:55 +00:00
Konstantin Belousov
cbc158449b Implement WNOWAIT flag for wait4(2). It specifies that process whose status
is returned shall be kept in the waitable state.
Add WSTOPPED as an alias for WUNTRACED.

Submitted by:	Jukka Ukkonen <jau at iki fi>
PR:	standards/116221
MFC after:	2 weeks
2008-08-26 12:37:16 +00:00
Konstantin Belousov
eaad109973 When calculating arguments to the interpreter for the shebang script
executed by fexecve(2), imgp->args->fname is NULL. Moreover, there is
no way to recover the path to the script being executed.
Do what some other U*ixes do unconditionally, namely supply /dev/fd/n
as the script path when called from fexecve(). Document requirement of
having fdescfs mounted as caveat.
2008-08-26 10:53:32 +00:00
John Baldwin
cf22c63dd5 Resort a few accessor routines so that they are consistently grouped
with 'set_foo/get_foo' adjacent to each other.
2008-08-25 16:16:57 +00:00
Robert Watson
3f3978840e More fully audit fexecve(2) and its arguments.
Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2008-08-25 13:50:01 +00:00
Robert Watson
5ae504055a Regenerate following r182123. 2008-08-24 21:23:08 +00:00
Robert Watson
e484af13ed When MPSAFE ttys were merged, a new BSM audit event identifier was
allocated for posix_openpt(2).  Unfortunately, that identifier
conflicts with other events already allocated to other systems in
OpenBSM.  Assign a new globally unique identifier and conform
better to the AUE_ event naming scheme.

This is a stopgap until a new OpenBSM import is done with the
correct identifier, so we'll maintain this as a local diff in svn
until then.

Discussed with:	ed
Obtained from:	TrustedBSD Project
2008-08-24 21:20:35 +00:00
Christian S.J. Peron
e451733718 Remove worrying printf warning on bootup when processing vnodes which
have NULL mount-points.  This is the case for special vnodes, such as the
one used in nameiinit() which is used for crossing mount points in lookup()
to avoid  lock ordering issues.

MFC after:	2 weeks
Discussed with:	rwatson, kib
2008-08-24 20:16:44 +00:00
Ed Schouten
1a643b0f02 Allow the user to suppress the rate-limited pty(4) warning.
The pty(4) driver raises up to warnings when an old BSD-style PTY is
created. The reason why I added this warning, was to make it easier to
spot applications that allocate BSD-style PTY's, while they should just
use openpty() or posix_openpt().

Add a sysctl, which allows you to override the number of remaining
messages, making it possible to suppress the warnings.

Requested by:	kib
Reviewed by:	kib
2008-08-23 16:03:00 +00:00
Robert Watson
6356dba0b4 Introduce two related changes to the TrustedBSD MAC Framework:
(1) Abstract interpreter vnode labeling in execve(2) and mac_execve(2)
    so that the general exec code isn't aware of the details of
    allocating, copying, and freeing labels, rather, simply passes in
    a void pointer to start and stop functions that will be used by
    the framework.  This change will be MFC'd.

(2) Introduce a new flags field to the MAC_POLICY_SET(9) interface
    allowing policies to declare which types of objects require label
    allocation, initialization, and destruction, and define a set of
    flags covering various supported object types (MPC_OBJECT_PROC,
    MPC_OBJECT_VNODE, MPC_OBJECT_INPCB, ...).  This change reduces the
    overhead of compiling the MAC Framework into the kernel if policies
    aren't loaded, or if policies require labels on only a small number
    or even no object types.  Each time a policy is loaded or unloaded,
    we recalculate a mask of labeled object types across all policies
    present in the system.  Eliminate MAC_ALWAYS_LABEL_MBUF option as it
    is no longer required.

MFC after:	1 week ((1) only)
Reviewed by:	csjp
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
2008-08-23 15:26:36 +00:00
John Baldwin
969bf150df Fix a race condition with concurrent LOOKUP namecache operations for a vnode
not in the namecache when shared lookups are enabled (vfs.lookup_shared=1,
it is currently off by default) and the filesystem supports shared lookups
(e.g. NFS client).  Specifically, if multiple concurrent LOOKUPs both miss
in the name cache in parallel, each of the lookups may each end up adding an
entry to the namecache resulting in duplicate entries in the namecache
for the same pathname.  A subsequent removal of the mapping of that
pathname to that vnode (via remove or rename) would only evict one of the
entries from the name cache.  As a result, subseqent lookups for that
pathname would still return the old vnode.

This race was observed with shared lookups over NFS where a file was updated
by writing a new file out to a temporary file name and then renaming that
temporary file to the "real" file to effect atomic updates of a file.  Other
processes on the same client that were periodically reading the file would
occasionally receive an ESTALE error from open(2) because the VOP_GETATTR()
in nfs_open() would receive that error when given the stale vnode.

The fix here is to check for duplicates in cache_enter() and just return
if an entry for this same directory and leaf file name for this vnode is
already in the cache.  The check for duplicates is done by walking the
per-vnode list of name cache entries.  It is expected that this list should
be very small in the common case (usually 0 or 1 entries during a
cache_enter() since most files only have 1 "leaf" name).

Reviewed by:	ups, scottl
MFC after:	2 months
2008-08-23 15:13:39 +00:00
Ed Schouten
ce570f82cc Remove unused tty_gone() checks inside ttyoutq_read_uio().
When my earlier MPSAFE TTY prototypes still implemented line
disciplines, we needed a mechanism to abort read()'s on PTY master
devices when inside the line discipline. Because this is no longer the
case, these checks have become unneeded.
2008-08-23 13:32:21 +00:00
Craig Rodrigues
d5bdb2f68d In nmount(), when we see the "force" option,
set the MNT_FORCE flag, but do not persist "force"
in the options list, since it is a command, not a persistent property
of a mount.

Similarly, when we see "reload", set MNT_RELOAD,
but delete "reload" from the options list.

MFC after:	1 week
2008-08-23 01:16:09 +00:00
Kip Macy
6205924afd Submit a band-aid for interrupt set up race.
MFC after:	1 month
2008-08-22 23:24:53 +00:00
Ed Schouten
0f0a7c27c5 Fix two small bugs in tcsetattr().
- According to POSIX, tcsetattr() must not fail when any of the bits in
  the structure are unsupported, but it must leave the unsupported flags
  alone.

- The CIGNORE flag (set by TCSASOFT, extension) was not cleared from
  c_cflag, which means using it would cause it to be applied during its
  entire lifespan. Eventually make sure we clear the flag.

I don't really like CIGNORE, but I think we must keep it alive right
now. With our new TTY layer, we don't actually need this mechanism,
because if you leave c_cflag, c_ispeed and c_ospeed alone, we won't make
a call into the device driver anyway.

Reported by:	naddy
Tested by:	naddy
2008-08-22 21:27:37 +00:00
John Baldwin
7847a9daec A suspended thread can, in fact, be swapped out. Thus,
thread_unsuspend_one() needs to optionally wakeup the swapper.  Since we
hold the thread lock for that entire function, however, we have to push
that requirement up into the caller.

Found by:	rwatson
2008-08-22 16:15:58 +00:00
John Baldwin
814f26da8a Use |= rather than += when aggregrating requests to wakeup the swapper.
What we really want is an inclusive or of all the requests, and += can
in theory roll over to 0.
2008-08-22 16:14:23 +00:00
Ed Schouten
6137be4386 Fix pts(4) error codes when slave device is closed.
Unlike pre-MPSAFE TTY, the pts(4) driver always returned ENXIO when a
read() or write() was performed on a pseudo-terminal master device when
the slave device was not opened. The old implementation had different
semantics:

- When the slave device had not been opened yet, read() and write() just
  blocked.
- When the slave device had been closed, a read() call would return 0
  bytes length.
- When the slave device had been closed, a write() call would return
  EIO.

Change the new implementation to return 0 and EIO as well. We don't
implement the first rule, but I suspect this is not needed, because
routines like openpty() also open the slave device node. posix_openpt()
users also do similar things.

Reported by:	rink
Tested by:	rink
2008-08-22 10:40:21 +00:00
Ed Schouten
7dc843ca92 Prevent VSTART flooding when turning on software flow control.
It turned out we transmitted VSTART after each successful read on a TTY
when software flow control was turned on. This was because of a very
evil bug where we tested the TF_HIWAT_IN flag the other way around.

Reported by:	Christian Weisgerber <naddy mips inka de>
2008-08-22 05:15:52 +00:00
David E. O'Brien
35c316caaf Add comments on NOARGS, NODEF, and NOPROTO. 2008-08-21 22:57:31 +00:00
Ed Schouten
40572ab385 Properly lock proctree_lock before locking the process while accounting.
During the import of the MPSAFE TTY layer (r181905), I changed
acct_process() to lock proctree_lock instead of SESS_LOCK, because
s_ttyp is now locked using proctree_lock. One of the things I forgot,
was to lock it before we PROC_LOCK.

Commit this patch, written by kib@. To ensure we hold proctree_lock as
short as possible, obtaining `ac_tty' has now been made the first step
of filling `acct'.

Reported by:	Kevin <kevinxlinuz 163 com>
Solved by:	kib
2008-08-21 15:02:17 +00:00
Ed Schouten
040b1db930 Remove the now unused `lbolt' variable from the kernel.
We used to have a single wait channel inside the kernel which could be
used by threads that just wanted to sleep for some time (the next
second). The old TTY layer was the only piece of code that still used
lbolt, because I already removed the use of lbolt from the NFS clients
and the VFS syncer.

Approved by:	philip
2008-08-20 12:20:22 +00:00
Kip Macy
fc3a86f6e9 remove scheduler_running as xenbus no longer needs it
MFC after:	1 month
2008-08-20 09:21:24 +00:00
Ed Schouten
18cf135421 Update system call tables.
The previous commit also included changes to all the system call lists,
but it is a tradition to update these lists in a second commit, so rerun
make sysent to update the $FreeBSD$ tags inside these files to refer to
the latest version of syscalls.master.

Requested by:	rwatson
2008-08-20 08:39:10 +00:00