Commit Graph

139717 Commits

Author SHA1 Message Date
John Baldwin
6bc1e9cd84 Rework the lifetime management of the kernel implementation of POSIX
semaphores.  Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec.  This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely.  It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.

Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
  the sem_unlink() operation.  Prior to this patch, if a semaphore's name
  was removed, valid handles from sem_open() would get EINVAL errors from
  sem_getvalue(), sem_post(), etc.  This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
  process exited or exec'd.  They were only cleaned up if the process
  did an explicit sem_destroy().  This could result in a leak of semaphore
  objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
  'struct ksem' of an unnamed semaphore (created via sem_init)) and had
  write access to the semaphore based on UID/GID checks, then that other
  process could manipulate the semaphore via sem_destroy(), sem_post(),
  sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
  creating the semaphore was not honored.  Thus if your umask denied group
  read/write access but the explicit mode in the sem_init() call allowed
  it, the semaphore would be readable/writable by other users in the
  same group, for example.  This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
  then it might have deregistered one or more of the semaphore system
  calls before it noticed that there was a problem.  I'm not sure if
  this actually happened as the order that modules are discovered by the
  kernel linker depends on how the actual .ko file is linked.  One can
  make the order deterministic by using a single module with a mod_event
  handler that explicitly registers syscalls (and deregisters during
  unload after any checks).  This also fixes a race where even if the
  sem_module unloaded first it would have destroyed locks that the
  syscalls might be trying to access if they are still executing when
  they are unloaded.

  XXX: By the way, deregistering system calls doesn't do any blocking
  to drain any threads from the calls.
- Some minor fixes to errno values on error.  For example, sem_init()
  isn't documented to return ENFILE or EMFILE if we run out of semaphores
  the way that sem_open() can.  Instead, it should return ENOSPC in that
  case.

Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
  named semaphores nearly in a similar fashion to the POSIX shared memory
  object file descriptors.  Kernel semaphores can now also have names
  longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
  in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
  done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
  MAC checks for POSIX semaphores accept both a file credential and an
  active credential.  There is also a new posixsem_check_stat() since it
  is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
  in src/tools/regression/posixsem.

Reported by:	kris (1)
Tested by:	kris
Reviewed by:	rwatson (lightly)
MFC after:	1 month
2008-06-27 05:39:04 +00:00
Robert Watson
02f4879d3a Introduce locking around use of ifindex_table, whose use was previously
unsynchronized.  While races were extremely rare, we've now had a
couple of reports of panics in environments involving large numbers of
IPSEC tunnels being added very quickly on an active system.

- Add accessor functions ifnet_byindex(), ifaddr_byindex(),
  ifdev_byindex() to replace existing accessor macros.  These functions
  now acquire the ifnet lock before derefencing the table.
- Add IFNET_WLOCK_ASSERT().
- Add static accessor functions ifnet_setbyindex(), ifdev_setbyindex(),
  which set values in the table either asserting of acquiring the ifnet
  lock.
- Use accessor functions throughout if.c to modify and read
  ifindex_table.
- Rework ifnet attach/detach to lock around ifindex_table modification.

Note that these changes simply close races around use of ifindex_table,
and make no attempt to solve the probem of disappearing ifnets.  Further
refinement of this work, including with respect to ifindex_table
resizing, is still required.

In a future change, the ifnet lock should be converted from a mutex to an
rwlock in order to reduce contention.

Reviewed and tested by:	brooks
2008-06-26 23:05:28 +00:00
Julian Elischer
a54eadd8c4 change a variable name ot stop it from colliding with other names in
some situations. (i.e. in vimage)

MFC after:	1 week
2008-06-26 22:59:49 +00:00
Julian Elischer
9dcc73ed79 Someone cut and pasted a bunch of stuff here so lots of
indents were spaces when they should have been tabs,
screwing up diffs and patches..

Whitespace commit as my first SVN commit. (yay)

MFC after:	1 week
2008-06-26 22:45:04 +00:00
John Baldwin
2137b017d7 Tweak the output of event log messages from the controller:
- Each log entry contains a text description in the "description" field of
  the entry.  The existing decode logic always ended up duplicating
  information that was already in the description string.  This made the
  logs overly verbose.  Now we just print out the description string.
- Add some simple parsing of the timestamp and event classes.

Reviewed by:	ambrisko, scottl
MFC after:	2 weeks
2008-06-26 22:36:38 +00:00
John Baldwin
c1ed06a84b Adjust the handling of pending log events during boot:
- Fetch events from the controller in batches of 15 rather than a single
  event at a time.
- When fetching events from the controller, honor the event class and
  locale settings (via hw.mfi tunables).  This also allows the firmware to
  skip over unwanted log entries resulting in fewer requests to the
  controller if there many unwanted log entries since the last clean
  shutdown.
- Don't drop the driver mutex while decoding an event.
- If we get an error other than MFI_STAT_NOT_FOUND (basically EOF for
  hitting the end of the event log) then emit a warning and bail on
  processing further log entries.

Reviewed by:	ambrisko, scottl
MFC after:	2 weeks
2008-06-26 22:33:24 +00:00
John Baldwin
62344da1e6 Fix compile on 64-bit platforms. 2008-06-26 21:26:34 +00:00
Andrew Thompson
39978059cc Remove the non-existent rt2860 subdir. Note, the ralfw module is not used in
the build yet.

PR:		kern/125015
Submitted by:	Dan Cojocar
2008-06-26 18:58:01 +00:00
Tim Kientzle
6986afe53e As reported by Alexey Shuvaev, -dumpl overwrote files after
linking them, with predictably bad results.
2008-06-26 15:46:01 +00:00
John Baldwin
f4c1db8901 Change SEM_VALUE_MAX (maximum value of a POSIX semaphore) from UINT_MAX
to INT_MAX.  Otherwise, a process could create a semaphore (or increase
its value via ksem_post()) beyond INT_MAX and sem_getvalue() would return
a negative value.  sem_getvalue() is only supposed to return a negative
value if that is the number of waiters for that semaphore.

MFC after:	2 weeks
2008-06-26 13:51:25 +00:00
John Baldwin
127cc7673d Add missing counter increments for posix shm checks. 2008-06-26 13:49:32 +00:00
Daniel Gerzo
aa2a33b4fa - add description of the MLINK error
PR:		docs/123019
MFC after:	3 days
2008-06-26 12:15:38 +00:00
Dag-Erling Smørgrav
c7dd6fa2c9 Some tests won't build at WARNS level 6 due to aliasing violations.
Add missing -I. so the tests will build when ${.OBJDIR} != ${.CURDIR}.
${.OBJDIR} does not need to be spelled out.
2008-06-26 11:58:26 +00:00
Dag-Erling Smørgrav
f9145f3547 Add regression test for CRC32 check. The test file has been modified to
include an invalid checksum for file2.

Approved by:	kientzle
2008-06-26 11:50:11 +00:00
Dag-Erling Smørgrav
c7d703c46a Implement CRC32 verification. Note that you have to read until EOF to
trigger the check.

Requested by:	ache
Approved by:	kientzle
2008-06-26 11:48:19 +00:00
Dag-Erling Smørgrav
e2157b51de Allow the tests to build without libdmalloc. 2008-06-26 10:53:05 +00:00
Doug Rabson
c675522fc4 Re-implement the client side of rpc.lockd in the kernel. This implementation
provides the correct semantics for flock(2) style locks which are used by the
lockf(1) command line tool and the pidfile(3) library. It also implements
recovery from server restarts and ensures that dirty cache blocks are written
to the server before obtaining locks (allowing multiple clients to use file
locking to safely share data).

Sponsored by:	Isilon Systems
PR:		94256
MFC after:	2 weeks
2008-06-26 10:21:54 +00:00
Daniel Gerzo
91bc389e54 Mark the section describing return values with an appropriate section flag.
PR:		docs/122818
MFC after:	3 days
2008-06-26 08:24:59 +00:00
Ruslan Ermilov
cae17430bf Fix a fallout from SSP commit, and make this compile again.
Bonus: including kern.mk just to pick kernel warning flags
was an extremely bad idea anyway, because it also picked
up CFLAGS (it probably wasn't the case at the time of CVS
rev. 1.1, I haven't checked).  Remove duplicate CWARNFLAGS
from CFLAGS.
2008-06-26 07:56:16 +00:00
Ruslan Ermilov
d03c587ffa Fix a chicken-and-egg problem: this files implements SSP support,
so we cannot compile it with -fstack-protector[-all] flags (or
it will self-recurse); this is ensured in sys/conf/files.  This
OTOH means that checking for defines __SSP__ and __SSP_ALL__ to
determine if we should be compiling the support is impossible
(which it was trying, resulting in an empty object file).  Fix
this by always compiling the symbols in this files.  It's good
because it allows us to always have SSP support, and then compile
with SSP selectively.

Repoted by:	tinderbox
2008-06-26 07:52:45 +00:00
Mike Makonnen
34a087543a Gcc barfs in glob.c when run with -O3. To fix this make g_strchr() work on
and return (const Char *) pointers instead of just (Char *) and get rid of
all the type casting.

PR:		kern/124334
2008-06-26 07:12:35 +00:00
Mike Makonnen
186f2eea49 The signature for a pthread function requires that it
return a pointer to a void. The send_thread() and disk_thread()
funtions; however, do not have a return value because they run for
the duration of the daemon's lifetime. This causes gcc to barf when
running with -O3. Make these functions return a null pointer to quiet it.

PR:	bin/124342
Submitted by:	Garrett Cooper <gcooper@FreeBSD.org> (minus his comments)
MFC after:	1 week
2008-06-26 07:05:35 +00:00
Maxim Sobolev
cb45b78eae Fix 6-year old cut&paste error. The # could be escaped with '\', not
with '\\'.

MFC after:	2 weeks
2008-06-26 07:02:47 +00:00
Tim Kientzle
8b88e9591a Split out the reference zip file for ease of maintenance. 2008-06-26 04:48:42 +00:00
Ruslan Ermilov
5c1eb5ea14 Regen properly. 2008-06-25 21:42:23 +00:00
Ruslan Ermilov
5a9bc08994 Regen. 2008-06-25 21:36:25 +00:00
Ruslan Ermilov
042df2e2da Enable GCC stack protection (aka Propolice) for userland:
- It is opt-out for now so as to give it maximum testing, but it may be
  turned opt-in for stable branches depending on the consensus.  You
  can turn it off with WITHOUT_SSP.
- WITHOUT_SSP was previously used to disable the build of GNU libssp.
  It is harmless to steal the knob as SSP symbols have been provided
  by libc for a long time, GNU libssp should not have been much used.
- SSP is disabled in a few corners such as system bootstrap programs
  (sys/boot), process bootstrap code (rtld, csu) and SSP symbols themselves.
- It should be safe to use -fstack-protector-all to build world, however
  libc will be automatically downgraded to -fstack-protector because it
  breaks rtld otherwise.
- This option is unavailable on ia64.

Enable GCC stack protection (aka Propolice) for kernel:
- It is opt-out for now so as to give it maximum testing.
- Do not compile your kernel with -fstack-protector-all, it won't work.

Submitted by:	Jeremie Le Hen <jeremie@le-hen.org>
2008-06-25 21:33:28 +00:00
Marius Strobl
0d9e99b6ca Use "__asm __volatile" rather than "__asm" for instruction sequences
that modify condition codes (the carry bit, in this case). Without
"__volatile", the compiler might add the inline assembler instructions
between unrelated code which also uses condition codes, modifying the
latter.
This prevents the TCP pseudo header checksum calculation done in
tcp_output() from having effects on other conditions when compiled
with GCC 4.2.1 at "-O2" and "options INET6" left out. [1]

Reported & tested by:	Boris Kochergin [1]
MFC after:		3 days
2008-06-25 21:04:59 +00:00
Marius Strobl
1239136645 Given that sun4u uses sparc64/sparc64/in_cksum.c, use the sparc64
<machine/in_cksum.h> here also.

MFC after:	3 days
2008-06-25 21:03:26 +00:00
Ruslan Ermilov
896eafd957 src/compat/ is gone back in March.
Reported by:	Mars G Miro
2008-06-25 20:29:22 +00:00
Bjoern A. Zeeb
9a8398173d Document spindown constraints as given in the original commit
message[1] and later clarification provided by phk.

[1] http://docs.freebsd.org/cgi/mid.cgi?200803171033.m2HAXOeN055116

Reviewed by:	brueffer, phk, ed
2008-06-25 18:11:22 +00:00
Ed Schouten
9d7a57e916 Remove the unused M_MEMDEV from the kernel.
The M_MEMDEV memory allocation pool does not seem to be used. We can
live without it.

Approved by:	philip (mentor)
2008-06-25 07:52:10 +00:00
Ed Schouten
721351876c Remove the unused major/minor numbers from iodev and memdev.
Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by:	philip (mentor)
2008-06-25 07:45:31 +00:00
Alex Dupre
172b9da045 Fix links to online gcc docs.
Reported by:	Andre Guibert de Bruet <andy@siliconlandmark.com>
MFC after:	1 day
2008-06-25 06:07:03 +00:00
Tim Kientzle
634d062e6a Pass the entry down into the core write loop, so we
can include the filename when reporting errors.

Thanks to: Dan Nelson
2008-06-25 05:01:02 +00:00
Garrett Wollman
603609c7f4 Months in English are capitalized (even when abbreviated). 2008-06-25 04:56:08 +00:00
Mike Makonnen
522b9831bd Quiet rc.d/syscons unless it has something to say. 2008-06-24 21:01:56 +00:00
Jung-uk Kim
1427b09672 Emit opcodes closer to GNU as(1) generated codes and micro-optimize. 2008-06-24 20:12:44 +00:00
Jung-uk Kim
b86977a5ab Emit opcodes closer to GNU as(1) generated codes and micro-optimize. 2008-06-24 20:12:12 +00:00
George V. Neville-Neil
a13c239b91 Make it simpler to build netgraph modules outside of the kernel source
tree.  This change follows similar ones in the device tree.

MFC after:	2 weeks
2008-06-24 18:49:49 +00:00
Tim Kientzle
e6c78aec4f In -p mode, don't gaurd against '..' in paths. We continue to
check in -i mode unless --insecure is specified.

PR: bin/124924
2008-06-24 15:18:40 +00:00
Oleksandr Tymoshenko
cf77b84879 In case of interface initialization failure remove struct in_ifaddr* from
in_ifaddrhashtbl in in_ifinit because error handler in in_control removes
entries only for AF_INET addresses. If in_ifinit is called for the cloned
inteface that has just been created its address family is not AF_INET and
therefor LIST_REMOVE is not called for respective LIST_INSERT_HEAD and
freed entries remain in in_ifaddrhashtbl and lead to memory corruption.

PR:	kern/124384
2008-06-24 13:58:28 +00:00
David Xu
7de1ecef2d Add two commands to _umtx_op system call to allow a simple mutex to be
locked and unlocked completely in userland. by locking and unlocking mutex
in userland, it reduces the total time a mutex is locked by a thread,
in some application code, a mutex only protects a small piece of code, the
code's execution time is less than a simple system call, if a lock contention
happens, however in current implemenation, the lock holder has to extend its
locking time and enter kernel to unlock it, the change avoids this disadvantage,
it first sets mutex to free state and then enters kernel and wake one waiter
up. This improves performance dramatically in some sysbench mutex tests.

Tested by: kris
Sounds great: jeff
2008-06-24 07:32:12 +00:00
Ed Maste
ef0b687ced Fix test for waiting AIFs in aac_poll(). This seems to solve the
problem where Adaptec's arcconf monitoring tool hangs after producing
its expected output.

Submitted by:	Adaptec, via driver ver 15317
MFC after:	1 week
2008-06-24 03:26:41 +00:00
Jung-uk Kim
6a9748abc8 Rehash and clean up BPF JIT compiler macros to match AT&T notations. 2008-06-23 23:10:11 +00:00
Jung-uk Kim
292f013c88 Rehash and clean up BPF JIT compiler macros to match AT&T notations. 2008-06-23 23:09:52 +00:00
Mike Makonnen
45a5dc937d Add a -q flag to swapon(8) to suppress informational messages. Use it in
rc.d.
Note: errors are not affected by this flag.
2008-06-23 22:17:08 +00:00
Mike Makonnen
d9fcd86c3a The sysctl(8) program exits on some errors and only emits warnings on
others. In the case where it displayed warnings it would still return
succesfully. Modify it so that it returns the number of sysctls that
it was not able to set.

Make use of this in rc.d to display only *unsuccessfull* attempts to
set sysctls.
2008-06-23 22:06:28 +00:00
John Baldwin
c4f3a35a54 Remove the posixsem_check_destroy() MAC check. It is semantically identical
to doing a MAC check for close(), but no other types of close() (including
close(2) and ksem_close(2)) have MAC checks.

Discussed with:	rwatson
2008-06-23 21:37:53 +00:00
Mike Makonnen
2794059010 Run savecore(8) only if there is a core dump to save. If there is
no core dump hide the message to that effect behind $rc_quiet.
2008-06-23 20:54:32 +00:00