12234 Commits

Author SHA1 Message Date
mdf
d825d95c9f Add an option to have a fail point term only execute when run by a
specified pid.  This is helpful for automated testing involving a global
knob that would otherwise be executed by many other threads.

MFC after: 1 week
2011-07-08 20:41:12 +00:00
mdf
def1e657f3 style(9) and cleanup fixes.
MFC after: 1 week
2011-07-08 20:41:07 +00:00
jonathan
7c2c726167 Fix the "passability" test in fdcopy().
Rather than checking to see if a descriptor is a kqueue, check to see if
its fileops flags include DFLAG_PASSABLE.

At the moment, these two tests are equivalent, but this will change with
the addition of capabilities that wrap kqueues but are themselves of type
DTYPE_CAPABILITY. We already have the DFLAG_PASSABLE abstraction, so let's
use it.

This change has been tested with [the newly improved] tools/regression/kqueue.

Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-08 12:19:25 +00:00
andre
41fc3214df In the experimental soreceive_stream():
o Move the non-blocking socket test below the SBS_CANTRCVMORE so that EOF
   is correctly returned on a remote connection close.
 o In the non-blocking socket test compare SS_NBIO against the so->so_state
   field instead of the incorrect sb->sb_state field.
 o Simplify the ENOTCONN test by removing cases that can't occur.

Submitted by:	trociny (with some further tweaks by committer)
Tested by:	trociny
2011-07-08 10:50:13 +00:00
trasz
b468f23f53 Style fix - macros are supposed to be uppercase. 2011-07-07 17:44:42 +00:00
andre
19bc1179a6 Remove the TCP_SORECEIVE_STREAM compile time option. The use of
soreceive_stream() for TCP still has to be enabled with the loader
tuneable net.inet.tcp.soreceive_stream.

Suggested by:	trociny and others
2011-07-07 10:37:14 +00:00
trasz
4a17b24427 All the racct_*() calls need to happen with the proc locked. Fixing this
won't happen before 9.0.  This commit adds "#ifdef RACCT" around all the
"PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order
to avoid useless locking/unlocking in kernels built without "options RACCT".
2011-07-06 20:06:44 +00:00
marius
97f9011cd8 Call pmap_qremove() before freeing or unwiring the pages, otherwise
there's a window during which a page can be re-used before its previous
mapping is removed.

Reviewed by:	alc
MFC after:	1 week
2011-07-05 18:40:37 +00:00
jonathan
6abbb93d5f Rework _fget to accept capability parameters.
This new version of _fget() requires new parameters:
- cap_rights_t needrights
    the rights that we expect the capability's rights mask to include
    (e.g. CAP_READ if we are going to read from the file)

- cap_rights_t *haverights
    used to return the capability's rights mask (ignored if NULL)

- u_char *maxprotp
    the maximum mmap() rights (e.g. VM_PROT_READ) that can be permitted
    (only used if we are going to mmap the file; ignored if NULL)

- int fget_flags
    FGET_GETCAP if we want to return the capability itself, rather than
    the underlying object which it wraps

Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-05 13:45:10 +00:00
jonathan
bf3c575ea1 Add kernel functions to unwrap capabilities.
cap_funwrap() and cap_funwrap_mmap() unwrap capabilities, exposing the
underlying object. Attempting to unwrap a capability with an inadequate
rights mask (e.g. calling cap_funwrap(fp, CAP_WRITE | CAP_MMAP, &result)
on a capability whose rights mask is CAP_READ | CAP_MMAP) will result in
ENOTCAPABLE.

Unwrapping a non-capability is effectively a no-op.

These functions will be used by Capsicum-aware versions of _fget(), etc.

Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-04 14:40:32 +00:00
attilio
364d0522f7 With retirement of cpumask_t and usage of cpuset_t for representing a
mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.

Remove them and replace their usage with custom pc_cpuid magic (as,
atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and
pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).

This change is not targeted for MFC because of struct pcpu members
removal and dependency by cpumask_t retirement.

MD review by:	marcel, marius, alc
Tested by:	pluknet
MD testing by:	marcel, marius, gonzo, andreast
2011-07-04 12:04:52 +00:00
bz
9cad5bfef3 Add infrastructure to allow all frames/packets received on an interface
to be assigned to a non-default FIB instance.

You may need to recompile world or ports due to the change of struct ifnet.

Submitted by:	cjsp
Submitted by:	Alexander V. Chernikov (melifaro ipfw.ru)
		(original versions)
Reviewed by:	julian
Reviewed by:	Alexander V. Chernikov (melifaro ipfw.ru)
MFC after:	2 weeks
X-MFC:		use spare in struct ifnet
2011-07-03 12:22:02 +00:00
ed
49a6dac46f Reintroduce the cioctl() hook in the TTY layer for digi(4).
The cioctl() hook can be used by drivers to add ioctls to the *.init and
*.lock devices. This commit breaks the ttydevsw ABI, since this
structure didn't provide any padding. To prevent ABI breakage in the
future, add a tsw_spare.

Submitted by:	Peter Jeremy <peter jeremy alcatel lucent com>
Obtained from:	kern/152254 (slightly modified)
2011-07-02 13:54:20 +00:00
jonathan
4d4c5b3285 When Capsicum starts creating capabilities to wrap existing file
descriptors, we will want to allocate a new descriptor without installing
it in the FD array.

Split falloc() into falloc_noinstall() and finstall(), and rewrite
falloc() to call them with appropriate atomicity.

Approved by: mentor (rwatson), re (bz)
2011-06-30 15:22:49 +00:00
jonathan
8c932faae4 Add some checks to ensure that Capsicum is behaving correctly, and add some
more explicit comments about what's going on and what future maintainers
need to do when e.g. adding a new operation to a sys_machdep.c.

Approved by: mentor(rwatson), re(bz)
2011-06-30 10:56:02 +00:00
alc
21902be08c Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this
option to vm_object_page_remove() asserts that the specified range of pages
is not mapped, or more precisely that none of these pages have any managed
mappings.  Thus, vm_object_page_remove() need not call pmap_remove_all() on
the pages.

This change not only saves time by eliminating pointless calls to
pmap_remove_all(), but it also eliminates an inconsistency in the use of
pmap_remove_all() versus related functions, like pmap_remove_write().  It
eliminates harmless but pointless calls to pmap_remove_all() that were being
performed on PG_UNMANAGED pages.

Update all of the existing assertions on pmap_remove_all() to reflect this
change.

Reviewed by:	kib
2011-06-29 16:40:41 +00:00
jonathan
624e733467 We may split today's CAPABILITIES into CAPABILITY_MODE (which has
to do with global namespaces) and CAPABILITIES (which has to do with
constraining file descriptors). Just in case, and because it's a better
name anyway, let's move CAPABILITIES out of the way.

Also, change opt_capabilities.h to opt_capsicum.h; for now, this will
only hold CAPABILITY_MODE, but it will probably also hold the new
CAPABILITIES (implying constrained file descriptors) in the future.

Approved by: rwatson
Sponsored by: Google UK Ltd
2011-06-29 13:03:05 +00:00
ed
4ef034d1ea Fix whitespace inconsistencies in the TTY layer and its drivers owned by me. 2011-06-26 18:26:20 +00:00
jonathan
7571a34069 Remove redundant Capsicum sysctl.
Since we're now declaring FEATURE(security_capabilities), there's no need for an explicit SYSCTL_NODE.

Approved by: rwatson
2011-06-25 12:37:06 +00:00
avg
74e5eef4ea unconditionally stop other cpus when entering kdb in smp system
... and thus retire debug.kdb.stop_cpus tunable/sysctl.
The knob was to work around CPU stopping issues, which since have been
either fixed or greatly reduced.  kdb should really operate in a special
environment with scheduler stopped and interrupts disabled to provide
deterministic debugging.

Discussed with:	attilio, rwatson
X-MFC after:	2 months or never
2011-06-25 10:28:16 +00:00
avg
57d68644db generic_stop_cpus: pull timeout logic from under DIAGNOSTIC
... and also increase the timeout.
It's better to try to proceed somehow despite stuck CPUs than to hang
indefinitely.  Especially so during shutdown and when entering kdb or panic.

Timeout value is still an aribitrary value.
Timeout diagnostic is just a printf; the work on something more
debuggable is planned by attilio.  Need to be careful here as
stop_cpus_hard is called very early while enetering kdb and soon(-ish)
it may become called very early when entering panic.

Reviewed by:	attilio
MFC after:	2 months
2011-06-25 10:01:43 +00:00
jonathan
d77535edb6 Tidy up a capabilities-related comment.
This comment refers to an #ifdef that hasn't been merged [yet?]; remove it.

Approved by: rwatson
2011-06-24 14:40:22 +00:00
jkim
6da60ac39e Set negative quality to TSC timecounter when C3 state is enabled for Intel
processors unless the invariant TSC bit of CPUID is set.  Intel processors
may stop incrementing TSC when DPSLP# pin is asserted, according to Intel
processor manuals, i. e., TSC timecounter is useless if the processor can
enter deep sleep state (C3/C4).  This problem was accidentally uncovered by
r222869, which increased timecounter quality of P-state invariant TSC, e.g.,
for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c).

Reported by:	Fabian Keil (freebsd-listen at fabiankeil dot de)
		Ian FREISLICH (ianf at clue dot co dot za)
Tested by:	Fabian Keil (freebsd-listen at fabiankeil dot de)
		- Core2 Duo T5870 (C3 state available/enabled)
		jkim - Xeon X5150 (C3 state unavailable)
2011-06-22 16:40:45 +00:00
obrien
d17521adb2 Add comment from CSRG rev 7.27 (1992/06/23 19:56:55; author: mckusick) 2011-06-17 21:44:13 +00:00
kib
6e0462eab2 Do not trash the argv[0] pointer for an a.out process on amd64.
Found with the binary provided by joerg.
2011-06-16 22:00:59 +00:00
kib
f13e1b44c6 Fix silly typo that resulted in the a.out process stack to end at
~200MB instead of 3GB on amd64.
2011-06-16 21:59:16 +00:00
marcel
c7e47a0b81 Even if the loaded module has no symbols, we still need to notify
MD code about it and update the link map for GDB's use.
2011-06-16 17:41:21 +00:00
gibbs
6c5518eb45 sys/kern/subr_kdb.c:
Modify the "alternate break sequence" detecting state
	machine so that only a contiguous invocation of the
	break sequence is accepted.  The old implementation
	did not reset the state machine when detecting an
	unexpected character.

	While here, use an enum for the states of the machine
	instead of magic numbers.bmitted by:

Sponsored by:	Spectra Logic Corporation
2011-06-14 21:37:25 +00:00
obrien
f797e31a8d We should not return ECHILD when debugging a child and the parent does a
"wait4(-1, ..., WNOHANG, ...)".  Instead wait(2) should behave as if the
child does not wish to report status at this time.

Reviewed by:	jhb
2011-06-14 17:09:30 +00:00
gibbs
9d45c190c8 sys/sys/conf.h:
sys/kern/kern_conf.c:
	Add make_dev_physpath_alias().  This interface takes
	the parent cdev of the alias, an old alias cdev (if any)
	to replace with the newly created alias, and the physical
	path string.  The alias is visiable as a symlink to the
	parent, with the same name as the parent, rooted at
	physpath in devfs.

	Note: make_dev_physpath_alias() has hard coded knowledge of the
	      Solaris style prefix convention for physical path data,
	      "id1,".  In the future, I expect the convention to change
	      to allow "physical path quality" to be reported in the
	      prefix.  For example, a physical path based on NewBus
	      topology would be of "lower quality" than a physical path
	      reported by a device enclosure.

Sponsored by:	Spectra Logic Corporation
2011-06-14 16:29:43 +00:00
ken
b8dcfe0228 Instead of using an atomic operation to determine whether the devstat(9)
device node has been created, pass MAKEDEV_CHECKNAME in so that the devfs
code will do the check.

Use a regular static variable as before, that's good enough to keep us from
calling into devfs most of the time.

Suggested by:	kib
MFC after:	1 week
Sponsored by:	Spectra Logic Corporation
2011-06-13 22:08:24 +00:00
gibbs
f3f56007b6 Fix a couple of race conditions in devstat(9) initialization.
In devstat_new_entry(), there is no need to initialize the queue
and the mutex in this function.  There are ways to do static
initialization on both, so use STAILQ_HEAD_INITIALIZER and
MTX_SYSINIT to initialize the queue and the mutex.

In devstat_alloc(), use an atomic test and set routine to guard
making our entry in /dev.  Using just a plain static variable
creates a race condition on multiprocessor machines.  If you
attempt to create a second entry in devfs, the kernel will panic.

Submitted by:	kdm
Reviewed by:	gibbs
Sponsored by:	Spectra Logic Corporation
MFC after:	1 week.
2011-06-13 21:21:02 +00:00
jeff
02186557c4 - When printing bufs with show buf the lblkno is often more useful than
the blkno.  Print them both.
2011-06-10 22:15:36 +00:00
attilio
2514230a6b In the current code, a double panic condition may lead to dumps
interleaving.
Signal dumping to happen only for the first panic which should be the
most important.

Sponsored by:	Sandvine Incorporated
Submitted by:	Nima Misaghian (nmisaghian AT sandvine DOT com)
MFC after:	2 weeks
2011-06-08 19:28:59 +00:00
jhb
b5ffa4ca36 Log the socket address passed as the destination to sendto() and sendmsg()
via ktrace.

MFC after:	1 week
2011-06-07 17:40:33 +00:00
attilio
6ed3ca2c5b MFC 2011-06-07 08:24:29 +00:00
ken
048adb69c7 Set pca.p_bufr to NULL when we haven't allocated a buffer.
Otherwise, p_bufr is set to garbage on the stack, and if that garbage
happens to be non-NULL, and the TOLOG or TOCONS flag is set, putbuf()
will get called and attempt to fill the non-existent buffer.

This is really only relevant for tprintf() (and only when the priority is
not -1), but set it in uprintf() and ttyprintf() for completeness.

The next step, to avoid log buffer scrambling, would be to add the
PRINTF_BUFR_SIZE code to tprintf(), but this should prevent panics.

Submitted by:	rmacklem
Found by:	pho
2011-06-07 05:04:37 +00:00
davidxu
fc6d16c51a Use p4prio_to_tsprio to calculate TS priority instead of using
p4prio_to_rtpprio which is for RT priority.

PR:	kern/157657
Submitted by:	krivenok.dmitry at gmail dot com
MFC after:	3 days
2011-06-07 02:50:14 +00:00
marcel
36b8c5d486 Fix making kernel dumps from the debugger by creating a command
for it. Do not not expect a developer to call doadump(). Calling
doadump does not necessarily work when it's declared static. Nor
does it necessarily do what was intended in the context of text
dumps. The dump command always creates a core dump.

Move printing of error messages from doadump to the dump command,
now that we don't have to worry about being called from DDB.
2011-06-07 01:28:12 +00:00
attilio
fcefe479fe MFC 2011-06-06 21:38:39 +00:00
jhb
aa8ddae280 Clear the device_t pointer in 'struct resource' when releasing a device
as otherwise the sysctl to export rman info can dereference a stale
pointer.

PR:		kern/115371
Submitted by:	Arthur Hartwig
MFC after:	1 week
2011-06-06 13:12:56 +00:00
attilio
9f19c1c64d MFC 2011-06-01 16:54:33 +00:00
ken
9237f32b34 Fix a bug introduced in revision 222537.
In msgbuf_reinit() and msgbuf_init(), we weren't initializing the mutex.
Depending on the contents of memory, the LO_INITIALIZED flag might be
set on the mutex (either due to a warm reboot, and the message buffer
remaining in place, or due to garbage in memory) and in that case, with
INVARIANTS turned on, we would trigger an assertion that the mutex had
already been initialized.

Fix this by bzeroing the message buffer mutex for the _init() and _reinit()
paths.

Reported by:	mdf
2011-05-31 22:39:32 +00:00
attilio
bc4d32e80b MFC 2011-05-31 21:22:44 +00:00
attilio
a924571ff7 Fix KTR_CPUMASK in order to accept a string representing a cpuset_t.
This introduce all the underlying support for making this possible (via
the function cpusetobj_strscan() and keeps ktr_cpumask exported.  sparc64
implements its own assembly primitives for tracing events and needs to
properly check it.  Anyway the sparc64 logic is not implemented yet due
to lack of knowledge (by me) and time (by marius), but it is just a
matter of using ktr_cpumask when possible.

Tested and fixed by:	pluknet
Reviewed by:		marius
2011-05-31 20:48:58 +00:00
attilio
066c7ac96c Revert a change that crept in during MFC. 2011-05-31 20:23:33 +00:00
ken
0febb6df5e Fix apparent garbage in the message buffer.
While we have had a fix in place (options PRINTF_BUFR_SIZE=128) to fix
scrambled console output, the message buffer and syslog were still getting
log messages one character at a time.  While all of the characters still
made it into the log (courtesy of atomic operations), they were often
interleaved when there were multiple threads writing to the buffer at the
same time.

This fixes message buffer accesses to use buffering logic as well, so that
strings that are less than PRINTF_BUFR_SIZE will be put into the message
buffer atomically.  So now dmesg output should look the same as console
output.

subr_msgbuf.c:		Convert most message buffer calls to use a new spin
			lock instead of atomic variables in some places.

			Add a new routine, msgbuf_addstr(), that adds a
			NUL-terminated string to a message buffer.  This
			takes a priority argument, which allows us to
			eliminate some races (at least in the the string
			at a time case) that are present in the
			implementation of msglogchar().  (dangling and
			lastpri are static variables, and are subject to
			races when multiple callers are present.)

			msgbuf_addstr() also allows the caller to request
			that carriage returns be stripped out of the
			string.  This matches the behavior of msglogchar(),
			but in testing so far it doesn't appear that any
			newlines are being stripped out.  So the carriage
			return removal functionality may be a candidate
			for removal later on if further analysis shows
			that it isn't necessary.

subr_prf.c:		Add a new msglogstr() routine that calls
			msgbuf_logstr().

			Rename putcons() to putbuf().  This now handles
			buffered output to the message log as well as
			the console.  Also, remove the logic in putcons()
			(now putbuf()) that added a carriage return before
			a newline.  The console path was the only path that
			needed it, and cnputc() (called by cnputs())
			already adds a carriage return.  So this
			duplication resulted in kernel-generated console
			output lines ending in '\r''\r''\n'.

			Refactor putchar() to handle the new buffering
			scheme.

			Add buffering to log().

			Change log_console() to use msglogstr() instead of
			msglogchar().  Don't add extra newlines by default
			in log_console().  Hide that behavior behind a
			tunable/sysctl (kern.log_console_add_linefeed) for
			those who would like the old behavior.  The old
			behavior led to the insertion of extra newlines
			for log output for programs that print out a
			string, and then a trailing newline on a separate
			write.  (This is visible with dmesg -a.)

msgbuf.h:		Add a prototype for msgbuf_addstr().

			Add three new fields to struct msgbuf, msg_needsnl,
			msg_lastpri and msg_lock.  The first two are needed
			for log message functionality previously handled
			by msglogchar().  (Which is still active if
			buffering isn't enabled.)

			Include sys/lock.h and sys/mutex.h for the new
			mutex.

Reviewed by:	gibbs
2011-05-31 17:29:58 +00:00
nwhitehorn
a69e106b2f On multi-core, multi-threaded PPC systems, it is important that the threads
be brought up in the order they are enumerated in the device tree (in
particular, that thread 0 on each core be brought up first). The SLIST
through which we loop to start the CPUs has all of its entries added with
SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration
and so AP startup would always fail in such situations (causing a machine
check or RTAS failure). Fix this by changing the SLIST into an STAILQ,
and inserting new CPUs at the end.

Reviewed by:	jhb
2011-05-31 15:11:43 +00:00
attilio
b1bf71d3c5 MFC 2011-05-31 14:18:10 +00:00
attilio
8dd6262cd3 MFC 2011-05-29 18:33:13 +00:00