12369 Commits

Author SHA1 Message Date
kib
b49a656854 Consistently use process spin lock for protection of the
p->p_boundary_count. Race could cause the execve(2) from the threaded
process to hung since thread boundary counter was incorrect and
single-threading never finished.

Reported by:	pluknet, pho
Tested by:	pho
MFC after:	1 week
2011-11-18 09:12:26 +00:00
kevlo
1a26b28a9b Add unicode support to msdosfs and smbfs; original pathes from imura,
bug fixes by Kuan-Chung Chiu <buganini at gmail dot com>.

Tested by me in production for several days at work.
2011-11-18 03:05:20 +00:00
pjd
a3e664d830 Constify arguments for locking KPIs where possible.
This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.

Reviewed by:	ed, kib, jhb
2011-11-16 21:51:17 +00:00
pjd
f01187d1ea Constify stack argument for functions that don't modify it.
Reviewed by:	ed, kib, jhb
2011-11-16 19:06:55 +00:00
marius
4b6458192f As it turns out, r186347 actually is insufficient to avoid the use of the
curthread-accessing part of mtx_{,un}lock(9) when using a r210623-style
curthread implementation on sparc64, crashing the kernel in its early
cycles as PCPU isn't set up, yet (and can't be set up as OFW is one of the
things we need for that, which leads to a chicken-and-egg problem). What
happens is that due to the fact that the idea of r210623 actually is to
allow the compiler to cache invocations of curthread, it factors out
obtaining curthread needed for both mtx_lock(9) and mtx_unlock(9) to
before the branch based on kobj_mutex_inited when compiling the kernel
without the debugging options. So change kobj_class_compile_static(9)
to just never acquire kobj_mtx, effectively restricting it to its
documented use, and add a kobj_init_static(9) for initializing objects
using a class compiled with the former and that also avoids using mutex(9)
(and malloc(9)). Also assert in both of these functions that they are
used in their intended way only.
While at it, inline kobj_register_method() and kobj_unregister_method()
as there wasn't much point for factoring them out in the first place
and so that a reader of the code has to figure out the locking for
fewer functions missing a KOBJ_ASSERT.
Tested on powerpc{,64} by andreast.

Reviewed by:	nwhitehorn (earlier version), jhb
MFC after:	3 days
2011-11-15 20:11:03 +00:00
obrien
def2613c2b Reformat comment to be more readable in standard Xterm.
(while I'm here, wrap other long lines)
2011-11-15 01:48:53 +00:00
rmh
ff5c11fefd Remove a few bits of FreeBSD 2.x compatibility code.
Approved by:	kib (mentor)
2011-11-14 18:21:27 +00:00
jhb
9315d9d42c - Split out a kern_posix_fadvise() from the posix_fadvise() system call so
it can be used by in-kernel consumers.
- Make kern_posix_fallocate() public.
- Use kern_posix_fadvise() and kern_posix_fallocate() to implement the
  freebsd32 wrappers for the two system calls.
2011-11-14 18:00:15 +00:00
alfred
fe73343e27 Constify args to copyiniov and copyinuio. 2011-11-14 07:12:10 +00:00
kib
ca682488f3 To limit amount of the kernel memory allocated, and to optimize the
iteration over the fdsets, kern_select() limits the length of the
fdsets copied in by the last valid file descriptor index. If any bit
is set in a mask above the limit, current implementation ignores the
filedescriptor, instead of returning EBADF.

Fix the issue by scanning the tails of fdset before entering the
select loop and returning EBADF if any bit above last valid
filedescriptor index is set. The performance impact of the additional
check is only imposed on the (somewhat) buggy applications that pass
bad file descriptors to select(2) or pselect(2).

PR:	kern/155606, kern/162379
Discussed with:	cognet, glebius
Tested by:	andreast (powerpc, all 64/32bit ABI combinations, big-endian),
       marius (sparc64, big-endian)
MFC after:    2 weeks
2011-11-13 10:28:01 +00:00
kib
a8b2cc359c Style.
MFC after:	1 week
2011-11-11 04:13:47 +00:00
kib
f3c70bd299 Guard against the unlikely case of the alias path containing the '%' symbols.
Reported by:	arundel
MFC after:	1 week
2011-11-11 04:12:58 +00:00
rstone
f6710005c7 Correct the types of the arguments to return probes of the syscall
provider.  Previously we were erroneously supplying the argument types of
the corresponding entry probe.

Reviewed by:	rpaulo
MFC after:	1 week
2011-11-11 03:49:42 +00:00
ed
3d017846c9 Simplify the code emitted by makeobjops.awk slightly.
Just place the default kobj_method inside the kobjop_desc structure.
There's no need to give these kobj_methods their own symbol. This shaves
off 10 KB of a GENERIC kernel binary.
2011-11-09 11:00:29 +00:00
ed
85414e79bd Make kobj_methods constant.
These structures hold no information that is modified during runtime. By
marking this constant, we see approximately 600 symbols become
read-only (amd64 GENERIC). While there, also mark the kobj_method
structures generated by makeobjops.awk static. They are only referenced
by the kobjop_desc structures within the same file.

Before:

	$ ls -l kernel
	-rwxr-xr-x  1 ed  wheel  15937309 Nov  8 16:29 kernel*
	$ size kernel
	    text    data     bss      dec    hex filename
	12260854 1358468 2848832 16468154 fb48ba kernel
	$ nm kernel | fgrep -c ' r '
	8240

After:

	$ ls -l kernel
	-rwxr-xr-x  1 ed  wheel  15922469 Nov  8 16:25 kernel*
	$ size kernel
	    text    data     bss      dec    hex filename
	12302869 1302660 2848704 16454233 fb1259 kernel
	$ nm kernel | fgrep -c ' r '
	8838
2011-11-08 15:38:21 +00:00
rstone
ae7b6414d5 The in-kernel CTF parser caches the result of its first attempt to parse
CTF data from a module.  On subsequent attempts to retrieve CTF data for
a module, return an error if there no CTF data.

This fixes a panic if you try to enable fbt probes on a module with CTF
data twice.

Submitted by:	Paul Ambrose (ambrosehua AT gmail DOT com)
MFC after:	3 days
2011-11-08 15:17:54 +00:00
attilio
8e918ec439 Introduce the option VFS_ALLOW_NONMPSAFE and turn it on by default on
all the architectures.
The option allows to mount non-MPSAFE filesystem. Without it, the
kernel will refuse to mount a non-MPSAFE filesytem.

This patch is part of the effort of killing non-MPSAFE filesystems
from the tree.

No MFC is expected for this patch.

Tested by:	gianni
Reviewed by:	kib
2011-11-08 10:18:07 +00:00
trociny
9cd1f9add2 Add KVME_FLAG_SUPER and use it in sysctl_kern_proc_vmmap for marking
entries with superpages.

Submitted by:	Mel Flynn <mel.flynn+fbsd.hackers@mailing.thruhere.net>
Reviewed by:	alc, rwatson
2011-11-07 21:13:19 +00:00
trociny
f25b803473 In lim_fork() assert that processes locks are held.
Suggested by:	kib
2011-11-07 21:09:04 +00:00
ed
0c56cf839d Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
ed
e97eae1577 Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
fjoe
afccaaff1c Add KLD_DEBUG option. 2011-11-06 08:10:41 +00:00
jhb
767a02dc14 Regen. 2011-11-04 04:06:31 +00:00
jhb
78c075174e Add the posix_fadvise(2) system call. It is somewhat similar to
madvise(2) except that it operates on a file descriptor instead of a
memory region.  It is currently only supported on regular files.

Just as with madvise(2), the advice given to posix_fadvise(2) can be
divided into two types.  The first type provide hints about data access
patterns and are used in the file read and write routines to modify the
I/O flags passed down to VOP_READ() and VOP_WRITE().  These modes are
thus filesystem independent.  Note that to ease implementation (and
since this API is only advisory anyway), only a single non-normal
range is allowed per file descriptor.

The second type of hints are used to hint to the OS that data will or
will not be used.  These hints are implemented via a new VOP_ADVISE().
A default implementation is provided which does nothing for the WILLNEED
request and attempts to move any clean pages to the cache page queue for
the DONTNEED request.  This latter case required two other changes.
First, a new V_CLEANONLY flag was added to vinvalbuf().  This requests
vinvalbuf() to only flush clean buffers for the vnode from the buffer
cache and to not remove any backing pages from the vnode.  This is
used to ensure clean pages are not wired into the buffer cache before
attempting to move them to the cache page queue.  The second change adds
a new vm_object_page_cache() method.  This method is somewhat similar to
vm_object_page_remove() except that instead of freeing each page in the
specified range, it attempts to move clean pages to the cache queue if
possible.

To preserve the ABI of struct file, the f_cdevpriv pointer is now reused
in a union to point to the currently active advice region if one is
present for regular files.

Reviewed by:	jilles, kib, arch@
Approved by:	re (kib)
MFC after:	1 month
2011-11-04 04:02:50 +00:00
jhb
1e2d8c9d67 Move the cleanup of f_cdevpriv when the reference count of a devfs
file descriptor drops to zero out of _fdrop() and into devfs_close_f()
as it is only relevant for devfs file descriptors.

Reviewed by:	kib
MFC after:	1 week
2011-11-04 03:39:31 +00:00
attilio
90682e14db Disable interrupt and preemption for smp_rendezvous() also in the
UP/!SMP case.
The callbacks may be relying on this feature and having 2 different
ways to deal with them is not correct.

Reported by:	rstone
Reviewed by:	jhb
MFC after:	2 weeks
2011-11-03 14:36:56 +00:00
marcel
11d8234b97 Revert rev. 226893: subr_syscall.c is being included from C files and
on amd64 with FREEBSD32 enabled, this means that systrace_probe_func
gets defined twice.
2011-10-30 02:19:39 +00:00
marcel
ac13f9cbdb Define systrace_probe_func in subr_syscall.c where it's used, instead
of defining it in MD code. This eliminates porting to other architectures.
2011-10-29 01:26:36 +00:00
pluknet
f9aa1bdb23 Fix arguments list for proc:::signal-discard DTrace probe.
Reported by:	Anton Yuzhaninov <citrin citrin ru>
MFC after:	1 week
2011-10-28 15:22:51 +00:00
jhb
b4486349bd Whitespace fix. 2011-10-27 17:43:36 +00:00
alc
841afea2d9 Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls to
vm_page_alloc().  While I'm here, for the sake of consistency, always
specify the allocation class, such as VM_ALLOC_NORMAL, as the first of
the flags.
2011-10-27 16:39:17 +00:00
pluknet
41ec9d2d93 Remove the long reprecated ``/stand/sysinstall'' from the init_path.
It can be put back using the INIT_PATH config option or init_path
loader variable, if still needed (which I doubt).

MFC after:	1 week
2011-10-27 10:25:11 +00:00
alc
0a7d6450d6 contigmalloc(9) and contigfree(9) are now implemented in terms of other
more general VM system interfaces.  So, their implementation can now
reside in kern_malloc.c alongside the other functions that are declared
in malloc.h.
2011-10-27 02:52:24 +00:00
jhb
1c4fa2aabc - Fixup filenames in a few more places where they are used.
- Some whitespace fixes.
2011-10-26 15:17:42 +00:00
pjd
b49c6b382f The v_data field is a pointer, so set it to NULL, not 0.
MFC after:	3 days
2011-10-25 14:01:17 +00:00
marcel
26b45aef71 Don't terminate the interactive root mount prompt on mount failure.
This restores the previous behaviour. While here, match '?' and '.'
inputs exactly and improve the error message.

Requested by: avg@
Derived from a patch by: Arnaud Lacombe <lacombar@gmail.com>
2011-10-23 20:03:33 +00:00
des
1b405df8ba Revisit the capability failure trace points. The initial implementation
only logged instances where an operation on a file descriptor required
capabilities which the file descriptor did not have.  By adding a type enum
to struct ktr_cap_fail, we can catch other types of capability failures as
well, such as disallowed system calls or attempts to wrap a file descriptor
with more capabilities than it had to begin with.
2011-10-18 07:28:58 +00:00
marcel
494f14b933 Fix double vision syndrome (read: double output) when in the
debugger without a panic.
2011-10-16 14:16:46 +00:00
kib
8e118d38cf Control the execution permission of the readable segments for
i386 binaries on the amd64 and ia64 with the sysctl, instead of
unconditionally enabling it.

Reviewed by:	marcel
2011-10-15 12:35:18 +00:00
marcel
92e552423d In elf32_trans_prot() and when compiling for amd64 or ia64, add
PROT_EXECUTE when PROT_READ is needed. By default i386 allows
execution when reading is allowed and JDK 1.4.x depends on that.
2011-10-13 16:16:46 +00:00
glebius
2522c42334 Make memguard(9) capable to guard uma(9) allocations. 2011-10-12 18:08:28 +00:00
rwatson
0a6da59b61 Correct a bug in export of capability-related information from the sysctls
supporting procstat -f: properly provide capability rights information to
userspace.  The bug resulted from a merge-o during upstreaming (or rather,
a failure to properly merge FreeBSD-side changed downstream).

Spotted by:     des, kibab
MFC after:      3 days
2011-10-12 12:08:03 +00:00
adrian
7c88ef5a97 Don't call fixup_filename() on each witness lock call.
This has been irking me for a while. This causes significant
CPU use on bottlenecked CPUs (eg my older EEEPC w/ an earlier
Celeron CPU and my MIPS24k boards) when they're passing
a lot of traffic.

Since the file/line values are only used for printing, this
should only affect display. It should have no operational
change on the code, besides reducing CPU use.
2011-10-12 09:21:02 +00:00
des
9b8d9b3ed1 Add a new trace point, KTRFAC_CAPFAIL, which traces capability check
failures.  It is included in the default set for ktrace(1) and kdump(1).
2011-10-11 20:37:10 +00:00
mckusick
b6c294da1b When unmounting a filesystem always wait for the vfs_busy lock to clear
so that if no vnodes in the filesystem are actively in use the unmount
will succeed rather than failing with EBUSY.

Reported by: Garrett Cooper
Reviewed by: Attilio Rao and Kostik Belousov
Tested by:   Garrett Cooper
PR:          kern/161016
MFC after:   3 weeks
2011-10-11 18:46:41 +00:00
marius
22104b1021 In device_get_children() avoid malloc(0) in order to increase portability
to other operating systems.

PR:     154287
2011-10-09 21:21:37 +00:00
alc
e9db5a5a50 Fix the handling of an empty kmem map by sysctl_kmem_map_free(). In
the unlikely event that sysctl_kmem_map_free() was performed on an
empty kmem map, it would incorrectly report the free space as zero.

Discussed with:	avg
MFC after:	1 week
2011-10-08 18:29:30 +00:00
jonathan
9c782bcdf6 Change one printf() to log().
As noted in kern/159780, printf() is not very jail-friendly, since it can't be easily monitored by jail management tools. This patch reports an error via log() instead, which, if nobody is watching the log file, still prints to the console.

Approved by: mentor (rwatson)
Submitted by: Eugene Grosbein <eugen@eg.sd.rdtc.ru>
MFC after: 5 days
2011-10-07 09:51:12 +00:00
obrien
4b04845b06 Disallow various debug.kdb sysctl's when securelevel is raised.
PR:	161350
2011-10-07 05:47:30 +00:00
delphij
3774d99430 Return proper errno when we hit error when doing sanity check.
This fixes dtrace crashes when module is not compiled with CTF
data.

Submitted by:	Paul Ambrose ambrosehua at gmail.com
MFC after:	1 week
2011-10-07 01:37:58 +00:00