12130 Commits

Author SHA1 Message Date
Alan Cox
5b2d228c44 Correct the order of the arguments to vm_fault_quick_hold_pages(). 2010-12-26 01:42:52 +00:00
Alan Cox
82de724fe1 Introduce and use a new VM interface for temporarily pinning pages. This
new interface replaces the combined use of vm_fault_quick() and
pmap_extract_and_hold() throughout the kernel.

In collaboration with:	kib@
2010-12-25 21:26:56 +00:00
David Xu
1c45127bd3 Enlarge hash table for new condition variable. 2010-12-23 03:12:03 +00:00
David Xu
d1078b0b03 MFp4:
- Add flags CVWAIT_ABSTIME and CVWAIT_CLOCKID for umtx kernel based
  condition variable, this should eliminate an extra system call to get
  current time.

- Add sub-function UMTX_OP_NWAKE_PRIVATE to wake up N channels in single
  system call. Create userland sleep queue for condition variable, in most
  cases, thread will wait in the queue, the pthread_cond_signal will defer
  thread wakeup until the mutex is unlocked, it tries to avoid an extra
  system call and a extra context switch in time window of pthread_cond_signal
  and pthread_mutex_unlock.

The changes are part of process-shared mutex project.
2010-12-22 05:01:52 +00:00
Matthew D Fleming
4b7c684420 Initialize fp_location for explicitly managed fail points, and push
the parentheses around the location for simple fail points into the
location string.  This makes the print on fail point set more
consistent between the two versions.

Also fix up fail.h a little for style(9): only use one of sys/param.h
and sys/types.h, and use the existing __XSTRING() macro instead of
rolling our own.  Also fix up a few tabs on changed and nearby lines.

Lastly, since KFAIL_POINT_{BEGIN,END} are not meant for use outside
this file, just eliminate the macros entirely.

MFC after: 1 week
2010-12-21 18:23:03 +00:00
Matthew D Fleming
d26b794591 Move the fail_point_entry definition from fail.h to kern_fail.c, which
allows putting the enumeration constants of fail point types with the
text string that matches them.

MFC after:	1 week
2010-12-21 16:29:58 +00:00
Lawrence Stewart
a8d61afdc2 - Introduce the Hhook (Helper Hook) KPI. The KPI is closely modelled on pfil(9),
and in many respects can be thought of as a more generic superset of pfil.
  Hhook provides a way for kernel subsystems to export hook points that Khelp
  modules can hook to provide enhanced or new functionality to the kernel. The
  KPI has been designed to ensure hook points pose no noticeable overhead when
  no hook functions are registered.

- Introduce the Khelp (Kernel Helpers) KPI. Khelp provides a framework for
  managing Khelp modules, which indirectly use the Hhook KPI to register their
  hook functions with hook points of interest within the kernel. Khelp modules
  aim to provide a structured way to dynamically extend the kernel at runtime in
  an ABI preserving manner. Depending on the subsystem providing hook points, a
  Khelp module may be able to associate per-object data for maintaining relevant
  state between hook calls.

- pjd's Object Specific Data (OSD) KPI is used to manage the per-object data
  allocated to Khelp modules. Create a new "OSD_KHELP" OSD type for use by the
  Khelp framework.

- Bump __FreeBSD_version to 900028 to mark the introduction of the new KPIs.

In collaboration with:	David Hayes <dahayes at swin edu au> and
			Grenville Armitage <garmitage at swin edu au>
Sponsored by:	FreeBSD Foundation
Reviewed by:	bz, others along the way
MFC after:	3 months
2010-12-21 13:45:29 +00:00
Alan Cox
acd11c7499 Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race
condition in proc_rwmem() and to (2) simplify the implementation of the
cxgb driver's vm_fault_hold_user_pages().  Specifically, in proc_rwmem()
the requested read or write could fail because the targeted page could be
reclaimed between the calls to vm_fault() and vm_page_hold().

In collaboration with:	kib@
MFC after:	6 weeks
2010-12-20 22:49:31 +00:00
Alan Cox
8c22654d7e Implement and use a single optimized function for unholding a set of pages.
Reviewed by:	kib@
2010-12-17 22:41:22 +00:00
John Baldwin
36b4cd243e Add back a bounds check on valid idle priorities that was lost in an
earlier commit.  While here, move the thread lock down in rtp_to_pri().
It is not needed for all of the priority value checks and the computation
of newpri.

Reported by:	swell.k @ gmail
MFC after:	3 days
2010-12-17 16:29:06 +00:00
Matthew D Fleming
e0f389c8d3 One of the compat32 functions was copying in a raw timespec, instead of
a 32-bit one.  This can cause weird timeout issues, as the copying reads
garbage from the user.

Code by:     Deepak Veliath <deepak dot veliath at isilon dot com>
MFC after:   1 week
2010-12-15 19:30:44 +00:00
Pawel Jakub Dawidek
b452cf6317 Just pass M_ZERO to malloc(9) instead of clearing allocated memory separately. 2010-12-14 06:19:13 +00:00
Edward Tomasz Napierala
4c7bba9985 Adapt filesystem-independent NFSv4 ACL code (used by UFS, but not by ZFS)
to PSARC/2010/029.  In short, the semantics is simplified - "weird stuff"
no longer happens after chmod, entries don't get duplicated during
inheritance, and trivial ACLs no longer contain three "DENY" entries,
which is also more friendly to MS Windows.

By default, UFS keeps using old semantics.  To change it, set sysctl
vfs.acl_nfs4_old_semantics to 0.  I'll flip the switch when ZFSv28
hits the tree, to keep these two in sync - ZFS v28 uses PSARC semantics,
and ZFS v15 uses the old one.
2010-12-13 18:56:04 +00:00
Hans Petter Selasky
0bad52e1d8 Fix race in devfs by using LIST_FIRST() instead of
LIST_FOREACH_SAFE() when freeing the devfs private
data entries.

Reviewed by:	kib
MFC after:	3 days
Approved by:	thompsa (mentor)
2010-12-11 08:44:10 +00:00
Edward Tomasz Napierala
afd01097a0 Refactor fork1() to make it easier to follow. No functional changes.
Reviewed by:	kib (earlier version)
Tested by:	pho
2010-12-10 08:33:56 +00:00
Bjoern A. Zeeb
4befa84f9c Don't tie ct_debug to bootverbose. Provide a sysctl to turn it on or off.
Switch the default to always off.

Reviewed by:	kib
2010-12-09 22:02:48 +00:00
David Xu
ec6ea5e86d MFp4:
The unit number allocator reuses ID too fast, this may hide bugs in
other code, add a ring buffer to delay freeing a thread ID.
2010-12-09 05:16:20 +00:00
David Xu
acbe332a58 MFp4:
It is possible a lower priority thread lending priority to higher priority
thread, in old code, it is ignored, however the lending should always be
recorded, add field td_lend_user_pri to fix the problem, if a thread does
not have borrowed priority, its value is PRI_MAX.

MFC after: 1 week
2010-12-09 02:42:02 +00:00
Edward Tomasz Napierala
087bfb0e6b Add a KASSERT to make it obvious when fork_norfproc() is to be called,
and set *procp to NULL in all cases.  Previously, it was not being set
in the ERESTART case.  This is effectively no-op, since its value is
ignored by callers in the error case.

Reviewed by:	kib@
2010-12-06 19:15:38 +00:00
Edward Tomasz Napierala
f68c74bbd3 Fix style bug introduced by previous commit. 2010-12-06 16:45:36 +00:00
Edward Tomasz Napierala
1d845e8638 Improve readability by factoring out the !RFPROC case. While here,
turn K&R function definitions into ANSI.  No functional changes.

Reviewed by:	kib@
2010-12-06 16:39:18 +00:00
Konstantin Belousov
9f4ba450f2 Trim whitespaces at the end of lines. Use the commit to record
proper log message for r216150.

MFC after:	1 week

If unix socket has a unix socket attached as the rights that has a
unix socket attached as the rights that has a unix socket attached as
the rights ... Kernel may overflow the stack on attempt to close such
socket.

Only close the rights file in the context of the current close if the
file is not unix domain socket. Otherwise, postpone the work to
taskqueue, preventing unlimited recursion.

The pass of the unix domain sockets over the SCM_RIGHTS message
control is not widely used, and more, the close of the socket with
still attached rights is mostly an application failure. The change
should not affect the performance of typical users of SCM_RIGHTS.

Reviewed by:	jeff, rwatson
2010-12-03 20:39:06 +00:00
Konstantin Belousov
0cb64678bc Reviewed by: jeff, rwatson
MFC after:	1 week
2010-12-03 16:15:44 +00:00
Edward Tomasz Napierala
ef694c1ac4 Replace pointer to "struct uidinfo" with pointer to "struct ucred"
in "struct vm_object".  This is required to make it possible to account
for per-jail swap usage.

Reviewed by:	kib@
Tested by:	pho@
Sponsored by:	FreeBSD Foundation
2010-12-02 17:37:16 +00:00
Warner Losh
704c91294b removed tag is '-', not '+'.
remove extra return.
2010-12-02 04:28:01 +00:00
Edward Tomasz Napierala
26778a6c82 Remove useless NULL checks for M_WAITOK mallocs. 2010-12-02 01:14:45 +00:00
Warner Losh
5cb51b647c Remove redundant (and bogus) insertion of pnp info when announcing new
and retiring devices.  That's already inserted elsewhere.

Submitted by:	n_hibma
MFC after:	3 days
2010-11-30 05:54:21 +00:00
Matthew D Fleming
dd6312a7c1 Fix uninitialized variable warning that shows on Tinderbox but not my
setup. (??)

Submitted by:	Michael Butler <imb at protected-networks dot net>
2010-11-29 21:53:21 +00:00
Matthew D Fleming
ccecef29d1 Do not hold the sysctl lock across a call to the handler. This fixes a
general LOR issue where the sysctl lock had no good place in the
hierarchy.  One specific instance is #284 on
http://sources.zabbadoz.net/freebsd/lor.html .

Reviewed by:	jhb
MFC after:	1 month
X-MFC-note:	split oid_refcnt field for oid_running to preserve KBI
2010-11-29 18:18:07 +00:00
Matthew D Fleming
d0bb6f258b Slightly modify the logic in sysctl_find_oid to reduce the indentation.
There should be no functional change.

MFC after:	3 days
2010-11-29 18:18:00 +00:00
Matthew D Fleming
5127ecb89c Use the SYSCTL_CHILDREN macro in kern_sysctl.c to help de-obfuscate the
code.

MFC after:	3 days
2010-11-29 18:17:53 +00:00
Konstantin Belousov
eea7f71c81 Account i/o done on cdevs.
Reported and tested by:	Adam Vande More <amvandemore gmail com>
MFC after:	1 week
2010-11-25 20:05:11 +00:00
Konstantin Belousov
f5eb95b1fc Allow shared-locked vnode to be passed to vunref(9).
When shared-locked vnode is supplied as an argument to vunref(9) and
resulting usecount is 0, set VI_OWEINACT and do not try to upgrade vnode
lock. The later could cause vnode unlock, allowing the vnode to be
reclaimed meantime.

Tested by:	pho
MFC after:	1 week
2010-11-24 12:30:41 +00:00
Andriy Gapon
706b0d31bb taskqueue: drop unused tq_name field
tq_name was used write-only and besides it was just a pointer, so it
could point to some garbage in a temporary buffer that's gone.
This change shouldn't change KPI/KBI as struct taskqueue is private to
subr_taskqueue.c.
If we find a need for tq_name it can be resurrected at any moment.
taskqueue_create() interface is preserved for this purpose.

Suggested by:	jhb
MFC after:	10 days
2010-11-23 14:30:22 +00:00
Sergey Kandaurov
f03749ca2d Update MNT_ROOTFS comments after changes in the root mount logic.
Reported by:	arundel
Suggested by:	marcel (phrasing)
Approved by:	kib (mentor)
2010-11-23 13:49:15 +00:00
Colin Percival
772d1e42a2 Add parentheses for clarity. The parentheses around the two terms of the &&
are unnecessary but I'm leaving them in for the sake of avoiding confusion
(I confuse easily).

Submitted by:	bde
2010-11-23 04:50:01 +00:00
Dimitry Andric
3e288e6238 After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files.  A better long-term solution is
still being considered.  This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
2010-11-22 19:32:54 +00:00
Attilio Rao
57c153804a Style fix.
Sponsored by:	Sandvine Incorporated
Requested by:	jhb
Reviewed by:	jhb
MFC after:	1 week
X-MFC:		215544
2010-11-22 15:28:54 +00:00
Attilio Rao
7f08176ee8 Add the ability for GDB to printout the thread name along with other
thread specific informations.

In order to do that, and in order to avoid KBI breakage with existing
infrastructure the following semantic is implemented:
- For live programs, a new member to the PT_LWPINFO is added (pl_tdname)
- For cores, a new ELF note is added (NT_THRMISC) that can be used for
  storing thread specific, miscellaneous, informations. Right now it is
  just popluated with a thread name.

GDB, then, retrieves the correct informations from the corefile via the
BFD interface, as it groks the ELF notes and create appropriate
pseudo-sections.

Sponsored by:	Sandvine Incorporated
Tested by:	gianni
Discussed with:	dim, kan, kib
MFC after:	2 weeks
2010-11-22 14:42:13 +00:00
Colin Percival
aa519c0a64 In tc_windup, handle the case where the previous call to tc_windup was
more than 1s earlier.  Prior to this commit, the computation of
th_scale * delta (which produces a 64-bit value equal to the time since
the last tc_windup call in units of 2^(-64) seconds) would overflow and
any complete seconds would be lost.

We fix this by repeatedly converting tc_frequency units of timecounter
to one seconds; this is not exactly correct, since it loses the NTP
adjustment, but if we find ourselves going more than 1s at a time between
clock interrupts, losing a few seconds worth of NTP adjustments is the
least of our problems...
2010-11-22 09:13:25 +00:00
Alexander Leidinger
bb63fdde6d By using the 32-bit Linux version of Sun's Java Development Kit 1.6
on FreeBSD (amd64), invocations of "javac" (or "java") eventually
end with the output of "Killed" and exit code 137.

This is caused by:
1. After calling exec() in multithreaded linux program threads are not
   destroyed and continue running. They get killed after program being
   executed finishes.

2. linux_exit_group doesn't return correct exit code when called not
   from group leader. Which happens regularly using sun jvm.

The submitters fix this in a similar way to how NetBSD handles this.

I took the PRs away from dchagin, who seems to be out of touch of
this since a while (no response from him).

The patches committed here are from [2], with some little modifications
from me to the style.

PR:		141439 [1], 144194 [2]
Submitted by:	Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk
Reviewed by:	rdivacky (in april 2010)
MFC after:	5 days
2010-11-22 09:06:59 +00:00
David Xu
b169d0efa1 Use atomic instruction to set _has_writer, otherwise there is a race
causes userland to not wake up a thread sleeping in kernel.

MFC after: 3 days
2010-11-22 02:42:02 +00:00
Konstantin Belousov
730b63b0c2 Remove prtactive variable and related printf()s in the vop_inactive
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.

MFC after:   1 week
X-MFC-note:  keep prtactive symbol in vfs_subr.c
2010-11-19 21:17:34 +00:00
Attilio Rao
772753491b Scan the list in reverse order for the shutdown handlers of loaded modules.
This way, when there is a dependency between two modules, the handler of the
latter probed runs first.

This is a similar approach as the modules are unloaded in the same
linkerfile.

Sponsored by:	Sandvine Incorporated
Submitted by:	Nima Misaghian <nmisaghian at sandvine dot com>
MFC after:	1 week
2010-11-19 19:43:56 +00:00
John Baldwin
2e7758a8f6 Set the POSIX semaphore capability when the semaphore module is enabled.
This is ignored in HEAD where semaphores are marked as always enabled in
<unistd.h>.

MFC after:	1 week
2010-11-19 17:57:50 +00:00
John Baldwin
34c1c5992f Set various POSIX capability sysctls to the version of the API that is
supported rather than 1.  They are supposed to return a suitable value
for sysconf(3).  While here, make the fsync sysctl match <unistd.h>.

MFC after:	1 week
2010-11-19 17:56:16 +00:00
John Baldwin
144df3a28f Add a resource_list_reserved() method that returns true if a resource
list entry contains a reserved resource.
2010-11-17 22:28:04 +00:00
Olivier Houchard
0c87b5e243 No need to include sys/systm.h twice. 2010-11-16 14:08:21 +00:00
David Xu
32c63db519 Only unlock process if a thread is found. 2010-11-15 07:33:54 +00:00
Dimitry Andric
31c6a0037e Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.
2010-11-14 20:38:11 +00:00