freebsd-skq

Author	SHA1	Message	Date
trasz	f3e31127df	Finishing touches to fork1() - ANSIfy missed function definition, style(9) fixes, removal of few comments that didn't really make sense and addition of fork_findpid() locking requirements.	2011-01-02 12:16:57 +00:00
bz	246fcf4d6f	Mfp4 CH177924: Add and export constants of array sizes of jail parameters as compiled into the kernel. This is the least intrusive way to allow kvm to read the (sparse) arrays independent of the options the kernel was compiled with. Reviewed by: jhb (originally) MFC after: 1 week Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH	2010-12-31 22:49:13 +00:00
kib	f112887cab	Remove OBJ_CLEANING flag. The vfs_setdirty_locked_object() is the only consumer of the flag, and it used the flag because OBJ_MIGHTBEDIRTY was cleared early in vm_object_page_clean, before the cleaning pass was done. This is no longer true after r216799. Moreover, since OBJ_CLEANING is a flag, and not the counter, it could be reset too prematurely when parallel vm_object_page_clean() are performed. Reviewed by: alc (as a part of the bigger patch) MFC after: 1 month (after r216799 is merged)	2010-12-29 22:26:49 +00:00
attilio	ff557d3a61	Fix several callout migration races: - Problem1: Hypothesis: thread1 is doing a callout_reset_on(), within his callout handler, willing to implicitly or explicitly migrate the callout. thread2 is draining the callout. Thesys: * thread1 calls callout_lock() and locks the old callout cpu * thread1 performs the checks in the first path of the callout_reset_on() * thread1 hits this codepiece: /* * If the lock must migrate we have to check the state again as * we can't hold both the new and old locks simultaneously. / if (c->c_cpu != cpu) { c->c_cpu = cpu; CC_UNLOCK(cc); goto retry; } which means it will drop the lock and 'retry' thread2 will callout_lock() and locks the new callout cpu. thread1 spins on the new lock and will not keep going for the moment. * thread2 checks that the callout is not pending (as callout is currently running) and that it is not on cc->cc_curr (because cc now refers to the new callout and the callout is running on the old callout cpu) thus it thinks it is done and returns. * thread1 will now acquire the lock and then adds the callout to the new callout cpu queue That seems an obvious race as callout_stop() falsely reports the callout stopped or worse, callout_drain() falsely returns while the callout is still in use. - Solution1: Fixing this problem would require, in general, to lock both callout cpus at once while switching the c_cpu field and avoid cyclic deadlocks between callout cpus locks. The concept of CPUBLOCK is then introduced (working more or less like the blocked_lock for thread_lock() function) meaning: "in callout_lock(), spin until the c->c_cpu is not different from CPUBLOCK". That way the "original" callout cpu, referred to the above mentioned code snippet, will remain blocked until the lock handover is over critical path will remain covered. - Problem2: Having the callout currently executed on a specific callout cpu and contemporary pending on another callout cpu (as it can happen with current code) breaks, at least, the assumption callout_drain() returns just once the callout cannot be referenced anymore. - Solution2: Callout migration is deferred if the current callout is already under execution. The best place to do that is in softclock() and new members are added to the callout cpu structure in order to specify a pending migration is requested. That is necessary because the callout cannot be trusted (not freed) the 100% of times after the execution of the callout handler. CPUBLOCK will prevent, in the "deferred migration" case, that the callout gets freed in this case, stopping any callout_stop() and callout_drain() possible activity until the migration is actually performed. - Problem3: There is a further race in callout_drain(). In order to avoid a race between sleepqueue lock and callout cpu spinlock, in _callout_stop_safe(), the callout cpu lock is dropped, the sleepqueue lock is acquired and a new callout cpu lookup is performed. Note that the channel used for locking the sleepqueue is obtained from the "current" callout cpu (&cc->cc_waiting). If the callout migrated in the meanwhile, callout_drain() will end up using the wrong wchan for the sleepqueue (the locked one will be the older, while the new one will not really be locked) leading to a lock leak and a race access to sleepqueue. - Solution3: It is enough to check if a migration happened between the operation of acquiring the sleepqueue lock and the new callout cpu lock and eventually unwind all those and try again. This problems can lead to deathly races on moderate (4-ways) SMP environment, leading to easy panic or deadlocks. The 24-ways of the reporter, could easilly panic, with completely normal workload, almost daily. gianni@ kindly wrote the following prof-of-concept which can panic a FreeBSD machine in less than one hour, in smaller SMP: http://www.freebsd.org/~attilio/callout/test.c Reported by: Nicholas Esborn <nick at desert dot net>, DesertNet In collabouration with: gianni, pho, Nicholas Esborn Reviewed by: jhb MFC after: 1 week () Usually, I would aim for a larger MFC timeout, but I really want this in before 8.2-RELEASE, thus re@ accepted a shorter timeout as a special case for this patch	2010-12-29 18:17:36 +00:00
davidxu	3daac37e3c	- Follow r216313, the sched_unlend_user_prio is no longer needed, always use sched_lend_user_prio to set lent priority. - Improve pthread priority-inherit mutex, when a contender's priority is lowered, repropagete priorities, this may cause mutex owner's priority to be lowerd, in old code, mutex owner's priority is rise-only.	2010-12-29 09:26:46 +00:00
kib	747e187ac4	Teach ddb "show mount" about MNTK_SUJ flag.	2010-12-27 12:06:38 +00:00
alc	a2053caad0	Correct the order of the arguments to vm_fault_quick_hold_pages().	2010-12-26 01:42:52 +00:00
alc	971b02b7bc	Introduce and use a new VM interface for temporarily pinning pages. This new interface replaces the combined use of vm_fault_quick() and pmap_extract_and_hold() throughout the kernel. In collaboration with: kib@	2010-12-25 21:26:56 +00:00
davidxu	63146a5952	Enlarge hash table for new condition variable.	2010-12-23 03:12:03 +00:00
davidxu	437ad27f9c	MFp4: - Add flags CVWAIT_ABSTIME and CVWAIT_CLOCKID for umtx kernel based condition variable, this should eliminate an extra system call to get current time. - Add sub-function UMTX_OP_NWAKE_PRIVATE to wake up N channels in single system call. Create userland sleep queue for condition variable, in most cases, thread will wait in the queue, the pthread_cond_signal will defer thread wakeup until the mutex is unlocked, it tries to avoid an extra system call and a extra context switch in time window of pthread_cond_signal and pthread_mutex_unlock. The changes are part of process-shared mutex project.	2010-12-22 05:01:52 +00:00
mdf	9e845cb332	Initialize fp_location for explicitly managed fail points, and push the parentheses around the location for simple fail points into the location string. This makes the print on fail point set more consistent between the two versions. Also fix up fail.h a little for style(9): only use one of sys/param.h and sys/types.h, and use the existing __XSTRING() macro instead of rolling our own. Also fix up a few tabs on changed and nearby lines. Lastly, since KFAIL_POINT_{BEGIN,END} are not meant for use outside this file, just eliminate the macros entirely. MFC after: 1 week	2010-12-21 18:23:03 +00:00
mdf	9d7bd11478	Move the fail_point_entry definition from fail.h to kern_fail.c, which allows putting the enumeration constants of fail point types with the text string that matches them. MFC after: 1 week	2010-12-21 16:29:58 +00:00
lstewart	dedc1118c9	- Introduce the Hhook (Helper Hook) KPI. The KPI is closely modelled on pfil(9), and in many respects can be thought of as a more generic superset of pfil. Hhook provides a way for kernel subsystems to export hook points that Khelp modules can hook to provide enhanced or new functionality to the kernel. The KPI has been designed to ensure hook points pose no noticeable overhead when no hook functions are registered. - Introduce the Khelp (Kernel Helpers) KPI. Khelp provides a framework for managing Khelp modules, which indirectly use the Hhook KPI to register their hook functions with hook points of interest within the kernel. Khelp modules aim to provide a structured way to dynamically extend the kernel at runtime in an ABI preserving manner. Depending on the subsystem providing hook points, a Khelp module may be able to associate per-object data for maintaining relevant state between hook calls. - pjd's Object Specific Data (OSD) KPI is used to manage the per-object data allocated to Khelp modules. Create a new "OSD_KHELP" OSD type for use by the Khelp framework. - Bump __FreeBSD_version to 900028 to mark the introduction of the new KPIs. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz, others along the way MFC after: 3 months	2010-12-21 13:45:29 +00:00
alc	be5201b0d1	Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race condition in proc_rwmem() and to (2) simplify the implementation of the cxgb driver's vm_fault_hold_user_pages(). Specifically, in proc_rwmem() the requested read or write could fail because the targeted page could be reclaimed between the calls to vm_fault() and vm_page_hold(). In collaboration with: kib@ MFC after: 6 weeks	2010-12-20 22:49:31 +00:00
alc	303f816df2	Implement and use a single optimized function for unholding a set of pages. Reviewed by: kib@	2010-12-17 22:41:22 +00:00
jhb	1b88c87408	Add back a bounds check on valid idle priorities that was lost in an earlier commit. While here, move the thread lock down in rtp_to_pri(). It is not needed for all of the priority value checks and the computation of newpri. Reported by: swell.k @ gmail MFC after: 3 days	2010-12-17 16:29:06 +00:00
mdf	60b768f654	One of the compat32 functions was copying in a raw timespec, instead of a 32-bit one. This can cause weird timeout issues, as the copying reads garbage from the user. Code by: Deepak Veliath <deepak dot veliath at isilon dot com> MFC after: 1 week	2010-12-15 19:30:44 +00:00
pjd	627e6dcc72	Just pass M_ZERO to malloc(9) instead of clearing allocated memory separately.	2010-12-14 06:19:13 +00:00
trasz	6be0018c93	Adapt filesystem-independent NFSv4 ACL code (used by UFS, but not by ZFS) to PSARC/2010/029. In short, the semantics is simplified - "weird stuff" no longer happens after chmod, entries don't get duplicated during inheritance, and trivial ACLs no longer contain three "DENY" entries, which is also more friendly to MS Windows. By default, UFS keeps using old semantics. To change it, set sysctl vfs.acl_nfs4_old_semantics to 0. I'll flip the switch when ZFSv28 hits the tree, to keep these two in sync - ZFS v28 uses PSARC semantics, and ZFS v15 uses the old one.	2010-12-13 18:56:04 +00:00
hselasky	2b010fb38d	Fix race in devfs by using LIST_FIRST() instead of LIST_FOREACH_SAFE() when freeing the devfs private data entries. Reviewed by: kib MFC after: 3 days Approved by: thompsa (mentor)	2010-12-11 08:44:10 +00:00
trasz	1fff03b62c	Refactor fork1() to make it easier to follow. No functional changes. Reviewed by: kib (earlier version) Tested by: pho	2010-12-10 08:33:56 +00:00
bz	be6b3b47d5	Don't tie ct_debug to bootverbose. Provide a sysctl to turn it on or off. Switch the default to always off. Reviewed by: kib	2010-12-09 22:02:48 +00:00
davidxu	f88dac0410	MFp4: The unit number allocator reuses ID too fast, this may hide bugs in other code, add a ring buffer to delay freeing a thread ID.	2010-12-09 05:16:20 +00:00
davidxu	171976dba2	MFp4: It is possible a lower priority thread lending priority to higher priority thread, in old code, it is ignored, however the lending should always be recorded, add field td_lend_user_pri to fix the problem, if a thread does not have borrowed priority, its value is PRI_MAX. MFC after: 1 week	2010-12-09 02:42:02 +00:00
trasz	1d758da820	Add a KASSERT to make it obvious when fork_norfproc() is to be called, and set *procp to NULL in all cases. Previously, it was not being set in the ERESTART case. This is effectively no-op, since its value is ignored by callers in the error case. Reviewed by: kib@	2010-12-06 19:15:38 +00:00
trasz	0a2fe19d79	Fix style bug introduced by previous commit.	2010-12-06 16:45:36 +00:00
trasz	690f1210e9	Improve readability by factoring out the !RFPROC case. While here, turn K&R function definitions into ANSI. No functional changes. Reviewed by: kib@	2010-12-06 16:39:18 +00:00
kib	1ffd755b88	Trim whitespaces at the end of lines. Use the commit to record proper log message for r216150. MFC after: 1 week If unix socket has a unix socket attached as the rights that has a unix socket attached as the rights that has a unix socket attached as the rights ... Kernel may overflow the stack on attempt to close such socket. Only close the rights file in the context of the current close if the file is not unix domain socket. Otherwise, postpone the work to taskqueue, preventing unlimited recursion. The pass of the unix domain sockets over the SCM_RIGHTS message control is not widely used, and more, the close of the socket with still attached rights is mostly an application failure. The change should not affect the performance of typical users of SCM_RIGHTS. Reviewed by: jeff, rwatson	2010-12-03 20:39:06 +00:00
kib	44fb3ef253	Reviewed by: jeff, rwatson MFC after: 1 week	2010-12-03 16:15:44 +00:00
trasz	e5fb69509c	Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage. Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation	2010-12-02 17:37:16 +00:00
imp	716b18c157	removed tag is '-', not '+'. remove extra return.	2010-12-02 04:28:01 +00:00
trasz	234ab3f035	Remove useless NULL checks for M_WAITOK mallocs.	2010-12-02 01:14:45 +00:00
imp	edeca9302c	Remove redundant (and bogus) insertion of pnp info when announcing new and retiring devices. That's already inserted elsewhere. Submitted by: n_hibma MFC after: 3 days	2010-11-30 05:54:21 +00:00
mdf	94ee7fc25d	Fix uninitialized variable warning that shows on Tinderbox but not my setup. (??) Submitted by: Michael Butler <imb at protected-networks dot net>	2010-11-29 21:53:21 +00:00
mdf	0ea34870f8	Do not hold the sysctl lock across a call to the handler. This fixes a general LOR issue where the sysctl lock had no good place in the hierarchy. One specific instance is #284 on http://sources.zabbadoz.net/freebsd/lor.html . Reviewed by: jhb MFC after: 1 month X-MFC-note: split oid_refcnt field for oid_running to preserve KBI	2010-11-29 18:18:07 +00:00
mdf	41bb73b7ef	Slightly modify the logic in sysctl_find_oid to reduce the indentation. There should be no functional change. MFC after: 3 days	2010-11-29 18:18:00 +00:00
mdf	fdb72b2e1f	Use the SYSCTL_CHILDREN macro in kern_sysctl.c to help de-obfuscate the code. MFC after: 3 days	2010-11-29 18:17:53 +00:00
kib	8633830499	Account i/o done on cdevs. Reported and tested by: Adam Vande More <amvandemore gmail com> MFC after: 1 week	2010-11-25 20:05:11 +00:00
kib	87501c4bfe	Allow shared-locked vnode to be passed to vunref(9). When shared-locked vnode is supplied as an argument to vunref(9) and resulting usecount is 0, set VI_OWEINACT and do not try to upgrade vnode lock. The later could cause vnode unlock, allowing the vnode to be reclaimed meantime. Tested by: pho MFC after: 1 week	2010-11-24 12:30:41 +00:00
avg	01b9b54af6	taskqueue: drop unused tq_name field tq_name was used write-only and besides it was just a pointer, so it could point to some garbage in a temporary buffer that's gone. This change shouldn't change KPI/KBI as struct taskqueue is private to subr_taskqueue.c. If we find a need for tq_name it can be resurrected at any moment. taskqueue_create() interface is preserved for this purpose. Suggested by: jhb MFC after: 10 days	2010-11-23 14:30:22 +00:00
pluknet	64dd3dbf39	Update MNT_ROOTFS comments after changes in the root mount logic. Reported by: arundel Suggested by: marcel (phrasing) Approved by: kib (mentor)	2010-11-23 13:49:15 +00:00
cperciva	ba6c9ebbca	Add parentheses for clarity. The parentheses around the two terms of the && are unnecessary but I'm leaving them in for the sake of avoiding confusion (I confuse easily). Submitted by: bde	2010-11-23 04:50:01 +00:00
dim	fb307d7d1d	After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 \| dim \| 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) \| 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 \| dim \| 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) \| 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 \| dim \| 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) \| 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.	2010-11-22 19:32:54 +00:00
attilio	af24e6ed9f	Style fix. Sponsored by: Sandvine Incorporated Requested by: jhb Reviewed by: jhb MFC after: 1 week X-MFC: 215544	2010-11-22 15:28:54 +00:00
attilio	7718cbcbf4	Add the ability for GDB to printout the thread name along with other thread specific informations. In order to do that, and in order to avoid KBI breakage with existing infrastructure the following semantic is implemented: - For live programs, a new member to the PT_LWPINFO is added (pl_tdname) - For cores, a new ELF note is added (NT_THRMISC) that can be used for storing thread specific, miscellaneous, informations. Right now it is just popluated with a thread name. GDB, then, retrieves the correct informations from the corefile via the BFD interface, as it groks the ELF notes and create appropriate pseudo-sections. Sponsored by: Sandvine Incorporated Tested by: gianni Discussed with: dim, kan, kib MFC after: 2 weeks	2010-11-22 14:42:13 +00:00
cperciva	6cdc82f907	In tc_windup, handle the case where the previous call to tc_windup was more than 1s earlier. Prior to this commit, the computation of th_scale * delta (which produces a 64-bit value equal to the time since the last tc_windup call in units of 2^(-64) seconds) would overflow and any complete seconds would be lost. We fix this by repeatedly converting tc_frequency units of timecounter to one seconds; this is not exactly correct, since it loses the NTP adjustment, but if we find ourselves going more than 1s at a time between clock interrupts, losing a few seconds worth of NTP adjustments is the least of our problems...	2010-11-22 09:13:25 +00:00
netchild	46e50a7603	By using the 32-bit Linux version of Sun's Java Development Kit 1.6 on FreeBSD (amd64), invocations of "javac" (or "java") eventually end with the output of "Killed" and exit code 137. This is caused by: 1. After calling exec() in multithreaded linux program threads are not destroyed and continue running. They get killed after program being executed finishes. 2. linux_exit_group doesn't return correct exit code when called not from group leader. Which happens regularly using sun jvm. The submitters fix this in a similar way to how NetBSD handles this. I took the PRs away from dchagin, who seems to be out of touch of this since a while (no response from him). The patches committed here are from [2], with some little modifications from me to the style. PR: 141439 [1], 144194 [2] Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk Reviewed by: rdivacky (in april 2010) MFC after: 5 days	2010-11-22 09:06:59 +00:00
davidxu	b05094dc42	Use atomic instruction to set _has_writer, otherwise there is a race causes userland to not wake up a thread sleeping in kernel. MFC after: 3 days	2010-11-22 02:42:02 +00:00
kib	7980fb6d3a	Remove prtactive variable and related printf()s in the vop_inactive and vop_reclaim() methods. They seems to be unused, and the reported situation is normal for the forced unmount. MFC after: 1 week X-MFC-note: keep prtactive symbol in vfs_subr.c	2010-11-19 21:17:34 +00:00
attilio	4902b4f43a	Scan the list in reverse order for the shutdown handlers of loaded modules. This way, when there is a dependency between two modules, the handler of the latter probed runs first. This is a similar approach as the modules are unloaded in the same linkerfile. Sponsored by: Sandvine Incorporated Submitted by: Nima Misaghian <nmisaghian at sandvine dot com> MFC after: 1 week	2010-11-19 19:43:56 +00:00

1 2 3 4 5 ...

11994 Commits