freebsd-skq

Author	SHA1	Message	Date
pjd	bbe899b96e	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
davidxu	1ebf3ee9a3	Revert rev 184216 and 184199, due to the way the thread_lock works, it may cause a lockup. Noticed by: peter, jhb	2008-11-05 03:01:23 +00:00
davidxu	2062caca24	Actually, for signal and thread suspension, extra process spin lock is unnecessary, the normal process lock and thread lock are enough. The spin lock is still needed for process and thread exiting to mimic single sched_lock.	2008-10-23 07:55:38 +00:00
davidxu	3f5ab59cf2	Restore code wrongly removed in SVN revision 173004, it causes threaded process to be stuck in execv(). Noticed by: delphij	2008-10-16 04:17:17 +00:00
davidxu	5068f6dcf0	Move per-thread userland debugging flags into seperated field, this eliminates some problems of locking, e.g, a thread lock is needed but can not be used at that time. Only the process lock is needed now for new field.	2008-10-15 06:31:37 +00:00
jhb	b054f3f992	A suspended thread can, in fact, be swapped out. Thus, thread_unsuspend_one() needs to optionally wakeup the swapper. Since we hold the thread lock for that entire function, however, we have to push that requirement up into the caller. Found by: rwatson	2008-08-22 16:15:58 +00:00
attilio	ff459eb3cf	Introduce some WITNESS improvements: - Speedup the lock orderings lookup modifying the witness graph from a linked tree to a matrix. A table lookup caches the lock orderings in order to make a O(1) access for them. Any witness object has an unique index withing this lookup cache table. - Reduce the lock contention on w_mtx acquiring it only when the LOR actually happens and not in a sane case. In order to do this don't totally flush lock lists (per-CPU spinlocks list and per-thread sleeplocks list) but check for ll_count anytime we need to have to verify allocations sanity. - Introduce the function witness_thread_exit() in the witness namespace which should verify a thread doesn't hold any witness occurrence why exiting. - Rename the sysctl debug.witness.graphs into debug.witness.fullgraph and add debug.witness.badstacks which prints out stacks for LOR revealed. This is implemented using the stack(9) support, which makes WITNESS to be dependent by the STACK option or by the DDB (including STACK) option. - Fix style(9) for src/sys/kern/subr_witness.c The hash table approach has been developed by Ilya Maykov on the behalf of Isilon Systems which kindly released the patch. Jeff Roberson, ported the patch to -CURRENT and fixed w_mtx contention, on the behalf of Nokia. Submitted by: Ilya Maykov <ivmaykov at gmail dot com> (Isilon Systems), jeff Sponsored by: Nokia	2008-08-13 18:24:22 +00:00
jhb	8af56fb687	If a thread that is swapped out is made runnable, then the setrunnable() routine wakes up proc0 so that proc0 can swap the thread back in. Historically, this has been done by waking up proc0 directly from setrunnable() itself via a wakeup(). When waking up a sleeping thread that was swapped out (the usual case when waking proc0 since only sleeping threads are eligible to be swapped out), this resulted in a bit of recursion (e.g. wakeup() -> setrunnable() -> wakeup()). With sleep queues having separate locks in 6.x and later, this caused a spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock). An attempt was made to fix this in 7.0 by making the proc0 wakeup use the ithread mechanism for doing the wakeup. However, this required grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated into the same LOR since the thread lock would be some other sleepq lock. Fix this by deferring the wakeup of the swapper until after the sleepq lock held by the upper layer has been locked. The setrunnable() routine now returns a boolean value to indicate whether or not proc0 needs to be woken up. The end result is that consumers of the sleepq API such as *sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup proc0 if they get a non-zero return value from sleepq_abort(), sleepq_broadcast(), or sleepq_signal(). Discussed with: jeff Glanced at by: sam Tested by: Jurgen Weber jurgen - ish com au MFC after: 2 weeks	2008-08-05 20:02:31 +00:00
jeff	9d30d1d7a4	- Make SCHED_STATS more generic by adding a wrapper to create the variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia	2008-04-17 04:20:10 +00:00
jeff	ba540b27d6	- Add a new td flag TDF_NEEDSUSPCHK that is set whenever a thread needs to enter thread_suspend_check(). - Set TDF_ASTPENDING along with TDF_NEEDSUSPCHK so we can move the thread_suspend_check() to ast() rather than userret(). - Check TDF_NEEDSUSPCHK in the sleepq_catch_signals() optimization so that we don't miss a suspend request. If this is set use the expensive signal path. - Set NEEDSUSPCHK when creating a new thread in thr in case the creating thread is due to be suspended as well but has not yet. Reviewed by: davidxu (Authored original patch)	2008-03-21 08:23:25 +00:00
jeff	898428987b	- There is no sense in calling sched_newthread() at thread_init() and thread_fini(). The schedulers initialize themselves properly during sched_fork_thread() anyhow. fini is only called when we're returning the memory to the allocator which surely doesn't care what state the memory is in.	2008-03-20 03:07:57 +00:00
jeff	4350e599a3	- Restore the NULL check for td_cpuset. This can happen if a partially constructed thread was torn down as is the case when we fail to allocate a kernel stack.	2008-03-19 06:20:21 +00:00
jeff	46f09d5bc3	- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.	2008-03-19 06:19:01 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
jeff	3b1acbdce2	- Pass the priority argument from sleep() into sleepq and down into sched_sleep(). This removes extra thread_lock() acquisition and allows the scheduler to decide what to do with the static boost. - Change the priority arguments to cv_ to match sleepq/msleep/etc. where 0 means no priority change. Catch -1 in cv_broadcastpri() and convert it to 0 for now. - Set a flag when sleeping in a way that is compatible with swapping since direct priority comparisons are meaningless now. - Add a sysctl to ule, kern.sched.static_boost, that defaults to on which controls the boost behavior. Turning it off gives better performance in some workloads but needs more investigation. - While we're modifying sleepq, change signal and broadcast to both return with the lock held as the lock was held on enter. Reviewed by: jhb, peter	2008-03-12 06:31:06 +00:00
jeff	e30139dff5	- KSE may free a thread that was never actually forked. This will leave td_cpuset NULL. Check for this condition before dereferencing the cpuset. Reported by: david@catwhisker.org, miwi@freebsd.org Sponsored by: Nokia	2008-03-12 05:01:14 +00:00
jeff	694203dedd	Add cpuset, an api for thread to cpu binding and cpu resource grouping and assignment. - Add a reference to a struct cpuset in each thread that is inherited from the thread that created it. - Release the reference when the thread is destroyed. - Add prototypes for syscalls and macros for manipulating cpusets in sys/cpuset.h - Add syscalls to create, get, and set new numbered cpusets: cpuset(), cpuset_{get,set}id() - Add syscalls for getting and setting affinity masks for cpusets or individual threads: cpuid_{get,set}affinity() - Add types for the 'level' and 'which' parameters for the cpuset. This will permit expansion of the api to cover cpu masks for other objects identifiable with an id_t integer. For example, IRQs and Jails may be coming soon. - The root set 0 contains all valid cpus. All thread initially belong to cpuset 1. This permits migrating all threads off of certain cpus to reserve them for special applications. Sponsored by: Nokia Discussed with: arch, rwatson, brooks, davidxu, deischen Reviewed by: antoine	2008-03-02 07:39:22 +00:00
julian	265714a11e	give thread0 the tid 100000 and bumpt the others to start at 100001 MFC after: 1 week	2007-12-22 04:56:48 +00:00
jeff	4ec9caf00c	Refactor select to reduce contention and hide internal implementation details from consumers. - Track individual selecters on a per-descriptor basis such that there are no longer collisions and after sleeping for events only those descriptors which triggered events must be rescaned. - Protect the selinfo (per descriptor) structure with a mtx pool mutex. mtx pool mutexes were chosen to preserve api compatibility with existing code which does nothing but bzero() to setup selinfo structures. - Use a per-thread wait channel rather than a global wait channel. - Hide select implementation details in a seltd structure which is opaque to the rest of the kernel. - Provide a 'selsocket' interface for those kernel consumers who wish to select on a socket when they have no fd so they no longer have to be aware of select implementation details. Tested by: kris Reviewed on: arch	2007-12-16 06:21:20 +00:00
jeff	12adc443d6	- Re-implement lock profiling in such a way that it no longer breaks the ABI when enabled. There is no longer an embedded lock_profile_object in each lock. Instead a list of lock_profile_objects is kept per-thread for each lock it may own. The cnt_hold statistic is now always 0 to facilitate this. - Support shared locking by tracking individual lock instances and statistics in the per-thread per-instance lock_profile_object. - Make the lock profiling hash table a per-cpu singly linked list with a per-cpu static lock_prof allocator. This removes the need for an array of spinlocks and reduces cache contention between cores. - Use a seperate hash for spinlocks and other locks so that only a critical_enter() is required and not a spinlock_enter() to modify the per-cpu tables. - Count time spent spinning in the lock statistics. - Remove the LOCK_PROFILE_SHARED option as it is always supported now. - Specifically drop and release the scheduler locks in both schedulers since we track owners now. In collaboration with: Kip Macy Sponsored by: Nokia	2007-12-15 23:13:31 +00:00
rrs	303afeb279	- Adds event handlers for process_ctor,process_dtor, process_init, process_fini, thread_ctor, thread_dtor, thread_init, thread_fini. This will allow us to extend dynamically areas in proc/thread for dtrace ;-) Reviewed by: rwatson	2007-11-15 14:20:07 +00:00
julian	303014d009	This time REALLY copy the name from the proc to the thread as a default.	2007-11-15 06:35:26 +00:00
marcel	1e7c4f0a3f	o Rename cpu_thread_setup() to cpu_thread_alloc() to better communicate that it relates to (is called by) thread_alloc() o Add cpu_thread_free() which is called from thread_free() to counter-act cpu_thread_alloc(). i386: Have cpu_thread_free() call cpu_thread_clean() to preserve behaviour. ia64: Have cpu_thread_free() call mtx_destroy() for the mutex initialized in cpu_thread_alloc(). PR: ia64/118024	2007-11-14 20:21:54 +00:00
julian	7ee6259be7	A bunch more files that should probably print out a thread name instead of a process name.	2007-11-14 06:51:33 +00:00
julian	b248158d8d	Make sure there is a good default thread name for all threads.	2007-11-14 06:04:57 +00:00
kib	9ae733819b	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
julian	11e1aa0d18	Introduce a way to make pure kernal threads. kthread_add() takes the same parameters as the old kthread_create() plus a pointer to a process structure, and adds a kernel thread to that process. kproc_kthread_add() takes the parameters for kthread_add, plus a process name and a pointer to a pointer to a process instead of just a pointer, and if the proc * is NULL, it creates the process to the specifications required, before adding the thread to it. All other old kthread_xxx() calls return, but act on (struct thread ) instead of (struct proc ). One reason to change the name is so that any old kernel modules that are lying around and expect kthread_create() to make a process will not just accidentally link. fix top to show kernel threads by their thread name in -SH mode add a tdnam formatting option to ps to show thread names. make all idle threads actual kthreads and put them into their own idled process. make all interrupt threads kthreads and put them in an interd process (mainly for aesthetic and accounting reasons) rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper' man page fixes to follow.	2007-10-26 08:00:41 +00:00
jeff	13d3160ef5	- Call sched_sleep() before we suspend threads. sched_wakeup() is already called via setrunnable(). This allows time slept while suspended to be accounted for swap. Approved by: re	2007-09-21 04:04:22 +00:00
attilio	e25b203061	Fix some entries in the locks static table of witness. In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re	2007-09-20 20:38:43 +00:00
jeff	3fc0f8b973	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
attilio	c2dedaa0a9	Actually, upcalls cannot be freed while destroying the thread because we should call uma_zfree() with various spinlock helds. Rearranging the code would not help here because we cannot break atomicity respect prcess spinlock, so the only one choice we have is to defer the operation. In order to do this use a global queue synchronized through the kse_lock spinlock which is freed at any thread_alloc() / thread_wait() through a call to thread_reap(). Note that this approach is not ideal as we should want a per-process list of zombie upcalls, but it follows initial guidelines of KSE authors. Tested by: jkim, pav Approved by: jeff, julian Approved by: re	2007-07-27 09:21:18 +00:00
attilio	ad75d346f7	Actually, KSE kernel bits locking is broken and can lead likely to dangerous races. Fix this problems adding correct locking for the members of 'struct kse_upcall' and other struct proc/struct thread related members. For the moment, just leave ku_mflag and ku_flags "lazy" locked. While here, cleanup the code removing the function kse_GC() (unused), and merging upcall_link(), upcall_unlink(), upcall_stash() in their respective callers (static functions, very short and only called in one place). Reported by: pav Tested by: pav (on some pointyhat cluster nodes) Approved by: jeff Approved by: re Sponsorized by: NGX Italy (http://www.ngx.it)	2007-07-23 14:52:22 +00:00
jeff	26422aea29	- Garbage collect unused concurrency functions. - Remove unused kse fields from struct proc. - Group remaining fields and #ifdef KSE them. - Move some kern_kse.c only prototypes out of proc and into kern_kse. Discussed with: Julian	2007-06-12 19:49:39 +00:00
jeff	49712c9a60	Solve a complex exit race introduced with thread_lock: - Add a count of exiting threads, p_exitthreads, to struct proc. - Increment p_exithreads when we set the deadthread in thread_exit(). - When we thread_stash() a deadthread use an atomic to drop the count. - Spin until the p_exithreads count reaches 0 in thread_wait(). - Lock the last exiting thread momentarily to be certain that it has exited cpu_throw(). - Restructure thread_wait(). It does not need a loop as there will only ever be one thread. Tested by: moose@opera.com Reported by: kris, moose@opera.com	2007-06-12 07:24:46 +00:00
attilio	c105658c88	The current rusage code show peculiar problems: - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed. A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process. Submitted by: jeff Approved by: jeff (mentor)	2007-06-09 18:56:11 +00:00
jeff	8931208c40	Commit 4/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. - Move some common code into thread_suspend_switch() to handle the mechanics of suspending a thread. The locking here is incredibly convoluted and should be simplified. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:52:24 +00:00
attilio	9bd4fdf7ce	Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:45:18 +00:00
jeff	a7a8bac81f	- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)	2007-06-01 01:12:45 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
jhb	38635fcc55	Align 'struct thread' on 16 byte boundaries so that the lower 4 bits are always 0. Previously we aligned threads on a minimum of 8-byte boundaries. Note: This changes the uma zone to no longer cache align threads. We really want the uma zone to do align threads to MAX(16, cache line size) but there currently isn't a good way to express that to uma. Submitted by: attilio	2007-03-27 16:51:34 +00:00
mohans	a332cb00d5	Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.	2007-03-09 04:02:38 +00:00
rwatson	2a3330b81a	Prefer a more traditional spelling of inhibited in comments and panic messages.	2006-12-31 15:56:04 +00:00
davidxu	b0b74f9bd3	Remove unused sysctls.	2006-12-19 13:06:01 +00:00
julian	396ed947f6	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
davidxu	ce90069697	Remove member p_procscopegrp which is no longer used by libthr.	2006-10-27 05:45:44 +00:00
jb	f82c799735	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
davidxu	5a12667fcf	This is initial version of POSIX priority mutex support, a new userland mutex structure is added as following: struct umutex { __lwpid_t m_owner; uint32_t m_flags; uint32_t m_ceilings[2]; uint32_t m_spare[4]; }; The m_owner represents owner thread, it is a thread id, in non-contested case, userland can simply use atomic_cmpset_int to lock the mutex, if the mutex is contested, high order bit will be set, and userland should do locking and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure and kernel syscall should be invoked to do locking, this becauses for such a mutex, kernel should always boost the thread's priority before it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex, the first element is used to boost thread's priority when it locked the mutex, second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT mutex's link list is kept in userland, the m_ceiling[1] is managed by thread library so kernel needn't allocate memory to keep the link list, when such a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED. Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process shared, if the flag is not set, it saves a vm_map_lookup() call. The umtx chain is still used as a sleep queue, when a thread is blocked on PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority propagating, it is dynamically allocated and reference count is used, it is not optimized but works well in my tests, while the umtx chain has its own locking protocol, the priority propagating protocol are all protected by sched_lock because priority propagating function is called with sched_lock held from scheduler. No visible performance degradation is found which these changes. Some parameter names in _umtx_op syscall are renamed.	2006-08-28 04:24:51 +00:00
maxim	d96c84ab9e	o Fix typo in the comment. PR: kern/99632 Submitted by: clsung	2006-06-30 08:10:55 +00:00
davidxu	d7a4692118	Rethink it a bit, if there is a STOP flag, don't bother to resume other threads.	2006-03-21 10:05:15 +00:00

1 2 3 4 5 ...

280 Commits