freebsd-dev

Author	SHA1	Message	Date
Eric van Gyzen	9dbdf2a169	When the RTC is adjusted, reevaluate absolute sleep times based on the RTC POSIX 2008 says this about clock_settime(2): If the value of the CLOCK_REALTIME clock is set via clock_settime(), the new value of the clock shall be used to determine the time of expiration for absolute time services based upon the CLOCK_REALTIME clock. This applies to the time at which armed absolute timers expire. If the absolute time requested at the invocation of such a time service is before the new value of the clock, the time service shall expire immediately as if the clock had reached the requested time normally. Setting the value of the CLOCK_REALTIME clock via clock_settime() shall have no effect on threads that are blocked waiting for a relative time service based upon this clock, including the nanosleep() function; nor on the expiration of relative timers based upon this clock. Consequently, these time services shall expire when the requested relative interval elapses, independently of the new or old value of the clock. When the real-time clock is adjusted, such as by clock_settime(3), wake any threads sleeping until an absolute real-clock time. Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions will set that field to zero and return zero to tell the caller to reevaluate its sleep duration based on the new value of the clock. At present, this affects the following functions: pthread_cond_timedwait(3) pthread_mutex_timedlock(3) pthread_rwlock_timedrdlock(3) pthread_rwlock_timedwrlock(3) sem_timedwait(3) sem_clockwait_np(3) I'm working on adding clock_nanosleep(2), which will also be affected. Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de> Reviewed by: jhb, kib MFC after: 2 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9791	2017-03-14 19:06:44 +00:00
Eric van Gyzen	b215ceaaec	Add sem_clockwait_np() This function allows the caller to specify the reference clock and choose between absolute and relative mode. In relative mode, the remaining time can be returned. The API is similar to clock_nanosleep(3). Thanks to Ed Schouten for that suggestion. While I'm here, reduce the sleep time in the semaphore "child" test to greatly reduce its runtime. Also add a reasonable timeout. Reviewed by: ed (userland) MFC after: 2 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9656	2017-02-23 19:36:38 +00:00
Adrian Chadd	0046bef85a	[mips] make UMTX_CHAINS configurable at compile time. The default (512) wastes quite a bit of space which doesn't really buy us much on highly embedded systems which don't take a lot of locks in parallel. This makes it at least build time configurable so people can experiment.	2016-11-15 01:34:38 +00:00
Konstantin Belousov	0f2d97838d	In both do_rw_wrlock() and do_rw_rdlock() after r304808, do not obliterate possible error from sleep with errors from umtxq_check_susp(), when looping to clear URWLOCK_{READ,WRITE}_WAITERS. Noted and reviewed by: vangyzen Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-25 19:15:02 +00:00
Konstantin Belousov	28e21133f3	Prevent leak of URWLOCK_READ_WAITERS flag for urwlocks. If there was some error, e.g. the sleep was interrupted, as in the referenced PR, do_rw_rdlock() did not cleared URWLOCK_READ_WAITERS. Since unlock only wakes up write waiters when there is no read waiters, for URWLOCK_PREFER_READER kind of locks, the result was missed wakeups for writers. In particular, the most visible victims are ld-elf.so locks in processes which loaded libthr, because rtld locks are urwlocks in prefer-reader mode. Normal rwlocks fall into prefer-reader mode only if thread already owns rw lock in read mode, which is not typical and correspondingly less visible. In the PR, unowned rtld bind lock was waited for in the process where only one thread was left alive. Note that do_rw_wrlock() correctly clears URWLOCK_WRITE_WAITERS in case of errors. Reported and tested by: longwitz@incore.de PR: 211947 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-25 16:35:42 +00:00
Eric Badger	b0f2185bbe	sem_post(): wake up the sleeper only after adjusting has_waiters If the caller of sem_post() wakes up a thread sleeping via sem_wait() before it clears the has_waiters flag, the caller of sem_wait() has no way of knowing when it is safe to destroy the semaphore and reuse the memory. This is because the caller of sem_post() may be interrupted between the wake step and the clearing of has_waiters. It will then write into the has_waiters flag in userspace after being preempted for some unknown amount of time. Reviewed by: jhb, kib, vangyzen Approved by: kib (mentor), vangyzen (mentor) MFC after: 2 weeks Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D7505	2016-08-15 20:09:09 +00:00
Konstantin Belousov	2a339d9e3d	Add implementation of robust mutexes, hopefully close enough to the intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. When a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. Somewhat unrelated changes in the patch: 1. Style. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the userspace struct pthread_mutex m_owner field. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Reviewed by: jilles (previous version, supposedly the objection was fixed) Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects) Tested by: pho Sponsored by: The FreeBSD Foundation	2016-05-17 09:56:22 +00:00
Konstantin Belousov	2cfddaa6ff	Fix umtx lock/trylock for compat32. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2016-04-19 11:37:43 +00:00
Konstantin Belousov	1bdbd70599	Implement process-shared locks support for libthr.so.3, without breaking the ABI. Special value is stored in the lock pointer to indicate shared lock, and offline page in the shared memory is allocated to store the actual lock. Reviewed by: vangyzen (previous version) Discussed with: deischen, emaste, jhb, rwatson, Martin Simmons <martin@lispworks.com> Tested by: pho Sponsored by: The FreeBSD Foundation	2016-02-28 17:52:33 +00:00
Konstantin Belousov	50657fd342	Minor (and incomplete) style cleanup. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 20:47:42 +00:00
Konstantin Belousov	2936e0013c	Also mark compat32 umtx op table as constant. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 19:32:30 +00:00
Konstantin Belousov	c539e87014	Use C99 array initialization, which also makes the code self-documented, and eases addition of new ops. For the similar reasons, eliminate UMTX_OP_MAX. nitems() handles the only use of the symbol. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 19:20:40 +00:00
Ed Schouten	dc4b532479	Fix bad arithmetic in umtx_key_get() to compute object offset. It looks like umtx_key_get() has the addition and subtraction the wrong way around, meaning that it fails to match in certain cases. This causes the cloudlibc unit tests to deadlock in certain cases. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D3287	2015-08-04 06:01:13 +00:00
Ed Schouten	52942c1eae	Add missing const keyword to function parameter. The umtx_key_get() function does not dereference the address off the userspace object. The pointer can safely be const.	2015-08-03 21:11:33 +00:00
Eric van Gyzen	e858b027db	Clean up some cosmetic nits in kern_umtx.c, found during recent work in this area and by the Clang static analyzer. Remove some dead assignments. Fix a typo in a panic string. Use umtx_pi_disown() instead of duplicate code. Use an existing variable instead of curthread. Approved by: kib (mentor) MFC after: 3 days Sponsored by: Dell Inc	2015-03-28 21:21:40 +00:00
Konstantin Belousov	13dad10871	The umtx_lock mutex is used by top-half of the kernel, but is currently a spin lock. Apparently, the only reason for this is that umtx_thread_exit() is called under the process spinlock, which put the requirement on the umtx_lock. Note that the witness static order list is wrong for the umtx_lock, umtx_lock is explicitely before any thread lock, so it is also before sleepq locks. Change umtx_lock to be the sleepable mutex. For the reason above, the calls to umtx_thread_exit() are moved from thread_exit() earlier in each caller, when the process spin lock is not yet taken. Discussed with: jhb Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-02-28 04:19:02 +00:00
Konstantin Belousov	84b736b268	When failing to claim ownership of a umtx_pi, restore the umutex owner to its previous, unowned state. This avoids compounding an existing problem of inconsistent ownership. Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com> Obtained from: Dell Inc. PR: 198914 MFC after: 1 week	2015-02-25 16:17:16 +00:00
Konstantin Belousov	cc876d2c5c	When unlocking a contested PI pthread mutex, if the queue of waiters is empty, look up the umtx_pi and disown it if the current thread owns it. This can happen if a signal or timeout removed the last waiter from the queue, but there is still a thread in do_lock_pi() holding a reference on the umtx_pi. The unlocking thread might not own the umtx_pi in this case, but if it does, it must disown it to keep the ownership consistent between the umtx_pi and the umutex. Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com> with advice from: Elliott Rabe and Jim Muchow, also at Dell Inc. Obtained from: Dell Inc. PR: 198914	2015-02-25 16:12:56 +00:00
Konstantin Belousov	6690381ef1	The dependency chain for priority-inheritance mutexes could be subverted by userspace into cycle. Both umtx_propagate_priority() and umtx_repropagate_priority() would then loop infinitely, owning the spinlock. Check for the cycle using standard Floyd' algorithm before doing the pass in the affected functions. Add simple check for condition of tricking the thread into a wait for itself, which could be easily simulated by usermode without race. Found by: Eric van Gyzen <eric@vangyzen.net> In collaboration with: Eric van Gyzen <eric@vangyzen.net> Tested by: pho MFC after: 1 week	2015-01-31 12:27:40 +00:00
Konstantin Belousov	416be7a1c6	Fix assertion, &uc->uc_busy is never zero, the intent is to test the uc_busy value, and not its address [1]. Remove the single use of the macro, write KASSERT() explicitely in the code of umtxq_sleep_pi(). Submitted by: Eric van Gyzen <eric@vangyzen.net> [1] MFC after: 1 week	2014-11-13 18:51:09 +00:00
Konstantin Belousov	2361c6d135	Add type qualifier volatile to the base (userspace) address argument of fuword(9) and suword(9). This makes the functions type-compatible with volatile objects and does not require devolatile force, e.g. in kern_umtx.c. Requested by: bde Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-10-31 17:43:21 +00:00
Konstantin Belousov	f7e91c288a	Convert kern_umtx.c to use fueword() and casueword(). Also fix some mishandling of suword(9) errors as errno, which resulted in spurious ERESTART. Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 3 weeks	2014-10-28 15:30:33 +00:00
John Baldwin	1bc9ea1caa	Use correct type in __DEVOLATILE().	2014-10-25 20:42:47 +00:00
Xin LI	6362e06b42	Fix build.	2014-10-25 00:16:36 +00:00
John Baldwin	53e1ffbbce	The current POSIX semaphore implementation stores the _has_waiters flag in a separate word from the _count. This does not permit both items to be updated atomically in a portable manner. As a result, sem_post() must always perform a system call to safely clear _has_waiters. This change removes the _has_waiters field and instead uses the high bit of _count as the _has_waiters flag. A new umtx object type (_usem2) and two new umtx operations are added (SEM_WAIT2 and SEM_WAKE2) to implement these semantics. The older operations are still supported under the COMPAT_FREEBSD9/10 options. The POSIX semaphore API in libc has been updated to use the new implementation. Note that the new implementation is not compatible with the previous implementation. However, this only affects static binaries (which cannot be helped by symbol versioning). Binaries using a dynamic libc will continue to work fine. SEM_MAGIC has been bumped so that mismatched binaries will error rather than corrupting a shared semaphore. In addition, a padding field has been added to sem_t so that it remains the same size. Differential Revision: https://reviews.freebsd.org/D961 Reported by: adrian Reviewed by: kib, jilles (earlier version) Sponsored by: Norse	2014-10-24 20:02:44 +00:00
Konstantin Belousov	fbb6eca60f	In do_lock_pi(), do not override error from umtxq_sleep_pi() when doing suspend check. This restores the pre-r251684 behaviour, to retry once after the signal is detected. PR: kern/192918 Submitted by: Elliott Rabe, Dell Inc., Eric van Gyzen <eric@vangyzen.net> Obtained from: Dell Inc. MFC after: 1 week	2014-08-22 18:42:14 +00:00
Attilio Rao	3198603edd	Fix comments. Sponsored by: EMC / Isilon Storage Division	2014-03-19 12:45:40 +00:00
Attilio Rao	ce42e79310	Remove dead code from umtx support: - Retire long time unused (basically always unused) sys__umtx_lock() and sys__umtx_unlock() syscalls - struct umtx and their supporting definitions - UMUTEX_ERROR_CHECK flag - Retire UMTX_OP_LOCK/UMTX_OP_UNLOCK from _umtx_op() syscall __FreeBSD_version is not bumped yet because it is expected that further breakages to the umtx interface will follow up in the next days. However there will be a final bump when necessary. Sponsored by: EMC / Isilon storage division Reviewed by: jhb	2014-03-18 21:32:03 +00:00
Konstantin Belousov	1d7466bca4	Fix two issues with the spin loops in the umtx(2) implementation. - When looping, check for the pending suspension. Otherwise, other usermode thread which races with the looping one, could try to prevent the process from stopping or exiting. - Add missed checks for the faults from casuword*(). The code is structured in a way which makes the loops exit if the specified address is invalid, since both fuword() and casuword() return -1 on the fault. But if the address is mapped readonly, the typical value read by fuword() is different from -1, while casuword() returns -1. Absent the checks for casuword() faults, this is interpreted as the race with other thread and causes non-interruptible spinning in the kernel. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-06-13 09:33:22 +00:00
Jilles Tjoelker	1e367efa8b	sem: Restart the POSIX sem_* calls after signals with SA_RESTART set. Programs often do not expect an [EINTR] return from sem_wait() and POSIX only allows it if the signal was installed without SA_RESTART. The timeout in sem_timedwait() is absolute so it can be restarted normally. The umtx call can be invoked with a relative timeout and in that case [ERESTART] must be changed to [EINTR]. However, libc does not do this. The old POSIX semaphore implementation did this correctly (before r249566), unlike the new umtx one. It may be desirable to avoid [EINTR] completely, which matches the pthread functions and is explicitly permitted by POSIX. However, the kernel must return [EINTR] at least for signals with SA_RESTART clear, otherwise pthread cancellation will not abort a semaphore wait. In this commit, only restore the 8.x behaviour which is also permitted by POSIX. Discussed with: jhb MFC after: 1 week	2013-04-19 10:16:00 +00:00
Attilio Rao	d52d7aa871	Fix a bug in UMTX_PROFILING: UMTX_PROFILING should really analyze the distribution of locks as they index entries in the umtxq_chains hash-table. However, the current implementation does add/dec the length counters for every thread insert/removal, measuring at all really userland contention and not the hash distribution. Fix this by correctly add/dec the length counters in the points where it is really needed. Please note that this bug brought us questioning in the past the quality of the umtx hash table distribution. To date with all the benchmarks I could try I was not able to reproduce any issue about the hash distribution on umtx. Sponsored by: EMC / Isilon storage division Reviewed by: jeff, davide MFC after: 2 weeks	2013-03-21 19:58:25 +00:00
Attilio Rao	1fc8c346d5	Improve UMTX_PROFILING: - Use u_int values for length and max_length values - Add a way to reset the max_length heuristic in order to have the possibility to reuse the mechanism consecutively without rebooting the machine - Add a way to quick display top5 contented buckets in the system for the max_length value. This should give a quick overview on the quality of the hash table distribution. Sponsored by: EMC / Isilon storage division Reviewed by: jeff, davide	2013-03-09 15:31:19 +00:00
Davide Italiano	ba4be2110a	The fields of struct timespec32 should be int32_t and not uint32_t. Make this change. Reviewed by: bde, davidxu Tested by: pho MFC after: 1 week	2012-10-27 23:42:41 +00:00
David Xu	d7f97db7bd	Some style fixes inspired by @bde.	2012-08-11 23:48:39 +00:00
David Xu	e8afbca2bc	tvtohz will print out an error message if a negative value is given to it, avoid this problem by detecting timeout earlier. Reported by: pho	2012-08-11 00:06:56 +00:00
Davide Italiano	331805a5d3	Fix some style bugs introduced in a previous commit (r233045) Reported by: glebius, jmallet Reviewed by: jmallet Approved by: gnn (mentor) MFC after: 2 days	2012-04-14 23:53:31 +00:00
David Xu	8931e524bf	In sem_post, the field _has_waiters is no longer used, because some application destroys semaphore after sem_wait returns. Just enter kernel to wake up sleeping threads, only update _has_waiters if it is safe. While here, check if the value exceed SEM_VALUE_MAX and return EOVERFLOW if this is true.	2012-04-05 03:05:02 +00:00
David Xu	17ce606321	umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses a mutex after a thread has unlocked it, it event writes data to the mutex memory to clear contention bit, there is a race that other threads can lock it and unlock it, then destroy it, so it should not write data to the mutex memory if there isn't any waiter. The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It requires thread library to clear the lock word entirely, then call the WAKE2 operation to check if there is any waiter in kernel, and try to wake up a thread, if necessary, the contention bit is set again by the operation. This also mitgates the chance that other threads find the contention bit and try to enter kernel to compete with each other to wake up sleeping thread, this is unnecessary. With this change, the mutex owner is no longer holding the mutex until it reaches a point where kernel umtx queue is locked, it releases the mutex as soon as possible. Performance is improved when the mutex is contensted heavily. On Intel i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c	2012-04-05 02:24:08 +00:00
David Xu	8b1eafa723	Remove stale comments.	2012-03-31 06:48:41 +00:00
David Xu	b29d7d9b60	Remove trailing semicolon, it is a typo.	2012-03-30 12:57:14 +00:00
David Xu	0cf573e989	Fix COMPAT_FREEBSD32 build. Submitted by: Andreas Tobler < andreast at fgznet dot ch >	2012-03-30 09:03:53 +00:00
David Xu	4ed8858df0	Remove trailing space.	2012-03-30 05:49:32 +00:00
David Xu	e05171d939	Merge umtxq_sleep and umtxq_nanosleep into a single function by using an abs_timeout structure which describes timeout info.	2012-03-30 05:40:26 +00:00
David Xu	d31f470d15	Reduce code size by creating common timed sleeping function.	2012-03-29 02:46:43 +00:00
Davide Italiano	c6111de55d	Add rudimentary profiling of the hash table used in the in the umtx code to hold active lock queues. Reviewed by: attilio Approved by: davidxu, gnn (mentor) MFC after: 3 weeks	2012-03-16 20:32:11 +00:00
David Xu	c9b01ed581	initialize clock ID and flags only when copying timespec, a _umtx_time copy already contains these fields.	2012-02-29 02:01:48 +00:00
David Xu	24c209494a	Follow changes made in revision 232144, pass absolute timeout to kernel, this eliminates a clock_gettime() syscall.	2012-02-27 13:38:52 +00:00
David Xu	df1f1bae9e	In revision 231989, we pass a 16-bit clock ID into kernel, however according to POSIX document, the clock ID may be dynamically allocated, it unlikely will be in 64K forever. To make it future compatible, we pack all timeout information into a new structure called _umtx_time, and use fourth argument as a size indication, a zero means it is old code using timespec as timeout value, but the new structure also includes flags and a clock ID, so the size argument is different than before, and it is non-zero. With this change, it is possible that a thread can sleep on any supported clock, though current kernel code does not have such a POSIX clock driver system.	2012-02-25 02:12:17 +00:00
David Xu	f911d9fa4d	Fix typo.	2012-02-22 07:34:23 +00:00
David Xu	b13a8fa78f	Use unused fourth argument of umtx_op to pass flags to kernel for operation UMTX_OP_WAIT. Upper 16bits is enough to hold a clock id, and lower 16bits is used to pass flags. The change saves a clock_gettime() syscall from libthr.	2012-02-22 03:22:49 +00:00

1 2 3

150 Commits