freebsd-nq

Author	SHA1	Message	Date
Kyle Evans	bd4bcd14e3	Fix !COMPAT_FREEBSD32 kernel build One of the last shifts inadvertently moved these static assertions out of a COMPAT_FREEBSD32 block, which the relevant definitions are limited to. Fix it. Pointy hat: kevans	2020-11-17 04:22:10 +00:00
Kyle Evans	63ecb272a0	umtx_op: reduce redundancy required for compat32 All of the compat32 variants are substantially the same, save for copyin/copyout (mostly). Apply the same kind of technique used with kevent here by having the syscall routines supply a umtx_copyops describing the operations needed. umtx_copyops carries the bare minimum needed- size of timespec and _umtx_time are used for determining if copyout is needed in the sem2_wait case. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27222	2020-11-17 03:36:58 +00:00
Kyle Evans	4be0a1b587	_umtx_op: fix a compat32 bug in UMTX_OP_NWAKE_PRIVATE Specifically, if we're waking up some value n > BATCH_SIZE, then the copyin(9) is wrong on the second iteration due to upp being the wrong type. upp is currently a uint32_t*, so upp + pos advances it by twice as many elements as it should (host pointer size vs. compat32 pointer size). Fix it by just making upp a uint32_t; it's still technically a double pointer, but the distinction doesn't matter all that much here since we're just doing arithmetic on it. Add a test case that demonstrates the problem, placed with the libthr tests since one messing with _umtx_op should be running these tests. Running under compat32, the new test case will hang as threads after the first 128 get missed in the wake. it's not immediately clear how to hit it in practice, since pthread_cond_broadcast() uses a smaller (sleepq batch?) size observed to be around ~50 -- I did not spend much time digging into it. The uintptr_t change makes no functional difference, but i've tossed it in since it's more accurate (semantically). Reported by: Andrew Gierth (andrew_tao173.riddles.org.uk, inspection) Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27231	2020-11-17 03:34:01 +00:00
Kyle Evans	38033780a3	umtx: drop incorrect timespec32 definition This works for amd64, but none others -- drop it, because we already have a proper definition in sys/compat/freebsd32/freebsd32.h that correctly uses time32_t. MFC after: 1 week	2020-11-11 22:35:23 +00:00
Konstantin Belousov	d301b3580f	Support for userspace non-transparent superpages (largepages). Created with shm_open2(SHM_LARGEPAGE) and then configured with FIOSSHMLPGCNF ioctl, largepages posix shared memory objects guarantee that all userspace mappings of it are served by superpage non-managed mappings. Only amd64 for now, both 2M and 1G superpages can be requested, the later requires CPU feature. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652	2020-09-09 22:12:51 +00:00
Pawel Biernacki	b05ca4290c	sys/: Document few more sysctls. Submitted by: Antranig Vartanian <antranigv@freebsd.am> Reviewed by: kaktus Commented by: jhb Approved by: kib (mentor) Sponsored by: illuria security Differential Revision: https://reviews.freebsd.org/D23759	2020-03-02 15:30:52 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Mateusz Guzik	3ff65f71cb	Remove duplicated empty lines from kern/*.c No functional changes.	2020-01-30 20:05:05 +00:00
Konstantin Belousov	478ca4b004	Rename umtxq_check_susp() to thread_check_susp() and make it usable outside of kern_umtx.c. To be used in several future changes. Discussed with: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-01-02 22:13:59 +00:00
Konstantin Belousov	8f4d74eb1e	Style: remove trailing spaces/tabs. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-01-02 22:07:03 +00:00
Konstantin Belousov	5cbdd18fd4	Make umtxq_check_susp() to correctly handle thread exit requests. The check for P_SINGLE_EXIT was shadowed by the (P_SHOULDSTOP \|\| traced) check. Reported by: bdrewery (might be) Reviewed by: markj Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21124	2019-08-01 14:34:27 +00:00
Konstantin Belousov	fd336e2ac0	Fix handling of transient casueword(9) failures in do_sem_wait(). In particular, restart should be only done when the failure is transient. For this, recheck the count1 value after the operation. Note that do_sem_wait() is older usem interface. Reported and tested by: bdrewery Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-07-31 19:16:49 +00:00
Konstantin Belousov	9f7b7da5bf	In do_sem2_wait(), balance umtx_key_get() with umtx_key_release() on retry. Reported by: ler Bisected and reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 12 days	2019-07-15 19:18:25 +00:00
Konstantin Belousov	ad038195bd	In do_lock_pi(), do not return prematurely. If umtxq_check_susp() indicates an exit, we should clean the resources before returning. Do it by breaking out of the loop and relying on post-loop cleanup. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 12 days Differential revision: https://reviews.freebsd.org/D20949	2019-07-15 08:39:52 +00:00
Konstantin Belousov	40bd868ba7	Correctly check for casueword(9) success in do_set_ceiling(). After r349951, the return code must be checked instead of old == new comparision. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 12 days Differential revision: https://reviews.freebsd.org/D20949	2019-07-15 08:38:01 +00:00
Konstantin Belousov	30b3018d48	Provide protection against starvation of the ll/sc loops when accessing userpace. Casueword(9) on ll/sc architectures must be prepared for userspace constantly modifying the same cache line as containing the CAS word, and not loop infinitely. Otherwise, rogue userspace livelocks the kernel. To fix the issue, change casueword(9) interface to return new value 1 indicating that either comparision or store failed, instead of relying on the oldval == *oldvalp comparison. The primitive no longer retries the operation if it failed spuriously. Modify callers of casueword(9), all in kern_umtx.c, to handle retries, and react to stops and requests to terminate between retries. On x86, despite cmpxchg should not return spurious failures, we can take advantage of the new interface and just return PSL.ZF. Reviewed by: andrew (arm64, previous version), markj Tested by: pho Reported by: https://xenbits.xen.org/xsa/advisory-295.txt Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D20772	2019-07-12 18:43:24 +00:00
Konstantin Belousov	7fde3c6b28	More style. Re-wrap long lines, reformat comments, remove excessive blank line. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-07-02 21:03:06 +00:00
Konstantin Belousov	4b8b28e130	Style. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-07-02 19:32:48 +00:00
Konstantin Belousov	2d7a555294	Style. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-06-28 20:40:54 +00:00
Mateusz Guzik	ac97da9ad8	Reduce umtx-related work on exec and exit - there is no need to take the process lock to iterate the thread list after single-threading is enforced - typically there are no mutexes to clean up (testable without taking the global umtx lock) - typically there is no need to adjust the priority (testable without taking thread lock) Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20160	2019-05-08 16:30:38 +00:00
Mateusz Guzik	6017827676	umtx: avoid umtxshm locking on object termination if possible Sample build world result on tmpfs: kern.ipc.umtx_terminate_notempty: 0 kern.ipc.umtx_terminate_empty: 2891815 Sponsored by: The FreeBSD Foundation	2018-12-08 14:04:57 +00:00
Brooks Davis	b34f4419fb	Make freebsd32_umtx_op follow the freebsd32_foo convention. Sponsored by: DARPA, AFRL	2018-11-09 00:46:10 +00:00
Alan Somers	6040822c4e	Make timespecadd(3) and friends public The timespecadd(3) family of macros were imported from NetBSD back in r35029. However, they were initially guarded by #ifdef _KERNEL. In the meantime, we have grown at least 28 syscalls that use timespecs in some way, leading many programs both inside and outside of the base system to redefine those macros. It's better just to make the definitions public. Our kernel currently defines two-argument versions of timespecadd and timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define three-argument versions. Solaris also defines a three-argument version, but only in its kernel. This revision changes our definition to match the common three-argument version. Bump _FreeBSD_version due to the breaking KPI change. Discussed with: cem, jilles, ian, bde Differential Revision: https://reviews.freebsd.org/D14725	2018-07-30 15:46:40 +00:00
Matt Macy	e1a92f058f	umtx: don't call umtxq_getchain unless the value is needed	2018-05-19 05:09:10 +00:00
Brooks Davis	6469bdcdb6	Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941	2018-04-06 17:35:35 +00:00
Brooks Davis	91a743004c	Use umtx_copyin_umtx_time32() in __umtx_op_lock_umutex_compat32(). Non-NULL timeouts where copied in improperly and could produce failures due to incompatible data structures. Reviewed by: kib MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14587	2018-03-06 01:52:04 +00:00
Pedro F. Giffuni	8a36da99de	sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:20:12 +00:00
Konstantin Belousov	30c438723d	Convert explicit panic() call to assert. Based on github pull request: #113 Submitted by: pmarillo@github MFC after: 1 week	2017-11-04 10:49:34 +00:00
Eric van Gyzen	9dbdf2a169	When the RTC is adjusted, reevaluate absolute sleep times based on the RTC POSIX 2008 says this about clock_settime(2): If the value of the CLOCK_REALTIME clock is set via clock_settime(), the new value of the clock shall be used to determine the time of expiration for absolute time services based upon the CLOCK_REALTIME clock. This applies to the time at which armed absolute timers expire. If the absolute time requested at the invocation of such a time service is before the new value of the clock, the time service shall expire immediately as if the clock had reached the requested time normally. Setting the value of the CLOCK_REALTIME clock via clock_settime() shall have no effect on threads that are blocked waiting for a relative time service based upon this clock, including the nanosleep() function; nor on the expiration of relative timers based upon this clock. Consequently, these time services shall expire when the requested relative interval elapses, independently of the new or old value of the clock. When the real-time clock is adjusted, such as by clock_settime(3), wake any threads sleeping until an absolute real-clock time. Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions will set that field to zero and return zero to tell the caller to reevaluate its sleep duration based on the new value of the clock. At present, this affects the following functions: pthread_cond_timedwait(3) pthread_mutex_timedlock(3) pthread_rwlock_timedrdlock(3) pthread_rwlock_timedwrlock(3) sem_timedwait(3) sem_clockwait_np(3) I'm working on adding clock_nanosleep(2), which will also be affected. Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de> Reviewed by: jhb, kib MFC after: 2 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9791	2017-03-14 19:06:44 +00:00
Eric van Gyzen	b215ceaaec	Add sem_clockwait_np() This function allows the caller to specify the reference clock and choose between absolute and relative mode. In relative mode, the remaining time can be returned. The API is similar to clock_nanosleep(3). Thanks to Ed Schouten for that suggestion. While I'm here, reduce the sleep time in the semaphore "child" test to greatly reduce its runtime. Also add a reasonable timeout. Reviewed by: ed (userland) MFC after: 2 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9656	2017-02-23 19:36:38 +00:00
Adrian Chadd	0046bef85a	[mips] make UMTX_CHAINS configurable at compile time. The default (512) wastes quite a bit of space which doesn't really buy us much on highly embedded systems which don't take a lot of locks in parallel. This makes it at least build time configurable so people can experiment.	2016-11-15 01:34:38 +00:00
Konstantin Belousov	0f2d97838d	In both do_rw_wrlock() and do_rw_rdlock() after r304808, do not obliterate possible error from sleep with errors from umtxq_check_susp(), when looping to clear URWLOCK_{READ,WRITE}_WAITERS. Noted and reviewed by: vangyzen Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-25 19:15:02 +00:00
Konstantin Belousov	28e21133f3	Prevent leak of URWLOCK_READ_WAITERS flag for urwlocks. If there was some error, e.g. the sleep was interrupted, as in the referenced PR, do_rw_rdlock() did not cleared URWLOCK_READ_WAITERS. Since unlock only wakes up write waiters when there is no read waiters, for URWLOCK_PREFER_READER kind of locks, the result was missed wakeups for writers. In particular, the most visible victims are ld-elf.so locks in processes which loaded libthr, because rtld locks are urwlocks in prefer-reader mode. Normal rwlocks fall into prefer-reader mode only if thread already owns rw lock in read mode, which is not typical and correspondingly less visible. In the PR, unowned rtld bind lock was waited for in the process where only one thread was left alive. Note that do_rw_wrlock() correctly clears URWLOCK_WRITE_WAITERS in case of errors. Reported and tested by: longwitz@incore.de PR: 211947 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-25 16:35:42 +00:00
Eric Badger	b0f2185bbe	sem_post(): wake up the sleeper only after adjusting has_waiters If the caller of sem_post() wakes up a thread sleeping via sem_wait() before it clears the has_waiters flag, the caller of sem_wait() has no way of knowing when it is safe to destroy the semaphore and reuse the memory. This is because the caller of sem_post() may be interrupted between the wake step and the clearing of has_waiters. It will then write into the has_waiters flag in userspace after being preempted for some unknown amount of time. Reviewed by: jhb, kib, vangyzen Approved by: kib (mentor), vangyzen (mentor) MFC after: 2 weeks Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D7505	2016-08-15 20:09:09 +00:00
Konstantin Belousov	2a339d9e3d	Add implementation of robust mutexes, hopefully close enough to the intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. When a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. Somewhat unrelated changes in the patch: 1. Style. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the userspace struct pthread_mutex m_owner field. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Reviewed by: jilles (previous version, supposedly the objection was fixed) Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects) Tested by: pho Sponsored by: The FreeBSD Foundation	2016-05-17 09:56:22 +00:00
Konstantin Belousov	2cfddaa6ff	Fix umtx lock/trylock for compat32. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2016-04-19 11:37:43 +00:00
Konstantin Belousov	1bdbd70599	Implement process-shared locks support for libthr.so.3, without breaking the ABI. Special value is stored in the lock pointer to indicate shared lock, and offline page in the shared memory is allocated to store the actual lock. Reviewed by: vangyzen (previous version) Discussed with: deischen, emaste, jhb, rwatson, Martin Simmons <martin@lispworks.com> Tested by: pho Sponsored by: The FreeBSD Foundation	2016-02-28 17:52:33 +00:00
Konstantin Belousov	50657fd342	Minor (and incomplete) style cleanup. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 20:47:42 +00:00
Konstantin Belousov	2936e0013c	Also mark compat32 umtx op table as constant. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 19:32:30 +00:00
Konstantin Belousov	c539e87014	Use C99 array initialization, which also makes the code self-documented, and eases addition of new ops. For the similar reasons, eliminate UMTX_OP_MAX. nitems() handles the only use of the symbol. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-30 19:20:40 +00:00
Ed Schouten	dc4b532479	Fix bad arithmetic in umtx_key_get() to compute object offset. It looks like umtx_key_get() has the addition and subtraction the wrong way around, meaning that it fails to match in certain cases. This causes the cloudlibc unit tests to deadlock in certain cases. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D3287	2015-08-04 06:01:13 +00:00
Ed Schouten	52942c1eae	Add missing const keyword to function parameter. The umtx_key_get() function does not dereference the address off the userspace object. The pointer can safely be const.	2015-08-03 21:11:33 +00:00
Eric van Gyzen	e858b027db	Clean up some cosmetic nits in kern_umtx.c, found during recent work in this area and by the Clang static analyzer. Remove some dead assignments. Fix a typo in a panic string. Use umtx_pi_disown() instead of duplicate code. Use an existing variable instead of curthread. Approved by: kib (mentor) MFC after: 3 days Sponsored by: Dell Inc	2015-03-28 21:21:40 +00:00
Konstantin Belousov	13dad10871	The umtx_lock mutex is used by top-half of the kernel, but is currently a spin lock. Apparently, the only reason for this is that umtx_thread_exit() is called under the process spinlock, which put the requirement on the umtx_lock. Note that the witness static order list is wrong for the umtx_lock, umtx_lock is explicitely before any thread lock, so it is also before sleepq locks. Change umtx_lock to be the sleepable mutex. For the reason above, the calls to umtx_thread_exit() are moved from thread_exit() earlier in each caller, when the process spin lock is not yet taken. Discussed with: jhb Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-02-28 04:19:02 +00:00
Konstantin Belousov	84b736b268	When failing to claim ownership of a umtx_pi, restore the umutex owner to its previous, unowned state. This avoids compounding an existing problem of inconsistent ownership. Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com> Obtained from: Dell Inc. PR: 198914 MFC after: 1 week	2015-02-25 16:17:16 +00:00
Konstantin Belousov	cc876d2c5c	When unlocking a contested PI pthread mutex, if the queue of waiters is empty, look up the umtx_pi and disown it if the current thread owns it. This can happen if a signal or timeout removed the last waiter from the queue, but there is still a thread in do_lock_pi() holding a reference on the umtx_pi. The unlocking thread might not own the umtx_pi in this case, but if it does, it must disown it to keep the ownership consistent between the umtx_pi and the umutex. Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com> with advice from: Elliott Rabe and Jim Muchow, also at Dell Inc. Obtained from: Dell Inc. PR: 198914	2015-02-25 16:12:56 +00:00
Konstantin Belousov	6690381ef1	The dependency chain for priority-inheritance mutexes could be subverted by userspace into cycle. Both umtx_propagate_priority() and umtx_repropagate_priority() would then loop infinitely, owning the spinlock. Check for the cycle using standard Floyd' algorithm before doing the pass in the affected functions. Add simple check for condition of tricking the thread into a wait for itself, which could be easily simulated by usermode without race. Found by: Eric van Gyzen <eric@vangyzen.net> In collaboration with: Eric van Gyzen <eric@vangyzen.net> Tested by: pho MFC after: 1 week	2015-01-31 12:27:40 +00:00
Konstantin Belousov	416be7a1c6	Fix assertion, &uc->uc_busy is never zero, the intent is to test the uc_busy value, and not its address [1]. Remove the single use of the macro, write KASSERT() explicitely in the code of umtxq_sleep_pi(). Submitted by: Eric van Gyzen <eric@vangyzen.net> [1] MFC after: 1 week	2014-11-13 18:51:09 +00:00
Konstantin Belousov	2361c6d135	Add type qualifier volatile to the base (userspace) address argument of fuword(9) and suword(9). This makes the functions type-compatible with volatile objects and does not require devolatile force, e.g. in kern_umtx.c. Requested by: bde Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-10-31 17:43:21 +00:00
Konstantin Belousov	f7e91c288a	Convert kern_umtx.c to use fueword() and casueword(). Also fix some mishandling of suword(9) errors as errno, which resulted in spurious ERESTART. Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 3 weeks	2014-10-28 15:30:33 +00:00

1 2 3 4

178 Commits