freebsd-skq

Author	SHA1	Message	Date
Brooks Davis	91a743004c	Use umtx_copyin_umtx_time32() in __umtx_op_lock_umutex_compat32(). Non-NULL timeouts where copied in improperly and could produce failures due to incompatible data structures. Reviewed by: kib MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14587	2018-03-06 01:52:04 +00:00
Brooks Davis	aec37bad99	Regen after r330517.	2018-03-05 17:02:50 +00:00
Brooks Davis	1c1b4c66b6	Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries. No functional change (comments change in some generated files.) Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14571	2018-03-05 17:02:16 +00:00
Mateusz Guzik	0ad122a966	lockmgr: save on sleepq when cmpset fails	2018-03-05 00:30:07 +00:00
Mateusz Guzik	93d41967da	lockmgr: whack unused lockmgr_note_exclusive_upgrade	2018-03-04 22:14:20 +00:00
Mateusz Guzik	9f4e008d4a	mtx: tidy up recursion handling in thread lock Normally after grabbing the lock it has to be verified we got the right one to begin with. However, if we are recursing, it must not change thus the check can be avoided. In particular this avoids a lock read for non-recursing case which found out the lock was changed. While here avoid an irq trip of this happens. Tested by: pho (previous version)	2018-03-04 22:01:23 +00:00
Mateusz Guzik	a8e747c5e7	sx: don't do an atomic op in upgrade if it cananot succeed The code already pays the cost of reading the lock to obtain the waiters flag. Checking whether there is more than one reader is not a problem and avoids dirtying the line. This also fixes a small corner case: if waiters were to show up between reading the flag and upgrading the lock, the operation would fail even though it should not. No correctness change here though.	2018-03-04 21:41:05 +00:00
Mateusz Guzik	d94df98c5c	locks: fix a corner case in r327399 If there were exactly rowner_retries/asx_retries (by default: 10) transitions between read and write state and the waiters still did not get the lock, the next owner -> reader transition would result in the code correctly falling back to turnstile/sleepq where it would incorrectly think it was waiting for a writer and decide to leave turnstile/sleepq to loop back. From this point it would take ts/sq trips until the lock gets released. The bug sometimes manifested itself in stalls during -j 128 package builds. Refactor the code to fix the bug, while here remove some of the gratituous differences between rw and sx locks.	2018-03-04 21:38:30 +00:00
Mateusz Guzik	1c6987ebc5	lockmgr: start decomposing the main routine The main routine takes 8 args, 3 of which are almost the same for most uses. This in particular pushes it above the limit of 6 arguments passable through registers on amd64 making it impossible to tail call. This is a prerequisite for further cleanups. Tested by: pho	2018-03-04 19:12:54 +00:00
Hans Petter Selasky	2077229b56	Allow pause_sbt() to catch signals during sleep by passing C_CATCH flag. Define pause_sig() function macro helper similarly to other kernel functions which catch signals. Update outdated function description. Discussed with: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-03 18:36:38 +00:00
Hans Petter Selasky	54fc03834a	Correct the return code from pause() during cold startup from zero to EWOULDBLOCK. This also matches the description in pause(9). Discussed with: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-03 18:12:21 +00:00
Brooks Davis	93e48a303a	Rename kernel-only members of semid_ds and msgid_ds. This deliberately breaks the API in preperation for future syscall revisions which will remove these nonstandard members. In an exp-run a single port (devel/qemu-user-static) was found to use them which it did becuase it emulates system calls. This has been fixed in the ports tree. PR: 224443 (exp-run) Reviewed by: kib, jhb (previous version) Exp-run by: antoine Sponsored by: DARPA, AFRP Differential Revision: https://reviews.freebsd.org/D14490	2018-03-02 22:10:48 +00:00
Mateusz Guzik	c505b59961	sx: fix adaptive spinning broken in r327397 The condition was flipped. In particular heavy multithreaded kernel builds on zfs started suffering due to nested sx locks. For instance make -s -j 128 buildkernel: before: 3326.67s user 1269.62s system 6981% cpu 1:05.84 total after: 3365.55s user 911.27s system 6871% cpu 1:02.24 total ps. .-'---`-. .-'---`-. ,' `. ,' `. \| \ \| \ \| \ \| \ \ _ \ \ _ \ ,\ _ ,'-,/-)\ ,\ _ ,'-,/-)\ ( * \ \,' ,' ,'-) ( * \ \,' ,' ,'-) `._,) -',-') `._,) -',-') \/ ''/ \/ ''/ ) / / ) / / / ,'-' / ,'-'	2018-03-02 21:26:27 +00:00
Mateusz Guzik	9d4e369ae8	Don't generate data in sysctl_out_proc unless we intend to copy out. The first call is used to gauge how much spaces is needed. Just computing the size instead of generating the output allows to not take the proctree lock.	2018-02-25 15:16:58 +00:00
Jeff Roberson	1c2529ab32	Fix issues with sparse cpu allocation. Consistently use mp_maxid + 1. Reported by: pho Reviewed by: markj Sponsored by: Netflix, Dell/EMC Isilon	2018-02-25 00:35:21 +00:00
Conrad Meyer	63901c0171	kern/sys_generic.c: style(9) return(foo) -> return (foo) No functional change. Sponsored by: Dell EMC Isilon	2018-02-24 01:15:33 +00:00
Jeff Roberson	5f8cd1c0bf	Add a generic Proportional Integral Derivative (PID) controller algorithm and use it to regulate page daemon output. This provides much smoother and more responsive page daemon output, anticipating demand and avoiding pageout stalls by increasing the number of pages to match the workload. This is a reimplementation of work done by myself and mlaier at Isilon. Reviewed by: bsdimp Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14402	2018-02-23 22:51:51 +00:00
Kirk McKusick	16680b6af5	Include error number in the "fsync: giving up on dirty" message (in case it ever starts happening again in spite of 328444). Submitted by: Andreas Longwitz <longwitz at incore.de>	2018-02-23 21:57:10 +00:00
Konstantin Belousov	4c8a8cfcde	Restore UP build. Reviewed by: truckman Sponsored by: The FreeBSD Foundation	2018-02-23 18:26:31 +00:00
Ed Maste	315fbaeca2	Correct pseudo misspelling in sys/ comments contrib code and #define in intel_ata.h unchanged.	2018-02-23 18:15:50 +00:00
Don Lewis	97e9382d56	Decrease latency by not wrapping the idle loop's potentially lengthy search for a thread to steal inside a critical section. Since this allows the search to be preempted, restart the search if preemption happens since the search results found earlier may no longer be valid. Decrease the latency of starting a thread that may be assigned to this CPU during the search by polling for incoming threads during the search and switching to that thread instead of continuing the search. Test for stale search results and restart the search before going through the expense of calling tdq_lock_pair(). Retry some tests after grabbing the locks since things may have changed while waiting to get both locks. Eliminate special case handling for stealing from an SMT peer that uses 1 as the steal threshold. This can only succeed if a thread has been assigned but our SMT peer has not yet started executing it. This is quite rare and when it happens the other SMT thread is generally waiting for the same tdq lock that we hold. Basically both SMT threads are racing to grab the same spin lock. Add the kern.sched.always_steal knob from a ULE patch by jeff@. Incorporate another idea from Jeff's ULE patch. If the sched_switch() detects that the CPU is about to go idle, try to steal a thread before switching to the idle thread. Since the search for a thread to steal has to be done inside a critical section in this context, limit the impact on latency by adding the knob kern.sched.trysteal_limit to limit the topological distance of the search and don't restart the search if we detect stale results. If this search can't find an stealable thread, the idle loop can do a more complete search. Also poll for threads being assigned to this CPU during the search and switch to them instead of continuing the search. This change is responsibile for the majority of the improvement in parallel buildworld times. In sched_balance_group() change the minimum threshold from stealing a thread from 1 to 2. Poaching a newly assigned thread from a CPU that is waking up hasn't yet switched to that thread from idle is likely very rare and is likely to have the same lock race as is seen when stealing threads in the idle loop. Also use tdq_notify() to kick the destintation CPU instead of always sending an IPI. Update a stale comment, the number of transferable threads is not calculated. Reviewed by: kib (earlier version) Comments by: avg, jeff, mav MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12130	2018-02-23 00:12:51 +00:00
Mateusz Guzik	a0c722bdbf	Fix up sysctl vfs.buffercache broken in r329612 Sample problem: top: sysctl(vfs.bufspace...) expected 8, got 4 Reported by: O. Hartmann <ohartmann walstatt.org>	2018-02-22 20:39:25 +00:00
Eric van Gyzen	0127914caa	sched_ule: update a comment to reflect reality MFC after: 3 days Sponsored by: Dell EMC	2018-02-22 17:09:26 +00:00
Jeff Roberson	683ca3a432	Fix the broken subqueue assignment for the cleanq. Reported by: pho Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon	2018-02-20 21:27:17 +00:00
Mateusz Guzik	500ca73d43	mtx: add debug assertions to mtx_spin_wait_unlocked	2018-02-20 20:39:34 +00:00
Mateusz Guzik	862db53fb5	Fix reaping on process fd close broken after r329449 The only consumer of proc_reap other than proc_to_reap was not updated to not PROC_SLOCK. Reported by: Juan Ramon Molina Menor <listjm club.fr>	2018-02-20 20:19:38 +00:00
Brooks Davis	b81e88d296	Reduce duplication in dynamic syscall registration code. Remove the unused syscall_(de)register() functions in favor of the better documented and easier to use syscall_helper_(un)register(9) functions. The default and freebsd32 versions differed in which array of struct sysents they used and a few missing updates to the 32-bit code as features were added to the main code. Reviewed by: cem Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14337	2018-02-20 18:08:57 +00:00
Mateusz Guzik	681a1b752c	Make killpg1 perform process validity checks without proc lock held.	2018-02-20 10:52:07 +00:00
Mateusz Guzik	81d68271d7	Reduce contention on the proctree lock during heavy package build. There is a proctree -> allproc ordering established. Most of the time it is either xlock -> xlock or slock -> slock. On fork however there is a slock -> xlock pair which results in pathological wait times due to threads keeping proctree held for reading and all waiting on allproc. Switch this to xlock -> xlock. Longer term fix would get rid of proctree in this place to begin with. Right now it is necessary to walk the session/process group lists to determine which id is free. The walk can be avoided e.g. with bitmaps. The exit path used to have one place which dealt with allproc and then with proctree. Move the allproc acquire into the section protected by proctree. This reduces contention against threads waiting on proctree in the fork codepath - the fork proctree holder does not have to wait for allproc as often. Finally, move tidhash manipulation outside of the area protected by either of these locks. The removal from the hash was already unprotected. There is no legitimate reason to look up thread ids for a process still under construction. This results in about 50% wait time reduction during -j 128 package build.	2018-02-20 02:18:30 +00:00
Jeff Roberson	06220fa737	Further parallelize the buffer cache. Provide multiple clean queues partitioned into 'domains'. Each domain manages its own bufspace and has its own bufspace daemon. Each domain has a set of subqueues indexed by the current cpuid to reduce lock contention on the cleanq. Refine the sleep/wakeup around the bufspace daemon to use atomics as much as possible. Add a B_REUSE flag that is used to requeue bufs during the scan to approximate LRU rather than locking the queue on every use of a frequently accessed buf. Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts. Reviewed by: markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14274	2018-02-20 00:06:07 +00:00
Mateusz Guzik	2ca66c1ef5	Fix process exit vs reap race introduced in r329449 The race manifested itself mostly in terms of crashes with "spin lock held too long". Relevant parts of respective code paths: exit: reap: PROC_LOCK(p); PROC_SLOCK(p); p->p_state == PRS_ZOMBIE PROC_UNLOCK(p); PROC_LOCK(p); /* exit work / if (p->p_state == PRS_ZOMBIE) / true / proc_reap() free proc / more exit work */ PROC_SUNLOCK(p); Thus a still exiting process is reaped. Prior to the change the zombie check was followed by slock/sunlock trip which prevented the problem. Even code prior to this commit has a bug: the proc is still accessed for statistic collection purposes. However, the severity is rather small and the bug may be fixed in a future commit. Reported by: many Tested by: allanjude	2018-02-19 00:54:08 +00:00
Mateusz Guzik	d257698833	mtx: add mtx_spin_wait_unlocked The primitive can be used to wait for the lock to be released. Intended usage is for locks in structures which are about to be freed. The benefit is the avoided interrupt enable/disable trip + atomic op to grab the lock and shorter wait if the lock is held (since there is no worry someone will contend on the lock, re-reads can be more aggressive). Briefly discussed with: kib	2018-02-19 00:38:14 +00:00
Mateusz Guzik	7beb60820f	exit: get rid of PROC_SLOCK when checking a process to report, take #2 The suspension counter needs synchronisation through slock, but we don't need it to check if inspecting the counter is necessary to begin with. In the common case it is not, thus avoid the lock if possible. Reviewed by: kib Tested by: pho	2018-02-18 21:07:15 +00:00
Mariusz Zaborski	965cd21173	Fix broken assertion in r329520. Reported by: pho@ lwhsu@	2018-02-18 20:04:39 +00:00
Brooks Davis	7a095112b2	Correct/improve the descriptions if kern.ipc.(shmsegs,sema,msqids). The description of kern.ipc.shmsegs was wrong since 2005. I updated the others (which were more correct) to match. PR: 225933 Reviewed by: cem MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14391	2018-02-18 19:19:36 +00:00
Mariusz Zaborski	20641651ec	Use the fdeget_locked function instead of the fget_locked in the sys_capability. Reviewed by: pjd@ (earlier version) Discussed with: mjg@	2018-02-18 15:27:24 +00:00
Mateusz Guzik	8bf6ff2226	Revert r329448. Turns out is is actually racy, reproducible with stress2/misc/truss.sh Requested by: kib	2018-02-17 17:23:43 +00:00
Mateusz Guzik	e4ccf57fdc	Undo LOCK_PROFILING pessimisation after r313454 and r313455 With the option used to compile the kernel both sx and rw shared ops would always go to the slow path which added avoidable overhead even when the facility is disabled. Furthermore the increased time spent doing uncontested shared lock acquire would be bogusly added to total wait time, somewhat skewing the results. Restore old behaviour of going there only when profiling is enabled. This change is a no-op for kernels without LOCK_PROFILING (which is the default).	2018-02-17 12:07:09 +00:00
Mateusz Guzik	ad58e5e86c	exit: stop doing PROC_SLOCK just to call proc_reap It immediately does PROC_SUNLOCK anyway and the lock plays no role.	2018-02-17 09:03:11 +00:00
Mateusz Guzik	9c0e785c58	exit: get rid of PROC_SLOCK when checking a process to report All accessed fields are protected with already held process lock.	2018-02-17 08:48:45 +00:00
Mateusz Guzik	015cd8dc93	On process exit signal the parent after dropping the proctree lock.	2018-02-17 00:24:50 +00:00
Mateusz Guzik	7e588b9219	Unref the prison after proctree is dropped.	2018-02-17 00:23:56 +00:00
Mateusz Guzik	65f29b9caa	Postpone sx_sunlock(&proctree_lock) on fork until after allproc is dropped. There is a significant contention on the lock during -j 128 package build. This change drops total wait time on this lock by 60%.	2018-02-17 00:23:28 +00:00
Mateusz Guzik	6776bfeb8f	Tidy up kern_wait6 - don't relock curproc in msleep - don't relock proctree if P_STATCHILD is spotted - reformat the proc_to_reap call in the main loop	2018-02-17 00:21:50 +00:00
Brooks Davis	aff4f2d315	Reduce duplication in __acl_*_(file\|link). Add const to new kern_ functions and push down as required. Reviewed by: rwatson Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14174	2018-02-15 21:24:43 +00:00
Mark Johnston	05f0f0e9ea	Fix the test for SET_FOREACH termination. Unlike the queue(3) _FOREACH macros, the iterator for a SET_FOREACH is not NULL after the end of the set is reached.	2018-02-15 17:35:40 +00:00
Mateusz Guzik	f795032b47	rwlock: diff-reduction of runlock compared to sx sunlock	2018-02-14 20:37:33 +00:00
Bryan Drewery	70c144dc78	nanosleep(2): Fix bogus incrementing of rmtp by tc_tick_sbt on [EINTR]. sbt is the time in the future that the tsleep_sbt() is expected to be completed at. sbtt is the current time. Depending on the precision with sysctl kern.timecounter.alloweddeviation the start time may be incremented by tc_tick_sbt. The same increment is needed for the current time of sbtt before calculating the difference. The impact of missing this increment is that rmtp may increase by one tc_tick_sbt on every early [EINTR] return. If the same struct is passed in for rqtp as rmtp this can result in rqtp effectively incrementing by tc_tick_sbt and sleeping longer than originally intended. This problem was introduced in r247797. Reviewed by: kib, markj, vangyzen (all on an older version of the test) MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D14362	2018-02-14 18:43:50 +00:00
Mark Johnston	6026dcd7ca	Add support for zstd-compressed user and kernel core dumps. This works similarly to the existing gzip compression support, but zstd is typically faster and gives better compression ratios. Support for this functionality must be configured by adding ZSTDIO to one's kernel configuration file. dumpon(8)'s new -Z option is used to configure zstd compression for kernel dumps. savecore(8) now recognizes and saves zstd-compressed kernel dumps with a .zst extension. Submitted by: cem (original version) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13101, https://reviews.freebsd.org/D13633	2018-02-13 19:28:02 +00:00
Ian Lepore	157f3d7649	Fix bad indentation. Whitespace only, no functional changes. Reported by: bde@	2018-02-13 17:38:08 +00:00

1 2 3 4 5 ...

15888 Commits