freebsd-skq

Author	SHA1	Message	Date
brooks	301d9bace0	Use umtx_copyin_umtx_time32() in __umtx_op_lock_umutex_compat32(). Non-NULL timeouts where copied in improperly and could produce failures due to incompatible data structures. Reviewed by: kib MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14587	2018-03-06 01:52:04 +00:00
brooks	16ef654221	Regen after r330517.	2018-03-05 17:02:50 +00:00
brooks	5d5c624759	Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries. No functional change (comments change in some generated files.) Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14571	2018-03-05 17:02:16 +00:00
mjg	fb232db34a	lockmgr: save on sleepq when cmpset fails	2018-03-05 00:30:07 +00:00
mjg	b77b8ba290	lockmgr: whack unused lockmgr_note_exclusive_upgrade	2018-03-04 22:14:20 +00:00
mjg	11dd702fc8	mtx: tidy up recursion handling in thread lock Normally after grabbing the lock it has to be verified we got the right one to begin with. However, if we are recursing, it must not change thus the check can be avoided. In particular this avoids a lock read for non-recursing case which found out the lock was changed. While here avoid an irq trip of this happens. Tested by: pho (previous version)	2018-03-04 22:01:23 +00:00
mjg	8531f3bfc6	sx: don't do an atomic op in upgrade if it cananot succeed The code already pays the cost of reading the lock to obtain the waiters flag. Checking whether there is more than one reader is not a problem and avoids dirtying the line. This also fixes a small corner case: if waiters were to show up between reading the flag and upgrading the lock, the operation would fail even though it should not. No correctness change here though.	2018-03-04 21:41:05 +00:00
mjg	de7546e12b	locks: fix a corner case in r327399 If there were exactly rowner_retries/asx_retries (by default: 10) transitions between read and write state and the waiters still did not get the lock, the next owner -> reader transition would result in the code correctly falling back to turnstile/sleepq where it would incorrectly think it was waiting for a writer and decide to leave turnstile/sleepq to loop back. From this point it would take ts/sq trips until the lock gets released. The bug sometimes manifested itself in stalls during -j 128 package builds. Refactor the code to fix the bug, while here remove some of the gratituous differences between rw and sx locks.	2018-03-04 21:38:30 +00:00
mjg	97c5138f1a	lockmgr: start decomposing the main routine The main routine takes 8 args, 3 of which are almost the same for most uses. This in particular pushes it above the limit of 6 arguments passable through registers on amd64 making it impossible to tail call. This is a prerequisite for further cleanups. Tested by: pho	2018-03-04 19:12:54 +00:00
hselasky	95ebcd3ae9	Allow pause_sbt() to catch signals during sleep by passing C_CATCH flag. Define pause_sig() function macro helper similarly to other kernel functions which catch signals. Update outdated function description. Discussed with: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-03 18:36:38 +00:00
hselasky	dfc94d6584	Correct the return code from pause() during cold startup from zero to EWOULDBLOCK. This also matches the description in pause(9). Discussed with: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-03 18:12:21 +00:00
brooks	a0e10aee51	Rename kernel-only members of semid_ds and msgid_ds. This deliberately breaks the API in preperation for future syscall revisions which will remove these nonstandard members. In an exp-run a single port (devel/qemu-user-static) was found to use them which it did becuase it emulates system calls. This has been fixed in the ports tree. PR: 224443 (exp-run) Reviewed by: kib, jhb (previous version) Exp-run by: antoine Sponsored by: DARPA, AFRP Differential Revision: https://reviews.freebsd.org/D14490	2018-03-02 22:10:48 +00:00
mjg	dd76098f06	sx: fix adaptive spinning broken in r327397 The condition was flipped. In particular heavy multithreaded kernel builds on zfs started suffering due to nested sx locks. For instance make -s -j 128 buildkernel: before: 3326.67s user 1269.62s system 6981% cpu 1:05.84 total after: 3365.55s user 911.27s system 6871% cpu 1:02.24 total ps. .-'---`-. .-'---`-. ,' `. ,' `. \| \ \| \ \| \ \| \ \ _ \ \ _ \ ,\ _ ,'-,/-)\ ,\ _ ,'-,/-)\ ( * \ \,' ,' ,'-) ( * \ \,' ,' ,'-) `._,) -',-') `._,) -',-') \/ ''/ \/ ''/ ) / / ) / / / ,'-' / ,'-'	2018-03-02 21:26:27 +00:00
mjg	b9b1579c76	Don't generate data in sysctl_out_proc unless we intend to copy out. The first call is used to gauge how much spaces is needed. Just computing the size instead of generating the output allows to not take the proctree lock.	2018-02-25 15:16:58 +00:00
jeff	429f27856d	Fix issues with sparse cpu allocation. Consistently use mp_maxid + 1. Reported by: pho Reviewed by: markj Sponsored by: Netflix, Dell/EMC Isilon	2018-02-25 00:35:21 +00:00
cem	53052ebdcf	kern/sys_generic.c: style(9) return(foo) -> return (foo) No functional change. Sponsored by: Dell EMC Isilon	2018-02-24 01:15:33 +00:00
jeff	6e1b76d26b	Add a generic Proportional Integral Derivative (PID) controller algorithm and use it to regulate page daemon output. This provides much smoother and more responsive page daemon output, anticipating demand and avoiding pageout stalls by increasing the number of pages to match the workload. This is a reimplementation of work done by myself and mlaier at Isilon. Reviewed by: bsdimp Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14402	2018-02-23 22:51:51 +00:00
mckusick	a7edc289e0	Include error number in the "fsync: giving up on dirty" message (in case it ever starts happening again in spite of 328444). Submitted by: Andreas Longwitz <longwitz at incore.de>	2018-02-23 21:57:10 +00:00
kib	3c2ec84650	Restore UP build. Reviewed by: truckman Sponsored by: The FreeBSD Foundation	2018-02-23 18:26:31 +00:00
emaste	5122d84e64	Correct pseudo misspelling in sys/ comments contrib code and #define in intel_ata.h unchanged.	2018-02-23 18:15:50 +00:00
truckman	1e9bbe93d1	Decrease latency by not wrapping the idle loop's potentially lengthy search for a thread to steal inside a critical section. Since this allows the search to be preempted, restart the search if preemption happens since the search results found earlier may no longer be valid. Decrease the latency of starting a thread that may be assigned to this CPU during the search by polling for incoming threads during the search and switching to that thread instead of continuing the search. Test for stale search results and restart the search before going through the expense of calling tdq_lock_pair(). Retry some tests after grabbing the locks since things may have changed while waiting to get both locks. Eliminate special case handling for stealing from an SMT peer that uses 1 as the steal threshold. This can only succeed if a thread has been assigned but our SMT peer has not yet started executing it. This is quite rare and when it happens the other SMT thread is generally waiting for the same tdq lock that we hold. Basically both SMT threads are racing to grab the same spin lock. Add the kern.sched.always_steal knob from a ULE patch by jeff@. Incorporate another idea from Jeff's ULE patch. If the sched_switch() detects that the CPU is about to go idle, try to steal a thread before switching to the idle thread. Since the search for a thread to steal has to be done inside a critical section in this context, limit the impact on latency by adding the knob kern.sched.trysteal_limit to limit the topological distance of the search and don't restart the search if we detect stale results. If this search can't find an stealable thread, the idle loop can do a more complete search. Also poll for threads being assigned to this CPU during the search and switch to them instead of continuing the search. This change is responsibile for the majority of the improvement in parallel buildworld times. In sched_balance_group() change the minimum threshold from stealing a thread from 1 to 2. Poaching a newly assigned thread from a CPU that is waking up hasn't yet switched to that thread from idle is likely very rare and is likely to have the same lock race as is seen when stealing threads in the idle loop. Also use tdq_notify() to kick the destintation CPU instead of always sending an IPI. Update a stale comment, the number of transferable threads is not calculated. Reviewed by: kib (earlier version) Comments by: avg, jeff, mav MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12130	2018-02-23 00:12:51 +00:00
mjg	f85647797f	Fix up sysctl vfs.buffercache broken in r329612 Sample problem: top: sysctl(vfs.bufspace...) expected 8, got 4 Reported by: O. Hartmann <ohartmann walstatt.org>	2018-02-22 20:39:25 +00:00
vangyzen	f84320cfc8	sched_ule: update a comment to reflect reality MFC after: 3 days Sponsored by: Dell EMC	2018-02-22 17:09:26 +00:00
jeff	21bd298f13	Fix the broken subqueue assignment for the cleanq. Reported by: pho Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon	2018-02-20 21:27:17 +00:00
mjg	428cfb036c	mtx: add debug assertions to mtx_spin_wait_unlocked	2018-02-20 20:39:34 +00:00
mjg	fcb26b586a	Fix reaping on process fd close broken after r329449 The only consumer of proc_reap other than proc_to_reap was not updated to not PROC_SLOCK. Reported by: Juan Ramon Molina Menor <listjm club.fr>	2018-02-20 20:19:38 +00:00
brooks	3217658e61	Reduce duplication in dynamic syscall registration code. Remove the unused syscall_(de)register() functions in favor of the better documented and easier to use syscall_helper_(un)register(9) functions. The default and freebsd32 versions differed in which array of struct sysents they used and a few missing updates to the 32-bit code as features were added to the main code. Reviewed by: cem Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14337	2018-02-20 18:08:57 +00:00
mjg	d4ad49fd39	Make killpg1 perform process validity checks without proc lock held.	2018-02-20 10:52:07 +00:00
mjg	25e9d331dc	Reduce contention on the proctree lock during heavy package build. There is a proctree -> allproc ordering established. Most of the time it is either xlock -> xlock or slock -> slock. On fork however there is a slock -> xlock pair which results in pathological wait times due to threads keeping proctree held for reading and all waiting on allproc. Switch this to xlock -> xlock. Longer term fix would get rid of proctree in this place to begin with. Right now it is necessary to walk the session/process group lists to determine which id is free. The walk can be avoided e.g. with bitmaps. The exit path used to have one place which dealt with allproc and then with proctree. Move the allproc acquire into the section protected by proctree. This reduces contention against threads waiting on proctree in the fork codepath - the fork proctree holder does not have to wait for allproc as often. Finally, move tidhash manipulation outside of the area protected by either of these locks. The removal from the hash was already unprotected. There is no legitimate reason to look up thread ids for a process still under construction. This results in about 50% wait time reduction during -j 128 package build.	2018-02-20 02:18:30 +00:00
jeff	e3be9f8fb6	Further parallelize the buffer cache. Provide multiple clean queues partitioned into 'domains'. Each domain manages its own bufspace and has its own bufspace daemon. Each domain has a set of subqueues indexed by the current cpuid to reduce lock contention on the cleanq. Refine the sleep/wakeup around the bufspace daemon to use atomics as much as possible. Add a B_REUSE flag that is used to requeue bufs during the scan to approximate LRU rather than locking the queue on every use of a frequently accessed buf. Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts. Reviewed by: markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14274	2018-02-20 00:06:07 +00:00
mjg	7fcf37e646	Fix process exit vs reap race introduced in r329449 The race manifested itself mostly in terms of crashes with "spin lock held too long". Relevant parts of respective code paths: exit: reap: PROC_LOCK(p); PROC_SLOCK(p); p->p_state == PRS_ZOMBIE PROC_UNLOCK(p); PROC_LOCK(p); /* exit work / if (p->p_state == PRS_ZOMBIE) / true / proc_reap() free proc / more exit work */ PROC_SUNLOCK(p); Thus a still exiting process is reaped. Prior to the change the zombie check was followed by slock/sunlock trip which prevented the problem. Even code prior to this commit has a bug: the proc is still accessed for statistic collection purposes. However, the severity is rather small and the bug may be fixed in a future commit. Reported by: many Tested by: allanjude	2018-02-19 00:54:08 +00:00
mjg	2fd1c912f6	mtx: add mtx_spin_wait_unlocked The primitive can be used to wait for the lock to be released. Intended usage is for locks in structures which are about to be freed. The benefit is the avoided interrupt enable/disable trip + atomic op to grab the lock and shorter wait if the lock is held (since there is no worry someone will contend on the lock, re-reads can be more aggressive). Briefly discussed with: kib	2018-02-19 00:38:14 +00:00
mjg	7669ad8f23	exit: get rid of PROC_SLOCK when checking a process to report, take #2 The suspension counter needs synchronisation through slock, but we don't need it to check if inspecting the counter is necessary to begin with. In the common case it is not, thus avoid the lock if possible. Reviewed by: kib Tested by: pho	2018-02-18 21:07:15 +00:00
oshogbo	e41b3243d5	Fix broken assertion in r329520. Reported by: pho@ lwhsu@	2018-02-18 20:04:39 +00:00
brooks	91b2a928ff	Correct/improve the descriptions if kern.ipc.(shmsegs,sema,msqids). The description of kern.ipc.shmsegs was wrong since 2005. I updated the others (which were more correct) to match. PR: 225933 Reviewed by: cem MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14391	2018-02-18 19:19:36 +00:00
oshogbo	6d3d70b9a6	Use the fdeget_locked function instead of the fget_locked in the sys_capability. Reviewed by: pjd@ (earlier version) Discussed with: mjg@	2018-02-18 15:27:24 +00:00
mjg	075146c37f	Revert r329448. Turns out is is actually racy, reproducible with stress2/misc/truss.sh Requested by: kib	2018-02-17 17:23:43 +00:00
mjg	c782ed4dbc	Undo LOCK_PROFILING pessimisation after r313454 and r313455 With the option used to compile the kernel both sx and rw shared ops would always go to the slow path which added avoidable overhead even when the facility is disabled. Furthermore the increased time spent doing uncontested shared lock acquire would be bogusly added to total wait time, somewhat skewing the results. Restore old behaviour of going there only when profiling is enabled. This change is a no-op for kernels without LOCK_PROFILING (which is the default).	2018-02-17 12:07:09 +00:00
mjg	018f299fa7	exit: stop doing PROC_SLOCK just to call proc_reap It immediately does PROC_SUNLOCK anyway and the lock plays no role.	2018-02-17 09:03:11 +00:00
mjg	fc7474d6fb	exit: get rid of PROC_SLOCK when checking a process to report All accessed fields are protected with already held process lock.	2018-02-17 08:48:45 +00:00
mjg	7616e6b9b0	On process exit signal the parent after dropping the proctree lock.	2018-02-17 00:24:50 +00:00
mjg	6b281480bb	Unref the prison after proctree is dropped.	2018-02-17 00:23:56 +00:00
mjg	2ba397be42	Postpone sx_sunlock(&proctree_lock) on fork until after allproc is dropped. There is a significant contention on the lock during -j 128 package build. This change drops total wait time on this lock by 60%.	2018-02-17 00:23:28 +00:00
mjg	c1f86f952d	Tidy up kern_wait6 - don't relock curproc in msleep - don't relock proctree if P_STATCHILD is spotted - reformat the proc_to_reap call in the main loop	2018-02-17 00:21:50 +00:00
brooks	e21f157f4d	Reduce duplication in __acl_*_(file\|link). Add const to new kern_ functions and push down as required. Reviewed by: rwatson Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14174	2018-02-15 21:24:43 +00:00
markj	e1e7fa21c2	Fix the test for SET_FOREACH termination. Unlike the queue(3) _FOREACH macros, the iterator for a SET_FOREACH is not NULL after the end of the set is reached.	2018-02-15 17:35:40 +00:00
mjg	9e89b7286b	rwlock: diff-reduction of runlock compared to sx sunlock	2018-02-14 20:37:33 +00:00
bdrewery	db51b6bdae	nanosleep(2): Fix bogus incrementing of rmtp by tc_tick_sbt on [EINTR]. sbt is the time in the future that the tsleep_sbt() is expected to be completed at. sbtt is the current time. Depending on the precision with sysctl kern.timecounter.alloweddeviation the start time may be incremented by tc_tick_sbt. The same increment is needed for the current time of sbtt before calculating the difference. The impact of missing this increment is that rmtp may increase by one tc_tick_sbt on every early [EINTR] return. If the same struct is passed in for rqtp as rmtp this can result in rqtp effectively incrementing by tc_tick_sbt and sleeping longer than originally intended. This problem was introduced in r247797. Reviewed by: kib, markj, vangyzen (all on an older version of the test) MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D14362	2018-02-14 18:43:50 +00:00
markj	4c9fc08f4a	Add support for zstd-compressed user and kernel core dumps. This works similarly to the existing gzip compression support, but zstd is typically faster and gives better compression ratios. Support for this functionality must be configured by adding ZSTDIO to one's kernel configuration file. dumpon(8)'s new -Z option is used to configure zstd compression for kernel dumps. savecore(8) now recognizes and saves zstd-compressed kernel dumps with a .zst extension. Submitted by: cem (original version) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13101, https://reviews.freebsd.org/D13633	2018-02-13 19:28:02 +00:00
ian	ae623dba38	Fix bad indentation. Whitespace only, no functional changes. Reported by: bde@	2018-02-13 17:38:08 +00:00

1 2 3 4 5 ...

15915 Commits