freebsd-skq

Author	SHA1	Message	Date
mmacy	a029bbc5e5	umtx: don't call umtxq_getchain unless the value is needed	2018-05-19 05:09:10 +00:00
mmacy	5eda5d6711	cpuset: revert and annotate instead	2018-05-19 05:07:31 +00:00
mmacy	861014e26c	conf: revert last change and annotate unused var instead	2018-05-19 05:07:03 +00:00
mmacy	56410d49b1	kevent: annotate unused stack local	2018-05-19 05:06:18 +00:00
mmacy	c78ee89370	lockf: annotate LOCKF_DEBUG only var	2018-05-19 05:04:38 +00:00
mmacy	aac0792101	capsicum: annotate variable only used by debug	2018-05-19 05:02:40 +00:00
mmacy	63ce31f3b1	turnstile / sleepqueue: annotate variables only used by debug builds	2018-05-19 05:00:16 +00:00
mmacy	4eacc08586	vfs: annotate variables only used by debug builds as __unused	2018-05-19 04:59:39 +00:00
mmacy	214cb70c1a	tty: use __unused annotation instead to silence warnings	2018-05-19 04:48:26 +00:00
mmacy	366f674ab5	malloc: avoid possibly returning stack garbage if MALLOC_DEBUG is defined	2018-05-19 04:43:49 +00:00
mmacy	a42e239a05	cpuset_thread0: avoid unused assignment on non debug build	2018-05-19 04:14:00 +00:00
mmacy	07e1c3def4	make_dev: avoid unused assignments on non debug builds	2018-05-19 04:13:20 +00:00
mmacy	77a3ff4fe0	mqueue: avoid unused variables	2018-05-19 04:10:53 +00:00
mmacy	71ee67e907	physio: avoid uninitialized variables	2018-05-19 04:09:58 +00:00
mmacy	0cad561f8c	cache_lookup remove unused variable and initialize used	2018-05-19 04:08:11 +00:00
mmacy	e1c41612d0	filt_timerdetach: only assign to old if we're going to check it in a KASSERT	2018-05-19 04:07:00 +00:00
mmacy	ae3e2f4e32	getnextevent: put variable only used by KTR under ifdef KTR	2018-05-19 04:05:36 +00:00
mmacy	1edae3372d	simplify control flow so that gcc knows we never pass save to curthread_pflags_restore without initializing	2018-05-19 04:04:44 +00:00
mmacy	3b202e89e0	tty: conditionally assign to ret value only used by MPASS statement	2018-05-19 04:02:29 +00:00
mmacy	130d8ea8de	remove unused locked variable in lockmgr_unlock_fast_path	2018-05-19 03:58:40 +00:00
mmacy	ac52916f52	signotify: don't create a stack local that isn't used on non-debug builds	2018-05-19 03:57:41 +00:00
mmacy	1a9d954f7e	sysv_msg initialize saved_msgsz	2018-05-19 03:56:39 +00:00
mmacy	9adfff2b18	remove unused variable	2018-05-19 03:55:42 +00:00
mmacy	0c8764214e	fix uninitialized variable warning in reader locks	2018-05-19 03:52:55 +00:00
mmacy	350f657795	fix uninitialized variable warning	2018-05-19 03:49:36 +00:00
mmacy	dcf6fa9422	sys_process.c fix set but not used warning	2018-05-19 03:48:35 +00:00
mmacy	edbaf8fc62	subr_epoch.c fix unused variable warnings	2018-05-19 03:47:37 +00:00
mmacy	79dec8cde3	pidctrl Actually use the variables that we assign to as seatbelts to prevent divide by zero Reviewed by: jeffr	2018-05-19 02:17:18 +00:00
mmacy	092fac4e4a	fix gcc8 unused variable and set but not used variable in unix sockets add copyright from lock rewrite while here	2018-05-19 02:15:40 +00:00
mjg	9507177d38	lockmgr: avoid atomic on unlock in the slow path The code is pretty much guaranteed not to be able to unlock. This is a minor nit. The code still performs way too many reads. The altered exclusive-locked condition is supposed to be always true as well, to be cleaned up at a later date.	2018-05-18 22:57:52 +00:00
mmacy	7aeac9ef18	ifnet: Replace if_addr_lock rwlock with epoch + mutex Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366	2018-05-18 20:13:34 +00:00
mmacy	49ea7f0046	epoch(9): assert that epoch is allocated post-configure	2018-05-18 18:27:17 +00:00
emaste	f0cc1a044c	Use NULL for SYSINIT's last arg, which is a pointer type Sponsored by: The FreeBSD Foundation	2018-05-18 17:58:09 +00:00
mmacy	a48d80f193	epoch(9): Make epochs non-preemptible by default There are risks associated with waiting on a preemptible epoch section. Change the name to make them not be the default and document the issue under CAVEATS. Reported by: markj	2018-05-18 17:29:43 +00:00
mmacy	3e6748b997	epoch: actually allocate the counters we've assigned sysctls too Approved by: sbruno	2018-05-18 02:57:39 +00:00
mmacy	aac2a8081e	epoch: add non-preemptible "critical" variant adds: - epoch_enter_critical() - can be called inside a different epoch, starts a section that will acquire any MTX_DEF mutexes or do anything that might sleep. - epoch_exit_critical() - corresponding exit call - epoch_wait_critical() - wait variant that is guaranteed that any threads in a section are running. - epoch_global_critical - an epoch_wait_critical safe epoch instance Requested by: markj Approved by: sbruno	2018-05-18 01:52:51 +00:00
brooks	1625a51062	Use strsep() to parse init_path in start_init(). This simplifies the use of the path variable by making it NUL terminated. This is a prerequisite for further cleanups. Reviewed by: imp Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15467	2018-05-17 23:07:51 +00:00
mmacy	4a56969793	epoch: skip poll function call in hardclock unless there are callbacks pending Reported by: mjg Approved by: sbruno	2018-05-17 21:39:15 +00:00
mmacy	cdcdb3cf1f	epoch(9): schedule pcpu callback task in hardclock if there are callbacks pending Approved by: sbruno	2018-05-17 19:57:07 +00:00
mmacy	f35c237f2d	epoch(9): eliminate the need to wait when polling for callbacks to run by using ck's own callback handling mechanism we can simply check which callbacks have had a grace period elapse Approved by: sbruno	2018-05-17 19:50:55 +00:00
mmacy	32052b0186	epoch(9): fix potential deadlock Don't acquire a waiting thread's lock while holding our own Approved by: sbruno	2018-05-17 19:41:58 +00:00
mmacy	d683846951	epoch(9): restore thread priority on exit if it was changed by a waiter Reported by: markj Approved by: sbruno	2018-05-17 19:08:28 +00:00
mmacy	7c5c49366c	AF_UNIX: make unix socket locking finer grained This change moves to using a reference count across lock drop / reacquire to guarantee liveness. Currently sends on unix sockets contend heavily on read locking the list lock. unix1_processes in will-it-scale peaks at 6 processes and then declines. With this change I get a substantial improvement in number of operations per second with 96 processes: x before + after N Min Max Median Avg Stddev x 11 1688420 1696389 1693578 1692766.3 2971.1702 + 10 63417955 71030114 70662504 69576423 2374684.6 Difference at 95.0% confidence 6.78837e+07 +/- 1.49463e+06 4010.22% +/- 88.4246% (Student's t, pooled s = 1.63437e+06) And even for 2 processes shows a ~18% improvement. "Small" iron changes (1, 2, and 4 processes): x before1 + after1.2 +------------------------------------------------------------------------+ \| + \| \| x + \| \| x + \| \| x + \| \| x ++ \| \| xx ++ \| \|x x xx ++ \| \| \|__________________A_____M_____AM____\|\| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 1131648 1197750 1197138.5 1190369.3 20651.839 + 10 1203840 1205056 1204919 1204827.9 353.27404 Difference at 95.0% confidence 14458.6 +/- 13723 1.21463% +/- 1.16683% (Student's t, pooled s = 14605.2) x before2 + after2.2 +------------------------------------------------------------------------+ \| +\| \| +\| \| +\| \| +\| \| +\| \| +\| \| x +\| \| x +\| \| x xx +\| \|x xxxx +\| \| \|___AM_\| A\| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 1972843 2045866 2038186.5 2030443.8 21367.694 + 10 2400853 2402196 2401043.5 2401172.7 385.40024 Difference at 95.0% confidence 370729 +/- 14198.9 18.2585% +/- 0.826943% (Student's t, pooled s = 15111.7) x before4 + after4.2 N Min Max Median Avg Stddev x 10 3986994 3991728 3990137.5 3989985.2 1300.0164 + 10 4799990 4806664 4806116.5 4805194 1990.6625 Difference at 95.0% confidence 815209 +/- 1579.64 20.4314% +/- 0.0421713% (Student's t, pooled s = 1681.19) Tested by: pho Reported by: mjg Approved by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15430	2018-05-17 17:59:35 +00:00
mmacy	8520e87bb7	epoch(9): make recursion lighter weight There isn't any real work to do except bump td_epochnest when recursing. Skip the additional work in this case. Approved by: sbruno	2018-05-17 01:13:40 +00:00
mmacy	c6869bc0ff	epoch(9): Guarantee forward progress on busy sections Add epoch section to struct thread. We can use this to ennable epoch counter to advance even if a section is perpetually occupied by a thread. Approved by: sbruno	2018-05-17 00:45:35 +00:00
mmacy	b4ad383689	hwpmc: Implement per-thread counters for PMC sampling This implements per-thread counters for PMC sampling. The thread descriptors are stored in a list attached to the process descriptor. These thread descriptors can store any per-thread information necessary for current or future features. For the moment, they just store the counters for sampling. The thread descriptors are created when the process descriptor is created. Additionally, thread descriptors are created or freed when threads are started or stopped. Because the thread exit function is called in a critical section, we can't directly free the thread descriptors. Hence, they are freed to a cache, which is also used as a source of allocations when needed for new threads. Approved by: sbruno Obtained from: jtl Sponsored by: Juniper Networks, Limelight Networks Differential Revision: https://reviews.freebsd.org/D15335	2018-05-16 22:29:20 +00:00
dumbbell	b9337da075	teken, vt(4): New callbacks to lock the terminal once ... to process input, instead of inside each smaller operations such as appending a character or moving the cursor forward. In other words, before we were doing (oversimplified): teken_input() <for each input character> vtterm_putchar() VTBUF_LOCK() VTBUF_UNLOCK() vtterm_cursor_position() VTBUF_LOCK() VTBUF_UNLOCK() Now, we are doing: vtterm_pre_input() VTBUF_LOCK() teken_input() <for each input character> vtterm_putchar() vtterm_cursor_position() vtterm_post_input() VTBUF_UNLOCK() The situation was even worse when the vtterm_copy() and vtterm_fill() callbacks were involved. The new callbacks are: * struct terminal_class->tc_pre_input() * struct terminal_class->tc_post_input() They are called in teken_input(), surrounding the while() loop. The goal is to improve input processing speed of vt(4). As a benchmark, here is the time taken to write a text file of 360 000 lines (26 MiB) on `ttyv0`: * vt(4), unmodified: 1500 ms * vt(4), with this patch: 1200 ms * syscons(4): 700 ms This is on a Haswell laptop with a GENERIC-NODEBUG kernel. At the same time, the locking is changed in the vt_flush() function which is responsible to draw the text on screen. So instead of (indirectly) using VTBUF_LOCK() just to read and reset the dirty area of the internal buffer, the lock is held for about the entire function, including the drawing part. The change is mostly visible while content is scrolling fast: before, lines could appear garbled while scrolling because the internal buffer was accessed without locks (once the scrolling was finished, the output was correct). Now, the scrolling appears correct. In the end, the locking model is closer to what syscons(4) does. Differential Revision: https://reviews.freebsd.org/D15302	2018-05-16 09:01:02 +00:00
emaste	0e06aa13ba	subr_pidctrl: use standard 2-Clause FreeBSD license and disclaimer Approved by: jeff	2018-05-15 00:50:09 +00:00
mmacy	9888701947	hwpmc: fix load/unload race and vm map LOR - fix load/unload race by allocating the per-domain list structure at boot - fix long extant vm map LOR by replacing pmc_sx sx_slock with global_epoch to protect the liveness of elements of the pmc_ss_owners list Reported by: pho Approved by: sbruno	2018-05-14 00:21:04 +00:00
mmacy	641892da2a	epoch(9): allow sx locks to be held across epoch_wait() The INVARIANTS checks in epoch_wait() were intended to prevent the block handler from returning with locks held. What it in fact did was preventing anything except Giant from being held across it. Check that the number of locks held has not changed instead. Approved by: sbruno@	2018-05-14 00:14:00 +00:00

1 2 3 4 5 ...

16077 Commits