freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	27dcd3d90b	cam: clean up empty lines in .c and .h files	2020-09-01 22:13:48 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Warner Losh	ece56614c8	Revert r355833 While it works on nda, it fails on ada and/or da for at least zfs with a modify after free issue on a trim BIO. Revert while I rework it to fix those devices.	2019-12-17 21:53:22 +00:00
Warner Losh	0d83f8dc1f	Implement bio_speedup React to the BIO_SPEED command in the cam io scheduler by completing as successful BIO_DELETE commands that are pending, up to the length passed down in the BIO_SPEEDUP cmomand. The length passed down is a hint for how much space on the drive needs to be recovered. By completing the BIO_DELETE comomands, this allows the upper layers to allocate and write to the blocks that were about to be trimmed. Since FreeBSD implements TRIMSs as advisory, we can eliminliminate them and go directly to writing. The biggest benefit from TRIMS coomes ffrom the drive being able t ooptimize its free block pool inthe log run. There's little nto no bene3efit in the shoort term. , sepeciall whn the trim is followed by a write. Speedup lets us make this tradeoff. Reviewed by: kirk, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D18351	2019-12-17 00:13:45 +00:00
Warner Losh	7918ea40a5	Eliminate the TRIM_ACTIVE flag. Rather than a trim active flag, have a counter that can be used to have a absolute limit on the number of trims in flight independent of any I/O limiting factors. Sponsored by: Netflix	2019-12-17 00:13:30 +00:00
Warner Losh	3aba1d47c8	Tweak the ddb show cam iosched command a bit. For each of the different queue types, list the name of the queue. While it can be worked out from context, this makes it more useful and clearer. Sponsored by: Netflix	2019-12-17 00:13:26 +00:00
Warner Losh	c6171b4440	Add rate limiters to TRIM. Add rate limiters to trims. Trims are a bit different than reads or writes in that they can be combined, so some care needs to be taken where we rate limit them. Additional work will be needed to push the working rate limit below the I/O quanta rate for things like IOPS. Sponsored by: Netflix	2019-12-17 00:13:21 +00:00
Warner Losh	d900ade516	NVME trim clocking Add the ability to set two goals for trims in the I/O scheduler. The first goal is the number of BIO_DELETEs to accumulate (kern.cam.XX.U.trim_goal). When non-zero, this many trims will be accumulated before we start to transfer them to lower layers. This is useful for devices that like to get lots of trims all at once in one transaction (not all devices are like this, and some vary by workload). The second is a number of ticks to defer trims. If you've set a trim goal, then kern.cam.XX.U.trim_ticks controls how long the system will defer those trims before timing out and sending them anyway. It has no effect when trim_goal is 0. In any event, a BIO_FLUSH will cause all the TRIMs to be released to the periph drivers. This may be a minor overloading of what BIO_FLUSH is supposed to mean, but it's useful to preserve other ordering semantics that users of BIO_FLUSH reply on. Sponsored by: Netflix, Inc	2018-11-27 00:36:35 +00:00
Warner Losh	1759fd7798	Minor tweaks to the formatting Tweak the format of the trim + read bias code. Add similar debug to the read + writes case. Spondored by: Netflix	2018-11-26 22:50:30 +00:00
Warner Losh	e5436ab5af	Add cam_iosched_set_latfcn to set a latency callback for high latency. It's often useful to have a callback when an I/O takes more than a threshold amount of time. This adds the infrastructure for periph devices to register one. One use-case is as a debugging aide when you need a semi-realtime indication of an I/O outlier so you can trigger bus capture gear for vendor analysis. Sponsored by: Netflix, Inc	2018-11-15 16:02:45 +00:00
Warner Losh	74cc33ce57	Flesh out a comment about what we're doing with read bias and trims. Sponsored by: Netflix	2018-08-15 00:15:40 +00:00
Warner Losh	62c94a0551	For the dynamic I/O scheduler, make the TRIM stuff also count against read bias so we do reads in preference to TRIMs. This helps a lot when many trims are delivered at once from the upper layers as they tend to delay READs due to priority inversion in the code today. The non iosched case will be fixed when the trim comibing changes needed for nvme come in later this year. Sponsored by: Netflix	2018-07-26 22:55:51 +00:00
Warner Losh	041f49aece	Remove the 'All Rights Reserved' clause from some of the stuff I've done for Netflix, since I'm in the neighborhood.	2018-05-09 20:32:23 +00:00
Warner Losh	157cb465c4	Fix inverted logic that counted all completions as errors, except when they were actual errors. Sponsored by: Netflix	2018-03-14 16:44:57 +00:00
Warner Losh	8a3de7bc34	Allow NULL ccb to cam_iosched_bio_complete When the ccb is NULL to cam_iosched_bio_complete, just update the other statistics, but not the time. If many operations are collapsed together, this is needed to keep stats properly for the grouped bp. This should fix trim accounting. Sponsored by: Netflix	2018-03-14 16:44:16 +00:00
Warner Losh	2d87718fda	Use bool instead of int for predicate functions relating to work available.	2018-02-23 16:06:54 +00:00
Warner Losh	07e5967a22	Revert r329814 as well. It should have been in r329819.	2018-02-22 11:51:50 +00:00
Warner Losh	0028abe633	Backout r329818, r329816 and r329815. These aren't the commits I thought I was testing prior to commit. Revert until I can sort out what happened and fix it.	2018-02-22 11:18:33 +00:00
Warner Losh	91acaad987	Fix typo in last commit after last rebase before commit...	2018-02-22 10:55:23 +00:00
Warner Losh	c5fe3ae9b8	Introduce capacity flags for periphs Introduce flags word to describe the capacities of the peripheral. First bit will describe if the periph driver allows multiple outstanding TRIMS to be active in a device. Modify the I/O scheduler so that the nda driver can queue trims for a while after the first one arrives. We'll queue until we see a I/O scheduler tick, then we'll schedule as many TRIMs as allowed by other factors (currently this is slocts in the NVMe controller). This mariginally helps the read latency issues we see with reads, but sets the stage for the nda driver to do TRIM collapsing like the da and ada drivers do today. Sponsored by: Netflix	2018-02-22 05:43:55 +00:00
Warner Losh	c9878d6d63	Note when we tick. To help implement a policy of 'queue all trims until next I/O sched tick' policy to help coalesce them, note when we tick so we can do something special on the first call after the tick to get more work. Sponsored by: Netflix	2018-02-22 05:43:50 +00:00
Warner Losh	f2b9885036	Wrap an extra long line This debugging line is too big for even my largest xterm. wrap it at about 80 columns. Sponsored by: Netflix	2018-02-22 05:43:45 +00:00
Warner Losh	97f8aa050e	Don't sort TRIMs. While the code for ada and da both assume that the trim list is ordered when doing the coaleascing the TRIMs, it turns out that creating the sorted list uses more resources than are saved by having slightly fewer trims sent to the device. Sponsored by: Netflix	2018-02-22 05:43:20 +00:00
Warner Losh	c4b72d8b37	Keep a counter for number of requests completed with an error. Sponsored by: Netflix	2018-02-06 23:21:08 +00:00
Pedro F. Giffuni	f24882eca5	SPDX: finish tagging sys/cam.	2018-01-16 23:19:57 +00:00
Warner Losh	6ca2fb6623	Treat a 'current' value of 0 as unlimited as a failsfe. When limiting I/O, a value of 0 makes no sense as a limit. No progress can be made. Trade the possibility that someone might be doing something clever to achieve ultra-low I/O limits vs the damage of not ever making progress on an I/O in favor of making progress. Now the machine won't be useless if this accidentally gets requested. Sponsored by: Netflix	2017-10-24 02:25:42 +00:00
Warner Losh	78ed811e6c	cam iosched: Bettar account IOPS for smoother performance Prevent cam_iosched_iops_tick() from discarding 'unspent' ios unless it's a new accounting interval. Previously ios that weren't used between ticks were lost, as a result the iops limiter could enforce a limit below the configured maximum. Obtained from: ElectroBSD Submitted by: Fabian Keil PR: 221974	2017-09-22 02:36:36 +00:00
Warner Losh	f777123b83	cam iosched: Enforce iop limits below the quanta value Previously the iops limiter would always allow at least quanta ios per second as cam_iosched_iops_tick() never set ios->l_value1 below 1. Submitted by: Fabian Keil <fk@fabiankeil.de> Obtained from: ElectroBSD PR: 221974	2017-09-22 02:36:32 +00:00
Warner Losh	89d26636f3	cam iosched: Call cam_iosched_limiter_init() after ios->current is set to the default Previously ios->current was set to 0 until the first cam_iosched_cl_maybe_steer() call. PR: 221954 Obtained from: ElectroBSD Submitted by: Fabian Keil Differential Revision: https://reviews.freebsd.org/D12349	2017-09-20 21:26:01 +00:00
Warner Losh	3028dd8dd5	cam iosched: Schedule cam_iosched_ticker() quanta times per second Previously callout_reset() was called with a "ticks" value that was off by one. As a result cam_iosched_ticker() was called a bit too frequently: On systems with hz=1000 a quanta value of 200 resulted in ~250 calls and a value of 100 in ~111 calls. For the "queue_depth" and "bandwidth" limiters the difference doesn't matter but the "iops" limiter depends on the scheduling to enforce the correct maximum. PR: 221956 Obtained from: ElectroBSD Submitted by: Fabian Keil Differential Revision: https://reviews.freebsd.org/D12350	2017-09-20 21:25:56 +00:00
Warner Losh	2d22619adc	cam iosched: Add a handler for the quanta sysctl to enforce valid values Invalid values can result in devision-by-zero panics or other undefined behaviour so lets not allow them. PR: 221957 Obtained from: ElectroBSD Submitted by: Fabian Keil Differential Revision: https://reviews.freebsd.org/D12351	2017-09-20 21:19:53 +00:00
Warner Losh	84c12dcdd0	cam iosched: Use the write queue for BIO_ZONE commands Use the write queue for BIO_ZONE commands so they can't get executed ahead of writes that were sent after them. More generally, since they introduce strong ordering into the list, they need to go to the write queue (which is the only queue that BIO_ORDERED is honored for at the moment). In fact, fix mismatch between queueing and dequeueing code by changing this to queue all non-reads (and non-trims) to the write queue. As a side effect this prevents the kernel message: kernel: Found bio_cmd = 0x9 which cam_iosched_next_bio() emits when finding commands other than BIO_READ in the read queue. PR: 221973 Obtained from: ElectroBSD Submitted by: Fabian Keil Differential Revision: https://reviews.freebsd.org/D12353	2017-09-20 21:13:20 +00:00
Warner Losh	55c770b40a	Update comments on what the CAM_IOSCHED_FLAG_TRIM_ACTIVE means. It's intended only for those situations where the periph driver ones to limit the number of trims active to one and only one. Also update comments on associated functions. Sponsored by: Netflix	2017-09-15 20:15:55 +00:00
Warner Losh	d7fa1ab02d	cam iosched: Limit the quanta default to hz if it's below 200 The cam_iosched_ticker() can't be scheduled more than once per tick. Some limiters depend on quanta matching the number of calls per second to enforce the proper limits. Limit the quanta to no faster than 1 per clock tick. This fixes some features when running in VMs where the default HZ is 100. PR: 221953 Obtained from: ElectroBSD Differential Revision: https://reviews.freebsd.org/D12337 Submitted by: Fabian Keil	2017-09-12 23:46:33 +00:00
Warner Losh	08fc2f23b3	Expand the latency tracking array from 1.024s to 8.192s to help track extreme outliers from dodgy drives. Adjust comments to reflect this, and make sure that the number of latency buckets match in the two places where it matters.	2017-08-24 22:11:10 +00:00
Warner Losh	e4c9cba71f	Fix 32-bit overflow on latency measurements o Allow I/O scheduler to gather times on 32-bit systems. We do this by shifting the sbintime_t over by 8 bits and truncating to 32-bits. This gives us 8.24 time. This is sufficient both in range (256 seconds is about 128x what current users need) and precision (60ns easily meets the 1ms smallest bucket size measurements). 64-bit systems are unchanged. Centralize all the time math so it's easy to tweak tha range / precision tradeoffs in the future. o While I'm here, the I/O scheduler should be using periph_data rather than sim_data since it is operating on behalf of the periph. Differential Review: https://reviews.freebsd.org/D12119	2017-08-24 22:10:58 +00:00
Ed Maste	0b4060b073	cam iosched: fix typos in comments PR: 220947 Submitted by: Fabian Keil Obtained from: ElectroBSD	2017-08-18 16:38:33 +00:00
Warner Losh	79d80af216	Implement moving SD. From the paper "Incremental calculation of weighted mean and variance" by Tony Finch Februrary 2009, retrieved from http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf converted to use shifting.	2017-03-22 19:18:47 +00:00
Warner Losh	e831795b8a	Remove nested #ifdef that can't possibly be false.	2017-01-27 08:30:43 +00:00
Warner Losh	b20c0a07fb	Preening pass to fix up trailing white space and other minor style(9) nits (though I'm sure others remain). MFC After: 3 days	2017-01-25 02:05:08 +00:00
Warner Losh	cf3ec15167	Compute two new metrics. Disk load, the average number of transactions we have queued up normaliazed to the queue size. Also compute buckets of latency to help compute, in userland, estimates of Median, P90, P95 and P99 values. Sponsored by: Netflix, Inc	2016-09-30 17:49:04 +00:00
Warner Losh	035ec48e55	Tidy up loose ends from Netflix I/O sched rename to dynamic I/O sched. Rename kern.cam.do_netflix_iosched sysctl to kern.cam.do_dynamic_iosched. Approved by: re (kib@)	2016-07-07 20:31:35 +00:00
Warner Losh	df2362478e	Rename CAM_NETFLIX_IOSCHED to CAM_IOSCHED_DYNAMIC to better reflect its nature. Approved by: re Reviewed By: jhb Differential Revision: https://reviews.freebsd.org/D6811	2016-06-23 23:20:58 +00:00
Xin LI	b97b6d27f2	Fix tinderbox LINT build.	2016-04-18 08:24:13 +00:00
Warner Losh	2b5c19f196	Do the intmax_t dance for debug so CAM_NETFLIX_IOSCHED builds on i386. Sponsored by: Netflix, Inc	2016-04-17 21:29:44 +00:00
Warner Losh	5ede5b8cb5	Put function only used by CAM_NETFLIX_IOSCHED under that ifdef.	2016-04-15 05:10:32 +00:00
Warner Losh	ba6c22ce93	Add in missing files from r298002.	2016-04-14 22:13:44 +00:00

47 Commits