freebsd-nq/sys
Josh Paetzel f2be81e92c MFV 312436
6569 large file delete can starve out write ops

  illumos/illumos-gate@ff5177ee8b
  ff5177ee8b

  https://www.illumos.org/issues/6569
    The core issue I've found is that there is no throttle for how many
    deletes get assigned to one TXG. As a results when deleting large files
    we end up filling consecutive TXGs with deletes/frees, then write
    throttling other (more important) ops.

    There is an easy test case for this problem. Try deleting several
    large files (at least 1/2 TB) while you do write ops on the same
    pool. What we've seen is performance of these write ops (let's
    call it sideload I/O) would drop to zero.

    More specifically the problem is that dmu_free_long_range_impl()
    can/will fill up all of the dirty data in the pool "instantly",
    before many of the sideload ops can get in. So sideload
    performance will be impacted until all the files are freed.

    The solution we have tested at Nexenta (with positive results)
    creates a relatively simple throttle for how many "free" ops we let
    into one TXG.

    However this solution exposes other problems that should also be
    addressed. If we are to slow down freeing of data that means one
    has to wait even longer (assuming vnode ref count of 1) to get shell
    back after an rm or for NFS thread to finish the free-ing op.
    To avoid this the proposed solution is to call zfs_inactive() async
    for "large" files. Async freeing then begs for the reclaimed space
    to be accounted for in the zpool's "freeing" prop.

    The other issue with having a longer delete is the inability to
    export/unmount for a longer period of time. The proposed solution
    is to interrupt freeing of blocks when a fs is unmounted.

  Author: Alek Pinchuk <alek@nexenta.com>
  Reviewed by: Matt Ahrens <mahrens@delphix.com>
  Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
  Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
  Approved by: Dan McDonald <danmcd@omniti.com>

Reviewed by:	avg
Differential Revision:	D9008
2017-01-20 15:01:04 +00:00
..
amd64 vmm_dev: work around a bogus error with gcc 6.3.0 2017-01-20 13:21:27 +00:00
arm Handle the set capabilities ioctl, letting the hardware checksum be 2017-01-19 14:58:55 +00:00
arm64 Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
boot Remove empty ranges property so beri_simplebus can be attached again. 2017-01-18 14:41:59 +00:00
bsm
cam Remove writing 'residual' field of struct ctl_scsiio. 2017-01-17 18:32:47 +00:00
cddl MFV 312436 2017-01-20 15:01:04 +00:00
compat Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
conf mppc - Finish pluging NETGRAPH_MPPC_COMPRESSION. 2017-01-20 00:02:11 +00:00
contrib Merge ACPICA 20170119. 2017-01-19 22:07:21 +00:00
crypto libmd: add noexec stack annotation in skein_block_asm.s 2017-01-07 19:26:25 +00:00
ddb Revert r311952. 2017-01-14 22:06:25 +00:00
dev Make draining a sendqueue more robust. 2017-01-20 12:02:40 +00:00
fs Remove mistakenly merged field. 2017-01-19 20:03:26 +00:00
gdb
geom Report disk addition errors on add or create subcommand. 2017-01-20 13:49:04 +00:00
gnu Add Ingenic X1000 DTS files (unofficial). 2016-11-19 15:03:49 +00:00
i386 Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
isa
kern ANSYfy kern_ktrace.c and remove archaic register keyword 2017-01-20 14:59:56 +00:00
kgssapi
libkern libkern: Remove obsolete 'register' keyword 2017-01-12 17:02:29 +00:00
mips [ar71xx] add EARLY_PRINTF support for the rest of the non-AR933x SoCs. 2017-01-15 06:35:00 +00:00
modules Use SRCTOP-relative paths to other directories instead of .CURDIR-relative ones 2017-01-20 05:45:07 +00:00
net Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
net80211 [net80211] allow for MCS16-23 to be statically configured. 2017-01-20 07:43:40 +00:00
netgraph mppc - Finish pluging NETGRAPH_MPPC_COMPRESSION. 2017-01-20 00:02:11 +00:00
netinet Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
netinet6 Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
netipsec Add direction argument to ipsec_setspidx_inpcb() function. 2017-01-08 12:40:07 +00:00
netnatm
netpfil Initialize IPFW static rules rmlock with RM_RECURSE flag. 2017-01-17 10:50:28 +00:00
netsmb
nfs
nfsclient
nfsserver
nlm
ofed Move the ConnectX-3 and ConnectX-2 driver from sys/ofed into sys/dev/mlx4 2016-09-30 08:23:06 +00:00
opencrypto Add support for the fpu_kern(9) KPI on arm64. It hooks into the existing 2016-10-20 09:22:10 +00:00
pc98 Add a COMPAT_FREEBSD11 kernel option. 2016-12-09 18:54:12 +00:00
powerpc Use the explicit expanded form of cmp. 2017-01-18 03:42:21 +00:00
riscv Disable superpages reservations as we don't have implemented them yet. 2016-11-21 12:00:31 +00:00
rpc
security Audit 'fd' and 'cmd' arguments to fcntl(2), and when generating BSM, 2016-11-22 00:41:24 +00:00
sparc64 Trim a few comments on platforms that did not implement mmap of /dev/kmem. 2017-01-13 21:52:53 +00:00
sys Adjust gtaskqueue startup again so that we catch the !SMP case and 2017-01-19 19:58:08 +00:00
teken
tests
tools Replace using of objdump with elfdump 2017-01-10 18:46:40 +00:00
ufs ffs_vnops: Simplify extattr access 2017-01-19 16:46:05 +00:00
vm Avoid unnecessary page lookups in vm_object_madvise(). 2017-01-15 03:50:08 +00:00
x86 "Buses" is the preferred plural of "bus" 2017-01-15 17:54:01 +00:00
xdr
xen "Buses" is the preferred plural of "bus" 2017-01-15 17:54:01 +00:00
Makefile