MFV r271515:

Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to
limit how many blocks can be free'ed before a new transaction group is
created.  The default is no limit (infinite), but we should probably have
a lower default, e.g. 100,000.

With this limit, we can guard against the case where ZFS could run out of
memory when destroying large numbers of blocks in a single transaction
group, as the entire DDT needs to be brought into memory.

Illumos issue:
    5138 add tunable for maximum number of blocks freed in one txg

MFC after:	2 weeks
This commit is contained in:
Xin LI 2014-09-13 17:24:56 +00:00
commit be1b14a063

View File

@ -90,6 +90,11 @@ SYSCTL_INT(_vfs_zfs, OID_AUTO, no_scrub_prefetch, CTLFLAG_RWTUN,
&zfs_no_scrub_prefetch, 0, "Disable scrub prefetching");
enum ddt_class zfs_scrub_ddt_class_max = DDT_CLASS_DUPLICATE;
/* max number of blocks to free in a single TXG */
uint64_t zfs_free_max_blocks = UINT64_MAX;
SYSCTL_UQUAD(_vfs_zfs, OID_AUTO, free_max_blocks, CTLFLAG_RWTUN,
&zfs_free_max_blocks, 0, "Maximum number of blocks to free in one TXG");
#define DSL_SCAN_IS_SCRUB_RESILVER(scn) \
((scn)->scn_phys.scn_func == POOL_SCAN_SCRUB || \
@ -1341,6 +1346,9 @@ dsl_scan_free_should_pause(dsl_scan_t *scn)
if (zfs_recover)
return (B_FALSE);
if (scn->scn_visited_this_txg >= zfs_free_max_blocks)
return (B_TRUE);
elapsed_nanosecs = gethrtime() - scn->scn_sync_start_time;
return (elapsed_nanosecs / NANOSEC > zfs_txg_timeout ||
(NSEC2MSEC(elapsed_nanosecs) > zfs_free_min_time_ms &&