illumos/illumos-gate@0f7643c737https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b
10c846a
https://www.illumos.org/issues/7090
When write I/Os are issued, they are issued in block order but the ZIO pipelin
e
will drive them asynchronously through the allocation stage which can result i
n
blocks being allocated out-of-order. It would be nice to preserve as much of
the logical order as possible.
In addition, the allocations are equally scattered across all top-level VDEVs
but not all top-level VDEVs are created equally. The pipeline should be able t
o
detect devices that are more capable of handling allocations and should
allocate more blocks to those devices. This allows for dynamic allocation
distribution when devices are imbalanced as fuller devices will tend to be
slower than empty devices.
The change includes a new pool-wide allocation queue which would throttle and
order allocations in the ZIO pipeline. The queue would be ordered by issued
time and offset and would provide an initial amount of allocation of work to
each top-level vdev. The allocation logic utilizes a reservation system to
reserve allocations that will be performed by the allocator. Once an allocatio
n
is successfully completed it's scheduled on a given top-level vdev. Each top-
level vdev maintains a maximum number of allocations that it can handle
(mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels *
mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab
groups and round robin across all eligible metaslab groups to distribute the
work. As top-levels complete their work, they receive additional work from the
pool-wide allocation queue until the allocation queue is emptied.
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@f9eb9fdf19https://github.com/illumos/illumos-gate/commit/f9eb9fdf196b6ed476e4ffc69cecd8b0d
a3cb7e7
https://www.illumos.org/issues/6451
Sometimes ztest fails because zdb detects checksum errors. e.g.:
Traversing all blocks to verify checksums and verify nothing leaked ...
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 8000160> DVA0=<0:1cc2000:
180000> [L0 other uint64[]] sha256 uncompressed LE contiguou
s unique single size=100000L/100000P birth=271L/271P fill=1
cksum=c5a3e27d1ed0f894:843bca3a5473c4bf:f76a19b6830a2e4:91292591613a12bf --
skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 800000180> DVA0=<0:ce16800:
180000> [L0 other uint64[]] sha256 uncompressed LE contigu
ous unique single size=100000L/100000P birth=840L/840P fill=1
cksum=5d018f3d061e17f3:6d1584784587bf63:2805a74a0ce37369:ba68a214806c7e75
-- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 1000000360> DVA0=<0:10d37400:
180000> [L0 other uint64[]] sha256 uncompressed LE conti
guous unique single size=100000L/100000P birth=904L/904P fill=1
cksum=fa1e11d4138bd14b:86c9488c444473e3:f31e43c72e72e46b:e3446472d1174d
ba -- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 400000002c0> DVA0=<0:127ef400:
180000> [L0 other uint64[]] sha256 uncompressed LE cont
iguous dedup single size=100000L/100000P birth=549L/549P fill=1
cksum=30e14955ebf13522:66dc2ff8067e6810:4607e750abb9d3b3:6582b8af909fcb
58 -- skipping
zdb_blkptr_cb: Got error 50 reading <657, 5, 0, 1c0> DVA0=<0:1a180400:180000>
[L0 other uint64[]] fletcher4 uncompressed LE contiguou
s unique single size=100000L/100000P birth=1091L/1091P fill=1 cksum=a6cf1e50:
29b3bd01c57e5:36779b914035db9a:db61cdcf6bec56f0 -- skippin
g
The problem is that ztest_fault_inject() can inject multiple faults into the
same block. It is designed such that it can inject errors on all leafs of a
RAID-Z or mirror, but for a given range of offsets, it will only inject errors
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
7147 ztest: ztest_ddt_repair fails with ztest_pattern_match assertion
illumos/illumos-gate@aab8072633https://github.com/illumos/illumos-gate/commit/aab80726335c76a7cae32c7300890248d
73a51e3
https://www.illumos.org/issues/7147
Here's the dbuf we're currently reading:
966f200::dbuf
addr object lvl blkid holds os
966f200 4 0 0 1 ztest/ds_3
966f200::print dmu_buf_t db_data
db_data = 0x9ae0400
0x9ae0400/10J
0x9ae0400: c1c7ced932020d c1c7ced932020d c1c7ced932020d c1c7ced932020d
c1c7ced932020d c1c7ced932020d c1c7ced932020d c1c7ced932020d
c1c7ced932020d c1c7ced932020d
The pattern we're expecting is actually this: a34ae10b5f2db2. If we attempt to
read the block on disk we find that it has matches what ztest_ddt_repair()
would have written:
~c1c7ced932020d=J
ff3e383126cdfdf2
966f200::print dmu_buf_impl_t db_blkptr | ::blkptr
DVA0=<0:71d3c00:800>
[L0 UINT64_OTHER] SHA256 OFF LE contiguous dedup single
size=400L/400P birth=55L/55P fill=1
cksum=18486450d3ce8c6d:75a72f4bbf117b0f:2d3a226314eb5650:2eb0fd68648b1af0
1. zdb -U /rpool/tmp/zpool.cache -R ztest 0:71d3c00:800 | head
Found vdev type: mirror
0:71d3c00:800
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
000000: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000010: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000020: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000030: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000040: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000050: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
7086 ztest attempts dva_get_dsize_sync on an embedded blockpointer
illumos/illumos-gate@926549256bhttps://github.com/illumos/illumos-gate/commit/926549256b71acd595f69b236779ff6b7
8fa08ef
https://www.illumos.org/issues/7086
In dbuf_dirty(), we need to grab the dn_struct_rwlock before looking at the
db_blkptr, to prevent it from being changed by syncing context.
Otherwise we may see that ztest got a segfault from this stack:
libzpool.so.1`dva_get_dsize_sync+0x98(872f000, b32b240, fed7811b, 0, b4cda20,
0)
libzpool.so.1`bp_get_dsize+0x60(872f000, b32b240, 0, 97cb780, 9d4c1a8, 0)
libzpool.so.1`dbuf_dirty+0x9b3(ce0a100, 97cb780, 9, fecd2530)
libzpool.so.1`dmu_buf_will_dirty+0xc3(ce0a100, 97cb780, ea293d6c, 1)
libzpool.so.1`zap_lockdir+0x1a0(8aaa3c0, 1, 0, 97cb780, 1, 1)
libzpool.so.1`zap_remove_norm+0x30(8aaa3c0, 1, 0, 8728b10, 0, 97cb780)
libzpool.so.1`zap_remove+0x29(8aaa3c0, 1, 0, 8728b10, 97cb780, a)
ztest_replay_remove+0x225(ea294588, 8728ae8, 0, 38010000, 0, 0)
ztest_remove+0x9f(ea294588, ea293f50, 4, 3)
ztest_object_init+0x78(ea294588, ea293f50, 4e0, 1)
ztest_dmu_object_alloc_free+0x71(ea294588, 13)
ztest_dmu_objset_create_destroy+0x224(80cef08, 13, 0, 805d36c, 9017ad44, 0)
ztest_execute+0x89(a, 807c720, 13, 0)
ztest_thread+0xea(13, 0, 0, 0)
libc.so.1`_thrp_setup+0x88(f0983240)
libc.so.1`_lwp_start(f0983240, 0, 0, 0, 0, 0)
Looking into it a bit, we see that this is an embedded blockpointer, so
BP_GET_NDVAS should have returned 0:
b32b240::blkptr
EMBEDDED [L0 ZAP_OTHER] et=0 LZ4 size=200L/4aP birth=80L
Instead, it looks like another thread is modifying this blockpointer:
b32b240::ugrep | ::whatis
f47a0e0c is in [ stack tid=0x19f ]
ebd6ec40 is in [ stack tid=0x226 ]
ea293bd0 is in [ stack tid=0x244 ]
ea293be4 is in [ stack tid=0x244 ]
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
7072 zfs fails to expand if lun added when os is in shutdown state
illumos/illumos-gate@c39a2aae1ec39a2aae1ehttps://www.illumos.org/issues/7072
upstream:
38733 zfs fails to expand if lun added when os is in shutdown state
DLPX-36910 spares and caches should not display expandable space
DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@dcbf3bd6a1dcbf3bd6a1https://www.illumos.org/issues/6950
When reading compressed data from disk, the ARC should keep the compressed
block cached and only decompress it when consumers access the block. The
uncompressed data should be short-lived allowing the ARC to cache a much larger
amount of data. The DMU would also maintain a smaller cache of uncompressed
blocks to minimize the impact of decompressing frequently accessed blocks.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
dependent pmap_ts_referenced() so that it updates the page's dirty field
if a modified bit is found while counting reference bits. This
opportunistic update can be performed at low cost and can eliminate the
need for some future calls to pmap_is_modified() by the machine-
independent layer.
MFC after: 3 weeks
FreeBSD always delivers all signals sent with sigqueue, except when
dealing with low memory conditions according to kib (see
bug # 212173 comment # 5).
In collaboration with: kib
PR: 212173
Sponsored by: EMC / Isilon Storage Division
Althought cryptotest itself has a -z mode to test all algorithms at a variety
of sizes, this script allows us to be more selective. Threads and buffer sizes
move in powers of two from 1, for threads, and 256 for buffer sizes.
e.g. cryptorun.sh aes 4 512
Test aes with 1, 2 and 4 processes, and at sizes of 256 and 512 bytes.
Sponsored by: Rubicon Communications, LLC (Netgate)
Make dhclient set interface MTU if it was provided.
This version implements MTU setting in dhclient itself before it runs
dhclient-script.
PR: 206721
Submitted by: novel@
Reported by: Jarrod Petz <jlpetz at gmail.com>
Reviewed by: cem, allanjude
Differential Revision: https://reviews.freebsd.org/D5675
this library. Sticking to 'libifconfig' (and 'ifconfig_' as function prefix)
should reduce chances of namespace collisions, make it more clear what the
library does, and be more in line with existing libraries.
Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com>
Differential Revision: https://reviews.freebsd.org/D7742
Reviewed by: cem, kp
will allow drivers that manage the clock frequency to communicate this with
the reset of the kernel.
Reported by: jmcneill
MFC after: 1 week
Sponsored by: ABT Systems Ltd
ISOCHRONOUS USB transfers. Make sure enough length and buffer pointers
are allocated when setting up the libusb transfer structure to support
the maximum number of frames the kernel can handle.
MFC after: 1 week
Previously cron had its own maximum username length limit, which was
smaller than the system's MAXLOGNAME. This could lead to crontab -u
updating the wrong user's crontab (if the name was truncated, and
matched another user).
PR: 212305
Reported by: Andrii Kuzik
Reviewed by: allanjude, jilles
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7747
for Chinese locales.
As mentioned in the commit message of r289041, nl_langinfo(ABMON_*) only
returned numbers when using a Chinese locale, this causes problems in
applications that put the short month name and the day of the month together.
Spotted by: Ting-Wei Lan <lantw44 gmail com>
transfers.
The Initiator and Target both perform zero copy receive for transfers
greater than or equal to this threshold.
Sponsored by: Chelsio Communications