7386 zfs get does not work properly with bookmarks
illumos/illumos-gate@edb901aab9edb901aab9https://www.illumos.org/issues/7386
The zfs get command does not work with the bookmark parameter while it works
properly with both filesystem and snapshot:
# zfs get -t all -r creation rpool/test
NAME PROPERTY VALUE SOURCE
rpool/test creation Fri Sep 16 15:00 2016 -
rpool/test@snap creation Fri Sep 16 15:00 2016 -
rpool/test#bkmark creation Fri Sep 16 15:00 2016 -
# zfs get -t all -r creation rpool/test@snap
NAME PROPERTY VALUE SOURCE
rpool/test@snap creation Fri Sep 16 15:00 2016 -
# zfs get -t all -r creation rpool/test#bkmark
cannot open 'rpool/test#bkmark': invalid dataset name
#
The zfs get command should be modified to work properly with bookmarks too.
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
illumos/illumos-gate@8363e80ae7https://github.com/illumos/illumos-gate/commit/8363e80ae72609660f6090766ca8c2c18https://www.illumos.org/issues/7303
This change introduces a new weighting algorithm to improve metaslab selection
.
The new weighting algorithm relies on the SPACEMAP_HISTOGRAM feature. As a res
ult,
the metaslab weight now encodes the type of weighting algorithm used
(size-based vs segment-based).
This also introduce a new allocation tracing facility and two new dcmds to hel
p
debug allocation problems. Each zio now contains a zio_alloc_list_t structure
that is populated as the zio goes through the allocations stage. Here's an
example of how to use the tracing facility:
> c5ec000::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
MSID DVA ASIZE WEIGHT RESULT VDEV
- 0 400 0 NOT_ALLOCATABLE ztest.0a
- 0 400 0 NOT_ALLOCATABLE ztest.0a
- 0 400 0 ENOSPC ztest.0a
- 0 200 0 NOT_ALLOCATABLE ztest.0a
- 0 200 0 NOT_ALLOCATABLE ztest.0a
- 0 200 0 ENOSPC ztest.0a
1 0 400 1 x 8M 17b1a00 ztest.0a
> 1ff2400::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
MSID DVA ASIZE WEIGHT RESULT VDEV
- 0 200 0 NOT_ALLOCATABLE mirror-2
- 0 200 0 NOT_ALLOCATABLE mirror-0
1 0 200 1 x 4M 112ae00 mirror-1
- 1 200 0 NOT_ALLOCATABLE mirror-2
- 1 200 0 NOT_ALLOCATABLE mirror-0
1 1 200 1 x 4M 112b000 mirror-1
- 2 200 0 NOT_ALLOCATABLE mirror-2
If the metaslab is using segment-based weighting then the WEIGHT column will
display the number of segments available in the bucket where the allocation
attempt was made.
Author: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
6428 set canmount=off on unmounted filesystem tries to unmount children
This is a cosmetic and bookkeeping change as the actual change is
already in FreeBSD.
See r297521, r304520, r308985.
cddl: normalize paths using SRCTOP-relative paths or :H when possible
This simplifies make logic/output
While here, remove bogus CFLAGS which look for headers in cddl/lib/libumem.
There aren't any source files there (just Makefiles)
r320165:
devd(8): Remove pidfile on shutdown
Sponsored by: Spectra Logic Corp
r320166:
Require devd to be running for its ATF tests to run
The ATF tests communicate with the system's running devd
PR: 220169
Reported by: gjb
Sponsored by: Spectra Logic Corp
r320167:
zfsd(8): Remove pidfile on shutdown
Sponsored by: Spectra Logic Corp
MFV 316855
7900 zdb shouldn't print the path of a znode at verbosity < 5
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alan Somers <asomers@freebsd.org>
illumos/illumos-gate@e548d2fa41https://www.illumos.org/issues/7900
Sponsored by: Spectra Logic Corp
8046 Let calloc() do the multiplication in libzfs_fru_refresh
5697e03e6ehttps://www.illumos.org/issues/8046
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pedro Giffuni <pfg@freebsd.org>
Fix DTrace TCP tracepoints to not use mtod() as it is both unnecessary and
dangerous. Those wanting data from an mbuf should use DTrace itself to get
the data.
Add an mbuf to ipinfo_t translator to finish cleanup of mbuf passing to TCP probes.
After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
Sponsored by: iXsystems, Inc.
Fix an unchecked return value in zfsd
It's pretty unlikely to actually hit this, but good to check it anyway
Reported by: Coverity
CID: 1362018
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Convert ipv4_flags and ipv4_offset fields into host byte order.
Also save only high bits in the ipv4_flags, because it is defined
as uint8_t. So now it will show DF and MF flags as 0x40 and 0x20.
On FreeBSD there is a setsockopt option SO_USER_COOKIE which allows
setting a 32 bit value on each socket. This can be used by applications
and DTrace as a rendezvous point so that an applicaton's data can
more easily be captured at run time. Expose the user cookie via
DTrace by updating the translator in tcp.d and add a quick test
program, a TCP server, that sets the cookie on each connection
accepted.
Sponsored by: Limelight Networks
illumos/illumos-gate@f9eb9fdf19https://github.com/illumos/illumos-gate/commit/f9eb9fdf196b6ed476e4ffc69cecd8b0d
a3cb7e7
https://www.illumos.org/issues/6451
Sometimes ztest fails because zdb detects checksum errors. e.g.:
Traversing all blocks to verify checksums and verify nothing leaked ...
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 8000160> DVA0=<0:1cc2000:
180000> [L0 other uint64[]] sha256 uncompressed LE contiguou
s unique single size=100000L/100000P birth=271L/271P fill=1
cksum=c5a3e27d1ed0f894:843bca3a5473c4bf:f76a19b6830a2e4:91292591613a12bf --
skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 800000180> DVA0=<0:ce16800:
180000> [L0 other uint64[]] sha256 uncompressed LE contigu
ous unique single size=100000L/100000P birth=840L/840P fill=1
cksum=5d018f3d061e17f3:6d1584784587bf63:2805a74a0ce37369:ba68a214806c7e75
-- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 1000000360> DVA0=<0:10d37400:
180000> [L0 other uint64[]] sha256 uncompressed LE conti
guous unique single size=100000L/100000P birth=904L/904P fill=1
cksum=fa1e11d4138bd14b:86c9488c444473e3:f31e43c72e72e46b:e3446472d1174d
ba -- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 400000002c0> DVA0=<0:127ef400:
180000> [L0 other uint64[]] sha256 uncompressed LE cont
iguous dedup single size=100000L/100000P birth=549L/549P fill=1
cksum=30e14955ebf13522:66dc2ff8067e6810:4607e750abb9d3b3:6582b8af909fcb
58 -- skipping
zdb_blkptr_cb: Got error 50 reading <657, 5, 0, 1c0> DVA0=<0:1a180400:180000>
[L0 other uint64[]] fletcher4 uncompressed LE contiguou
s unique single size=100000L/100000P birth=1091L/1091P fill=1 cksum=a6cf1e50:
29b3bd01c57e5:36779b914035db9a:db61cdcf6bec56f0 -- skippin
g
The problem is that ztest_fault_inject() can inject multiple faults into the
same block. It is designed such that it can inject errors on all leafs of a
RAID-Z or mirror, but for a given range of offsets, it will only inject errors
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
7147 ztest: ztest_ddt_repair fails with ztest_pattern_match assertion
illumos/illumos-gate@aab8072633https://github.com/illumos/illumos-gate/commit/aab80726335c76a7cae32c7300890248d
73a51e3
https://www.illumos.org/issues/7147
Here's the dbuf we're currently reading:
966f200::dbuf
addr object lvl blkid holds os
966f200 4 0 0 1 ztest/ds_3
966f200::print dmu_buf_t db_data
db_data = 0x9ae0400
0x9ae0400/10J
0x9ae0400: c1c7ced932020d c1c7ced932020d c1c7ced932020d c1c7ced932020d
c1c7ced932020d c1c7ced932020d c1c7ced932020d c1c7ced932020d
c1c7ced932020d c1c7ced932020d
The pattern we're expecting is actually this: a34ae10b5f2db2. If we attempt to
read the block on disk we find that it has matches what ztest_ddt_repair()
would have written:
~c1c7ced932020d=J
ff3e383126cdfdf2
966f200::print dmu_buf_impl_t db_blkptr | ::blkptr
DVA0=<0:71d3c00:800>
[L0 UINT64_OTHER] SHA256 OFF LE contiguous dedup single
size=400L/400P birth=55L/55P fill=1
cksum=18486450d3ce8c6d:75a72f4bbf117b0f:2d3a226314eb5650:2eb0fd68648b1af0
1. zdb -U /rpool/tmp/zpool.cache -R ztest 0:71d3c00:800 | head
Found vdev type: mirror
0:71d3c00:800
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
000000: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000010: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000020: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000030: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000040: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
000050: ff3e383126cdfdf2 ff3e383126cdfdf2 ...&18>....&18>.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@dcbf3bd6a1dcbf3bd6a1https://www.illumos.org/issues/6950
When reading compressed data from disk, the ARC should keep the compressed
block cached and only decompress it when consumers access the block. The
uncompressed data should be short-lived allowing the ARC to cache a much larger
amount of data. The DMU would also maintain a smaller cache of uncompressed
blocks to minimize the impact of decompressing frequently accessed blocks.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@759e89be35https://github.com/illumos/illumos-gate/commit/759e89be359f2af635e4122d147df56bc
e948773
https://www.illumos.org/issues/6447
I got a patch from someone who uses nvpair code outside of illumos. It fixes a
couple of gcc warnings/bugs for him.
1. silence uninitialized use warnings
2. add parentheses around assignment used as truth value
3. fix printf format specifier (ll is for integers only)
4. strstr, strspn, strcspn, and strcmp are declared in string.h, not
strings.h.
5. avoid scanning integer into boolean variable
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Steve Dougherty <sdougherty@barracuda.com>
illumos/illumos-gate@9adfa60d48https://github.com/illumos/illumos-gate/commit/9adfa60d484ce2435f5af77cc99dcd4e6
92b6660
https://www.illumos.org/issues/6314
Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but
dsl_dataset_name copies the datasets' name PLUS the snapshot name to it,
resulting in a max of 2 * ZFS_MAXNAMELEN + '@'.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
6872 zfs libraries should not allow uninitialized variables
illumos/illumos-gate@f83b46baf9https://github.com/illumos/illumos-gate/commit/f83b46baf98d276f5f84fa84c8b461f41
2ac1f5e
https://www.illumos.org/issues/6872
We compile the zfs libraries with -Wno-uninitialized. We should remove
this. Change makefiles, fix new warnings, fix pbchk errors.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
4521 zfstest is trying to execute evil "zfs unmount -a"
illumos/illumos-gate@8808ac5daehttps://github.com/illumos/illumos-gate/commit/8808ac5dae118369991f158b6ab736cb2
691ecde
https://www.illumos.org/issues/4521
zfstest is trying to execute evil "zfs unmount -a", which fails (fortunately,
as it would otherwise leave me with my ~ missing):
03:44:11.86 cannot unmount '/export/home/yuri': Device busy cannot unmount '/
export/home': Device busy
03:44:11.86 ERROR: /usr/sbin/zfs unmount -a exited 1
This affects, at least, zfs_mount_009_neg and zfs_mount_all_001_pos, both
failing on that step. The pool containing the /export/home hierarchy is
included in KEEP variable, but it doesn't seem to affect anything here.
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
6879 incorrect endianness swap for drr_spill.drr_length in libzfs_sendrecv.c
illumos/illumos-gate@20fea7a474https://github.com/illumos/illumos-gate/commit/20fea7a47472aceb64d3ed48cc2a3ea26
8bc4795
https://www.illumos.org/issues/6879
In libzfs_sendrecv, there's a typo:
case DRR_SPILL:
if (byteswap) {
drr->drr_u.drr_write.drr_length =
BSWAP_64(drr->drr_u.drr_spill.drr_length);
}
Instead of drr_write.drr_length, we should be assigning the result of the
byteswap to drr_spill.drr_length.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
6111 zfs send should ignore datasets created after the ending snapshot
illumos/illumos-gate@4a20c933b1https://github.com/illumos/illumos-gate/commit/4a20c933b148de8a1c1d3538391c64284
e636653
https://www.illumos.org/issues/6111
If you create a zfs child folder, zfs send returns an error when a recursive
incremental send is done between two snapshots made prior to the folder
creation.
The problem can be reproduced with the following steps.
root@zfs:/# zfs create pool/test
root@zfs:/# zfs snapshot pool/test@snap1
root@zfs:/# zfs snapshot pool/test@snap2
root@zfs:/# zfs create pool/test/child
root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap2 > /dev/null
WARNING: could not send pool/test/child@snap2: does not exist
WARNING: could not send pool/test/child@snap2: does not exist
root@zfs:/# echo $?
1
root@zfs:/# zfs snapshot -r pool/test@snap3
root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap3 > /dev/null
root@zfs:/# echo $?
0
root@zfs:/# zfs send -R -I pool/test@snap2 pool/test@snap3 > /dev/null
root@zfs:/# echo $?
0
Since pool/test/child was created after snap2, zfs send should not expect snap2
to be in pool/test/child when doing a recursive send. It should examine the
compare the creation time of the snapshot and each child folder to decide if
the folder will be sent. The next incremental send between snap2 and snap3
would properly create the child folder and snap3 which first appears in the
child folder.
The problem is identical if '-i' is used instead of '-I'.
Reviewed by: Alex Aizman alex.aizman@nexenta.com
Reviewed by: Alek Pinchuk alek.pinchuk@nexenta.com
Reviewed by: Roman Strashkin roman.strashkin@nexenta.com
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Alex Deiter <alex.deiter@nexenta.com>
6876 Stack corruption after importing a pool with a too-long name
illumos/illumos-gate@c971037baac971037baahttps://www.illumos.org/issues/6876
Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for
trouble. We should check every dataset on import, using a 1024 byte buffer and
checking each time to see if the dataset's new name is longer than 256 bytes.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Paul Dagnelie <pcd@delphix.com>
6902 speed up listing of snapshots if requesting name only and sorting by name
This was our change from the beginning, so just reduce the upstream diff.
r305018:
Use SRCTOP instead of a homegrown definition for it (SRCDIR)
r305019:
Remove unnecessary variable (SRCDIR) replaced by SRCTOP in Makefile.common
r305020:
Remove redundant declarations and simplify ../ in pathing
- TESTSBASE and LOCALBASE are already defined in bsd.tests.mk
- TESTSDIR is automatically divined as ${TESTSBASE}${RELDIR:H} after
r289158.
- Replace SRCDIR with SRCTOP
Unlike Solaris, in FreeBSD p_args can be 0 so check for that
instead of walking down to ar_args blindly.
Reported by: Amanda Strnad
Reviewed by: markj, jhb
Sponsored by: DARPA, AFRL
Add an empty virtual destructor to zfsd's Vdev class. This is needed
because the class has virtual functions, and the compiler-generated
default destructor is non-virtual.
Reviewed by: asomers
MFC r305016:
Fix the zfsd unittest:
* TESTSDIR is supposed to be under cddl/usr.sbin, not cddl/sbin
* DevdCtl::EventBuffer no longer exists, so remove its forward
declaration
Cast result from third parameter to int instead of promoting it to size_t
This resolves a -Wformat issue when the value is used as a format width
precision specifier, i.e. %*s
Highball memory requirement (4GB) with common/{raise,safety}
Both test suites require more memory than my amd64 VM using
GENERIC-NODEBUG can provide and reliably panic it with OOM issues in
dtrace(4).
Some of the testcases fail, but this at least bypasses the panic behavior
on platforms that don't have enough resources
Discussed with: markj