Commit Graph

1123 Commits

Author SHA1 Message Date
Alan Somers
d727d8b0b1 Optimize zfsd for the happy case
If there are no damaged pools, then ignore all GEOM events.  We only use
them to fix damaged pools.  However, still pay attention to ZFS events.

MFC after:	20 days
X-MFC-With:	329284
Sponsored by:	Spectra Logic Corp
2018-02-15 21:30:30 +00:00
Devin Teske
85ef133db3 Add the following errno definitions to /usr/lib/dtrace/errno.d
ENOTCAPABLE (93)
ECAPMODE (94)
ENOTRECOVERABLE (95)
EOWNERDEAD (96)
ERELOOKUP (-5)

and update ELAST (92 -> 96)

Reviewed by:	markj
Sponsored by:	Smule, Inc.
Differential Revision:	https://reviews.freebsd.org/D14340
2018-02-15 18:37:32 +00:00
Alan Somers
a07d59d1da zfsd: Allow zfsd to work on any type of GEOM provider
cddl/usr.sbin/zfsd/zfsd_event.cc
	Remove the check for da and ada devices.  This way zfsd can work on md,
	geli, glabel, gstripe, etc devices.  geli in particular is useful
	combined with ZFS.  gnop is also useful for simulating drive pulls in
	the ZFSD test suite.

	Also, eliminate the DevfsEvent class entirely.  Move its
	responsibilities into GeomEvent.  We can get everything we need to know
	just from listening to GEOM events.

lib/libdevdctl/event.cc
	Fix GeomEvent::DevName for CREATE events.  Oddly, the relevant field is
	named "cdev" for CREATE events but "devname" for disk events.

MFC after:	3 weeks
Relnotes:	Yes (probably worth mentioning the geli part)
Sponsored by:	Spectra Logic Corp
2018-02-14 23:52:39 +00:00
Devin Teske
980b68eaa2 Use tabs in io.d, fix alignment issues, remove extraneous newlines 2018-02-12 23:53:38 +00:00
Alan Somers
ad4bbe575b Fix "zpool add" crash when a replacing vdev has a spare child
Fix an assertion in zpool that causes a crash when running any "zpool add"
command on a spare that contains a replacing vdev with a spare child.

This likely affects Illumos, too.

PR:		225546
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D14138
2018-02-09 16:08:57 +00:00
Alan Somers
af63bbd0f0 zfsd: Don't spare a vdev that's being replaced
If a zfs pool contains a replacing vdev (either created manually by "zpool
replace" or by zfsd(8) via autoreplace by physical path) and then new spares
get added to the pool, zfsd shouldn't use one to replace the drive that is
already being replaced.  That's a waste of resources that just slows down
the rebuild.

PR:		225547
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2018-01-30 21:25:43 +00:00
Mark Johnston
3ddcfe1e46 Remove uneeded parentheses.
MFC after:	1 week
2018-01-25 15:31:56 +00:00
Alexander Motin
0d6993dfd3 MFV r328255: 8972 zfs holds: In scripted mode, do not pad columns with spaces
illumos/illumos-gate@e9b7d6e7f7

https://www.illumos.org/issues/8972:
'zfs holds -H' does not properly output content in scripted mode. It uses a
tab instead of two spaces, but it still pads column widths with spaces when
it should not.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Allan Jude <allanjude@freebsd.org>
2018-01-22 06:00:45 +00:00
Alexander Motin
4eb2803697 MFV r328251: 8652 Tautological comparisons with ZPROP_INVAL
illumos/illumos-gate@4ae5f5f06c

https://www.illumos.org/issues/8652:
Clang and GCC prefer to use unsigned ints to store enums. With Clang, that
causes tautological comparison warnings when comparing a zfs_prop_t or
zpool_prop_t variable to the macro ZPROP_INVAL. It's likely that error
handling code is being silently removed as a result.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Alan Somers <asomers@gmail.com>
2018-01-22 05:52:39 +00:00
Alexander Motin
cf52647c41 MFV r328249:
8641 "zpool clear" and "zinject" don't work on "spare" or "replacing" vdevs

illumos/illumos-gate@2ba5f978a4

https://www.illumos.org/issues/8641:
"zpool clear" and "zinject -d" can both operate on specific vdevs, either
leaf or interior. However, due to an oversight, neither works on a "spare"
or "replacing" vdev. For example:

sudo zpool create foo raidz1 c1t5000CCA000081D61d0 c1t5000CCA000186235d0 spare c
1t5000CCA000094115d0
sudo zpool replace foo c1t5000CCA000186235d0 c1t5000CCA000094115d0
$ zpool status foo pool: foo
state: ONLINE
scan: resilvered 81.5K in 0h0m with 0 errors on Fri Sep 8 10:53:03 2017
config:

NAME                         STATE     READ WRITE CKSUM
        foo                          ONLINE       0     0     0
          raidz1-0                   ONLINE       0     0     0
            c1t5000CCA000081D61d0    ONLINE       0     0     0
            spare-1                  ONLINE       0     0     0
              c1t5000CCA000186235d0  ONLINE       0     0     0
              c1t5000CCA000094115d0  ONLINE       0     0     0
        spares
          c1t5000CCA000094115d0      INUSE     currently in use
$ sudo zinject -d spare-1 -A degrade foo
cannot find device 'spare-1' in pool 'foo'
$ sudo zpool clear foo spare-1
cannot clear errors for spare-1: no such device in pool

Even though there was nothing to clear, those commands shouldn't have
reported an error. by contrast, trying to clear "raidz1-0" works just fine:
$ sudo zpool clear foo raidz1-0

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Alan Somers <asomers@gmail.com>
2018-01-22 04:37:04 +00:00
Alexander Motin
d26c97e2fb MFV r328233:
8898 creating fs with checksum=skein on the boot pools fails ungracefully

illumos/illumos-gate@9fa2266d9a

https://www.illumos.org/issues/8898:
# zfs create -o checksum=skein rpool/test
internal error: Result too large
Abort (core dumped)

Not a big deal per se, but should be handled correctly.

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

PR:		222199
2018-01-22 00:01:36 +00:00
Alexander Motin
9b347baa3a MFV r328231: 8897 zpool online -e fails assertion when run on non-leaf vdevs
illumos/illumos-gate@9a551dd645

https://www.illumos.org/issues/8897:
# zpool online -e test mirror-1
Assertion failed: nvlist_lookup_string(tgt, "path", &pathname) == 0, file ../common/libzfs_pool.c, line 2558, function zpool_vdev_online
Abort (core dumped)

Not a big deal per se, but should be handled gracefully, same way as 'offline' and 'online' without '-e'.

Also reported as: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221408

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
2018-01-21 23:53:56 +00:00
Alexander Motin
442814b7e7 MFV r328220: 8677 Open-Context Channel Programs
illumos/illumos-gate@a3b2868063

https://www.illumos.org/issues/8677
  We want to be able to run channel programs outside of synching context.
  This would greatly improve performance of channel program that just gather
  information, as we won't have to wait for synching context anymore.

  This feature should introduce the following:
  - A new command line flag in "zfs program" to specify our intention to
  run in open context.
  - A new flag/option within the channel program ioctl which selects the
  context.
  - Appropriate error handling whenever we try a channel program in
  open-context that contains zfs.sync* expressions.
  - Documentation for the new feature in the manual pages.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
2018-01-21 23:02:05 +00:00
Mark Johnston
d678ce4b6b Remove tst.zonename.d from the list of expected failures.
X-MFC with:	r327888
2018-01-14 17:56:19 +00:00
Mark Johnston
224e0c2f61 Add "jid" and "jailname" variables to DTrace.
These return the jail ID and jail name for the traced process,
respectively, and are analogous to "zonename" on Solaris/illumos.
"zonename" is now aliased to "jailname".

Also add some stress tests for the new variables.

Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
Reviewed by:	dteske (previous version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13877
2018-01-12 19:59:46 +00:00
Mark Johnston
60eddb209b Add a regression test for r327794.
MFC after:	2 weeks
2018-01-10 21:40:36 +00:00
Mark Johnston
ff9ebfc4b6 Fix an off-by-one in dt_opt_setenv().
The bug would cause incorrect behaviour when attempting to override
an already set environment variable with -x setenv, as long as the
variable is not the last one in the array.

Reported by:	Samuel Lepetit <slepetit@apple.com>
MFC after:	2 weeks
2018-01-10 21:37:11 +00:00
Mark Johnston
7c3701904a Mark uctf/err.user64mode.ksh as EXFAIL for now.
MFC after:	1 week
2017-12-15 18:09:23 +00:00
Mark Johnston
97cb52fa9a Actually add the -x setenv test Makefile, missed in r326499.
X-MFC with:	r326499
2017-12-08 19:26:25 +00:00
Mark Johnston
04006780d9 Complete support for dtrace's -x setenv option.
This allows one to override the environment for processes created with
dtrace -c. By default, the environment is inherited.

This support was originally merged from illumos in r249367 but was lost
when the commit was later reverted and then brought back piecemeal.

Reported by:	Samuel Lepetit <slepetit@apple.com>
MFC after:	2 weeks
2017-12-03 16:57:28 +00:00
Mark Johnston
5577b8a709 Add an envp argument to proc_create().
This is needed to support dtrace's -x setenv option.

MFC after:	2 weeks
2017-12-03 16:50:16 +00:00
Alan Somers
013953eb5f Add basic tests for ctfconvert(1), fold(1) and rs(1)
Add basic command line parsing test coverage for these utilities.  The tests
were automatically generated based on their man pages.  These tests can be
expanded by hand for more thorough coverage.  The aim is to generate very
basic amount of test coverage for all the utilities in the base system.

Tests generated via: https://github.com/shivansh/smoketestsuite/

Submitted by:	shivansh
Reviewed by:	asomers
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D12424
2017-11-27 20:01:58 +00:00
Mark Johnston
f4b90f5a5c Revert r326181 for now.
We can't link an executable using -m32 until the lib32 phase of a
buildworld, though the build works fine when executing make from
cddl/usr.sbin/dtrace/tests. Some other solution will need to be found.
2017-11-27 17:54:17 +00:00
Andriy Gapon
718cb91ccc zdb: follow-up to r326150, check if malloc succeeded
Reported by:	rpokala
MFC after:	1 week
X-MFC with:	r326150
2017-11-25 09:47:31 +00:00
Mark Johnston
e96f62322b Compile one of the uctf test programs with -m32.
The err.user64mode.ksh test expects it to run as a 32-bit process.

MFC after:	1 week
2017-11-24 19:57:13 +00:00
Mark Johnston
eb381edab7 Fix the type signature for sx(9) DTrace subroutines.
MFC after:	1 week
2017-11-24 19:05:45 +00:00
Andriy Gapon
fd74a38251 zdb: use a heap allocation instead of a huge array on stack
SPA_MAXBLOCKSIZE is 16 MB and having such a large object on the stack is
not nice in general and it could cause some confusing failures in the
single-user mode where the default stack size of 8 MB is used.

I expect that the upstream would make the same change.

MFC after:	1 week
2017-11-24 10:45:33 +00:00
Mark Johnston
483f7100b4 Annotate pragma/err.invalidlibdep.ksh as EXFAIL.
The test creates a D library with a "depends_on library" pragma
referencing a non-existent library, and expects compilation to fail.
However, as far as I can tell, libdtrace is supposed simply abort
compilation of the library in this case, and continue. This behaviour
is desirable when adding libraries which depend on optional KLDs, for
example.

MFC after:	1 week
2017-11-22 15:54:52 +00:00
Mark Johnston
7c72b10912 Annotate usdt/tst.eliminate.ksh as EXFAIL.
It appears to depend on some behaviour specific to the Sun link editor.

MFC after:	1 week
2017-11-21 16:00:18 +00:00
Mark Johnston
fc81ca0045 Don't assume that we can resolve "main" in the ksh executable.
MFC after:	1 week
2017-11-21 15:03:38 +00:00
Ed Maste
4e4805ddf1 dt_modtext: return error on archs lacking an implementation
Reported by:	mmel
Reviewed by:	markj
MFC after:	1 week
MFC with:	r325042
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13176
2017-11-21 03:15:32 +00:00
Mark Johnston
ad1633b34b Take r313504 into account when recomputing the string table length.
When we encounter a USDT probe in a weak symbol, we emit an alias for
the probe function symbol. Such aliases are named differently from the
aliases we emit for probes in local functions, so make sure to take that
difference into account when resizing the output object file's string
table. Otherwise, we underrun the string table buffer.

PR:		223680
2017-11-16 07:14:29 +00:00
Bryan Drewery
ea825d0274 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
Bryan Drewery
3806950135 DIRDEPS_BUILD: Connect new directories.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:04:07 +00:00
Ed Maste
732f2c69a1 libdtrace: replace "DOODAD" with more descriptive string
Previously some unimplemented libdtrace routines printed the function,
file and line number, followed by "DOODAD." That is not particularly
informative, so replace it with a message reporting the actual issue.

Sponsored by:	The FreeBSD Foundation
2017-10-27 16:23:45 +00:00
Andriy Gapon
1c05a6ea6b MFV r325013,r325034: 640 number_to_scaled_string is duplicated in several commands
illumos/illumos-gate@0a0551200e
0a0551200e

https://www.illumos.org/issues/640
  du(1), df(1m), ls(1), and swap(1m) all include a copy (it appears literally
  copied) of the 'number_to_scaled_string' function in their source. This should
  be moved to a shared library and all 4 commands should use this instead.

FreeBSD note: of all libcmdutils functionality ZFS (and other illumos
contrib code) currently uses only nicenum() function (which is similar
to humanize_number but has some formatting differences).  For this
reason I decided to not port the whole library.  As a result, nicenum.c
from libcmdutils is compiled into libzfs and libzpool.  This is a bit
ugly, but works.  If one day we are forced to create libillumos, then
the file should be moved to that library.

Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Jason King <jason.brian.king@gmail.com>

MFC after:	2 weeks
2017-10-27 12:37:22 +00:00
Alan Somers
12a88a3d63 zfsd should be able to online an L2ARC that disappears and returns
Previously, this didn't work because L2ARC devices' labels don't contain
pool GUIDs.  Modify zfsd so that the pool GUID won't be required:

lib/libdevdctl/guid.h
	Change INVALID_GUID from a uint64_t constant to a function that
	returns an invalid Guid object.  Remove the void constructor.
	Nothing uses it, and it violates RAII.

cddl/usr.sbin/zfsd/case_file.h
cddl/usr.sbin/zfsd/case_file.cc
	Allow CaseFile::Find to match a CaseFile based on Vdev GUID alone.
	In CaseFile::ReEvaluate, attempt to online devices even if the newly
	arrived device has no pool GUID.

cddl/usr.sbin/zfsd/vdev_iterator.cc
	Iterate through a pool's cache devices as well as its regular
	devices.

Reported by:	avg
Reviewed by:	avg
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12791
2017-10-26 15:28:18 +00:00
Alan Somers
63f8025d6a Fix zpool_read_all_labels when vfs.aio.enable_unsafe=0
Previously, zpool_read_all_labels was trying to do 256KB reads, which are
greater than the default MAXPHYS and therefore must go through the slow,
unsafe AIO path.  Shrink these reads to 112KB so they can use the safe, fast
AIO path instead.

MFC after:	1 week
X-MFC-With:	324568
Sponsored by:	Spectra Logic Corp
2017-10-25 16:01:19 +00:00
Mark Johnston
7421ff0751 Address some miscellaneous issues in the CTF man page.
- Fix a number of typos.
- Replace some illumos-specific references.
- Note that a type definition of kind CTF_K_FUNCTION may be followed by
  a null type identifier in order to provide 4-byte alignment for the
  next type definition.

MFC after:	2 weeks
2017-10-22 19:17:25 +00:00
Mark Johnston
ef36b3f756 MFV r323105 (partial): 8300 fix man page issues found by mandoc 1.14.1
illumos/illumos-gate@72d3dbb9ab
72d3dbb9ab

https://www.illumos.org/issues/8300
  Prior to integrating the mdocml update to 1.14.1, fix issues found by
  new version, especially the "new sentence, new line" style rule.

FreeBSD note: this revision merges only the changes to the CTF manual
page. The changes to the ZFS pages cannot be applied directly.

Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

Discussed with:	avg
MFC after:	2 weeks
2017-10-22 18:32:28 +00:00
Alan Somers
b49e9abcf4 Optimize zpool_read_all_labels with AIO
Read all labels in parallel instead of sequentially

MFC after:	3 weeks
X-MFC-With: 322854
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12495
2017-10-12 21:25:11 +00:00
Mark Johnston
11a7f121b8 Avoid adding an extra "0x" prefix before pointer formats.
MFC after:	1 week
2017-10-06 18:29:00 +00:00
Andriy Gapon
ac63ac6859 zdb.8: replace with the slighly modified upstream version
The upstream has converted their manual page to the same format as we
have, so we can use the upstream version with minimal modifications.

The current modifications are:
- different zpool.cache path
- different manual sections for zdb, zfs, zpool commands

igor reports a few minor issues, it would be nice to fix them both in
FreeBSD and in the upstream.

MFC after:	3 weeks
2017-10-06 08:28:35 +00:00
Andriy Gapon
e117882ba2 MFV r322235: 8067 zdb should be able to dump literal embedded block pointer
illumos/illumos-gate@4923c69fdd
4923c69fdd

FreeBSD note: the manual page is to be updated separately.

https://www.illumos.org/issues/8067
  Add an option to zdb to print a literal embedded block pointer supplied on the
  command line:
  zdb -E [-A] word0:word1:...:word15

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after:	3 weeks
2017-10-06 08:21:06 +00:00
Andriy Gapon
c40fbcbc18 MFV r316934: 7340 receive manual origin should override automatic origin
illumos/illumos-gate@ed4e7a6a5c
ed4e7a6a5c

https://www.illumos.org/issues/7340
  When -o origin=<snapshot> is specified as part of a ZFS receive, that origin
  should override the automatic detection in libzfs.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>

MFC after:	3 weeks
2017-10-06 08:17:12 +00:00
Andriy Gapon
3c7b32b890 MFV r316933: 5142 libzfs support raidz root pool (loader project)
illumos/illumos-gate@d5f26ad812
d5f26ad812

https://www.illumos.org/issues/5142
  the current libzfs only allows simple disk and mirror setup for boot pool, as
  loader does support booting from raidz, this feature will remove raidz
  restriction from boot pool setup.

FreeBSD note: we have long supported this feature, this commit only
removes a small difference in libzfs.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Toomas Soome <tsoome@me.com>

MFC after:	3 weeks
2017-10-06 08:15:37 +00:00
Andriy Gapon
6a880de6e0 MFV r316931: 6268 zfs diff confused by moving a file to another directory
illumos/illumos-gate@aab04418a7
aab04418a7

https://www.illumos.org/issues/6268
  The zfs diff command presents a description of the changes that have occurred
  to files within a filesystem between two snapshots. If a file is renamed, the
  tool is capable of reporting this, e.g.:
  cd /some/zfs/dataset/subdir
  mv file0 file1
  Will result in a diff record like:
  R        /some/zfs/dataset/subdir/file0  ->  /some/zfs/dataset/subdir/file1
  Unfortunately, it seems that rename detection only uses the base filename to
  determine if a file has been renamed or simply modified. This leads to
  misreporting only the original filename, omitting the more relevant destination
  filename entirely. For example:
  cd /some/zfs/dataset/subdir
  mv file0 ../otherdir/file0
  Will result in a diff entry:
  M        /some/zfs/dataset/subdir/file0
  But it should really emit:
  R        /some/zfs/dataset/subdir/file0  ->  /some/zfs/dataset/otherdir/file0

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Joshua M. Clulow <josh@sysmgr.org>

MFC after:	3 weeks
2017-10-06 08:12:13 +00:00
Andriy Gapon
f079bf7ce3 MFV r316877: 7571 non-present readonly numeric ZFS props do not have default value
illumos/illumos-gate@ad2760acbd
ad2760acbd

https://www.illumos.org/issues/7571
  ZFS displays the default value for non-present readonly numeric (and index)
  properties. However, these properties default values are not meaningful.
  Instead, we should display a "-", indicating that they are not present. For
  example, on a version-12 pool, the usedby* properties are not available, but
  they show up as the incorrect value "0":
     1. zfs get all test12
        ...
        test12 usedbysnapshots 0 -
        test12 usedbydataset 0 -
        test12 usedbychildren 0 -
        test12 usedbyrefreservation 0 -
  We will be introducing more sometimes-present numeric readonly properties, so
  it would be nice to fix this.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after:	3 weeks
2017-10-06 08:10:54 +00:00
Andriy Gapon
f51efc68f6 MFV r316864: 6392 zdb: introduce -V for verbatim import
illumos/illumos-gate@dfd5965f7e
dfd5965f7e

FreeBSD note: the manual page is to be updated separately.

https://www.illumos.org/issues/6392
  When given a pool name via -e, zdb would attempt an import. If it
  failed, then it would attempt a verbatim import. This behavior is
  not always desirable so a -V switch is added to zdb to control the
  behavior. When specified, a verbatim import is done. Otherwise,
  the behavior is as it was previously, except no verbatim import
  is done on failure.
  a5778ea242
  Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
  Reviewed by: Matthew Ahrens <mahrens@delphix.com>

Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Richard Yao <ryao@gentoo.org>

MFC after:	3 weeks
2017-10-06 08:09:20 +00:00
Andriy Gapon
65512687a9 MFV r316862: 6410 teach zdb to perform object lookups by path
illumos/illumos-gate@ed61ec1da9
ed61ec1da9

FreeBSD note: this commit does not update the manual page.
The original change includes conversion of the manual page from *roff format
to mandoc format.  So, it is hard to extract the content change from
that.  I am going to replace our zdb manual page, which is an earlier
independent conversion, with a slighly modified version of the upstream page.

https://www.illumos.org/issues/6410
  This is primarily intended to ease debugging & testing ZFS when one is only
  interested in things like the on-disk location of a specific object's blocks,
  but doesn't know their object id. This allows doing things like the following
  (FreeBSD-based example):
          # zpool create -f foo da0
          # dd if=/dev/zero of=/foo/1 bs=1M count=4 >/dev/null 2>&1
          # zpool export foo
          # zdb -vvvvv -o "ZFS plain file" foo /1
          Object  lvl   iblk   dblk  dsize  lsize   %full  type
               8    2    16K   128K  3.99M     4M  100.00  ZFS plain file (K=inherit) (Z=inherit)
                                              168   bonus  System attributes
          dnode flags: USED_BYTES USERUSED_ACCOUNTED
          dnode maxblkid: 31
          path    /1
          uid     0
          gid     0
          atime   Thu Apr 23 22:45:32 2015
          mtime   Thu Apr 23 22:45:32 2015
          ctime   Thu Apr 23 22:45:32 2015
          crtime  Thu Apr 23 22:45:32 2015
          gen     7
          mode    100644
          size    4194304
          parent  4
          links   1
          pflags  40800000004
          Indirect blocks:
                0 L1  DVA[0]=<0:c19200:600> DVA[1]=<0:10800019200:600> [L1 ZFS
  plain file] fletcher4 lz4 LE contiguous unique double size=4000L/200P birth=7L/

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

MFC after:	3 weeks
2017-10-06 07:52:25 +00:00
Alan Somers
d22f655459 MFV r319743: 8108 zdb -l fails to read labels 2 and 3
illumos/illumos-gate@22c8b9583d
22c8b9583d

https://www.illumos.org/issues/8108

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

MFC after:	3 weeks
2017-10-02 22:39:12 +00:00
Alan Somers
58dde1c585 MFV r316863: 3871 fix issues introduced by 3604
illumos/illumos-gate@de05b58863
de05b58863

https://www.illumos.org/issues/3871
  GCC 4.5.3 on Gentoo Linux did not like a few of the changes made in the issue
  3604 patch. It printed an error and a couple of warnings:
  ../../cmd/zdb/zdb.c: In function 'dump_bpobj':
  ../../cmd/zdb/zdb.c:1257:3: error: 'for' loop initial declarations are only
  allowed in C99 mode
  ../../cmd/zdb/zdb.c:1257:3: note: use option -std=c99 or -std=gnu99 to compile
  your code
  ../../cmd/zdb/zdb.c: In function 'dump_deadlist':
  ../../cmd/zdb/zdb.c:1323:8: warning: too many arguments for format
  ../../cmd/zdb/zdb.c:1323:8: warning: too many arguments for format

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Richard Yao <ryao@gentoo.org>

MFC after:	3 weeks
2017-10-02 22:35:35 +00:00
Alan Somers
2091d1c3b9 MFV r316861: 6866 zdb -l should return non-zero if it fails to find any label
illumos/illumos-gate@64723e3611
64723e3611

https://www.illumos.org/issues/6866
  Need this for #6865.
  To be generally more scripting-friendly, overload this issue with adding '-q'
  option which should skip printing any label information.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

MFC after:	3 weeks
2017-10-02 22:13:20 +00:00
Alan Somers
66eafdf052 MFC r316858 7280 Allow changing global libzpool variables in zdb
7280 Allow changing global libzpool variables in zdb and ztest through command line

illumos/illumos-gate@0e60744c98
0e60744c98

https://www.illumos.org/issues/7280
  zdb is very handy for diagnosing problems with a pool in a safe and
  quick way. When a pool is in a bad shape, we often want to disable some
  fail-safes, or adjust some tunables in order to open them. In the
  kernel, this is done by changing public variables in mdb. The goal of
  this feature is to add the same capability to zdb and ztest, so that
  they can change libzpool tuneables from the command line.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC after:	3 weeks
2017-10-02 22:02:04 +00:00
Andriy Gapon
620c2c801b MFV r323913: 8600 ZFS channel programs - snapshot
illumos/illumos-gate@2840dce1a0
2840dce1a0

https://www.illumos.org/issues/8600
  ZFS channel programs should be able to create snapshots.
  In addition to the base snapshot functionality, this will likely entail adding
  extra logic to handle edge cases which were formerly not possible, such as
  creating then destroying a snapshot in the same transaction sync.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>

MFC after:	5 weeks
X-MFC after:	r324163
2017-10-02 11:32:08 +00:00
Andriy Gapon
a1f65a15ce MFV r323912: 8592 ZFS channel programs - rollback
illumos/illumos-gate@000cce6b6f
000cce6b6f

https://www.illumos.org/issues/8592
  ZFS channel programs should be able to perform a rollback. This logic will
  probably look pretty similar to zfs.sync.destroy().

Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Brad Lewis <brad.lewis@delphix.com>

MFC after:	5 weeks
X-MFC after:	r324163
2017-10-02 11:23:31 +00:00
Andriy Gapon
3e52a05570 MFV r323794: 8605 zfs channel programs: zfs.exists undocumented and non-working
illumos/illumos-gate@5f39f884e2
5f39f884e2

https://www.illumos.org/issues/8605
  zfs.exists() in channel programs doesn't return any result, and should have a
  man page entry.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>

MFC after:	5 weeks
X-MFC after:	r324163
2017-10-01 16:51:05 +00:00
Andriy Gapon
bda88d07d9 MFV r323530,r323533,r323534: 7431 ZFS Channel Programs, and followups
7431 ZFS Channel Programs

illumos/illumos-gate@dfc115332c
dfc115332c

https://www.illumos.org/issues/7431
  ZFS channel programs (ZCP) adds support for performing compound ZFS
  administrative actions via Lua scripts in a sandboxed environment (with time
  and memory limits).
  This initial commit includes both base support for running ZCP scripts, and a
  small initial library of API calls which support getting properties and
  listing, destroying, and promoting datasets.
  Testing: in addition to the included unit tests, channel programs have been in
  use at Delphix for several months for batch destroying filesystems. The
  dsl_destroy_snaps_nvl() call has also been replaced with

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Chris Williamson <chris.williamson@delphix.com>

8552 ZFS LUA code uses floating point math

illumos/illumos-gate@916c8d8811
916c8d8811

https://www.illumos.org/issues/8552
  In the LUA interpreter used by "zfs program", the lua format() function
  accidentally includes support for '%f' and friends, which can cause compilation
  problems when building on platforms that don't support floating-point math in
  the kernel (e.g. sparc). Support for '%f' friends (%f %e %E %g %G) should be
  removed, since there's no way to supply a floating-point value anyway (all
  numbers in ZFS LUA are int64_t's).

Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

8590 memory leak in dsl_destroy_snapshots_nvl()

illumos/illumos-gate@e6ab4525d1
e6ab4525d1

https://www.illumos.org/issues/8590
  In dsl_destroy_snapshots_nvl(), "snaps_normalized" is not freed after it is
  added to "arg".

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

FreeBSD notes:
- zfs-program.8 manual page is taken almost as is from the vendor repository,
  no FreeBSD-ification done
- fixed multiple instances of NULL being used where an integer is expected
- replaced ETIME and ECHRNG with ETIMEDOUT and EDOM respectively

This commit adds a modified version of Lua 5.2.4 under
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua, mirroring the
upstream.  See README.zfs in that directory for the description of Lua
customizations.
See zfs-program.8 on how to use the new feature.

MFC after:	5 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D12528
2017-10-01 16:11:07 +00:00
Andriy Gapon
c13f1d82c8 MFV r323535: 8585 improve batching done in zil_commit()
FreeBSD notes:
- this MFV reverts FreeBSD commit r314549 to make the merge easier
- at present our emulation of cv_timedwait_hires is rather poor,
  so I elected to use cv_timedwait_sbt directly
Please see the differential revision for details.
Unfortunately, I did not get any positive reviews, so there could be
bugs in the FreeBSD-specific piece of the merge.
Hence, the long MFC timeout.

illumos/illumos-gate@1271e4b10d
1271e4b10d

https://www.illumos.org/issues/8585
  The current implementation of zil_commit() can introduce significant
  latency, beyond what is inherent due to the latency of the underlying
  storage. The additional latency comes from two main problems:
  1. When there's outstanding ZIL blocks being written (i.e. there's
      already a "writer thread" in progress), then any new calls to
      zil_commit() will block waiting for the currently oustanding ZIL
      blocks to complete. The blocks written for each "writer thread" is
      coined a "batch", and there can only ever be a single "batch" being
      written at a time. When a batch is being written, any new ZIL
      transactions will have to wait for the next batch to be written,
      which won't occur until the current batch finishes.
  As a result, the underlying storage may not be used as efficiently
      as possible. While "new" threads enter zil_commit() and are blocked
      waiting for the next batch, it's possible that the underlying
      storage isn't fully utilized by the current batch of ZIL blocks. In
      that case, it'd be better to allow these new threads to generate
      (and issue) a new ZIL block, such that it could be serviced by the
      underlying storage concurrently with the other ZIL blocks that are
      being serviced.
  2. Any call to zil_commit() must wait for all ZIL blocks in its "batch"
      to complete, prior to zil_commit() returning. The size of any given
      batch is proportional to the number of ZIL transaction in the queue
      at the time that the batch starts processing the queue; which
      doesn't occur until the previous batch completes. Thus, if there's a
      lot of transactions in the queue, the batch could be composed of
      many ZIL blocks, and each call to zil_commit() will have to wait for
      all of these writes to complete (even if the thread calling
      zil_commit() only cared about one of the transactions in the batch).

Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>

MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12355
2017-09-26 11:04:08 +00:00
Alan Somers
76f9ab7444 Close a memory leak when using zpool_read_all_labels
MFC after:	3 weeks
X-MFC-With:	322854
Sponsored by:	Spectra Logic Corp
2017-09-25 20:44:40 +00:00
Andriy Gapon
65b5f7c743 MFV r323790: 8567 Inconsistent return value in zpool_read_label
illumos/illumos-gate@c861bfbd77
c861bfbd77

https://www.illumos.org/issues/8567
  If fstat64 fails, pread64 fails, or the label is unintelligible,
  zpool_read_label will return 0. But if malloc fails, it will return -1. For
  consistency, it should always return -1 on failure or 0 on success.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Alan Somers <asomers@gmail.com>

MFC after:	2 weeks
2017-09-20 07:23:50 +00:00
Bryan Drewery
b3b6d7b406 Fix the raise tests.
- The exit probe was not appropriately filtered to only the known pid so it
  was firing on any random process that would exit rather the only the one
  we cared about.
- The dtest script executes the tst.raise*.exe in the background from
  POSIX sh without jobs control.  POSIX mandates that SIGINT be set to
  SIG_IGN in this case.  The test executable never actually tested that
  SIGINT could be caught despite trying to block and delay the signal.
  So the SIGINT sent from raise() is never actually received since it
  is ignored.  This could be fixed by calling 'trap - INT' from dtest
  before running the executable but I've opted to just use SIGUSR1
  instead in these specific tests rather than adding more logic to
  test that SIGINT is not ignored at startup.

These 2 issues meant that the tests would randomly work but only if a process
coincidentally exited during the test.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-09-15 19:48:48 +00:00
Andriy Gapon
8ac2314e00 MFV r323527: 5815 libzpool's panic function doesn't set global panicstr, ::status not as useful
illumos/illumos-gate@fae6347731
fae6347731

https://www.illumos.org/issues/5815
  When panic() is called from within ztest, the mdb ::status command isn't as
  useful as it could be since the global panicstr variable isn't updated. We
  should modify the function to make sure panicstr is set, so ::status can
  present the error message just like it does on a failed assertion.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Rich Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after:	4 weeks
2017-09-13 10:34:31 +00:00
Andriy Gapon
a345c0b23b MFV r323523: 8331 zfs_unshare returns wrong error code for smb unshare failure
illumos/illumos-gate@4f4378cc54
4f4378cc54

https://www.illumos.org/issues/8331
  zfs_unshare returns EZFS_UNSHARENFSFAILED on error for all share types.

Reviewed by: Marcel Telka <marcel@telka.sk>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andrew Stormont <astormont@racktopsystems.com>

MFC after:	4 weeks
2017-09-13 10:23:55 +00:00
Andriy Gapon
c9f7a4ab32 MFV r316932: 6280 libzfs: unshare_one() could fail with EZFS_SHARENFSFAILED
illumos/illumos-gate@d1672efb6f
d1672efb6f

https://www.illumos.org/issues/6280
  The unshare_one() in libzfs could fail with EZFS_SHARENFSFAILED at line 834
  here:
  831    /* make sure libshare initialized */
  832    if ((err = zfs_init_libshare(hdl, SA_INIT_SHARE_API)) != SA_OK) {
  833        free(mntpt);    /* don't need the copy anymore */
  834        return (zfs_error_fmt(hdl, EZFS_SHARENFSFAILED,
  835            dgettext(TEXT_DOMAIN, "cannot unshare '%s': %s"),
  836            name, _sa_errorstr(err)));
  837    }
  The correct error should be EZFS_UNSHARENFSFAILED instead.

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Marcel Telka <marcel.telka@nexenta.com>

MFC after:	4 weeks
2017-09-13 10:22:09 +00:00
Li-Wen Hsu
da3086df18 Fix DTrace test tst_inet_ntop_d: remove definitions are already in libdtrace
We have D definitions for the named values in socket.h after r323253.  Remove
them in test script to prevent compiling failure.

Reviewed by:	markj, gnn
Differential Revision:	https://reviews.freebsd.org/D12334
2017-09-12 16:00:51 +00:00
Mark Johnston
b934b56429 Add a O_CLOEXEC use missed in r323166.
PR:		199810
Reported by:	Jukka A. Ukkonen <jau789@gmail.com>
MFC after:	3 days
2017-09-12 14:38:10 +00:00
Jonathan Anderson
bab31140a2 Remove redundant source and object files.
Reviewed by:	bdrewery, ngie
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D12208
2017-09-09 13:18:32 +00:00
Andriy Gapon
90354c3200 MFV r323107: 8414 Implemented zpool scrub pause/resume
illumos/illumos-gate@1702cce751
1702cce751

FreeBSD note:  rather than merging the zpool.8 update I copied the zpool
scrub section from the illumos zpool.1m to FreeBSD zpool.8 almost
verbatim.  Now that the illumos page uses the mdoc format, it was an
easier option.  Perhaps the change is not in perfect compliance with the
FreeBSD style, but I think that it is acceptible.

https://www.illumos.org/issues/8414
  This issue tracks the port of scrub pause from ZoL: https://github.com/zfsonlinux/zfs/pull/6167
  Currently, there is no way to pause a scrub. Pausing may be useful when
  the pool is busy with other I/O to preserve bandwidth.

  Description

  This patch adds the ability to pause and resume scrubbing.  This is achieved
  by maintaining a persistent on-disk scrub state.  While the state is 'paused'
  we do not scrub any more blocks.  We do however perform regular scan
  housekeeping such as freeing async destroyed and deadlist blocks while paused.

  Motivation and Context

  Scrub pausing can be an I/O intensive operation and people have been asking
  for the ability to pause a scrub for a while. This allows one to preserve scrub
  progress while freeing up bandwidth for other I/O.

Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Alek Pinchuk <apinchuk@datto.com>

MFC after:	2 weeks
2017-09-09 11:00:07 +00:00
George V. Neville-Neil
958f4928e6 Add D definitions for the named values in socket.h
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12241
2017-09-07 03:05:16 +00:00
Alan Somers
0a7b460e70 Honor all options of "zfs mount -o"
The existing code in zmount incorrectly parses the comma-delimited option
string. The result is that nmount only honors the last option. AFAICT the
parsing has been broken ever since ZFS's initial import in change 168404.

PR:		222078
Reviewed by:	avg
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12232
2017-09-05 19:28:35 +00:00
Mark Johnston
afd2f355fb Use O_CLOEXEC when opening persistent handles in libdtrace.
PR:		199810
Submitted by:	jau@iki.fi
MFC after:	1 week
2017-09-05 00:11:06 +00:00
Alan Somers
e6c8b3c98a zfsd(8): Close a race condition when onlining a disk paritition
When inserting a partitioned disk, devfs and geom will announce the whole
disk before they announce the partition. If the partition containing ZFS
extends to one of the disk's extents, then zfsd will see a ZFS label on the
whole disk and attempt to online it. ZFS is smart enough to activate the
partition instead of the whole disk, but only if GEOM has already created
the partition's provider.

cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h
cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c
	Add a zpool_read_all_labels method. It's similar to
	zpool_read_label, but it will return the number of labels found.

cddl/usr.sbin/zfsd/zfsd_event.cc
	When processing a DevFS CREATE event, only online a VDEV if we can
	read all four ZFS labels.

Reviewed by:	mav
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D11920
2017-08-24 19:48:41 +00:00
Mark Johnston
9b0ddce6bc Use an updated copy of the CDDL header boilerplate from illumos.
Reported by:	Yuri Pankov <yuripv@gmx.com>
X-MFC with:	r322774
2017-08-21 22:26:49 +00:00
Mark Johnston
a3a7b74e18 Add a regression test for r322773.
MFC after:	1 week
2017-08-21 21:58:42 +00:00
Mark Johnston
ce4da6fccc Fix an off-by-two in the llquantize() action parameter validation.
The aggregation created by llquantize() partitions values into buckets; the
lower bound of the bucket containing the largest values is b^{m+1}, where
b and m are the second and fourth parameters to the action, respectively.
Bucket bounds are stored in a 64-bit integer, and so the llquantize()
validation checks need to verify that b^{m+1} fits in 64 bits. However, it
was only verifying that b^{m-1} fits in 64 bits, so certain parameter
combinations could trigger assertion failures in libdtrace.

PR:		219451
MFC after:	1 week
2017-08-21 21:56:02 +00:00
Andriy Gapon
9c48e95dd9 MFV r322229: 7600 zfs rollback should pass target snapshot to kernel
illumos/illumos-gate@77b171372e
77b171372e

https://www.illumos.org/issues/7600
  At present, the kernel side code seems to blindly rollback to whatever happens
  to be the latest snapshot at the time when the rollback task is processed.
  The expected target's name should be passed to the kernel driver and the sync
  task should validate that the target exists and that it is the latest snapshot
  indeed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after:	3 weeks
2017-08-08 10:52:01 +00:00
Andriy Gapon
b4e4140d13 MFV r322223: 8378 crash due to bp in-memory modification of nopwrite block
illumos/illumos-gate@b7edcb9408
b7edcb9408

https://www.illumos.org/issues/8378
  The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which
  we then nopwrite against.
  zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr
  to zgd_bp, dbuf_write_ready()
  could change db_blkptr, and dbuf_write_done() could remove the dirty record.
  dmu_sync() then sees the stale
  BP and that the dbuf it not dirty, so it is eligible for nop-writing.
  The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the
  db_mtx. We could still see a stale
  db_blkptr, but if it is stale then the dirty record will still exist and thus
  we won't attempt to nopwrite.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after:	2 weeks
2017-08-08 10:46:51 +00:00
Andriy Gapon
8b30f189d1 MFV r322217: 8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op
illumos/illumos-gate@e09ba01dcd
e09ba01dcd

https://www.illumos.org/issues/8418
  The following line in zfs_validate_name() is just a no-op and it
  should be removed:
      108    (void) zfs_prop_get_table();

Reviewed by: Vitaliy Gusev <gusev.vitaliy@icloud.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>

MFC after:	2 weeks
2017-08-08 10:30:49 +00:00
Ruslan Bukin
ca20f8ec29 o Replace __riscv__ with __riscv
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)

This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.

RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):

__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen

Reviewed by:	ngie
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11901
2017-08-07 14:09:57 +00:00
Mark Johnston
22e406c80b Rework and simplify the ksyms(4) implementation.
- Store the symbol table contents in an anonymous swap-backed object. Have
  mmap(/dev/ksyms) map that object, and stop mapping the symbol table into
  the calling process in ksyms_open(). Previously we would cache a pointer
  to the pmap of the opening process, and mmap(/dev/ksyms) would create a
  mapping using the physical address found by a pmap lookup at the initial
  mapping address. However, this assumes that the cached pmap is valid,
  which may not be the case. [1]
- Remove the ksyms ioctl interface. It appears to have been added to work
  around a limitation in libelf that no longer exists; see r321842.
  Moreover, the interface is difficult to support and isn't present in
  illumos. Since ksyms was added specifically to support lockstat(1), it
  is expected that this removal won't have any real impact.
- Simplify ksyms_read() to avoid unnecessary copying.
- Don't call the device handle destructor if we fail to capture a snapshot
  of the kernel's symbol table. devfs will do that for us.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com> [1]
Reviewed by:	kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11789
2017-08-03 00:38:13 +00:00
Alexander Motin
f2d3f6918e Add compat shim part missed at r305197.
This fixes compatibility between old kernel and new ZFS tools.
It seems to be tradition to forget it. :(

PR:		221112
MFC after:	3 days
2017-08-02 10:33:47 +00:00
Enji Cooper
4b330699f8 Convert traditional ${MK_TESTS} conditional idiom for including test
directories to SUBDIR.${MK_TESTS} idiom

This is being done to pave the way for future work (and homogenity) in
^/projects/make-check-sandbox .

No functional change intended.

MFC after:	1 weeks
2017-08-02 08:35:51 +00:00
Mark Johnston
3c9308e82b Remove local variables missed in r321842.
X-MFC with:	r321842
2017-08-01 04:52:03 +00:00
Mark Johnston
90ec6fd4ba Let lockstat use ksyms(4)'s mmap interface.
The workaround described in the deleted comment is no longer needed.

MFC after:	1 week
2017-08-01 04:49:54 +00:00
Li-Wen Hsu
ad89d783a6 Add an auxiliary subroutine to generate some events for testing
This test is also timeout on a quiet system because there is nobody triggering
read probefunc while test execution.

Reviewed by:	gnn, markj, ngie
Differential Revision:	https://reviews.freebsd.org/D11731
2017-07-26 12:07:46 +00:00
Li-Wen Hsu
288ebd813a The test case common.funcs.t_dtrace_contrib.tst_basename_d generates a
verifying script which needs being run to complete the test.

While here, add missing shebang.

Reviewed by:	gnn, markj, ngie
Differential Revision:	https://reviews.freebsd.org/D11716
2017-07-25 13:18:28 +00:00
Li-Wen Hsu
637ba06516 Modify glob patterns and expected output to match FreeBSD's implementation.
Reviewed by:	gnn, markj, ngie
Differential Revision:	https://reviews.freebsd.org/D11713
2017-07-25 13:14:02 +00:00
Li-Wen Hsu
5604d0f997 Make this test case accepts basename() in D script returns "" or "."
In Solaris, basename(1) and basename(3) both return "." while being given an
empty string (""), while in BSD (and Linux) basename(1) returns "" and
basename(3) returns "."

While here, also change #!/usr/bin/ksh to #!/usr/bin/env ksh to find ksh in
$PATH

Reviewed by:	gnn, markj (earlier version), ngie (earlier version)
Differential Revision:	https://reviews.freebsd.org/D11707
2017-07-25 13:11:20 +00:00
Li-Wen Hsu
4ca0dfa6b0 Explicitly set dynamic variable buffer size.
We added too many variable assignments in BEGIN block, which will run out of
default auto-configured variable buffer space.  The test VM has 4G RAM which
should be enough for most cases so it's reasonable to increase limitation to
these case.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D11676
2017-07-25 13:07:06 +00:00
Li-Wen Hsu
23833df483 Explicitly set dynamic variable buffer size.
We added too many variable assignments in BEGIN block, which will run out of
default auto-configured variable buffer space.  The test VM has 4G RAM which
should be enough for most cases so it's reasonable to increase limitation to
these case.

Reviewed by:	gnn, markj, ngie
Differential Revision:	https://reviews.freebsd.org/D11674
2017-07-25 13:04:24 +00:00
Li-Wen Hsu
070a148127 Add an auxiliary subroutine to generate read(2) event while testing.
Reviewed by:	gnn, ngie
Differential Revision:	https://reviews.freebsd.org/D11673
2017-07-25 13:01:10 +00:00
Li-Wen Hsu
d83c70758a Add a simple script which calls open(2) and others to generate events for
testing.

This test times-out on a quiet system because there is nobody triggers
syscall::open:entry or syscall::: probe while test execution.

Reviewed by:	gnn, markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D11671
2017-07-25 12:58:03 +00:00
Li-Wen Hsu
b9de3393dd Add a simple program which calls sigtimedwait(2) to generate events for testing
This test timeout on a quiet system because there is nobody triggers
'syscall::*wait*:entry' probe while test execution.

Reviewed by:	gnn, markj, ngie
Differential Revision:	https://reviews.freebsd.org/D11668
2017-07-25 12:52:32 +00:00
Enji Cooper
31ed01a2de Fix whitespace on a line in fix(..) accidentally missed in r321424
MFC after:	1 month
MFC with:	r321424
2017-07-24 17:29:56 +00:00
Enji Cooper
f3305cae02 Style cleanup: delete spurious trailing whitespace
MFC after:	1 month
2017-07-24 17:27:21 +00:00
Enji Cooper
aa52ad5489 Don't use incorrect hardcoded path to ksh -- use /usr/bin/env
to find ksh instead

MFC after:	1 month
2017-07-23 17:57:00 +00:00
Alan Somers
19d04786c7 zfsd(8): Remove pidfile on shutdown
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-06-20 19:45:02 +00:00
Andriy Gapon
f9cdbaba8d MFV r318946: 8021 ARC buf data scatter-ization
illumos/illumos-gate@770499e185
770499e185

https://www.illumos.org/issues/8021
  The ARC buf data project (known simply as "ABD" since its genesis in the ZoL
  community) changes the way the ARC allocates `b_pdata` memory from using linear
  `void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This
  improves ZFS's performance by helping to defragment the address space occupied
  by the ARC, in particular for cases where compressed ARC is enabled. It could
  also ease future work to allocate pages directly from `segkpm` for minimal-
  overhead memory allocations, bypassing the `kmem` subsystem.
  This is essentially the same change as the one which recently landed in ZFS on
  Linux, although they made some platform-specific changes while adapting this
  work to their codebase:
  1. Implemented the equivalent of the `segkpm` suggestion for future work
  mentioned above to bypass issues that they've had with the Linux kernel memory
  allocator.
  2. Changed the internal representation of the ABD's scatter/gather list so it
  could be used to pass I/O directly into Linux block device drivers. (This
  feature is not available in the illumos block device interface yet.)

FreeBSD notes:
- the actual (default) chunk size is 4KB (despite the text above saying 1KB)
- we can try to reimplement ABDs, so that they are not permanently
  mapped into the KVA unless explicitly requested, especially on
  platforms with scarce KVA
- we can try to use unmapped I/O and avoid intermediate allocation of a
  linear, virtual memory mapped buffer
- we can try to avoid extra data copying by referring to chunks / pages
  in the original ABD

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Dan Kimmel <dan.kimmel@delphix.com>

MFC after:	3 weeks
2017-06-20 17:39:24 +00:00
Bryan Drewery
c99b67a794 Utilize SYSROOT from r320119 in places where DESTDIR may be wanting WORLDTMP.
Since buildenv exports SYSROOT all of these uses will now look in
WORLDTMP by default.

sys/boot/efi/loader/Makefile
        A LIBSTAND hack is no longer required for buildenv.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-06-19 20:47:24 +00:00
Andriy Gapon
b8d341fe26 MFV r319945,r319946: 8264 want support for promoting datasets in libzfs_core
illumos/illumos-gate@a4b8c9aa65
a4b8c9aa65

https://www.illumos.org/issues/8264
  Oddly there is a lzc_clone function, but no lzc_promote function.

Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@kebe.com>
Approved by: Dan McDonald <danmcd@kebe.com>
Author: Andrew Stormont <astormont@racktopsystems.com>
MFC after:	1 week
2017-06-14 16:31:36 +00:00
Mark Johnston
c645060dbb Override the locale so that file lists get a consistent sort order.
Reported by:	avg
MFC after:	1 week
2017-06-10 14:47:01 +00:00
Andriy Gapon
6bd205a529 follow up to r319746: add the new test files to the make file
Reported by:	markj
MFC after:	2 days
X-MFC with:	r319746
2017-06-10 06:13:52 +00:00
Andriy Gapon
9260925dcd MFV r319740: 8168 NULL pointer dereference in zfs_create()
illumos/illumos-gate@690031d326
690031d326

https://www.illumos.org/issues/8168
  If we manage to export the pool on which we are creating a dataset (filesystem
  or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for
  which we never check the return value) we end up dereferencing a NULL pointer
  in libzfs`zpool_close().
  This was discovered on ZFS on Linux. The same issue can be reproduced on
  Illumos running in parallel:
    while :; do zpool import -d /tmp testpool ; zpool export testpool ; done
    while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done
  Eventually this will result in several core dumps like this one:
  [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244
  Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1
  libnvpair.so.1 ld.so.1 ]
  > ::stack
  libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450)
  libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8)
  zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3)
  main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70)
  _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b)
  >
  Fix and reproducer (systemtap): https://github.com/zfsonlinux/zfs/pull/6096

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>
MFC after:	2 weeks
2017-06-09 15:30:41 +00:00
Andriy Gapon
ad2b1a296f MFV r319744,r319745: 8269 dtrace stddev aggregation is normalized incorrectly
illumos/illumos-gate@79809f9cf4
79809f9cf4

https://www.illumos.org/issues/8269
  It seems that currently normalization of stddev aggregation is done
  incorrectly.
  We divide both the sum of values and the sum of their squares by the
  normalization factor. But we should divide the sum of squares by the
  normalization factor squared to scale the original values properly.

FreeBSD note: the actual change was committed in r316853, this commit
adds the test files and record merge information.

Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
MFC after:	1 week
Sponsored by:	Panzura
2017-06-09 15:16:39 +00:00
Allan Jude
39b0b876dc New sentences start on new lines, fix two violations
Reviewed by:	bcr
Sponsored by:	BSDCan Dev Summit
2017-06-08 01:39:17 +00:00
Allan Jude
dc379eca14 SHA-512 and Skein have been supported by the boot loader for some time.
Submitted by:	lifanov
Reviewed by:	bcr
Sponsored by:	BSDCan Dev Summit
2017-06-08 01:29:24 +00:00
Andriy Gapon
457711b02b MFV r316922: 5380 receive of a send -p stream doesn't need to try renaming snapshots
illumos/illumos-gate@471a88e499
471a88e499

https://www.illumos.org/issues/5380
  A stream created with zfs send -p -I contains properties of all snapshots of a
  given dataset as opposed to only properties of snapshots in a given range.
  Not only this is suboptimal but the receive code also does not filter
  properties by the range. So, properties of earlier snapshots would be updated
  even though the snapshots themselves are not in the stream (just their
  properties).
  Given that modifying the snapshot properties requires a TXG sync and that the
  snapshots are updated one by one the described behavior may lead to a sever
  performance penalty.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
MFC after:	3 weeks
2017-05-24 22:30:21 +00:00
Andriy Gapon
b6be31c7ca MFC r316908: 7541 zpool import/tryimport ioctl returns ENOMEM because provided buffer is too small for config
illumos/illumos-gate@8b65a70b76
8b65a70b76

https://www.illumos.org/issues/7541
  When calling zpool import, zpool does a few ioctls to ZFS.
  zpool allocates a buffer in userland and passes it to the kernel so that ZFS
  can copy info into it. ZFS will use it to put the nvlist that describes the
  pool configuration.
  If the allocated buffer is too small, ZFS will return ENOMEM and the call will
  have to be redone. This wastes CPU time and slows down the import process. This
  happens very often for the ZFS_IOC_POOL_TRYIMPORT call.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
MFC after:	2 weeks
2017-05-24 21:32:35 +00:00
Andriy Gapon
5644318970 MFC r316904: 7729 libzfs_core`lzc_rollback() leaks result nvl
illumos/illumos-gate@ac428481f9
ac428481f9

https://www.illumos.org/issues/7729
  libzfs_core`lzc_rollback() doesn't free the result nvl after lzc_ioctl() call.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>

MFC after:	2 weeks
2017-05-24 20:53:01 +00:00
Andriy Gapon
c65389d367 MFV r316860: 7545 zdb should disable reference tracking
illumos/illumos-gate@4dd77f9e38
4dd77f9e38

https://www.illumos.org/issues/7545
  When evicting from the ARC, we manipulate some refcount_t's, e.g. arcs_size.
  When using zdb to examine a large amount of data (e.g. zdb -bb on a large pool
  with small blocks), the ARC may have a large number of entries. If reference
  tracking is enabled, there will be ~1 reference for each block in the ARC. When
  evicting, we decrement the refcount and have to search all the references to
  find the one that we are removing, which is very slow.
  Since zdb is typically used to find problems with the on-disk format, and not
  with the code it is running, we should disable reference tracking in zdb.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after:	2 weeks
2017-05-24 20:41:26 +00:00
Konstantin Belousov
6992112349 Commit the 64-bit inode project.
Extend the ino_t, dev_t, nlink_t types to 64-bit ints.  Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment.  Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.

ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks.  Unfortunately, not everything can be
fixed, especially outside the base system.  For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.

Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.

Struct xvnode changed layout, no compat shims are provided.

For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.

Update note: strictly follow the instructions in UPDATING.  Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.

Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb).  Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver.  Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem).  Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).

Sponsored by:	The FreeBSD Foundation (emaste, kib)
Differential revision:	https://reviews.freebsd.org/D10439
2017-05-23 09:29:05 +00:00
Mark Johnston
b4a3f67bd6 Add a little helper program for tst.exitcore.ksh.
sleep(1) is capsicumized, which means that we cannot rely on it to dump
core as required by the test.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-05-22 20:34:51 +00:00
Alexander Motin
c9bcbf3800 Fix time handling in cv_timedwait_hires().
pthread_cond_timedwait() receives absolute time, not relative.  Passing
wrong time there caused two threads of zdb to spin in a tight loop.

MFC after:	1 week
2017-05-19 05:12:58 +00:00
Mark Johnston
9ebaf5f802 Remove the EXFAIL annotation for tests which pass as of r309596.
Reported by:	bdrewery
Sponsored by:	Dell EMC Isilon
2017-05-19 00:25:09 +00:00
Josh Paetzel
c78abb8b50 MFV 316894
7252 7628 compressed zfs send / receive

illumos/illumos-gate@5602294fda
5602294fda

https://www.illumos.org/issues/7252
  This feature includes code to allow a system with compressed ARC enabled to
  send data in its compressed form straight out of the ARC, and receive data in
  its compressed form directly into the ARC.

https://www.illumos.org/issues/7628
  We should have longer, more readable versions of the ZFS send / recv options.

7628 create long versions of ZFS send / receive options

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: David Quigley <dpquigl@davequigley.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
2017-04-25 17:57:43 +00:00
Josh Paetzel
ef18459108 MFV 316891
7386 zfs get does not work properly with bookmarks

illumos/illumos-gate@edb901aab9
edb901aab9

https://www.illumos.org/issues/7386
  The zfs get command does not work with the bookmark parameter while it works
  properly with both filesystem and snapshot:
  # zfs get -t all -r creation rpool/test
  NAME               PROPERTY  VALUE                  SOURCE
  rpool/test         creation  Fri Sep 16 15:00 2016  -
  rpool/test@snap    creation  Fri Sep 16 15:00 2016  -
  rpool/test#bkmark  creation  Fri Sep 16 15:00 2016  -
  # zfs get -t all -r creation rpool/test@snap
  NAME             PROPERTY  VALUE                  SOURCE
  rpool/test@snap  creation  Fri Sep 16 15:00 2016  -
  # zfs get -t all -r creation rpool/test#bkmark
  cannot open 'rpool/test#bkmark': invalid dataset name
  #
  The zfs get command should be modified to work properly with bookmarks too.

Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
2017-04-21 19:53:52 +00:00
Alan Somers
07bb15b440 MFV 316855
7900 zdb shouldn't print the path of a znode at verbosity < 5

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alan Somers <asomers@freebsd.org>

illumos/illumos-gate@e548d2fa41
https://www.illumos.org/issues/7900

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-04-14 16:30:37 +00:00
Andriy Gapon
5ad79d9b20 dtrace: fix normalization of stddev aggregation
To be upstreamed.

Discussed with:	Bryan Cantrill <bryancantrill@gmail.com>
MFC after:	2 weeks
Sponsored by:	Panzura
2017-04-14 15:31:04 +00:00
Pedro F. Giffuni
9e98194646 MFV r316693:
8046 Let calloc() do the multiplication in libzfs_fru_refresh

5697e03e6e

https://www.illumos.org/issues/8046

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:	Pedro Giffuni <pfg@freebsd.org>

MFC after:	3 days
2017-04-10 22:56:38 +00:00
Alexander Motin
3aef5b286a MFV r315290, r315291: 7303 dynamic metaslab selection
illumos/illumos-gate@8363e80ae7
https://github.com/illumos/illumos-gate/commit/8363e80ae72609660f6090766ca8c2c18

https://www.illumos.org/issues/7303

  This change introduces a new weighting algorithm to improve metaslab selection.
  The new weighting algorithm relies on the SPACEMAP_HISTOGRAM feature. As a result,
  the metaslab weight now encodes the type of weighting algorithm used
  (size-based vs segment-based).

  This also introduce a new allocation tracing facility and two new dcmds to help
  debug allocation problems. Each zio now contains a zio_alloc_list_t structure
  that is populated as the zio goes through the allocations stage. Here's an
  example of how to use the tracing facility:

> c5ec000::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
  MSID    DVA    ASIZE      WEIGHT             RESULT               VDEV
     -      0      400           0    NOT_ALLOCATABLE           ztest.0a
     -      0      400           0    NOT_ALLOCATABLE           ztest.0a
     -      0      400           0             ENOSPC           ztest.0a
     -      0      200           0    NOT_ALLOCATABLE           ztest.0a
     -      0      200           0    NOT_ALLOCATABLE           ztest.0a
     -      0      200           0             ENOSPC           ztest.0a
     1      0      400      1 x 8M            17b1a00           ztest.0a

> 1ff2400::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
  MSID    DVA    ASIZE      WEIGHT             RESULT               VDEV
     -      0      200           0    NOT_ALLOCATABLE           mirror-2
     -      0      200           0    NOT_ALLOCATABLE           mirror-0
     1      0      200      1 x 4M            112ae00           mirror-1
     -      1      200           0    NOT_ALLOCATABLE           mirror-2
     -      1      200           0    NOT_ALLOCATABLE           mirror-0
     1      1      200      1 x 4M            112b000           mirror-1
     -      2      200           0    NOT_ALLOCATABLE           mirror-2

  If the metaslab is using segment-based weighting then the WEIGHT column will
  display the number of segments available in the bucket where the allocation
  attempt was made.

Author: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
2017-03-24 09:37:00 +00:00
Enji Cooper
acc37ca1c1 cddl: normalize paths using SRCTOP-relative paths or :H when possible
This simplifies make logic/output

While here, remove bogus CFLAGS which look for headers in cddl/lib/libumem.
There aren't any source files there (just Makefiles)

MFC after:	1 month
Sponsored by:	Dell EMC Isilon
2017-03-04 11:30:04 +00:00
Mark Johnston
d935f34b8f Fix memory leaks in error cases in libdtrace.
Submitted by:	Tom Rix <trix@juniper.net>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9705
2017-02-23 17:56:24 +00:00
Mark Johnston
74d9553e43 Fix a memory leak in an error case in libctf.
Submitted by:	Tom Rix <trix@juniper.net>
MFC after:	1 week
2017-02-23 17:54:17 +00:00
Mark Johnston
281e4f2dd4 When patching USDT probes, use non-unique names for aliases of weak symbols.
Aliases are normally given names that include a key that's unique for each
input object file. This, for example, ensures that aliases for identically
named local symbols in different object files don't conflict. However, in
some cases the static linker will leave an undefined alias after merging
identical weak symbols, resulting in a link error. A non-unique name allows
the aliases to be merged as well.

PR:		216871
X-MFC With:	r313262
2017-02-10 02:01:32 +00:00
Mark Johnston
35bf9feb41 Search for _DTRACE_VERSION in sys/sdt.h rather than unistd.h.
MFC after:	1 week
2017-02-05 02:45:35 +00:00
Mark Johnston
55c2fd519f Avoid using Sun compiler-specific flags.
MFC after:	1 week
2017-02-05 02:44:48 +00:00
Mark Johnston
273efb05a2 Fix a double free of libelf data buffers in the USDT link code.
libdtrace needs to append to the input object files' string and symbol
tables. Currently it does so by allocating a larger buffer, copying the
existing sections into them, and swapping pointers in the libelf data
descriptors. However, it also frees those buffers when its processing is
complete, which leads to a double free since the elftoolchain libelf
owns them and also frees them in elf_end(3). Instead, free the buffers
originally allocated by libelf.

MFC after:	2 weeks
2017-02-05 02:44:08 +00:00
Mark Johnston
e801af6fba Use PC-relative relocations for USDT probe sites on i386 and amd64.
When recording probe site addresses in the output DOF file, dtrace -G
needs to emit relocations for the .SUNW_dof section in order to obtain
the addresses of functions containing probe sites. DTrace expects the
addresses to be relative to the base address of the final ELF file,
and the amd64 USDT implementation was relying on some unspecified and
incorrect behaviour in the base system GNU ld to achieve this.

This change reimplements the probe site relocation handling to allow
USDT to be used with lld and newer GNU binutils. Specifically, it
makes use of R_X86_64_PC64/R_386_PC32 relocations to obtain the
probe site address relative to the DOF file address, and adds and uses a
new DOF relocation type which computes the final probe site address using
these relative offsets.

Reported by and discussed with:	Rafael Espíndola
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D9374
2017-02-05 02:39:12 +00:00
George V. Neville-Neil
82988b50a1 Add an mbuf to ipinfo_t translator to finish cleanup of mbuf passing to TCP probes.
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9401
2017-02-01 19:33:00 +00:00
Alan Somers
9df99d6ed0 Fix an unchecked return value in zfsd
It's pretty unlikely to actually hit this, but good to check it anyway

Reported by:	Coverity
CID:		1362018
MFC after:	4 weeks
Sponsored by:	Spectra Logic Corp
2017-01-18 22:10:18 +00:00
Andrey V. Elsukov
4349a2145f Convert ipv4_flags and ipv4_offset fields into host byte order.
Also save only high bits in the ipv4_flags, because it is defined
as uint8_t. So now it will show DF and MF flags as 0x40 and 0x20.

Reviewed by:	markj@
MFC after:	1 week
2016-12-29 20:27:54 +00:00
Mark Johnston
3a3e3279e7 Avoid modifying the object string table when patching USDT probes.
dtrace converts pairs of consecutive underscores in a probe name to dashes.
When dtrace -G processes relocations corresponding to USDT probe sites, it
performs this conversion on the corresponding symbol names prior to looking
up the resulting probe names in the USDT provider definition. However, in
so doing it would actually modify the input object's string table, which
breaks the string suffix merging done by recent binutils. Because we don't
care about the symbol name once the probe site is recorded, just perform the
probe lookup using a temporary copy.

Reported by:	hrs, swills
MFC after:	3 weeks
2016-12-20 18:25:41 +00:00
Mark Johnston
a46000e8ff Consistently print D variable indices in decimal when disassembling.
MFC after:	1 week
2016-12-20 05:45:52 +00:00
Mark Johnston
90556839e9 Skip a ustack test that triggers an assertion on INVARIANTS kernels.
Reported by:	ngie
Sponsored by:	Dell EMC Isilon
X-MFC-With:	r309698
2016-12-14 19:01:43 +00:00
Mark Johnston
d78e7e3b58 err.D_PROC_CREATEFAIL.many.d passes, so remove the EFAIL annotation.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-12-08 18:56:35 +00:00
Mark Johnston
3c606f671e Use the correct path to date(1).
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-12-07 23:38:18 +00:00
Mark Johnston
058f5a9a47 Use the native data model instead of forcing ILP32 in tst.provregex3.ksh.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-12-07 23:37:51 +00:00
Mark Johnston
6b91eae9d2 Run DTrace test scripts with "tst" set to the test script file name.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-12-07 23:36:15 +00:00
Mark Johnston
e0d70fc1dc libdtrace: Don't use a read-only handle for enumerating pid probes.
Enumeration of return probes involves disassembling subroutines in the
target process, and ptrace(2) is currently used to read from the target
process. libproc could read from the backing file instead to avoid this
problem, but in the common case libdtrace will have a writeable handle
on the process anyway. In particular, a writeable handle is needed to list
USDT probes, and libdtrace will cache such a handle for processes that it
controls via dtrace -c and -p.
2016-12-06 04:28:56 +00:00
Mark Johnston
b043b5dc6b libproc: Add support for some proc_attach() flags.
This change adds some handling for the equivalent of Solaris' PGRAB_*
flags. In particular, support for PGRAB_RDONLY is needed to avoid a
nasty deadlock: dtrace(1) may otherwise stop the master process for its
pseudo-terminal and end up blocking while writing to standard output.
2016-12-06 04:22:38 +00:00
Mark Johnston
d42df2a447 libproc: Match prefixes when looking up mapped object by name.
When looking up an object by name, allow prefix matches if no direct match
is found. This allows one to, for example, match libc entry probes with:

 # dtrace -n 'pid$target:libc.so::entry' -c ./foo

instead of requiring "libc.so.7" or a glob.

Also remove proc_obj2map() as it currently just duplicates the
functionality of proc_name2map(). It's supposed to take a Solaris
link-map ID as a paramter, but support for this isn't implemented and
isn't required to support DTrace's pid provider.
2016-12-06 04:20:32 +00:00
Pedro F. Giffuni
69718b786d Revert r253678, r253661:
Fix a segfault in ctfmerge(1) due to a bug in GCC.

The change was correct and the bug real, but upstream didn't adopt it
and we want to remain in sync. When/if upstream does something about it
we can bring their version.

The bug in question was fixed in GCC 4.9 which is now the default in
FreeBSD's ports. Our native gcc-4.2, which is still in use in some Tier-2
platforms also has a workaround so no end-user should be harmed by the
revert.
2016-12-03 17:44:43 +00:00
Andriy Gapon
61158a7ce8 MFV r308989: 6428 set canmount=off on unmounted filesystem tries to
unmount children

This is a cosmetic and bookkeeping change as the actual change is
already in FreeBSD.
See r297521, r304520, r308985.
2016-11-24 10:11:09 +00:00
Andriy Gapon
9170c18bb9 revert r304520, set canmount=on is not supposed to mount the filesystem
Not sure where I got the idea that it should.

See https://github.com/openzfs/openzfs/pull/218

Reported by:	mahrens
Pointyhat to:	avg
MFC after:	5 days
2016-11-22 11:44:30 +00:00
Alexander Motin
14b5719f6a After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.

Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.

Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.

While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.

Sponsored by:	iXsystems, Inc.
2016-11-17 21:01:27 +00:00
Mark Johnston
e20727061f Define dependencies for some auto-generated source files in libdtrace.
Remove an unneeded beforedepend rule.

Reported by:	emaste, jhb
Reviewed by:	bdrewery
MFC after:	1 week
2016-11-17 18:13:42 +00:00
Mark Johnston
375c8b20dc Remove the DTrace printt and typeref actions.
These are FreeBSD-specific and were added in r178576 to provide the ability
to pretty-print instances of compound types. However, the print action has
long since been augmented to provide this functionality with a simpler
interface.

Discussed with:	gnn
Differential Revision:	https://reviews.freebsd.org/D8478
2016-11-12 19:26:12 +00:00
Andriy Gapon
d5315b02cd MFV r308222: 6051 lzc_receive: allow the caller to read the begin record
illumos/illumos-gate@620f322510
620f322510

https://www.illumos.org/issues/6051
  Currently lzc_receive() requires that its snapname argument is a snapshot name
  (contains '@').
  zfs receive allows to specify just a dataset name and would try to deduce the
  snapshot name from the stream.
  I propose to allow lzc_receive() to do the same.
  That seems to be quite easy to implement, it requires only a small amount of
  logic, it does not require any additional system calls or any additional data
  from the stream.
  The benefit is that the new behavior would allow to keep the snapshot names the
  same between the sender and receiver at zero cost, without a need to pass the
  names out of band.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@icyb.net.ua>
MFC after:	2 weeks
2016-11-03 09:24:27 +00:00
Andriy Gapon
97371ba2a9 zfsbootcfg: a simple tool to set next boot (one time) options for zfsboot
(gpt)zfsboot will read one-time boot directives from a special ZFS pool
area.  The area was previously described as "Boot Block Header", but
currently it is know as Pad2, marked as reserved and is zeroed out on
pool creation.  The new code interprets data in this area, if any, using
the same format as boot.config.  The area is immediately wiped out.
Failure to parse the directives results in a reboot right after the
cleanup.  Otherwise the boot sequence proceeds as usual.

zfsbootcfg writes zfsboot arguments specified on its command line to the
Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and
vfs.zfs.boot.primary_vdev kenv variables that are set by loader during
boot.  Please see the manual page for more.

Thanks to all who reviewed, contributed and made suggestions!  There are
many potential improvements to the feature, please see the review for
details.

Reviewed by:	wblock (docs)
Discussed with:	jhb, tsoome
MFC after:	3 weeks
Relnotes:	yes
Differential Revision: https://reviews.freebsd.org/D7612
2016-10-29 14:09:32 +00:00