Commit Graph

554 Commits

Author SHA1 Message Date
Andriy Gapon
2ccc5f6fd1 5379 modifying a mmap()-ed file does not update its timestamps
illumos/illumos-gate@80e10fd0d2
80e10fd0d2

https://www.illumos.org/issues/5379
  The following is based on a review of the illumos code and on a similar problem
  reported for FreeBSD where the relevant code is different.
  Looking at this block of code http://src.illumos.org/source/xref/illumos-gate/
  usr/src/uts/common/fs/zfs/zfs_vnops.c#4187 I see code to set up an
  sa_bulk_attr_t object, I see code to set up mtime and ctime values, but I do
  not see code to actually apply the attributes...
  I would expect there to be a call to sa_bulk_update(), there is such a call in
  zfs_write() for instance.
  mmap_write.c [Magnifier] - demo (1.42 KB) Andriy Gapon, 2015-11-11 01:53 PM

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Andriy Gapon <andriy.gapon@clusterhq.com>
2017-04-14 18:38:21 +00:00
Andriy Gapon
847b286425 7955 libshare needs to initialize only those datasets being modified by the consumer
illumos/illumos-gate@8a981c3356
8a981c3356

https://www.illumos.org/issues/7955
  Libshare currently initializes all available filesystems when doing any
  libshare operation. This requires iterating through all the filesystem
  multiple times, which is a huge performance problem for sharing and
  unsharing operations.

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
2017-04-14 18:34:03 +00:00
Andriy Gapon
b7db959db3 6101 attempt to lzc_create() a filesystem under a volume results in a panic
illumos/illumos-gate@b127fe3c05
b127fe3c05

https://www.illumos.org/issues/6101
  lzc_create(), or more correctly, zfs_ioc_create() does not reject an attempt to
  create a filesystem as a child of a volume, instead it proceeds to a crash.
  A crash stack obtained on FreeBSD:
  page fault while in kernel mode

  zap_leaf_lookup()
  fzap_lookup()
  zap_lookup_norm()
  zap_lookup()
  zfs_get_zplprop()
  zfs_fill_zplprops_impl()
  zfs_ioc_create()
  zfsdev_ioctl()
  devfs_ioctl_f()
  kern_ioctl()
  sys_ioctl()
  This crash happened with a kernel without debugging assertions.
  The immediate cause of crash appears to an attempt to interpret a zvol object
  as a zap object.
  For filesystems:
  #define MASTER_NODE_OBJ 1
  For zvols:
  #define ZVOL_OBJ                1ULL
  #define ZVOL_ZAP_OBJ            2ULL
  So, I see two problems here:
     1. an attempt to create a filesystem under a zvol should be rejected as
        early as possible, maybe in zfs_fill_zplprops()
     2. maybe zap_lookup / zap_lockdir should reject objects that are not of one
        of the zap object types

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
2017-04-14 18:33:20 +00:00
Andriy Gapon
d72c1a5bb1 8061 sa_find_idx_tab can be declared more type-safely
illumos/illumos-gate@7f0bdb4257
7f0bdb4257

https://www.illumos.org/issues/8061
  sa_find_idx_tab() is declared as taking and returning "void *" parameters.
  These can be declared to be the specific types.

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:32:38 +00:00
Andriy Gapon
0843b4c8ae 8026 retire zfs_throttle_delay and zfs_throttle_resolution
illumos/illumos-gate@6b03625981
6b03625981

https://www.illumos.org/issues/8026
  zfs_throttle_delay and zfs_throttle_resolution became disused since the new
  write throttling mechanism was introduced.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Andriy Gapon <avg@FreeBSD.org>
2017-04-14 18:32:12 +00:00
Andriy Gapon
c2745459d3 5380 receive of a send -p stream doesn't need to try renaming snapshots
illumos/illumos-gate@471a88e499
471a88e499

https://www.illumos.org/issues/5380
  A stream created with zfs send -p -I contains properties of all snapshots of a
  given dataset as opposed to only properties of snapshots in a given range.
  Not only this is suboptimal but the receive code also does not filter
  properties by the range. So, properties of earlier snapshots would be updated
  even though the snapshots themselves are not in the stream (just their
  properties).
  Given that modifying the snapshot properties requires a TXG sync and that the
  snapshots are updated one by one the described behavior may lead to a sever
  performance penalty.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
2017-04-14 18:30:22 +00:00
Andriy Gapon
45967327e2 8027 tighten up dsl_pool_dirty_delta
illumos/illumos-gate@313ae1e182
313ae1e182

https://www.illumos.org/issues/8027
  dsl_pool_dirty_delta() should not wake up waiters when dp->dp_dirty_total ==
  zfs_dirty_data_max, because they wait for dp_dirty_total to fall strictly below
  the threshold.
  It's probably very rare for that condition to occur, but it's better to have
  more accurate code.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
2017-04-14 18:29:35 +00:00
Andriy Gapon
bf644e9b6f 8023 Panic destroying a metaslab deferred range tree
illumos/illumos-gate@3991b535a8
3991b535a8

https://www.illumos.org/issues/8023
       $C
  ffffff0011bc0970 vpanic()
  ffffff0011bc0a00 strlog()
  ffffff0011bc0a30 range_tree_destroy+0x72(ffffff043769ad00)
  ffffff0011bc0a70 metaslab_fini+0xd5(ffffff0449acf380)
  ffffff0011bc0ab0 vdev_metaslab_fini+0x56(ffffff0462bae800)
  ffffff0011bc0af0 spa_unload+0x9b(ffffff03e3dac000)
  ffffff0011bc0b70 spa_export_common+0x115(ffffff047f4b4000, 2, 0, 0, 0)
  ffffff0011bc0b90 spa_destroy+0x1d(ffffff047f4b4000)
  ffffff0011bc0bd0 zfs_ioc_pool_destroy+0x20(ffffff047f4b4000)
  ffffff0011bc0c80 zfsdev_ioctl+0x4d7(11400000000, 5a01, 8040190, 100003,
  ffffff03e1956b10, ffffff0011bc0e68)
  ffffff0011bc0cc0 cdev_ioctl+0x39(11400000000, 5a01, 8040190, 100003,
  ffffff03e1956b10, ffffff0011bc0e68)
  ffffff0011bc0d10 spec_ioctl+0x60(ffffff03d9153b00, 5a01, 8040190, 100003,
  ffffff03e1956b10, ffffff0011bc0e68, 0)
  ffffff0011bc0da0 fop_ioctl+0x55(ffffff03d9153b00, 5a01, 8040190, 100003,
  ffffff03e1956b10, ffffff0011bc0e68, 0)
  ffffff0011bc0ec0 ioctl+0x9b(3, 5a01, 8040190)
  ffffff0011bc0f10 _sys_sysenter_post_swapgs+0x149()

Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
2017-04-14 18:29:13 +00:00
Andriy Gapon
99988b024d 7885 zpool list can report 16.0e for expandsz
illumos/illumos-gate@c040c10cdd
c040c10cdd

https://www.illumos.org/issues/7885
  When a member of a RAIDZ has been replaced with a device smaller than the
  original, then the top level vdev can report its expand size as 16.0E.
  The reduced child asize causes the RAIDZ to have a vdev_asize lower than its
  vdev_max_asize which then results in an underflow during the calculation of the
  parents expand size.
  Also for RAIDZ vdevs the sum of their child vdev_min_asize could be smaller
  than the parents vdev_min_size.
  Fixed by: https://github.com/openzfs/openzfs/pull/296

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
2017-04-14 18:28:40 +00:00
Andriy Gapon
2d3fcc82a1 7990 libzfs: snapspec_cb() does not need to call zfs_strdup()
illumos/illumos-gate@d8584ba6fb
d8584ba6fb

https://www.illumos.org/issues/7990
  The snapspec_cb() callback function in libzfs does not need to call zfs_strdup().

Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
2017-04-14 18:27:12 +00:00
Andriy Gapon
318cd645a2 7968 multi-threaded spa_sync()
illumos/illumos-gate@94c2d0eb22
94c2d0eb22

https://www.illumos.org/issues/7968
  spa_sync() iterates over all the dirty dnodes and processes each of them by
  calling dnode_sync(). If there are many dirty dnodes (e.g. because we created
  or removed a lot of files), the single thread of spa_sync() calling dnode_sync
  () can become a bottleneck. Additionally, if many dnodes are dirtied
  concurrently in open context (e.g. due to concurrent file creation), the
  os_lock will experience lock contention via dnode_setdirty().
  The solution is to track dirty dnodes on a multilist_t, and for spa_sync() to
  use separate threads to process each of the sublists in the multilist.
  On the concurrent file creation microbenchmark, the performance improvement
  from dnode_setdirty() is up to 7%. Additionally, the wall clock time spent in
  spa_sync() is reduced to 15%-40% of the single-threaded case. In terms of cost/
  reward, once the other bottlenecks are addressed, fixing this bug will provide
  a medium-large performance gain and require a medium amount of effort to
  implement.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:26:24 +00:00
Andriy Gapon
260aaa4c3d 7970 zfs_arc_num_sublists_per_state should be common to all multilists
illumos/illumos-gate@10fbdecb05
10fbdecb05

https://www.illumos.org/issues/7970
  The global tunable zfs_arc_num_sublists_per_state is used by the ARC and
  the dbuf cache, and other users are planned. We should change this
  tunable to be common to all multilists.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:25:49 +00:00
Andriy Gapon
bc390f4947 7801 add more by-dnode routines (lint)
illumos/illumos-gate@411be58a6e
411be58a6e

https://www.illumos.org/issues/7801
  Add *_by_dnode() routines for accessing objects given their
  dnode_t *, this is more efficient than accessing the object by
  (objset_t *, uint64_t object). This change converts some but
  not all of the existing consumers. As performance-sensitive
  code paths are discovered they should be converted to use
  these routines.
  Ported from: https://github.com/zfsonlinux/zfs/commit/
  0eef1bde31d67091d3deed23fe2394f5a8bf2276

Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:25:22 +00:00
Andriy Gapon
fd870ba29a 7801 add more by-dnode routines
illumos/illumos-gate@b0c42cd470
b0c42cd470

https://www.illumos.org/issues/7801
  Add *_by_dnode() routines for accessing objects given their
  dnode_t *, this is more efficient than accessing the object by
  (objset_t *, uint64_t object). This change converts some but
  not all of the existing consumers. As performance-sensitive
  code paths are discovered they should be converted to use
  these routines.
  Ported from: https://github.com/zfsonlinux/zfs/commit/
  0eef1bde31d67091d3deed23fe2394f5a8bf2276

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: bzzz77 <bzzz.tomas@gmail.com>
2017-04-14 18:25:02 +00:00
Andriy Gapon
c9327351e8 7869 panic in bpobj_space(): null pointer dereference
illumos/illumos-gate@a3905a4592
a3905a4592

https://www.illumos.org/issues/7869
  The issue fixed by this patch is a race condition in the deadlist code.
  A thread executing an administrative command that uses
  `dsl_deadlist_space_range()` holds the lock of the whole `deadlist_t` to
  protect the access of all its entries that the deadlist contains in an
  avl tree.
  Sync threads trying to insert a new entry in the deadlist
  (through `dsl_deadlist_insert()` -> `dle_enqueue()`) do not hold the
  deadlist lock at that moment. If the `dle_bpobj` is the empty bpobj (our
  sentinel value), we close and reopen it. Between these two operations,
  it is possible for the `dsl_deadlist_space_range()` thread to dereference
  that bpobj which is `NULL` during that window.
  Threads should hold the a deadlist's `dl_lock` when they manipulate its
  internal data so scenarios like the one above are avoided. In addition,
  threads should also hold the bpobj lock whenever they are allocating the
  subobj list of a bpobj, and not just when they actually insert the subobj
  to the list. This way we can avoid potential memory leaks.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
2017-04-14 18:24:38 +00:00
Andriy Gapon
1ea6ff796d 7793 ztest fails assertion in dmu_tx_willuse_space
illumos/illumos-gate@61e255ce72
61e255ce72

https://www.illumos.org/issues/7793
  Background information: This assertion about tx_space_* verifies that we
  are not dirtying more stuff than we thought we would. We “need” to know
  how much we will dirty so that we can check if we should fail this
  transaction with ENOSPC/EDQUOT, in dmu_tx_assign(). While the
  transaction is open (i.e. between dmu_tx_assign() and dmu_tx_commit() —
  typically less than a millisecond), we call dbuf_dirty() on the exact
  blocks that will be modified. Once this happens, the temporary
  accounting in tx_space_* is unnecessary, because we know exactly what
  blocks are newly dirtied; we call dnode_willuse_space() to track this
  more exact accounting.
  The fundamental problem causing this bug is that dmu_tx_hold_*() relies
  on the current state in the DMU (e.g. dn_nlevels) to predict how much
  will be dirtied by this transaction, but this state can change before we
  actually perform the transaction (i.e. call dbuf_dirty()).
  This bug will be fixed by removing the assertion that the tx_space_*
  accounting is perfectly accurate (i.e. we never dirty more than was
  predicted by dmu_tx_hold_*()). By removing the requirement that this
  accounting be perfectly accurate, we can also vastly simplify it, e.g.
  removing most of the logic in dmu_tx_count_*().
  The new tx space accounting will be very approximate, and may be more or
  less than what is actually dirtied. It will still be used to determine
  if this transaction will put us over quota. Transactions that are marked
  by dmu_tx_mark_netfree() will be excepted from this check. We won’t make
  an attempt to determine how much space will be freed by the transaction
  — this was rarely accurate enough to determine if a transaction should
  be permitted when we are over quota, which is why dmu_tx_mark_netfree()
  was introduced in 2014.
  We also won’t attempt to give “credit” when overwriting existing blocks,
  if those blocks may be freed. This allows us to remove the
  do_free_accounting logic in dbuf_dirty(), and associated routines. This

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:24:03 +00:00
Andriy Gapon
59dc4d6bbc 7816 remove static unused variable in zfs_vfsops.c
illumos/illumos-gate@2e972bf18f
2e972bf18f

https://www.illumos.org/issues/7816
  found by gcc6 build unused static variable:
  -static const fs_operation_def_t zfs_vfsops_eio_template[] = {
  -    VFSNAME_FREEVFS,    { .vfs_freevfs =  zfs_freevfs },
  -    NULL,            NULL
  -};

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont astormont@racktopsystems.com
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Igor Kozhukhov <igork@argotech.io>
2017-04-14 18:23:18 +00:00
Andriy Gapon
d3f85b54a4 7812 Remove gender specific language
illumos/illumos-gate@48bbca8168
48bbca8168

https://www.illumos.org/issues/7812
  This change removes all gendered language that did not refer specifically
  to an individual person or pet. The convention taken was to use
  variations on "they" when referring to users and/or human beings, while
  using "it" when referring to code, functions, and/or libraries.
  Additionally, we took the liberty to fix up any whitespace issues that
  were found in any files that were already being modified.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
2017-04-14 18:22:42 +00:00
Andriy Gapon
b6187af16b 7803 want devid_str_from_path(3devid)
illumos/illumos-gate@46d46cd4fa
46d46cd4fa

https://www.illumos.org/issues/7803
  Make get_devid() from libzfs a public function in libdevid, as its pretty
  usable in other places and duplicating all the logic required to get string
  encoded devid from path seems counter-productive.

Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Jason King <jason.brian.king@gmail.com>
Reviewed by: Marcel Telka <marcel@telka.sk>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
2017-04-14 18:21:58 +00:00
Andriy Gapon
e7a96b6403 7541 zpool import/tryimport ioctl returns ENOMEM because provided buffer is too small for config
illumos/illumos-gate@8b65a70b76
8b65a70b76

https://www.illumos.org/issues/7541
  When calling zpool import, zpool does a few ioctls to ZFS.
  zpool allocates a buffer in userland and passes it to the kernel so that ZFS
  can copy info into it. ZFS will use it to put the nvlist that describes the
  pool configuration.
  If the allocated buffer is too small, ZFS will return ENOMEM and the call will
  have to be redone. This wastes CPU time and slows down the import process. This
  happens very often for the ZFS_IOC_POOL_TRYIMPORT call.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
2017-04-14 18:20:56 +00:00
Andriy Gapon
b71204d77d 1300 filename normalization doesn't work for removes
illumos/illumos-gate@1c17160ac5
1c17160ac5

https://www.illumos.org/issues/1300
  Problem:
  We can create invisible file in ZFS.
  How to reproduce:
  0. Prepare normalization formD ZFS.
      # cat cat /etc/release
      cat: cat: No such file or directory
                   OpenIndiana Development oi_151 X86 (powered by illumos)
              Copyright 2011 Oracle and/or its affiliates. All rights reserved.
                              Use is subject to license terms.
                                 Assembled 28 April 2011
      # mkfile 100M 100M
      # zpool create -O utf8only=on -O normalization=formD test_pool $( pwd )/
  100M
      # zpool upgrade
      This system is currently running ZFS pool version 28.

      All pools are formatted using this version.
      # zfs get normalization test_pool
      NAME       PROPERTY       VALUE          SOURCE
      test_pool  normalization  formD          -
      # chmod 777 /test_pool

  1. Create a NFD file.
      $ cd /test_pool/
      $ cp /etc/release $( echo "\x75\xcc\x88" )
      $ ls -la
      total 4
      drwxrwxrwx  2 root  root    3 2011-07-29 08:53 .
      drwxr-xr-x 25 root  root   26 2011-07-29 08:53 ..
      -r--r--r--  1 test1 staff 251 2011-07-29 08:53 u?

Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Kevin Crowe <kevin.crowe@nexenta.com>
2017-04-14 18:20:20 +00:00
Andriy Gapon
be49e7b29b 4242 file rename event fires before the rename happens
illumos/illumos-gate@54207fd2e1
54207fd2e1

https://www.illumos.org/issues/4242
  From Joyent's OS-2557:
  So we're basically just doing a check here that after we got a 'rename' event
  for our file, the file has actually been moved out of the way.
  What I've seen in bh1-stage2 is that this happens as you'd expect for all zones
  but many times over the last week it's failed because when we do the fs.exists
  () check here, the file that we got the rename event for, still exists.
  I've confirmed what's happening using the following dtrace script:
  #!/usr/sbin/dtrace -s

  #pragma D option quiet
  #pragma D option bufsize=256k

  syscall::open:entry,
  syscall::open64:entry
  /copyinstr(arg0) == "/var/svc/provisioning" || (strlen(copyinstr(arg0)) == 69
  && substr(copyinstr(arg0), 48) == "/var/svc/provisioning")/
  {
      this->watching_open = 1;
      printf("%d zone %s process %s(%d) [%s] open(%s)\\n",
          timestamp,
          zonename,
          execname,
          pid,
          curpsinfo->pr_psargs,
          copyinstr(arg0));
  }

  syscall::open:return,
  syscall::open64:return
  /this->watching_open == 1/

Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Marcel Telka <marcel@telka.sk>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
2017-04-14 18:19:48 +00:00
Andriy Gapon
9dfe195883 7740 fix for 6513 only works in hole punching case, not truncation
illumos/illumos-gate@7de35a3ed0
7de35a3ed0

https://www.illumos.org/issues/7740
  The problem is that dbuf_findbp will return ENOENT if the block it's
  trying to find is beyond the end of the file. If that happens, we assume
  there is no birth time, and so we lose that information when we write
  out new blkptrs. We should teach dbuf_findbp to look for things that are
  beyond the current end, but not beyond the absolute end of the file.
  To verify, create a large file, truncate it to a short length, and then
  write beyond the end. Check with zdb to make sure that there are no
  holes with birth time zero (will appear as gaps).

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Paul Dagnelie <pcd@delphix.com>
2017-04-14 18:15:47 +00:00
Andriy Gapon
1e876f2a64 7729 libzfs_core`lzc_rollback() leaks result nvl
illumos/illumos-gate@ac428481f9
ac428481f9

https://www.illumos.org/issues/7729
  libzfs_core`lzc_rollback() doesn't free the result nvl after lzc_ioctl() call.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
2017-04-14 18:15:13 +00:00
Andriy Gapon
16244ff154 7779 clean up unused definitions in zfs ctldir code
illumos/illumos-gate@b7f9f60c8e
b7f9f60c8e

https://www.illumos.org/issues/7779
  zfsctl_ops_shares_dir and ZFSCTL_INO_SHARES are essentially dead code.
  While there, fix the index range check in zfsctl_root_inode_cb.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <andriy.gapon@clusterhq.com>
2017-04-14 18:14:41 +00:00
Andriy Gapon
43fd8e6d5e 7745 print error if lzc_* is called before libzfs_core_init
illumos/illumos-gate@7c13517fff
7c13517fff

https://www.illumos.org/issues/7745
  The problem is that consumers of `libZFS_Core` that forget to call
  `libzfs_core_init()` before calling any other function of the library
  are having a hard time realizing their mistake. The library's internal
  file descriptor is declared as global static, which is ok, but it is not
  initialized explicitly; therefore, it defaults to 0, which is a valid
  file descriptor. If `libzfs_core_init()`, which explicitly initializes
  the correct fd, is skipped, the ioctl functions return errors that do
  not have anything to do with `libZFS_Core`, where the problem is
  actually located.
  Even though assertions for that existed within `libZFS_Core` for debug
  builds, they were never enabled because the `-DDEBUG` flag was missing
  from the compiler flags.
  This patch applies the following changes:
  1. It adds `-DDEBUG` for debug builds of `libZFS_Core` and `libzfs`,
         to enable their assertions on debug builds.
  2. It corrects an assertion within `libzfs`, where a function had
         been spelled incorrectly (`zpool_prop_unsupported()`) and nobody
         knew because the `-DDEBUG` flag was missing, and the preprocessor
         was taking that part of the code away.
  3. The library's internal fd is initialized to `-1` and `VERIFY`
         assertions have been placed to check that the fd is not equal to
         `-1` before issuing any ioctl. It is important here to note, that
         the `VERIFY` assertions exist in both debug and non-debug builds.
  4. In `libzfs_core_fini` we make sure to never increment the
         refcount of our fd below 0, and also reset the fd to `-1` when no
         one refers to it. The reason for this, is for the rare case that
         the consumer closes all references but then calls one of the
         library's functions without using `libzfs_core_init()` first, and
         in the mean time, a previous call to `open()` decided to reuse
         our previous fd. This scenario would have passed our assertion in

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
2017-04-14 18:14:02 +00:00
Andriy Gapon
e90b7ce5ca 7730 libzfs`add_config() leaks config nvl when reading spare/l2cache devices
illumos/illumos-gate@105686550e
105686550e

https://www.illumos.org/issues/7730
  antares:root:~# mdb /usr/sbin/zpool
  > ::sysbp _exit
  > ::run import
     pool: data
       id: 2093977168778024605
    state: ONLINE
   action: The pool can be imported using its name or numeric identifier.
   config:

          data        ONLINE
            c6t0d0    ONLINE
            c6t1d0    ONLINE
          cache
            c6t2d0
  mdb: stop on entry to _exit
  mdb: target stopped at:
  0xfee556ba:     nop
  mdb: You've got symbols!
  Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1
  libnvpair.so.1 ]
  > ::findleaks -d
  BYTES             LEAKED VMEM_SEG CALLER
  4096                  10 fda7b000 MMAP
  8192                   1 fea8d000 MMAP
  8192                   1 fe76d000 MMAP
  8192                   1 fe66e000 MMAP
  4096                   1 fe570000 MMAP
  8192                   1 fe470000 MMAP
  4096                   1 fe372000 MMAP
  4096                   1 fe273000 MMAP

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
2017-04-14 18:13:33 +00:00
Andriy Gapon
768d3677f6 7743 per-vdev-zaps have no initialize path on upgrade
illumos/illumos-gate@555da5111b
555da5111b

https://www.illumos.org/issues/7743
  When loading a pool that had been created before the existance of
  per-vdev zaps, on a system that knows about per-vdev zaps, the
  per-vdev zaps will not be allocated and initialized.
  This appears to be because the logic that would have done so, in
  spa_sync_config_object(), is not reached under normal operation. It is
  only reached if spa_config_dirty_list is non-empty.
  The fix is to add another `AVZ_ACTION_` enum that will allow this code
  to be reached when we detect that we're loading an old pool, even when
  there are no dirty configs.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
2017-04-14 18:13:06 +00:00
Andriy Gapon
2bb93dd84d 7659 Missing thread_exit() in dmu_send.c
illumos/illumos-gate@f2c1e9bc48
f2c1e9bc48

https://www.illumos.org/issues/7659
  Two threads send_traverse_thread() and receive_writer_thread() should
  end with thread_exit();
  Mostly a cosmetic issue under IllumOS.
  https://github.com/openzfs/openzfs/pull/252

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Jorgen Lundman <lundman@lundman.net>
2017-04-14 18:11:53 +00:00
Andriy Gapon
c5c6a287f1 7613 ms_freetree[4] is only used in syncing context
illumos/illumos-gate@5f14577801
5f14577801

https://www.illumos.org/issues/7613
  metaslab_t:ms_freetree[TXG_SIZE] is only used in syncing context. We should
  replace it with two trees: the freeing tree (ranges that we are freeing this
  syncing txg) and the freed tree (ranges which have been freed this txg).

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:11:16 +00:00
Andriy Gapon
26af00ef5b 7586 remove #ifdef __lint hack from dmu.h
illumos/illumos-gate@4ba5b96163
4ba5b96163

https://www.illumos.org/issues/7586
  The #ifdef __lint in dmu.h is ugly, and it would be nice not to duplicate it if
  we add other inline functions into header files in ZFS, especially since it is
  difficult to make any other solution work across all compilation targets. We
  should switch to disabling the lint flags that are failing instead.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
2017-04-14 18:10:23 +00:00
Andriy Gapon
2bfa921051 7580 ztest failure in dbuf_read_impl
illumos/illumos-gate@1a01181fdc
1a01181fdc

https://www.illumos.org/issues/7580
  We need to prevent any reader whenever we're about the zero out all the
  blkptrs. To do this we need to grab the dn_struct_rwlock as writer in
  dbuf_write_children_ready and free_children just prior to calling bzero.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
2017-04-14 18:09:32 +00:00
Andriy Gapon
cdeb4e5f08 7606 dmu_objset_find_dp() takes a long time while importing pool
illumos/illumos-gate@7588687e6b
7588687e6b

https://www.illumos.org/issues/7606
  When importing a pool with a large number of filesystems within the same
  parent filesystem, we see that dmu_objset_find_dp() takes a long time.
  It is called from 3 places: spa_check_logs(), spa_ld_claim_log_blocks(),
  and spa_load_verify().
  There are several ways to improve performance here:
  1. We don't really need to do spa_check_logs() or
         spa_ld_claim_log_blocks() if the pool was closed cleanly.
  2. spa_load_verify() uses dmu_objset_find_dp() to check that no
         datasets have too long of names.
  3. dmu_objset_find_dp() is slow because it's doing
         zap_value_search() (which is O(N sibling datasets)) to determine
         the name of each dsl_dir when it's opened. In this case we
         actually know the name when we are opening it, so we can provide
         it and avoid the lookup.
  This change implements fix #3 from the above list; i.e. make
  dmu_objset_find_dp() provide the name of the dataset so that we don't
  have to search for it.

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <prashksp@gmail.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:08:50 +00:00
Andriy Gapon
b3264caf7b 7252 7628 compressed zfs send / receive
illumos/illumos-gate@5602294fda
5602294fda

https://www.illumos.org/issues/7252
  This feature includes code to allow a system with compressed ARC enabled to
  send data in its compressed form straight out of the ARC, and receive data in
  its compressed form directly into the ARC.

https://www.illumos.org/issues/7628
  We should have longer, more readable versions of the ZFS send / recv options.

7628 create long versions of ZFS send / receive options

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: David Quigley <dpquigl@davequigley.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
2017-04-14 18:07:43 +00:00
Andriy Gapon
9c03f5f793 7604 if volblocksize property is the default, it displays as "-" rather than 8K
illumos/illumos-gate@4d86c0eab2
4d86c0eab2

https://www.illumos.org/issues/7604
  If a zvol has the default setting for the "volblocksize" property, it is
  8KB. However, it is displayed as "-" (not present), rather than "8K".
  The problem was introduced by:
  commit 25228e830e86924a41243343b1de9daf2d7dd43a
      Author: Matthew Ahrens &lt;mahrens@delphix.com&gt;
      Date:   Thu Nov 17 14:37:24 2016 -0800
  7571 non-present readonly numeric ZFS props do not have default value
  which changed changed get_numeric_property() to indicate that readonly
  default properties are not present. However, zfs_prop_readonly() returns
  TRUE for both readonly and set-once properties (e.g. volblocksize).
  Amusingly, that commit essentially reverted:
  6900484 default volblocksize is no longer being reported correctly
  from November 2009. However, that change was not correct either; the
  correct solution is to only do this check for "truly readonly" (i.e. not
  setonce) properties.
  $ zfs list -t volume -o name,volblocksize
      NAME
  VOLBLOCK
      domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
  archive            -
      domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
  datafile           -
      domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
  external           -
      rpool/dump
  128K
      rpool/swap
  4K
      rpool/swap1
  ===============================================================================

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:05:20 +00:00
Andriy Gapon
a3c68fd244 7602 minor issues with zfs manpage
illumos/illumos-gate@5c262fd009
5c262fd009

https://www.illumos.org/issues/7602
  The line volblocksize=blocksize should just read volblocksize in the same
  rendering as the other properties in the same section.
  The zfs.1m man page renders one variant of unallow as
  zfs unallow [-r] -s -@setname [perm|@setname[,perm|@setname]...]
               filesystem|volume
  There is an extra dash preceeding @setname that should not be there.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Daniel Hoffman <dj.hoffman@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Sara Hartse <sara.hartse@delphix.com>
2017-04-14 18:02:25 +00:00
Andriy Gapon
cfaa3c3d8f 7386 zfs get does not work properly with bookmarks
illumos/illumos-gate@edb901aab9
edb901aab9

https://www.illumos.org/issues/7386
  The zfs get command does not work with the bookmark parameter while it works
  properly with both filesystem and snapshot:
  # zfs get -t all -r creation rpool/test
  NAME               PROPERTY  VALUE                  SOURCE
  rpool/test         creation  Fri Sep 16 15:00 2016  -
  rpool/test@snap    creation  Fri Sep 16 15:00 2016  -
  rpool/test#bkmark  creation  Fri Sep 16 15:00 2016  -
  # zfs get -t all -r creation rpool/test@snap
  NAME             PROPERTY  VALUE                  SOURCE
  rpool/test@snap  creation  Fri Sep 16 15:00 2016  -
  # zfs get -t all -r creation rpool/test#bkmark
  cannot open 'rpool/test#bkmark': invalid dataset name
  #
  The zfs get command should be modified to work properly with bookmarks too.

Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
2017-04-14 18:01:43 +00:00
Andriy Gapon
4d9a21a808 7276 zfs(1m) manpage could better describe space properties (remove extra line)
illumos/illumos-gate@079d299664
079d299664

https://www.illumos.org/issues/7276
  The "used" and "written" properties could be described better by the zfs.1m
  manpage.
  "written" could be better described as "The amount of space referenced by this
  dataset, that was written since the previous snapshot (i.e. that is not
  referenced by the previous snapshot)."
  The "used" section needs more work, but at a minimum it could say that the
  "used" space of a snapshot is the space unique to the snapshot (i.e. the space
  referenced only by this snapshot). The "used" space of a snapshot is a subset
  of the "written" space of the snapshot.

Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 18:00:42 +00:00
Andriy Gapon
3af072ced1 7345 zfs(1m): Description for -d depth appeared in -r
illumos/illumos-gate@cb3c687bb9
cb3c687bb9

https://www.illumos.org/issues/7345
  The following part of the zfs(1m) man page:
         -d depth
             Recursively display any children of the dataset, limiting the
             recursion to

  ...

         -r  Recursively display any children of the dataset on the command
             line.  depth.  A depth of 1 will display only the dataset and its
             direct children.
  should be changed to:
         -d depth
             Recursively display any children of the dataset, limiting the
             recursion to depth.  A depth of 1 will display only the dataset and
  its
             direct children.

  ...

         -r  Recursively display any children of the dataset on the command
             line.

Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Marcel Telka <marcel@telka.sk>
2017-04-14 17:59:56 +00:00
Andriy Gapon
cce6c1c241 7276 zfs(1m) manpage could better describe space properties
illumos/illumos-gate@5749c35234
5749c35234

https://www.illumos.org/issues/7276
  The "used" and "written" properties could be described better by the zfs.1m
  manpage.
  "written" could be better described as "The amount of space referenced by this
  dataset, that was written since the previous snapshot (i.e. that is not
  referenced by the previous snapshot)."
  The "used" section needs more work, but at a minimum it could say that the
  "used" space of a snapshot is the space unique to the snapshot (i.e. the space
  referenced only by this snapshot). The "used" space of a snapshot is a subset
  of the "written" space of the snapshot.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
2017-04-14 17:59:10 +00:00
Andriy Gapon
b689d45202 7257 zfs manpage user property length needs to be updated
illumos/illumos-gate@3bc7169503
3bc7169503

https://www.illumos.org/issues/7257
  The zfs.1m manpage says:
  User Properties
      ...
         Property values are limited to 1024 characters.
  Since zpool version 16, this limit is actually 8192 characters. Additionally,
  this limit is actually 8192 bytes, as it supports UTF-8.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Eli Rosenthal <eli.rosenthal@delphix.com>
2017-04-14 17:58:31 +00:00
Andriy Gapon
fc7f6c1fe6 7041 Fix spelling mistakes in sections 1 and 1M
illumos/illumos-gate@df23d905b9
df23d905b9

https://www.illumos.org/issues/7041
  With the changes made in #7040, I found and fixed misspellings in sections 1
  and 1M of the manual pages.

Reviewed by: Marcel Telka <marcel@telka.sk>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Cody Peter Mello <cody.mello@joyent.com>
2017-04-14 17:58:00 +00:00
Andriy Gapon
3619e6d9ab missing zfs.1m changes from r299449 2017-04-14 17:57:10 +00:00
Andriy Gapon
685bc6fed7 missing zfs.1m changes from r296518 2017-04-14 17:55:06 +00:00
Andriy Gapon
0c465e198f missing zfs.1m changes from r294814 2017-04-14 17:52:47 +00:00
Andriy Gapon
1383302d2d 6346 zfs(1M) has spurious comma
illumos/illumos-gate@1058dba45e
1058dba45e

https://www.illumos.org/issues/6346
  The xref to gzip(1) in the SEE ALSO puts the comma inside the parens, because a
  space is missing in the source
  .Xr gzip 1,
  should be
  .Xr gzip 1 ,
  It'd be cool if the manual page checks in pbchk could catch this, too, but I'm
  not sure how easy that'd be.

Reviewed by: Garrett D'Amore <garrett@damore.org>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
2017-04-14 17:49:37 +00:00
Andriy Gapon
99d91c7d5e missing zfs.1m changes from r289312 2017-04-14 17:48:45 +00:00
Andriy Gapon
cc685b610a missing zfs.1m changes from r289310 2017-04-14 17:46:40 +00:00
Andriy Gapon
7ecf4767d4 missing zfs.1m changes from r286704 2017-04-14 17:43:51 +00:00
Andriy Gapon
aad99e99a3 7571 non-present readonly numeric ZFS props do not have default value
illumos/illumos-gate@ad2760acbd
ad2760acbd

https://www.illumos.org/issues/7571
  ZFS displays the default value for non-present readonly numeric (and index)
  properties. However, these properties default values are not meaningful.
  Instead, we should display a "-", indicating that they are not present. For
  example, on a version-12 pool, the usedby* properties are not available, but
  they show up as the incorrect value "0":
     1. zfs get all test12
        ...
        test12 usedbysnapshots 0 -
        test12 usedbydataset 0 -
        test12 usedbychildren 0 -
        test12 usedbyrefreservation 0 -
  We will be introducing more sometimes-present numeric readonly properties, so
  it would be nice to fix this.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
2017-04-14 17:27:09 +00:00