Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Background information: This assertion about tx_space_* verifies that we
are not dirtying more stuff than we thought we would. We “need” to know
how much we will dirty so that we can check if we should fail this
transaction with ENOSPC/EDQUOT, in dmu_tx_assign(). While the
transaction is open (i.e. between dmu_tx_assign() and dmu_tx_commit() —
typically less than a millisecond), we call dbuf_dirty() on the exact
blocks that will be modified. Once this happens, the temporary
accounting in tx_space_* is unnecessary, because we know exactly what
blocks are newly dirtied; we call dnode_willuse_space() to track this
more exact accounting.
The fundamental problem causing this bug is that dmu_tx_hold_*() relies
on the current state in the DMU (e.g. dn_nlevels) to predict how much
will be dirtied by this transaction, but this state can change before we
actually perform the transaction (i.e. call dbuf_dirty()).
This bug will be fixed by removing the assertion that the tx_space_*
accounting is perfectly accurate (i.e. we never dirty more than was
predicted by dmu_tx_hold_*()). By removing the requirement that this
accounting be perfectly accurate, we can also vastly simplify it, e.g.
removing most of the logic in dmu_tx_count_*().
The new tx space accounting will be very approximate, and may be more or
less than what is actually dirtied. It will still be used to determine
if this transaction will put us over quota. Transactions that are marked
by dmu_tx_mark_netfree() will be excepted from this check. We won’t make
an attempt to determine how much space will be freed by the transaction
— this was rarely accurate enough to determine if a transaction should
be permitted when we are over quota, which is why dmu_tx_mark_netfree()
was introduced in 2014.
We also won’t attempt to give “credit” when overwriting existing blocks,
if those blocks may be freed. This allows us to remove the
do_free_accounting logic in dbuf_dirty(), and associated routines. This
logic attempted to predict what will be on disk when this txg syncs, to
know if the overwritten block will be freed (i.e. exists, and has no
snapshots).
OpenZFS-issue: https://www.illumos.org/issues/7793
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3704e0a
Upstream bugs: DLPX-32883a
Closes#5804
Porting notes:
- DNODE_SIZE replaced with DNODE_MIN_SIZE in dmu_tx_count_dnode(),
Using the default dnode size would be slightly better.
- DEBUG_DMU_TX wrappers and configure option removed.
- Resolved _by_dnode() conflicts these changes have not yet been
applied to OpenZFS.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/7843
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4d519e7Closes#5868
Commit 933ec99 removes read and write from f_op because the vfs layer will
select iter_write or aio_write automatically. However, for Linux <= 4.0,
loop_set_fd will actually check f_op->write and set read-only if not exists.
This patch add them back and use the generic do_sync_{read,write} for
aio_{read,write} and new_sync_{read,write} for {read,write}_iter.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5776Closes#5855
Linux 4.11 introduces a new type, refcount_t, which conflicts with the
type of the same name defined within ZFS.
Rename the ZFS type zfs_refcount_t. Within the ZFS code, use a macro to
cause references to refcount_t to be changed to zfs_refcount_t at
compile time. This reduces conflicts when later landing OpenZFS
patches.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#5823Closes#5842
0eef1bde31
introduced some changes which we slightly improved the style of when
porting to illumos.
There is also one minor error-handling fix, in zap_add() the "zap" may
become NULL in case of an error re-opening the ZAP.
Originally suggested at: https://github.com/openzfs/openzfs/pull/276
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#5805
The commit a6255b7fce removed a few
assertions which help catch errors and improve code readability. It also
duplicated two conditionals, which was unnecessary and made the code
confusing to read. This patch cleans it up.
Reviewed-by: David Quigley <david.quigley@intel.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Isaac Huang <he.huang@intel.com>
Closes#5802
Authored by: Daniel Hoffman <dj.hoffman@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
This change removes all gendered language that did not refer specifically
to an individual person or pet. The convention taken was to use
variations on "they" when referring to users and/or human beings, while
using "it" when referring to code, functions, and/or libraries.
Additionally, we took the liberty to fix up any whitespace issues that
were found in any files that were already being modified.
OpenZFS-issue: https://www.illumos.org/issues/7812
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ad626dbCloses#5822
7200 no blocks must be born in a txg after a snaphot is created
Authored by: Andriy Gapon <andriy.gapon@clusterhq.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7199
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/bfaed0bCloses#5817
After a hot spare replaces an OFFLINE vdev, the new
parent spare vdev state is set incorrectly to OFFLINE.
The correct state should be DEGRADED. The incorrect
OFFLINE state will prevent top-level vdev from reading
the spare vdev, thus causing unnecessary reconstruction.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Isaac Huang <he.huang@intel.com>
Closes#5766Closes#5770
The global tunable zfs_arc_num_sublists_per_state is used by the ARC and
the dbuf cache, and other users are planned. We should change this
tunable to be common to all multilists. This tuning may be overridden
on a per-multilist basis.
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#5764
A likely cut/paste error caused the description to be applied to
zfs_arc_average_blocksize.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes#5788
Doing the following command would return success with zfs creating an orphan
object.
touch $(for i in $(seq 256); do printf "n"; done)
The funny thing is that this will only work once for each directory, because
after upgraded to fzap, zfs_lookup would fail properly since it has additional
length check.
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5768
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7104
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4b5c8e9Closes#5679
This is a race condition in the deadlist code.
A thread executing an administrative command that uses
dsl_deadlist_space_range() holds the lock of the whole deadlist_t to
protect the access of all its entries that the deadlist contains in an
avl tree.
Sync threads trying to insert a new entry in the deadlist (through
dsl_deadlist_insert() -> dle_enqueue()) do not hold the deadlist lock at
that moment. If the dle_bpobj is the empty bpobj (our sentinel value),
we close and reopen it. Between these two operations, it is possible
for the dsl_deadlist_space_range() thread to dereference that bpobj
which is NULL during that window.
Threads should hold the a deadlist's dl_lock when they manipulate its
internal data so scenarios like the one above are avoided.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#5762
As explicitly stated in section 2 of the 'Programming rules'
comments at the top of zfs_vnops.c.
If you must call iput() within a tx then use zfs_iput_async().
Move iput() calls after dmu_tx_commit() / dmu_tx_abort when
possible. When not possible convert the iput() calls to
zfs_iput_async().
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5758
Add the appropriate compiler flags to accept c99 code. This will help to
minimize differences with upstream, and aid porting changes. One change was
necessary in zvol.c because the DEFINE_IDA() macro does not work with the new
compiler flags.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#5756
When the code was added this tunable was not exposed via module params. Also it
was not documented. This patch changes the type from a uint32 to a ulong as
done with other percentage tunables and also documents it in the
zfs-module-parameters man page.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes#5750
Authored by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7448
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/295438bCloses#5737
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7247
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2ad25b4Closes#5689
Porting notes:
- tests/zfs-tests/tests/functional/cli_root/zfs_receive/zfs_receive_013_pos.ksh
renamed as zfs_receive_015_pos.ksh, zfs_receive_013_pos.ksh is now
used for OpenZFS test.
- libzfs_sendrecv.c: SMALLEST_POSSIBLE_MAX_DDT_MB is always used
for all 32-bit builds.
For generic_write_checks with 2 args, we can exit when it returns zero because
it means count is zero. However this is not the case for generic_write_checks
with 4 args, where zero means no error.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Haakan T Johansson <f96hajo@chalmers.se>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5720Closes#5726
This is effectively dead code for the Linux implementation which can
be removed to improve readability. We want to linter to check the
real production/debug build as much as possible.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes#5722
Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7072
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c39a2aaCloses#5694
Porting notes:
- vdev.c: 'vdev_get_stats' changes are moved to 'vdev_get_stats_ex'.
- vdev_disk.c: ignored, Linux specific code is different.
This patch adds the necessary infrastructure for ABD to make use
of the vectorized fletcher 4 routines.
- export ABD compatible interface from fletcher_4
- add ABD fletcher_4 tests for data and metadata ABD types.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Original-patch-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes#5589
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Porting Notes: Moved reference_tracking_enable and
reference_history outside of ZFS_DEBUG.
OpenZFS-issue: https://www.illumos.org/issues/7545
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4dd77f9Closes#5701
The deadman in ZoL didn't behave quite as it did in upstream
OpenZFS. In addition to the 2 purposes for which OpenZFS used the
zfs_deadman_synctime_ms parameter, ZoL also used it to determine how
frequently the deadman would fire once it has been triggered.
This patch adds the zfs_deadman_checktime_ms parameter to control how
frequently the subsequent checks are performed.
The deadman is now disabled for suspended pools.
As had been the case, unlike upstream OpenZFS, ZoL will not panic when
a hung IO is detected.
The module parameter documentation has been upated to include the new
parameter and to better describe the operation of the deadmen.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes#5695
Authored by: Alex Wilson <alex.wilson@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7019
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/45b1747Closes#5709
Authored by: Alan Somers <asomers@gmail.com>
7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7136
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b72b6bbCloses#5691
Porting notes:
- Functionally this patch behaves the same as the OpenZFS
version but it was adapted because because ZoL doesn't
have the same illumos sysevent_t infrastructure and functionality.
Authored by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7490
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/6cedfc3Closes#5693
Authored by: Alan Somers <asomers@gmail.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/6922
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/63364b0Closes#5690
Porting notes:
- 'zfs_dbgmsg_print()' reintroduced to userspace.
Authored by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7277
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/29bdd2fCloses#5684
After resuming a pool the godfather zio could have both the
ZIO_REEXECUTE_NOW and ZIO_REEXECUTE_SUSPEND bits set. This
can occur if some child zios set ZIO_REEXECUTE_NOW while
other set ZIO_REEXECUTE_SUSPEND. The godfather zio can
inherit both flags in zio_notify_parent().
The child zios which assigned the ZIO_REEXECUTE_SUSPEND flag
will be removed from the godfather's child list and added to
the spa->spa_suspend_zio_root child list. While child zios
with the ZIO_REEXECUTE_NOW bit set remain being monitored
by the godfather zio.
When the godfather zio executes zio_done() the presence of
the ZIO_REEXECUTE_SUSPEND bit results in all io_reexecute
being cleared. These child zios will then not be re-executed
and instead will be destroyed and lost.
The most straight forward way to address this situation is
to only clear the ZIO_REEXECUTE_SUSPEND bit and leave the
ZIO_REEXECUTE_NOW bit set.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7580
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3105d95Closes#5678
The .write/.read file operations callbacks can be retired since
support for .read_iter/.write_iter and .aio_read/.aio_write has
been added. The vfs_write()/vfs_read() entry functions will
select the correct interface for the kernel. This is desirable
because all VFS write/read operations now rely on common code.
This change also add the generic write checks to make sure that
ulimits are enforced correctly on write.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#5587Closes#5673
metaslab_t:ms_freetree[TXG_SIZE] is only used in syncing context. We
should replace it with two trees: the freeing tree (ranges that we are
freeing this syncing txg) and the freed tree (ranges which have been
freed this txg).
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Tim Chase <tim@chase2k.com>
OpenZFS-issue: https://www.illumos.org/issues/7613
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a8698da2Closes#5598
Authored by: Stephen Blinick <stephen.blinick@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7500
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/653af1bCloses#5639
When importing a pool with a large number of filesystems within the same
parent filesystem, we see that dmu_objset_find_dp() takes a long time.
It is called from 3 places: spa_check_logs(), spa_ld_claim_log_blocks(),
and spa_load_verify().
There are several ways to improve performance here:
1. We don't really need to do spa_check_logs() or
spa_ld_claim_log_blocks() if the pool was closed cleanly.
2. spa_load_verify() uses dmu_objset_find_dp() to check that no
datasets have too long of names.
3. dmu_objset_find_dp() is slow because it's doing
zap_value_search() (which is O(N sibling datasets)) to determine
the name of each dsl_dir when it's opened. In this case we
actually know the name when we are opening it, so we can provide
it and avoid the lookup.
This change implements fix#3 from the above list; i.e. make
dmu_objset_find_dp() provide the name of the dataset so that we don't
have to search for it.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <prashksp@gmail.com>
Reviewed-by: David Quigley <david.quigley@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7606
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/cac6babCloses#5662
When doing recv and rollback, dsl_dataset_clone_swap_sync_impl will be
called to swap out the ds_objset and do dmu_objset_evict on the old one.
However, currently zv->zv_objset will not be swapped out accordingly, so
if anyone currently holds a fd on the zvol, we risk hitting a use-after-free.
We fix this by introducing the suspend and resume mechanism of zsb to
zv. Before recv or rollback, we use zvol_suspend to block all access to
zv_objset and shut it down. After the recv or rollback, we use zvol_resume
to swap in zv_objset with the new ds_objset and unblock the access.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#4866Closes#5609
Porting notes:
- This issue was first fixed in ZoL by commit d862cb0d. That fix was
then modified and an equivalent version of the patch landed in the
upstream code base. For additional details see the discussion in
https://github.com/openzfs/openzfs/pull/24 .
This commit aligns ZoL with OpenZFS codebase.
Authored by: Andriy Gapon <avg@icyb.net.ua>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Ned Bass <bass6@llnl.gov>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Ported-by: George Melikov mail@gmelikov.ru
OpenZFS-issue: https://www.illumos.org/issues/6529
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e7e978bCloses#5606
Two threads send_traverse_thread() and receive_writer_thread() should
end with thread_exit();
Mostly a cosmetic issue under IllumOS.
Authored by: Jorgen Lundman <lundman@lundman.net>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: George Melikov <mail@gmelikov.ru>
OpenZFS-issue: https://www.illumos.org/issues/7659
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a569268Closes#5603