3740 Poor ZFS send / receive performance due to snapshot
hold / release processing
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Christopher Siden <christopher.siden@delphix.com>
References:
https://www.illumos.org/issues/3740illumos/illumos-gate@a7a845e4bf
Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1775
Porting notes:
1. 13fe019870 introduced a merge conflict
in dsl_dataset_user_release_tmp where some variables were moved
outside of the preprocessor directive.
2. dea9dfefdd747534b3846845629d2200f0616dad made the previous merge
conflict worse by switching KM_SLEEP to KM_PUSHPAGE. This is notable
because this commit refactors the code, adding a new KM_SLEEP
allocation. It is not clear to me whether this should be converted
to KM_PUSHPAGE.
3. We had a merge conflict in libzfs_sendrecv.c because of copyright
notices.
4. Several small C99 compatibility fixed were made.
3642 dsl_scan_active() should not issue I/O to determine if async
destroying is active
3643 txg_delay should not hold the tc_lock
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
References:
https://www.illumos.org/issues/3642https://www.illumos.org/issues/3643illumos/illumos-gate@4a92375985
Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1775
Porting Notes:
1. The alignment assumptions for the tx_cpu structure assume that
a kmutex_t is 8 bytes. This isn't true under Linux but tc_pad[]
was adjusted anyway for consistency since this structure was
never carefully aligned in ZoL. If careful alignment does impact
performance significantly this should be reworked to be portable.
3598 want to dtrace when errors are generated in zfs
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
References:
https://www.illumos.org/issues/3598illumos/illumos-gate@be6fd75a69
Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1775
Porting notes:
1. include/sys/zfs_context.h has been modified to render some new
macros inert until dtrace is available on Linux.
2. Linux-specific changes have been adapted to use SET_ERROR().
3. I'm NOT happy about this change. It does nothing but ugly
up the code under Linux. Unfortunately we need to take it to
avoid more merge conflicts in the future. -Brian
dataset_remove_clones_key does recursion, so if the recursion goes
deep it can overrun the linux kernel stack size of 8KB. I have seen
this happen in the actual deployment, and subsequently confirmed it by
running a test workload on a custom-built kernel that uses 32KB stack.
See the following stack trace as an example of the case where it would
have run over the 8KB stack kernel:
Depth Size Location (42 entries)
----- ---- --------
0) 11192 72 __kmalloc+0x2e/0x240
1) 11120 144 kmem_alloc_debug+0x20e/0x500
2) 10976 72 dbuf_hold_impl+0x4a/0xa0
3) 10904 120 dbuf_prefetch+0xd3/0x280
4) 10784 80 dmu_zfetch_dofetch.isra.5+0x10f/0x180
5) 10704 240 dmu_zfetch+0x5f7/0x10e0
6) 10464 168 dbuf_read+0x71e/0x8f0
7) 10296 104 dnode_hold_impl+0x1ee/0x620
8) 10192 16 dnode_hold+0x19/0x20
9) 10176 88 dmu_buf_hold+0x42/0x1b0
10) 10088 144 zap_lockdir+0x48/0x730
11) 9944 128 zap_cursor_retrieve+0x1c4/0x2f0
12) 9816 392 dsl_dataset_remove_clones_key.isra.14+0xab/0x190
13) 9424 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
14) 9032 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
15) 8640 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
16) 8248 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
17) 7856 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
18) 7464 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
19) 7072 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
20) 6680 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
21) 6288 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
22) 5896 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
23) 5504 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
24) 5112 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
25) 4720 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
26) 4328 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
27) 3936 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
28) 3544 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
29) 3152 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
30) 2760 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
31) 2368 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
32) 1976 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
33) 1584 392 dsl_dataset_remove_clones_key.isra.14+0x10c/0x190
34) 1192 232 dsl_dataset_destroy_sync+0x311/0xf60
35) 960 72 dsl_sync_task_group_sync+0x12f/0x230
36) 888 168 dsl_pool_sync+0x48b/0x5c0
37) 720 184 spa_sync+0x417/0xb00
38) 536 184 txg_sync_thread+0x325/0x5b0
39) 352 48 thread_generic_wrapper+0x7a/0x90
40) 304 128 kthread+0xc0/0xd0
41) 176 176 ret_from_fork+0x7c/0xb0
This change reduces the stack usage in dsl_dataset_remove_clones_key
by allocating structures in heap, not in stack. This is not a fundamental
fix, as one can create an arbitrary large data set that runs over any
fixed size stack, but this will make the problem far less likely.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kohsuke Kawaguchi <kk@kohsuke.org>
Closes#1726