freebsd-skq

Author	SHA1	Message	Date
mav	1c84b59fa5	Add vfs.zfs.zil_log_limit sysctl. It is at least partially broken now, but that is another question.	2016-10-16 18:49:15 +00:00
mav	81423ea480	Optimize ZIL itx memory allocation on FreeBSD. These allocations can reach up to 128KB, while FreeBSD kernel allocator can cache allocations only up to 64KB. To avoid expensive allocations for each large ZIL write use caching zio_buf_alloc() allocator instead. To make it possible de-inline few instances of zil_itx_destroy().	2016-10-16 10:43:12 +00:00
mav	b03829e518	MFV r307314: 6988 spa_sync() spends half its time in dmu_objset_do_userquota_updates Using a benchmark which creates 2 million files in one TXG, I observe that the thread running spa_sync() is on CPU almost the entire time we are syncing, and therefore can be a performance bottleneck. About 50% of the time in spa_sync() is in dmu_objset_do_userquota_updates(). The problem is that dmu_objset_do_userquota_updates() calls zap_increment_int(DMU_USERUSED_OBJECT) once for every file that was modified (or created). In this benchmark, all the files are owned by the same user/group, so all 2 million calls to zap_increment_int() are modifying the same entry in the zap. The same issue exists for the DMU_GROUPUSED_OBJECT. We should keep an in-memory map from user to space delta while we are syncing, and when we finish, iterate over the in-memory map and modify the ZAP once per entry. This reduces the number of calls to zap_increment_int() from "number of objects modified" to "number of owners/groups of modified files". This reduced the time spent in spa_sync() in the file create benchmark by ~33%, from 11 seconds to 7 seconds. Closes #107 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: Ned Bass <bass6@llnl.gov> Reviewed by: Jinshan Xiong <jinshan.xiong@intel.com> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@5fc46359c5	2016-10-14 12:03:04 +00:00
mav	48ba15912d	MFV r307313: 5120 zfs should allow large block/gzip/raidz boot pool (loader project) Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Toomas Soome <tsoome@me.com> openzfs/openzfs@c8811bd3e2 FreeBSD still does not support booting from gzip-compressed datasets, so keep one chunk of this commit out.	2016-10-14 12:01:33 +00:00
kib	af14bca641	Fix a race in vm_page_busy_sleep(9). Suppose that we have an exclusively busy page, and a thread which can accept shared-busy page. In this case, typical code waiting for the page xbusy state to pass is again: VM_OBJECT_WLOCK(object); ... if (vm_page_xbusied(m)) { vm_page_lock(m); VM_OBJECT_WUNLOCK(object); <---1 vm_page_busy_sleep(p, "vmopax"); goto again; } Suppose that the xbusy state owner locked the object, unbusied the page and unlocked the object after we are at the line [1], but before we executed the load of the busy_lock word in vm_page_busy_sleep(). If it happens that there is still no waiters recorded for the busy state, the xbusy owner did not acquired the page lock, so it proceeded. More, suppose that some other thread happen to share-busy the page after xbusy state was relinquished but before the m->busy_lock is read in vm_page_busy_sleep(). Again, that thread only needs vm_object lock to proceed. Then, vm_page_busy_sleep() reads busy_lock value equal to the VPB_SHARERS_WORD(1). In this case, all tests in vm_page_busy_sleep(9) pass and we are going to sleep, despite the page being share-busied. Update check for m->busy_lock == VPB_UNBUSIED in vm_page_busy_sleep(9) to also accept shared-busy state if we only wait for the xbusy state to pass. Merge sequential if()s with the same 'then' clause in vm_page_busy_sleep(). Note that the current code does not share-busy pages from parallel threads, the only way to have more that one sbusy owner is right now is to recurse. Reported and tested by: pho (previous version) Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D8196	2016-10-13 14:41:05 +00:00
kib	0ce9d71278	Limit scope of the optimization in r306608 to dounmount() caller only. Other uses of cache_purgevfs() do rely on the cache purge for correct operations, when paths are invalidated without unmount. Reported and tested by: jkim Discussed with: mjg Sponsored by: The FreeBSD Foundation	2016-10-07 11:38:28 +00:00
avg	6b13484289	implement zfs_vptocnp() using z_parent property This should allow vn_fullpath() to work even when vfs name cache is disabled for zfs, which is the case when zfs properties like casesensitivity and normalization are set non-default values. The new code should be 100% reliable for directories and "mostly" reliable for files, that is, when hardlinks across directories are not used. Reported by: Frederic Chardon <chardon.frederic@gmail.com> Reviewed by: kib (vfs contract) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D8146	2016-10-07 06:29:24 +00:00
avg	3c45649fe3	zfs: fix a wrong assertion for extended attributes For the extended attributes the order between z_teardown_lock and the vnode lock is different. The bug was triggered only with DIAGNOSTIC turned on. This fix is developed in cooperation with avos. PR: 213112 Reported by: avos Tested by: avos MFC after: 1 week	2016-10-04 08:09:25 +00:00
markj	4c947dca49	Allow tracing of functions prefixed by "__". This restriction was inherited from upstream but is not relevant on FreeBSD. Furthermore, it hindered the tracing of locking primitive subroutines. MFC after: 1 week	2016-10-02 00:35:00 +00:00
mav	c1a80a5b9e	Add #ifdef _KERNEL around send_holes_without_birth_time sysctl. Reported by: avg@	2016-09-29 17:48:53 +00:00
mav	cf8ccc7757	MFV r306423: 7402 Create tunable to ignore hole_birth feature Until we can resolve the numerous hole_birth bugs that have cropped up recently, and come up with a way going forwards to protect users from corruption, we should disable the hole_birth feature. Using a tunable allows those who are confident that their data is correct to continue to take advantage of the feature. Closes #188 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-29 00:00:37 +00:00
mav	daf458fb50	MFV r306422: 7254 ztest failed assertion in ztest_dataset_dirobj_verify: dirobjs + 1 == usedobjs dsl_dataset_space is looking at the ds_bp's fill count while dmu_objset_write_ready() is concurrently modifying it. This fix adds an rrwlock to protect the ds_bp. Closes #180 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-28 23:54:47 +00:00
markj	03291ec95d	Move implementations of uread() and uwrite() to the illumos compat layer. MFC after: 1 week	2016-09-24 21:40:14 +00:00
avg	b2581c5dd2	fix vnode lock assertion for extended attributes directory Background. In ZFS a file with extended attributes has a special directory associated with it where each extended attribute is a file. The attribute's name is a file name and its value is a file content. When the ownership of a file with extended attributes is changed, ZFS also changes ownership of the special directory. This is where the bug was hit. The bug was introduced in r209158. Nota bene. ZFS vnode locks are typically acquired before z_teardown_lock (i.e., before ZFS_ENTER). But this is not the case for the vnodes that represent the extended attribute directory and files. Those are always locked after ZFS_ENTER. This is confusing and fragile. PR: 212702 Reported by: Christian Fuss to FreeNAS Tested by: mav MFC after: 1 week	2016-09-24 08:13:15 +00:00
markj	c3a5f7973c	Re-check the systrace probe ID before calling dtrace_probe(). Otherwise there exists a narrow window during which a syscall probe can be disabled and cause a concurrently-running thread to call dtrace_probe() with an invalid probe ID. Reported by: ngie MFC after: 1 week Sponsored by: Dell EMC Isilon	2016-09-22 23:22:53 +00:00
allanjude	370d083f6e	MFV r268120: 4936 lz4 could theoretically overflow a pointer with a certain input illumos/illumos-gate@58d0718061 Reviewed by: delphij MFC after: 2 weeks Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D7850	2016-09-11 17:48:06 +00:00
mav	2a468fb489	Switch random_get_pseudo_bytes() shim to arc4rand(). Our shim for Solaris random_get_bytes() uses read_random(), that looks reasonable, since it guaranties reliably seeded random data. On the other side Solaris random_get_pseudo_bytes() does not provide this guarantie, and its original Solaris implementation is equivalent to our arc4rand(), using software crypto without stressing slower hardware RNG.	2016-09-10 09:37:41 +00:00
mav	1744c00ea7	MFV r305562: 7259 DS_FIELD_LARGE_BLOCKS is unused The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of this patch: commit ca0cc3918a1789fa839194af2a9245f801a06b1a Author: Matthew Ahrens <mahrens@delphix.com> Date: Fri Jul 24 09:53:55 2015 -0700 5959 clean up per-dataset feature count code Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> This patch simply removes this macro from dsl_dataset.h. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-07 20:09:24 +00:00
mav	285fcb30fd	MFV r305560: 7278 tuning zfs_arc_max does not impact arc_c_min When changing zfs_arc_max (e.g. as zdb does), it may be set to less than the default arc_c_min. arc_c_min should decrease to not be more than arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@608764bead	2016-09-07 20:05:10 +00:00
avg	20fcfc01b5	fix zfs pool creation accidentally broken by r305331 The upstream change introduced a new load state, SPA_LOAD_CREATE, and vdev_geom code needs to be aware of it. Tested by: cy MFC after: 1 week X-MFC with: r305331	2016-09-06 06:09:12 +00:00
mav	25849e5c65	Missed FreeBSD-specific piece of r305338.	2016-09-03 11:17:33 +00:00
mav	b54e25cf10	MFC r305337: 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object Using a benchmark which has 32 threads creating 2 million files in the same directory, on a machine with 16 CPU cores, I observed poor performance. I noticed that dmu_tx_hold_zap() was using about 30% of all CPU, and doing dnode_hold() 7 times on the same object (the ZAP object that is being held). dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the dnode_t that we already have in hand, rather than repeatedly calling dnode_hold(). To do this, we need to pass the dnode_t down through all the intermediate calls that dmu_tx_hold_zap() makes, making these routines take the dnode_t* rather than an objset_t* and a uint64_t object number. In particular, the following routines will need to have analogous *_by_dnode() variants created: dmu_buf_hold_noread() dmu_buf_hold() zap_lookup() zap_lookup_norm() zap_count_write() zap_lockdir() zap_count_write() This can improve performance on the benchmark described above by 100%, from 30,000 file creations per second to 60,000. (This improvement is on top of that provided by working around the object allocation issue. Peak performance of ~90,000 creations per second was observed with 8 CPUs; adding CPUs past that decreased performance due to lock contention.) The CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds to 40 CPU-seconds. Sponsored by: Intel Corp. Closes #109 Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Ned Bass <bass6@llnl.gov> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@d3e523d489	2016-09-03 11:00:29 +00:00
mav	c643a1ba03	MFV r305336: 7247 zfs receive of deduplicated stream fails This resolves two 'zfs recv' issues. First, when receiving into an existing filesystem, a snapshot created during the receive process is not added to the guid->dataset map for the stream, resulting in failed lookups for deduped streams when a WRITE_BYREF record refers to a snapshot received earlier in the stream. Second, the newly created snapshot was also not set properly, referencing the snapshot before the new receiving dataset rather than the existing filesystem. Closes #159 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Author: Chris Williamson <chris.williamson@delphix.com> openzfs/openzfs@b09697c8c1	2016-09-03 10:59:05 +00:00
mav	4e3d6e73d3	MFV r305335: 7003 zap_lockdir() should tag hold zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp. Closes #108 Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@0780b3eab5	2016-09-03 10:58:14 +00:00
mav	ed2cd23bbf	MFV r304157: 7230 add assertions to dmu_send_impl() to verify that stream includes BEGIN and END records illumos/illumos-gate@12b90ee2d3 https://github.com/illumos/illumos-gate/commit/12b90ee2d3b10689fc45f4930d2392f5f e1d9cfa https://www.illumos.org/issues/7230 A test failure occurred where a send stream had only a BEGIN record. This should not be possible if the send returns without error. Prevented this from happening in the future by adding an assertion to dmu_send_impl() to verify that if the function returns 0 (success) both a BEGIN and END record are present. Did this by adding flags to dmu_sendarg_t (indicating whether BEGIN o r END records sent), having dump_record() set flags appropriately, adding VERIFY statement to dmu_send_impl(). Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matt Krantz <matt.krantz@delphix.com>	2016-09-03 10:10:58 +00:00
mav	5dc4a605c8	MFV r304156: 7235 remove unused func dsl_dataset_set_blkptr illumos/illumos-gate@bd56f80007 https://github.com/illumos/illumos-gate/commit/bd56f80007857b960e0981ed0797ad8ec 844a96b https://www.illumos.org/issues/7235 The function dsl_dataset_set_blkptr() is unused. We should remove it. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 10:09:23 +00:00
mav	3bf1abe8c9	MFV r304155: 7090 zfs should improve allocation order and throttle allocations illumos/illumos-gate@0f7643c737 https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b 10c846a https://www.illumos.org/issues/7090 When write I/Os are issued, they are issued in block order but the ZIO pipelin e will drive them asynchronously through the allocation stage which can result i n blocks being allocated out-of-order. It would be nice to preserve as much of the logical order as possible. In addition, the allocations are equally scattered across all top-level VDEVs but not all top-level VDEVs are created equally. The pipeline should be able t o detect devices that are more capable of handling allocations and should allocate more blocks to those devices. This allows for dynamic allocation distribution when devices are imbalanced as fuller devices will tend to be slower than empty devices. The change includes a new pool-wide allocation queue which would throttle and order allocations in the ZIO pipeline. The queue would be ordered by issued time and offset and would provide an initial amount of allocation of work to each top-level vdev. The allocation logic utilizes a reservation system to reserve allocations that will be performed by the allocator. Once an allocatio n is successfully completed it's scheduled on a given top-level vdev. Each top- level vdev maintains a maximum number of allocations that it can handle (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab groups and round robin across all eligible metaslab groups to distribute the work. As top-levels complete their work, they receive additional work from the pool-wide allocation queue until the allocation queue is emptied. Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 10:04:37 +00:00
mav	b318ac9f53	MFV r303078: 7086 ztest attempts dva_get_dsize_sync on an embedded blockpointer illumos/illumos-gate@926549256b https://github.com/illumos/illumos-gate/commit/926549256b71acd595f69b236779ff6b7 8fa08ef https://www.illumos.org/issues/7086 In dbuf_dirty(), we need to grab the dn_struct_rwlock before looking at the db_blkptr, to prevent it from being changed by syncing context. Otherwise we may see that ztest got a segfault from this stack: libzpool.so.1`dva_get_dsize_sync+0x98(872f000, `b32b240`, fed7811b, 0, b4cda20, 0) libzpool.so.1`bp_get_dsize+0x60(872f000, `b32b240`, 0, 97cb780, 9d4c1a8, 0) libzpool.so.1`dbuf_dirty+0x9b3(ce0a100, 97cb780, 9, fecd2530) libzpool.so.1`dmu_buf_will_dirty+0xc3(ce0a100, 97cb780, ea293d6c, 1) libzpool.so.1`zap_lockdir+0x1a0(8aaa3c0, 1, 0, 97cb780, 1, 1) libzpool.so.1`zap_remove_norm+0x30(8aaa3c0, 1, 0, 8728b10, 0, 97cb780) libzpool.so.1`zap_remove+0x29(8aaa3c0, 1, 0, 8728b10, 97cb780, a) ztest_replay_remove+0x225(ea294588, 8728ae8, 0, 38010000, 0, 0) ztest_remove+0x9f(ea294588, ea293f50, 4, 3) ztest_object_init+0x78(ea294588, ea293f50, 4e0, 1) ztest_dmu_object_alloc_free+0x71(ea294588, 13) ztest_dmu_objset_create_destroy+0x224(80cef08, 13, 0, 805d36c, 9017ad44, 0) ztest_execute+0x89(a, 807c720, 13, 0) ztest_thread+0xea(13, 0, 0, 0) libc.so.1`_thrp_setup+0x88(f0983240) libc.so.1`_lwp_start(f0983240, 0, 0, 0, 0, 0) Looking into it a bit, we see that this is an embedded blockpointer, so BP_GET_NDVAS should have returned 0: b32b240::blkptr EMBEDDED [L0 ZAP_OTHER] et=0 LZ4 size=200L/4aP birth=80L Instead, it looks like another thread is modifying this blockpointer: b32b240::ugrep \| ::whatis f47a0e0c is in [ stack tid=0x19f ] ebd6ec40 is in [ stack tid=0x226 ] ea293bd0 is in [ stack tid=0x244 ] ea293be4 is in [ stack tid=0x244 ] Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 08:43:43 +00:00
mav	75991dd419	MFV r303077: 7072 zfs fails to expand if lun added when os is in shutdown state illumos/illumos-gate@c39a2aae1e `c39a2aae1e` https://www.illumos.org/issues/7072 upstream: 38733 zfs fails to expand if lun added when os is in shutdown state DLPX-36910 spares and caches should not display expandable space DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 08:42:12 +00:00
mav	e7f8bc59ab	MFV r302991: 6950 ARC should cache compressed data illumos/illumos-gate@dcbf3bd6a1 `dcbf3bd6a1` https://www.illumos.org/issues/6950 When reading compressed data from disk, the ARC should keep the compressed block cached and only decompress it when consumers access the block. The uncompressed data should be short-lived allowing the ARC to cache a much larger amount of data. The DMU would also maintain a smaller cache of uncompressed blocks to minimize the impact of decompressing frequently accessed blocks. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Don Brady <don.brady@intel.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 08:30:51 +00:00
mav	54c8c37102	MFV r304158: 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information 7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often illumos/illumos-gate@b72b6bb10a https://github.com/illumos/illumos-gate/commit/b72b6bb10ad55121a1b352c6f68ebdc8e 20c9086 https://www.illumos.org/issues/7136 6922 added ESC_ZFS_VDEV_REMOVE_AUX and ESC_ZFS_VDEV_REMOVE_DEV sysevents whenever an aux device gets removed from a pool. However, those sysevents will be created without the vdev_guid and vdev_path fields. It would be better to always populate those fields. https://www.illumos.org/issues/7115 The addition of spa_event_notify in vdev removal code (see #6922) causes event s to be generated even if the spare failed to be removed with EBUSY. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Alan Somers <asomers@gmail.com>	2016-09-01 18:37:11 +00:00
mav	f67e4fd9fa	MFV r302993: 7104 increase indirect block size illumos/illumos-gate@4b5c8e93ca https://github.com/illumos/illumos-gate/commit/4b5c8e93cab28d3c65ba9d407fd8f46e3 be1db1c https://www.illumos.org/issues/7104 The current default indirect block size is 16KB. We can improve performance by increasing it to 128KB. This is especially helpful for any workload that needs to read most of the metadata, e.g. scrub/resilver, file deletion, filesystem deletion, and zfs send. We also need to fix a few space estimation errors to make the tests pass. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 18:33:39 +00:00
mav	7fc2ff327f	MFV r302992: 7071 lzc_snapshot does not fill in errlist on ENOENT illumos/illumos-gate@25f7d993ad https://github.com/illumos/illumos-gate/commit/25f7d993adbfb3452ac4625b379167074 6d35ae3 https://www.illumos.org/issues/7071 upstream DLPX-40482 lzc_snapshot does not fill in errlist on ENOENT Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 18:25:49 +00:00
mav	9903a75bd9	MFV r302662: 6447 handful of nvpair cleanups illumos/illumos-gate@759e89be35 https://github.com/illumos/illumos-gate/commit/759e89be359f2af635e4122d147df56bc e948773 https://www.illumos.org/issues/6447 I got a patch from someone who uses nvpair code outside of illumos. It fixes a couple of gcc warnings/bugs for him. 1. silence uninitialized use warnings 2. add parentheses around assignment used as truth value 3. fix printf format specifier (ll is for integers only) 4. strstr, strspn, strcspn, and strcmp are declared in string.h, not strings.h. 5. avoid scanning integer into boolean variable Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Steve Dougherty <sdougherty@barracuda.com>	2016-09-01 15:17:39 +00:00
mav	e25a0bc1b9	MFV r302661: 7082 bptree_iterate() passes wrong args to zfs_dbgmsg() illumos/illumos-gate@10e67aa0db https://github.com/illumos/illumos-gate/commit/10e67aa0db0823d5464aafdd681f3c966 155c68e https://www.illumos.org/issues/7082 upstream DLPX-40542 bptree_iterate() passes wrong args to zfs_dbgmsg() Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 15:10:40 +00:00
mav	ae8d156b04	MFV r302660: 6314 buffer overflow in dsl_dataset_name illumos/illumos-gate@9adfa60d48 https://github.com/illumos/illumos-gate/commit/9adfa60d484ce2435f5af77cc99dcd4e6 92b6660 https://www.illumos.org/issues/6314 Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but dsl_dataset_name copies the datasets' name PLUS the snapshot name to it, resulting in a max of 2 * ZFS_MAXNAMELEN + '@'. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 15:08:27 +00:00
mav	9765603b98	MFV r302659: 6931 lib/libzfs: cleanup gcc warnings illumos/illumos-gate@88f61dee20 `88f61dee20` https://www.illumos.org/issues/6931 need cleanup: CERRWARN += -_gcc=-Wno-switch CERRWARN += -_gcc=-Wno-parentheses CERRWARN += -_gcc=-Wno-unused-function Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Igor Kozhukhov <ikozhukhov@gmail.com>	2016-09-01 14:57:06 +00:00
mav	df47b938c1	MFV r302651: 7054 dmu_tx_hold_t should use refcount_t to track space illumos/illumos-gate@0c779ad424 https://github.com/illumos/illumos-gate/commit/0c779ad424a92a84d1e07d47cab7f8009 189202b https://www.illumos.org/issues/7054 upstream: ee0003de7d3e598499be7ac3fe6b61efcc47cb7f DLPX-40399 dmu_tx_hold_t should use refcount_t to track space Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 14:38:25 +00:00
mav	d6db686e3c	MFV r302648: 7019 zfsdev_ioctl skips secpolicy when FKIOCTL is set Note that the bulk of the upstream change is not applicable to FreeBSD and the affected files are not even in the vendor area. illumos/illumos-gate@45b1747515 `45b1747515` https://www.illumos.org/issues/7019 Currently zfsdev_ioctl, when confronted by a request with the FKIOCTL flag set, skips all processing of secpolicy functions. This means that ZFS is not doing any kind of verification of the credentials or access rights of the caller and assuming that (as it is an in-kernel client) all such checks have already been done. This turns out to be quite a dangerous assumption, especially with respect to sdev. In general I don't think it's particularly reasonable to offload this enforcement of access rights onto other kernel subsystems when ZFS has some particular local semantics in this area (delegated datasets etc) and does not provide any kind of API to allow other subsystems to avoid code duplication when doing it. ZFS should apply its normal access policy to requests from within the kernel, and callers should take care to give it the correct credentials and call it from the correct context in order to get the results they need. You can observe the currently unfortunate consequences of this bug in any non- global zone that has access to /dev/zvol or any subset of it via sdev profiles. In particular, a zone used to contain a KVM or similar which has a single zvol passed through to it using a <device match= block in its zone XML. Even though sdev makes something of an attempt to control for whether the caller should have access to nodes in /dev/zvol, it doesn't do this correctly, or really at all in the lookup call path. So, if we have a zone that's been given access to any part of /dev/zvol, it can simply look up the full path to any other zvol on the entire system, and the node will appear and be able to be used. Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alex Wilson <alex.wilson@joyent.com>	2016-09-01 14:24:54 +00:00
mav	3cbf3d5ed6	MFV r302647: 6922 Emit ESC_ZFS_VDEV_REMOVE_AUX after removing an aux device illumos/illumos-gate@63364b0ee2 https://github.com/illumos/illumos-gate/commit/63364b0ee2604783e7a55f84258888677 68eafa4 https://www.illumos.org/issues/6922 ZFS does not do a config_sync after removing an aux (spare, log, or cache) device. AFAICT this isn't being done because it is slow and was deemed unnecessary. However, it should be such a rare operation that speed doesn't matter, and not doing it results in two problems: 1) It is theoretically possible to remove an aux device from one pool and attach it to another, then lose power. When power is restored, both pools woul d think that they own the aux device. 2) Removal of the aux device doesn't send any useful sysevents to userland. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alan Somers <asomers@gmail.com>	2016-09-01 14:17:30 +00:00
mav	9c74e51cc9	MFV r302646: 6980 6902 causes zfs send to break due to 32-bit/64-bit struct mismatch illumos/illumos-gate@ea4a67f462 https://github.com/illumos/illumos-gate/commit/ea4a67f462de0a39a9adea8197bcdef84 9de5371 https://www.illumos.org/issues/6980 doing zfs send -i snap1 snap2 >testfile results in internal error: Invalid argument Abort (core dumped) Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 14:06:30 +00:00
mav	13a03d4bba	MFV r302643: 6902 speed up listing of snapshots if requesting name only and sorting by name This was our change from the beginning, so just reduce the upstream diff.	2016-09-01 13:29:53 +00:00
mav	879fd44e10	MFV r302642: 6876 Stack corruption after importing a pool with a too-long name illumos/illumos-gate@c971037baa `c971037baa` https://www.illumos.org/issues/6876 Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for trouble. We should check every dataset on import, using a 1024 byte buffer and checking each time to see if the dataset's new name is longer than 256 bytes. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-01 13:04:36 +00:00
mav	7526e12121	Fix kernel panic when inheriting properties without default. There are two writable hidden properties "iscsioptions" and "stmf_sbd_lu", that have no default string value. Attempt to unset them or replicate caused kernel panic. This simple bandaid seems fixes the problem nicely. MFC after: 2 weeks	2016-08-31 11:55:31 +00:00
tsoome	61f41de13c	Bug 212114 - loader: zio_checksum_verify() must test spa for NULL pointer The issue was introduced with adding support for salted checksums, and was revealed by bhyve userboot.so. During pool discovery the loader is reading pool label from disks, and at that time the spa structure is not yet set up, so the NULL pointer is passed for spa. This condition must be checked to avoid the corruption of the memory and NULL pointer dereference. PR: 212114 Reported by: tsoome@freebsd.com Reviewed by: allanjude Approved by: allanjude (mentor) Differential Revision: https://reviews.freebsd.org/D7634	2016-08-24 16:30:15 +00:00
tsoome	05a5c963f2	Add SHA512, skein, large blocks support for loader zfs. Updated sha512 from illumos. Using skein from freebsd crypto tree. Since loader itself is using 64MB memory for heap, updated zfsboot to use same, and this also allows to support zfs large blocks. Note, adding additional features does increate zfsboot code, therefore this update does increase zfsboot code to 128k, also I have ported gptldr.S update to zfsldr.S to support 64k+ code. With this update, boot1.efi has almost reached the current limit of the size set for it, so one of the future patches for boot1.efi will need to increase the limit. Currently known missing zfs features in boot loader are edonr and gzip support. Reviewed by: delphij, imp Approved by: imp (mentor) Obtained from: sha256.c update and skein_zfs.c stub from illumos. Differential Revision: https://reviews.freebsd.org/D7418	2016-08-18 00:37:07 +00:00
markj	b1b1e507c6	MFV r301526: 7035 string-related subroutines should validate input earlier Reviewed by: Alex Wilson <alex.wilson@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Patrick Mooney <pmooney@pfmooney.com> illumos/illumos-gate@771e39c3b1 MFC after: 2 weeks	2016-08-16 02:25:19 +00:00
markj	231aa84ab7	MFV r301525: 7033 ustack helper should fault on bad return values Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Alex Wilson <alex.wilson@joyent.com> illumos/illumos-gate@a2f72b65eb MFC after: 2 weeks	2016-08-16 02:20:02 +00:00
markj	33122f7379	MFV r301524: 7034 negative record sizes should be rejected Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Alex Wilson <alex.wilson@joyent.com> illumos/illumos-gate@0b8049bfb0 MFC after: 2 weeks	2016-08-16 02:18:34 +00:00
markj	81ae8f62f3	MFV r296989: 6734 dtrace_canstore_statvar() fails for some valid static variables Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Bryan Cantrill <bryan@joyent.com> illumos/illumos-gate@d65f2bb4e5 MFC after: 2 weeks	2016-08-16 02:16:54 +00:00

1 2 3 4 5 ...

1616 Commits