freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	5975e53d40	Fix a race in vm_page_busy_sleep(9). Suppose that we have an exclusively busy page, and a thread which can accept shared-busy page. In this case, typical code waiting for the page xbusy state to pass is again: VM_OBJECT_WLOCK(object); ... if (vm_page_xbusied(m)) { vm_page_lock(m); VM_OBJECT_WUNLOCK(object); <---1 vm_page_busy_sleep(p, "vmopax"); goto again; } Suppose that the xbusy state owner locked the object, unbusied the page and unlocked the object after we are at the line [1], but before we executed the load of the busy_lock word in vm_page_busy_sleep(). If it happens that there is still no waiters recorded for the busy state, the xbusy owner did not acquired the page lock, so it proceeded. More, suppose that some other thread happen to share-busy the page after xbusy state was relinquished but before the m->busy_lock is read in vm_page_busy_sleep(). Again, that thread only needs vm_object lock to proceed. Then, vm_page_busy_sleep() reads busy_lock value equal to the VPB_SHARERS_WORD(1). In this case, all tests in vm_page_busy_sleep(9) pass and we are going to sleep, despite the page being share-busied. Update check for m->busy_lock == VPB_UNBUSIED in vm_page_busy_sleep(9) to also accept shared-busy state if we only wait for the xbusy state to pass. Merge sequential if()s with the same 'then' clause in vm_page_busy_sleep(). Note that the current code does not share-busy pages from parallel threads, the only way to have more that one sbusy owner is right now is to recurse. Reported and tested by: pho (previous version) Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D8196	2016-10-13 14:41:05 +00:00
Konstantin Belousov	f71d08566c	Limit scope of the optimization in r306608 to dounmount() caller only. Other uses of cache_purgevfs() do rely on the cache purge for correct operations, when paths are invalidated without unmount. Reported and tested by: jkim Discussed with: mjg Sponsored by: The FreeBSD Foundation	2016-10-07 11:38:28 +00:00
Andriy Gapon	6f98c83306	implement zfs_vptocnp() using z_parent property This should allow vn_fullpath() to work even when vfs name cache is disabled for zfs, which is the case when zfs properties like casesensitivity and normalization are set non-default values. The new code should be 100% reliable for directories and "mostly" reliable for files, that is, when hardlinks across directories are not used. Reported by: Frederic Chardon <chardon.frederic@gmail.com> Reviewed by: kib (vfs contract) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D8146	2016-10-07 06:29:24 +00:00
Andriy Gapon	9ba3abc30e	zfs: fix a wrong assertion for extended attributes For the extended attributes the order between z_teardown_lock and the vnode lock is different. The bug was triggered only with DIAGNOSTIC turned on. This fix is developed in cooperation with avos. PR: 213112 Reported by: avos Tested by: avos MFC after: 1 week	2016-10-04 08:09:25 +00:00
Mark Johnston	4538cee5bf	Allow tracing of functions prefixed by "__". This restriction was inherited from upstream but is not relevant on FreeBSD. Furthermore, it hindered the tracing of locking primitive subroutines. MFC after: 1 week	2016-10-02 00:35:00 +00:00
Alexander Motin	863ef2ca62	Add #ifdef _KERNEL around send_holes_without_birth_time sysctl. Reported by: avg@	2016-09-29 17:48:53 +00:00
Alexander Motin	226a11f81e	MFV r306423: 7402 Create tunable to ignore hole_birth feature Until we can resolve the numerous hole_birth bugs that have cropped up recently, and come up with a way going forwards to protect users from corruption, we should disable the hole_birth feature. Using a tunable allows those who are confident that their data is correct to continue to take advantage of the feature. Closes #188 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-29 00:00:37 +00:00
Alexander Motin	bb97118138	MFV r306422: 7254 ztest failed assertion in ztest_dataset_dirobj_verify: dirobjs + 1 == usedobjs dsl_dataset_space is looking at the ds_bp's fill count while dmu_objset_write_ready() is concurrently modifying it. This fix adds an rrwlock to protect the ds_bp. Closes #180 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-28 23:54:47 +00:00
Mark Johnston	9e579a58c3	Move implementations of uread() and uwrite() to the illumos compat layer. MFC after: 1 week	2016-09-24 21:40:14 +00:00
Andriy Gapon	d26312a4e4	fix vnode lock assertion for extended attributes directory Background. In ZFS a file with extended attributes has a special directory associated with it where each extended attribute is a file. The attribute's name is a file name and its value is a file content. When the ownership of a file with extended attributes is changed, ZFS also changes ownership of the special directory. This is where the bug was hit. The bug was introduced in r209158. Nota bene. ZFS vnode locks are typically acquired before z_teardown_lock (i.e., before ZFS_ENTER). But this is not the case for the vnodes that represent the extended attribute directory and files. Those are always locked after ZFS_ENTER. This is confusing and fragile. PR: 212702 Reported by: Christian Fuss to FreeNAS Tested by: mav MFC after: 1 week	2016-09-24 08:13:15 +00:00
Mark Johnston	36f5d07745	Re-check the systrace probe ID before calling dtrace_probe(). Otherwise there exists a narrow window during which a syscall probe can be disabled and cause a concurrently-running thread to call dtrace_probe() with an invalid probe ID. Reported by: ngie MFC after: 1 week Sponsored by: Dell EMC Isilon	2016-09-22 23:22:53 +00:00
Allan Jude	c2b475d0ee	MFV r268120: 4936 lz4 could theoretically overflow a pointer with a certain input illumos/illumos-gate@58d0718061 Reviewed by: delphij MFC after: 2 weeks Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D7850	2016-09-11 17:48:06 +00:00
Alexander Motin	20e45e033c	Switch random_get_pseudo_bytes() shim to arc4rand(). Our shim for Solaris random_get_bytes() uses read_random(), that looks reasonable, since it guaranties reliably seeded random data. On the other side Solaris random_get_pseudo_bytes() does not provide this guarantie, and its original Solaris implementation is equivalent to our arc4rand(), using software crypto without stressing slower hardware RNG.	2016-09-10 09:37:41 +00:00
Alexander Motin	4605bf63c4	MFV r305562: 7259 DS_FIELD_LARGE_BLOCKS is unused The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of this patch: commit ca0cc3918a1789fa839194af2a9245f801a06b1a Author: Matthew Ahrens <mahrens@delphix.com> Date: Fri Jul 24 09:53:55 2015 -0700 5959 clean up per-dataset feature count code Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> This patch simply removes this macro from dsl_dataset.h. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-07 20:09:24 +00:00
Alexander Motin	de1fdddeda	MFV r305560: 7278 tuning zfs_arc_max does not impact arc_c_min When changing zfs_arc_max (e.g. as zdb does), it may be set to less than the default arc_c_min. arc_c_min should decrease to not be more than arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@608764bead	2016-09-07 20:05:10 +00:00
Andriy Gapon	1a82707cd7	fix zfs pool creation accidentally broken by r305331 The upstream change introduced a new load state, SPA_LOAD_CREATE, and vdev_geom code needs to be aware of it. Tested by: cy MFC after: 1 week X-MFC with: r305331	2016-09-06 06:09:12 +00:00
Alexander Motin	9b9258a12a	Missed FreeBSD-specific piece of r305338.	2016-09-03 11:17:33 +00:00
Alexander Motin	d7e781bda3	MFC r305337: 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object Using a benchmark which has 32 threads creating 2 million files in the same directory, on a machine with 16 CPU cores, I observed poor performance. I noticed that dmu_tx_hold_zap() was using about 30% of all CPU, and doing dnode_hold() 7 times on the same object (the ZAP object that is being held). dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the dnode_t that we already have in hand, rather than repeatedly calling dnode_hold(). To do this, we need to pass the dnode_t down through all the intermediate calls that dmu_tx_hold_zap() makes, making these routines take the dnode_t* rather than an objset_t* and a uint64_t object number. In particular, the following routines will need to have analogous *_by_dnode() variants created: dmu_buf_hold_noread() dmu_buf_hold() zap_lookup() zap_lookup_norm() zap_count_write() zap_lockdir() zap_count_write() This can improve performance on the benchmark described above by 100%, from 30,000 file creations per second to 60,000. (This improvement is on top of that provided by working around the object allocation issue. Peak performance of ~90,000 creations per second was observed with 8 CPUs; adding CPUs past that decreased performance due to lock contention.) The CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds to 40 CPU-seconds. Sponsored by: Intel Corp. Closes #109 Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Ned Bass <bass6@llnl.gov> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@d3e523d489	2016-09-03 11:00:29 +00:00
Alexander Motin	4ad4b70e77	MFV r305336: 7247 zfs receive of deduplicated stream fails This resolves two 'zfs recv' issues. First, when receiving into an existing filesystem, a snapshot created during the receive process is not added to the guid->dataset map for the stream, resulting in failed lookups for deduped streams when a WRITE_BYREF record refers to a snapshot received earlier in the stream. Second, the newly created snapshot was also not set properly, referencing the snapshot before the new receiving dataset rather than the existing filesystem. Closes #159 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Author: Chris Williamson <chris.williamson@delphix.com> openzfs/openzfs@b09697c8c1	2016-09-03 10:59:05 +00:00
Alexander Motin	070da3f779	MFV r305335: 7003 zap_lockdir() should tag hold zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp. Closes #108 Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@0780b3eab5	2016-09-03 10:58:14 +00:00
Alexander Motin	d3ec2cdb4a	MFV r304157: 7230 add assertions to dmu_send_impl() to verify that stream includes BEGIN and END records illumos/illumos-gate@12b90ee2d3 https://github.com/illumos/illumos-gate/commit/12b90ee2d3b10689fc45f4930d2392f5f e1d9cfa https://www.illumos.org/issues/7230 A test failure occurred where a send stream had only a BEGIN record. This should not be possible if the send returns without error. Prevented this from happening in the future by adding an assertion to dmu_send_impl() to verify that if the function returns 0 (success) both a BEGIN and END record are present. Did this by adding flags to dmu_sendarg_t (indicating whether BEGIN o r END records sent), having dump_record() set flags appropriately, adding VERIFY statement to dmu_send_impl(). Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matt Krantz <matt.krantz@delphix.com>	2016-09-03 10:10:58 +00:00
Alexander Motin	7aafc9d4c8	MFV r304156: 7235 remove unused func dsl_dataset_set_blkptr illumos/illumos-gate@bd56f80007 https://github.com/illumos/illumos-gate/commit/bd56f80007857b960e0981ed0797ad8ec 844a96b https://www.illumos.org/issues/7235 The function dsl_dataset_set_blkptr() is unused. We should remove it. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 10:09:23 +00:00
Alexander Motin	c9fa25c110	MFV r304155: 7090 zfs should improve allocation order and throttle allocations illumos/illumos-gate@0f7643c737 https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b 10c846a https://www.illumos.org/issues/7090 When write I/Os are issued, they are issued in block order but the ZIO pipelin e will drive them asynchronously through the allocation stage which can result i n blocks being allocated out-of-order. It would be nice to preserve as much of the logical order as possible. In addition, the allocations are equally scattered across all top-level VDEVs but not all top-level VDEVs are created equally. The pipeline should be able t o detect devices that are more capable of handling allocations and should allocate more blocks to those devices. This allows for dynamic allocation distribution when devices are imbalanced as fuller devices will tend to be slower than empty devices. The change includes a new pool-wide allocation queue which would throttle and order allocations in the ZIO pipeline. The queue would be ordered by issued time and offset and would provide an initial amount of allocation of work to each top-level vdev. The allocation logic utilizes a reservation system to reserve allocations that will be performed by the allocator. Once an allocatio n is successfully completed it's scheduled on a given top-level vdev. Each top- level vdev maintains a maximum number of allocations that it can handle (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab groups and round robin across all eligible metaslab groups to distribute the work. As top-levels complete their work, they receive additional work from the pool-wide allocation queue until the allocation queue is emptied. Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 10:04:37 +00:00
Alexander Motin	0b51a59fc7	MFV r303078: 7086 ztest attempts dva_get_dsize_sync on an embedded blockpointer illumos/illumos-gate@926549256b https://github.com/illumos/illumos-gate/commit/926549256b71acd595f69b236779ff6b7 8fa08ef https://www.illumos.org/issues/7086 In dbuf_dirty(), we need to grab the dn_struct_rwlock before looking at the db_blkptr, to prevent it from being changed by syncing context. Otherwise we may see that ztest got a segfault from this stack: libzpool.so.1`dva_get_dsize_sync+0x98(872f000, `b32b240`, fed7811b, 0, b4cda20, 0) libzpool.so.1`bp_get_dsize+0x60(872f000, `b32b240`, 0, 97cb780, 9d4c1a8, 0) libzpool.so.1`dbuf_dirty+0x9b3(ce0a100, 97cb780, 9, fecd2530) libzpool.so.1`dmu_buf_will_dirty+0xc3(ce0a100, 97cb780, ea293d6c, 1) libzpool.so.1`zap_lockdir+0x1a0(8aaa3c0, 1, 0, 97cb780, 1, 1) libzpool.so.1`zap_remove_norm+0x30(8aaa3c0, 1, 0, 8728b10, 0, 97cb780) libzpool.so.1`zap_remove+0x29(8aaa3c0, 1, 0, 8728b10, 97cb780, a) ztest_replay_remove+0x225(ea294588, 8728ae8, 0, 38010000, 0, 0) ztest_remove+0x9f(ea294588, ea293f50, 4, 3) ztest_object_init+0x78(ea294588, ea293f50, 4e0, 1) ztest_dmu_object_alloc_free+0x71(ea294588, 13) ztest_dmu_objset_create_destroy+0x224(80cef08, 13, 0, 805d36c, 9017ad44, 0) ztest_execute+0x89(a, 807c720, 13, 0) ztest_thread+0xea(13, 0, 0, 0) libc.so.1`_thrp_setup+0x88(f0983240) libc.so.1`_lwp_start(f0983240, 0, 0, 0, 0, 0) Looking into it a bit, we see that this is an embedded blockpointer, so BP_GET_NDVAS should have returned 0: b32b240::blkptr EMBEDDED [L0 ZAP_OTHER] et=0 LZ4 size=200L/4aP birth=80L Instead, it looks like another thread is modifying this blockpointer: b32b240::ugrep \| ::whatis f47a0e0c is in [ stack tid=0x19f ] ebd6ec40 is in [ stack tid=0x226 ] ea293bd0 is in [ stack tid=0x244 ] ea293be4 is in [ stack tid=0x244 ] Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 08:43:43 +00:00
Alexander Motin	84c3781ac9	MFV r303077: 7072 zfs fails to expand if lun added when os is in shutdown state illumos/illumos-gate@c39a2aae1e `c39a2aae1e` https://www.illumos.org/issues/7072 upstream: 38733 zfs fails to expand if lun added when os is in shutdown state DLPX-36910 spares and caches should not display expandable space DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 08:42:12 +00:00
Alexander Motin	efa0867fb0	MFV r302991: 6950 ARC should cache compressed data illumos/illumos-gate@dcbf3bd6a1 `dcbf3bd6a1` https://www.illumos.org/issues/6950 When reading compressed data from disk, the ARC should keep the compressed block cached and only decompress it when consumers access the block. The uncompressed data should be short-lived allowing the ARC to cache a much larger amount of data. The DMU would also maintain a smaller cache of uncompressed blocks to minimize the impact of decompressing frequently accessed blocks. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Don Brady <don.brady@intel.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 08:30:51 +00:00
Alexander Motin	c543b519be	MFV r304158: 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information 7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often illumos/illumos-gate@b72b6bb10a https://github.com/illumos/illumos-gate/commit/b72b6bb10ad55121a1b352c6f68ebdc8e 20c9086 https://www.illumos.org/issues/7136 6922 added ESC_ZFS_VDEV_REMOVE_AUX and ESC_ZFS_VDEV_REMOVE_DEV sysevents whenever an aux device gets removed from a pool. However, those sysevents will be created without the vdev_guid and vdev_path fields. It would be better to always populate those fields. https://www.illumos.org/issues/7115 The addition of spa_event_notify in vdev removal code (see #6922) causes event s to be generated even if the spare failed to be removed with EBUSY. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Alan Somers <asomers@gmail.com>	2016-09-01 18:37:11 +00:00
Alexander Motin	25584d12e7	MFV r302993: 7104 increase indirect block size illumos/illumos-gate@4b5c8e93ca https://github.com/illumos/illumos-gate/commit/4b5c8e93cab28d3c65ba9d407fd8f46e3 be1db1c https://www.illumos.org/issues/7104 The current default indirect block size is 16KB. We can improve performance by increasing it to 128KB. This is especially helpful for any workload that needs to read most of the metadata, e.g. scrub/resilver, file deletion, filesystem deletion, and zfs send. We also need to fix a few space estimation errors to make the tests pass. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 18:33:39 +00:00
Alexander Motin	dd7f7cb7ac	MFV r302992: 7071 lzc_snapshot does not fill in errlist on ENOENT illumos/illumos-gate@25f7d993ad https://github.com/illumos/illumos-gate/commit/25f7d993adbfb3452ac4625b379167074 6d35ae3 https://www.illumos.org/issues/7071 upstream DLPX-40482 lzc_snapshot does not fill in errlist on ENOENT Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 18:25:49 +00:00
Alexander Motin	3d1e0e0830	MFV r302662: 6447 handful of nvpair cleanups illumos/illumos-gate@759e89be35 https://github.com/illumos/illumos-gate/commit/759e89be359f2af635e4122d147df56bc e948773 https://www.illumos.org/issues/6447 I got a patch from someone who uses nvpair code outside of illumos. It fixes a couple of gcc warnings/bugs for him. 1. silence uninitialized use warnings 2. add parentheses around assignment used as truth value 3. fix printf format specifier (ll is for integers only) 4. strstr, strspn, strcspn, and strcmp are declared in string.h, not strings.h. 5. avoid scanning integer into boolean variable Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Steve Dougherty <sdougherty@barracuda.com>	2016-09-01 15:17:39 +00:00
Alexander Motin	3421688c2d	MFV r302661: 7082 bptree_iterate() passes wrong args to zfs_dbgmsg() illumos/illumos-gate@10e67aa0db https://github.com/illumos/illumos-gate/commit/10e67aa0db0823d5464aafdd681f3c966 155c68e https://www.illumos.org/issues/7082 upstream DLPX-40542 bptree_iterate() passes wrong args to zfs_dbgmsg() Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 15:10:40 +00:00
Alexander Motin	41b9077ef6	MFV r302660: 6314 buffer overflow in dsl_dataset_name illumos/illumos-gate@9adfa60d48 https://github.com/illumos/illumos-gate/commit/9adfa60d484ce2435f5af77cc99dcd4e6 92b6660 https://www.illumos.org/issues/6314 Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but dsl_dataset_name copies the datasets' name PLUS the snapshot name to it, resulting in a max of 2 * ZFS_MAXNAMELEN + '@'. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 15:08:27 +00:00
Alexander Motin	e12a269749	MFV r302659: 6931 lib/libzfs: cleanup gcc warnings illumos/illumos-gate@88f61dee20 `88f61dee20` https://www.illumos.org/issues/6931 need cleanup: CERRWARN += -_gcc=-Wno-switch CERRWARN += -_gcc=-Wno-parentheses CERRWARN += -_gcc=-Wno-unused-function Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Igor Kozhukhov <ikozhukhov@gmail.com>	2016-09-01 14:57:06 +00:00
Alexander Motin	a95a9fe945	MFV r302651: 7054 dmu_tx_hold_t should use refcount_t to track space illumos/illumos-gate@0c779ad424 https://github.com/illumos/illumos-gate/commit/0c779ad424a92a84d1e07d47cab7f8009 189202b https://www.illumos.org/issues/7054 upstream: ee0003de7d3e598499be7ac3fe6b61efcc47cb7f DLPX-40399 dmu_tx_hold_t should use refcount_t to track space Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 14:38:25 +00:00
Alexander Motin	96bf48b8cb	MFV r302648: 7019 zfsdev_ioctl skips secpolicy when FKIOCTL is set Note that the bulk of the upstream change is not applicable to FreeBSD and the affected files are not even in the vendor area. illumos/illumos-gate@45b1747515 `45b1747515` https://www.illumos.org/issues/7019 Currently zfsdev_ioctl, when confronted by a request with the FKIOCTL flag set, skips all processing of secpolicy functions. This means that ZFS is not doing any kind of verification of the credentials or access rights of the caller and assuming that (as it is an in-kernel client) all such checks have already been done. This turns out to be quite a dangerous assumption, especially with respect to sdev. In general I don't think it's particularly reasonable to offload this enforcement of access rights onto other kernel subsystems when ZFS has some particular local semantics in this area (delegated datasets etc) and does not provide any kind of API to allow other subsystems to avoid code duplication when doing it. ZFS should apply its normal access policy to requests from within the kernel, and callers should take care to give it the correct credentials and call it from the correct context in order to get the results they need. You can observe the currently unfortunate consequences of this bug in any non- global zone that has access to /dev/zvol or any subset of it via sdev profiles. In particular, a zone used to contain a KVM or similar which has a single zvol passed through to it using a <device match= block in its zone XML. Even though sdev makes something of an attempt to control for whether the caller should have access to nodes in /dev/zvol, it doesn't do this correctly, or really at all in the lookup call path. So, if we have a zone that's been given access to any part of /dev/zvol, it can simply look up the full path to any other zvol on the entire system, and the node will appear and be able to be used. Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alex Wilson <alex.wilson@joyent.com>	2016-09-01 14:24:54 +00:00
Alexander Motin	13876b47d7	MFV r302647: 6922 Emit ESC_ZFS_VDEV_REMOVE_AUX after removing an aux device illumos/illumos-gate@63364b0ee2 https://github.com/illumos/illumos-gate/commit/63364b0ee2604783e7a55f84258888677 68eafa4 https://www.illumos.org/issues/6922 ZFS does not do a config_sync after removing an aux (spare, log, or cache) device. AFAICT this isn't being done because it is slow and was deemed unnecessary. However, it should be such a rare operation that speed doesn't matter, and not doing it results in two problems: 1) It is theoretically possible to remove an aux device from one pool and attach it to another, then lose power. When power is restored, both pools woul d think that they own the aux device. 2) Removal of the aux device doesn't send any useful sysevents to userland. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alan Somers <asomers@gmail.com>	2016-09-01 14:17:30 +00:00
Alexander Motin	1c7d88abed	MFV r302646: 6980 6902 causes zfs send to break due to 32-bit/64-bit struct mismatch illumos/illumos-gate@ea4a67f462 https://github.com/illumos/illumos-gate/commit/ea4a67f462de0a39a9adea8197bcdef84 9de5371 https://www.illumos.org/issues/6980 doing zfs send -i snap1 snap2 >testfile results in internal error: Invalid argument Abort (core dumped) Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-01 14:06:30 +00:00
Alexander Motin	4536fd9bed	MFV r302643: 6902 speed up listing of snapshots if requesting name only and sorting by name This was our change from the beginning, so just reduce the upstream diff.	2016-09-01 13:29:53 +00:00
Alexander Motin	5fd28943d6	MFV r302642: 6876 Stack corruption after importing a pool with a too-long name illumos/illumos-gate@c971037baa `c971037baa` https://www.illumos.org/issues/6876 Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for trouble. We should check every dataset on import, using a 1024 byte buffer and checking each time to see if the dataset's new name is longer than 256 bytes. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Paul Dagnelie <pcd@delphix.com>	2016-09-01 13:04:36 +00:00
Alexander Motin	9007a8679a	Fix kernel panic when inheriting properties without default. There are two writable hidden properties "iscsioptions" and "stmf_sbd_lu", that have no default string value. Attempt to unset them or replicate caused kernel panic. This simple bandaid seems fixes the problem nicely. MFC after: 2 weeks	2016-08-31 11:55:31 +00:00
Toomas Soome	f1624ed8c4	Bug 212114 - loader: zio_checksum_verify() must test spa for NULL pointer The issue was introduced with adding support for salted checksums, and was revealed by bhyve userboot.so. During pool discovery the loader is reading pool label from disks, and at that time the spa structure is not yet set up, so the NULL pointer is passed for spa. This condition must be checked to avoid the corruption of the memory and NULL pointer dereference. PR: 212114 Reported by: tsoome@freebsd.com Reviewed by: allanjude Approved by: allanjude (mentor) Differential Revision: https://reviews.freebsd.org/D7634	2016-08-24 16:30:15 +00:00
Toomas Soome	2c55d0903d	Add SHA512, skein, large blocks support for loader zfs. Updated sha512 from illumos. Using skein from freebsd crypto tree. Since loader itself is using 64MB memory for heap, updated zfsboot to use same, and this also allows to support zfs large blocks. Note, adding additional features does increate zfsboot code, therefore this update does increase zfsboot code to 128k, also I have ported gptldr.S update to zfsldr.S to support 64k+ code. With this update, boot1.efi has almost reached the current limit of the size set for it, so one of the future patches for boot1.efi will need to increase the limit. Currently known missing zfs features in boot loader are edonr and gzip support. Reviewed by: delphij, imp Approved by: imp (mentor) Obtained from: sha256.c update and skein_zfs.c stub from illumos. Differential Revision: https://reviews.freebsd.org/D7418	2016-08-18 00:37:07 +00:00
Mark Johnston	59ceeddecf	MFV r301526: 7035 string-related subroutines should validate input earlier Reviewed by: Alex Wilson <alex.wilson@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Patrick Mooney <pmooney@pfmooney.com> illumos/illumos-gate@771e39c3b1 MFC after: 2 weeks	2016-08-16 02:25:19 +00:00
Mark Johnston	f66200ee22	MFV r301525: 7033 ustack helper should fault on bad return values Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Alex Wilson <alex.wilson@joyent.com> illumos/illumos-gate@a2f72b65eb MFC after: 2 weeks	2016-08-16 02:20:02 +00:00
Mark Johnston	4aea8f31b1	MFV r301524: 7034 negative record sizes should be rejected Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Alex Wilson <alex.wilson@joyent.com> illumos/illumos-gate@0b8049bfb0 MFC after: 2 weeks	2016-08-16 02:18:34 +00:00
Mark Johnston	b7125fa9cd	MFV r296989: 6734 dtrace_canstore_statvar() fails for some valid static variables Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Bryan Cantrill <bryan@joyent.com> illumos/illumos-gate@d65f2bb4e5 MFC after: 2 weeks	2016-08-16 02:16:54 +00:00
Andriy Gapon	96762fe314	fix a zfs cross-device rename crash introduced in r303763 The problem was that 'zfsvfs' variable was not initialized if the error was detected, but in the exit path the variable was dereferenced before the error code was checked. Reported by: np MFC after: 3 days X-MFC with: r303763	2016-08-09 06:11:24 +00:00
Justin Hibbits	161c415133	Two fixups for dtrace * Use the right incantation to get the next stack pointer. Since powerpc uses special frames for traps, dereferencing the stack pointer straight up won't get us the next stack pointer in every case. * Clear EE using the correct instruction sequence. The PowerISA states that 'andi.' ANDs the register with 0\|\|<imm>, instead of sign extending or filling out the unavailable bits with 1. Even if it did sign extend, PSL_EE is 0x8000, so ~PSL_EE is 0x7fff, and the upper bits would be cleared. Use rlwinm in the 32-bit case, and a two-rotate sequence in the 64-bit case, the latter chosen to follow the output generated by gcc. MFC after: 1 week	2016-08-06 15:06:19 +00:00
Andriy Gapon	4fb51b52ef	fix .zfs-related cases in zfs_lookup that were broken by r303763 The problem is that the special .zfs nodes are not represented by znodes but by special gfs-based nodes. r303763 changed interface of zfs_dirlook such that started operating on znodes rather than on vnodes and, thus, the function became unsuitable for handling .zfs entities. The solution is to move the handling of the special cases to zfs_lookup, the only consumer of zfs_dirlook. I already had this solution applied in D7421, but for different reasons. Reported by: asomers MFC after: 3 days X-MFC with: r303763	2016-08-06 11:02:07 +00:00
Andriy Gapon	f79bc17233	zfs: honour and make use of vfs vnode locking protocol ZFS POSIX Layer is originally written for Solaris VFS which is very different from FreeBSD VFS. Most importantly many things that FreeBSD VFS manages on behalf of all filesystems are implemented in ZPL in a different way. Thus, ZPL contains code that is redundant on FreeBSD or duplicates VFS functionality or, in the worst cases, badly interacts / interferes with VFS. The most prominent problem is a deadlock caused by the lock order reversal of vnode locks that may happen with concurrent zfs_rename() and lookup(). The deadlock is a result of zfs_rename() not observing the vnode locking contract expected by VFS. This commit removes all ZPL internal locking that protects parent-child relationships of filesystem nodes. These relationships are protected by vnode locks and the code is changed to take advantage of that fact and to properly interact with VFS. Removal of the internal locking allowed all ZPL dmu_tx_assign calls to use TXG_WAIT mode. Another victim, disputable perhaps, is ZFS support for filesystems with mixed case sensitivity. That support is not provided by the OS anyway, so in ZFS it was a buch of dead code. To do: - replace ZFS_ENTER mechanism with VFS managed / visible mechanism - replace zfs_zget with zfs_vget[f] as much as possible - get rid of not really useful now zfs_freebsd_* adapters - more cleanups of unneeded / unused code - fix / replace .zfs support PR: 209158 Reported by: many Tested by: many (thank you all!) MFC after: 5 days Sponsored by: HybridCluster / ClusterHQ Differential Revision: https://reviews.freebsd.org/D6533	2016-08-05 06:23:06 +00:00

1 2 3 4 5 ...

1600 Commits