freebsd-dev

Author	SHA1	Message	Date
Andriy Gapon	912c3fe715	ZFS: add bookmark renaming The feature is implemented as an extension of the existing ZFS_IOC_RENAME ioctl. Both the userland and the DSL interfaces support renaming only a single bookmark at a time. As of now, there is no ZCP interface to the new functionality. I am going to add it once the DSL interface passes a test of time. This change picks up support for zfs_ioc_namecheck_t::ENTITY_NAME that was added to ZoL as part of Redacted Send/Receive feature by Paul Dagnelie <pcd@delphix.com>. This is needed to allow a bookmark name in zc_name. Discussed with: mahrens Reviewed by: bcr (man page) Sponsored by: CyberSecure Differential Revision: https://reviews.freebsd.org/D21795	2019-10-03 11:08:45 +00:00
Andriy Gapon	38a1def12f	MFZoL: Retire send space estimation via ZFS_IOC_SEND Add a small wrapper around libzfs_core's lzc_send_space() to libzfs so that every legacy ZFS_IOC_SEND consumer, along with their userland counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to request send space estimation. The legacy functionality in zfs_ioc_send() is left untouched for compatibility purposes. Obtained from: ZoL Obtained from: zfsonlinux/zfs@cf7684bc8d Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks	2019-09-22 08:44:41 +00:00
Andriy Gapon	62dd1037a9	print summary line for space estimate of zfs send from bookmark Although there is always a single stream and the total size in the summary is always equal to the size reported for the stream, it's nice to follow the usual output format. MFC after: 3 days	2019-09-22 08:34:23 +00:00
Sean Eric Fagan	0c3d878d1c	Fix a regression introduced in r344601, and work properly with the -v and -n options. PR: 240640 Reported by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: avg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D21709	2019-09-21 17:54:42 +00:00
Andriy Gapon	dfb2b9a361	update zfs send usage help with r352447 MFC after: 3 days	2019-09-19 09:48:01 +00:00
Andriy Gapon	496ba62c36	MFZoL: Add -vnP support to 'zfs send' for bookmarks zfsonlinux/zfs@835db58592 We have long supported estimating a size of an incremental stream from a snapshot. We should do the same for bookmarks as well. Obtained from: ZoL Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 3 days	2019-09-17 13:58:15 +00:00
Emmanuel Vadot	c2a7c3bedd	pkgbase: Force zfs(8) and zpool(8) to be in the runtime package Those commands are needed to repair a FreeBSD installation so add them to the runtime package Reviewed by: bapt, gjb Differential Revision: https://reviews.freebsd.org/D21498	2019-09-05 14:07:49 +00:00
Andriy Gapon	b539c9bfbd	ZFS: Always refuse receving non-resume stream when resume state exists This fixes a hole in the situation where the resume state is left from receiving a new dataset and, so, the state is set on the dataset itself (as opposed to %recv child). Additionally, distinguish incremental and resume streams in error messages. This was also committed to ZoL: zfsonlinux/zfs@ebeb6f23bf MFC after: 2 weeks Sponsored by: CyberSecure	2019-09-04 07:33:22 +00:00
Li-Wen Hsu	c4bf2f169a	Fix dtrace test case after r351423 due to ping6(8) options changed Failure test case: cddl.usr.sbin.dtrace.common.ip.t_dtrace_contrib.tst_ipv6localicmp_ksh Sponsored by: The FreeBSD Foundation	2019-08-31 15:10:27 +00:00
Li-Wen Hsu	8dc0ad14ad	Fix tests use /etc/motd after r350184 by using an always existing file Sponsored by: The FreeBSD Foundation	2019-08-31 14:41:58 +00:00
Alexander Motin	4d08d25153	MFV/ZoL: Fix wrong assertion in libzfs diff error handling In compare(), all error cases set the error code to EPIPE, so when an error is set, the correct assertion to make is that the error is EPIPE, not EINVAL. Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@freqlabs.com> Closes #8743 zfsonlinux/zfs@9dc41a769d Submitted by: Ryan Moeller <ryan@freqlabs.com> MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D20118	2019-08-28 17:39:46 +00:00
Mark Johnston	e1a29d8c67	Add hold events for lockmgr probes, missed in r351361. MFC with: r351361	2019-08-21 23:47:01 +00:00
Mark Johnston	5b699f1614	Add lockmgr(9) probes to the lockstat DTrace provider. They follow the conventions set by rw and sx lock probes. There is an additional lockstat:::lockmgr-disown probe. Update lockstat(1) to report on contention and hold events for lockmgr locks. Document the new probes in dtrace_lockstat.4, and deduplicate some of the existing probe descriptions. Reviewed by: mjg MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21355	2019-08-21 23:43:58 +00:00
Mark Johnston	6ad06a5e50	Fix inverted predicates for sx lock hold events in lockstat(1). This caused shared sx holds to be reported as exclusive, and vice versa. Reviewed by: mjg MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-08-21 23:13:00 +00:00
Mateusz Piotrowski	2379f91cf9	zpool-features.7: Fix a typo Reported by: pstef Reviewed by: pstef Approved by: src (pstef) Differential Revision: https://reviews.freebsd.org/D21290	2019-08-16 10:43:23 +00:00
Andriy Gapon	5a75f51a1f	Revert r351076 and r351074 because of atomic_swap_64 on 32-bit platforms Trying to sort it out.	2019-08-15 15:27:58 +00:00
Andriy Gapon	93132b76cd	MFV r350898: 8423 8199 7432 Implement large_dnode pool feature 8423 8199 7432 Implement large_dnode pool feature 8423 Implement large_dnode pool feature 8199 multi-threaded dmu_object_alloc() 7432 Large dnode pool feature llumos/illumos-gate@54811da5ac `54811da5ac` https://www.illumos.org/issues/8423 https://www.illumos.org/issues/8199 https://www.illumos.org/issues/7432 ZoL issues: Improved dnode allocation #6564 Clean up large dnode code #6262 Fix dnode_hold() freeing dnode behavior #8172 Fix dnode allocation race #6414, #6439 Partial: Raw sends must be able to decrease nlevels #6821, #6864 Remove unnecessary txg syncs from receive_object() Closes #7197 This updates FreeBSD large_dnode code (that was imported from ZoL) to a version that was committed to illumos. It has some cleanups, improvements and fixes comparing to what we have in FreeBSD now. I think that the most significant update is 8199 multi-threaded dmu_object_alloc(). Obtained from: illumos MFC after: 3 weeks	2019-08-15 14:57:27 +00:00
Andriy Gapon	4139761bb5	MFV r350896: 6585 sha512, skein, and edonr have an unenforced dependency on extensible dataset illumos/illumos-gate@892586e8a1 `892586e8a1` https://www.illumos.org/issues/6585 In any pool without the extensible dataset feature flag already enabled, creating a dataset with dedup set to use one of the new checksums would result in the following panic as soon as any data was added: panic[cpu0]/thread=ffffff0006761c40: feature_get_refcount(spa, feature, &refcount) != 48 (0x30 != 0x30), file: ../../common/fs/zfs/zfeature.c line 390 ffffff0006761830 fffffffffba8fbdd () ffffff0006761890 zfs:feature_do_action+11a () ffffff00067618c0 zfs:spa_feature_incr+1e () ffffff0006761920 zfs:dmu_object_zapify+b7 () ffffff00067619b0 zfs:dsl_dataset_activate_feature+97 () ffffff0006761a20 zfs:dsl_dataset_sync+ba () ffffff0006761ab0 zfs:dsl_pool_sync+153 () ffffff0006761b70 zfs:spa_sync+26e () ffffff0006761c20 zfs:txg_sync_thread+227 () ffffff0006761c30 unix:thread_start+8 () Inspection showed that feature->fi_feature was 7, which is the value of SPA_FEATURE_EXTENSIBLE_DATASET in the spa_feature enum. Testing shows that the panic can be prevented by explicitly setting extensible dataset as a dependency for the sha512, edonr, and skein feature flags. Alternatively, the new checksums code could possibly be changed to obviate the need for the dependency. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: ilovezfs <ilovezfs@icloud.com> Note that FreeBSD does not support ednor yet. MFC after: 2 weeks	2019-08-12 11:42:16 +00:00
Andriy Gapon	f62615062e	Allow ZVOL bookmarks to be listed recursively Many thanks to cryx-freebsd@h3q.com for reporting the problem and submitting a fix. I have chosen to take an equivalent but textually different patch from ZoL just to avoid increasing divergence between OpenZFS flavours. ZoL commit: zfsonlinux/zfse33da554c5daf0103b093f44ab5b90ad6c064c3f Author: loli10K <ezomori.nozomu@gmail.com> Date: Wed Sep 7 19:34:20 2016 +0200 PR: 197821 Submitted by: cryx-freebsd@h3q.com (alternative version) Reported by: cryx-freebsd@h3q.com Obtained from: ZoL MFC after: 1 week	2019-08-12 10:00:32 +00:00
Toomas Soome	b1b9326846	loader: support com.delphix:removing We should support removing vdev from boot pool. Update loader zfs reader to support com.delphix:removing. Reviewed by: allanjude MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18901	2019-08-08 18:08:13 +00:00
Wolfram Schneider	2b299dcf0e	add forgotten opening bracket "(" PR: 237514 Reviewed by: allanjude MFC after: soon for 11.3 and 12 series Differential Revision: https://reviews.freebsd.org/D21009	2019-07-31 21:21:34 +00:00
Baptiste Daroussin	f34d9e5d6b	Fix a bug introduced with parallel mounting of zfs Incorporate a fix from zol: `ab5036df1c` commit log from upstream: Fix race in parallel mount's thread dispatching algorithm Strategy of parallel mount is as follows. 1) Initial thread dispatching is to select sets of mount points that don't have dependencies on other sets, hence threads can/should run lock-less and shouldn't race with other threads for other sets. Each thread dispatched corresponds to top level directory which may or may not have datasets to be mounted on sub directories. 2) Subsequent recursive thread dispatching for each thread from 1) is to mount datasets for each set of mount points. The mount points within each set have dependencies (i.e. child directories), so child directories are processed only after parent directory completes. The problem is that the initial thread dispatching in zfs_foreach_mountpoint() can be multi-threaded when it needs to be single-threaded, and this puts threads under race condition. This race appeared as mount/unmount issues on ZoL for ZoL having different timing regarding mount(2) execution due to fork(2)/exec(2) of mount(8). `zfs unmount -a` which expects proper mount order can't unmount if the mounts were reordered by the race condition. There are currently two known patterns of input list `handles` in `zfs_foreach_mountpoint(..,handles,..)` which cause the race condition. 1) #8833 case where input is `/a /a /a/b` after sorting. The problem is that libzfs_path_contains() can't correctly handle an input list with two same top level directories. There is a race between two POSIX threads A and B, * ThreadA for "/a" for test1 and "/a/b" * ThreadB for "/a" for test0/a and in case of #8833, ThreadA won the race. Two threads were created because "/a" wasn't considered as `"/a" contains "/a"`. 2) #8450 case where input is `/ /var/data /var/data/test` after sorting. The problem is that libzfs_path_contains() can't correctly handle an input list containing "/". There is a race between two POSIX threads A and B, * ThreadA for "/" and "/var/data/test" * ThreadB for "/var/data" and in case of #8450, ThreadA won the race. Two threads were created because "/var/data" wasn't considered as `"/" contains "/var/data"`. In other words, if there is (at least one) "/" in the input list, the initial thread dispatching must be single-threaded since every directory is a child of "/", meaning they all directly or indirectly depend on "/". In both cases, the first non_descendant_idx() call fails to correctly determine "path1-contains-path2", and as a result the initial thread dispatching creates another thread when it needs to be single-threaded. Fix a conditional in libzfs_path_contains() to consider above two. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> PR: 237517, 237397, 239243 Submitted by: Matthew D. Fuller <fullermd@over-yonder.net> (by email) MFC after: 3 days	2019-07-26 13:12:33 +00:00
Mark Johnston	d6eb98610f	Reference stdint.h types in ctf.5. MFC after: 1 week	2019-07-17 16:31:50 +00:00
Mariusz Zaborski	39d51a9400	DTrace: add a top level makefile to the new test suit Pointed out by: markj	2019-06-09 22:45:07 +00:00
Allan Jude	92e0d7f840	zpool.8: the comment property is not read-only The comment property was listed in the man page twice, once under the list of read-only properties, and again (correctly), under the list of user editable properties. PR: 238355 Reported by: Michael Zuo <muh.muhten@gmail.com> Sponsored by: Klara Systems	2019-06-06 01:32:00 +00:00
Mariusz Zaborski	75ed05ef7d	DTrace: create an amd64 test suit Create two tests checking if we can read urgs registers and if the rax register returns a correct number. Reviewed by: markj Discussed with: lwhsu MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20364	2019-06-05 22:32:26 +00:00
Alexander Motin	9b048dd219	MFV r348583: 9847 leaking dd_clones (DMU_OT_DSL_CLONES) objects illumos/illumos-gate@17fb938fd6 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com>	2019-06-03 20:49:20 +00:00
Alexander Motin	eb106113c2	MFV r348580: 9559 zfs diff handles files on delete queue in fromsnap poorly illumos/illumos-gate@20633e304b Reviewed by: Joshua M. Clulow <josh@sysmgr.org> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Paul Dagnelie <pcd@delphix.com>	2019-06-03 20:40:32 +00:00
Alexander Motin	07a5c938c9	MFV r348578: 9962 zil_commit should omit cache thrash illumos/illumos-gate@cab3a55e15 Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com> Approved by: Joshua M. Clulow <josh@sysmgr.org> Author: Prakash Surya <prakash.surya@delphix.com>	2019-06-03 20:24:40 +00:00
Alexander Motin	ce88141b27	MFV r348568: 9466 add JSON output support to channel programs illumos/illumos-gate@5267591016 Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Alek Pinchuk <apinchuk@datto.com>	2019-06-03 19:15:06 +00:00
Alexander Motin	677ef2563d	MFV r348553: 9681 ztest failure in spa_history_log_internal due to spa_rename() illumos/illumos-gate@6aee0ad769 Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2019-06-03 18:32:56 +00:00
Alexander Motin	74f7070445	MFV r348552: 9682 page fault in dsl_async_clone_destroy() while opening pool illumos/illumos-gate@ade2c82828 Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com>	2019-06-03 17:56:44 +00:00
Alexander Motin	c4af53b8f1	MFV r348534: 9616 Bogus error when attempting to set property on read-only pool illumos/illumos-gate@f62db44dbc Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andrew Stormont <astormont@racktopsystems.com>	2019-06-03 17:19:05 +00:00
Mark Johnston	f0e2814d34	Hook up the existing i386 DTrace tests to the build. Now that it's relatively easy to do so, we might as well. MFC after: 1 week Event: Waterloo Hackathon 2019	2019-05-22 03:42:03 +00:00
Mark Johnston	2df0edc13c	Make it possible to generate makefiles for arch-dependent DTrace tests. MFC after: 1 week Event: Waterloo Hackathon 2019	2019-05-22 03:10:23 +00:00
Leandro Lupori	ff7449d6f5	[PowerPC64] stand: fix build using clang 8 as compiler This change fixes "stand" build issues when using clang 8 as compiler. Submitted by: alfredo.junior_eldorado.org.br Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D20026	2019-05-20 19:21:35 +00:00
Allan Jude	19a9d4fa28	MFV/ZoL: `zfs userspace` ignored all unresolved UIDs after the first zfsonlinux/zfs@88cfff1824 zfs_main: fix `zfs userspace` squashing unresolved entries The `zfs userspace` squashes all entries with unresolved numeric values into a single output entry due to the comparsion always made by the string name which is empty in case of unresolved IDs. Fix this by falling to a numerical comparison when either one of string values is not found. This then compares any numerical values after all with a name resolved. Signed-off-by: Pavel Boldin <boldin.pavel@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reported by: clusteradm Obtained from: ZFS-on-Linux MFC after: 3 days	2019-05-18 12:27:22 +00:00
Alexander Motin	d044b69950	Fix dataset name comparison in zfs_compare(). The code never returned match comparing two datasets (not snapshots). As result, uu_avl_find(), called from zfs_callback(), never succeeded, allowing to add same dataset into the list multiple times, for example: # zfs get name pers pers pers@z pers@z NAME PROPERTY VALUE SOURCE pers name pers - pers name pers - pers@z name pers@z - With the patch: # zfs get name pers pers pers@z pers@z NAME PROPERTY VALUE SOURCE pers name pers - pers@z name pers@z - MFC after: 1 week Sponsored by: iXsystems, Inc.	2019-05-08 01:35:43 +00:00
Li-Wen Hsu	132af14fde	Add a trailing empty line to match the test code output This is added for letting these long failing test case pass, and for consistency. The test code should be fixed later to not output this extra empty line. Sponsored by: The FreeBSD Foundation	2019-04-29 03:50:21 +00:00
Michael Tuexen	6eb0062dd0	Some test scripts use ncat --sctp --listen port to run an SCTP discard server in the background. However, when running in the background, stdin is closed and ncat initiates a graceful shutdown of the SCTP association. This is not expected by the client. Therefore, the ncat-based discard server is replaced by a perl-based one. In addition, to remove the dependency from ncat, which needs to be installed via the nmap port, also the code testing for a free SCTP port is changed to use the perl-based client. Finally, remove some debug output from the report generated. Reviewed by: lwhsu@ Differential Revision: https://reviews.freebsd.org/D20086	2019-04-28 19:07:31 +00:00
Edward Tomasz Napierala	b9d89f5e2f	Drop -g from CFLAGS for zfsd(8). No idea why it was ever there. Reviewed by: kib, ngie, asomers MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D19915	2019-04-16 12:25:15 +00:00
Edward Tomasz Napierala	93a07b2278	Make zfsd(8) build obey CFLAGS. Reviewed by: imp Obtained from: CheriBSD MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D19865	2019-04-10 13:42:37 +00:00
Mark Johnston	5ee81b26a8	Ensure that we use a 64-bit value for the last mmap() argument. When using __syscall(2), the offset argument is passed on the stack on amd64. Previously only 32 bits were written, so the upper 32 bits were garbage and could cause the test to fail. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-03-20 23:35:15 +00:00
Enji Cooper	a530b61063	Integrate cddl/usr.sbin/zfds/tests into the FreeBSD test suite This change integrates the unit tests for zfsd into the test suite using the integration method described in r345203. This change removes the `LOCALBASE` includes added for the port version of googlemock/googletest, as well as unnecessary `LIBADD`/`DPADD` and `CXXFLAGS` defines, which are included in the `GTEST_CXXFLAGS` variable, as part of r345203. Reviewed by: asomers Approved by: emaste (mentor) MFC after: 2 months MFC with: r345203 Differential Revision: https://reviews.freebsd.org/D19552	2019-03-15 21:49:19 +00:00
Baptiste Daroussin	e7be6f4ed4	Fix a regression introduced in r344569 Import a fix from illumos (thanks Toomas Soomas for pointing at it) See https://www.illumos.org/issues/10205 for more details Illumos commit: `247b7da039` Submitted by: jack@gandi.net Reported by: cy Reviewed by: tsoome, cy, bapt Obtained from: Illumos	2019-02-27 13:49:41 +00:00
Baptiste Daroussin	c7851c5b7c	Fix regression introduced in r344569 Reported by: cy Tested by: cy Submitted by: Fatih Acar <fatih@gandi.net>	2019-02-27 07:55:53 +00:00
Sean Eric Fagan	50792eb553	Set process title during zfs send. This adds a '-V' option to 'zfs send', which sets the process title once a second to the progress information. This code has been in FreeNAS for a long time now; this is just upstreaming it here. It was originially written by delphij. Reviewed by: mav Obtained from: iXsystems, Inc Sponsored by: iXsystems, Inc Differential Revision: https://reviews.freebsd.org/D19184	2019-02-26 19:23:22 +00:00
Baptiste Daroussin	0b858c82d8	Implement parallel mounting for ZFS filesystem It was first implemented on Illumos and then ported to ZoL. This patch is a port to FreeBSD of the ZoL version. This patch also includes a fix for a race condition that was amended With such patch Delphix has seen a huge decrease in latency of the mount phase (https://github.com/openzfs/openzfs/commit/a3f0e2b569 for details). With that current change Gandi has measured improvments that are on par with those reported by Delphix. Zol commits incorporated: `a10d50f999` `e63ac16d25` Reviewed by: avg, sef Approved by: avg, sef Obtained from: ZoL MFC after: 1 month Relnotes: yes Sponsored by: Gandi.net Differential Revision: https://reviews.freebsd.org/D19098	2019-02-26 08:18:34 +00:00
Leandro Lupori	559af1ec16	Increase ctfconvert buffer size Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D19353	2019-02-25 18:52:47 +00:00
Mark Johnston	9747bd8e02	MFV r344364: 9058 postmortem DTrace frequently broken under vmware illumos/illumos-gate@793bd7e361 MFC after: 1 week	2019-02-20 17:10:30 +00:00
Andriy Gapon	885b0f9e91	zpool.8: sort zpool status flags in the same order as in illumos manual Just in case, while I was here. MFC after: 1 week	2019-02-20 13:37:27 +00:00
Andriy Gapon	d97ff345cf	zpool.8: document -D flag for zpool status The description is taken from the illumos manual. Reported by: stilezy@gmail.com MFC after: 1 week	2019-02-20 13:34:16 +00:00
Andriy Gapon	30d6475b3f	fix userland illumos taskq code to pass relative timeout to cv_timedwait Unlike illumos, FreeBSD cv_timedwait requires a relative timeout. That applies both to the kernel illumos compatibility code and to the userland "fake kernel" code. MFC after: 2 weeks Sponsored by: Panzura	2019-02-20 13:19:08 +00:00
Kirk McKusick	88640c0e8b	Create new EINTEGRITY error with message "Integrity check failed". An integrity check such as a check-hash or a cross-correlation failed. The integrity error falls between EINVAL that identifies errors in parameters to a system call and EIO that identifies errors with the underlying storage media. EINTEGRITY is typically raised by intermediate kernel layers such as a filesystem or an in-kernel GEOM subsystem when they detect inconsistencies. Uses include allowing the mount(8) command to return a different exit value to automate the running of fsck(8) during a system boot. These changes make no use of the new error, they just add it. Later commits will be made for the use of the new error number and it will be added to additional manual pages as appropriate. Reviewed by: gnn, dim, brueffer, imp Discussed with: kib, cem, emaste, ed, jilles Differential Revision: https://reviews.freebsd.org/D18765	2019-01-17 06:35:45 +00:00
Andriy Gapon	4c325393f3	MFV r342532: 5882 Temporary pool names Note that this commit brings only formatting changes that were done during the final review of the illumos change, because FreeBSD got the main changes before illumos. illumos/illumos-gate@04e5635652 `04e5635652` https://www.illumos.org/issues/5882 This is an import of the temporary pool names functionality from ZoL: `e2282ef57e` `26b42f3f9d` `2f3ec90061` `00d2a8c92f` `83e9986f6e` `023bbe6f01` It is intended to assist the creation and management of virtual machines that have their rootfs on ZFS on hosts that also have their rootfs on ZFS. These situations cause SPA namespace collisions when the standard name rpool is used in both cases. The solution is either to give each guest pool a name unique to the host, which is not always desireable, or boot a VM environment containing an ISO image to install it, which is cumbersome. MFC after: 1 week Sponsored by: Panzura	2018-12-26 11:03:14 +00:00
Andriy Gapon	f050611e7f	MFV r342469: 9630 add lzc_rename and lzc_destroy to libzfs_core illumos/illumos-gate@049ba636fa `049ba636fa` https://www.illumos.org/issues/9630 Rename and destroy are very useful operations that deserve to be in libzfs_core. And they are not hard to implement too. MFC after: 2 weeks Relnotes: maybe	2018-12-26 10:37:41 +00:00
Yuri Pankov	65c9ed85e4	dtrace(1): remove reference to dtruss that was removed from base system in r300226. PR: 211618 Reviewed by: gnn, markj, 0mp Approved by: kib (mentor, implicit) Differential Revision: https://reviews.freebsd.org/D17762	2018-10-31 15:29:26 +00:00
Michael Tuexen	1e88cc8b59	Add support for send, receive and state-change DTrace providers for SCTP. They are based on what is specified in the Solaris DTrace manual for Solaris 11.4. Reviewed by: 0mp, dteske, markj Relnotes: yes Differential Revision: https://reviews.freebsd.org/D16839	2018-08-22 21:23:32 +00:00
Matt Macy	d12e91d584	Make dnode definition uniform on !x86 gcc4 requires -fms-extensions to accept anonymous union members	2018-08-21 03:45:09 +00:00
Kyle Evans	7920ad944b	libbe(3): Move build goop back out of cddl/ Some background: in the GSoC project, libbe/Makefile lived in lib/libbe. I created projects/bectl branch, maintained the above for all of five minutes before I misread Makefile.inc1 and decided that it couldn't possibly build outside of cddl/, so I kicked the Makefile out into the cddl/ build and all was good. The misreading was of the bit where .WAIT is added to SUBDIR after lib, libexec but prior to building bin and cddl only during the install targets, which is the critical part. Fast forward- buildworld was still broken in my branch unbeknownst to me because I didn't nuke my OBJDIR. Combing through Makefile.inc1 eventually revealed the necessary magic to make sure that libbe's dependencies are specified well enough, and it becomes clear what needs done to make a non-cddl/ build work. This is an interesting prospect, because the build split is kind of annoying to work with. IGNORE_PRAGMA is added to avoid dropping WARNS by one more. This was previously pulled in via cddl/Makefile.inc.	2018-08-18 03:20:59 +00:00
Kyle Evans	f25a4e58ec	libbe(3): Remove -v from LDFLAGS -v is clearly not needed for linking, and it adds extra verbose information that is not necessary.	2018-08-18 03:08:54 +00:00
Mark Johnston	f0af0b312f	Add partial documentation for dtrace(1)'s -x configuration options. Some options are still missing descriptions, but they can be filled in over time. Submitted by: raichoo <raichoo@googlemail.com> Reviewed by: 0mp (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16671	2018-08-16 19:28:44 +00:00
Will Andrews	450e5a4378	zfs: add ztest to the kyua test suite. This program is currently failing, and has been for >6 months on HEAD. Ideally, this should be run 24x7 in CI, to discover hard-to-find bugs that only manifest with concurrent i/o. Requested by: lwhsu, mmacy	2018-08-15 13:05:04 +00:00
Kyle Evans	f2fdf2a1dc	libbe(3)/bectl(8): Remove now-redundant include paths These were previously necessary because the libnvpair and libzfs_core includes were not installed into the SYSROOT, being a part of the copies target in include/Makefile rather than being installed with the library. This was fixed in r337696 and the headers are now installed properly, so we may let go of the cruft.	2018-08-13 05:01:19 +00:00
Kyle Evans	ce33c57d6c	Use INCS for non-sys/ libnvpair and libzfs_core includes While nothing was wrong with libnvpair.h, libzfs_core.h was only guarded by MK_CDDL rather than MK_CDDL && MK_ZFS. Rather than ugl'if'ying include/Makefile to impose the extra restriction, just move the non-sys/ includes into INCS with the respect lib builds. This has the added bonus of allowing third party packagers to try and split these libs out of the FreeBSD-runtime package, if they are so inclined. The sys/ include was left alone- generally userland libraries shouldn't install kernel headers. MFC after: 1 week	2018-08-13 03:38:32 +00:00
Matt Macy	cc0fbbb92e	MFV/ZoL: Implement large_dnode pool feature commit `50c957f702` Author: Ned Bass <bass6@llnl.gov> Date: Wed Mar 16 18:25:34 2016 -0700 Implement large_dnode pool feature Justification ------------- This feature adds support for variable length dnodes. Our motivation is to eliminate the overhead associated with using spill blocks. Spill blocks are used to store system attribute data (i.e. file metadata) that does not fit in the dnode's bonus buffer. By allowing a larger bonus buffer area the use of a spill block can be avoided. Spill blocks potentially incur an additional read I/O for every dnode in a dnode block. As a worst case example, reading 32 dnodes from a 16k dnode block and all of the spill blocks could issue 33 separate reads. Now suppose those dnodes have size 1024 and therefore don't need spill blocks. Then the worst case number of blocks read is reduced to from 33 to two--one per dnode block. In practice spill blocks may tend to be co-located on disk with the dnode blocks so the reduction in I/O would not be this drastic. In a badly fragmented pool, however, the improvement could be significant. ZFS-on-Linux systems that make heavy use of extended attributes would benefit from this feature. In particular, ZFS-on-Linux supports the xattr=sa dataset property which allows file extended attribute data to be stored in the dnode bonus buffer as an alternative to the traditional directory-based format. Workloads such as SELinux and the Lustre distributed filesystem often store enough xattr data to force spill bocks when xattr=sa is in effect. Large dnodes may therefore provide a performance benefit to such systems. Other use cases that may benefit from this feature include files with large ACLs and symbolic links with long target names. Furthermore, this feature may be desirable on other platforms in case future applications or features are developed that could make use of a larger bonus buffer area. Implementation -------------- The size of a dnode may be a multiple of 512 bytes up to the size of a dnode block (currently 16384 bytes). A dn_extra_slots field was added to the current on-disk dnode_phys_t structure to describe the size of the physical dnode on disk. The 8 bits for this field were taken from the zero filled dn_pad2 field. The field represents how many "extra" dnode_phys_t slots a dnode consumes in its dnode block. This convention results in a value of 0 for 512 byte dnodes which preserves on-disk format compatibility with older software. Similarly, the in-memory dnode_t structure has a new dn_num_slots field to represent the total number of dnode_phys_t slots consumed on disk. Thus dn->dn_num_slots is 1 greater than the corresponding dnp->dn_extra_slots. This difference in convention was adopted because, unlike on-disk structures, backward compatibility is not a concern for in-memory objects, so we used a more natural way to represent size for a dnode_t. The default size for newly created dnodes is determined by the value of a new "dnodesize" dataset property. By default the property is set to "legacy" which is compatible with older software. Setting the property to "auto" will allow the filesystem to choose the most suitable dnode size. Currently this just sets the default dnode size to 1k, but future code improvements could dynamically choose a size based on observed workload patterns. Dnodes of varying sizes can coexist within the same dataset and even within the same dnode block. For example, to enable automatically-sized dnodes, run # zfs set dnodesize=auto tank/fish The user can also specify literal values for the dnodesize property. These are currently limited to powers of two from 1k to 16k. The power-of-2 limitation is only for simplicity of the user interface. Internally the implementation can handle any multiple of 512 up to 16k, and consumers of the DMU API can specify any legal dnode value. The size of a new dnode is determined at object allocation time and stored as a new field in the znode in-memory structure. New DMU interfaces are added to allow the consumer to specify the dnode size that a newly allocated object should use. Existing interfaces are unchanged to avoid having to update every call site and to preserve compatibility with external consumers such as Lustre. The new interfaces names are given below. The versions of these functions that don't take a dnodesize parameter now just call the _dnsize() versions with a dnodesize of 0, which means use the legacy dnode size. New DMU interfaces: dmu_object_alloc_dnsize() dmu_object_claim_dnsize() dmu_object_reclaim_dnsize() New ZAP interfaces: zap_create_dnsize() zap_create_norm_dnsize() zap_create_flags_dnsize() zap_create_claim_norm_dnsize() zap_create_link_dnsize() The constant DN_MAX_BONUSLEN is renamed to DN_OLD_MAX_BONUSLEN. The spa_maxdnodesize() function should be used to determine the maximum bonus length for a pool. These are a few noteworthy changes to key functions: * The prototype for dnode_hold_impl() now takes a "slots" parameter. When the DNODE_MUST_BE_FREE flag is set, this parameter is used to ensure the hole at the specified object offset is large enough to hold the dnode being created. The slots parameter is also used to ensure a dnode does not span multiple dnode blocks. In both of these cases, if a failure occurs, ENOSPC is returned. Keep in mind, these failure cases are only possible when using DNODE_MUST_BE_FREE. If the DNODE_MUST_BE_ALLOCATED flag is set, "slots" must be 0. dnode_hold_impl() will check if the requested dnode is already consumed as an extra dnode slot by an large dnode, in which case it returns ENOENT. * The function dmu_object_alloc() advances to the next dnode block if dnode_hold_impl() returns an error for a requested object. This is because the beginning of the next dnode block is the only location it can safely assume to either be a hole or a valid starting point for a dnode. * dnode_next_offset_level() and other functions that iterate through dnode blocks may no longer use a simple array indexing scheme. These now use the current dnode's dn_num_slots field to advance to the next dnode in the block. This is to ensure we properly skip the current dnode's bonus area and don't interpret it as a valid dnode. zdb --- The zdb command was updated to display a dnode's size under the "dnsize" column when the object is dumped. For ZIL create log records, zdb will now display the slot count for the object. ztest ----- Ztest chooses a random dnodesize for every newly created object. The random distribution is more heavily weighted toward small dnodes to better simulate real-world datasets. Unused bonus buffer space is filled with non-zero values computed from the object number, dataset id, offset, and generation number. This helps ensure that the dnode traversal code properly skips the interior regions of large dnodes, and that these interior regions are not overwritten by data belonging to other dnodes. A new test visits each object in a dataset. It verifies that the actual dnode size matches what was stored in the ztest block tag when it was created. It also verifies that the unused bonus buffer space is filled with the expected data patterns. ZFS Test Suite -------------- Added six new large dnode-specific tests, and integrated the dnodesize property into existing tests for zfs allow and send/recv. Send/Receive ------------ ZFS send streams for datasets containing large dnodes cannot be received on pools that don't support the large_dnode feature. A send stream with large dnodes sets a DMU_BACKUP_FEATURE_LARGE_DNODE flag which will be unrecognized by an incompatible receiving pool so that the zfs receive will fail gracefully. While not implemented here, it may be possible to generate a backward-compatible send stream from a dataset containing large dnodes. The implementation may be tricky, however, because the send object record for a large dnode would need to be resized to a 512 byte dnode, possibly kicking in a spill block in the process. This means we would need to construct a new SA layout and possibly register it in the SA layout object. The SA layout is normally just sent as an ordinary object record. But if we are constructing new layouts while generating the send stream we'd have to build the SA layout object dynamically and send it at the end of the stream. For sending and receiving between pools that do support large dnodes, the drr_object send record type is extended with a new field to store the dnode slot count. This field was repurposed from unused padding in the structure. ZIL Replay ---------- The dnode slot count is stored in the uppermost 8 bits of the lr_foid field. The bits were unused as the object id is currently capped at 48 bits. Resizing Dnodes --------------- It should be possible to resize a dnode when it is dirtied if the current dnodesize dataset property differs from the dnode's size, but this functionality is not currently implemented. Clearly a dnode can only grow if there are sufficient contiguous unused slots in the dnode block, but it should always be possible to shrink a dnode. Growing dnodes may be useful to reduce fragmentation in a pool with many spill blocks in use. Shrinking dnodes may be useful to allow sending a dataset to a pool that doesn't support the large_dnode feature. Feature Reference Counting -------------------------- The reference count for the large_dnode pool feature tracks the number of datasets that have ever contained a dnode of size larger than 512 bytes. The first time a large dnode is created in a dataset the dataset is converted to an extensible dataset. This is a one-way operation and the only way to decrement the feature count is to destroy the dataset, even if the dataset no longer contains any large dnodes. The complexity of reference counting on a per-dnode basis was too high, so we chose to track it on a per-dataset basis similarly to the large_block feature. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3542	2018-08-12 00:45:53 +00:00
Kyle Evans	3f48dbd1cc	Merge libbe(3)/bectl(8) from projects/bectl into head bectl(8) is an administrative interface for working with ZFS boot environments, intended to provide a superset of the functionality provided by sysutils/beadm. libbe(3) is the back-end library that the required functionality has been pulled out into for later reuse. These were originally written for GSoC 2017 under the mentorship of allanjude@. bectl(8) has proven pretty stable in my testing, with the known bug documented in the man page. Relnotes: yes	2018-08-11 23:50:09 +00:00
Kyle Evans	35d2028fb8	libbe(3)/bectl(8): More SYSROOT/GCC build fixes - Missing include path - Fully specify libzfs's dependencies (except for deps pulled in by other deps) in Makefile.inc1 - Drop WARNS back down to 2 for libbe(3). I do this with much hesitation, but the libzfs headers are apparently a hot warning-filled mess as far as GCC 4.2 is concerned.	2018-08-11 22:45:39 +00:00
Alexander Leidinger	a079a34fd5	Extend the info about the limitations of datasets in jails. Reviewed by: allanjude Sponsored by: Essen Hackathon	2018-08-11 20:49:19 +00:00
Brad Davis	edb1df35b0	Fix the build by just installing systop since testing shows it works with: dwatch -X systop Reviewed by: kp Approved by: allanjude (mentor)	2018-08-11 16:06:32 +00:00
Devin Teske	37b0d996dc	dwatch(1): Add systop profile Provides a top-like view of syscall consumers. MFC after: 3 days X-MFC-to: stable/11 Sponsored by: Smule, Inc.	2018-08-11 06:32:31 +00:00
Devin Teske	2282756519	dwatch(1): Fix syntax error in vop_readdir profile Reported by: Arne Ehrlich <ehrlich@consider-it.de> MFC after: 3 days X-MFC-to: stable/11 Sponsored by: Smule, Inc.	2018-08-11 06:13:11 +00:00
Kyle Evans	14b841d4a8	MFH @ r337607, in preparation for boarding	2018-08-11 04:26:29 +00:00
Mark Johnston	0b56e7a8e9	Disable the D subroutines msgsize() and msgdsize(). They are specific to illumos and the corresponding DIF subroutines are already disabled on FreeBSD. Reported by: gnn	2018-08-10 19:23:20 +00:00
Matt Macy	648cfe57fd	Performance optimization of AVL tree comparator functions MFV: commit `ee36c709c3` Author: Gvozden Neskovic <neskovic@gmail.com> Date: Sat Aug 27 20:12:53 2016 +0200 perf: 2.75x faster ddt_entry_compare() First 256bits of ddt_key_t is a block checksum, which are expected to be close to random data. Hence, on average, comparison only needs to look at first few bytes of the keys. To reduce number of conditional jump instructions, the result is computed as: sign(memcmp(k1, k2)). Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} , which is computed efficiently. Synthetic performance evaluation of original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R) CPU E5-2660 v3: old 6.85789 s new 2.49089 s perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare() Compute the result directly instead of using conditionals perf: zfs_range_compare() Speedup between 1.1x - 2.5x, depending on compiler version and optimization level. perf: spa_error_entry_compare() `bcmp()` is not suitable for comparator use. Use `memcmp()` instead. perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare() perf: 2.8x faster zil_bp_compare() perf: 2.8x faster mze_compare() perf: faster dbuf_compare() perf: faster compares in spa_misc perf: 2.8x faster layout_hash_compare() perf: 2.8x faster space_reftree_compare() perf: libzfs: faster avl tree comparators perf: guid_compare() perf: dsl_deadlist_compare() perf: perm_set_compare() perf: 2x faster range_tree_seg_compare() perf: faster unique_compare() perf: faster vdev_cache _compare() perf: faster vdev_uberblock_compare() perf: faster fuid _compare() perf: faster zfs_znode_hold_compare() Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Richard Elling <richard.elling@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #5033	2018-08-10 06:42:08 +00:00
Alexander Motin	07ddc55096	MFV r337223: 9580 Add a hash-table on top of nvlist to speed-up operations illumos/illumos-gate@2ec7644aab Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com>	2018-08-03 01:52:25 +00:00
Alexander Motin	0285589b38	MFV 337214: 9621 Make createtxg and guid properties public illumos/illumos-gate@e8d4a73c86 Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Yuri Pankov <yuripv@yuripv.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Josh Paetzel <josh@tcbug.org>	2018-08-03 00:24:27 +00:00
Alexander Motin	d49e9be14f	MFV r337184: 9457 libzfs_import.c:add_config() has a memory leak A memory leak occurs on lines 209 and 213 because the config is not freed in the error case. The interface to add_config() seems less than ideal - it would be better if it copied any data necessary from the config and the caller freed it. illumos/illumos-gate@ddfe901b12 Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: sara hartse <sara.hartse@delphix.com>	2018-08-02 21:25:32 +00:00
Alexander Motin	2bce9a5316	MFV r337182: 9330 stack overflow when creating a deeply nested dataset Datasets that are deeply nested (~100 levels) are impractical. We just put a limit of 50 levels to newly created datasets. Existing datasets should work without a problem. illumos/illumos-gate@5ac95da7d6 Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>	2018-08-02 21:19:35 +00:00
Alexander Motin	ac879e61ad	9523 Large alloc in zdb can cause trouble 16MB alloc in zdb_embedded_block() can cause cores in certain situations (clang, gcc55). OsX commit: `ced236a5da` FreeBSD commit: https://svnweb.freebsd.org/base?view=revision&revision=326150 illumos/illumos-gate@03a4c2f4bf Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Jorgen Lundman <lundman@lundman.net> This is an update for r326150 (by avg), where this change comes from.	2018-08-02 20:44:07 +00:00
Alexander Motin	c423a5e6b8	MFV r337161: 9512 zfs remap poolname@snapname coredumps Only filesystems and volumes are valid "zfs remap" parameters: when passed a snapshot name zfs_remap_indirects() does not handle the EINVAL returned from libzfs_core, which results in failing an assertion and consequently crashing. illumos/illumos-gate@0b2e825398 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Author: loli10K <ezomori.nozomu@gmail.com>	2018-08-02 19:13:45 +00:00
Alexander Motin	7fca1b93c4	Do not blindly include illumos kernel headers instead of user-space. It is not needed now, and I doubt it much helped at all, creating more confusions then good.	2018-08-02 18:55:55 +00:00
Alexander Motin	b59e9cd1c0	MFV r316926: 7955 libshare needs to initialize only those datasets being modified by the consumer illumos/illumos-gate@8a981c3356 `8a981c3356` https://www.illumos.org/issues/7955 Libshare currently initializes all available filesystems when doing any libshare operation. This requires iterating through all the filesystem multiple times, which is a huge performance problem for sharing and unsharing operations. Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@gmail.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com> Author: Daniel Hoffman <dj.hoffman@delphix.com> For FreeBSD this is practically a NOP, just a diff reduction.	2018-08-01 21:51:49 +00:00
Michael Tuexen	7bda966394	Add a dtrace provider for UDP-Lite. The dtrace provider for UDP-Lite is modeled after the UDP provider. This fixes the bug that UDP-Lite packets were triggering the UDP provider. Thanks to dteske@ for providing the dwatch module. Reviewed by: dteske@, markj@, rrs@ Relnotes: yes Differential Revision: https://reviews.freebsd.org/D16377	2018-07-31 22:56:03 +00:00
Alexander Motin	200c27a75d	MFV r337014: 9421 zdb should detect and print out the number of "leaked" objects 9422 zfs diff and zdb should explicitly mark objects that are on the deleted queue illumos/illumos-gate@20b5dafb42 Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>	2018-07-31 22:50:50 +00:00
Alexander Motin	0021e1c10c	MFV r336991, r337001: 9102 zfs should be able to initialize storage devices The first access to a disk block can incur a performance penalty on some platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that volumes be "thick provisioned", where supported by the platform (VMware). Thick provisioning is time consuming and often is ignored. If the thick provision step is omitted, customers will see suboptimal performance until we have written to all parts of the LUN. ZFS should be able to initialize any unused storage to remove any first-write penalty that exists. illumos/illumos-gate@094e47e980 Reviewed by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: George Wilson <george.wilson@delphix.com>	2018-07-31 21:06:04 +00:00
Alexander Motin	d1cf4052d0	MFV r336955: 9236 nuke spa_dbgmsg We should use zfs_dbgmsg instead of spa_dbgmsg. Or at least, metaslab_condense() should call zfs_dbgmsg because it's important and rare enough to always log. It's possible that the message in zio_dva_allocate() would be too high-frequency for zfs_dbgmsg. illumos/illumos-gate@21f7c81cc1 Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com>	2018-07-31 00:47:27 +00:00
Alexander Motin	194000fa21	MFV r336950: 9290 device removal reduces redundancy of mirrors Mirrors are supposed to provide redundancy in the face of whole-disk failure and silent damage (e.g. some data on disk is not right, but ZFS hasn't detected the whole device as being broken). However, the current device removal implementation bypasses some of the mirror's redundancy. illumos/illumos-gate@3a4b1be953 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prashanth Sreenivasa <pks@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Tim Chase <tim@chase2k.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com>	2018-07-31 00:25:39 +00:00
Alexander Motin	6413a6d31f	MFV r336946: 9238 ZFS Spacemap Encoding V2 The current space map encoding has the following disadvantages: [1] Assuming 512 sector size each entry can represent at most 16MB for a segment. This makes the encoding very inefficient for large regions of space. [2] As vdev-wide space maps have started to be used by new features (i.e. device removal, zpool checkpoint) we've started imposing limits in the vdevs that can be used with them based on the maximum addressable offset (currently 64PB for a top-level vdev). The new remains backwards compatible with the old one. The introduced two-word entry format, besides extending the limits imposed by the single-entry layout, also includes a vdev field and some extra padding after its prefix. The extra padding after the prefix should is reserved for future usage (e.g. new prefixes for future encodings or new fields for flags). The new vdev field not only makes the space maps more self-descriptive, but also opens the doors for pool-wide space maps. One final important note is that the number of bits used for vdevs is reduced to 24 bits for blkptrs. That was decided as we don't know of any setups that use more than 16M vdevs for the time being and we wanted to fit the vdev field in the space map. In addition that gives us some extra bits in dva_t. illumos/illumos-gate@17f11284b4 Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <gwilson@zfsmail.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com>	2018-07-30 23:47:38 +00:00
Alexander Motin	1960706625	MFV r336944: 9286 want refreservation=auto When a ZFS volume is created with zfs create -V (but without -s), the refreservation property is set to a value that is volsize plus the maximum size of metadata. If refreservation is ever set to another value, it is impossible to set it back to the automatically determined value. There are other cases where refreservation may be wrong. These include receiving a volume that was sent without properties and zfs clone. We need: zfs set refreservation=auto <volume> zfs clone -o refreservation=auto <volume> Each one would use the same function used by zfs create -V to determine the proper value for refreservation. illumos/illumos-gate@1c10ae76c0 Reviewed by: Allan Jude <allanjude@freebsd.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Mike Gerdts <mike.gerdts@joyent.com>	2018-07-30 22:39:30 +00:00
Kyle Evans	b29bf2f84e	libbe(3)/be(8): Drop WARNS overrides, fix all fallout Based on the idea that we shouldn't have all-new library and utility going into base that need WARNS=1... - Decent amount of constification - Lots of parentheses - Minor other nits	2018-07-25 15:14:35 +00:00
Kyle Evans	268af06d3e	Normalize bectl(8)/libbe(3) Makefiles, remove Makefile copyright/license Approved by: hselaskey	2018-07-24 19:55:02 +00:00
Kyle Evans	70a11a8eea	libbe(3): Add to cddl build, adjust src.libnames.mk as needed	2018-07-24 15:42:23 +00:00
Michael Tuexen	53e0911116	Improve TCP related tests for dtrace. Ensure that the TCP connections are terminated gracefully as expected by the test. Use appropriate numbers for sent/received packets. In addition, enable tst.localtcpstate.ksh, which should pass, but doesn't until https://reviews.freebsd.org/D16369 is committed. Reviewed by: markj@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16288	2018-07-22 10:50:59 +00:00
Michael Tuexen	be029a4979	Test that the dtrace UDP receive probe fires. This test ensures that the fix committed in https://svnweb.freebsd.org/changeset/base/336551 actually works. Reviewed by: dteske@, markj@, rrs@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16046	2018-07-20 15:37:29 +00:00
Michael Tuexen	10b803a40d	Adjust comment to reality since r286171. Sponsored by: Netflix, Inc.	2018-07-15 20:42:47 +00:00
Michael Tuexen	e0f9b8233f	Don't require a local sshd for the local TCP state dtrace test This change is similar to the one done in r286171 for tst.ipv4localtcp.ksh. This not only reduces the requirements on the system used for testing but results also in a graceful teardown of the TCP connection. Reviewed by: gnn@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16276	2018-07-15 20:41:16 +00:00
Michael Tuexen	dc9f20b3f3	Fix the UDP tests for dtrace. The code imported from opensolaris was depending on ping supporting UDP for sending probes. Since this is not supported by ping on FreeBSD use a perl script instead. The remote test requires the usage of ksh93, so state that in the sheband. Enable the local test, but keep the remote test disabled, since it requires a remote machine on the LAN. Reviewed by: markj@, gnn@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16268	2018-07-15 20:34:22 +00:00
Michael Tuexen	60e4fb3a3f	Return the intended return code. This bug was spotted by markj@ in D16268 because I copied this code part and used it there. So fix it. Sponsored by: Netflix, Inc.	2018-07-14 19:53:41 +00:00
Michael Tuexen	49d7124b18	Fix shebangs and execute bit of test scripts. Since we don't have /usr/bin/ksh, use a generic way of specifying ksh. Some of the tests only run with ksh93, so use this shell for these tests. Two of the tests don't have the execute bit set, so fix this, too. Reviewed by: markj@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16270	2018-07-14 19:49:14 +00:00

1 2 3 4 5 ...

1247 Commits