freebsd-skq

Author	SHA1	Message	Date
Mark Johnston	6c2806594b	Make the second argument of dtrace_invop() a trapframe pointer. Currently this argument is a pointer into the stack which is used by FBT to fetch the first five probe arguments. On all non-x86 architectures it's simply the trapframe address, so this change has no functional impact. On amd64 it's a pointer into the trapframe such that stack[1 .. 5] gives the first five argument registers, which are deliberately grouped together in the amd64 trapframe definition. A trapframe argument simplifies the invop handlers on !x86 and makes the x86 FBT invop handler easier to understand. Moreover, it allows for invop handlers that may want to modify the register set of the interrupted thread.	2016-04-17 23:08:47 +00:00
Andriy Gapon	e01dd79f9a	zfs_rezget: z_vnode can not be NULL if zp is valid MFC after: 3 weeks	2016-04-16 07:41:56 +00:00
Andriy Gapon	c2d36fc5cd	zfs: enable vn_io_fault support Note that now we have to account for possible partial writes in dmu_write_uio_dbuf(). It seems that on illumos either all or none of the data are expected to be written. But the partial writes are quite expected when vn_io_fault support is enabled. Reviewed by: kib MFC after: 7 weeks Differential Revision: https://reviews.freebsd.org/D2790	2016-04-16 07:35:53 +00:00
Alan Somers	739f4ae3b1	Don't corrupt ZFS label's physpath attribute when booting while a disk is missing Prior to this change, vdev_geom_open_by_path would call vdev_geom_attach prior to verifying the device's GUIDs. vdev_geom_attach calls vdev_geom_attrchange to set the physpath in the vdev object. The result is that if the disk could not be found, then the labels for other disks in the same TLD would overwrite the missing disk's physpath with the physpath of whichever disk currently has the same devname as the missing one used to have. MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-15 16:36:17 +00:00
Alan Somers	c29088b5c7	Add more debugging statements in vdev_geom.c Log a debugging message whenever geom functions fail in vdev_geom_attach. Printing these messages is controlled by vfs.zfs.debug MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-14 23:14:41 +00:00
Alan Somers	f0ac053088	Update a debugging message in vdev_geom_open_by_guids for consistency with similar messages elsewhere in the file. MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-14 19:20:31 +00:00
Alan Somers	4e3ab010a2	Fix rare double free in vdev_geom_attrchanged sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Don't drop the g_topology_lock before freeing old_physpath. That opens up a race where one thread can call vdev_geom_attrchanged, set old_physpath, drop the g_topology_lock, then block trying to acquire the SCL_STATE lock. Then another thread can come into vdev_geom_attrchanged, set old_physpath to the same value, and proceed to free it. When the first thread resumes, it will free the same location. It turns out that the SCL_STATE lock isn't needed. It was originally added by gibbs to protect vd->vdev_physpath while updating the same. However, the update process subsequently was switched to an atomic operation (a pointer swap). Now, there is no need for the SCL_STATE lock, and hence no need to drop the g_topology_lock. Reviewed by: delphij MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D5413	2016-04-12 19:11:14 +00:00
Andriy Gapon	c3249989ef	l2arc: make sure that all writes honor ashift of a cache device Previously uncompressed buffers did not obey that rule. Type of b_asize is changed to uint64_t for consistency, given that this is a zeta-byte filesystem. l2arc_compress_buf is renamed to l2arc_transform_buf to better reflect its new utility. Now not only we ensure that a compressed buffer has a size aligned to ashift, but we also allocate a properly sized temporary buffer if the original buffer is not compressed and it has an odd size. This ensures that all I/O to the cache device is always ashift-aligned, in terms of both a request offset and a request size. If the aligned data is larger than the original data, then we have to use a temporary buffer when reading it as well. Also, enhance physical zio alignment checks using vdev_logical_ashift. On FreeBSD we have this information, so we can make stricter assertions. Reviewed by: smh, mav MFC after: 1 month Sponsored by: ClusterHQ Differential Revision: https://reviews.freebsd.org/D2789	2016-04-12 06:56:35 +00:00
Andriy Gapon	6a50036052	Revert r297396 Modify "4958 zdb trips assert on pools with ashift >= 0xe" A better fix is following.	2016-04-12 06:54:18 +00:00
Alexander Motin	78aec5c610	MFV r297831: 6322 ZFS indirect block predictive prefetch Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Author: Alexander Motin <mav@FreeBSD.org> Improve speculative prefetch of indirect blocks. Scalability of many operations on wide ZFS pool can be limited by requirement to prefetch indirect blocks first. Recently added asynchronous indirect block read partially helped, but did not solve the problem completely. This patch extends existing prefetcher functionality to explicitly work with indirect blocks. Before this change prefetcher issued reads for up to 8MB of data in advance. With this change it also issues indirect block reads for up to 64MB of data in advance, so that when it will be time to actually read those data, it can be done immediately. Alike effect can be achieved by just increasing maximal data prefetch distance, but at higher memory cost. Also this change introduces indirect block prefetch for rewrite operations, that was never done before. Previously ARC miss for Indirect blocks regularly blocked rewrites, converting perfectly aligned asynchronous operations into synchronous read-write pairs, significantly reducing maximal rewrite speed. While being there this issue was also fixed: - prefetch was done always, even if caching for the dataset was completely disabled. Testing on FreeBSD with zvol on top of 6x striped 2x mirrored pool of 12 assorted HDDs shown me such performance numbers: ------- BEFORE -------- Write 491363677 bytes/sec Read 312430631 bytes/sec Rewrite 97680464 bytes/sec -------- AFTER -------- Write 493524146 bytes/sec Read 438598079 bytes/sec Rewrite 277506044 bytes/sec Closes #65 Closes #80 openzfs/openzfs@792fd28ac0	2016-04-11 21:09:15 +00:00
Steven Hartland	2dcee04b3a	Only include sysctl in kernel build Only include sysctl in kernel builds fixing warning about implicit declaration of function 'sysctl_handle_int'. PR: 204140 MFC after: 1 week X-MFC-With: r297813 Sponsored by: Multiplay	2016-04-11 13:17:11 +00:00
Steven Hartland	7bc47b4ea3	Only include sysctl in kernel build Only include sysctl in kernel builds fixing warning about implicit declaration of function 'sysctl_handle_int'. Sponsored by: Multiplay	2016-04-11 08:57:54 +00:00
Andriy Gapon	1da2e1e353	zio: align use of "no dump" flag between use_uma and !use_uma cases At the moment no ZFS buffers are included into a crash dump unless ZFS_DEBUG (or INVARIANTS) kernel option is enabled. That's not very helpful for debugging of ZFS problems, because important information often resides in metadata buffers. This change switches the dumping behavior when UMA is used from the illumos behavior to a more useful behavior that we have on FreeBSD when ZFS buffers are allocated via malloc. Reviewed by: smh, mav MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D5892	2016-04-11 07:11:20 +00:00
Mark Johnston	b529028676	Implement support for boot-time DTrace. This allows one to enable DTrace probes relatively early during boot, during SI_SUB_DTRACE_ANON, before dtrace(1) can invoked. The desired enabling is created using dtrace -A, which writes a /boot/dtrace.dof file and uses nextboot(8) to ensure that DTrace kernel modules are loaded and that the DOF file describing the enabling is loaded by loader(8) during the subsequent boot. The trace output can then be fetched with dtrace -a. With this commit, boot-time DTrace is only functional on i386 and amd64: on other architectures, the high-resolution timer frequency is initialized during SI_SUB_CLOCKS and is thus not available when the anonymous tracing state is initialized. On x86, the TSC is used and is thus available earlier. MFC after: 1 month Relnotes: yes	2016-04-10 01:25:48 +00:00
Mark Johnston	33b454938a	Initialize SDT probes during SI_SUB_DTRACE_PROVIDER. This is consistent with all other DTrace providers and ensures that SDT probes are available for boot-time tracing. MFC after: 2 weeks	2016-04-10 01:24:27 +00:00
Mark Johnston	e1e33ff912	Initialize DTrace hrtimer frequency during SI_SUB_CPU on i386 and amd64. This allows the hrtimer to be used earlier during boot. This is required for boot-time DTrace: anonymous enablings are created during SI_SUB_DTRACE_ANON, which runs before APs are started. In particular, the DTrace deadman timer requires that the hrtimer be functional. MFC after: 2 weeks	2016-04-10 01:23:39 +00:00
Alexander Motin	eaee150e3f	MFV r297760: 6418 zpool should have a label clearing command Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Author: Will Andrews <will@firepipe.net> Closes #83 Closes #32 openzfs/openzfs@9663688425 FreeBSD already had `zpool labelclear` functionality, so this is mostly just a diff reduction. MFC after: 1 month	2016-04-09 20:30:50 +00:00
Andriy Gapon	c8ff459286	zio write issue threads should have lower (numerically greater) priority This is because they might do data compression which is quite CPU expensive. The original code is correct for illumos, because there a higher priority corresponds to a greater number. MFC after: 2 weeks	2016-04-08 11:58:24 +00:00
Alexander Motin	309b1c7ade	Alike to r293708 relax pool check in vdev_geom_open_by_path(). This made impossible spare disk open by known path, which kind of worked only because the same fix was applied to vdev_geom_attach_by_guids() in r293708. MFC after: 1 week	2016-04-07 12:54:44 +00:00
Edward Tomasz Napierala	ae34b6ff96	Add four new RCTL resources - readbps, readiops, writebps and writeiops, for limiting disk (actually filesystem) IO. Note that in some cases these limits are not quite precise. It's ok, as long as it's within some reasonable bounds. Testing - and review of the code, in particular the VFS and VM parts - is very welcome. MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5080	2016-04-07 04:23:25 +00:00
Wojciech Macek	1c7c13aa0e	Implement dtrace_getupcstack in ARM64 Allow using DTRACE for performance analysis of userspace applications - the function call stack can be captured. This is almost an exact copy of AMD64 solution. Obtained from: Semihalf Sponsored by: Cavium Reviewed by: emaste, gnn, jhibbits Differential Revision: https://reviews.freebsd.org/D5779	2016-04-06 05:13:36 +00:00
Andriy Gapon	e881d8757c	remove emulation of VFS_HOLD and VFS_RELE from opensolaris compat On FreeBSD VFS_HOLD/VN_RELE were mapped to MNT_REF/MNT_REL that manipulate mnt_ref. But the job of properly maintaining the reference count is already automatically performed by insmntque(9) and delmntque(9). So, in effect all ZFS vnodes referenced the corresponding mountpoint twice. That was completely harmless, but we want to be very explicit about what FreeBSD VFS APIs are used, because illumos VFS_HOLD and FreeBSD MNT_REF provide quite different guarantees with respect to the held vfs_t / mountpoint. On illumos VFS_HOLD is sufficient to guarantee that vfs_t.vfs_data stays valid. On the other hand, on FreeBSD MNT_REF does not provide the same guarantee about mnt_data. We have to use vfs_busy() to get that guarantee. Thus, the calls to VFS_HOLD/VFS_RELE on vnode init and fini are removed. VFS_HOLD calls are replaced with vfs_busy in the ioctl handlers. And because vfs_busy has a richer interface that can not be dumbed down in all cases it's better to explicitly use it rather than trying to mask it behind VFS_HOLD. This change fixes a panic that could result from a race between zfs_umount() and zfs_ioc_rollback(). We observed a case where zfsvfs_free() tried to destroy data that zfsvfs_teardown() was still using. That happened because there was nothing to prevent unmounting of a ZFS filesystem that was in between zfs_suspend_fs() and zfs_resume_fs(). Reviewed by: kib, smh MFC after: 3 weeks Sponsored by: ClusterHQ Differential Revision: https://reviews.freebsd.org/D2794	2016-04-02 16:25:46 +00:00
Alexander Motin	baf8ceac9e	MFV r297506: 6738 zfs send stream padding needs documentation Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Eli Rosenthal <eli.rosenthal@delphix.com> illumos/illumos-gate@c20404ff77	2016-04-02 08:36:24 +00:00
Alexander Motin	4cf6fde5e8	MFV r297504: 6681 zfs list burning lots of time in dodefault() via dsl_prop_* Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Alex Wilson <alex.wilson@joyent.com> illumos/illumos-gate@d09e4475f6	2016-04-02 08:28:46 +00:00
Gleb Smirnoff	a8b2b39cce	Fix an error in r292373. Use proper count to update "pages in" counter. Noticed by: pfg via Coverity	2016-03-31 21:15:00 +00:00
Alexander Motin	30a0e024ee	Plug open count leak on zvol rename. MFC after: 2 weeks	2016-03-30 16:54:18 +00:00
Alexander Motin	b39dea9308	Switch from using make_dev_p() to make_dev_s() to close races.	2016-03-30 16:48:57 +00:00
Alexander Motin	86b0daa373	Modify "4958 zdb trips assert on pools with ashift >= 0xe". Unlike Illumos FreeBSD has concept of logical ashift, that specifies really minimal vdev block size that can be accessed. This knowledge allows properly pad physical I/O and correctly assert its alignment. This change fixes L2ARC write errors when device has logical sector size above 512 bytes. MFC after: 1 month	2016-03-29 19:18:34 +00:00
Alexander Motin	78b127f2fc	Pass through error code from make_dev_p(). ENAMETOOLONG is much more informative in logs then ENXIO. MFC after: 1 week	2016-03-28 08:12:29 +00:00
Alexander Motin	52c9b0b539	Unify ignoring EEXIST from zvol_create_minor(). This fixes creation of zvol devices for snapshots during zfs receive, that previously failed with "ZFS WARNING: Unable to create ZVOL" message. This solution is not perfect, but IMHO better then it was before. MFC after: 2 weeks	2016-03-24 10:10:41 +00:00
Mark Johnston	48cc2d5e22	Remove unused variables dtrace_in_probe and dtrace_in_probe_addr.	2016-03-17 18:55:54 +00:00
Alexander Motin	fa9de58388	Fix small memory leak on attempt to access deleted snapshot. MFC after: 3 days	2016-03-15 21:21:28 +00:00
Alexander Motin	5db0866658	Make ZFS ignore stripe sizes above SPA_MAXASHIFT (8KB). If device has stripe size bigger then maximal sector size supported by ZFS, there is nothing can be done to avoid read-modify-write cycles. Taking that stripe size into account will only reduce space efficiency and pointlessly bother user with warnings that can not be fixed. Discussed with: smh	2016-03-10 16:39:46 +00:00
Alexander Motin	eef192d85c	Make ZFS more picky to GEOM stripe sizes and offsets. Use of misaligned or non-power-of-2 stripes is not really useful for ZFS, since increased ashift won't help to avoid read-modify-write cycles, and only reduce pool space efficiency and compression rates.	2016-03-10 14:18:14 +00:00
Alexander Motin	a151f3a7ef	MFV r296609: 6370 ZFS send fails to transmit some holes Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Stefan Ring <stefanrin@gmail.com> Reviewed by: Steven Burgess <sburgess@datto.com> Reviewed by: Arne Jansen <sensille@gmx.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Paul Dagnelie <pcd@delphix.com> In certain circumstances, "zfs send -i" (incremental send) can produce a stream which will result in incorrect sparse file contents on the target. The problem manifests as regions of the received file that should be sparse (and read a zero-filled) actually contain data from a file that was deleted (and which happened to share this file's object ID). Note: this can happen only with filesystems (not zvols, because they do not free (and thus can not reuse) object IDs). Note: This can happen only if, since the incremental source (FromSnap), a file was deleted and then another file was created, and the new file is sparse (i.e. has areas that were never written to and should be implicitly zero-filled). We suspect that this was introduced by 4370 (applies only if hole_birth feature is enabled), and made worse by 5243 (applies if hole_birth feature is disabled, and we never send any holes). The bug is caused by the hole birth feature. When an object is deleted and replaced, all the holes in the object have birth time zero. However, zfs send cannot tell that the holes are new since the file was replaced, so it doesn't send them in an incremental. As a result, you can end up with invalid data when you receive incremental send streams. As a short-term fix, we can always send holes with birth time 0 (unless it's a zvol or a dataset where we can guarantee that no objects have been reused). Closes #37 openzfs/openzfs@adef853162	2016-03-10 09:01:19 +00:00
Alexander Motin	7370229e8d	Add new IOCTL compat shims for ABI breakage caused by r296510: MFV r296505: 6531 Provide mechanism to artificially limit disk performance	2016-03-09 11:16:15 +00:00
Alexander Motin	8d0e2eb06b	MFV r296529: 6672 arc_reclaim_thread() should use gethrtime() instead of ddi_get_lbolt() 6673 want a macro to convert seconds to nanoseconds and vice-versa Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Eli Rosenthal <eli.rosenthal@delphix.com> illumos/illumos-gate@a8f6344fa0	2016-03-08 18:28:24 +00:00
Alexander Motin	468bca03ef	MFV r296527: 6659 nvlist_free(NULL) is a no-op Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Marcel Telka <marcel@telka.sk> Approved by: Robert Mustacchi <rm@joyent.com> Author: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> illumos/illumos-gate@aab83bb83b	2016-03-08 18:11:38 +00:00
Alexander Motin	26802705d3	MFV r296522: 6541 Pool feature-flag check defeated if "verify" is included in the dedup property value Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: ilovezfs <ilovezfs@icloud.com> illumos/illumos-gate@971640e6aa	2016-03-08 17:58:02 +00:00
Alexander Motin	178f2c2b8e	MFV r296520: 6562 Refquota on receive doesn't account for overage Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Dan McDonald <danmcd@omniti.com> illumos/illumos-gate@5f7a8e6d75	2016-03-08 17:53:42 +00:00
Alexander Motin	7a90077752	MFV r296518: 5027 zfs large block support (add copyright) Author: Matthew Ahrens <matt@mahrens.org> illumos/illumos-gate@c3d26abc9e	2016-03-08 17:51:09 +00:00
Alexander Motin	c892984b84	MFV r296515: 6536 zfs send: want a way to disable setting of DRR_FLAG_FREERECORDS Reviewed by: Anil Vijarnia <avijarnia@racktopsystems.com> Reviewed by: Kim Shrier <kshrier@racktopsystems.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Andrew Stormont <astormont@racktopsystems.com> illumos/illumos-gate@880094b606	2016-03-08 17:43:21 +00:00
Alexander Motin	253159febf	MFV r296513: 6450 scrub/resilver unnecessarily traverses snapshots created after the scrub started Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com> illumos/illumos-gate@38d6103674	2016-03-08 17:34:58 +00:00
Alexander Motin	4427252c14	MFV r296511: 6537 Panic on zpool scrub with DEBUG kernel Reviewed by: Steve Gonczi <gonczi@comcast.net> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Gary Mills <gary_mills@fastmail.fm> illumos/illumos-gate@8c04a1fa3f	2016-03-08 17:32:24 +00:00
Alexander Motin	1b63fd68f4	MFV r296505: 6531 Provide mechanism to artificially limit disk performance Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Prakash Surya <prakash.surya@delphix.com> illumos/illumos-gate@97e8130957	2016-03-08 17:27:13 +00:00
Mark Johnston	9610c89750	Fix a couple of silly mistakes in r291962. - Handle the case where no DOF helper is provided. This occurs with the currently-unused DTRACEHIOC_ADD ioctl. - Fix some checks that prevented the loading DOF in the (non-default) lazyload mode.	2016-03-08 00:46:03 +00:00
Mark Johnston	380344a7af	Fix fasttrap tracepoint locking. Upstream, tracepoints are protected by per-CPU mutexes. An unlinked tracepoint may be freed once all the tracepoint mutexes have been acquired and released - this is done in fasttrap_mod_barrier(). This mechanism was not properly ported: in some places, the proc lock is used in place of a tracepoint lock, and in others the locking is omitted entirely. This change implements tracepoint locking with an rmlock, where the read lock is used in fasttrap probe context. As a side effect, this fixes a recursion on the proc lock when the raise action is used from a userland probe. MFC after: 1 month	2016-03-08 00:43:03 +00:00
Mark Johnston	6b1bddce00	Remove the fasttrap implementation for sparc. Other machine-dependent code required for DTrace on sparc is not present in the tree, so there's no point to keeping the fasttrap code.	2016-03-08 00:18:46 +00:00
Mark Johnston	acaa855f6e	MFV r296306: 6604 harden DIF bounds checking Reviewed by: Alex Wilson <alex.wilson@joyent.com> Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Bryan Cantrill <bryan@joyent.com> illumos/illumos-gate@1c0cef67db MFC after: 2 weeks	2016-03-08 00:14:14 +00:00
Steven Hartland	e283644b87	Removed unused label and fix mutex_exit order Remove unused done label from zfs_setacl fixing PVS-Studio V729. Fix mutex_exit order to mirror the mutex_enter order. MFC after: 1 week Sponsored by: Multiplay	2016-02-25 03:01:24 +00:00

1 2 3 4 5 ...

1470 Commits