freebsd-skq

Author	SHA1	Message	Date
avg	1e5ca04d7b	relax an assert in zfsctl_snapdir_lookup to match r323578 Since r323578 we may remove the last reference to a covered vnode with vrele() instead of vput(). So, v_usecount may be decremented before the vnode is locked and zfsctl_snapdir_lookup may "catch" the vnode with v_usecount of zero and v_holdcnt of one. PR: 225795 Reported by: asomers MFC after: 1 week	2018-02-19 08:55:22 +00:00
asomers	d09a1b3fc5	zfs: fix formatting in a log statement Submitted by: Dave Baukus <daveb@spectralogic.com> MFC after: 3 weeks Sponsored by: Spectra Logic Corp	2018-02-16 21:59:08 +00:00
asomers	4cf74ec317	Handle generic pathconf attributes in the .zfs ctldir MFC instructions: change the value of _PC_LINK_MAX to INT_MAX Reported by: jhb MFC after: 19 days X-MFC-With: 329265 Sponsored by: Spectra Logic Corp	2018-02-16 16:56:09 +00:00
avg	fdda25ea79	read-behind / read-ahead support for zfs_getpages() ZFS caches blocks it reads in its ARC, so in general the optional pages are not as useful as with filesystems that read the data directly into the target pages. But still the optional pages are useful to reduce the number of page faults and associated VM / VFS / ZFS calls. Another case that gets optimized (as a side effect) is paging in from a hole. ZFS DMU does not currently provide a convenient API to check for a hole. Instead it creates a temporary zero-filled block and allows accessing it as if it were a normal data block. Getting multiple pages one by one from a hole results in repeated creation and destruction of the temporary block (and an associated ARC header). Tested with fsx using various supported blocks sizes from 512 bytes to 128 KB and additionally 1 MB. Please note that in illumos and ZoL they do not do the range-locking in the page-in path. This is because ZFS has a double-caching problem between ARC and page cache and that requires zfs_read() and zfs_write() to consult pages in the page cache. So, in those functions they first lock a range and then lock pages corresponding to the range. While in the page-in (and maybe page-out) path they first lock the pages and then would lock the range. So, they would have a deadlock. I believe that FreeBSD does not have that problem, because the page-in deals only with invalid pages while zfs_read() and zfs_write() need to access only valid pages. They do not wait on a busy page unless it's already valid. Reviewed by: kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D14263	2018-02-16 06:59:35 +00:00
avg	f1d57f3528	MFV r329313: 8857 zio_remove_child() panic due to already destroyed parent zio illumos/illumos-gate@d6e1c446d7 `d6e1c446d7` https://www.illumos.org/issues/8857 I had an OS panic on one of our servers: ffffff01809128c0 vpanic() ffffff01809128e0 mutex_panic+0x58(fffffffffb94c904, ffffff597dde7f80) ffffff0180912950 mutex_vector_enter+0x347(ffffff597dde7f80) ffffff01809129b0 zio_remove_child+0x50(ffffff597dde7c58, ffffff32bd901ac0, ffffff3373370908) ffffff0180912a40 zio_done+0x390(ffffff32bd901ac0) ffffff0180912a70 zio_execute+0x78(ffffff32bd901ac0) ffffff0180912b30 taskq_thread+0x2d0(ffffff33bae44140) ffffff0180912b40 thread_start+8() It panicked here: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/ zio.c#430 pio->io_lock is DEAD, thus a panic. Further analysis shows the "pio" (parent zio of "cio") has already been destroyed. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Youzhong Yang <youzhong@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: George Wilson <george.wilson@delphix.com> PR: 223803 Tested by: shiva.bhanujan@quorum.com MFC after: 2 weeks	2018-02-15 14:46:29 +00:00
asomers	19509cb430	Implement .vop_pathconf and .vop_getacl for the .zfs ctldir zfsctl_common_pathconf will report all the same variables that regular ZFS volumes report. zfsctl_common_getacl will report an ACL equivalent to 555, except that you can't read xattrs or edit attributes. Fixes a bug where "ls .zfs" will occasionally print something like: ls: .zfs/.: Operation not supported PR: 225793 Reviewed by: avg MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D14365	2018-02-14 15:49:31 +00:00
mav	a7e83ccc30	Add sysctls for dnode block and indirect block shifts. MFC after: 2 weeks	2018-02-09 23:29:50 +00:00
avg	db8483ceb9	remove a duplicate assignment There should be no functional change. MFC after: 1 week	2018-02-08 13:22:40 +00:00
jeff	e67ec0d694	Use per-domain locks for vm page queue free. Move paging control from global to per-domain state. Protect reservations with the free lock from the domain that they belong to. Refactor to make vm domains more of a first class object. Reviewed by: markj, kib, gallatin Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14000	2018-02-06 22:10:07 +00:00
avg	0e29bd7dee	zfs: move a utility function, ioflags, closer to its consumers No functional change. MFC after: 1 week	2018-02-05 14:19:36 +00:00
avg	af23a9cd14	ZFS ARC: restore illumos uses of 'needfree' that were removed in r325851 This is purely a cosmetic change to have a more complete copy of ifdef-ed out illumos code. MFC after: 1 week	2018-02-02 12:57:33 +00:00
avg	22ad2342b1	zfs_rezget: drop cached pages before doing anything else We did that in the case of success to prevent the use of stale cached data, but it makes even less sense to keep the cached data when we fail. Ideally, we should call vgone() on the vnode in the case of zfs_rezget failure, but the current lock order prevents us from doing that. The change also rearranges the order of unlinked check and the size change check. While there, add missing SET_ERROR in one of the error paths. MFC after: 2 weeks	2018-01-31 14:44:51 +00:00
mav	47ae44b999	MFV r328253: 8835 Speculative prefetch in ZFS not working for misaligned reads illumos/illumos-gate@5cb8d943bc https://www.illumos.org/issues/8835: Sequential reads not aligned to block size are not detected by ZFS prefetcher as sequential, killing prefetch and severely hurting performance. It is caused by dmu_zfetch() in case of misaligned sequential accesses being called with overlap of one block. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Allan Jude <allanjude@freebsd.org> Approved by: Gordon Ross <gwr@nexenta.com> Author: Alexander Motin <mav@FreeBSD.org>	2018-01-22 05:57:14 +00:00
mav	2dd60f22d7	MFV r328251: 8652 Tautological comparisons with ZPROP_INVAL illumos/illumos-gate@4ae5f5f06c https://www.illumos.org/issues/8652: Clang and GCC prefer to use unsigned ints to store enums. With Clang, that causes tautological comparison warnings when comparing a zfs_prop_t or zpool_prop_t variable to the macro ZPROP_INVAL. It's likely that error handling code is being silently removed as a result. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Gordon Ross <gwr@nexenta.com> Author: Alan Somers <asomers@gmail.com>	2018-01-22 05:52:39 +00:00
mav	27fedeb8ad	MFV r328247: 8959 Add notifications when a scrub is paused or resumed illumos/illumos-gate@301fd1d6f2 Reviewed by: Alek Pinchuk <pinchuk.alek@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Sean Eric Fagan <sef@ixsystems.com>	2018-01-22 04:31:48 +00:00
mav	84b8a477fb	MFV r328245: 8856 arc_cksum_is_equal() doesn't take into account ABD-logic illumos/illumos-gate@01a059ee0c https://www.illumos.org/issues/8856: arc_cksum_is_equal() calls zio_push_transform() that requires abd_t* (second arg), but a void* is passed. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Roman Strashkin <roman.strashkin@nexenta.com>	2018-01-22 04:23:48 +00:00
mav	428df4ba9a	MFV r328229: 8930 zfs_zinactive: do not remove the node if the filesystem is readonly illumos/illumos-gate@93c618e0f4 https://www.illumos.org/issues/8930: We normally remove an unlinked node when its last user goes away and the node becomes inactive. However, we should not do that if the filesystem is mounted read-only including the case where it has its readonly property set. The node will remain on the unlinked queue, so it will not be leaked. One particular scenario is when we receive an incremental stream into a mounted read-only filesystem and that stream contains an unlinked file (still on the unlinked queue). If that file is opened before the receive and some time later after the receive it becomes inactive we would remove it and, thus, modify the read-only filesystem. As a result, the filesystem would diverge from its source and further incremental receives would not be possible (without forcing a rollback). Another related scenario, that may or may not be possible depending on an OS / VFS policy, is when an open file is unlinked, then the filesystem is remounted read-only, and then the file is closed. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Andriy Gapon <avg@FreeBSD.org>	2018-01-21 23:49:17 +00:00
mav	c8d77253f9	MFV r328227: 8909 8585 can cause a use-after-free kernel panic illumos/illumos-gate@94ddd0900a https://www.illumos.org/issues/8909: There's a race condition that exists if `zil_free_lwb` races with either `zil_commit_waiter_timeout` and/or `zil_lwb_flush_vdevs_done`. Here's an example panic due to this bug: > ::status debugging crash dump vmcore.0 (64-bit) from ip-10-110-205-40 operating system: 5.11 dlpx-5.2.2.0_2017-12-04-17-28-32b6ba51fb (i86pc) image uuid: 4af0edfb-e58e-6ed8-cafc-d3e9167c7513 panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0010555970 addr=60 occurred in mo dule "zfs" due to a NULL pointer dereference dump content: kernel pages only > $c zio_shrink+0x12() zil_lwb_write_issue+0x30d(ffffff03dcd15cc0, ffffff03e0730e20) zil_commit_waiter_timeout+0xa2(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit_waiter+0xf3(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit+0x80(ffffff03dcd15cc0, 9a9) zfs_write+0xc34(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) fop_write+0x5b(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) write+0x250(42, fffffd7ff4832000, 2000) sys_syscall+0x177() If there's an outstanding lwb that's in `zil_commit_waiter_timeout` waiting to timeout, waiting on it's waiter's CV, we must be sure not to call `zil_free_lwb`. If we end up calling `zil_free_lwb`, then that LWB may be freed and can result in a use-after-free situation where the stale lwb pointer stored in the `zil_commit_waiter_t` structure of the thread waiting on the waiter's CV is used. A similar situation can occur if an lwb is issued to disk, and thus in the `LWB_STATE_ISSUED` state, and `zil_free_lwb` is called while the disk is servicing that lwb. In this situation, the lwb will be freed by `zil_free_lwb`, which will result in a use-after-free situation when the lwb's zio completes, and `zil_lwb_flush_vdevs_done` is called. This race condition is prevented in `zil_close` by calling `zil_commit` before `zil_free_lwb` is called, which will ensure all outstanding (i.e. all lwb's in the `LWB_STATE_OPEN` and/or `LWB_STATE_ISSUED` states) reach the `LWB_STATE_DONE` state before the lwb's are freed (`zil_commit` will not return untill all the lwb's are `LWB_STATE_DONE`). Further, this race condition is prevented in `zil_sync` by only calling `zil_free_lwb` for lwb's that do not have their `lwb_buf` pointer set. All lwb's not in the `LWB_STATE_DONE` state will have a non-null value for this pointer; the pointer is only cleared in `zil_lwb_flush_vdevs_done`, at which point the lwb's state will be changed to `LWB_STATE_DONE`. This race is present in `zil_suspend`, leading to this bug. At first glance, it would appear as though this would not be true because `zil_suspend` will call `zil_commit`, just like `zil_close`, but the problem is that `zil_suspend` will set the zilog's `zl_suspend` field prior to calling `zil_commit`. Further, in `zil_commit`, if `zl_suspend` is set, `zil_commit` will take a special branch of logic and use `txg_wait_synced` instead of performing the normal `zil_commit` logic. This call to `txg_wait_synced` might be good enough for the data to reach disk safely before it returns, but it does not ensure that all outstanding lwb's reach the `LWB_STATE_DONE` state before it returns. This is because, if there's an lwb "stuck" in `zil_commit_waiter_timeout`, waiting for it's lwb to timeout, it will maintain a non-null value for it's `lwb_buf` field and thus `zil_sync` will not free that lwb. Thus, even though the lwb's data is already on disk, the lwb will be left lingering, waiting on the CV, and will eventually timeout and be issued to disk even though the write is unnesseary. So, after `zil_commit` is called from `zil_suspend`, we incorrectly assume that there are not outstanding lwb's, and proceed to free all lwb's found on the zilog's lwb list. As a result, we free the lwb that will later be used `zil_commit_waiter_timeout`. Reviewed by: John Kennedy <jwk404@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com>	2018-01-21 23:18:42 +00:00
mav	46f172e5a8	MFV r328225: 8603 rename zilog's "zl_writer_lock" to "zl_issuer_lock" illumos/illumos-gate@cf07d3da99 https://www.illumos.org/issues/8603: To help make the ZIL's code more understandable, it was suggested that the zilog_t's "zl_writer_lock" field should be renamed to "zl_issuer_lock". Reviewed by: C Fraire <cfraire@me.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com>	2018-01-21 23:11:20 +00:00
mav	2700f9ece1	MFV r328220: 8677 Open-Context Channel Programs illumos/illumos-gate@a3b2868063 https://www.illumos.org/issues/8677 We want to be able to run channel programs outside of synching context. This would greatly improve performance of channel program that just gather information, as we won't have to wait for synching context anymore. This feature should introduce the following: - A new command line flag in "zfs program" to specify our intention to run in open context. - A new flag/option within the channel program ioctl which selects the context. - Appropriate error handling whenever we try a channel program in open-context that contains zfs.sync* expressions. - Documentation for the new feature in the manual pages. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com>	2018-01-21 23:02:05 +00:00
avg	ed9760cef2	zfs: no need to check that size of zfs_cmd_t is not greater than IOCPARM_MAX Nowadays we do not pass zfs_cmd_t directly through the ioctl interface. Instead a small zfs_iocparm_t object is passed and the command is explicitly copied in and out. So, the check has become irrelevant. MFC after: 3 weeks Sponsored by: Panzura	2018-01-21 11:19:18 +00:00
markj	f1eb0fc41a	Use the thread's ucred struct when fetching jid or jailname. Reported by: mjg X-MFC with: r327888	2018-01-14 17:55:40 +00:00
markj	1bfc3a6a76	Add "jid" and "jailname" variables to DTrace. These return the jail ID and jail name for the traced process, respectively, and are analogous to "zonename" on Solaris/illumos. "zonename" is now aliased to "jailname". Also add some stress tests for the new variables. Submitted by: Domagoj Stolfa <domagoj.stolfa@gmail.com> Reviewed by: dteske (previous version) MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D13877	2018-01-12 19:59:46 +00:00
avg	ff3a1d7da2	zfs_mount: restore a bit of ifdef-out illumos code And correctly mark the end of the replacement FreeBSD code. MFC after: 1 week	2018-01-09 13:43:04 +00:00
jeff	c17fd15c00	Fix arc after r326347 broke various memory limit queries. Use UMA features rather than kmem arena size to determine available memory. Initialize the UMA limit to LONG_MAX to avoid spurious wakeups on boot before the real limit is set. PR: 224330 (partial), 224080 Reviewed by: markj, avg Sponsored by: Netflix / Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13494	2018-01-02 04:35:56 +00:00
dim	e62792a7eb	Remove obsolete register keyword from opensolaris's sysmacros.h. When compiling zfsd with recent clang, it leads to a warning about the register storage class being incompatible with C++17. MFC after: 3 days	2017-12-24 19:17:15 +00:00
jhb	1b323bc2c5	Don't return early for non-failure for one of the EMLINK checks. r326987 enabled two #if 0'd-out EMLINK checks in zfs_link_create() for link overflow. However, one of the checks (when the vnode adding a link is a directory such as for mkdir) always returned even if the link did not overflow. Change this to only return early if it needs to report an EMLINK error. Reported by: db, shurd Sponsored by: Chelsio Communications	2017-12-19 23:54:44 +00:00
jhb	e09154bf75	Rework pathconf handling for FIFOs. On the one hand, FIFOs should respect other variables not supported by the fifofs vnode operation (such as _PC_NAME_MAX, _PC_LINK_MAX, etc.). These values are fs-specific and must come from a fs-specific method. On the other hand, filesystems that support FIFOs are required to support _PC_PIPE_BUF on directory vnodes that can contain FIFOs. Given this latter requirement, once the fs-specific VOP_PATHCONF method supports _PC_PIPE_BUF for directories, it is also suitable for FIFOs permitting a single VOP_PATHCONF method to be used for both FIFOs and non-FIFOs. To that end, retire all of the FIFO-specific pathconf methods from filesystems and change FIFO-specific vnode operation switches to use the existing fs-specific VOP_PATHCONF method. For fifofs, set it's VOP_PATHCONF to VOP_PANIC since it should no longer be used. While here, move _PC_PIPE_BUF handling out of vop_stdpathconf() so that only filesystems supporting FIFOs will report a value. In addition, only report a valid _PC_PIPE_BUF for directories and FIFOs. Discussed with: bde Reviewed by: kib (part of a larger patch) MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D12572	2017-12-19 22:39:05 +00:00
jhb	3efec8ad25	Move NAME_MAX, LINK_MAX, and CHOWN_RESTRICTED out of vop_stdpathconf(). Having all filesystems fall through to default values isn't always correct and these values can vary for different filesystem implementations. Most of these changes just use the existing default values with a few exceptions: - Don't report CHOWN_RESTRICTED for ZFS since it doesn't do the exact permissions check this claims for chown(). - Use NANDFS_NAME_LEN for NAME_MAX for nandfs. - Don't report a LINK_MAX of 0 on smbfs. Now fail with EINVAL to indicate hard links aren't supported. Requested by: bde (though perhaps not this exact implementation) Reviewed by: kib (earlier version) MFC after: 1 month Sponsored by: Chelsio Communications	2017-12-19 19:51:36 +00:00
jhb	4be4c74c89	Adjust ZFS' link count handling for ino64. - Define a ZFS_LINK_MAX as the ZFS version of LINK_MAX which is set to UINT64_MAX to match the on-disk format. - Enable the currently #if 0'd code to check for link overflows and return EMLINK. - Don't clamp the link count reported in stat() to LINK_MAX as that is still the 16-bit limit, but report the full link counts. Also, avoid possibly overflowing the reported link count to 0 when adjusting the link count to account for ".snapshot". - Update the LINK_MAX reported by pathconf() to report ZFS_LINK_MAX rather than LINK_MAX (but clamped to LONG_MAX for 32-bit systems). Reviewed by: avg (earlier version) Sponsored by: Chelsio Communications	2017-12-19 19:07:24 +00:00
markj	c4bc9a29b5	Avoid CPU migration in dtrace_gethrtime() on x86. dtrace_gethrtime() may be called outside of probe context, and in particular, from the DTRACEIOC_BUFSNAP handler. Disable interrupts rather than using sched_pin() to help ensure that we don't call any external functions when in probe context. PR: 218452 MFC after: 1 week	2017-12-18 17:26:24 +00:00
markj	96bef4e3d4	Unregister the ARC lowmem event handler earlier in arc_fini(). Otherwise a poorly timed lowmem event may attempt to acquire a destroyed lock. Unregister the handler before destroying the ARC reclaim thread. Reported by: gjb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D13480	2017-12-17 18:21:40 +00:00
markj	ec23987918	MFV r326785: 8880 improve DTrace error checking illumos/illumos-gate@2cf374268f `2cf374268f` https://www.illumos.org/issues/8880 Reviewed by: Tim Kordas <tim.kordas@joyent.com> Reviewed by: Bryan Cantrill <bryan@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Dan McDonald <danmcd@joyent.com> Author: Jerry Jelinek <jerry.jelinek@joyent.com> MFC after: 1 week	2017-12-12 22:08:34 +00:00
markj	46cef17a3e	Correct initialization of pc on powerpc. PR: 224293 Submitted by: Breno Leitao <breno.leitao@gmail.com> X-MFC with: r326774 Pointy hat: markj	2017-12-12 20:41:11 +00:00
markj	b0b9b4fcf4	Pass the trap frame to fasttrap hooks. The DTrace fasttrap entry points expect a struct reg containing the register values of the calling thread. Perform the conversion in fasttrap rather than in the trap handler: this reduces the number of ifdefs and avoids wasting stack space for traps that don't involve DTrace. MFC after: 2 weeks	2017-12-11 19:21:39 +00:00
imp	fb81bab70d	Mark two things as unused (since they are only sometimes used) and toss in a DECONST to remove a const in some tricky code that would require too extensive a change to unwind otherwise. Sponsored by: Netflix	2017-12-03 04:55:33 +00:00
imp	5e8ff9a4f1	Fix all warnings related to geli and ZFS support on x86. Default WARNS to 0 still, since there's still some warnings on other architectures. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D13301	2017-12-02 00:07:37 +00:00
asomers	cdb41e3b44	Fix assertion when ZFS fails to open certain devices "panic: vdev_geom_close_locked: cp->private is NULL" This panic will result if ZFS fails to open a device due to either of the following reasons: 1) The device's sector size is greater than 8KB. 2) ZFS wants to open the device RW, but it can't be opened for writing. The solution is to change the initialization order to ensure that the assertion will be satisfied. PR: 221066 Reported by: David NewHamlet <wheelcomplex@gmail.com> Reviewed by: avg MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D13278	2017-11-30 15:36:06 +00:00
asomers	37322ff109	Revert r326399 Accidentally committed wrong file Pointy hat to: asomers Sponsored by: Spectra Logic Corp	2017-11-30 15:34:55 +00:00
asomers	53f83d21f7	Fix assertion when ZFS fails to open certain devices "panic: vdev_geom_close_locked: cp->private is NULL" This panic will result if ZFS fails to open a device due to either of the following reasons: 1) The device's sector size is greater than 8KB. 2) ZFS wants to open the device RW, but it can't be opened for writing. The solution is to change the initialization order to ensure that the assertion will be satisfied. PR: 221066 Reported by: David NewHamlet <wheelcomplex@gmail.com> Reviewed by: avg MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D13278	2017-11-30 15:28:29 +00:00
markj	507cb204bf	Don't use pcpu_find() to determine if a CPU ID is valid. This addresses assertion failures after r326218. MFC after: 1 week	2017-11-27 18:42:23 +00:00
markj	b7a8474133	Duplicate helpers after disabling inherited tracepoints during a fork. We may create probes in the nascent child process, so we first need to ensure that any inherited tracepoints are first removed. Otherwise the probe sites will not be in the state expected by fasttrap, and it won't be able to enable the probes. MFC after: 2 weeks	2017-11-23 14:29:07 +00:00
jhibbits	ba5835b241	PowerPC has 12 artificial frames for the profiler It may need to be different between AIM and Book-E, this was tested only on Book-E (64- and 32-bit) MFC after: 3 weeks	2017-11-22 01:53:59 +00:00
avg	5c5e6af72c	zfs_write: fix problem with writes appearing to succeed when over quota The problem happens when the writes have offsets and sizes aligned with a filesystem's recordsize (maximum block size). In this scenario dmu_tx_assign() would fail because of being over the quota, but the uio would already be modified in the code path where we copy data from the uio into a borrowed ARC buffer. That makes an appearance of a partial write, so zfs_write() would return success and the uio would be modified consistently with writing a single block. That bug can result in a data loss because the writes over the quota would appear to succeed while the actual data is being discarded. This commit fixes the bug by ensuring that the uio is not changed until after all error checks are done. To achieve that the code now uses uiocopy() + uioskip() as in the original illumos design. We can do that now that uiocopy() has been updated in r326067 to use vn_io_fault_uiomove(). Reported by: mav Analyzed by: mav Reviewed by: mav Pointyhat to: avg (myself) MFC after: 1 week X-MFC after: r326067 X-Erratum: wanted	2017-11-21 18:28:14 +00:00
avg	0e4af54239	make illumos uiocopy use vn_io_fault_uiomove uiocopy() is currently unused, its purpose is copy data from a uio without modifying the uio. It was in use before the vn_io_fault support was added to ZFS, at which point our code diverged from the illumos code a little bit. Because ZFS is the only (potential) user of the function we are free to modify it to better suit ZFS needs. The intention behind this change is to remove the differences introduced earlier in zfs_write(). While here, re-implement uioskip() using uiomove() with uio_segflg == UIO_NOCOPY. The story of uioskip is the same as with uiocopy. Reviewed by: mav MFC after: 1 week	2017-11-21 18:01:43 +00:00
markj	bd8385a990	Avoid holding the process in uread() and uwrite(). In general, higher-level code will atomically verify that the process is not exiting and hold the process. In one case, we were using uwrite() to copy a probed instruction to a per-thread scratch space block, but copyout() can be used for this purpose instead; this change effectively reverts r227291. MFC after: 1 week	2017-11-16 07:25:12 +00:00
bapt	0419f346b4	remove the poor emulation of the IllumOS needfree global variable to prevent the ARC reclaim thread running longer than needed. Update the arc::needfree dtrace probe triggered in arc_lowmem() to also report the value we may want to free. Submitted by: Nikita Kozlov <nikita.kozlov at blade-group.com> Reviewed by: avg Approved by: avg MFC after: 3 weeks Sponsored by: blade Differential Revision: https://reviews.freebsd.org/D12163	2017-11-15 12:48:36 +00:00
avg	48e6b8589f	MFV r325609: 7531 Assign correct flags to prefetched buffers illumos/illumos-gate@2729521654 `2729521654` https://www.illumos.org/issues/7531 I found that some buffers that could be L2ARC eligible are not flagged such, leading to some performance impact. As a test I ran the same IO workload 10 times in a raw. It is a metadata only workload (files listing). l2arc_noprefetch=0. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: benrubson <ben.rubson@gmail.com> MFC after: 8 days	2017-11-09 18:22:42 +00:00
avg	7e0b4f7fa5	MFV r325607: 8607 zfs: variable set but not used illumos/illumos-gate@b852c2f543 `b852c2f543` https://www.illumos.org/issues/8607 Reviewed by: Yuri Pankov <yuripv@gmx.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Toomas Soome <tsoome@me.com> MFC after: 1 week	2017-11-09 18:14:42 +00:00
avg	eb86daaed9	MFV r325605: 8713 Buffer overflow in dsl_dataset_name() illumos/illumos-gate@f37ae9a714 `f37ae9a714` https://www.illumos.org/issues/8713 If we're creating a pool with version >= SPA_VERSION_DSL_SCRUB (v11) we need to account for additional space needed by the origin dataset which will also be snapshotted: "poolname"+"/"+"$ORIGIN"+"@"+"$ORIGIN". Enforce this limit in pool_namecheck(). Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 1 week	2017-11-09 18:12:21 +00:00

1 2 3 4 5 ...

1881 Commits