freebsd-skq

Author	SHA1	Message	Date
Mariusz Zaborski	306a82f8f4	Rename zfs nvpair files to not colidate with our nvlist. PR: 201356 Approved by: pjd (mentor)	2015-07-09 21:53:40 +00:00
Mateusz Guzik	f131759f54	fd: make 'rights' a manadatory argument to fget* functions	2015-07-05 19:05:16 +00:00
Konstantin Belousov	6fdfd88220	Use single instance of the identical INKERNEL() and PMC_IN_KERNEL() macros on amd64 and i386. Move the definition to machine/param.h. kgdb defines INKERNEL() too, the conflict is resolved by renaming kgdb version to PINKERNEL(). On i386, correct the lowest kernel address. After the shared page was introduced, USRSTACK no longer points to the last user address + 1 [] Submitted by: Oliver Pinter [] Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-07-02 14:37:21 +00:00
Andriy Gapon	74f75cb1bd	zfs_mount(MS_REMOUNT): protect zfs_(un)register_callbacks calls We now take z_teardown_lock as a writer to ensure that there is no I/O while the filesystem state is in a flux. Also, zfs_suspend_fs() -> zfsvfs_teardown() call zfs_unregister_callbacks() and zfs_resume_fs() -> zfsvfs_setup() call zfs_unregister_callbacks(). Previously there was no synchronization between those calls and the calls in the re-mounting case. That could lead to concurrent execution and a crash. PR: 180060 Differential Revision: https://reviews.freebsd.org/D2865 Suggested by: mahrens Reviewed by: delphij, pho, mahrens, will MFC after: 13 days Sponsored by: ClusterHQ	2015-07-02 08:32:02 +00:00
Ruslan Bukin	b78ee15e9f	First cut of DTrace for AArch64. Reviewed by: andrew, emaste Sponsored by: ARM Limited Differential Revision: https://reviews.freebsd.org/D2738	2015-07-01 15:51:11 +00:00
Ruslan Bukin	0ff41755cd	Add a central location for exclusion checks. We check here if function is excluded from FBT instrumentation. Reviewed by: andrew, emaste, markj Differential Revision: https://reviews.freebsd.org/D2899	2015-07-01 14:09:59 +00:00
Andriy Gapon	bc97daa07e	MFV r284412: 5911 ZFS "hangs" while deleting file Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Reviewed by: Alek Pinchuk <alek@nexenta.com> Reviewed by: Simon Klinkert <simon.klinkert@gmail.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com> illumos/illumos-gate@46e1baa6cf https://www.illumos.org/issues/5911 Sometimes ZFS appears to hang while deleting a file. It is actually making slow progress at the file deletion, but other operations (administrative and writes via the data path) "hang" until the file removal completes, which can take a long time if the file has many blocks. The deletion (or most of it) happens in a single txg, and the sync thread spends most of its time reading indirect blocks via this stack trace: swtch+0x141() cv_wait+0x70() zio_wait+0x5b() dbuf_read+0x2c0() free_children+0x50() free_children+0x12a() free_children+0x12a() free_children+0x12a() dnode_sync_free_range_impl+0xdf() dnode_sync_free_range+0x52() range_tree_vacate+0x65() dnode_sync+0x1d8() dmu_objset_sync_dnodes+0x77() dmu_objset_sync+0x19f() dsl_dataset_sync+0x51() dsl_pool_sync+0x9a() spa_sync+0x2ff() txg_sync_thread+0x21f() thread_start+8() One way to reproduce the problem is if we are over the arc_meta_limit, e.g. because lots of indirect blocks are pinned because we have L0 dbufs under them. It could be that most of the L1 indirects are cached, in which case when dmu_free_long_range_impl() calls dmu_tx_hold_free(), it will complete very quickly. This allows dmu_free_long_range_impl() to put many (perhaps all of its) transactions in the same TXG. However, dmu_free_long_range_impl() calls dnode_evict_dbufs (and dnode_free_range()), which removes the L0 dbufs, thus reducing the hold count on the L1 indirect blocks above it, allowing them to be evicted. Because we are over the arc_meta_limit(), these L1 blocks will be evicted ASAP. Thus when we get to syncing context, the L1 indirects are no longer cached and must be read in. Obtained from: illumos MFC after: 15 days	2015-06-19 06:58:05 +00:00
Andriy Gapon	ab50c99d40	illums compat: use flsl/flsll for highbit/highbit64 Do that only when when fast inline versions are available. At the moment that can be the case only in the kernel and not for all platforms. The original code uses the binary search and that's kept as a fallback. This is a micro optimization. Differential Revision: https://reviews.freebsd.org/D2839 Reviewed by: delphij, mahrens, mav MFC after: 17 days	2015-06-19 06:41:53 +00:00
Gleb Smirnoff	093ebe1d28	o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async(). o Provide an extensive set of assertions for input array of pages. o Remove now duplicate assertions from different pagers. Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-06-17 22:44:27 +00:00
Andriy Gapon	783379a942	Revert r284511 because it caused build failures on many platforms The problem is that when inline versions of flsl and flsll are not available, then libkern.h must be included for their declarations in kernel sources. The fix would be trivial, but I would like to figure out first if it even makes sense to use the libkern provided implementations. Reported by: bz Pointyhat to: avg	2015-06-17 17:16:06 +00:00
Andriy Gapon	6470c31911	l2arc: pass correct size to trim requests b_size is a logical size of a buffer in memory, b_asize is its physical size that accounts for possible compression. Currently the latter is the best approximation for the allocated, on-disk size. L2ARC TRIM support was committed a few weeks before L2ARC compression was imported, so originally the code was correct, because b_size was the size. Further thoughts. Given that the cache device is being overwritten in a circular fashion it is not clear if a TRIM per each evicted L2ARC buffer has any benefits. Maybe it would be sufficient to issue a single trim request for the whole device when it is loaded, e.g. after a bootup, or when it is unloaded, e.g. before a shutdown. At least as long as L2ARC is not persistent across reboots. Discussed with: smh MFC after: 19 says	2015-06-17 12:28:13 +00:00
Andriy Gapon	1fa1d4a651	illumos compat: use flsl/flsll for highbit/highbit64 This is a micro optimization. The upstream code uses the binary search. Differential Revision: https://reviews.freebsd.org/D2839 Reviewed by: delphij, mav MFC after: 15 days	2015-06-17 12:05:04 +00:00
Andriy Gapon	bab89d0897	MFV r284036: 5961 Fix stack overflow in zfs_create_fs illumos/illumos-gate@c701fde691 Author: glebius MFC after: 11 days	2015-06-12 11:10:49 +00:00
Andriy Gapon	ff7e06fbf4	MFV r284030: 5818 zfs {ref}compressratio is incorrect with 4k sector size illumos/illumos-gate@81cd5c555f Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 17 days	2015-06-12 10:57:05 +00:00
Andriy Gapon	8e9f0d5803	MFV r283534: 5515 dataset user hold doesn't reject empty tags illumos/illumos-gate@752fd8dabc Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com> MFC after: 10 days	2015-06-12 10:52:53 +00:00
Andriy Gapon	dde4126314	MFV r284040: check that datasets are snapshots 5946 zfs_ioc_space_snaps must check that firstsnap and lastsnap refer to snapshots 5945 zfs_ioc_send_space must ensure that fromsnap refers to a snapshot Reviewed by: Steven Hartland <killing@multiplay.co.uk> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Gordon Ross <gordon.ross@nexenta.com> illumos/illumos-gate@24218bebb4 Note that the upstream commit is modified during MFV: in the upstream the check is done by inspecting ds_is_snapshot field while in FreeBSD we call dsl_dataset_is_snapshot(). This is because illumos/illumos-gate@bc9014e6a8 (r277428 in vendor-sys/illumos) is not MFV-ed yet. MFC after: 10 days	2015-06-12 10:41:24 +00:00
Ruslan Bukin	8bd0e17595	Don't re-define LOCORE when dtrace is built-in to the kernel.	2015-06-10 09:59:26 +00:00
Andriy Gapon	de93769f1d	compat nvpair.h: make sure that the names are mangled only for kernel Currently there is no good reason to mangle the userland API. The change was introduced in `eac1d566b4`, r279437. Also see https://reviews.freebsd.org/D1881. I am still convinced that nv should not have introduced intentionally conflicting API. Discussed with: rstone X-MFC with: r279437 Sponsored by: ClusterHQ	2015-06-07 08:54:25 +00:00
Konstantin Belousov	63261dad32	Add missed {}. Noted by: Morten Rodal <morten@rodal.no> MFC after: 2 weeks	2015-05-27 19:28:14 +00:00
Konstantin Belousov	780dca1b1e	Right now, dounmount() is called with unreferenced mount point. Nothing stops a parallel unmount to suceed before the given call to dounmount() checks and locks the covered vnode. Prevent dounmount() from acting on the freed (although type-stable) memory by changing the interface to require the mount point to be referenced. dounmount() consumes the reference on return, regardless of the sucessfull or erronous result. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-05-27 09:22:50 +00:00
Andriy Gapon	4b040d9513	zfs: fixes for a full stream received into an existing dataset - this should fail early unless the force flag is set - if the force flag is set then any local modifications including snapshots should be undone See: https://www.illumos.org/issues/5912 See: https://reviews.csiden.org/r/220/ Reviewed by: mahrens, Paul Dagnelie <pcd@delphix.com> MFC after: 15 days Sponsored by: ClusterHQ	2015-05-25 11:56:57 +00:00
Andriy Gapon	e80d8b4b7c	dsl_dataset_promote_check: ensure that shared snaps do not become too long ... after they are transfered from the old origin to the new one. See: https://www.illumos.org/issues/5909 See: https://reviews.csiden.org/r/219/ Reviewed by: mahrens MFC after: 10 days Sponsored by: ClusterHQ	2015-05-25 11:48:15 +00:00
Konstantin Belousov	e61d4e626e	Remove excess Giant acquisition around the dounmount() call. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-05-25 09:08:19 +00:00
Mark Johnston	11027ebcbb	Remove unused references to calltrap. MFC after: 3 days	2015-05-25 01:22:56 +00:00
Jung-uk Kim	fd90e2ed54	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
Steven Hartland	c017a87e08	Add copyright info missing from r282205 Add the copyright info missing from ZoL origin version. MFC after: 2 days Sponsored by: Multiplay	2015-05-14 08:13:01 +00:00
Andriy Gapon	defce67748	zfs ioctls: use fget_write / fget_read instead of getf wrapper for fget This allows to ensure that we do not write to a file that was opened for reading only or vice versa. Also, use the correct capability in in zfs_ioc_send_new(). Differential Revision: https://reviews.freebsd.org/D2382 Reviewed by: delphij MFC after: 17 days Sponsored by: ClusterHQ	2015-05-11 10:07:31 +00:00
Mark Johnston	5a9f9cb38e	Remove some commented-out upstream code for handling traps from usermode DTrace probes. This handling is already done in trap() on i386 and amd64.	2015-05-10 22:27:48 +00:00
Justin Hibbits	45bf6d59de	Fix a couple bugs in 64-bit powerpc fasttrap argument retrieval. Found by code inspection.	2015-05-10 04:33:01 +00:00
Andriy Gapon	96b60db0d7	MFV r282630: 5809 Blowaway full receive in v1 pool causes kernel panic MFC after: 5 days	2015-05-08 14:03:14 +00:00
Andriy Gapon	24dd1a8242	zfs: do not hold an extra reference on a root vnode while a filesystem is mounted At present zfs_domount() acquires a reference on the filesystem's root vnode and that reference is kept until zfs_umount. The latter calls vflush(rootrefs = 1) to dispose of the extra reference. There is no explanation of why that reference is kept - what problem it solves or what behavior it improves. Also, that logic is FreeBSD specific. There is one real problem with that reference, though. zfs recv -F may receive a full, non-incremental stream to a mounted filesystem. In that case the received root object is likely to have a different z_gen attribute value. Because of that, zfs_rezget will leave the previous root znode and vnode disassociated from the actual object (z_sa_hdl == NULL). Thus, future calls to VFS_ROOT() -> zfs_root() will produce a new vnode-znode pair, while the old one will be kept alive by the outstanding reference. So, the outstanding reference will not actually be for the new root vnode (or, more precisely, vnodes - because a root vnode may be recycled and a newer one can be created). As a result, when vflush(rootrefs = 1) s called there will be two problems: - a leaked reference on the old root vnode preventing a graceful unmount - insufficient references on the actual root vnode leading to a crash upon access to the vnode after it is destroyed by vgone() + vdrop() The second issue will actually override the first one. Differential Revision: https://reviews.freebsd.org/D2353 Reviewed by: delphij, kib, smh MFC after: 17 days	2015-05-05 11:01:06 +00:00
Andriy Gapon	ce0023d851	dmu_recv_end_check: don't leak hold if dsl_destroy_snapshot_check_impl fails The leak may happen if !drc_newfs && drc_force and there is an error iterating through snapshots or any of snapshot checks fails. See https://www.illumos.org/issues/5870 See https://reviews.csiden.org/r/206/ Reviewed by: mahrens (as mahrens@delphix.com) MFC after: 15 days Sponsored by: ClusterHQ	2015-05-05 10:56:16 +00:00
Steven Hartland	aeb9d4dad9	Fix misuse of input argument in traverse_visitbp In traverse_visitbp(), the input argument dnp is modified in the middle to point to a temporary buffer. Originally this doesn't matter, because no user of TRAVERSE_POST dereferences it. However, in fbeddd6 a piece of code is added dereferencing dnp after the modification, creating a possible bug. We fix this by creating a new local variable cdnp for the DMU_OT_DNODE case, so we don't modify the input argument. Also we introduce different local variables in the DMU_OT_OBJSET case to prevent confusion between the input argument. Obtained from: zfsonlinux (a585f2f844ed3d4270221fed88f5e494eb55d932) MFC after: 2 weeks Sponsored by: Multiplay	2015-04-28 22:46:58 +00:00
Andriy Gapon	9bc3222765	replace a comment about zfs recv -F corner case with a longer, more detailed one The old comment in zfs_rezget explains what situation the code handles, the new comment also describes how the situation can arise. Also, re-join a line that became sufficiently shorti some time ago. Differential Revision: https://reviews.freebsd.org/D2352 Reviewed by: delphij, smh MFC after: 12 days	2015-04-28 09:19:40 +00:00
Andriy Gapon	1af760ce1b	zfs_onexit_fd_hold: return EBADF even if devfs_get_cdevpriv gave ENOENT /dev/zfs always has per-open data, so when it is missing the file descriptor is for some other file. Returning ENOENT in this case is confusing as a variety of other conditions (like a missing dataset) may result in the same error. It's better to consistently return EBADF for any problems with the file descriptor. Note that zfs_onexit_fd_hold() is used with 'automatic cleanup fd' - when that fd is closed, typically because a process is terminated, some cleanup action is taken by ZFS driver. E.g. a temporary snapshot hold is released. Perhaps, it would even be worthwhile changing devfs_get_cdevpriv() to return EBADF if there is no associated data. Differential Revision: https://reviews.freebsd.org/D2370 Reviewed by: delphij, smh MFC after: 12 days	2015-04-28 09:11:47 +00:00
Andriy Gapon	37a9b4136e	dsl_dir_rename_check: return EXDEV on cross-pool rename attempt Obtained from: zfsonlinux/zfs@9063f65476 Obtained from: Boris Protopopov <boris.protopopov@actifio.com> MFC after: 10 days	2015-04-28 08:04:16 +00:00
Andriy Gapon	99d058c8a7	MFV r282123: 5610 zfs clone from different source and target pools produces coredump MFC after: 10 days	2015-04-28 07:42:28 +00:00
Andriy Gapon	28d15239af	MFV r282124: 5393 spurious failures from dsl_dataset_hold_obj() The actual bugfix was pro-actively committed in r275515. This MFV is cosmetic, it just aligns code style with the upstream. MFC after: 10 days	2015-04-28 07:37:38 +00:00
Andriy Gapon	39b6f1d6c1	nvpair_type_is_array: DATA_TYPE_INT8_ARRAY was not recognized To do: upstream (https://www.illumos.org/issues/5778) MFC after: 10 days	2015-04-28 06:34:55 +00:00
Robert Watson	a12df97ed2	Adjust PROF_ARTIFICIAL_FRAMES in the DTrace profile provider on ARM to skip 10, rather than 9, frames. This appears to work quite well in practice on the BeagleBone Black, so remove a comment about the value being bogus and replace it with a slightly less negative one. However, the number of frames to skip is quite sensitive to details of the timer and interrupt handling paths, so this is necessarily fragile -- but no more so than on x86. Sponsored by: DARPA, AFRL	2015-04-25 15:43:12 +00:00
Mark Johnston	8241ee3b2c	Fix DTrace's panic() action. It would previously call into some unfinished Solaris compatibility code and return without actually calling panic(9). The compatibility code is unneeded, however, so just remove it and have dtrace_panic() call vpanic(9) directly. Differential Revision: https://reviews.freebsd.org/D2349 Reviewed by: avg MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2015-04-24 03:19:30 +00:00
Xin LI	384f656a1a	Remove vfs.zfs.snapshot_list_prefetch, the corresponding code was gone in r248571 already. MFC after: 1 week	2015-04-17 21:21:11 +00:00
Mark Johnston	67cf27b70f	libdtrace: add support for lazyload mode. Passing "-x lazyload" to dtrace -G during compilation causes dtrace(1) to not link drti.o into the output object file, so the USDT probes are not created during process startup. Instead, dtrace(1) will automatically discover and create probes on the process' behalf when attaching. Differential Revision: https://reviews.freebsd.org/D2203 Reviewed by: rpaulo MFC after: 1 month	2015-04-08 02:36:37 +00:00
Alexander Motin	91b9f63738	Add DTrace probe to the new ARC reclaim cause added in r281026. MFC after: 1 month	2015-04-05 14:45:52 +00:00
Alexander Motin	2e9ccb32a1	Make ZFS ARC track both KVA usage and fragmentation. Even on Illumos, with its much larger KVA, ZFS ARC steps back if KVA usage reaches certain threshold (3/4 on i386 or 16/17 otherwise). FreeBSD has even less KVA, but had no such limit on archs with direct map as amd64. As result, on machines with a lot of RAM, during load with very small user- space memory pressure, such as `zfs send`, it was possible to reach state, when there is enough both physical RAM and KVA (I've seen up to 25-30%), but no continuous KVA range to allocate even single 128KB I/O request. Address this situation from two sides: - restore KVA usage limitations in a way the most close to Illumos; - introduce new requirement for KVA fragmentation, specifying that we should have at least one sequential KVA range of zfs_max_recordsize bytes. Experiments show that first limitation done alone is not sufficient. On machine with 64GB of RAM it is sometimes needed to drop up to half of ARC size to get at leats one 1MB KVA chunk. Statically limiting ARC to half of KVA/RAM is too strict, so second limitation makes it to work in cycles: accumulate trash up to certain critical mass, do massive spring-cleaning, and then start littering again. :) MFC after: 1 month	2015-04-03 14:45:48 +00:00
Andrew Turner	7572a8c8f1	Add the arm64 defines for cddl code. Differential Revision: https://reviews.freebsd.org/D2186 Reviewed by: emaste Sponsored by: The FreeBSD Foundation	2015-04-01 08:31:56 +00:00
Mark Johnston	09a15aa38d	Import a missing piece of commit b8fac8e162eda7e98d from illumos-gate. This adds an upper bound, dtrace_ustackdepth_max, to the number of frames traversed when computing the userland stack depth. Some programs - notably firefox - are otherwise able to trigger an infinite loop in dtrace_getustack_common(), causing a panic. MFC after: 1 week	2015-03-30 03:55:51 +00:00
Alexander Motin	e5dcb72f45	Some cosmetic polishing. No functional change. MFC after: 1 week	2015-03-29 20:28:18 +00:00
Mark Johnston	97f2f66479	Remove unused upstream DTrace provider implementations that are duplicates of providers under sys/cddl/dev/. Also remove sdt_subr.c, which isn't used in FreeBSD's SDT implementation. Suggested by: rwatson	2015-03-16 01:15:08 +00:00
Robert Watson	9dcce6e267	Now that DTrace stack traces handle exception frames better, skip fewer stack frames for FBT 'entry' probes on ARM. MFC after: 3 days Sponsored by: DARPA, AFRL	2015-03-15 15:19:02 +00:00

1 2 3 4 5 ...

1272 Commits