freebsd-dev

Author	SHA1	Message	Date
Justin Hibbits	45bf6d59de	Fix a couple bugs in 64-bit powerpc fasttrap argument retrieval. Found by code inspection.	2015-05-10 04:33:01 +00:00
Andriy Gapon	96b60db0d7	MFV r282630: 5809 Blowaway full receive in v1 pool causes kernel panic MFC after: 5 days	2015-05-08 14:03:14 +00:00
Andriy Gapon	24dd1a8242	zfs: do not hold an extra reference on a root vnode while a filesystem is mounted At present zfs_domount() acquires a reference on the filesystem's root vnode and that reference is kept until zfs_umount. The latter calls vflush(rootrefs = 1) to dispose of the extra reference. There is no explanation of why that reference is kept - what problem it solves or what behavior it improves. Also, that logic is FreeBSD specific. There is one real problem with that reference, though. zfs recv -F may receive a full, non-incremental stream to a mounted filesystem. In that case the received root object is likely to have a different z_gen attribute value. Because of that, zfs_rezget will leave the previous root znode and vnode disassociated from the actual object (z_sa_hdl == NULL). Thus, future calls to VFS_ROOT() -> zfs_root() will produce a new vnode-znode pair, while the old one will be kept alive by the outstanding reference. So, the outstanding reference will not actually be for the new root vnode (or, more precisely, vnodes - because a root vnode may be recycled and a newer one can be created). As a result, when vflush(rootrefs = 1) s called there will be two problems: - a leaked reference on the old root vnode preventing a graceful unmount - insufficient references on the actual root vnode leading to a crash upon access to the vnode after it is destroyed by vgone() + vdrop() The second issue will actually override the first one. Differential Revision: https://reviews.freebsd.org/D2353 Reviewed by: delphij, kib, smh MFC after: 17 days	2015-05-05 11:01:06 +00:00
Andriy Gapon	ce0023d851	dmu_recv_end_check: don't leak hold if dsl_destroy_snapshot_check_impl fails The leak may happen if !drc_newfs && drc_force and there is an error iterating through snapshots or any of snapshot checks fails. See https://www.illumos.org/issues/5870 See https://reviews.csiden.org/r/206/ Reviewed by: mahrens (as mahrens@delphix.com) MFC after: 15 days Sponsored by: ClusterHQ	2015-05-05 10:56:16 +00:00
Steven Hartland	aeb9d4dad9	Fix misuse of input argument in traverse_visitbp In traverse_visitbp(), the input argument dnp is modified in the middle to point to a temporary buffer. Originally this doesn't matter, because no user of TRAVERSE_POST dereferences it. However, in `fbeddd6` a piece of code is added dereferencing dnp after the modification, creating a possible bug. We fix this by creating a new local variable cdnp for the DMU_OT_DNODE case, so we don't modify the input argument. Also we introduce different local variables in the DMU_OT_OBJSET case to prevent confusion between the input argument. Obtained from: zfsonlinux (a585f2f844ed3d4270221fed88f5e494eb55d932) MFC after: 2 weeks Sponsored by: Multiplay	2015-04-28 22:46:58 +00:00
Andriy Gapon	9bc3222765	replace a comment about zfs recv -F corner case with a longer, more detailed one The old comment in zfs_rezget explains what situation the code handles, the new comment also describes how the situation can arise. Also, re-join a line that became sufficiently shorti some time ago. Differential Revision: https://reviews.freebsd.org/D2352 Reviewed by: delphij, smh MFC after: 12 days	2015-04-28 09:19:40 +00:00
Andriy Gapon	1af760ce1b	zfs_onexit_fd_hold: return EBADF even if devfs_get_cdevpriv gave ENOENT /dev/zfs always has per-open data, so when it is missing the file descriptor is for some other file. Returning ENOENT in this case is confusing as a variety of other conditions (like a missing dataset) may result in the same error. It's better to consistently return EBADF for any problems with the file descriptor. Note that zfs_onexit_fd_hold() is used with 'automatic cleanup fd' - when that fd is closed, typically because a process is terminated, some cleanup action is taken by ZFS driver. E.g. a temporary snapshot hold is released. Perhaps, it would even be worthwhile changing devfs_get_cdevpriv() to return EBADF if there is no associated data. Differential Revision: https://reviews.freebsd.org/D2370 Reviewed by: delphij, smh MFC after: 12 days	2015-04-28 09:11:47 +00:00
Andriy Gapon	37a9b4136e	dsl_dir_rename_check: return EXDEV on cross-pool rename attempt Obtained from: zfsonlinux/zfs@9063f65476 Obtained from: Boris Protopopov <boris.protopopov@actifio.com> MFC after: 10 days	2015-04-28 08:04:16 +00:00
Andriy Gapon	99d058c8a7	MFV r282123: 5610 zfs clone from different source and target pools produces coredump MFC after: 10 days	2015-04-28 07:42:28 +00:00
Andriy Gapon	28d15239af	MFV r282124: 5393 spurious failures from dsl_dataset_hold_obj() The actual bugfix was pro-actively committed in r275515. This MFV is cosmetic, it just aligns code style with the upstream. MFC after: 10 days	2015-04-28 07:37:38 +00:00
Andriy Gapon	39b6f1d6c1	nvpair_type_is_array: DATA_TYPE_INT8_ARRAY was not recognized To do: upstream (https://www.illumos.org/issues/5778) MFC after: 10 days	2015-04-28 06:34:55 +00:00
Mark Johnston	8241ee3b2c	Fix DTrace's panic() action. It would previously call into some unfinished Solaris compatibility code and return without actually calling panic(9). The compatibility code is unneeded, however, so just remove it and have dtrace_panic() call vpanic(9) directly. Differential Revision: https://reviews.freebsd.org/D2349 Reviewed by: avg MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2015-04-24 03:19:30 +00:00
Xin LI	384f656a1a	Remove vfs.zfs.snapshot_list_prefetch, the corresponding code was gone in r248571 already. MFC after: 1 week	2015-04-17 21:21:11 +00:00
Mark Johnston	67cf27b70f	libdtrace: add support for lazyload mode. Passing "-x lazyload" to dtrace -G during compilation causes dtrace(1) to not link drti.o into the output object file, so the USDT probes are not created during process startup. Instead, dtrace(1) will automatically discover and create probes on the process' behalf when attaching. Differential Revision: https://reviews.freebsd.org/D2203 Reviewed by: rpaulo MFC after: 1 month	2015-04-08 02:36:37 +00:00
Alexander Motin	91b9f63738	Add DTrace probe to the new ARC reclaim cause added in r281026. MFC after: 1 month	2015-04-05 14:45:52 +00:00
Alexander Motin	2e9ccb32a1	Make ZFS ARC track both KVA usage and fragmentation. Even on Illumos, with its much larger KVA, ZFS ARC steps back if KVA usage reaches certain threshold (3/4 on i386 or 16/17 otherwise). FreeBSD has even less KVA, but had no such limit on archs with direct map as amd64. As result, on machines with a lot of RAM, during load with very small user- space memory pressure, such as `zfs send`, it was possible to reach state, when there is enough both physical RAM and KVA (I've seen up to 25-30%), but no continuous KVA range to allocate even single 128KB I/O request. Address this situation from two sides: - restore KVA usage limitations in a way the most close to Illumos; - introduce new requirement for KVA fragmentation, specifying that we should have at least one sequential KVA range of zfs_max_recordsize bytes. Experiments show that first limitation done alone is not sufficient. On machine with 64GB of RAM it is sometimes needed to drop up to half of ARC size to get at leats one 1MB KVA chunk. Statically limiting ARC to half of KVA/RAM is too strict, so second limitation makes it to work in cycles: accumulate trash up to certain critical mass, do massive spring-cleaning, and then start littering again. :) MFC after: 1 month	2015-04-03 14:45:48 +00:00
Andrew Turner	7572a8c8f1	Add the arm64 defines for cddl code. Differential Revision: https://reviews.freebsd.org/D2186 Reviewed by: emaste Sponsored by: The FreeBSD Foundation	2015-04-01 08:31:56 +00:00
Alexander Motin	e5dcb72f45	Some cosmetic polishing. No functional change. MFC after: 1 week	2015-03-29 20:28:18 +00:00
Mark Johnston	97f2f66479	Remove unused upstream DTrace provider implementations that are duplicates of providers under sys/cddl/dev/. Also remove sdt_subr.c, which isn't used in FreeBSD's SDT implementation. Suggested by: rwatson	2015-03-16 01:15:08 +00:00
Steven Hartland	208264283d	Allow zvol_geom_worker to process BIO_DELETE's If zvol_geom_start is called with a BIO_DELETE from a thread which can sleep it queues it for later processing by the zvol_geom_worker. The zvol_geom_worker didn't have a delete case so would simply loose the bio hence preventing the original caller from every completing. In addition an other unknown types would suffer the same fate. Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and return unsupported for all unknown bio types. MFC after: 2 weeks Sponsored by: Multiplay	2015-03-14 17:35:04 +00:00
Alexander Motin	0d45c37cb6	Make DIOCGATTR in device mode handle "GEOM::candelete". MFC after: 3 days	2015-03-12 16:19:18 +00:00
Andrew Turner	4a8169d97b	Add the MD parts of dtrace needed to use fbt on ARM. For this we need to emulate the instructions used in function entry and exit. For function entry ARM will use a push instruction to push up to 16 registers to the stack. While we don't expect all 16 to be used we need to handle any combination the compiler may generate, even if it doesn't make sense (e.g. pushing the program counter). On function return we will either have a pop or branch instruction. The former is similar to the push instruction, but with care to make sure we update the stack pointer and program counter correctly in the cases they are either in the list of registers or not. For branch we need to take the 24-bit offset, sign-extend it, and add that number of 4-byte words to the program counter. Care needs to be taken as, due to historical reasons, the address the branch is relative to is not the current instruction, but 8 bytes later. This allows us to use the following probes on ARM boards: dtrace -n 'fbt::malloc:entry { stack() }' and dtrace -n 'fbt:🆓return { stack() }' Differential Revision: https://reviews.freebsd.org/D2007 Reviewed by: gnn, rpaulo Sponsored by: ABT Systems Ltd	2015-03-05 17:55:31 +00:00
George V. Neville-Neil	fcb5606706	Initial version of DTrace on ARM32. Submitted by: Howard Su based on work by Oleksandr Tymoshenko Reviewed by: ian, andrew, rpaulo, markj	2015-02-10 19:41:30 +00:00
Mark Johnston	3277b9a257	Fix a typo in r278137: make sure to free provider state. X-MFC-With: r278136	2015-02-08 03:55:12 +00:00
Pedro F. Giffuni	3ccccdc17d	MFV r266995: 4767 dtrace_probe() always has the timestamp Reference: https://illumos.org/issues/4767 Obtained from: Illumos MFC after: 2 weeks	2015-02-03 20:06:30 +00:00
Pedro F. Giffuni	eadcd0fadf	MFV r266993: 4469 DTrace helper tracing should be dynamic Reference: https://illumos.org/issues/4469 Obtained from: Illumos Phabric: D1551 Reviewed by: markj MFC after: 2 weeks	2015-02-03 19:39:53 +00:00
Mark Johnston	c36bd253fa	Continue to handle the case where state is NULL, though this currently cannot happen on FreeBSD. r278136 overlooked the fact that a destructor registered with devfs_set_cdevpriv(9) is invoked even in the case of an error. X-MFC-With: r278136	2015-02-03 06:04:16 +00:00
Mark Johnston	ac21b651bf	Diff reduction with illumos, in preparation for merging r266993 from the vendor branch. No functional change. MFC after: 1 week	2015-02-03 05:38:52 +00:00
Steven Hartland	370a13bfff	Prevent inlining txg_quiesce This allows dtrace to monitor the calls to txg_quiesce which can be really helpful. Also standardise __noinline order for arc_kmem_reap_now. Sponsored by: Multiplay	2015-02-02 00:17:36 +00:00
Mark Johnston	a70a59ea73	Don't attempt to disable enabled fasttrap probes in an exiting process. There's no need to do so, and we can't hold an exiting process, so this race can result in panics. MFC after: 1 week	2015-01-30 05:03:23 +00:00
Mark Johnston	1eb8ad64ea	In fasttrap_sigtrap(), use tdsendsignal() rather than tdksignal() to send SIGTRAP. The latter requires that its thread argument be non-NULL, but fasttrap_sigtrap() does not. PR: 193593 MFC after: 1 week Reported by: danilo	2015-01-30 04:51:59 +00:00
Xin LI	63cffd61d1	MFV r255258: Diff reduction with upstream. The actual change was merged in r272483 already. MFC after: 2 weeks	2015-01-28 08:56:48 +00:00
Will Andrews	b4e360d239	When creating or updating a node, use vfs_timestamp() for "now" instead of gethrestime(), to allow the administrator to decide the appropriate timestamp precision instead of always using nanosecond precision.	2015-01-24 00:43:02 +00:00
Will Andrews	bd3a7c08c4	Remove commented log messages.	2015-01-21 19:30:01 +00:00
Will Andrews	35b540bfb2	Ignore sync requests from the system syncher, i.e. VFS_SYNC(waitfor=MNT_LAZY). ZFS already commits outstanding data every zfs_txg_timeout seconds, so these syncs are unnecessarily intrusive. Submitted by: gibbs Sponsored by: Spectra Logic MFSpectraBSD: `1105759` on 2014/12/11	2015-01-21 19:25:57 +00:00
Will Andrews	2a2c1d424a	Eliminate an #ifdef illumos for zfs_ioc_rename(). Since allow_mounted is a FreeBSD-specific change, default to B_TRUE, then locally check for the magic bit. Unconditionally check allow_mounted below. Convert the setting of allow_mounted to an explicit boolean. MFC after: 1 week Sponsored by: Spectra Logic MFSpectraBSD: 672578 (in part) on 2013/07/19	2015-01-21 19:20:36 +00:00
Will Andrews	55ddf051d8	Add vfs.zfs.reference_tracking_enable sysctl/tunable. This is primarily for developer/debugging use; it enables built-in tagged tracking of refcounts inside ZFS. It can only be enabled from the loader, since it modifies how in-core state is managed. Default remains disabled. MFC after: 1 week Sponsored by: Spectra Logic	2015-01-21 17:03:11 +00:00
Will Andrews	798cbb7523	Fix arc__shrink DTrace probe's to_free argument. Remove the unnecessary #ifdef _KERNEL, which did not differ in the true or false cases. Actually set the value of to_free before using it. MFC after: 1 week Sponsored by: Spectra Logic	2015-01-20 22:39:10 +00:00
Will Andrews	fe20fb9fb0	Use the "zfs_gfs" tag for GFS vnodes to make them easier to identify. MFC after: 1 week Sponsored by: Spectra Logic	2015-01-20 22:31:26 +00:00
Alexander Motin	d6245e3d44	Allow skipping dmu_buf_will_dirty() call in dsl_dir_transfer_space(). dsl_dir_transfer_space() is mostly called after dsl_dir_diduse_space(), which already calls dmu_buf_will_dirty() for the same dbuf and tx, so its duplicate call in those cases will change nothing, only spend time. Skipping this call by four times reduces time spent in dbuf_write_done() and descendants, updating dataset statistics with several congested lock acquisitions. When rewriting 8K zvol blocks at 1GB/s rate, this reduces CPU time spent inside dbuf_write_done(), according to profiling, from 45% of 683K samples to 18% of 422K. MFC after: 2 weeks	2015-01-20 13:09:12 +00:00
Steven Hartland	5eab7e5406	Clean ZFS spa config before syncing A number of entries that can be present in the spa config shouldn't be saved to disk so add a method to ensure this is case. Without this if the last caller to vdev_config_generate requested stats then we can end up in the cache file. Also only skip a none writable pool in the cache file generation if its active. This prevents unavailable pools incorrectly getting removed from cache file. Tested by: delphij MFC after: 2 weeks Sponsored by: Multiplay	2015-01-18 23:15:49 +00:00
Steven Hartland	bc96366c86	Mechanically convert cddl sun #ifdef's to illumos Since the upstream for cddl code is now illumos not sun, mechanically convert all sun #ifdef's to illumos #ifdef's which have been used in all newer code for some time. Also do a manual pass to correct the use if #ifdef comments as per style(9) as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos. MFC after: 1 month Sponsored by: Multiplay	2015-01-17 14:44:59 +00:00
Alexander Motin	38feff972b	Fix overflow bug from r248577, turning 30s TRIM timeout into ~4s. MFC after: 2 weeks	2015-01-14 16:22:00 +00:00
Alexander Motin	d4f46a775d	Reimplement TRIM throttling added in r248577. Previous throttling implementation approached problem from the wrong side. It significantly limited useful delaying of TRIM requests and aggregation potential, while not so much controlled TRIM burstiness under heavy load. With this change random 4K write benchmarks (probably the worst case for TRIM) show me IOPS increase by 20%, average latency reduction by 30%, peak TRIM bursts reduction by 3 times and same peak TRIM map size (memory usage). Also the new logic does not force map size down so heavily, really allowing to keep deleted data for 32 TXG or 30 seconds under moderate load. It was practically impossible with old throttling logic, which pushed map down to only 64 segments. Reviewed by: smh MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2015-01-14 09:39:57 +00:00
Alexander Motin	5b3a65823d	Skip extra bcopy() when scrubbing vdev without redundancy. According to profiler, this bcopy() can use about 10% of CPU time. MFC after: 2 weeks	2015-01-12 22:38:55 +00:00
Alexander Motin	f5b85f6551	When aggregating TRIM segments, move the new one to the list end. New segment at the list head may block all TRIM requests until txg of that segment can be processed. On my random I/O tests this change reduce peak TRIM list length from 650 to 450 segments. Hopefully it should reduce TRIM burstiness when list processing is unblocked. MFC after: 2 weeks	2015-01-11 16:36:39 +00:00
Alexander Motin	2de874ed23	Add LBA as secondary sort key for synchronous I/O requests. On FreeBSD gethrtime() implemented via getnanouptime(), that has 1ms (1/hz) precision. It makes primary sort key (timestamp) collision very possible. In such situations sorting by secondary key of LBA is much more reasonable then by totally meaningless zio pointer value. With this change on multi-threaded synchronous ZVOL read I've measured 10% throughput increase and average latency reduction. MFC after: 2 weeks	2015-01-11 00:26:18 +00:00
Alexander Motin	13ea8106d9	Use new optimized dmu_read_uio_dbuf() for ZVOLs in device mode. This slightly reduces overhead by avoiding dnode_hold()/dnode_rele() calls. MFC after: 2 weeks	2015-01-10 18:28:58 +00:00
Steven Hartland	8de799ea3a	Correct zpool list displaying invalid EXPANDSZ for unavailable pool vdevs When pools are unavailable their vdevs are also unavailable which means that vdev_max_asize remains at the default zero. This default was being used to calculate vs_esize resulting in a negative number as vdev_asize > vdev_max_asize, which caused zpool list -v to display 16.0E for EXPANDSZ of these vdevs.	2014-12-31 04:54:48 +00:00
Steven Hartland	51f529b50b	Always sync the global ZFS config cache to reflect the new mosconfig This fixes out of date zpool.cache for root pools, which can cause issues such as confusion of zdb etc. MFC after: 1 month	2014-12-23 09:31:24 +00:00

1 2 3 4 5 ...

1010 Commits