freebsd-nq

Author	SHA1	Message	Date
Justin Hibbits	e40a5cd3ec	Fix the stack tracing for dtrace/powerpc. Summary: Fix the stack tracing for dtrace/powerpc by using the trapexit/asttrapexit return address sentinels instead of checking within the kernel address space. As part of this, I had to add new inline functions. FBT traces the kernel, so we have to have special case handling for this, since a trap will create a full new trap frame, and there's no way to pass around the 'real' stack. I handle this by special-casing 'aframes == 0' with the trap frame. If aframes counts out to the trap frame, then assume we're looking for the full kernel trap frame, so switch to the real stack pointer. Test Plan: Tested on powerpc64 Reviewers: rpaulo, markj, nwhitehorn Reviewed By: markj, nwhitehorn Differential Revision: https://reviews.freebsd.org/D788 MFC after: 3 week Relnotes: Yes	2014-09-17 02:43:47 +00:00
Steven Hartland	a889b18c52	Added missing ZFS sysctls * vfs.zfs.vdev.async_write_active_min_dirty_percent * vfs.zfs.vdev.async_write_active_max_dirty_percent Added validation of min / max for ZFS sysctl * vfs.zfs.dirty_data_max_percent MFC after: 3 days	2014-09-14 12:23:00 +00:00
Xin LI	f9290bc2c9	MFV r271518: Correctly report hole at end of file. When asked to find a hole, the DMU sees that there are no holes in the object, and returns ESRCH. The ZPL interprets this as "no holes before the end of the file", and therefore inserts the "virtual hole" at the end of the file. Because DMU and ZPL have different ideas of where the end of an object/file is, we will end up returning the end of file, which is generally larger, instead of returning the end of object. The fix is to handle the "virtual hole" in the DMU. If no hole is found, the DMU will return a hole at the end of the file, rather than an error. Illumos issue: 5139 SEEK_HOLE failed to report a hole at end of file MFC after: 1 week	2014-09-13 17:48:44 +00:00
Xin LI	dc147754b7	MFV r271517: In zil_claim, don't issue warning if we get EBUSY (inconsistent) when opening an objset, instead, ignore it silently. Illumos issue: 5140 message about "%recv could not be opened" is printed when booting after crash MFC after: 1 week	2014-09-13 17:36:34 +00:00
Xin LI	be1b14a063	MFV r271515: Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to limit how many blocks can be free'ed before a new transaction group is created. The default is no limit (infinite), but we should probably have a lower default, e.g. 100,000. With this limit, we can guard against the case where ZFS could run out of memory when destroying large numbers of blocks in a single transaction group, as the entire DDT needs to be brought into memory. Illumos issue: 5138 add tunable for maximum number of blocks freed in one txg MFC after: 2 weeks	2014-09-13 17:24:56 +00:00
Xin LI	ff0fc48bde	MFV r271512: Illumos issue: 5136 fix write throttle comment in dsl_pool.c MFC after: 2 weeks	2014-09-13 16:51:23 +00:00
Xin LI	263f396e2b	MFV r271510: Enforce 4K as smallest indirect block size (previously the smallest indirect block size was 1K but that was never used). This makes some space estimates more accurate and uses less memory for some data structures. Illumos issue: 5141 zfs minimum indirect block size is 4K MFC after: 2 weeks	2014-09-13 16:26:14 +00:00
Steven Hartland	3cdd9138c3	Persist vdev_resilver_txg changes to avoid panic caused by validation vs a vdev_resilver_txg value from a previous resilver. MFC after: 1 week	2014-09-11 16:21:51 +00:00
Gleb Smirnoff	27ad26d8c7	Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES().	2014-09-10 12:36:41 +00:00
Alexander Motin	ee9534ed96	Make ZVOL writes in device mode support IO_SYNC flag. MFC after: 1 month	2014-09-09 11:29:55 +00:00
Xin LI	817d804595	MFV r271223: In dnode_sync(), do dnode_increase_indirection() before processing the dn_next_nblkptr. Illumos issue: 5117 space map reallocation can cause corruption MFC after: 3 days	2014-09-07 13:13:42 +00:00
Peter Wemm	d903c21a64	Move the restored #ifdef i386 test back inside the #ifdef _KERNEL block where it originally was.	2014-08-31 09:05:02 +00:00
Steven Hartland	92ac3eb59f	Ensure that ZFS ARC free memory checks include cached pages Also restore kmem_used() check for i386 as it has KVA limits that the raw page counts above don't consider PR: 187594 Reviewed by: peter X-MFC-With: r270759 Review: D700 Sponsored by: Multiplay	2014-08-30 21:44:32 +00:00
Mateusz Guzik	6662ce5aab	Add missing proctree locking to fill_kinfo_proc consumers. This fixes r270444. Pointy hat: mjg Reported by: many MFC after: 1 week	2014-08-30 03:10:55 +00:00
Steven Hartland	4d19f4ad1f	Refactor ZFS ARC reclaim logic to be more VM cooperative Prior to this change we triggered ARC reclaim when kmem usage passed 3/4 of the total available, as indicated by vmem_size(kmem_arena, VMEM_ALLOC). This could lead large amounts of unused RAM e.g. on a 192GB machine with ARC the only major RAM consumer, 40GB of RAM would remain unused. The old method has also been seen to result in extreme RAM usage under certain loads, causing poor performance and stalls. We now trigger ARC reclaim when the number of free pages drops below the value defined by the new sysctl vfs.zfs.arc_free_target, which defaults to the value of vm.v_free_target. Credit to Karl Denninger for the original patch on which this update was based. PR: 191510 and 187594 Tested by: dteske MFC after: 1 week Relnotes: yes Sponsored by: Multiplay	2014-08-28 19:50:08 +00:00
Mark Johnston	35127d3c0f	Restore the correct value when disabling probes. Otherwise the instrumented tracepoints would continue to generate traps, which would be ignored but could consume noticeable amounts of CPU if, say, all functions in the kernel were instrumented. X-MFC-With: r270067	2014-08-24 17:10:47 +00:00
Xin LI	ec1b564650	Instead of using timestamp in the AVL, use the memory address when comparing. Illumos issue: 5095 panic when adding a duplicate dbuf to dn_dbufs MFC after: 3 days	2014-08-22 23:13:53 +00:00
Xin LI	fa4484104c	MFV r270197: Illumos issue: 5066 remove support for non-ANSI compilation 5068 Remove SCCSID() macro from <macros.h> MFC after: 2 weeks	2014-08-22 22:13:36 +00:00
Xin LI	d291a3bd9c	Provide compatibility shim for atomic_dec_64_nv. X-MFC-with: r270247 MFC after: 13 days	2014-08-21 08:25:46 +00:00
Xin LI	7c1db36b28	MFV r270196: Illumos issue: 5047 don't use atomic_*_nv if you discard the return value MFC after: 2 weeks	2014-08-20 22:39:26 +00:00
Xin LI	249ddb42f6	MFC r270195: Illumos issue: 5045 use atomic_{inc,dec}_* instead of atomic_add_* MFC after: 2 weeks	2014-08-20 21:44:48 +00:00
Xin LI	2bcc37f99c	MFV r270193: Illumos issues: 5042 stop using deprecated atomic functions MFC after: 2 weeks	2014-08-20 18:29:18 +00:00
Mark Johnston	266b4a78c2	Factor out the common code for function boundary tracing instead of duplicating the entire implementation for both x86 and powerpc. This makes it easier to add support for other architectures and has no functional impact. Phabric: D613 Reviewed by: gnn, jhibbits, rpaulo Tested by: jhibbits (powerpc) MFC after: 2 weeks	2014-08-16 21:42:55 +00:00
Xin LI	60723bfe21	MFV r269542: In vdev_get_stats, check that the vdev is not a hole before computing the fragmentation. This fixes a panic when removing log device. Illumos issue: 5049 panic when removing log device Author: Alex Reece <alex@delphix.com> MFC after: 2 weeks	2014-08-05 00:07:21 +00:00
Mark Johnston	2661328745	Return 0 for the PPID of threads in process 0, as process 0 doesn't have a parent process. MFC after: 2 weeks	2014-08-04 19:02:30 +00:00
Xin LI	cd741a5e1d	Revert r269404 and use cpu_ticks() for dbuf allocation. Encode CPU's number by XOR'ing the CPU ID against the 64-bit cpu_ticks(). Reviewed by: mav, gibbs Differential Revision: https://phabric.freebsd.org/D521 MFC after: 2 weeks	2014-08-03 09:47:51 +00:00
Xin LI	1dcef10eac	MFV r269427: In dnode_children_t, use C99's "[]" idiom for declaring the variable sized array dnc_children at the end of the structure. This prevents the compiler from mistakenly optimizing away accesses beyond the array's defined size. Illumos issue: 5038 Remove "old-style" flexible array usage in ZFS. Author: Justin T. Gibbs <justing@spectralogic.com> MFC after: 2 weeks	2014-08-02 08:34:22 +00:00
Ian Lepore	c311f7078c	When arm 64-bit atomic ops are available, define ARM_HAVE_ATOMIC64. Use that symbol (which will be correct in both kernel and userland contexts) rather than just __arm__ to decide whether to use a local implementation.	2014-08-02 03:44:27 +00:00
Ian Lepore	814f4c5896	Use the 64-bit atomics now provided by arm machine/atomic.h instead of (conflicting) local versions.	2014-08-01 23:45:50 +00:00
Steven Hartland	6a369c018c	Don't return ZIO_PIPELINE_CONTINUE from vdev_op_io_start methods This prevents recursion of vdev_queue_io_done as per r265321 but using a different method as recommended on the openzfs list. We now use zio_interrupt(zio) and return ZIO_PIPELINE_STOP instead of returning ZIO_PIPELINE_CONTINUE from vdev_*_io_start methods. zio_vdev_io_start now ASSERTS the that vdev_op_io_start returns ZIO_PIPELINE_STOP to ensure future changes don't reintroduce ZIO_PIPELINE_CONTINUE returns. Cleanup flow in vdev_geom_io_start while I'm here. Also fix some cases not using SET_ERROR(..) MFC after: 2 weeks X-MFC-With: r265321	2014-08-01 23:16:48 +00:00
Xin LI	125f68e708	Split gethrtime() and gethrtime_waitfree() and make the former use nanouptime() instead of getnanouptime(). nanouptime(9) provides more precise result at expense of being slower. In r269223, gethrtime() is used as creation time of dbuf, which in turn acts as portion of lookup key to maintain AVL invariant where there can not be duplicate items. Before this change, gethrtime() have preferred better execution time by sacrificing precision, which may lead to panic on busy systems with: panic: avl_find() succeeded inside avl_add() Reported by: allanjude, mav PR: kern/192284 MFC after: 11 days X-MFC-with: r269223	2014-08-01 22:33:23 +00:00
Rui Paulo	d18aa577d5	Copy strtolctype.h to sys/cddl/contrib/opensolaris/common/util to keep the kernel self-contained. Requested by: jhb	2014-07-31 08:07:23 +00:00
Xin LI	9b046b421f	MFV r269224: Increase default ARC buf_hash_table size. When typical block size is small, the hash table could be too small, which would lead to long hash chains and limit performance for cached reads. A new loader tunable, vfs.zfs.arc_average_blocksize, have been added which allows users to override the default assumption of average (typical) block size. Old default was 65536 (64 KiB) and new default is 8192 (8 KiB). Illumos issue: 5034 ARC's buf_hash_table is too small MFC after: 2 weeks	2014-07-29 09:36:48 +00:00
Xin LI	a3cbca537e	MFV r269223: Change dn->dn_dbufs from linked list to AVL tree. Illumos issues: 4873 zvol unmap calls can take a very long time for larger datasets MFC after: 2 weeks	2014-07-29 08:42:22 +00:00
Xin LI	343c95a24e	Reschedule the 'deadman' callout after handling, this makes our code behave more like it is on Solaris. Reported by: avg Reviewed by: avg, mav (but bugs are mine) Differential Revision: https://phabric.freebsd.org/D457	2014-07-29 06:57:13 +00:00
Konstantin Belousov	fe0e9a63e0	Initialize zfs vnode v_hash when the vnode is allocated, instead of postponing it to zfs_vget(). zfs_root() returned vnode with the default value of v_hash, which caused inconsistent v_hash value when root vnode was obtained from zfs_vget(). Nullfs allocated two upper vnodes for the root zfs vnode due to different hashes, causing consistency problems. Reported and tested by: Harald Schmalzbauer <h.schmalzbauer@omnilan.de> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-28 14:24:18 +00:00
Xin LI	50b74c6ef1	Add two sysctls for newly added tunables. MFC after: 2 weeks	2014-07-26 19:07:08 +00:00
Xin LI	7e37b1e609	MFV r269010: Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric MFC after: 2 weeks	2014-07-26 10:20:48 +00:00
Alexander Motin	1bc04f6a8c	Make sysctls under vfs.zfs.zfetch writeable. I don't see any reason for them to be read-only, while tuning them without reboot is much more convenient for experiments. MFC after: 2 weeks	2014-07-26 09:09:14 +00:00
Xin LI	0aa4ce9b7d	Transform the I/O when vdev_physical_ashift is greater than SPA_MINBLOCKSHIFT. MFC after: 2 weeks	2014-07-25 18:41:56 +00:00
Xin LI	883d80c104	As of r268075, the responsibility of rounding up buffer to optimal size have been transferred from zio_compress_data to its caller. Therefore, passing the 'minblocksize' down will be a no-op. Eliminate the parameter to reduce diff against upstream. MFC after: 2 weeks	2014-07-25 06:53:20 +00:00
Xin LI	3d4d6b0883	Correct typo introduced with r268855. MFC after: 10 days X-MFC with: r268855	2014-07-22 08:37:01 +00:00
Mark Johnston	5a5f9d21dd	Use a C wrapper for trap() instead of checking and calling the DTrace trap hook in assembly. Suggested by: kib Reviewed by: kib (original version) X-MFC-With: r268600	2014-07-19 02:27:31 +00:00
Xin LI	b4bb49887b	Reduce lock contention on the z_teardown_lock under heavily cached read workload by splitting the single teardown rrw lock into RRM_NUM_LOCKS (17) of them. Read acquisitions are randomly distributed among these locks based on curthread pointer. Write acquisitions are going to all the locks, which for the usage of this type of lock should be rare. Illumos issue: 5008 lock contention (rrw_exit) while running a read only load MFC after: 2 weeks	2014-07-19 00:26:03 +00:00
Xin LI	82599d31fe	MFV r268851: When a sync task is waiting for a txg to complete, we should hurry it along by increasing the number of outstanding async writes (i.e. make vdev_queue_max_async_writes() return a larger number). Illumos issue: 4753 increase number of outstanding async writes when sync task is waiting MFC after: 2 weeks	2014-07-18 22:34:01 +00:00
Xin LI	f886b6e3bc	MFV r268850: Change the interaction between the DMU and ARC so that when the DMU is shutting down an objset, we do not evict the data from the ARC. Instead we simply coordinate the destruction of the DMU's data with the ARC. The only case where we actually need to explicitly evict from the ARC is when dbuf_rele_and_unlock() determines that the administrator has requested that it not be kept in memory, via the primarycache/secondarycache properties. In this case, we evict the data from the ARC by its blkptr_t, the same way as when a block is freed we explicitly evict it from the ARC. Illumos issue: 4631 zvol_get_stats triggering too many reads MFC after: 2 weeks	2014-07-18 22:04:21 +00:00
Xin LI	7882b61f60	MFV r268848: Instead of asserting all zio's be properly aligned, only assert on the logical ones. Cap uberblocks at 8k, otherwise with ashift=17, there would be only one uberblock. This fixes a problem that zdb would trip assert on pools with ashift >= 0xe (8k). While there, also change the code so it only attempt to condense space map unless the uncondensed size consumes greater than zfs_metaslab_condense_block_threshold blocks. Illumos issue: 4958 zdb trips assert on pools with ashift >= 0xe MFC after: 2 weeks	2014-07-18 20:41:40 +00:00
Xin LI	7079d5877c	MFV r268714: Improve extreme rewind import. When doing an "extreme rewind" import ("zpool import -XF"), we attempt to verify all data in the pool, essentially scrubbing the entire pool. The problem is that spa_load_verify_cb() issues an unbounded number of concurrent scrub i/os. This can lead to all of memory being used for these zio's, wedging the system. Like normal scrub, we need to put a cap on the number of outstanding i/os, and have the traverse thread block when we reach this cap. For this purpose the cap can be very large (10,000) to optimize the elevator algorithm. Three kernel tunables have been added: vfs.zfs.spa_load_verify_maxinflight vfs.zfs.spa_load_verify_metadata vfs.zfs.spa_load_verify_data The latter two tunables controls whether metadata and/or user data when doing extreme rewind. Make 'zpool import -T' imply scrub. Make zpool import -T <txg> accept hexadecimal values for the txg when prefixed with 0x. Skip txg's for which there is no uberblock when doing extreme rewind. Skip reading all user data twice by skipping prefetches when doing extreme rewinds as we do not access via the ARC. Illumos issues: 4970 need controls on i/o issued by zpool import -XF 4971 zpool import -T should accept hex values 4972 zpool import -T implies extreme rewind, and thus a scrub 4973 spa_load_retry retries the same txg 4974 spa_load_verify() reads all data twice MFC after: 2 weeks	2014-07-15 22:44:04 +00:00
Xin LI	eb75155228	MFV r268702: Add missing *_destroy() calls in various places with ZFS. Illumos issue: 4975 missing mutex_destroy() calls in zfs MFC after: 2 weeks	2014-07-15 20:32:23 +00:00
Mark Johnston	291624fdf6	Invoke the DTrace trap handler before calling trap() on amd64. This matches the upstream implementation and helps ensure that a trap induced by tracing fbt::trap:entry is handled without recursively generating another trap. This makes it possible to run most (but not all) of the DTrace tests under common/safety/ without triggering a kernel panic. Submitted by: Anton Rang <anton.rang@isilon.com> (original version) Phabric: D95	2014-07-14 04:38:17 +00:00

1 2 3 4 5 ...

1097 Commits