freebsd-dev

Author	SHA1	Message	Date
Dimitry Andric	02d4a225db	With clang 3.9.0, compiling uplcom results in the following warnings: sys/dev/usb/serial/uplcom.c:543:29: error: implicit conversion from 'int' to 'int8_t' (aka 'signed char') changes value from 192 to -64 [-Werror,-Wconstant-conversion] if (uplcom_pl2303_do(udev, UT_READ_VENDOR_DEVICE, UPLCOM_SET_REQUEST, 0x8484, 0, 1) ~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~~~~~~ sys/dev/usb/usb.h:179:53: note: expanded from macro 'UT_READ_VENDOR_DEVICE' #define UT_READ_VENDOR_DEVICE (UT_READ \| UT_VENDOR \| UT_DEVICE) ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ This is because UT_READ is 0x80, so the int8_t argument is wrapped to a negative value. Fix this by using uint8_t instead. Reviewed by: imp, hselasky MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7776	2016-09-04 16:59:35 +00:00
Mateusz Guzik	591df14528	cache: defer freeing entries until after the global lock is dropped This also defers vdrop for held vnodes. Glanced at by: kib	2016-09-04 16:52:14 +00:00
Bruce Evans	f776d19f07	Oops, the previous i386 version of e_fmodf.S and e_fmodl.S was actually the amd64 version.	2016-09-04 15:08:14 +00:00
Bruce Evans	5432c3bec3	Disconnect the "optimized" asm variants of cos(), sin() and tan() from the build on i386. Leave them in the source tree for regression tests. The asm functions were always much less accurate (by a factor of more than 10**18 in the worst case). They were faster on old CPUs. But with each new generation of CPUs they get relatively slower. The double precision C version's average advantage is about a factor of 2 on Haswell. The asm functions were already intentionally avoided in float and long double precision on i386 and in all precisions on amd64. Float precision and amd64 give larger advantages to the C version. The long double precision C code and compilers' understanding of long double precision are not so good, so the i387 is still slightly faster for long double precision, except for the unimportant subcase of huge args where the sub-optimal C code now somehow beats the i387 by about a factor of 2.	2016-09-04 14:12:19 +00:00
Mateusz Guzik	1c1c35c74e	fd: fix up fdeget_file It was supposed to return NULL if a fp is not installed. Facepalm-by: mjg	2016-09-04 13:31:57 +00:00
Bruce Evans	83e449a402	Add asm versions of fmod(), fmodf() and fmodl() on amd64. Add asm versions of fmodf() amd fmodl() on i387. fmod is similar to remainder, and the C versions are 3 to 9 times slower than the asm versions on x86 for both, but we had the strange mixture of all 6 variants of remainder in asm and only 1 of 6 variants of fmod in asm.	2016-09-04 12:22:14 +00:00
Dag-Erling Smørgrav	e2d1500434	Upgrade to Unbound 1.5.9.	2016-09-04 12:17:57 +00:00
Bruce Evans	a4c138885e	Fix missing fmodl() on arches with 53-bit long doubles. PR: 199422, 211965 MFC after: 1 week	2016-09-04 12:01:32 +00:00
Mateusz Guzik	31977b420a	cache: manage negative entry list with a dedicated lock Since negative entries are managed with a LRU list, a hit requires a modificaton. Currently the code tries to upgrade the global lock if needed and is forced to retry the lookup if it fails. Provide a dedicated lock for use when the cache is only shared-locked. Reviewed by: kib MFC after: 1 week	2016-09-04 08:58:35 +00:00
Mateusz Guzik	b9042ae1bf	cache: put all negative entry management code into dedicated functions Reviewed by: kib MFC after: 1 week	2016-09-04 08:55:15 +00:00
Landon J. Fuller	eb83f2e1ea	bhndb(4): Fix probing of bhndb-attached bhnd_nvram devices. This fixes bhnd(4) nvram handling on devices that map SPROM CSRs via PCI configuration space. The probe method previously required that a bhnd(4) device be attached to the parent bridge; now that the bhnd_nvram device is always attached first, this unnecessary sanity check always failed. Approved by: adrian (mentor, implicit)	2016-09-04 01:47:21 +00:00
Landon J. Fuller	63fb0e8236	bhndb(4): Skip disabled cores when performing bridge configuration probing. On BCM4321 chipsets, both PCI and PCIe cores are included, with one of the cores potentially left floating. Since the PCI core appears first in the device table, and the PCI profiles appear first in the resource configuration tables, this resulted in incorrectly matching and using the PCI/v1 resource configuration on PCIe devices, rather than the correct PCIe/v1 profile. Approved by: adrian (mentor, implicit)	2016-09-04 01:43:54 +00:00
Landon J. Fuller	f32befd1a2	siba(4): Add missing bhnd_device/bhnd_device_quirk table terminator entries. This resulted in an over-read on siba chipsets that failed to match the existing entries. Approved by: adrian (mentor, implicit)	2016-09-04 01:25:46 +00:00
Landon J. Fuller	d9189ae727	Remove empty directories left by r299241, r302190, r304870, and r301410 Approved by: adrian (mentor, implicit)	2016-09-04 01:17:16 +00:00
Landon J. Fuller	111d7cb2e3	Migrate bhndb(4) to the new bhnd_erom API. Adds support for probing and initializing bhndb(4) bridge state using the bhnd_erom API, ensuring that full bridge configuration is available prior to actually attaching and enumerating the bhnd(4) child device, allowing us to safely allocate bus-level agent/device resources during bhnd(4) bus enumeration. - Add a bhnd_erom_probe() method usable by bhndb(4). This is an analogue to the existing bhnd_erom_probe_static() method, and allows the bhndb bridge to discover the best available erom parser class prior to newbus probing of its children. - Add support for supplying identification hints when probing erom devices. This is required on early EXTIF-only chipsets, where chip identification registers are not available. - Migrate bhndb over to the new bhnd_erom API, using bhnd_core_info records rather than bridged bhnd(4) device_t references to determine the bridged chipsets' capability/bridge configuration. - The bhndb parent (e.g. if_bwn) is now required to supply a hardware priority table to the bridge. The default table is currently sufficient for our supported devices. - Drop the two-pass attach approach we used for compatibility with bhndb(4) in the bhnd(4) bus drivers, and instead perform bus enumeration immediately, and allocate bridged per-child bus-level resources during that enumeration. Approved by: adrian (mentor) Differential Revision: https://reviews.freebsd.org/D7768	2016-09-04 00:58:19 +00:00
Mark Johnston	3da0f3c9ae	Micro-optimize sleepq_signal(). Lift a comparison out of the loop that finds the highest-priority thread on the queue. MFC after: 1 week	2016-09-04 00:29:48 +00:00
Mark Johnston	dd9cb6da0b	Respect the caller's hints when performing swap readahead. The pager getpages interface allows the caller to bound the number of readahead and readbehind pages, and vm_fault_hold() makes use of this feature. These bounds were ignored after r305056, causing the swap pager to potentially page in more than the specified number of pages. Reported and reviewed by: alc X-MFC with: r305056	2016-09-04 00:25:49 +00:00
Landon J. Fuller	664a749708	Implement a generic bhnd(4) device enumeration table API. This defines a new bhnd_erom_if API, providing a common interface to device enumeration on siba(4) and bcma(4) devices, for use both in the bhndb bridge and SoC early boot contexts, and migrates mips/broadcom over to the new API. This also replaces the previous adhoc device enumeration support implemented for mips/broadcom. Migration of bhndb to the new API will be implemented in a follow-up commit. - Defined new bhnd_erom_if interface for bhnd(4) device enumeration, along with bcma(4) and siba(4)-specific implementations. - Fixed a minor bug in bhndb that logged an error when we attempted to map the full siba(4) bus space (18000000-17FFFFFF) in the siba EROM parser. - Reverted use of the resource's start address as the ChipCommon enum_addr in bhnd_read_chipid(). When called from bhndb, this address is found within the host address space, resulting in an invalid bridged enum_addr. - Added support for falling back on standard bus_activate_resource() in bhnd_bus_generic_activate_resource(), enabling allocation of the bhnd_erom's bhnd_resource directly from a nexus-attached bhnd(4) device. - Removed BHND_BUS_GET_CORE_TABLE(); it has been replaced by the erom API. - Added support for statically initializing bhnd_erom instances, for use prior to malloc availability. The statically allocated buffer size is verified both at runtime, and via a compile-time assertion (see BHND_EROM_STATIC_BYTES). - bhnd_erom classes are registered within a module via a linker set, allowing mips/broadcom to probe available EROM parser instances without creating a strong reference to bcma/siba-specific symbols. - Migrated mips/broadcom to bhnd_erom_if, replacing the previous MIPS-specific device enumeration implementation. Approved by: adrian (mentor) Differential Revision: https://reviews.freebsd.org/D7748	2016-09-03 23:57:17 +00:00
Andrey A. Chernov	cd3912b6be	The bug: $ echo x \| awk '/[[:cntrl:]]/' x The NUL character in cntrl class truncates the pattern, and an empty pattern matches anything. The patch skips NUL as a quick fix. PR: 195792 Submitted by: kdrakehp@zoho.com Approved by: bwk@cs.princeton.edu (the author) MFC after: 3 days	2016-09-03 23:04:56 +00:00
Mark Johnston	d96700a6da	Remove redefinitions of some kernel types from mbuf.d. These override the kernel's definitions and do not match in some cases, which can break scripts that use these types. With r305055, dtrace is able to trace fields of struct mbuf's anonymous structs and unions, so there is no need to redefine types already defined in CTF. MFC after: 3 days	2016-09-03 20:43:59 +00:00
Mark Johnston	dbbaf04f1e	Remove support for idle page zeroing. Idle page zeroing has been disabled by default on all architectures since r170816 and has some bugs that make it seemingly unusable. Specifically, the idle-priority pagezero thread exacerbates contention for the free page lock, and yields the CPU without releasing it in non-preemptive kernels. The pagezero thread also does not behave correctly when superpage reservations are enabled: its target is a function of v_free_count, which includes reserved-but-free pages, but it is only able to zero pages belonging to the physical memory allocator. Reviewed by: alc, imp, kib Differential Revision: https://reviews.freebsd.org/D7714	2016-09-03 20:38:13 +00:00
Dimitry Andric	2db7b9f259	With clang 3.9.0, compiling cxgb results in the following warning: sys/dev/cxgb/cxgb_sge.c:2873:44: error: implicit conversion from 'int' to 'char' changes value from 128 to -128 [-Werror,-Wconstant-conversion] mtod(m, char ) = CPL_ASYNC_NOTIF; ~ ^~~~~~~~~~~~~~~ This is because CPL_ASYNC_NOTIF is 0x80, so the plain char argument is wrapped to a negative value. Fix this by using uint8_t instead. Reviewed by: np MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7772	2016-09-03 19:01:11 +00:00
Navdeep Parhar	5a63431265	Use correct CTR<n> variant.	2016-09-03 18:54:26 +00:00
Enji Cooper	74191470e8	Update contrib/netbsd-tests with new content from NetBSD This updates the snapshot from 09/30/2014 to 08/11/2016 This brings in a number of new testcases from upstream, most notably: - bin/cat - lib/libc - lib/msun - lib/libthr - usr.bin/sort lib/libc/tests/stdio/open_memstream_test.c was moved to lib/libc/tests/stdio/open_memstream2_test.c to accomodate the new open_memstream test from NetBSD. MFC after: 2 months Tested on: amd64 (VMware fusion VM; various bare metal platforms); i386 (VMware fusion VM); make tinderbox Sponsored by: EMC / Isilon Storage Division	2016-09-03 18:11:48 +00:00
Enji Cooper	3919472360	Skip testcases 9/10 if jail(8) isn't installed These testcases require jail support MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2016-09-03 17:59:46 +00:00
Enji Cooper	46f4fe1eb8	Add a missing "Bail out!" if zpool create fails This will make the exit info more meaningful if/when zpool create fails, and establishes parity with the other 2 zfs acl testcases (01, 03). MFC after: 3 days Sponsored by: EMC / Isilon Storage Division	2016-09-03 17:31:13 +00:00
Andrew Turner	6f0c70d446	Explicitly include all .rodata.* sections in the kernel .rodata. This helps link the kernel with lld as it will then put all these into a single .rodata section. MFC after: 1 week Sponsored by: ABT Systems Ltd	2016-09-03 17:23:24 +00:00
Jared McNeill	1403e695b7	Use the root key in the Security ID EFUSE (when valid) to generate a MAC address instead of creating a random one each boot.	2016-09-03 15:28:09 +00:00
Warner Losh	155d3e43ff	Don't use -N to set the OMAGIC with data and text writeable and data not page aligned. To do this, use the ld script gnu ld installs on my system. This is imperfect: LDFLAGS_BIN and LD_FLAGS_BIN describe different things. The loader script could be better named and take into account other architectures. And having two different mechanisms to do basically the same thing needs study. However, it's blocking forward progress on lld, so I'll work in parallel to sort these out. Differential Revision: https://reviews.freebsd.org/D7409 Reviewed by: emaste	2016-09-03 15:26:28 +00:00
Jared McNeill	d69d5ab04f	Add support for Allwinner A64 thermal sensors.	2016-09-03 15:26:00 +00:00
Jared McNeill	1738b325d0	Add cpu-supply xref to cpu@0	2016-09-03 15:24:30 +00:00
Jared McNeill	b18b1b0015	Add SID, THS, and CPU operating points.	2016-09-03 15:23:59 +00:00
Jared McNeill	0503b90dde	Add support for reading root key on A83T/A64.	2016-09-03 15:22:50 +00:00
Dag-Erling Smørgrav	a6533d8899	import unbound 1.5.9	2016-09-03 15:08:13 +00:00
Dimitry Andric	3128fa9a5a	With clang 3.9.0, compiling ppbus(4) results in the following warnings: sys/dev/ppbus/ppb_1284.c:296:46: error: implicit conversion from 'int' to 'char' changes value from 144 to -112 [-Werror,-Wconstant-conversion] if ((error = do_peripheral_wait(bus, SELECT \| nBUSY, 0))) { ~~~~~~~~~~~~~~~~~~ ~~~~~~~^~~~~~~ sys/dev/ppbus/ppb_1284.c:785:48: error: implicit conversion from 'int' to 'char' changes value from 240 to -16 [-Werror,-Wconstant-conversion] if (do_1284_wait(bus, nACK \| SELECT \| PERROR \| nBUSY, ~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~ sys/dev/ppbus/ppb_1284.c:786:29: error: implicit conversion from 'int' to 'char' changes value from 240 to -16 [-Werror,-Wconstant-conversion] nACK \| SELECT \| PERROR \| nBUSY)) { ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~ This is because nBUSY is 0x80, so the plain char argument is wrapped to a negative value. Fix this in a minimal fashion, by using uint8_t in a few places. Reviewed by: emaste MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7771	2016-09-03 13:48:44 +00:00
Dimitry Andric	402e32a8af	Define drmP.h's __OS_HAS_AGP and __OS_HAS_MTRR macros in a defined and portable way. Reviewed by: dumbbell MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7770	2016-09-03 13:33:28 +00:00
Ed Maste	8aa5c6cfeb	remove CONSTRUCTORS from MIPS uboot linker script The linker script CONSTRUCTORS keyword is only meaningful "when linking object file formats which do not support arbitrary sections, such as ECOFF and XCOFF"[1] and is ignored for other object file formats. LLVM's lld does not yet accept (and ignore) CONSTRUCTORS, so just remove CONSTRUCTORS from the linker script as it has no effect. [1] https://sourceware.org/binutils/docs/ld/Output-Section-Keywords.html	2016-09-03 13:01:37 +00:00
Alexander Motin	9b9258a12a	Missed FreeBSD-specific piece of r305338.	2016-09-03 11:17:33 +00:00
Alexander Motin	d7e781bda3	MFC r305337: 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object Using a benchmark which has 32 threads creating 2 million files in the same directory, on a machine with 16 CPU cores, I observed poor performance. I noticed that dmu_tx_hold_zap() was using about 30% of all CPU, and doing dnode_hold() 7 times on the same object (the ZAP object that is being held). dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the dnode_t that we already have in hand, rather than repeatedly calling dnode_hold(). To do this, we need to pass the dnode_t down through all the intermediate calls that dmu_tx_hold_zap() makes, making these routines take the dnode_t* rather than an objset_t* and a uint64_t object number. In particular, the following routines will need to have analogous *_by_dnode() variants created: dmu_buf_hold_noread() dmu_buf_hold() zap_lookup() zap_lookup_norm() zap_count_write() zap_lockdir() zap_count_write() This can improve performance on the benchmark described above by 100%, from 30,000 file creations per second to 60,000. (This improvement is on top of that provided by working around the object allocation issue. Peak performance of ~90,000 creations per second was observed with 8 CPUs; adding CPUs past that decreased performance due to lock contention.) The CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds to 40 CPU-seconds. Sponsored by: Intel Corp. Closes #109 Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Ned Bass <bass6@llnl.gov> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@d3e523d489	2016-09-03 11:00:29 +00:00
Alexander Motin	4ad4b70e77	MFV r305336: 7247 zfs receive of deduplicated stream fails This resolves two 'zfs recv' issues. First, when receiving into an existing filesystem, a snapshot created during the receive process is not added to the guid->dataset map for the stream, resulting in failed lookups for deduped streams when a WRITE_BYREF record refers to a snapshot received earlier in the stream. Second, the newly created snapshot was also not set properly, referencing the snapshot before the new receiving dataset rather than the existing filesystem. Closes #159 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Author: Chris Williamson <chris.williamson@delphix.com> openzfs/openzfs@b09697c8c1	2016-09-03 10:59:05 +00:00
Alexander Motin	070da3f779	MFV r305335: 7003 zap_lockdir() should tag hold zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp. Closes #108 Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@0780b3eab5	2016-09-03 10:58:14 +00:00
Alexander Motin	72a9a6ded9	7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object Using a benchmark which has 32 threads creating 2 million files in the same directory, on a machine with 16 CPU cores, I observed poor performance. I noticed that dmu_tx_hold_zap() was using about 30% of all CPU, and doing dnode_hold() 7 times on the same object (the ZAP object that is being held). dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the dnode_t that we already have in hand, rather than repeatedly calling dnode_hold(). To do this, we need to pass the dnode_t down through all the intermediate calls that dmu_tx_hold_zap() makes, making these routines take the dnode_t* rather than an objset_t* and a uint64_t object number. In particular, the following routines will need to have analogous *_by_dnode() variants created: dmu_buf_hold_noread() dmu_buf_hold() zap_lookup() zap_lookup_norm() zap_count_write() zap_lockdir() zap_count_write() This can improve performance on the benchmark described above by 100%, from 30,000 file creations per second to 60,000. (This improvement is on top of that provided by working around the object allocation issue. Peak performance of ~90,000 creations per second was observed with 8 CPUs; adding CPUs past that decreased performance due to lock contention.) The CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds to 40 CPU-seconds. Sponsored by: Intel Corp. Closes #109 Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Ned Bass <bass6@llnl.gov> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@d3e523d489	2016-09-03 10:54:56 +00:00
Alexander Motin	6ee1596f02	7247 zfs receive of deduplicated stream fails This resolves two 'zfs recv' issues. First, when receiving into an existing filesystem, a snapshot created during the receive process is not added to the guid->dataset map for the stream, resulting in failed lookups for deduped streams when a WRITE_BYREF record refers to a snapshot received earlier in the stream. Second, the newly created snapshot was also not set properly, referencing the snapshot before the new receiving dataset rather than the existing filesystem. Closes #159 Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Author: Chris Williamson <chris.williamson@delphix.com> openzfs/openzfs@b09697c8c1	2016-09-03 10:50:43 +00:00
Alexander Motin	7b84e6dc6f	7003 zap_lockdir() should tag hold zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp. Closes #108 Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Author: Matthew Ahrens <mahrens@delphix.com> openzfs/openzfs@0780b3eab5	2016-09-03 10:48:48 +00:00
Alexander Motin	d3ec2cdb4a	MFV r304157: 7230 add assertions to dmu_send_impl() to verify that stream includes BEGIN and END records illumos/illumos-gate@12b90ee2d3 https://github.com/illumos/illumos-gate/commit/12b90ee2d3b10689fc45f4930d2392f5f e1d9cfa https://www.illumos.org/issues/7230 A test failure occurred where a send stream had only a BEGIN record. This should not be possible if the send returns without error. Prevented this from happening in the future by adding an assertion to dmu_send_impl() to verify that if the function returns 0 (success) both a BEGIN and END record are present. Did this by adding flags to dmu_sendarg_t (indicating whether BEGIN o r END records sent), having dump_record() set flags appropriately, adding VERIFY statement to dmu_send_impl(). Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matt Krantz <matt.krantz@delphix.com>	2016-09-03 10:10:58 +00:00
Alexander Motin	7aafc9d4c8	MFV r304156: 7235 remove unused func dsl_dataset_set_blkptr illumos/illumos-gate@bd56f80007 https://github.com/illumos/illumos-gate/commit/bd56f80007857b960e0981ed0797ad8ec 844a96b https://www.illumos.org/issues/7235 The function dsl_dataset_set_blkptr() is unused. We should remove it. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 10:09:23 +00:00
Alexander Motin	929d0128f7	MFV r304159: 7277 zdb should be able to print zfs_dbgmsg's illumos/illumos-gate@29bdd2f916 https://github.com/illumos/illumos-gate/commit/29bdd2f916366ece37c4748bca6b3d61f 57a223b https://www.illumos.org/issues/7277 ztest always prints the debug messages (zfs_dbgmsg()) by calling zfs_dbgmsg_print(). We should add a flag to zdb to make it do this as well before exiting. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com>	2016-09-03 10:07:46 +00:00
Alexander Motin	c9fa25c110	MFV r304155: 7090 zfs should improve allocation order and throttle allocations illumos/illumos-gate@0f7643c737 https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b 10c846a https://www.illumos.org/issues/7090 When write I/Os are issued, they are issued in block order but the ZIO pipelin e will drive them asynchronously through the allocation stage which can result i n blocks being allocated out-of-order. It would be nice to preserve as much of the logical order as possible. In addition, the allocations are equally scattered across all top-level VDEVs but not all top-level VDEVs are created equally. The pipeline should be able t o detect devices that are more capable of handling allocations and should allocate more blocks to those devices. This allows for dynamic allocation distribution when devices are imbalanced as fuller devices will tend to be slower than empty devices. The change includes a new pool-wide allocation queue which would throttle and order allocations in the ZIO pipeline. The queue would be ordered by issued time and offset and would provide an initial amount of allocation of work to each top-level vdev. The allocation logic utilizes a reservation system to reserve allocations that will be performed by the allocator. Once an allocatio n is successfully completed it's scheduled on a given top-level vdev. Each top- level vdev maintains a maximum number of allocations that it can handle (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab groups and round robin across all eligible metaslab groups to distribute the work. As top-levels complete their work, they receive additional work from the pool-wide allocation queue until the allocation queue is emptied. Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: George Wilson <george.wilson@delphix.com>	2016-09-03 10:04:37 +00:00
Alexander Motin	5535f02daf	MFV r303081: 7163 ztest failures due to excess error injection illumos/illumos-gate@f34284d835 https://github.com/illumos/illumos-gate/commit/f34284d835bc555f987c1310df46c034c 3101155 https://www.illumos.org/issues/7163 Running zloop from zfs-precommit hit this assertion: *panicstr/s 0xfffffd7fd7419370: assertion failed for thread 0xfffffd7fe29ed240, thread-id 577: parent != NULL, file ../../../uts/common/fs/zfs/dbuf.c, line 1827 $c libc.so.1`_lwp_kill+0xa() libc.so.1`_assfail+0x182(fffffd7ffb1c29fa, fffffd7ffb1cc028, 723) libc.so.1`assfail+0x19(fffffd7ffb1c29fa, fffffd7ffb1cc028, 723) libzpool.so.1`dbuf_dirty+0xc69(10e3bc10, 3601700) libzpool.so.1`dbuf_dirty+0x61e(10c73640, 3601700) libzpool.so.1`dbuf_dirty+0x61e(10e28280, 3601700) libzpool.so.1`dmu_buf_will_fill+0x64(10e28280, 3601700) libzpool.so.1`dmu_write+0x1b6(2c7e640, d, 400000002e000000, 200, 3717b40, 3601700) ztest_replay_write+0x568(4950d0, 3717a80, 0) ztest_write+0x125(4950d0, d, 400000002e000000, 200, 413f000) ztest_io+0x1bb(4950d0, d, 400000002e000000) ztest_dmu_write_parallel+0xaa(4950d0, 6) ztest_execute+0x83(1, 420c98, 6) ztest_thread+0xf4(6) libc.so.1`_thrp_setup+0x8a(fffffd7fe29ed240) libc.so.1`_lwp_start() This is another manifestation of ECKSUM in ztest: The lowest level ancestor that’s in memory is the L8 (topmost). The L7 ancestor is blkid 0x10: ::dbufs -O 0x2c7e640 -o d -l 7 \|::dbuf addr object lvl blkid holds os 600be50 d 7 4 1 ztest/ds_6 719d880 d 7 0 4 ztest/ds_6 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 08:48:51 +00:00
Alexander Motin	1f0bf00253	MFV r303080: 6451 ztest fails due to checksum errors illumos/illumos-gate@f9eb9fdf19 https://github.com/illumos/illumos-gate/commit/f9eb9fdf196b6ed476e4ffc69cecd8b0d a3cb7e7 https://www.illumos.org/issues/6451 Sometimes ztest fails because zdb detects checksum errors. e.g.: Traversing all blocks to verify checksums and verify nothing leaked ... zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 8000160> DVA0=<0:1cc2000: 180000> [L0 other uint64[]] sha256 uncompressed LE contiguou s unique single size=100000L/100000P birth=271L/271P fill=1 cksum=c5a3e27d1ed0f894:843bca3a5473c4bf:f76a19b6830a2e4:91292591613a12bf -- skipping zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 800000180> DVA0=<0:ce16800: 180000> [L0 other uint64[]] sha256 uncompressed LE contigu ous unique single size=100000L/100000P birth=840L/840P fill=1 cksum=5d018f3d061e17f3:6d1584784587bf63:2805a74a0ce37369:ba68a214806c7e75 -- skipping zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 1000000360> DVA0=<0:10d37400: 180000> [L0 other uint64[]] sha256 uncompressed LE conti guous unique single size=100000L/100000P birth=904L/904P fill=1 cksum=fa1e11d4138bd14b:86c9488c444473e3:f31e43c72e72e46b:e3446472d1174d ba -- skipping zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 400000002c0> DVA0=<0:127ef400: 180000> [L0 other uint64[]] sha256 uncompressed LE cont iguous dedup single size=100000L/100000P birth=549L/549P fill=1 cksum=30e14955ebf13522:66dc2ff8067e6810:4607e750abb9d3b3:6582b8af909fcb 58 -- skipping zdb_blkptr_cb: Got error 50 reading <657, 5, 0, 1c0> DVA0=<0:1a180400:180000> [L0 other uint64[]] fletcher4 uncompressed LE contiguou s unique single size=100000L/100000P birth=1091L/1091P fill=1 cksum=a6cf1e50: 29b3bd01c57e5:36779b914035db9a:db61cdcf6bec56f0 -- skippin g The problem is that ztest_fault_inject() can inject multiple faults into the same block. It is designed such that it can inject errors on all leafs of a RAID-Z or mirror, but for a given range of offsets, it will only inject errors Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Jorgen Lundman <lundman@lundman.net> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>	2016-09-03 08:47:46 +00:00

1 2 3 4 5 ...

215612 Commits