freebsd-dev

Author	SHA1	Message	Date
Andriy Gapon	de2cb430ad	another rework of getzfsvfs / getzfsvfs_impl code This change is designed to account for yet another difference between illumos and FreeBSD VFS. In FreeBSD a filesystem driver is supposed to clean up mnt_data in its VFS_UNMOUNT method because it's the last call into the driver before a struct mount object is destroyed. The VFS drains all references to the object before destroying it, but for the driver it's already as good as gone. In contrast, illumos VFS provides another method, VFS_FREEVFS, that is called when all references are drained. So, the driver can keep its data after VFS_UNMOUNT and clean it up in VFS_FREEVFS after all references are gone. This is what ZFS does on illumos. So there a reference to a filesystem is sufficient to guarantee that the ZFS specific data, aka zfsvfs_t, stays around (even if the filesystem gets unmounted). In FreeBSD we need to vfs_busy the filesystem to get the same guarantee. vfs_ref guarantees only that the struct mount is kept. The following rules should be observed in getzfsvfs / getzfsvfs_impl on FreeBSD: - if we need access to zfsvfs_t then we must use vfs_busy - if only we need to access struct mount (aka vfs_t), then vfs_ref is enough - when illumos code actually needs only the vfs_t, they still can pass the zfsvfs_t and get the vfs_t from it; that can work in FreeBSD if the filesystem is busied, but when it's just referenced then we have to pass the vfs_t explicitly - we cannot call vfs_busy while holding a dataset because that creates a LOR with dp_config_rwlock As a result: - getzfsvfs_impl now only references the filesystem, same as in illumos, but unlike illumos it has to return the vfs_t - the consumers are updated to account for the change - getzfsvfs busies the filesystem (and drops the reference from getzfsvfs_impl) Also, zfs_unmount_snap() now gets a busied a filesystem, references it and then unbusies it essentially reverting actions done in getzfsvfs. This is needed because the code may perform some checks that require the zfsvfs_t. So, those are done before the unbusying. MFC after: 2 weeks	2018-02-22 13:06:27 +00:00
Warner Losh	07e5967a22	Revert r329814 as well. It should have been in r329819.	2018-02-22 11:51:50 +00:00
Andriy Gapon	8d69fe5cc8	followup to r329556, completely remove the covered vnode assert vrele() acquires the vnode lock only if the hold count drops to zero. In other scenarios it needs only the interlock. So, zfsctl_snapdir_lookup() can race with vfs_mount_destroy() -> vrele() such that the lookup adds a new reference and then vrele() drops the mountpoint's reference and only then we check the reference count. It would be just one in this case. In fact, the assert should have been removed in r323483 when the code learned how to deal with the uncovered vnode. PR: 225795 MFC after: 4 days X-MFC with: r329556	2018-02-22 11:41:00 +00:00
Warner Losh	0028abe633	Backout r329818, r329816 and r329815. These aren't the commits I thought I was testing prior to commit. Revert until I can sort out what happened and fix it.	2018-02-22 11:18:33 +00:00
Warner Losh	91acaad987	Fix typo in last commit after last rebase before commit...	2018-02-22 10:55:23 +00:00
Warner Losh	4d87e27125	Combine BIO_DELETE requests for nda devices Now that we're queueing BIO_DELETE requests in the CAM I/O scheduler, it make sense to try to combine as many as possible into a single request to send down to hardware. Hopefully, lots of larger requests like this are better than lots of individual transactions. Note for future: need to limit based on total size of the trim request. Should also collapse adjacent ranges where possible to increase the size of the max payload. Sponsored by: Netflix	2018-02-22 05:44:00 +00:00
Warner Losh	c5fe3ae9b8	Introduce capacity flags for periphs Introduce flags word to describe the capacities of the peripheral. First bit will describe if the periph driver allows multiple outstanding TRIMS to be active in a device. Modify the I/O scheduler so that the nda driver can queue trims for a while after the first one arrives. We'll queue until we see a I/O scheduler tick, then we'll schedule as many TRIMs as allowed by other factors (currently this is slocts in the NVMe controller). This mariginally helps the read latency issues we see with reads, but sets the stage for the nda driver to do TRIM collapsing like the da and ada drivers do today. Sponsored by: Netflix	2018-02-22 05:43:55 +00:00
Warner Losh	c9878d6d63	Note when we tick. To help implement a policy of 'queue all trims until next I/O sched tick' policy to help coalesce them, note when we tick so we can do something special on the first call after the tick to get more work. Sponsored by: Netflix	2018-02-22 05:43:50 +00:00
Warner Losh	f2b9885036	Wrap an extra long line This debugging line is too big for even my largest xterm. wrap it at about 80 columns. Sponsored by: Netflix	2018-02-22 05:43:45 +00:00
Warner Losh	97f8aa050e	Don't sort TRIMs. While the code for ada and da both assume that the trim list is ordered when doing the coaleascing the TRIMs, it turns out that creating the sorted list uses more resources than are saved by having slightly fewer trims sent to the device. Sponsored by: Netflix	2018-02-22 05:43:20 +00:00
Alexander Motin	dd9ceab333	MFV r329803: 9080 recursive enter of vdev_indirect_rwlock from vdev_indirect_remap() illumos/illumos-gate@bdfded42e6 A scenario came up where a callback executed by vdev_indirect_remap() on a vdev, calls vdev_indirect_remap() on the same vdev and tries to reacquire vdev_indirect_rwlock that was already acquired from the first call to vdev_indirect_remap(). The specific scenario, is that we want to remap a block pointer that is snapshoted but its dataset's remap_deadlist is not cached. So in order to add it we issue a read through a vdev_indirect_remap() on the same vdev, which brings up the aforementioned issue. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>	2018-02-22 03:54:59 +00:00
Alexander Motin	064827be34	MFV r329799, r329800: 9079 race condition in starting and ending condesing thread for indirect vdevs illumos/illumos-gate@667ec66f1b The timeline of the race condition is the following: [1] Thread A is about to finish condesing the first vdev in spa_condense_indirect_thread(), so it calls the spa_condense_indirect_complete_sync() sync task which sets the spa_condensing_indirect field to NULL. Waiting for the sync task to finish, thread A sleeps until the txg is done. When this happens, thread A will acquire spa_async_lock and set spa_condense_thread to NULL. [2] While thread A waits for the txg to finish, thread B which is running spa_sync() checks whether it should condense the second vdev in vdev_indirect_should_condense() by checking the spa_condensing_indirect field which was set to NULL by spa_condense_indirect_thread() from thread A. So it goes on and tries to spawn a new condensing thread in spa_condense_indirect_start_sync() and the aforementioned assertions fails because thread A has not set spa_condense_thread to NULL (which is basically the last thing it does before returning). The main issue here is that we rely on both spa_condensing_indirect and spa_condense_thread to signify whether a condensing thread is running. Ideally we would only use one throughout the codebase. In addition, for managing spa_condense_thread we currently use spa_async_lock which basically tights condensing to scrubing when it comes to pausing and resuming those actions during spa export. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: Serapheim Dimitropoulos <serapheim@delphix.com>	2018-02-22 03:49:06 +00:00
Ed Maste	a0409b6f36	Remove accidental vim droppings Reported by: cy	2018-02-22 03:37:01 +00:00
Alexander Motin	1ea10a60f9	MFV r329793, r329795: 9075 Improve ZFS pool import/load process and corrupted pool recovery illumos/illumos-gate@6f7938128a Some work has been done lately to improve the debugability of the ZFS pool load (and import) process. This includes: https://www.illumos.org/issues/7638: Refactor spa_load_impl into several functions https://www.illumos.org/issues/8961: SPA load/import should tell us why it failed https://www.illumos.org/issues/7277: zdb should be able to print zfs_dbgmsg's To iterate on top of that, there's a few changes that were made to make the import process more resilient and crash free. One of the first tasks during the pool load process is to parse a config provided from userland that describes what devices the pool is composed of. A vdev tree is generated from that config, and then all the vdevs are opened. The Meta Object Set (MOS) of the pool is accessed, and several metadata objects that are necessary to load the pool are read. The exact configuration of the pool is also stored inside the MOS. Since the configuration provided from userland is external and might not accurately describe the vdev tree of the pool at the txg that is being loaded, it cannot be relied upon to safely operate the pool. For that reason, the configuration in the MOS is read early on. In the past, the two configurations were compared together and if there was a mismatch then the load process was aborted and an error was returned. The latter was a good way to ensure a pool does not get corrupted, however it made the pool load process needlessly fragile in cases where the vdev configuration changed or the userland configuration was outdated. Since the MOS is stored in 3 copies, the configuration provided by userland doesn't have to be perfect in order to read its contents. Hence, a new approach has been adopted: The pool is first opened with the untrusted userland configuration just so that the real configuration can be read from the MOS. The trusted MOS configuration is then used to generate a new vdev tree and the pool is re-opened. When the pool is opened with an untrusted configuration, writes are disabled to avoid accidentally damaging it. During reads, some sanity checks are performed on block pointers to see if each DVA points to a known vdev; when the configuration is untrusted, instead of panicking the system if those checks fail we simply avoid issuing reads to the invalid DVAs. This new two-step pool load process now allows rewinding pools accross vdev tree changes such as device replacement, addition, etc. Loading a pool from an external config file in a clustering environment also becomes much safer now since the pool will import even if the config is outdated and didn't, for instance, register a recent device addition. With this code in place, it became relatively easy to implement a long-sought-after feature: the ability to import a pool with missing top level (i.e. non-redundant) devices. Note that since this almost guarantees some loss Of data, this feature is for now restricted to a read-only import. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: Pavel Zakharov <pavel.zakharov@delphix.com>	2018-02-22 03:15:35 +00:00
John Baldwin	642ffab5fc	Avoid grabbing locks when grabbing the vt(4) console for DDB. Trying to grab locks during cngrab() when entering the debugger is deadlock prone as all other CPUs are already halted (and thus unable to release locks) when cngrab() is invoked. One could instead use try-locks. However, the case that the try-lock fails still has to be handled. In addition, if the try-lock works it doesn't provide any greater ordering guarantees than is already provided by entering and exiting DDB. It is simpler to define a simpler path for the case that the try-lock would fail and always use that when entering DDB. Messing with timers, etc. when entering DDB is dubious even if the try-lock succeeds. This patch attempts to use the smallest possible set of operations to grab the vt(4) console when entering DDB without using any locks. Reviewed by: emaste Tested by: Matthew Macy MFC after: 1 week	2018-02-22 02:26:29 +00:00
Ed Maste	eae594f7d5	Correct proper nouns in the Linuxulator - Capitalize Linux - Spell FreeBSD out in full - Address some style(9) on changed lines Sponsored by: Turing Robotic Industries Inc.	2018-02-22 02:24:17 +00:00
John Baldwin	6619d9fb70	Bring in additional constants and message fields for TLS-related messages. Sponsored by: Chelsio Communications	2018-02-22 02:02:31 +00:00
Ed Maste	581bf7cbda	Use 'const int *' for sysentvec errno translation table This allows an sv_errtbl to be read-only .rodata. Sponsored by: Turing Robotic Industries Inc.	2018-02-22 01:59:59 +00:00
John Baldwin	125d42fe81	Move DDP PCB state into a helper structure. This consolidates all of the DDP state in one place. Also, the code has now been fixed to ensure that DDP state is only accessed for DDP connections. This should not be a functional change but makes it cleaner and easier to add state for other TOE socket modes in the future. MFC after: 1 month Sponsored by: Chelsio Communications	2018-02-22 01:50:30 +00:00
Alexander Motin	613b0d87da	8942 zfs promote .../%recv should be an error illumos/illumos-gate@add927f8c8 Reported on the ZFSonLinux https://github.com/zfsonlinux/zfs/issues/4843, fixed by https://github.com/zfsonlinux/zfs/pull/6339: If we are in the middle of an incremental zfs receive, the child .../%recv will exist. If you concurrently run zfs promote .../%recv, it will "work", but then zfs gets confused. For example, there's no obvious way to destroy the containing filesystem (because it is now a clone of its invisible child). Attempting to do this promote should be an error. We could fix this by having zfs_ioc_promote() check if zc_name contains a %, similar to zfs_ioc_rename(). Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>	2018-02-22 01:42:13 +00:00
Alexander Motin	a33ba3dbde	MFV r329776: 8477 Assertion failed in vdev_state_dirty(): spa_writeable(spa) illumos/illumos-gate@f4c1745bd6 Illumos 4080 allows "zpool clear" to work on readonly pools: i don't think this is the intended behaviour, we shouldn't be allowed to clear readonly pools. Probably. A fix is already in the ZFS on Linux repository to addess this issue: `92e43c1718` Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>	2018-02-22 01:00:46 +00:00
Alexander Motin	eea9be67e6	MFV r329774: 8408 dsl_props_set_sync_impl() does not handle nested nvlists correctly illumos/illumos-gate@85723e5eec When iterating over the input nvlist in dsl_props_set_sync_impl() when we don't preserve the nvpair name before looking up ZPROP_VALUE, so when we later go to process it nvpair_name() is always "value" instead of the actual property name. This results in a couple of bugs in the recv code: - received properties are not restored correctly when failing to receive an incremental send stream - received properties are not completely replaced by the new ones when successfully receiving an incremental send stream This was discovered on ZFS on Linux (fixed in `5f1346c299`) Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>	2018-02-22 00:55:25 +00:00
Alexander Motin	756595f675	MFV r329770: 9035 zfs: this statement may fall through illumos/illumos-gate@46ac8fdfc5 Reviewed by: Yuri Pankov <yuripv@yuripv.net> Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Toomas Soome <tsoome@me.com>	2018-02-22 00:47:38 +00:00
Alexander Motin	502d18a8f1	MFV r329766: 8962 zdb should work on non-idle pools illumos/illumos-gate@e144c4e6c9 Currently `zdb` consistently fails to examine non-idle pools as it fails during the `spa_load()` process. The main problem seems to be that `spa_load_verify()` fails as can be seen below: $ sudo zdb -d -G dcenter zdb: can't open 'dcenter': I/O error ZFS_DBGMSG(zdb): spa_open_common: opening dcenter spa_load(dcenter): LOADING disk vdev '/dev/dsk/c4t11d0s0': best uberblock found for spa dcenter. txg 40824950 spa_load(dcenter): using uberblock with txg=40824950 spa_load(dcenter): UNLOADING spa_load(dcenter): RELOADING spa_load(dcenter): LOADING disk vdev '/dev/dsk/c3t10d0s0': best uberblock found for spa dcenter. txg 40824952 spa_load(dcenter): using uberblock with txg=40824952 spa_load(dcenter): FAILED: spa_load_verify failed [error=5] spa_load(dcenter): UNLOADING This change makes `spa_load_verify()` a dryrun when ran from `zdb`. This is done by creating a global flag in zfs and then setting it in `zdb`. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com>	2018-02-22 00:42:12 +00:00
John Baldwin	4f8666989a	Add two new ioctls to bhyve for batch register fetch/store operations. These are a convenience for bhyve's debug server to use a single ioctl for 'g' and 'G' rather than a loop of individual get/set ioctl requests. Reviewed by: grehan MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D14074	2018-02-22 00:39:25 +00:00
Alexander Motin	fa607d017d	MFV r329762: 8961 SPA load/import should tell us why it failed illumos/illumos-gate@3ee8c80c74 When we fail to open or import a storage pool, we typically don't get any additional diagnostic information, just "no pool found" or "can not import". While there may be no additional user-consumable information, we should at least make this situation easier to debug/diagnose for developers and support. For example, we could start by using `zfs_dbgmsg()` to log each thing that we try when importing, and which things failed. E.g. "tried uberblock of txg X from label Y of device Z". Also, we could log each of the stages that we go through in `spa_load_impl()`. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com>	2018-02-22 00:03:14 +00:00
Warner Losh	6ffe523f18	Minor formatting nits.	2018-02-21 23:49:35 +00:00
Alexander Motin	df331510ab	MFV r329760: 7638 Refactor spa_load_impl into several functions illumos/illumos-gate@1fd3785ff6 spa_load_impl has grown out of proportions. It is currently over 700 lines long and makes it very hard to follow or debug the import process even for experienced ZFS developers. The objective is to split it up in a series of well commented functions. Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com>	2018-02-21 23:38:30 +00:00
Alexander Motin	b17bfcde3d	9018 Replace kmem_cache_reap_now() with kmem_cache_reap_soon() illumos/illumos-gate@36a64e6284 To prevent kmem_cache reaping from blocking other system resources, turn kmem_cache_reap_now() (which blocks) into kmem_cache_reap_soon(). Callers to kmem_cache_reap_soon() should use kmem_cache_reap_active(), which exploits #9017's new taskq_empty(). Reviewed by: Bryan Cantrill <bryan@joyent.com> Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Yuri Pankov <yuripv@yuripv.net> Author: Tim Kordas <tim.kordas@joyent.com> FreeBSD does not use taskqueue for kmem caches reaping, so this change is less dramatic then it is on Illumos, just limiting reaping to 1 time per second. It may possibly be improved later, if needed.	2018-02-21 23:15:06 +00:00
Alexander Motin	d208c07cf3	MFV r329753: 8809 libzpool should leverage work done in libfakekernel illumos/illumos-gate@f06dce2c1f Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Gordon Ross <gordon.w.ross@gmail.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Andrew Stormont <astormont@racktopsystems.com>	2018-02-21 21:18:04 +00:00
Kirk McKusick	f686b1710a	Refactor fix in r329600 to do its check once in readsuper() rather than in the two places that call readsuper(). No semantic change intended. Reviewed by: kib	2018-02-21 19:56:19 +00:00
Ryan Stone	b3b6ff23e7	Allow route change requests to not specify the gateway. Only require a gateway to be specified on a route add request. On a route change request that does not specify the gateway, the gateway will remain the same. This allows changing other route parameters without having to re-specifying the gateway, like in "route change 10.0.0.0/8 -mtu 9000". Update the route(8) manpage to explicitly call out this usage as being supported. MFC after: 2 weeks Sponsored by: Dell EMC Isilon Reviewed By: eugen (rtsock.c change), rgrimes Differential Revision: https://reviews.freebsd.org/D14291	2018-02-21 19:13:23 +00:00
Stephen Hurd	81ad57b1b8	IFLIB: Make isc_magic unsigned The IFLIB_MAGIC macro is > INT_MAX, so isc_magic should be able to contain it. Reported by: jeb Sponsored by: Limelight Networks	2018-02-21 18:57:00 +00:00
Alexander Motin	832deba1e5	MFV r329736: 8969 Cannot boot from RAIDZ with parity > 1 illumos/illumos-gate@0fb055e81f At present it is possible to boot from a root pool that is on RAIDZ but not one that is on RAIDZ2 or RAIDZ3. This is because, at the time the pool version is checked to ensure support for dual/triple parity, the uberblock has not yet been loaded into the SPA and therefore the code determines that the pool version is too old and returns ENOTSUP. Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Andy Fiddaman <omnios@citrus-it.co.uk> FreeBSD already had this fixed, so this is just a diff reduction.	2018-02-21 18:12:19 +00:00
Alexander Motin	24433f00ea	MFV r329502: 7614 zfs device evacuation/removal illumos/illumos-gate@5cabbc6b49 https://www.illumos.org/issues/7614: This project allows top-level vdevs to be removed from the storage pool with “zpool remove”, reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now “indirect”) vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become “obsolete” because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been “remapped” in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be “remapped” to their new (concrete) locations if possible. This process can be accelerated by using the “zfs remap” command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. Therefore, mirror and raidz devices can not be removed. Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Prashanth Sreenivasa <pks@delphix.com>	2018-02-21 16:51:02 +00:00
Ian Lepore	94d7be6551	Add required header files. Reported by: andreast@	2018-02-21 16:36:44 +00:00
Ian Lepore	3982006ed5	Remove some files that snuck in via cut and paste. Having these compiled into the module causes the kobj method descriptors to be resolved incorrectly (by the compile-time linker instead of the kernel linker), which then leads to hours of frustrating debugging.	2018-02-21 16:34:04 +00:00
Nathan Whitehorn	d9dbc2104f	Add definition for the PowerPC A2.	2018-02-21 15:15:58 +00:00
Nathan Whitehorn	dddf28585d	Add definitions for the new Radix MMU mode on POWER9+ CPUs.	2018-02-21 15:15:31 +00:00
Andriy Gapon	cfb675a138	MFV r329718: 8520 7198 lzc_rollback_to should support rolling back to origin illumos/illumos-gate@95643f75d2 `95643f75d2` https://www.illumos.org/issues/8520 lzc_rollback_to() should support rolling back to a clone's origin. The current checks in zfs_ioc_rollback() would not allow that because the origin snapshot belongs to a different filesystem. The overly restrictive check was introduced in 7600, but it was not a regression as none of the existing tools provided a way to rollback to the origin. https://www.illumos.org/issues/7198 EINVAL is returned when a dataset does not have any snapshots, so there is nothing to roll back to. Although the code in zfs_do_rollback checks for that condition in advance, it's still possible that the snapshot(s) gets removed after the check and before the rollback sync task is executed. At the moment zfs command would crash when that happens. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2018-02-21 15:12:14 +00:00
Andriy Gapon	754d27df02	MFV r329715: 8997 ztest assertion failure in zil_lwb_write_issue illumos/illumos-gate@f864f99efe `f864f99efe` https://www.illumos.org/issues/8997 When dmu_tx_assign is called from zil_lwb_write_issue, it's possible for either ERESTART or EIO to be returned. If ERESTART is returned, this will cause an assertion to fail directly in zil_lwb_write_issue, where the code assumes the return value is EIO if dmu_tx_assign returns a non-zero value. This can occur if the SPA is suspended when dmu_tx_assign is called, and most often occurs when running zloop. If EIO is returned, this can cause assertions to fail elsewhere in the ZIL code. For example, zil_commit_waiter_timeout contains the following logic: lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb); ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED); In this case, if dmu_tx_assign returned EIO from within zil_lwb_write_issue, the lwb variable passed in will not be issued to disk. Thus, it's lwb_state field will remain LWB_STATE_OPENED and this assertion will fail. zil_commit_waiter_timeout assumes that after it calls zil_lwb_write_issue, the lwb will be issued to disk, and doesn't handle the case where this is not true; i.e. it doesn't handle the case where dmu_tx_assign returns EIO. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 3 weeks	2018-02-21 15:07:49 +00:00
Andriy Gapon	9d6810819c	MFV r329713: 8731 ASSERT3U(nui64s, <=, UINT16_MAX) fails for large blocks illumos/illumos-gate@a6c1eb3c08 `a6c1eb3c08` https://www.illumos.org/issues/8731 annotate_ecksum() asserts that nui64s, calculated as nui64s = size / sizeof (uint64_t), is not greater than UINT16_MAX. This restriction is needed because histograms of incorrectly set and cleared bits have 16 bit counters and if the buffer consists of too many 64-bit words, then a counter can potentially overflow producing an incorrect result. When the largest buffer size was 128KB the greatest value of nui64s was 16K, well within the limit. But now we have support for large buffers and for buffer sizes of 512KB and above the restriction is violated. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2018-02-21 14:31:48 +00:00
Wojciech Macek	6d13fd638c	PowerNV: Put processor to power-save state in idle thread When processor enters power-save state it releases resources shared with other cpu threads which makes other cores working much faster. This patch also implements saving and restoring registers that might get corrupted in power-save state. Submitted by: Patryk Duda <pdk@semihalf.com> Obtained from: Semihalf Reviewed by: jhibbits, nwhitehorn, wma Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D14330	2018-02-21 14:28:40 +00:00
Andriy Gapon	70639f9a5e	MFV r329710: 8966 Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable illumos/illumos-gate@82693e09cc `82693e09cc` https://www.illumos.org/issues/8966 Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Approved by: Richard Lowe <richlowe@richlowe.net> Author: WHR <msl0000023508@gmail.com> PR: 225162 Submitted by: WHR <msl0000023508@gmail.com> Reported by: WHR <msl0000023508@gmail.com> MFC after: 1 week	2018-02-21 14:17:07 +00:00
Edward Tomasz Napierala	8404ad78db	Use proper buffer length (the announce_buf char pointer used to be anarray), broken in r317143. This fixes those weird "cd0: Attempt" messages at boot. PR: 222103 Reviewed by: scottl@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14369	2018-02-21 14:05:13 +00:00
Hans Petter Selasky	07c757ec25	Allow LinuxKPI character devices to receive mmap() calls from the Linux binary mode user-space emulation layer. This is a regression issue after r328436, when LinuxKPI character devices started to use DTYPE_DEV in the "f_type" field of the associated file structure(s). MFC after: 3 days Found by: Johannes Lundberg <johalun0@gmail.com> Sponsored by: Mellanox Technologies	2018-02-21 10:13:17 +00:00
Wojciech Macek	eb96cc1364	PowerNV: add missing RTC_WRITE support Add function which can store RTC values to OPAL. Submitted by: Wojciech Macek <wma@semihalf.org> Obtained from: Semihalf Sponsored by: IBM, QCM Technologies	2018-02-21 08:13:17 +00:00
Wojciech Macek	19a5b68236	CXGBE: implement prefetch on non-Intel architectures Submitted by: Michal Stanek <mst@semihalf.com> Obtained from: Semihalf Reviewed by: np, pdk@semihalf.com Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D14452	2018-02-21 08:05:56 +00:00
Justin Hibbits	fcc491a3fe	Split printtrap() into generic and CPU-specific components Summary: This compartmentalizes the CPU-specific trap components into its own function, rather than littering the general printtrap() with various checks. This will let us replace a series of #ifdef's with a runtime conditional check in the future. Reviewed By: nwhitehorn Differential Revision: https://reviews.freebsd.org/D14416	2018-02-21 03:34:33 +00:00
Alexander Motin	d8e89539c8	MFV r324198: 8081 Compiler warnings in zdb illumos/illumos-gate@3f7978d02b `3f7978d02b` https://www.illumos.org/issues/8081 zdb(8) is full of minor problems that generate compiler warnings. On FreeBSD, which uses -WError, the only way to build it is to disable all compiler warnings. This makes it much harder to detect newly introduced bugs. We should cleanup all the warnings. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Alan Somers <asomers@gmail.com>	2018-02-21 03:08:47 +00:00

1 2 3 4 5 ...

120737 Commits