988 Commits

Author SHA1 Message Date
delphij
6ab407aa01 All callers of static method load_nvlist() in spa.c handles error case,
so there is no reason to assert that we won't hit an error.  Instead,
just return that error to caller and have the upper layer handle it.

Obtained from:	FreeNAS
Reported by:	rodrigc
Reviewed by:	Matthew Ahrens
MFC after:	2 weeks
2014-03-02 02:41:33 +00:00
markj
9caf62197b Expose a few DTrace parameters as sysctls under kern.dtrace and add
descriptions for several existing sysctls.

PR:		187027
Submitted by:	Fedor Indutny <fedor@indutny.com> (original version)
MFC after:	2 weeks
2014-03-01 19:06:43 +00:00
markj
f6a884d05d Fix emulation of call and jmp instructions on i386 and for 32-bit processes
on amd64.

Submitted by:	Prashanth Kumar <pra_udupi@yahoo.co.in>
MFC after:	2 weeks
2014-03-01 17:55:20 +00:00
markj
a7aada62d5 4478 dtrace_dof_maxsize is far too small
illumos/illumos-gate@d339a29bb4

PR:		187027
MFC after:	1 week
2014-02-28 02:04:41 +00:00
markj
b76098bbce Fix the struct reg mappings for i386 and amd64, which differ between illumos
and FreeBSD.

Submitted by:	Prashanth Kumar <pra_udupi@yahoo.co.in>
MFC after:	2 weeks
2014-02-27 01:24:47 +00:00
markj
451c3aecb6 Move some files that are identical on i386 and amd64 to an x86 subdirectory
rather than keeping duplicate copies.

Discussed with:	avg
MFC after:	1 week
2014-02-27 01:04:35 +00:00
markj
9b658401ea Revert r262466, as it does not compile on PowerPC.
Reported by:	jhibbits
2014-02-26 01:00:00 +00:00
markj
b3fc1da12e Make all 8 syscall arguments available to syscall probes in the same way
that this is done for SDT probes. This fixes the syscall/tst.args.d test,
which was failing because mmap(2)'s sixth argument wasn't available to the
probe.

MFC after:	2 weeks
2014-02-25 02:58:11 +00:00
markj
428d834b5b 1452 DTrace buffer autoscaling should be less violent
illumos/illumos-gate@6fb4854bed

This fixes the tst.resize1.d and tst.resize2.d DTrace tests, which have
been failing since r261122 since they were causing dtrace(1) to attempt to
allocate and use large amounts of memory, and get killed by the OOM killer
as a result.

MFC after:	1 month
2014-02-22 05:18:55 +00:00
markj
95ac3c80e1 Define the KM_NORMALPRI flag for kmem_alloc(), as it is used in some
upstream DTrace code. It indicates that the kernel memory allocator need not
attempt to satisfy non-blocking allocations in low-memory conditions. This
has no direct equivalent in the malloc(9) flags, so it is just defined to 0
for now.
2014-02-22 05:13:35 +00:00
delphij
1dfe96a066 MFV r261619:
4574 get_clones_stat does not call zap_count in non-debug kernel

zap_count(...) is never called in non-DEBUG kernel.
As result "count" variable is always 0, and "goto fail" is always
reached.  This means get_clones_stat function never makes up list
of clones for "clones" properties.

MFC after:	2 weeks
2014-02-08 05:35:36 +00:00
delphij
4b064bf9ac MFV r260834:
Fix memory leak of compressed buffers in l2arc_write_done (Illumos
#3995).
2014-01-18 01:45:39 +00:00
avg
6b143ee35a traverse_visitbp: visit DMU_GROUPUSED_OBJECT before DMU_USERUSED_OBJECT
This is done to ensure that visited object IDs are always increasing.
Also, pass correct object ID to prefetch_dnode_metadata for
os_groupused_dnode.

Without this change we would hit an assert if traversal was paused on
a GROUPUSED object, which is unlikely but possible.

Apparently the same change was independently developed by Deplhix.

Reviewed by:	Matthew Ahrens <mahrens@delphix.com>
MFC after:	10 days
Sponsored by:	HybridCluster
2014-01-17 10:23:46 +00:00
avg
d107399017 fix a build problem with INVARIANTS enabled introduced in r260704
Reported by:	glebius
MFC after:	5 days
X-MFC with:	r260704
2014-01-16 13:44:37 +00:00
avg
e186f564bc fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt.  Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds.  But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.

Reviewed by:	gibbs
MFC after:	13 days
Sponsored by:	HybridCluster
2014-01-16 13:24:10 +00:00
avg
d1329f5a22 Revert r260705: wrong patch committed by accident
An earlier, less efficient version was committed by accident.
2014-01-16 13:20:20 +00:00
avg
113f9a4f53 zfs_deleteextattr: name buffer from namei is needed by zfs_rename
If we prematurely free the name buffer and it gets quickly recycled,
then zfs_rename may see data from another lookup or even unmapped memory
via cn_nameptr.

MFC after:	6 days
Sponsored by:	HybridCluster
2014-01-16 12:31:27 +00:00
avg
31b7f68d80 fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt.  Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds.  But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.

Reviewed by:	gibbs
MFC after:	13 days
Sponsored by:	HybridCluster
2014-01-16 12:26:54 +00:00
avg
97986ccb0b zfs: getnewvnode_reserve must be called outside of a zfs transaction
Otherwise we could run into the following deadlock.
A thread has a transaction open and assigned to a transaction group.
That would prevent the transaction group from be quiesced and synced.
The thread is blocked in getnewvnode_reserve waiting for a vnode to
a be reclaimed.  vnlru thread is blocked trying to enter ZFS VOP because
a filesystem is suspended by an ongoing rollback or receive operation.
In its turn the operation is waiting for the current transaction group
to be synced.

zfs_zget is always used outside of active transactions, but zfs_mknode
is always used in a transaction context.  Thus, we hoist
getnewvnode_reserve from zfs_mknode to its callers.

While there, assert that ZFS always calls getnewvnode while having
a vnode reserved.

Reported by:	adrian
Tested by:	adrian
MFC after:	17 days
Sponsored by:	HybridCluster
2014-01-16 12:22:46 +00:00
marcel
5e2984b1f1 In atomic_or_8_nv() load 1 and not 8 bytes from the address
given. Note that atomic_or_8_nv() is not used at this time.
2014-01-06 05:00:58 +00:00
mav
0b0d3d9762 Fix build after r260234 by converting ddi_get_lbolt64() from inline into
a macro.  Otherwise compiler complains that hz variable used there either
undefined or defined twice, thanks to header mess caused by compat shims.
2014-01-05 19:07:42 +00:00
mav
8e1d7be31e In dmu_zfetch_stream_reclaim() replace division with multiplication and
move it out of the loop and lock.
2014-01-03 18:44:37 +00:00
mav
7402ef4897 Remove extra conversion to nanoseconds from ddi_get_lbolt64().
As result this uses one multiplication and shifts instead of one division
and two multiplications.
2014-01-03 18:08:31 +00:00
delphij
37fa7d5554 MFV r260155:
When we encounter an I/O error on a piece of metadata while deleting
a file system or zvol, we don't update the bptree_entry_phys_t's
bookmark.  This would lead to double free of bp's which will lead to
space map corruption.

Instead of tolerating and allowing the corruption, panic immediately.

See Illumos #4390 for more details.

4391 panic system rather than corrupting pool if we hit bug 4390

Illumos/illumos-gate@8b36997aa2

MFC after:	2 weeks
2014-01-02 08:10:35 +00:00
delphij
5137277761 MFV r260154 + 260182:
4369 implement zfs bookmarks
4368 zfs send filesystems from readonly pools

Illumos/illumos-gate@78f1710053

MFC after:	2 weeks
2014-01-02 07:34:36 +00:00
delphij
05d7d24d7e Fix build on platforms where atomic_swap_64 is not available. 2014-01-02 03:24:44 +00:00
delphij
843c1c95ad MFV r260153:
4121 vdev_label_init should treat request as succeeded when pool
     is read only

Illumos/illumos-gate@973c78e94b

MFC after:	2 weeks
2014-01-01 01:26:39 +00:00
delphij
82c2441b7d MFV r259170:
4370 avoid transmitting holes during zfs send

4371 DMU code clean up

illumos/illumos-gate@43466aae47

NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.

MFC after:	2 weeks
2014-01-01 00:45:28 +00:00
delphij
302e136d55 MFV r258385:
(Note: this change is not applicable to FreeBSD and the file
is not included in build.  It's integrated for completeness).

4128 disks in zpools never go away when pulled

illumos/illumos-gate@39cddb10a3

MFC after:	2 weeks
2013-12-31 21:24:00 +00:00
delphij
ab4bdae837 MFV r242733:
3306 zdb should be able to issue reads in parallel
3321 'zpool reopen' command should be documented in the man page
and help message

illumos/illumos-gate@31d7e8fa33

FreeBSD porting notes: the kernel part of this changeset depends
on Solaris buf(9S) interfaces and are not really applicable for
our use.  vdev_disk.c is patched as-is to reduce diverge from
upstream, but vdev_file.c is left intact.

MFC after:	2 weeks
2013-12-31 19:39:15 +00:00
markj
31d017a7a5 Allocate the probe ID unrhdr before the DTrace kld_* event handlers are
registered. Otherwise there is a small window during which probe IDs may be
allocated before the unrhdr is allocated.

MFC after:	2 weeks
2013-12-31 15:41:16 +00:00
markj
0d764663c2 Revert r260091. The vmem calls seem to be slower than the *_unr() calls that
they replaced, which is important considering that probe IDs are allocated
during process startup for USDT probes.
2013-12-31 15:37:51 +00:00
markj
27bf80971b Now that vmem(9) is available, use vmem arenas to allocate probe and
aggregation IDs, as is done in the upstream illumos code. This still
requires some FreeBSD-specific code, as our vmem API is not identical to the
one in illumos.

Submitted by:	Mike Ma <mikemandarine@gmail.com>
2013-12-30 17:37:32 +00:00
delphij
66a5ee8e14 MFV r258374:
4171 clean up spa_feature_*() interfaces

4172 implement extensible_dataset feature for use by other zpool
features

illumos/illumos-gate@2acef22db7

MFC after:	2 weeks
2013-12-24 07:14:25 +00:00
delphij
b045f69bf5 MFV r258373:
4168 ztest assertion failure in dbuf_undirty

4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state

illumos/illumos-gate@7fdd916c47

MFC after:	2 weeks
2013-12-24 06:56:17 +00:00
jhibbits
8aa99174db Fix a brain-o. I had misread the limit as a size, but it's a pointer.
Submitted by:	Howard Su
MFC after:	2 weeks
X-MFC-with:	r259668
2013-12-21 00:37:32 +00:00
jhibbits
fde816803b Fix a couple bugs in FBT PowerPC. Clamp the size to a 'instruction size' not
'byte size', and fix a typo.

MFC after:	2 weeks
2013-12-20 23:18:14 +00:00
pjd
f052ba0c91 MFV r258923: 4188 assertion failed in dmu_tx_hold_free(): dn_datablkshift != 0
illumos/illumos-gate@bb411a08b0

MFC after:	3 days
2013-12-18 21:45:46 +00:00
markj
fa6de9117d The fasttrap fork handler is responsible for removing tracepoints in the
child process that were inherited from its parent. However, this should
not be done in the case of a vfork, since the fork handler ends up removing
the tracepoints from the shared vm space, and userland DTrace probes in the
parent will no longer fire as a result.

Now the child of a vfork may trigger userland DTrace probes enabled in its
parent, so modify the fasttrap probe handler to handle this case and handle
the child process in the same way that it would handle the traced process.
In particular, if once traces function foo() in a process that vforks, and
the child calls foo(), fasttrap will treat this call as having come from the
parent. This is the behaviour of the upstream code.

While here, add #ifdef guards to some code that isn't present upstream.

MFC after:	1 month
2013-12-18 01:41:52 +00:00
asomers
03e85c2d9b sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
When a da or ada device dissappears, outstanding IOs fail with
	ENXIO, not EIO.  The check for EIO was probably copied from Illumos,
	where that is indeed the correct errno.

	Without this change, pulling a busy drive from a zpool would usually
	turn it into UNAVAIL, even though pulling an idle drive would turn
	it into REMOVED.  With this change, it is REMOVED every time.

	Also, vdev_geom_io_intr shouldn't do zfs_post_remove, because that
	results in devd getting two resource.fs.zfs.removed events.  The
	comment said that the event had to be sent directly instead of
	through the async removal thread because "the DE engine is using
	this information to discard prevoius I/O errors".  However, the fact
	that vdev_geom_io_intr was never actually sending the events until
	now, and that vdev_geom_orphan never sent them at all, and that
	vdev_geom_orphan usually gets called about 2 seconds after the
	actual removal, means that FreeBSD's userland can cope with a late
	event just fine.

Approved by:	ken (mentor)
Sponsored by:	Spectra Logic Corporation
MFC after:	4 weeks
2013-12-12 00:27:22 +00:00
markj
f8785b45de Correct the check for errors from proc_rwmem().
MFC after:	2 weeks
2013-12-11 04:31:40 +00:00
mav
057ae4aad3 Don't even try to read vdev labels from devices smaller then SPA_MINDEVSIZE
(64MB).  Even if we would find one somehow, ZFS kernel code rejects such
devices.  It is funny to look on attempts to read 4 256K vdev labels from
1.44MB floppy, though it is not very practical and quite slow.
2013-12-10 12:36:44 +00:00
delphij
375701af53 Expose spa_asize_inflation.
X-MFC-With:	r258632
2013-12-06 23:49:16 +00:00
avg
430acb4217 zfs: add zfs_freebsd_putpages
this should be more optimal than writing pages one-by-one via zfs_write ->
update_pages in the case of multi-page putpages call

MFC after:	16 days
2013-11-29 15:39:39 +00:00
avg
16f88ac15b zfs: add dmu_write_pages variant for freebsd
The freebsd variant of dmu_write_pages is hidden under _KERNEL
to avoid needlessly pulling in vm_page_t declaration.
Besides, this function seems to be useless for ZFS userland counterpart.

MFC after:	15 days
2013-11-29 15:34:43 +00:00
avg
7a0711c338 zfs: make zfs_map_page / zfs_unmap_page public
MFC after:	15 days
2013-11-29 15:33:40 +00:00
avg
63dbff5d06 drop ZUT_OBJ, zfs unit testing driver never materialzied in freebsd
MFC after:	5 days
2013-11-29 15:32:53 +00:00
avg
89468053e3 zfs mappedread_sf: assert that a page is never partially valid
ZFS never partially validates or invalidates a page.
The higher level VM should not do that either.
mappedread_sf correct operation depends on a page being either fully
valid or invalid.

MFC after:	7 days
2013-11-29 12:19:52 +00:00
avg
47f145913e MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT)
illumos/illumos-gate@e722410c49

MFC after:	9 days
X-MFC after:	r258632
2013-11-28 19:44:36 +00:00
avg
9932b97e88 MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4104 ::spa_space no longer works
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab

illumos/illumos-gate@0713e232b7

Note that some tunables have been removed and some new tunables have
been added.  Of particular note, FreeBSD-only knob
vfs.zfs.space_map_last_hope is removed as it was a nop for some time now
(after one of the previous merges from upstream).

MFC after:	11 days
Sponsored by:	HybridCluster [merge]
2013-11-28 19:37:22 +00:00