Commit Graph

1273 Commits

Author SHA1 Message Date
avg
1574191a79 illumos compat: use flsl/flsll for highbit/highbit64
This is a micro optimization.
The upstream code uses the binary search.

Differential Revision:	https://reviews.freebsd.org/D2839
Reviewed by:	delphij, mav
MFC after:	15 days
2015-06-17 12:05:04 +00:00
avg
635065f17b MFV r284036: 5961 Fix stack overflow in zfs_create_fs
illumos/illumos-gate@c701fde691

Author:		glebius
MFC after:	11 days
2015-06-12 11:10:49 +00:00
avg
35511df052 MFV r284030: 5818 zfs {ref}compressratio is incorrect with 4k sector size
illumos/illumos-gate@81cd5c555f

Author:	Matthew Ahrens <mahrens@delphix.com>
MFC after:	17 days
2015-06-12 10:57:05 +00:00
avg
34d03f1f2c MFV r283534: 5515 dataset user hold doesn't reject empty tags
illumos/illumos-gate@752fd8dabc

Author:	Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
MFC after:	10 days
2015-06-12 10:52:53 +00:00
avg
aced9fcb65 MFV r284040: check that datasets are snapshots
5946 zfs_ioc_space_snaps must check that firstsnap and lastsnap refer to snapshots
5945 zfs_ioc_send_space must ensure that fromsnap refers to a snapshot
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>

illumos/illumos-gate@24218bebb4

Note that the upstream commit is modified during MFV: in the upstream
the check is done by inspecting ds_is_snapshot field while in FreeBSD
we call dsl_dataset_is_snapshot().
This is because illumos/illumos-gate@bc9014e6a8
(r277428 in vendor-sys/illumos) is not MFV-ed yet.

MFC after:	10 days
2015-06-12 10:41:24 +00:00
br
ab3ad78145 Don't re-define LOCORE when dtrace is built-in to the kernel. 2015-06-10 09:59:26 +00:00
avg
630db52ab7 compat nvpair.h: make sure that the names are mangled only for kernel
Currently there is no good reason to mangle the userland API.
The change was introduced in eac1d566b4,
r279437.  Also see https://reviews.freebsd.org/D1881.

I am still convinced that nv should not have introduced intentionally
conflicting API.

Discussed with:	rstone
X-MFC with:	r279437
Sponsored by:	ClusterHQ
2015-06-07 08:54:25 +00:00
kib
ae73c05fd5 Add missed {}.
Noted by:	Morten Rodal <morten@rodal.no>
MFC after:	2 weeks
2015-05-27 19:28:14 +00:00
kib
d77dbf3761 Right now, dounmount() is called with unreferenced mount point.
Nothing stops a parallel unmount to suceed before the given call to
dounmount() checks and locks the covered vnode.  Prevent dounmount()
from acting on the freed (although type-stable) memory by changing the
interface to require the mount point to be referenced.  dounmount()
consumes the reference on return, regardless of the sucessfull or
erronous result.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-05-27 09:22:50 +00:00
avg
2c52296aaa zfs: fixes for a full stream received into an existing dataset
- this should fail early unless the force flag is set
- if the force flag is set then any local modifications including
  snapshots should be undone

See:	https://www.illumos.org/issues/5912
See:	https://reviews.csiden.org/r/220/

Reviewed by:	mahrens, Paul Dagnelie <pcd@delphix.com>
MFC after:	15 days
Sponsored by:	ClusterHQ
2015-05-25 11:56:57 +00:00
avg
511007870d dsl_dataset_promote_check: ensure that shared snaps do not become too long
... after they are transfered from the old origin to the new one.

See:	https://www.illumos.org/issues/5909
See:	https://reviews.csiden.org/r/219/

Reviewed by:	mahrens
MFC after:	10 days
Sponsored by:	ClusterHQ
2015-05-25 11:48:15 +00:00
kib
e1f68e3cfb Remove excess Giant acquisition around the dounmount() call.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-05-25 09:08:19 +00:00
markj
6928f32018 Remove unused references to calltrap.
MFC after:	3 days
2015-05-25 01:22:56 +00:00
jkim
318c4f97e6 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
smh
6394c7af86 Add copyright info missing from r282205
Add the copyright info missing from ZoL origin version.

MFC after:	2 days
Sponsored by:	Multiplay
2015-05-14 08:13:01 +00:00
avg
9f5ffddd44 zfs ioctls: use fget_write / fget_read instead of getf wrapper for fget
This allows to ensure that we do not write to a file that was opened
for reading only or vice versa.

Also, use the correct capability in in zfs_ioc_send_new().

Differential Revision:	https://reviews.freebsd.org/D2382
Reviewed by:	delphij
MFC after:	17 days
Sponsored by:	ClusterHQ
2015-05-11 10:07:31 +00:00
markj
6ede9cbbef Remove some commented-out upstream code for handling traps from usermode
DTrace probes. This handling is already done in trap() on i386 and amd64.
2015-05-10 22:27:48 +00:00
jhibbits
c50204f333 Fix a couple bugs in 64-bit powerpc fasttrap argument retrieval.
Found by code inspection.
2015-05-10 04:33:01 +00:00
avg
b15170de41 MFV r282630: 5809 Blowaway full receive in v1 pool causes kernel panic
MFC after:	5 days
2015-05-08 14:03:14 +00:00
avg
897e19b7f1 zfs: do not hold an extra reference on a root vnode while a filesystem is mounted
At present zfs_domount() acquires a reference on the filesystem's root vnode
and that reference is kept until zfs_umount.
The latter calls vflush(rootrefs = 1) to dispose of the extra reference.

There is no explanation of why that reference is kept - what problem it
solves or what behavior it improves.
Also, that logic is FreeBSD specific.

There is one real problem with that reference, though.
zfs recv -F may receive a full, non-incremental stream to a mounted filesystem.
In that case the received root object is likely to have a different z_gen
attribute value. Because of that, zfs_rezget will leave the previous root znode
and vnode disassociated from the actual object (z_sa_hdl == NULL).
Thus, future calls to VFS_ROOT() -> zfs_root() will produce a new vnode-znode
pair, while the old one will be kept alive by the outstanding reference.
So, the outstanding reference will not actually be for the new root vnode
(or, more precisely, vnodes - because a root vnode may be recycled and a newer
one can be created).
As a result, when vflush(rootrefs = 1) s called there will be two problems:

- a leaked reference on the old root vnode preventing a graceful unmount
- insufficient references on the actual root vnode leading to a crash upon
  access to the vnode after it is destroyed by vgone() + vdrop()

The second issue will actually override the first one.

Differential Revision:	https://reviews.freebsd.org/D2353
Reviewed by:		delphij, kib, smh
MFC after:	17 days
2015-05-05 11:01:06 +00:00
avg
509c82b2c2 dmu_recv_end_check: don't leak hold if dsl_destroy_snapshot_check_impl fails
The leak may happen if !drc_newfs && drc_force and there is an error
iterating through snapshots or any of snapshot checks fails.

See https://www.illumos.org/issues/5870
See https://reviews.csiden.org/r/206/

Reviewed by:	mahrens (as mahrens@delphix.com)
MFC after:	15 days
Sponsored by:	ClusterHQ
2015-05-05 10:56:16 +00:00
smh
a5d3a9fc06 Fix misuse of input argument in traverse_visitbp
In traverse_visitbp(), the input argument dnp is modified in the middle
to point to a temporary buffer. Originally this doesn't matter, because
no user of TRAVERSE_POST dereferences it. However, in fbeddd6 a piece of
code is added dereferencing dnp after the modification, creating a possible
bug.

We fix this by creating a new local variable cdnp for the DMU_OT_DNODE case,
so we don't modify the input argument. Also we introduce different local
variables in the DMU_OT_OBJSET case to prevent confusion between the input
argument.

Obtained from:	zfsonlinux (a585f2f844ed3d4270221fed88f5e494eb55d932)
MFC after:	2 weeks
Sponsored by:	Multiplay
2015-04-28 22:46:58 +00:00
avg
70befe4134 replace a comment about zfs recv -F corner case with a longer, more detailed one
The old comment in zfs_rezget explains what situation the code handles,
the new comment also describes how the situation can arise.

Also, re-join a line that became sufficiently shorti some time ago.

Differential Revision:	https://reviews.freebsd.org/D2352
Reviewed by:	delphij, smh
MFC after:	12 days
2015-04-28 09:19:40 +00:00
avg
879cb055bd zfs_onexit_fd_hold: return EBADF even if devfs_get_cdevpriv gave ENOENT
/dev/zfs always has per-open data, so when it is missing the file
descriptor is for some other file.  Returning ENOENT in this case
is confusing as a variety of other conditions (like a missing dataset)
may result in the same error.  It's better to consistently return
EBADF for any problems with the file descriptor.

Note that zfs_onexit_fd_hold() is used with 'automatic cleanup fd'
- when that fd is closed, typically because a process is terminated,
some cleanup action is taken by ZFS driver.  E.g. a temporary
snapshot hold is released.

Perhaps, it would even be worthwhile changing devfs_get_cdevpriv()
to return EBADF if there is no associated data.

Differential Revision:	https://reviews.freebsd.org/D2370
Reviewed by:	delphij, smh
MFC after:	12 days
2015-04-28 09:11:47 +00:00
avg
65773effb9 dsl_dir_rename_check: return EXDEV on cross-pool rename attempt
Obtained from:	zfsonlinux/zfs@9063f65476
Obtained from:	Boris Protopopov <boris.protopopov@actifio.com>
MFC after:	10 days
2015-04-28 08:04:16 +00:00
avg
293d1ac7ce MFV r282123: 5610 zfs clone from different source and target pools produces coredump
MFC after:	10 days
2015-04-28 07:42:28 +00:00
avg
7d7084af66 MFV r282124: 5393 spurious failures from dsl_dataset_hold_obj()
The actual bugfix was pro-actively committed in r275515.
This MFV is cosmetic, it just aligns code style with the upstream.

MFC after:	10 days
2015-04-28 07:37:38 +00:00
avg
7112bf7b61 nvpair_type_is_array: DATA_TYPE_INT8_ARRAY was not recognized
To do:	upstream (https://www.illumos.org/issues/5778)
MFC after:	10 days
2015-04-28 06:34:55 +00:00
rwatson
ed8c3c1f90 Adjust PROF_ARTIFICIAL_FRAMES in the DTrace profile provider on ARM to
skip 10, rather than 9, frames.  This appears to work quite well in
practice on the BeagleBone Black, so remove a comment about the value
being bogus and replace it with a slightly less negative one.  However,
the number of frames to skip is quite sensitive to details of the timer
and interrupt handling paths, so this is necessarily fragile -- but no
more so than on x86.

Sponsored by:	DARPA, AFRL
2015-04-25 15:43:12 +00:00
markj
9385b9197d Fix DTrace's panic() action.
It would previously call into some unfinished Solaris compatibility code and
return without actually calling panic(9). The compatibility code is
unneeded, however, so just remove it and have dtrace_panic() call vpanic(9)
directly.

Differential Revision:	https://reviews.freebsd.org/D2349
Reviewed by:	avg
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2015-04-24 03:19:30 +00:00
delphij
9420bc6c71 Remove vfs.zfs.snapshot_list_prefetch, the corresponding code was
gone in r248571 already.

MFC after:	1 week
2015-04-17 21:21:11 +00:00
markj
1d6ffde4f4 libdtrace: add support for lazyload mode.
Passing "-x lazyload" to dtrace -G during compilation causes dtrace(1) to
not link drti.o into the output object file, so the USDT probes are not created
during process startup. Instead, dtrace(1) will automatically discover and
create probes on the process' behalf when attaching.

Differential Revision:	https://reviews.freebsd.org/D2203
Reviewed by:		rpaulo
MFC after:		1 month
2015-04-08 02:36:37 +00:00
mav
e5f58186a5 Add DTrace probe to the new ARC reclaim cause added in r281026.
MFC after:	1 month
2015-04-05 14:45:52 +00:00
mav
326429ebdd Make ZFS ARC track both KVA usage and fragmentation.
Even on Illumos, with its much larger KVA, ZFS ARC steps back if KVA usage
reaches certain threshold (3/4 on i386 or 16/17 otherwise).  FreeBSD has
even less KVA, but had no such limit on archs with direct map as amd64.
As result, on machines with a lot of RAM, during load with very small user-
space memory pressure, such as `zfs send`, it was possible to reach state,
when there is enough both physical RAM and KVA (I've seen up to 25-30%),
but no continuous KVA range to allocate even single 128KB I/O request.

Address this situation from two sides:
 - restore KVA usage limitations in a way the most close to Illumos;
 - introduce new requirement for KVA fragmentation, specifying that we
should have at least one sequential KVA range of zfs_max_recordsize bytes.

Experiments show that first limitation done alone is not sufficient.  On
machine with 64GB of RAM it is sometimes needed to drop up to half of ARC
size to get at leats one 1MB KVA chunk.  Statically limiting ARC to half
of KVA/RAM is too strict, so second limitation makes it to work in cycles:
accumulate trash up to certain critical mass, do massive spring-cleaning,
and then start littering again. :)

MFC after:	1 month
2015-04-03 14:45:48 +00:00
andrew
3046fbea8a Add the arm64 defines for cddl code.
Differential Revision:	https://reviews.freebsd.org/D2186
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
2015-04-01 08:31:56 +00:00
markj
acaed6413c Import a missing piece of commit b8fac8e162eda7e98d from illumos-gate.
This adds an upper bound, dtrace_ustackdepth_max, to the number of frames
traversed when computing the userland stack depth. Some programs - notably
firefox - are otherwise able to trigger an infinite loop in
dtrace_getustack_common(), causing a panic.

MFC after:	1 week
2015-03-30 03:55:51 +00:00
mav
4a52d77cdd Some cosmetic polishing. No functional change.
MFC after:	1 week
2015-03-29 20:28:18 +00:00
markj
d854427d60 Remove unused upstream DTrace provider implementations that are duplicates
of providers under sys/cddl/dev/. Also remove sdt_subr.c, which isn't used
in FreeBSD's SDT implementation.

Suggested by:	rwatson
2015-03-16 01:15:08 +00:00
rwatson
9f48720b49 Now that DTrace stack traces handle exception frames better, skip fewer
stack frames for FBT 'entry' probes on ARM.

MFC after:	3 days
Sponsored by:	DARPA, AFRL
2015-03-15 15:19:02 +00:00
rwatson
26e3e9bb99 On ARM, unlike some other architectures, saved $pc values from in-kernel
traps do appear in the regular call stack, rather than only in a special
trap frame, so we don't need to inject the trap-frame $pc into a returned
stack trace in DTrace.

MFC after:	3 days
Sponsored by:	DARPA, AFRL
2015-03-15 15:17:34 +00:00
rwatson
ec1dafc898 Replace the completely arbitrary '3' with '9' for the number of frames to
skip using the DTrace 'profile' provider on ARM.  This causes stack traces
to skip various driver-and callout-related things as they do on x86, where
the likewise arbitrary values are '6' (32-bit) and '10' (64-bit) for
similar sorts of reasons.

MFC after:	3 days
Sponsored by:	DARPA, AFRL
2015-03-15 14:12:40 +00:00
smh
bc1c82b63e Allow zvol_geom_worker to process BIO_DELETE's
If zvol_geom_start is called with a BIO_DELETE from a thread which can
sleep it queues it for later processing by the zvol_geom_worker. The
zvol_geom_worker didn't have a delete case so would simply loose the bio
hence preventing the original caller from every completing. In addition
an other unknown types would suffer the same fate.

Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and
return unsupported for all unknown bio types.

MFC after:	2 weeks
Sponsored by:	Multiplay
2015-03-14 17:35:04 +00:00
mav
397e76aa8d Make DIOCGATTR in device mode handle "GEOM::candelete".
MFC after:	3 days
2015-03-12 16:19:18 +00:00
gnn
c531c598ba Add support for walltimestamp to DTrace on ARM. 2015-03-07 04:38:25 +00:00
andrew
43cb9d5cd8 dtrace_cas32 and dtrace_casptr should retrn the data loaded from target
not the new value.

Sponsored by:	ABT Systems Ltd
2015-03-05 18:03:42 +00:00
andrew
ff8a1038b7 Add the MD parts of dtrace needed to use fbt on ARM. For this we need to
emulate the instructions used in function entry and exit.

For function entry ARM will use a push instruction to push up to 16
registers to the stack. While we don't expect all 16 to be used we need to
handle any combination the compiler may generate, even if it doesn't make
sense (e.g. pushing the program counter).

On function return we will either have a pop or branch instruction. The
former is similar to the push instruction, but with care to make sure we
update the stack pointer and program counter correctly in the cases they
are either in the list of registers or not. For branch we need to take the
24-bit offset, sign-extend it, and add that number of 4-byte words to the
program counter. Care needs to be taken as, due to historical reasons, the
address the branch is relative to is not the current instruction, but 8
bytes later.

This allows us to use the following probes on ARM boards:
  dtrace -n 'fbt::malloc:entry { stack() }'
and
  dtrace -n 'fbt:🆓return { stack() }'

Differential Revision:	https://reviews.freebsd.org/D2007
Reviewed by:	gnn, rpaulo
Sponsored by:	ABT Systems Ltd
2015-03-05 17:55:31 +00:00
nwhitehorn
048b34a391 Fix build after unifying DAR/DEAR storage in trap frame. 2015-03-05 17:02:22 +00:00
rwatson
d81f712ba9 Don't all DTrace's FBT on ARM to instrument undefinedinstruction(), as
this would lead to DTrace reentrance.

Sponsored by:	DARPA, AFRL
2015-03-05 07:40:41 +00:00
andrew
f4b588d2fc Fix the dtrace ARM atomic compare-and-set functions. These functions are
expected to return the data in the memory location pointed at by target
after the operation. The FreeBSD atomic functions previously used return
either 0 or 1 to indicate if the comparison succeeded or not respectively.

With this change these functions only support ARMv6 and later are supported
by these functions.

Sponsored by:	ABT Systems Ltd
2015-03-01 10:04:14 +00:00
rstone
eac1d566b4 Allow Illumos code to co-exist with nv(9)
Differential Revision:		https://reviews.freebsd.org/D1881
Reviewed by:			jfv, will
Suggested by:			pjd
MFC after:			1 month
Sponsored by:			Sandvine Inc
2015-03-01 00:22:45 +00:00