Commit Graph

557 Commits

Author SHA1 Message Date
delphij
3344ccfa37 MFV r264666:
4374 dn_free_ranges should use range_tree_t

illumos/illumos-gate@bf16b11e8d

MFC after:	2 weeks
2014-04-18 21:15:12 +00:00
markj
2004e1b1f2 Replace a few Solarisisms with their corresponding FreeBSDisms to make a few
printf tests pass.
2014-04-15 02:32:00 +00:00
markj
c45d8f95f1 Use the correct format specifiers for wide characters and strings of wide
characters.

MFC after:	1 week
2014-04-15 02:28:08 +00:00
delphij
78e547c540 Take into account when zpool history block grows exceeding 128KB in zpool(8)
and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in
spa_history_create_obj()).

PR:		bin/186574
Submitted by:	Andrew Childs <lorne cons org nz> (with changes)
MFC after:	2 weeks
2014-04-14 18:38:14 +00:00
imp
c39e6fc2c9 NO_MAN= has been deprecated in favor of MAN= for some time, go ahead
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
2014-04-13 05:21:56 +00:00
mav
dd28e6bbc0 Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL
between three modes:
 geom -- existing fully functional behavior (default);
 dev -- exposing volumes only as raw disk device file in devfs;
 none -- not exposing volumes outside ZFS.

The "dev" mode is less functional (can't be partitioned, mounted, etc),
but it is faster, and in some scenarios with untrusted consumers safer.
It can be useful for NAS, VM block storages, etc.
The "none" mode may be convenient for backup servers, etc. that don't
need direct data access.

Due to the way ZVOL is integrated with main ZFS code, those property
and sysctl are checked only during pool import and volume creation.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-04-05 13:01:44 +00:00
pfg
3c7cca5de7 MFV r258379;
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included

Illumos Revision:	4a20ab41aadcb81c53e72fc65886e964e9add59

Reference:
https://www.illumos.org/issues/4248
https://www.illumos.org/issues/4249

Obtained from:	Illumos
MFC after:	1 month
2014-04-02 15:32:44 +00:00
delphij
b26390b635 MFV r263887:
3993 zpool(1M) and zfs(1M) should support -p for "list" and "get"
4700 "zpool get" doesn't support -H or -o options

MFC after:	2 weeks
2014-03-28 23:12:00 +00:00
delphij
3779db6b3f MFV 263436-263438:
3947 zpool(1M) references nonexistent zfs-features(5)
  4540 zpool(1M) man page doesn't describe "readonly" property
  3948 zfs sync=default is not accepted
  4611 zfs(1M) still mentions 'send -r' in synopsis
  4415 zpool(1M) man page missing "import -m" description
  4570 Document dedupditto pool property
  4572 Dedup-related documentation additions for zpool and zdb.
  1371 Add -D option description to zpool(1M) manpage
  4571 Add documentation for -T and interval to "zpool list"

MFC after:	2 weeks
2014-03-21 01:32:25 +00:00
delphij
f68728a475 Remove unused option -r from zpool.
Submitted by:	Richard Yao <ryao gentoo org>
MFC after:	2 weeks
2014-03-19 23:04:52 +00:00
asomers
2c64d4aa1f cddl/contrib/opensolaris/lib/libuutil/common/uu_avl.c
Fix a memory leak in uu_avl_pool_create: pthread_mutex_init without
	a corresponding pthread_mutex_destroy.  It shows up, among other
	places, when doing "zfs list".

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corporation
2014-03-07 23:01:35 +00:00
jmg
731420f10e mark that libctf depends upon libz so that if you dlopen libctf, you
don't get:
Undefined symbol "zError"
2014-03-05 23:37:25 +00:00
markj
7e4d144851 When our linker merges .SUNW_dof sections from multiple files, it simply
concatenates the DOF tables into one section. Previously, the USDT init
code in drti.o would only look at the first table in the DOF section; with
this change, it iterates over all the tables, passing each DOF table to
the kernel.

PR:		186821
Submitted by:	Fedor Indutny <fedor@indutny.com>
MFC after:	1 month
2014-03-01 23:09:07 +00:00
markj
a7aada62d5 4478 dtrace_dof_maxsize is far too small
illumos/illumos-gate@d339a29bb4

PR:		187027
MFC after:	1 week
2014-02-28 02:04:41 +00:00
delphij
8d9721842c MFV r262570:
4626 libzfs memleak in zpool_in_use()

illumos/illumos-gate@fb13f48f1d

MFC after:	2 weeks
2014-02-27 21:50:46 +00:00
markj
451c3aecb6 Move some files that are identical on i386 and amd64 to an x86 subdirectory
rather than keeping duplicate copies.

Discussed with:	avg
MFC after:	1 week
2014-02-27 01:04:35 +00:00
markj
428d834b5b 1452 DTrace buffer autoscaling should be less violent
illumos/illumos-gate@6fb4854bed

This fixes the tst.resize1.d and tst.resize2.d DTrace tests, which have
been failing since r261122 since they were causing dtrace(1) to attempt to
allocate and use large amounts of memory, and get killed by the OOM killer
as a result.

MFC after:	1 month
2014-02-22 05:18:55 +00:00
feld
69dcdaba95 Fix formatting.
"Manpages should start a new sentence on a new line. This makes it easier
for translators to track changes." -jhb

Approved by:	jhb
MFC after:	3 days
Sponsored by:	SupraNet Communications, Inc
2014-02-17 13:23:49 +00:00
avg
3cb7f1978f zfs.8: fix garbled options in a sample zfs send -R command line
MFC after:	5 days
2014-02-14 15:21:21 +00:00
avg
282231056f zpool.8: fix typo in option description of labelclear command
MFC after:	5 days
2014-02-14 15:20:49 +00:00
feld
3f843d7a02 Add caveat to zpool manpage indicating that we do not automatically activate
hot spares. This should be MFC'd to all STABLE branches.

Upon the availability of zfsd, the zpool manpage on relevant branches should
be updated to remove this caveat and document hot spare's reliance on zfsd.

Approved by:	avg
MFC after:	1 week
Sponsored by:	SupraNet Communications
2014-02-11 15:38:29 +00:00
kaiw
f8038f60cd Only declare `bysz' variable under little endian archs. 2014-01-29 09:58:05 +00:00
kaiw
25e9e96645 MFH@261151. 2014-01-25 14:02:02 +00:00
kaiw
82f9b341c7 Simplify DWARF version check.
Submitted by:	emaste
2014-01-25 11:27:09 +00:00
avg
206718cf52 dtrace: remove unexplained 16MB limitation from dt_alloc/dt_zalloc
The limitation was introduced in r178556 without any note or comment.
It seems pretty artificial and now it leads to problems like the following:
  $ dtrace -x bufsize=17m -n ...
  dtrace: processing aborted: Memory allocation failure
OpenSolaris and illumos never had this limitation.

Sponsored by:	HybridCluster
2014-01-24 15:04:02 +00:00
kaiw
aae1bb2a7d Let ctfconvert accept DWARF version 3 and 4. 2014-01-22 13:43:54 +00:00
kaiw
e9c152dbc2 MFH@260917. 2014-01-20 19:38:44 +00:00
kaiw
cb3a8568bd Clang 3.4 will sometimes emit DIE for struct/union member before
emitting the DIE for the type of that member. ctfconvert can not
handle this properly and will calculate a wrong member bit offset.
Same struct/union type from different .o file will be treated as
different types when their member bit offsets are different, and
gets added/merged multiple times. This will in turn cause many other
structs/pointers/typedefs that refer to the duplicated struct/union
gets added/merged multiple times and eventually causes numerous
duplicated CTF types in the kernel.debug file.

The simple workaround here is to make use of DW_AT_byte_size attribute
of the member DIE to calculate the bits occupied by the member's type,
without actually resolving the type.
2014-01-20 01:35:14 +00:00
kaiw
9ccc946cdb * Make die_mem_offset() be able to handle DW_AT_data_member_location
attributes generated by Clang 3.4.
* Document how different compilers generate DW_AT_data_member_location
  attributes differently.
* Document the quirks about DW_FORM_data[48].
2014-01-19 13:48:02 +00:00
avg
e1e8b09fde zdb -R: do not treat numeric parameters to a flag as more flags
Reviewed by:	Matthew Ahrens <mahrens@delphix.com>
MFC after:	1 week
2014-01-17 10:18:45 +00:00
kaiw
844d940ffd We should not set the unnamed DIE's name to "__anon__" since that will
bring back a known issue with DTrace regarding type name
comparison. Instead, we can set the name to an empty string.

Pointed out by:	     avg
2014-01-17 08:44:12 +00:00
kaiw
403645c54c If function die_name() finds a DIE without a name, set its name to
"__anon__". This hack is used to workaround a issue that compilers
like GCC could generate DW_TAG_base_type DIE without a name.

Note that we didn't need this before because the old libdwarf
internally set all the unnamed DIE's name to "__anon__".
2014-01-16 22:28:33 +00:00
kaiw
ac8de91b29 Convert ctfconvert to use the new libdwarf API. 2014-01-16 21:56:05 +00:00
avg
f9bc6dc89c zinject must use ioctl(2) compatibility wrapper
MFC after:	8 days
Sponsored by:	HybridCluster
2014-01-16 12:21:21 +00:00
delphij
5137277761 MFV r260154 + 260182:
4369 implement zfs bookmarks
4368 zfs send filesystems from readonly pools

Illumos/illumos-gate@78f1710053

MFC after:	2 weeks
2014-01-02 07:34:36 +00:00
delphij
bc4b7f4186 MFV r260152:
4208 Typo in zfs_main.c: "posxiuser"

illumos/illumos-gate@f38cb554a5

Note: this is a stripped down version of Illumos change.

MFC after:	2 weeks
2014-01-01 01:23:40 +00:00
delphij
82c2441b7d MFV r259170:
4370 avoid transmitting holes during zfs send

4371 DMU code clean up

illumos/illumos-gate@43466aae47

NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.

MFC after:	2 weeks
2014-01-01 00:45:28 +00:00
delphij
fbdc3a9423 MFV r258972:
4373 add block contents print to zstreamdump

illumos/illumos-gate@994fb6b8a9

MFC after:	2 weeks
2013-12-31 21:37:24 +00:00
delphij
ab4bdae837 MFV r242733:
3306 zdb should be able to issue reads in parallel
3321 'zpool reopen' command should be documented in the man page
and help message

illumos/illumos-gate@31d7e8fa33

FreeBSD porting notes: the kernel part of this changeset depends
on Solaris buf(9S) interfaces and are not really applicable for
our use.  vdev_disk.c is patched as-is to reduce diverge from
upstream, but vdev_file.c is left intact.

MFC after:	2 weeks
2013-12-31 19:39:15 +00:00
markj
b05e082ecf When clearing relocations to __dtrace* symbols, handle both SHT_REL and
SHT_RELA sections properly instead of assuming that the relocation section
is of type SHT_REL.

Submitted by:	Prashanth Kumar <pra_udupi@yahoo.co.in> (original version)
MFC after:	1 month
2013-12-29 19:27:32 +00:00
delphij
ea2640845c MFV r258384:
2583 Add -p (parsable) option to zfs list

illumos/illumos-gate@43d68d68c1

MFC after:	2 weeks
2013-12-25 00:39:04 +00:00
delphij
5bdae47300 Fix incorrect markup introduced in r259813.
Pointy hat to:	delphij
X-MFC-after:	r259813
2013-12-24 07:27:55 +00:00
delphij
66a5ee8e14 MFV r258374:
4171 clean up spa_feature_*() interfaces

4172 implement extensible_dataset feature for use by other zpool
features

illumos/illumos-gate@2acef22db7

MFC after:	2 weeks
2013-12-24 07:14:25 +00:00
delphij
b045f69bf5 MFV r258373:
4168 ztest assertion failure in dbuf_undirty

4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state

illumos/illumos-gate@7fdd916c47

MFC after:	2 weeks
2013-12-24 06:56:17 +00:00
mav
057ae4aad3 Don't even try to read vdev labels from devices smaller then SPA_MINDEVSIZE
(64MB).  Even if we would find one somehow, ZFS kernel code rejects such
devices.  It is funny to look on attempts to read 4 256K vdev labels from
1.44MB floppy, though it is not very practical and quite slow.
2013-12-10 12:36:44 +00:00
delphij
aee9275df5 Don't panic when we get ZPOOL_STATUS_NON_NATIVE_ASHIFT
while listing importable pools.

MFC after:	3 days
2013-12-09 18:52:21 +00:00
joel
49a4972ff7 mdoc: remove EOL whitespace. 2013-12-06 21:22:33 +00:00
markj
231dd1c8cb Enable some previously-disabled DTrace tests for umod, ufunc and usym. They
expect the installed ksh binary to be named "ksh", which is not the case
when it's installed on FreeBSD via the shells/ksh93 port. Allow for it to be
"ksh93" as well so that the tests can actually pass.
2013-12-04 01:40:39 +00:00
markj
11fb004776 The uaddr, ufunc, umod and usym functions all seem to work as expected on
FreeBSD, so stop hiding them behind a "#if defined(sun)".

Reported by:	Prashanth Kumar <pra_udupi@yahoo.co.in>
2013-12-04 01:35:04 +00:00
markj
80b61d514a Use mkstemp(3) to create the temporary file used in the FreeBSD-specific
portions of dtrace_program_link().
2013-12-03 03:40:47 +00:00
avg
9932b97e88 MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4104 ::spa_space no longer works
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab

illumos/illumos-gate@0713e232b7

Note that some tunables have been removed and some new tunables have
been added.  Of particular note, FreeBSD-only knob
vfs.zfs.space_map_last_hope is removed as it was a nop for some time now
(after one of the previous merges from upstream).

MFC after:	11 days
Sponsored by:	HybridCluster [merge]
2013-11-28 19:37:22 +00:00
avg
51e896cc78 MFV r255255: 4045 zfs write throttle & i/o scheduler performance work
illumos/illumos-gate@69962b5647

Please note the following changes:
- zio_ioctl has lost its priority parameter and now TRIM is executed
  with 'now' priority
- some knobs are gone and some new knobs are added; not all of them are
  exposed as tunables / sysctls yet

MFC after:	10 days
Sponsored by:	HybridCluster [merge]
2013-11-26 09:57:14 +00:00
avg
d1f2a86401 734 taskq_dispatch_prealloc() desired
943 zio_interrupt ends up calling taskq_dispatch with TQ_SLEEP
illumos/illumos-gate@5aeb94743e

Essentially FreeBSD taskqueues already operate in a mode that
was added to Illumos with taskq_dispatch_ent change.
We even exposed the superior FreeBSD interface as taskq_dispatch_safe.
Now we just rename taskq_dispatch_safe to taskq_dispatch_ent and
struct struct ostask to taskq_ent_t, so that code differences will be
minimal.

After this change sys/cddl/compat/opensolaris/sys/taskq.h header is no
longer needed.

Note that this commit is not an MFV because the upstream change was not
individually committed to the vendor area.

MFC after:	8 days
2013-11-26 09:26:18 +00:00
jhibbits
eda0b2bf25 Use 'int' to store the return value of getopt(), rather than char.
On some architectures (powerpc), char is unsigned by default, which means
comparisons against -1 always fail, so the programs get stuck in an
infinite loop.

MFC after:	1 week
2013-11-20 01:42:29 +00:00
markj
6f96d8451e Don't try to use the 32-bit drti.o unless the data model is explicitly set
to ILP32. Otherwise dtrace -G will attempt to use it on amd64 if it can't
determine which data model to use, which happens when -64 is omitted and
no object files are provided, e.g. with

# dtrace -G -n BEGIN

This would result in a linker error, but now works properly.

Also remove an unnecessary #ifdef.

MFC after:	2 weeks
2013-11-09 04:38:16 +00:00
sbruno
3d7ba62a0b Quiesce warning assigning to void * from const ctf_header_t * by explicity casting
to void * before assignment.

Submitted as Illumos issue 4287
2013-11-04 21:32:07 +00:00
sbruno
029584a9a1 spelling in comments fixup
Submitted by:		Joerg Sonnenberger <joerg@britannica.bec.de>
2013-11-04 19:32:35 +00:00
sbruno
189f0aae34 Quiesce warning regarding %llf which has no effect.
Submitted as illumos issue #4284

Reviewed by:	delphij
2013-11-04 16:15:43 +00:00
sbruno
8fc848171e This library uses macros to define fprintf behvavior for several object types
The compiler will see the non-string literal arguments to the fprintf calls and
omit warnings for them. Quiese these warnings in contrib code:

cddl/contrib/opensolaris/lib/libnvpair/libnvpair.c:743:12: warning: format
  string is not a string literal (potentially insecure) [-Wformat-security]
  ARENDER(pctl, nvlist_array, nvl, name, val, nelem);
2013-11-03 21:05:44 +00:00
markj
a5afef3ae8 If the initial attempt to open /dev/ksyms fails, kldload the ksyms module
and retry.
2013-10-27 16:18:48 +00:00
markj
9d398866fd Convert the lockstat(1) man page to mdoc and make sure that it gets
installed. Additionally, remove Solaris-specific sections and references,
and replace example outputs with output from lockstat on FreeBSD, since
lockstat's output contains stack traces.

This change also removes some examples that don't seem to work properly on
FreeBSD. The examples should be re-added when lockstat is fixed.

Reported by:	avg
MFC after:	1 week
2013-10-27 16:01:11 +00:00
smh
a5552b83b2 Added support for the 'zfs list -t snap' and 'zfs snap' aliases which are
available under Oracle Solaris 11.

This includes an update to the ZFS(8) man page to reflect all the
available alias (snap, umount, and recv).

Initial changes obtained from ZFS On Linux + fixes for man page and cmd
help:
10b75496bb
cf81b00a73

Obtained from:	https://github.com/zfsonlinux/zfs
MFC after:	2 weeks
Sponsored by:	Multiplay
2013-10-23 18:22:27 +00:00
markj
3ecc6f1298 Add a function, memstr, which can be used to convert a buffer of
null-separated strings to a single string. This can be used to print the
full arguments of a process using execsnoop (from the DTrace toolkit) or
with the following one-liner:

dtrace -n 'syscall::execve:return {trace(curpsinfo->pr_psargs);}'

Note that this relies on the process arguments being cached via the struct
proc, which means that it will not work for argvs longer than
kern.ps_arg_cache_limit. However, the following rather non-portable
script can be used to extract any argv at exec time:

fbt::kern_execve:entry
{
    printf("%s", memstr(args[1]->begin_argv, ' ',
        args[1]->begin_envv - args[1]->begin_argv));
}

The debug.dtrace.memstr_max sysctl limits the maximum argument size to
memstr(). Thanks to Brendan Gregg for helpful comments on freebsd-dtrace.

Tested by:	Fabian Keil (earlier version)
MFC after:	2 weeks
2013-10-16 01:39:26 +00:00
jhibbits
0b9629ab6a Add fasttrap for PowerPC. This is the last piece of the dtrace/ppc puzzle.
It's incomplete, it doesn't contain full instruction emulation, but it should be
sufficient for most cases.

MFC after:	1 month
2013-10-15 15:00:29 +00:00
markj
0b5457bf66 Convert the dtrace(1) man page to mdoc and fix up some aspects of it that
don't make sense on FreeBSD. In particular,

- remove the ATTRIBUTES section,
- remove references to the Solaris Dynamic Tracing Guide, except in the
  SEE ALSO section,
- update the description of the -A option for FreeBSD's implementation,
- remove references to Solaris-specific programs and configuration files,
  and replace them with FreeBSD equivalents where possible.

The content has not changed aside from this.

Approved by:	re (joel)
MFC after:	1 week
2013-10-10 03:50:23 +00:00
rmh
774f6b00bf Fix implicit declaration of jail_getid()
Approved by:	re
2013-10-07 14:22:19 +00:00
markj
8ff2d52009 Add a separate translator for headers passed to the TCP probes in the
input path. These probes get some of the fields in host order, whereas the
output probes get them in network order, so a single translator isn't
enough. This workaround ensures that the problem is essentially invisble
to users: none of the probe arguments or their fields have changed.

Approved by:	re (hrs)
2013-10-02 17:14:12 +00:00
delphij
d04a7f0144 MFV r254750:
Add support of Illumos dumps on zvol over RAID-Z.

Note that this only adds the features.  FreeBSD would
still need more work to support dumping on zvols.

Illumos ZFS issues:
  2932 support crash dumps to raidz, etc. pools

MFC after:	1 month
Approved by:	re (ZFS blanket)
2013-09-21 00:17:26 +00:00
markj
37ef3ce41e Use the address of the inpcb rather than the tcpcb to identify TCP
connections. This keeps the tcp provider consistent with the other network
providers.

Approved by:	re (delphij)
2013-09-15 21:38:46 +00:00
delphij
a23043347c MFV r247844 (illumos-gate 13975:ef6409bc370f)
Illumos ZFS issues:
  3582 zfs_delay() should support a variable resolution
  3584 DTrace sdt probes for ZFS txg states

Provide a compatibility shim for Solaris's cv_timedwait_hires
to help aid future porting.

Approved by:	re (ZFS blanket)
2013-09-10 01:46:47 +00:00
will
1b508b8cc8 Build all ZFS testing & debugging tools with -g.
These programs and everything using libzpool rely on the embedded asserts to
verify the correctness of operations.  Given that, the core dumps would be
useless without debug symbols.
2013-08-27 04:01:31 +00:00
pfg
615f223c07 Merge various CTF fixes from illumos
2942 CTF tools need to handle files which legitimately lack data
2978 ctfconvert still needs to ignore legitimately dataless files on SPARC

Illumos Revisions:	13745:6b3106b4250f
			13754:7231b684c18b

Reference:

https://www.illumos.org/issues/2942
https://www.illumos.org/issues/2978

MFC after:	3 weeks
2013-08-26 22:29:42 +00:00
markj
29e4661920 Implement the ip, tcp, and udp DTrace providers. The probe definitions use
dynamic translation so that their arguments match the definitions for
these providers in Solaris and illumos. Thus, existing scripts for these
providers should work unmodified on FreeBSD.

Tested by:	gnn, hiren
MFC after:	1 month
2013-08-25 21:54:41 +00:00
delphij
a4cf8ab508 MFV r254751:
Don't treat the parameter as a number (pool GUID) when there is
error converting it from string, instead, treat it as the pool
name.

Illumos ZFS issues:
  1765 assert triggered in libzfs_import.c trying to import pool
       name beginning with a number
2013-08-24 00:54:47 +00:00
delphij
40685a92a9 MFV r254748:
Fix memory leak in libzfs's iter_dependents_cb().

Illumos ZFS issues:
  4061 libzfs: memory leak in iter_dependents_cb()
2013-08-24 00:29:34 +00:00
delphij
677bfa2265 MFV r254746:
To quote original Illumos ticket:

libctf thinks that any ELF file containing more than 65536 sections is
corrupt, because it doesn't understand the SHN_XINDEX magic.

Illumos DTrace issues:
  4005 libctf can't deal with extended sections
2013-08-23 23:58:56 +00:00
delphij
5017a032d2 MFV r254422:
Illumos DTrace issues:
  3089 want ::typedef
  3094 libctf should support removing a dynamic type
  3095 libctf does not validate arrays correctly
  3096 libctf does not validate function types correctly
2013-08-23 23:21:24 +00:00
gibbs
bd47afb289 Enhance the ZFS vdev layer to maintain both a logical and a physical
minimum allocation size for devices.  Use this information to
automatically increase ZFS's minimum allocation size for new top-level
vdevs to a value that more closely matches the optimum device
allocation size.

Use GEOM's stripesize attribute, if set, as the physical sector
size of the GEOM.

Calculate the minimum blocksize of each metaslab class.  Use the
calculated value instead of SPA_MINBLOCKSIZE (512b) when determining
the likelyhood of compression yeilding a reduction in physical space
usage.

Report devices with sub-optimal block size configuration in "zpool
status".  Also properly fail attempts to attach devices with a
logical block size greater than 8kB, since this will cause corruption
to ZFS's label area.

Sponsored by:	Spectra Logic Corporaion
MFC after:	2 weeks

Background
==========
Many modern devices use physical allocation units that are much
larger than the minimum logical allocation size accessible by
external commands.  Two prevalent examples of this are 512e disk
drives (512b logical sector, 4K physical sector) and flash devices
(512b logical sector, 4K or larger allocation block size, and 128k
or larger erase block size).  Operations that modify less than the
physical sector size result in a costly read-modify-write or garbage
collection sequence on these devices.

Simply exporting the true physical sector of the device to ZFS would
yield optimal performance, but has two serious drawbacks:

1) Existing pools created with devices that have different logical
   and physical block sizes, but were configured to use the logical
   block size (e.g. because the OS version used for pool construction
   reported the logical block size instead of the physical block
   size) will suddenly find that the vdev allocation size has
   increased.  This can be easily tolerated for active members of
   the array, but ZFS would prevent replacement of a vdev with
   another identical device because it now appears that the smaller
   allocation size required by the pool is not supported by the new
   device.

2) The device's physical block size may be too large to be supported
   by ZFS.  The optimal allocation size for the vdev may be quite
   large.  For example, a RAID controller may export a vdev that
   requires read-modify-write cycles unless accessed using 64k
   aligned/sized requests.  ZFS currently has an 8k minimum block
   size limit.

Reporting both the logical and physical allocation sizes for vdevs
solves these problems.  A device may be used so long as the logical
block size is compatible with the configuration.  By comparing the
logical and physical block sizes, new configurations can be optimized
and administrators can be notified of any existing pools that are
sub-optimal.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h:
	Add the SPA_ASHIFT constant.  ZFS currently has a hard upper
	limit of 13 (8k) for ashift and this constant is used to
	both document and enforce this limit.

sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h:
	Add the VDEV_AUX_ASHIFT_TOO_BIG error code.

	Add fields for exporting the configured, logical, and
	physical ashift to the vdev_stat_t structure.

	Add VDEV_STAT_VALID() macro which can be used to verify the
	presence of required vdev_stat_t fields in nvlist data.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
	Provide a SYSCTL_PROC handler for "max_auto_ashift".  Since
	the limit is only referenced long after boot when a create
	operation occurs, there's no compelling need for it to be
	a boot time configurable tunable.  This also allows the
	validation code for the max_auto_ashift value to be contained
	within the sysctl handler.

	Populate the new fields in the vdev_stat_t structure.

	Fail vdev opens if the vdev reports an ashift larger than
	SPA_MAXASHIFT.

	Propogate vdev_logical_ashift and vdev_physical_ashift between
	child and parent vdevs as is done for vdev_ashift.

	In vdev_open(), restore code that fails opens for devices
	where vdev_ashift grows.  This can only happen now if the
	device's logical ashift grows, which means it really isn't
	safe to use the device.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c:
	Update the vdev_open() API so that both logical (what was
	just ashift before) and physical ashift are reported.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h:
	Add two new fields, vdev_physical_ashift and vdev_logical_ashift,
	to vdev_t.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:
	Add vdev_ashift_optimize().  Call it anytime a new top-level
	vdev is allocated.

cddl/contrib/opensolaris/cmd/zpool/zpool_main.c:
	Add text for the VDEV_AUX_ASHIFT_TOO_BIG error.

	For each sub-optimally configured leaf vdev, report configured
	and native block sizes.

cddl/contrib/opensolaris/cmd/zpool/zpool_main.c:
cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h:
cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c:
	Introduce a new zpool status: ZPOOL_STATUS_NON_NATIVE_ASHIFT.
	This status is reported on healthy pools containing vdevs
	configured to use a block size smaller than their reported
	physical block size.

cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c:
	Update find_vdev_problem() and supporting functions to
	provide the full vdev_stat_t structure to problem checking
	routines, and to allow decent into replacing vdevs.

	Add a vdev_non_native_ashift() validator which is used on
	the full vdev tree to check for ZPOOL_STATUS_NON_NATIVE_ASHIFT.

cddl/contrib/opensolaris/lib/libzpool/common/kernel.c:
cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:
	Enhance sysctl userland stubs now that a SYSCTL_PROC handler
	is used in vdev.c.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h:
	When the group membership of a metaslab class changes (i.e.
	when a vdev is added or removed from a pool), walk the group
	list to determine the smallest block size currently available
	and record this in the metaslab class.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:
	Add the metaslab_class_get_minblocksize() accessor.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_compress.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:
	In zio_compress_data(), take the minimum blocksize as an
	input parameter instead of assuming SPA_MINBLOCKSIZE.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:
	In l2arc_compress_buf(), pass SPA_MINBLOCKSIZE as the minimum
	blocksize of the device.  The l2arc code performs has it's own
	code for deciding if compression is worth while, so this
	effectively disables zio_compress_data() from second guessing
	the original decision.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:
	In zio_write_bp_init(), use the minimum blocksize of the
	normal metaslab class when compressing data.
2013-08-21 04:10:24 +00:00
delphij
1b0e7b9e07 MFV r254421:
Illumos ZFS issues:
  3996 want a libzfs_core API to rollback to latest snapshot
2013-08-21 00:04:31 +00:00
rpaulo
10427f9082 Load the dtraceall module if /dev/dtrace/dtrace doesn't exist.
MFC after:	3 days
2013-08-10 23:17:09 +00:00
delphij
ccc3c4970e MFV r254079:
Illumos ZFS issues:
  3957 ztest should update the cachefile before killing itself
  3958 multiple scans can lead to partial resilvering
  3959 ddt entries are not always resilvered
  3960 dsl_scan can skip over dedup-ed blocks if
       physical birth != logical birth
  3961 freed gang blocks are not resilvered and can cause pool to suspend
  3962 ztest should print out zfs debug buffer before exiting
2013-08-08 23:38:31 +00:00
delphij
7291294314 MFV r254071:
Fix a regression introduced by fix for Illumos bug #3834.  Quote from
Matthew Ahrens on the Illumos issue:

ztest fails this assertion because ztest_dmu_read_write() does
        dmu_tx_hold_free(tx, bigobj, bigoff, bigsize);
and then
    dmu_object_set_checksum(os, bigobj,
        (enum zio_checksum)ztest_random_dsl_prop(ZFS_PROP_CHECKSUM), tx);

If the region to free is past the end of the file, the DMU assumes that there
will be nothing to do for this object.  However, ztest does set_checksum(),
which must modify the dnode.  The fix is for ztest to also call

    dmu_tx_hold_bonus(tx, bigobj);

so we can account for the dirty data associated with setting the checksum

Illumos ZFS issues:
  3955 ztest failure: assertion refcount_count(&tx->tx_space_written)
         + delta <= tx->tx_space_towrite
2013-08-07 22:21:00 +00:00
delphij
0171695909 MFV r254070:
Merge vendor bugfix for ZFS test suite that triggers false positives.

Illumos ZFS issues:
  3949 ztest fault injection should avoid resilvering devices
  3950 ztest: deadman fires when we're doing a scan
  3951 ztest hang when running dedup test
  3952 ztest: ztest_reguid test and ztest_fault_inject don't place nice together
2013-08-07 21:16:14 +00:00
rmh
5933e25518 Fix implicit declaration of warnx(). 2013-08-04 16:25:46 +00:00
delphij
c5affee6a3 MFV r253781 + r253871:
Illumos ZFS issues:
  3894 zfs should not allow snapshot of inconsistent dataset

MFC after:	2 weeks
2013-07-30 21:02:09 +00:00
smh
ef92cf9910 MFV r253784:
Fix zfs send -D hang after processing requiring a CTRL+C to interrupt due to
pthread_join prior to fd close.

This was introduced by r251646 (MFV r251644)

Illumos ZFS issue:
  3909 "zfs send -D" does not work

MFC after:	1 day
2013-07-30 20:45:27 +00:00
pfg
03e98d6f13 DTrace: re-apply r249426 now that the underlying issues have been solved.
Merge change from illumos:

3519 DTrace fails to resolve const types from fbt
3520 dtrace internal error -- token type 316 is not a valid D
     compilation token
3521 clean up dtrace unit tests

Illumos Revision:	e98f46c

Reference:
https://www.illumos.org/issues/3519
https://www.illumos.org/issues/3520
https://www.illumos.org/issues/3521

Tested by:	Fabian Keil
Obtained from:	Illumos
MFC after:	1 month
2013-07-28 01:02:17 +00:00
pfg
46097436dc DTrace: re-merge remainder of r249367 (original from Illumos).
Bring back some important fixes from Illumos:

3022 DTrace: keys should not affect the sort order when sorting by value
3023 it should be possible to dereference dynamic variables
3024 D integer narrowing needs some work

We particularly avoid the LD_NOLAZYLOAD changes that Illumos made
as those don't apply to FreeBSD and were causing problems in
interactive mode.

Illumos Revision:	13758:23432da34147

Reference:

https://www.illumos.org/issues/3022
https://www.illumos.org/issues/3023
https://www.illumos.org/issues/3024

MFC after:	1 month
Tested by:	markj
2013-07-28 00:45:20 +00:00
markj
401af7020e Use kern_ioctl() rather than ioctl() for testing the FBT provider, since the
latter doesn't exist in FreeBSD. All the tests under fbtprovider pass now.
2013-07-27 21:31:48 +00:00
pfg
923a920777 Style issue in r253661.
Pointed out by:	avg
MFC after:	1 month
2013-07-26 14:37:23 +00:00
pfg
34504eb8a9 Fix a segfault in ctfmerge due to a bug in gcc.
GCC can generate bogus dwarf attributes with DW_AT_byte_size
set to 0xFFFFFFFF.
The issue was originaly detected in NetBSD but it has been
adapted for portability and to avoid compiler warnings.

Reference:
https://www.illumos.org/issues/3776

Obtained from:	NetBSD
MFC after:	1 month
2013-07-26 00:28:19 +00:00
delphij
5a5f4c42fb Manually merge part of vendor import r238583 from Illumos.
Illumos changeset: 13680:2bd022a765e2
Illumos ZFS issue:

    2671 zpool import should not fail if vdev ashift has increased

MFC after:	3 days
2013-07-18 00:22:42 +00:00
mm
ddbb7a25f4 Fix misleading or remove irrelevant illumos messages and manpage references
in the zfs command.

PR:		bin/178996
Submitted by:	Peter Schaefer <peter.schaefer@wilhelmheinrichs.de>
MFC after:	3 days
2013-07-04 22:26:38 +00:00
delphij
4dfc3c75a2 MFV r252215:
Restore a previous behavior before r251646, where when destructing
ZFS snapshot, the ioctl would return ENOENT when it hit any of
them in the errlist (the new behavior was only return ENOENT when
all returns error).

Illumos ZFS issues:
  3829 fix for 3740 changed behavior of zfs destroy/hold/release ioctl

MFC after:	1 week
2013-06-25 22:14:32 +00:00
delphij
5240466227 Diff reduction against Illumos, no real change to code itself.
This marks vendor branch revision 252213 as merged, the actual code was
committed in r245479.

MFC after:	1 week
2013-06-25 21:51:52 +00:00
smh
08b5f9540c Fixed ZFS zpool freeze (debug command) not processing due to invalid ioctl call syntax.
MFC after:	1 week
2013-06-21 15:30:46 +00:00
delphij
ab3dbcb998 MFV r251644:
Poor ZFS send / receive performance due to snapshot
hold / release processing (by smh@)

Illumos ZFS issues:
  3740 Poor ZFS send / receive performance due to snapshot
       hold / release processing

MFC after:      2 weeks
2013-06-12 07:07:06 +00:00
delphij
421d823726 MFV r251624:
txg commit callbacks don't work

Illumos ZFS issues:
  3747 txg commit callbacks don't work

MFC after:      2 weeks
2013-06-11 19:29:31 +00:00
delphij
8fe6a06d58 MFV r251623:
zpool create should treat -O mountpoint and -m the same

Illumos ZFS issues:
  3745 zpool create should treat -O mountpoint and -m the same

MFC after:      2 weeks
2013-06-11 19:25:49 +00:00
delphij
9d0815fcd1 MFV r251619:
ZFS needs better comments.

Illumos ZFS issues:
  3741 zfs needs better comments

MFC after:      2 weeks
2013-06-11 19:02:36 +00:00