477 Commits

Author SHA1 Message Date
delphij
7cb0f49ed2 MFC r264835 (MFV r264829):
3897 zfs filesystem and snapshot limits
2014-05-09 07:12:31 +00:00
delphij
0fce5e81cc MFC r264669: MFV r264666:
4374 dn_free_ranges should use range_tree_t

illumos/illumos-gate@bf16b11e8d
2014-05-09 06:56:26 +00:00
mav
44963562b0 MFC r264145:
Add property and sysctl to control how ZVOLs are exposed to OS.

New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL
between three modes:
 geom -- existing fully functional behavior (default);
 dev -- exposing volumes only as raw disk device file in devfs;
 none -- not exposing volumes outside ZFS.

The "dev" mode is less functional (can't be partitioned, mounted, etc),
but it is faster, and in some scenarios with untrusted consumers safer.
It can be useful for NAS, VM block storages, etc.
The "none" mode may be convenient for backup servers, etc. that don't
need direct data access.

Due to the way ZVOL is integrated with main ZFS code, those property
and sysctl are checked only during pool import and volume creation.
2014-05-08 13:12:24 +00:00
smh
c84c4e2d80 MFC r264851
Eliminated optarg global being used outside of the function which called getopt

Sponsored by:	Multiplay
2014-05-08 08:17:12 +00:00
markj
607e8b47f9 MFC r262542:
Move some files that are identical on i386 and amd64 to an x86 subdirectory
rather than keeping duplicate copies.
2014-05-03 16:08:52 +00:00
pfg
7c82f25917 MFC r264040:
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included

Illumos Revision:	4a20ab41aadcb81c53e72fc65886e964e9add59

Reference:
https://www.illumos.org/issues/4248
https://www.illumos.org/issues/4249

Obtained from:	Illumos
2014-05-02 20:12:31 +00:00
delphij
e852cd6938 MFC r264467:
Take into account when zpool history block grows exceeding 128KB in zpool(8)
and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in
spa_history_create_obj()).

PR:		bin/186574
Submitted by:	Andrew Childs <lorne cons org nz> (with changes)
2014-04-28 06:11:03 +00:00
jmmv
797209d767 MFC r264741: Add placeholder Kyuafiles for various top-level hierarchies.
This is "make tinderbox" clean.
2014-04-28 04:20:14 +00:00
markj
ec059ac886 MFC r262596:
4478 dtrace_dof_maxsize is far too small

illumos/illumos-gate@d339a29bb4
2014-04-23 03:26:29 +00:00
delphij
d336d68dfb MFC r263889 (MFV r263887):
3993 zpool(1M) and zfs(1M) should support -p for "list" and "get"
4700 "zpool get" doesn't support -H or -o options
2014-04-11 01:27:33 +00:00
delphij
f4b8b0fe7c MFC r263459: MFV 263436-263438:
3947 zpool(1M) references nonexistent zfs-features(5)
4540 zpool(1M) man page doesn't describe "readonly" property
3948 zfs sync=default is not accepted
4611 zfs(1M) still mentions 'send -r' in synopsis
4415 zpool(1M) man page missing "import -m" description
4570 Document dedupditto pool property
4572 Dedup-related documentation additions for zpool and zdb.
1371 Add -D option description to zpool(1M) manpage
4571 Add documentation for -T and interval to "zpool list"
2014-04-11 01:23:46 +00:00
delphij
60e600a585 MFC r263385:
Remove unused option -r from zpool.

Submitted by:	Richard Yao <ryao gentoo org>
2014-04-03 01:00:51 +00:00
dim
ebd9ec5dee MFC r260880 (by kaiw, from projects/elftoolchain):
* Make die_mem_offset() be able to handle DW_AT_data_member_location
    attributes generated by Clang 3.4.
  * Document how different compilers generate DW_AT_data_member_location
    attributes differently.
  * Document the quirks about DW_FORM_data[48].

This is a slightly modified version, adapted to work with the old
libdwarf in stable/9 and stable/10.  It should fix DTrace on these
branches, when the kernel is compiled with clang 3.4.

Note that you have to build *and* install the CTF tools first, before
building the kernel.  Otherwise you can possibly still get error
messages similar to "failed to copy type of 'pr_uid': Type information
is in parent and unavailable", when attempting to run dtrace(1).

Submitted by:	kaiw
2014-03-29 17:18:23 +00:00
asomers
31717f7c2c MFC r262912
cddl/contrib/opensolaris/lib/libuutil/common/uu_avl.c
	Fix a memory leak in uu_avl_pool_create: pthread_mutex_init without
	a corresponding pthread_mutex_destroy.  It shows up, among other
	places, when doing "zfs list".
2014-03-28 15:09:35 +00:00
delphij
ec5d67aa46 MFC r260183: MFV r260154 + 260182:
4369 implement zfs bookmarks
4368 zfs send filesystems from readonly pools

Illumos/illumos-gate@78f1710053
2014-03-20 00:28:53 +00:00
delphij
11b0e21c75 MFC r259850: MFV r258384:
2583 Add -p (parsable) option to zfs list

illumos/illumos-gate@43d68d68c1
2014-03-20 00:24:35 +00:00
delphij
18a0059ca2 MFC r256999 (smh):
Added support for the 'zfs list -t snap' and 'zfs snap' aliases which are
available under Oracle Solaris 11.

This includes an update to the ZFS(8) man page to reflect all the
available alias (snap, umount, and recv).

Initial changes obtained from ZFS On Linux + fixes for man page and cmd
help:
10b75496bb
cf81b00a73

Obtained from:  https://github.com/zfsonlinux/zfs
2014-03-20 00:10:58 +00:00
delphij
ede49d4106 MFC r260150: MFV r259170:
4370 avoid transmitting holes during zfs send

4371 DMU code clean up

illumos/illumos-gate@43466aae47

NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.
2014-03-19 23:55:03 +00:00
delphij
874f949f14 MFC r260138: MFV r242733:
3306 zdb should be able to issue reads in parallel
3321 'zpool reopen' command should be documented in the man page
and help message

illumos/illumos-gate@31d7e8fa33

FreeBSD porting notes: the kernel part of this changeset depends
on Solaris buf(9S) interfaces and are not really applicable for
our use.  vdev_disk.c is patched as-is to reduce diverge from
upstream, but vdev_file.c is left intact.
2014-03-19 23:44:03 +00:00
delphij
5b9b9ced50 MFC r259813 + r259813: MFV r258374:
4171 clean up spa_feature_*() interfaces

4172 implement extensible_dataset feature for use by other zpool
features

illumos/illumos-gate@2acef22db7
2014-03-19 23:36:12 +00:00
delphij
dda91b34ac MFC r262577: MFV r262570:
4626 libzfs memleak in zpool_in_use()
2014-03-14 01:09:42 +00:00
eadler
1f967fac12 MFC r261774 by feld:
Add caveat to zpool manpage indicating that we do not automatically activate
	hot spares. This should be MFC'd to all STABLE branches.

	Upon the availability of zfsd, the zpool manpage on relevant branches should
	be updated to remove this caveat and document hot spare's reliance on zfsd.

Requested by:	feld
2014-02-24 17:01:27 +00:00
avg
444af87821 MFC r260811: zdb -R: do not treat numeric parameters to a flag as more flags 2014-02-17 17:48:38 +00:00
avg
e67db98e94 MFC r260703: zinject must use ioctl(2) compatibility wrapper 2014-02-17 17:46:15 +00:00
avg
51714698c3 MFC r258717: MFV r258371,r258372: 4101 metaslab_debug should allow for
fine-grained control
2014-02-17 17:11:38 +00:00
avg
6f834c6747 MFC r261122: dtrace: remove unexplained 16MB limitation from dt_alloc/dt_zalloc 2014-02-17 15:35:11 +00:00
markj
9a6dcea530 MFC r260051:
When clearing relocations to __dtrace* symbols, handle both SHT_REL and
SHT_RELA sections properly instead of assuming that the relocation section
is of type SHT_REL.
2014-02-17 14:51:02 +00:00
avg
26096ba436 MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler
performance work

Sponsored by:	HybridCluster [merge]
2014-01-16 15:57:39 +00:00
avg
a4ee4e9fb0 MFC r258630: 734 taskq_dispatch_prealloc() desired 2014-01-16 14:34:53 +00:00
jhibbits
04f97f6231 MFC r258362
Use 'int' to store the return value of getopt(), rather than char.

On some architectures (powerpc), char is unsigned by default, which means
comparisons against -1 always fail, so the programs get stuck in an
infinite loop.
2014-01-15 05:30:05 +00:00
jhibbits
b1391adbae MFC r256543,r259245,r259421,r259668,r259674
r256543:

Add fasttrap for PowerPC.  This is the last piece of the DTrace/ppc puzzle.
It's incomplete, it doesn't contain full instruction emulation, but it should be
sufficient for most cases.

r259245,r259421: (FBT)

FBT now does work fully on PowerPC.

Save r3 before using it for the trap check, else we end up saving the new r3,
containing the trap instruction encoding (0x7c810808), and restoring it back
with the frame on return.  This caused it to panic on my ppc32 machine.

r259668,r259674:
Fix a typo in the FBT code.
2014-01-15 05:19:37 +00:00
delphij
139f0a076d MFC r259811:
MFV r258373:

4168 ztest assertion failure in dbuf_undirty

4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state

illumos/illumos-gate@7fdd916c47
2014-01-14 01:28:08 +00:00
mav
cebac06b67 MFC r259168:
Don't even try to read vdev labels from devices smaller then SPA_MINDEVSIZE
(64MB).  Even if we would find one somehow, ZFS kernel code rejects such
devices.  It is funny to look on attempts to read 4 256K vdev labels from
1.44MB floppy, though it is not very practical and quite slow.
2014-01-05 22:14:12 +00:00
delphij
907f6b0d31 MFC r259131:
Don't panic when we get ZPOOL_STATUS_NON_NATIVE_ASHIFT
while listing importable pools.
2013-12-23 20:16:54 +00:00
markj
0b5457bf66 Convert the dtrace(1) man page to mdoc and fix up some aspects of it that
don't make sense on FreeBSD. In particular,

- remove the ATTRIBUTES section,
- remove references to the Solaris Dynamic Tracing Guide, except in the
  SEE ALSO section,
- update the description of the -A option for FreeBSD's implementation,
- remove references to Solaris-specific programs and configuration files,
  and replace them with FreeBSD equivalents where possible.

The content has not changed aside from this.

Approved by:	re (joel)
MFC after:	1 week
2013-10-10 03:50:23 +00:00
rmh
774f6b00bf Fix implicit declaration of jail_getid()
Approved by:	re
2013-10-07 14:22:19 +00:00
markj
8ff2d52009 Add a separate translator for headers passed to the TCP probes in the
input path. These probes get some of the fields in host order, whereas the
output probes get them in network order, so a single translator isn't
enough. This workaround ensures that the problem is essentially invisble
to users: none of the probe arguments or their fields have changed.

Approved by:	re (hrs)
2013-10-02 17:14:12 +00:00
delphij
d04a7f0144 MFV r254750:
Add support of Illumos dumps on zvol over RAID-Z.

Note that this only adds the features.  FreeBSD would
still need more work to support dumping on zvols.

Illumos ZFS issues:
  2932 support crash dumps to raidz, etc. pools

MFC after:	1 month
Approved by:	re (ZFS blanket)
2013-09-21 00:17:26 +00:00
markj
37ef3ce41e Use the address of the inpcb rather than the tcpcb to identify TCP
connections. This keeps the tcp provider consistent with the other network
providers.

Approved by:	re (delphij)
2013-09-15 21:38:46 +00:00
delphij
a23043347c MFV r247844 (illumos-gate 13975:ef6409bc370f)
Illumos ZFS issues:
  3582 zfs_delay() should support a variable resolution
  3584 DTrace sdt probes for ZFS txg states

Provide a compatibility shim for Solaris's cv_timedwait_hires
to help aid future porting.

Approved by:	re (ZFS blanket)
2013-09-10 01:46:47 +00:00
will
1b508b8cc8 Build all ZFS testing & debugging tools with -g.
These programs and everything using libzpool rely on the embedded asserts to
verify the correctness of operations.  Given that, the core dumps would be
useless without debug symbols.
2013-08-27 04:01:31 +00:00
pfg
615f223c07 Merge various CTF fixes from illumos
2942 CTF tools need to handle files which legitimately lack data
2978 ctfconvert still needs to ignore legitimately dataless files on SPARC

Illumos Revisions:	13745:6b3106b4250f
			13754:7231b684c18b

Reference:

https://www.illumos.org/issues/2942
https://www.illumos.org/issues/2978

MFC after:	3 weeks
2013-08-26 22:29:42 +00:00
markj
29e4661920 Implement the ip, tcp, and udp DTrace providers. The probe definitions use
dynamic translation so that their arguments match the definitions for
these providers in Solaris and illumos. Thus, existing scripts for these
providers should work unmodified on FreeBSD.

Tested by:	gnn, hiren
MFC after:	1 month
2013-08-25 21:54:41 +00:00
delphij
a4cf8ab508 MFV r254751:
Don't treat the parameter as a number (pool GUID) when there is
error converting it from string, instead, treat it as the pool
name.

Illumos ZFS issues:
  1765 assert triggered in libzfs_import.c trying to import pool
       name beginning with a number
2013-08-24 00:54:47 +00:00
delphij
40685a92a9 MFV r254748:
Fix memory leak in libzfs's iter_dependents_cb().

Illumos ZFS issues:
  4061 libzfs: memory leak in iter_dependents_cb()
2013-08-24 00:29:34 +00:00
delphij
677bfa2265 MFV r254746:
To quote original Illumos ticket:

libctf thinks that any ELF file containing more than 65536 sections is
corrupt, because it doesn't understand the SHN_XINDEX magic.

Illumos DTrace issues:
  4005 libctf can't deal with extended sections
2013-08-23 23:58:56 +00:00
delphij
5017a032d2 MFV r254422:
Illumos DTrace issues:
  3089 want ::typedef
  3094 libctf should support removing a dynamic type
  3095 libctf does not validate arrays correctly
  3096 libctf does not validate function types correctly
2013-08-23 23:21:24 +00:00
gibbs
bd47afb289 Enhance the ZFS vdev layer to maintain both a logical and a physical
minimum allocation size for devices.  Use this information to
automatically increase ZFS's minimum allocation size for new top-level
vdevs to a value that more closely matches the optimum device
allocation size.

Use GEOM's stripesize attribute, if set, as the physical sector
size of the GEOM.

Calculate the minimum blocksize of each metaslab class.  Use the
calculated value instead of SPA_MINBLOCKSIZE (512b) when determining
the likelyhood of compression yeilding a reduction in physical space
usage.

Report devices with sub-optimal block size configuration in "zpool
status".  Also properly fail attempts to attach devices with a
logical block size greater than 8kB, since this will cause corruption
to ZFS's label area.

Sponsored by:	Spectra Logic Corporaion
MFC after:	2 weeks

Background
==========
Many modern devices use physical allocation units that are much
larger than the minimum logical allocation size accessible by
external commands.  Two prevalent examples of this are 512e disk
drives (512b logical sector, 4K physical sector) and flash devices
(512b logical sector, 4K or larger allocation block size, and 128k
or larger erase block size).  Operations that modify less than the
physical sector size result in a costly read-modify-write or garbage
collection sequence on these devices.

Simply exporting the true physical sector of the device to ZFS would
yield optimal performance, but has two serious drawbacks:

1) Existing pools created with devices that have different logical
   and physical block sizes, but were configured to use the logical
   block size (e.g. because the OS version used for pool construction
   reported the logical block size instead of the physical block
   size) will suddenly find that the vdev allocation size has
   increased.  This can be easily tolerated for active members of
   the array, but ZFS would prevent replacement of a vdev with
   another identical device because it now appears that the smaller
   allocation size required by the pool is not supported by the new
   device.

2) The device's physical block size may be too large to be supported
   by ZFS.  The optimal allocation size for the vdev may be quite
   large.  For example, a RAID controller may export a vdev that
   requires read-modify-write cycles unless accessed using 64k
   aligned/sized requests.  ZFS currently has an 8k minimum block
   size limit.

Reporting both the logical and physical allocation sizes for vdevs
solves these problems.  A device may be used so long as the logical
block size is compatible with the configuration.  By comparing the
logical and physical block sizes, new configurations can be optimized
and administrators can be notified of any existing pools that are
sub-optimal.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h:
	Add the SPA_ASHIFT constant.  ZFS currently has a hard upper
	limit of 13 (8k) for ashift and this constant is used to
	both document and enforce this limit.

sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h:
	Add the VDEV_AUX_ASHIFT_TOO_BIG error code.

	Add fields for exporting the configured, logical, and
	physical ashift to the vdev_stat_t structure.

	Add VDEV_STAT_VALID() macro which can be used to verify the
	presence of required vdev_stat_t fields in nvlist data.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
	Provide a SYSCTL_PROC handler for "max_auto_ashift".  Since
	the limit is only referenced long after boot when a create
	operation occurs, there's no compelling need for it to be
	a boot time configurable tunable.  This also allows the
	validation code for the max_auto_ashift value to be contained
	within the sysctl handler.

	Populate the new fields in the vdev_stat_t structure.

	Fail vdev opens if the vdev reports an ashift larger than
	SPA_MAXASHIFT.

	Propogate vdev_logical_ashift and vdev_physical_ashift between
	child and parent vdevs as is done for vdev_ashift.

	In vdev_open(), restore code that fails opens for devices
	where vdev_ashift grows.  This can only happen now if the
	device's logical ashift grows, which means it really isn't
	safe to use the device.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c:
	Update the vdev_open() API so that both logical (what was
	just ashift before) and physical ashift are reported.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h:
	Add two new fields, vdev_physical_ashift and vdev_logical_ashift,
	to vdev_t.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:
	Add vdev_ashift_optimize().  Call it anytime a new top-level
	vdev is allocated.

cddl/contrib/opensolaris/cmd/zpool/zpool_main.c:
	Add text for the VDEV_AUX_ASHIFT_TOO_BIG error.

	For each sub-optimally configured leaf vdev, report configured
	and native block sizes.

cddl/contrib/opensolaris/cmd/zpool/zpool_main.c:
cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h:
cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c:
	Introduce a new zpool status: ZPOOL_STATUS_NON_NATIVE_ASHIFT.
	This status is reported on healthy pools containing vdevs
	configured to use a block size smaller than their reported
	physical block size.

cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c:
	Update find_vdev_problem() and supporting functions to
	provide the full vdev_stat_t structure to problem checking
	routines, and to allow decent into replacing vdevs.

	Add a vdev_non_native_ashift() validator which is used on
	the full vdev tree to check for ZPOOL_STATUS_NON_NATIVE_ASHIFT.

cddl/contrib/opensolaris/lib/libzpool/common/kernel.c:
cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:
	Enhance sysctl userland stubs now that a SYSCTL_PROC handler
	is used in vdev.c.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h:
	When the group membership of a metaslab class changes (i.e.
	when a vdev is added or removed from a pool), walk the group
	list to determine the smallest block size currently available
	and record this in the metaslab class.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:
	Add the metaslab_class_get_minblocksize() accessor.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_compress.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:
	In zio_compress_data(), take the minimum blocksize as an
	input parameter instead of assuming SPA_MINBLOCKSIZE.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:
	In l2arc_compress_buf(), pass SPA_MINBLOCKSIZE as the minimum
	blocksize of the device.  The l2arc code performs has it's own
	code for deciding if compression is worth while, so this
	effectively disables zio_compress_data() from second guessing
	the original decision.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:
	In zio_write_bp_init(), use the minimum blocksize of the
	normal metaslab class when compressing data.
2013-08-21 04:10:24 +00:00
delphij
1b0e7b9e07 MFV r254421:
Illumos ZFS issues:
  3996 want a libzfs_core API to rollback to latest snapshot
2013-08-21 00:04:31 +00:00
rpaulo
10427f9082 Load the dtraceall module if /dev/dtrace/dtrace doesn't exist.
MFC after:	3 days
2013-08-10 23:17:09 +00:00