Commit Graph

547 Commits

Author SHA1 Message Date
delphij
9db99dff71 MFV r268848:
Instead of asserting all zio's be properly aligned, only assert
on the logical ones.

Cap uberblocks at 8k, otherwise with ashift=17, there would be
only one uberblock.

This fixes a problem that zdb would trip assert on pools with
ashift >= 0xe (8k).

While there, also change the code so it only attempt to condense
space map unless the uncondensed size consumes greater than
zfs_metaslab_condense_block_threshold blocks.

Illumos issue:
  4958 zdb trips assert on pools with ashift >= 0xe

MFC after:	2 weeks
2014-07-18 20:41:40 +00:00
delphij
cd85118a8b MFV r268714:
Improve extreme rewind import.

When doing an "extreme rewind" import ("zpool import -XF"), we attempt
to verify all data in the pool, essentially scrubbing the entire pool.
The problem is that spa_load_verify_cb() issues an unbounded number of
concurrent scrub i/os.  This can lead to all of memory being used for
these zio's, wedging the system. Like normal scrub, we need to put a
cap on the number of outstanding i/os, and have the traverse thread
block when we reach this cap.

For this purpose the cap can be very large (10,000) to optimize the
elevator algorithm.  Three kernel tunables have been added:

	vfs.zfs.spa_load_verify_maxinflight
	vfs.zfs.spa_load_verify_metadata
	vfs.zfs.spa_load_verify_data

The latter two tunables controls whether metadata and/or user data
when doing extreme rewind.

Make 'zpool import -T' imply scrub.

Make zpool import -T <txg> accept hexadecimal values for the txg when
prefixed with 0x.

Skip txg's for which there is no uberblock when doing extreme rewind.

Skip reading all user data twice by skipping prefetches when doing
extreme rewinds as we do not access via the ARC.

Illumos issues:
  4970 need controls on i/o issued by zpool import -XF
  4971 zpool import -T should accept hex values
  4972 zpool import -T implies extreme rewind, and thus a scrub
  4973 spa_load_retry retries the same txg
  4974 spa_load_verify() reads all data twice

MFC after:	2 weeks
2014-07-15 22:44:04 +00:00
delphij
ba32c2f8ac Bump mdoc date after r268621.
X-MFC-With:	r268621
2014-07-14 17:54:36 +00:00
smh
5563475471 Don't report non-native block-size pools under zpool status -x
zpool status -x is used to identify pools that are exhibiting
errors or are otherwise unavailable, therefore non-native
block-size pools shouldn't be reported.

Also update man page to clarify other additional conditions
which won't cause a pool to be displayed under zpool status -x.

Sponsored by:	Multiplay
2014-07-14 14:33:03 +00:00
delphij
7c6811e675 MFV r268455:
Use reserved space for ZFS administrative commands.

We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at
least) for system use.  Most ZPL operations, e.g. write(2), creat(2), will
fail with ENOSPC if we fall below this.

Certain operations, e.g. file removal and most administrative actions,
still permitted until half of the slop space is used.  This would allow
users to use these operations to free up space in the pool when pool is
close to full but half of slop space is still free.

A very restricted set of operations that frees up space or change quota
are always permitted, regardless of the amount of free space.

MFC after:	 2 weeks
2014-07-09 23:14:59 +00:00
delphij
9b45140041 MFV r268454:
Refresh zpool list for each interval in order to produce fresh
output.

Illumos issue: 4966 zpool list iterator does not update output

MFC after:	 2 weeks
2014-07-09 21:07:20 +00:00
delphij
cb7e625917 MFV r268453:
Diff reduction against Illumos.

MFC after:	 2 weeks
2014-07-09 20:57:42 +00:00
marcel
9f28abd980 Remove ia64.
This includes:
o   All directories named *ia64*
o   All files named *ia64*
o   All ia64-specific code guarded by __ia64__
o   All ia64-specific makefile logic
o   Mention of ia64 in comments and documentation

This excludes:
o   Everything under contrib/
o   Everything under crypto/
o   sys/xen/interface
o   sys/sys/elf_common.h

Discussed at: BSDcan
2014-07-07 00:27:09 +00:00
delphij
3218d43e56 MFV r268121:
4924 LZ4 Compression for metadata

illumos/illumos-gate@b8289d24d8

MFC after:	2 weeks
2014-07-01 22:31:09 +00:00
delphij
6012f0ab3a MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t

illumos/illumos-gate@7802d7bf98

MFC after:	2 weeks
2014-07-01 21:51:30 +00:00
delphij
be1d315a04 - Fix handling of "new" style of ioctl in compatiblity mode [1];
- Reorganize code and reduce diff from upstream;
 - Improve forward compatibility shims for previous kernel;

Reported by:	sbruno [1]
X-MFC-With:	r268075
2014-07-01 20:57:39 +00:00
delphij
0b2372b195 MFV r267570:
4756 metaslab_group_preload() could deadlock

illumos/illumos-gate@30beaff42d

MFC after:	2 weeks
2014-07-01 08:36:56 +00:00
delphij
5f023415af MFV r267568:
4891 want zdb option to dump all metadata

illumos/illumos-gate@df15e419cb

MFC after:	2 weeks
2014-07-01 08:20:34 +00:00
delphij
0d7d731791 MFV r267566:
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption

MFC after:	2 weeks
2014-07-01 07:29:42 +00:00
delphij
f95fd16f8d MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression")
4913 zfs release should not be subject to space checks

MFC after:	2 weeks
2014-07-01 06:43:15 +00:00
rpaulo
da67e0c76e MFV illumos
4471 DTrace count() with histogram
4472 DTrace full width distribution histograms
4473 DTrace frequency trails

MFC after:	2 weeks
2014-06-26 23:24:59 +00:00
rpaulo
8c0e49065f MFV illumos
4474 DTrace Userland CTF Support
4475 DTrace userland Keyword
4476 DTrace tests should be better citizens
4479 pid provider types
4480 dof emulation is missing checks

MFC after:	2 weeks
2014-06-26 23:21:11 +00:00
rpaulo
b14d31acfe Add stubs for CTF functions which are not yet implemented.
MFC after:	2 weeks
2014-06-26 22:38:06 +00:00
rpaulo
c6d679e670 MFV illumos
4477 DTrace should speak JSON

MFC after:	2 weeks
2014-06-26 21:45:49 +00:00
rpaulo
3191dbe25d MFV illumos r266986:
2915 DTrace in a zone should see "cpu", "curpsinfo", et al
2916 DTrace in a zone should be able to access fds[]
2917 DTrace in a zone should have limited provider access

MFC after:	2 weeks
2014-06-26 19:38:16 +00:00
rpaulo
37b311bee5 Revert r267898. 2014-06-26 17:34:42 +00:00
rpaulo
aedbcf436f Bring the following change from the illumos-joyent repository:
commit 78e24ab6803bbe11ba37642624e1498ede5b239d
Author: Bryan Cantrill <bryan@joyent.com>
Date:   Thu Oct 31 01:20:54 2013

    OS-1688 DTrace count() with histogram
    OS-2360 DTrace full width distribution histograms
    OS-2361 DTrace frequency trails

MFC after:	2 weeks
2014-06-26 07:06:43 +00:00
pfg
cf8bf4883d MFV r258381:
4251 libdtrace leaks open file handles

Illumos commit:		93ed8d0d4b068b95d0bb50d57bb854df462a8485
			(partial)
Reference:
https://www.illumos.org/issues/4251

Discussed with:	Robert Mustacchi
Obtained from:	Illumos
MFC after:	1 week
2014-06-25 17:27:15 +00:00
joel
d94b51f5b9 mdoc: remove superfluous paragraph macros. 2014-06-23 18:40:21 +00:00
delphij
eb4f9057bf MFV r249332 (illumos-gate 14005:55fc53126003)
Illumos ZFS issues:
  3654 zdb should print number of ganged blocks

MFC after:	2 weeks
2014-06-17 08:11:45 +00:00
pfg
d3c94db217 MFV r266988:
Merge from r258379 missed the tests.

4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included

Illumos Revision:	54a20ab41aadcb81c53e72fc65886e964e9add59

MFC after:	5 days
2014-06-15 16:54:26 +00:00
delphij
78ba176089 MFV r266766:
Add a new zfs property, "redundant_metadata" which can have values "all" or
"most".  The default will be "all", which is the current behavior.  When set
to all, ZFS stores an extra copy of all metadata.  If a single on-disk block
is corrupt, at worst a single block of user data (which is recordsize bytes
long) can be lost.

Setting to "most" will cause us to only store 1 copy of level-1 indirect
blocks of user data files.  This can improve performance of random writes,
because less metadata has to be written.  In practice,  at worst about
100 blocks (of recordsize bytes each) of user data can be lost if a single
on-disk block is corrupt.

The exact behavior of which metadata blocks are stored redundantly may change
in future releases.

Illumos issue: 3835 zfs need not store 2 copies of all metadata

MFC after:	2 weeks
2014-05-27 19:46:11 +00:00
delphij
9e29b5f650 Explicitly link libzfs against libavl as it is done in OpenSolaris
(4543:12bb2876a62e).  Without this, some third party applications
may break because the lack of AVL related symbols.

FreeBSD base system are not affected because the FreeBSD ZFS command
line tools were all linked against libavl and thus hide the underlying
issue.

PR:		java/183081
Tested by:	jkim
MFC after:	3 days
2014-05-22 00:01:31 +00:00
markj
5433f26d40 Fix tst.ZeroModuleProbes.d.ksh, which was incorrectly modified in r178534.
Since "BEGIN" is not the name of a module, the test would just hang.

MFC after:	3 days
2014-05-19 20:11:55 +00:00
markj
79fc9c4be0 Bind ip/tcp/udp provider translators and symbols to the same versions as in
illumos, rather than using "1.0" everywhere.

Some of the translators use D functions that are not present in version
1.0 (e.g. inet_ntoa()) which can result in libdtrace crashing when running
scripts that restrict themselves to version 1.0
(e.g. with "-x version=1.0").

MFC after:	1 week
2014-05-14 19:02:00 +00:00
mav
cfea4862aa Comment out some pointless device open/close around reading device IDs.
FreeBSD ZFS port unlike OpenSolaris does not use device IDs, and does not
implement respective devid_*() fuctions.  It is pointless to open devices
just to close them back immediately.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-05-10 15:21:37 +00:00
mav
0f14c360e4 Import adapted OpenSolaris' thread pool API implementation.
The thread pool is used by libzfs to implement parallel disk scanning.
Without this change our dummy wrapper made `zpool import ZZZ` command to
scan all disks sequentially from the single thread when searching for pools.
This change makes it use two threads per CPU, same as in OpenSolaris.

On system with 200 HDDs this change reduces ZFS pool import time from 35
to 22 seconds.
2014-05-08 16:59:36 +00:00
markj
b2936ab3df Re-apply r248644. This fixes an annoying problem which caused dtrace -c to
fail to attach to stripped binaries. With the _r_debug_postinit symbol,
dtrace(1) can now set a breakpoint in the victim process after it has
registered its DOF table(s) with the kernel. r_debug_state cannot be used
for this purpose since it is called before DOF is made available, in which
case dtrace(1) cannot create USDT probes before the program begins
execution.

MFC after:	2 weeks
2014-05-08 03:43:18 +00:00
imp
2118f42afd Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
markj
b79625ea0c Remove a duplicate definition.
MFC after:	3 days
2014-05-04 03:37:39 +00:00
imp
29752a1c14 Spell NO_PROFILE= as MK_PROFILE=no. 2014-04-25 19:25:26 +00:00
smh
841db4d052 Silence compiler warning due to missing return in idmap_id_to_numeric_domain_rid 2014-04-24 01:20:43 +00:00
smh
5c3c99130b Eliminated optarg global being used outside of the function which called getopt
MFC after:	2 weeks
2014-04-24 01:12:52 +00:00
delphij
4a1b97c2c0 MFV r264829:
3897 zfs filesystem and snapshot limits

MFC after:	2 weeks
2014-04-23 20:29:46 +00:00
jmmv
64b466d8f8 Add placeholder Kyuafiles for various top-level hierarchies.
This change adds tests/ directories in the source tree to create various
subdirectories in /usr/tests/ and to install placeholder Kyuafiles for
them.

the relevant hierarchies are: cddl, etc, games, gnu and secure.

The reason for this is to simplify the addition of new test programs for
utilities or libraries under any of these directories.  Doing so on a
case by case basis is unnecessary and is quite an obscure process.
2014-04-21 21:39:25 +00:00
delphij
3344ccfa37 MFV r264666:
4374 dn_free_ranges should use range_tree_t

illumos/illumos-gate@bf16b11e8d

MFC after:	2 weeks
2014-04-18 21:15:12 +00:00
markj
2004e1b1f2 Replace a few Solarisisms with their corresponding FreeBSDisms to make a few
printf tests pass.
2014-04-15 02:32:00 +00:00
markj
c45d8f95f1 Use the correct format specifiers for wide characters and strings of wide
characters.

MFC after:	1 week
2014-04-15 02:28:08 +00:00
delphij
78e547c540 Take into account when zpool history block grows exceeding 128KB in zpool(8)
and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in
spa_history_create_obj()).

PR:		bin/186574
Submitted by:	Andrew Childs <lorne cons org nz> (with changes)
MFC after:	2 weeks
2014-04-14 18:38:14 +00:00
imp
c39e6fc2c9 NO_MAN= has been deprecated in favor of MAN= for some time, go ahead
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
2014-04-13 05:21:56 +00:00
mav
dd28e6bbc0 Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL
between three modes:
 geom -- existing fully functional behavior (default);
 dev -- exposing volumes only as raw disk device file in devfs;
 none -- not exposing volumes outside ZFS.

The "dev" mode is less functional (can't be partitioned, mounted, etc),
but it is faster, and in some scenarios with untrusted consumers safer.
It can be useful for NAS, VM block storages, etc.
The "none" mode may be convenient for backup servers, etc. that don't
need direct data access.

Due to the way ZVOL is integrated with main ZFS code, those property
and sysctl are checked only during pool import and volume creation.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-04-05 13:01:44 +00:00
pfg
3c7cca5de7 MFV r258379;
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included

Illumos Revision:	4a20ab41aadcb81c53e72fc65886e964e9add59

Reference:
https://www.illumos.org/issues/4248
https://www.illumos.org/issues/4249

Obtained from:	Illumos
MFC after:	1 month
2014-04-02 15:32:44 +00:00
delphij
b26390b635 MFV r263887:
3993 zpool(1M) and zfs(1M) should support -p for "list" and "get"
4700 "zpool get" doesn't support -H or -o options

MFC after:	2 weeks
2014-03-28 23:12:00 +00:00
delphij
3779db6b3f MFV 263436-263438:
3947 zpool(1M) references nonexistent zfs-features(5)
  4540 zpool(1M) man page doesn't describe "readonly" property
  3948 zfs sync=default is not accepted
  4611 zfs(1M) still mentions 'send -r' in synopsis
  4415 zpool(1M) man page missing "import -m" description
  4570 Document dedupditto pool property
  4572 Dedup-related documentation additions for zpool and zdb.
  1371 Add -D option description to zpool(1M) manpage
  4571 Add documentation for -T and interval to "zpool list"

MFC after:	2 weeks
2014-03-21 01:32:25 +00:00
delphij
f68728a475 Remove unused option -r from zpool.
Submitted by:	Richard Yao <ryao gentoo org>
MFC after:	2 weeks
2014-03-19 23:04:52 +00:00