Commit Graph

452 Commits

Author SHA1 Message Date
markj
dd6a87838b MFC r271695:
Fix some incorrect endianness checks.
2014-12-05 18:55:31 +00:00
delphij
a26baf65aa MFC r274276: MFV r274271:
Improve zdb -b performance:

 - Reduce gethrtime() call to 1/100th of blkptr's;
 - Skip manipulating the size-ordered tree;
 - Issue more (10, previously 3) async reads;
 - Use lighter weight testing in traverse_visitbp();

Illumos issue:
    5243 zdb -b could be much faster
2014-12-04 23:20:44 +00:00
delphij
0c96bfd67d MFC r274303:
Apply upstream 13597:3eac1e8e0f4c (git: illumos-gate@aa846ad9):

Initialize tqent_flags in the userland taskq implementation.  Without
this the assertion of tq->tq_freelist != NULL may fail in taskq_destroy.

The problem is that tqent_flags is never initialized in the userland
implementation while the kernel one does initialize it.  Without proper
initialization, the flag may have its lowest bit set, making it treated
as TQENT_FLAG_PREALLOC and never removing taskq_ent_t from tq_freelist.
2014-12-04 23:17:35 +00:00
smh
009b631caf MFC r264852
Silence compiler warning

Sponsored by:	Multiplay
2014-11-13 16:45:25 +00:00
markj
372e35ccad MFC r258902:
The uaddr, ufunc, umod and usym functions all seem to work as expected on
FreeBSD, so stop hiding them behind a "#if defined(sun)".
2014-10-24 17:24:29 +00:00
delphij
5295b89f11 MFC r272806: MFV r272802:
- Limit ARC for zdb at 256MB.  zdb do not typically revisit data
   in the ARC.
 - Increase default max_inflight from 200 to 1000 (can be overriden
   by -I) so we can queue more I/Os when doing scrubbing.
 - Print status while loading meataslabs for leak detection.

Illumos issues:

    5169 zdb should limit its ARC size
    5170 zdb -c should create more scrub i/os by default
    5171 zdb should print status while loading metaslabs for leak detection
2014-10-23 01:36:43 +00:00
delphij
2f84271419 MFC r272599: MFV r272588:
Handle old format deadlist.

Illumos issue:
    5178 zdb -vvvvv on old-format pool fails in dump_deadlist()
2014-10-20 22:18:21 +00:00
delphij
e02ec23db8 MFC r272598: MFV r272585:
Split the godfather zio into CPU number's to reduce lock
contention.

Illumos issue:
    5176 lock contention on godfather zio
2014-10-20 22:13:50 +00:00
delphij
56f7d0497e MFC r272502: MFV r272493:
Show individual disk capacity when doing zpool list -v.

Illumos issue:
    5147 zpool list -v should show individual disk capacity
2014-10-13 18:53:56 +00:00
avg
e7a471ff5e MFC r261893: zfs.8: fix garbled options in a sample zfs send -R command line 2014-10-07 13:23:52 +00:00
avg
7d7d42cd16 MFC r261892: zpool.8: fix typo in option description of labelclear command 2014-10-07 13:20:04 +00:00
delphij
81242229b8 MFC r271527: MFV r271511:
Use fnvlist_* to make code more readable.

Illumos issue:
    5135 zpool_find_import_cached() can use fnvlist_*

Approved by:	re (gjb)
2014-10-02 22:16:00 +00:00
delphij
fae0398507 MFC r271227: MFV r271225:
Iterate through all the children instead of returning error when we hit
the first error.  This makes the error message give more information
rather than just the first device that causes problem.

Illumos issue:
    5118 When verifying or creating a storage pool, error messages only
       show one device

Approved by:	re (gjb)
2014-09-25 21:45:07 +00:00
smh
971865d1c7 MFC r271934:
Output boot code warning when zpool upgrade -a is used to add features.

PR:		188328
Approved by:	re (marius)
Sponsored by:	Multiplay
2014-09-24 09:59:48 +00:00
delphij
81bb44f925 MFC r271222:
Fix typo.

Submitted by:	Dmitry Morozovsky <marck rinet ru>
Approved by:	re (gjb)
2014-09-10 13:13:30 +00:00
markj
c044c8f131 MFC r269524:
Preserve the errno value of an ioctl before calling free(3). Previously,
errno was very occasionally being clobbered, resulting in a bogus error from
dt_consume() and thus an error from dtrace(1).
2014-08-20 14:57:55 +00:00
delphij
f0e0389097 MFC r269430: MFV r269426:
Double test device size for ztest(1).

Illumos issue:
    5039 ztest should default to larger device sizes
    Author: Matthew Ahrens <mahrens@delphix.com>
2014-08-18 05:13:46 +00:00
pfg
5b8c39f44b MFC r267875:
4251 libdtrace leaks open file handles

Illumos commit:		93ed8d0d4b068b95d0bb50d57bb854df462a8485
			(partial)
Reference:
https://www.illumos.org/issues/4251

Discussed with:	Robert Mustacchi
Obtained from:	Illumos
2014-08-16 00:52:13 +00:00
markj
599cdb2d59 MFC r257877:
Don't try to use the 32-bit drti.o unless the data model is explicitly set
to ILP32. Otherwise dtrace -G will attempt to use it on amd64 if it can't
determine which data model to use, which happens when -64 is omitted and
no object files are provided, e.g. with

# dtrace -G -n BEGIN

This would result in a linker error, but now works properly.

Also remove an unnecessary #ifdef.
2014-08-14 16:45:01 +00:00
rpaulo
bd71cae4c3 MFC r269776
Remove the BROKEN_LIBELF section.
2014-08-13 06:41:06 +00:00
delphij
63a479a6fb MFC r269229,269404,269466: MFV r269223:
Change dn->dn_dbufs from linked list to AVL tree.

Illumos issues:
  4873 zvol unmap calls can take a very long time for larger datasets
2014-08-12 00:53:03 +00:00
delphij
c14fd95fbc MFC r269118: MFV r269010:
Import Illumos changes to address the following Illumos issues:
  4976 zfs should only avoid writing to a failing non-redundant
       top-level vdev
  4978 ztest fails in get_metaslab_refcount()
  4979 extend free space histogram to device and pool
  4980 metaslabs should have a fragmentation metric
  4981 remove fragmented ops vector from block allocator
  4982 space_map object should proactively upgrade when feature
       is enabled
  4984 device selection should use fragmentation metric
2014-08-10 05:58:41 +00:00
delphij
2e7e16d0c1 MFC r269100:
Diff reduction against Illumos.
2014-08-08 19:14:49 +00:00
delphij
3247b8806e MFC r268621 (smh) + r268625:
Don't report non-native block-size pools under zpool status -x

zpool status -x is used to identify pools that are exhibiting
errors or are otherwise unavailable, therefore non-native
block-size pools shouldn't be reported.

Also update man page to clarify other additional conditions
which won't cause a pool to be displayed under zpool status -x.

Sponsored by:   Multiplay
2014-08-08 19:11:23 +00:00
delphij
ccb7d1d4f5 MFC the cddl/contrib/opensolaris/cmd/zpool portion of r267803 (joel):
mdoc: remove superfluous paragraph macros.
2014-08-08 19:06:24 +00:00
markj
c189d6694c MFC r265631:
Re-apply r248644. This fixes an annoying problem which caused dtrace -c to
fail to attach to stripped binaries. With the _r_debug_postinit symbol,
dtrace(1) can now set a breakpoint in the victim process after it has
registered its DOF table(s) with the kernel. r_debug_state cannot be used
for this purpose since it is called before DOF is made available, in which
case dtrace(1) cannot create USDT probes before the program begins
execution.
2014-08-08 15:21:43 +00:00
markj
2fd28e2373 MFC r256571:
Add a function, memstr, which can be used to convert a buffer of
null-separated strings to a single string. This can be used to print the
full arguments of a process using execsnoop (from the DTrace toolkit) or
with the following one-liner:

dtrace -n 'syscall::execve:return {trace(curpsinfo->pr_psargs);}'

Note that this relies on the process arguments being cached via the struct
proc, which means that it will not work for argvs longer than
kern.ps_arg_cache_limit. However, the following rather non-portable
script can be used to extract any argv at exec time:

fbt::kern_execve:entry
{
    printf("%s", memstr(args[1]->begin_argv, ' ',
        args[1]->begin_envv - args[1]->begin_argv));
}

The debug.dtrace.memstr_max sysctl limits the maximum argument size to
memstr().
2014-08-04 15:36:22 +00:00
delphij
6a949e106d MFC r268855: MFV r268848:
Instead of asserting all zio's be properly aligned, only assert
on the logical ones.

Cap uberblocks at 8k, otherwise with ashift=17, there would be
only one uberblock.

This fixes a problem that zdb would trip assert on pools with
ashift >= 0xe (8k).

While there, also change the code so it only attempt to condense
space map unless the uncondensed size consumes greater than
zfs_metaslab_condense_block_threshold blocks.

Illumos issue:
  4958 zdb trips assert on pools with ashift >= 0xe
2014-08-02 03:56:06 +00:00
markj
c7af226b37 MFC r264486:
Use the correct format specifiers for wide characters and strings of wide
characters.
2014-07-29 21:21:16 +00:00
markj
2bf1f15393 MFC r262669:
When our linker merges .SUNW_dof sections from multiple files, it simply
concatenates the DOF tables into one section. Previously, the USDT init
code in drti.o would only look at the first table in the DOF section; with
this change, it iterates over all the tables, passing each DOF table to
the kernel.

PR:	186821
2014-07-29 18:31:27 +00:00
delphij
2c88e211d5 MFC r268720: MFV r268714:
Improve extreme rewind import.

When doing an "extreme rewind" import ("zpool import -XF"), we attempt
to verify all data in the pool, essentially scrubbing the entire pool.
The problem is that spa_load_verify_cb() issues an unbounded number of
concurrent scrub i/os.  This can lead to all of memory being used for
these zio's, wedging the system. Like normal scrub, we need to put a
cap on the number of outstanding i/os, and have the traverse thread
block when we reach this cap.

For this purpose the cap can be very large (10,000) to optimize the
elevator algorithm.  Three kernel tunables have been added:

	vfs.zfs.spa_load_verify_maxinflight
	vfs.zfs.spa_load_verify_metadata
	vfs.zfs.spa_load_verify_data

The latter two tunables controls whether metadata and/or user data
when doing extreme rewind.

Make 'zpool import -T' imply scrub.

Make zpool import -T <txg> accept hexadecimal values for the txg when
prefixed with 0x.

Skip txg's for which there is no uberblock when doing extreme rewind.

Skip reading all user data twice by skipping prefetches when doing
extreme rewinds as we do not access via the ARC.

Illumos issues:
  4970 need controls on i/o issued by zpool import -XF
  4971 zpool import -T should accept hex values
  4972 zpool import -T implies extreme rewind, and thus a scrub
  4973 spa_load_retry retries the same txg
  4974 spa_load_verify() reads all data twice
2014-07-29 05:49:16 +00:00
delphij
70f7d126e3 MFC r268473: MFV r268455:
Use reserved space for ZFS administrative commands.
2014-07-23 00:49:35 +00:00
delphij
776d6cbee6 MFC r260156: MFV r260152:
4208 Typo in zfs_main.c: "posxiuser"
2014-07-23 00:46:56 +00:00
delphij
126b8ac4b4 MFC r268470: MFV r268454:
Refresh zpool list for each interval in order to produce fresh
output.

Illumos issue: 4966 zpool list iterator does not update output
2014-07-23 00:41:11 +00:00
delphij
3750365a9c MFC r268469: MFV r268453:
Diff reduction against Illumos.
2014-07-23 00:38:23 +00:00
delphij
adc65d02d1 MFC r268116:
- Fix handling of "new" style of ioctl in compatiblity mode [1];
 - Reorganize code and reduce diff from upstream;
 - Improve forward compatibility shims for previous kernel;

Reported by:    sbruno [1]
2014-07-17 05:20:18 +00:00
delphij
4af6f088fb MFC r268126: MFV r268121:
4924 LZ4 Compression for metadata
2014-07-15 05:42:09 +00:00
delphij
5cce10db7b MFC r268123: MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
2014-07-15 05:39:22 +00:00
delphij
4d09e20b95 MFC r268086: MFV r267570:
4756 metaslab_group_preload() could deadlock
2014-07-15 05:36:26 +00:00
delphij
bfdd43f2b5 MFC r268084: MFV r267568:
4891 want zdb option to dump all metadata
2014-07-15 05:28:58 +00:00
delphij
91643324a9 MFC r268079: MFV r267566:
4390 i/o errors when deleting filesystem/zvol can lead to space map
     corruption
2014-07-15 05:00:46 +00:00
delphij
9d1dc5bcc9 MFC r268075: MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression")
4913 zfs release should not be subject to space checks
2014-07-15 04:53:34 +00:00
delphij
5ed76404fe MFC r260142: MFV r258972:
4373 add block contents print to zstreamdump
2014-07-15 04:44:06 +00:00
delphij
38554d6064 MFC r266771: MFV r266766:
Add a new zfs property, "redundant_metadata" which can have values "all" or
"most".  The default will be "all", which is the current behavior.  When set
to all, ZFS stores an extra copy of all metadata.  If a single on-disk block
is corrupt, at worst a single block of user data (which is recordsize bytes
long) can be lost.

Setting to "most" will cause us to only store 1 copy of level-1 indirect
blocks of user data files.  This can improve performance of random writes,
because less metadata has to be written.  In practice,  at worst about
100 blocks (of recordsize bytes each) of user data can be lost if a single
on-disk block is corrupt.

The exact behavior of which metadata blocks are stored redundantly may change
in future releases.

Illumos issue: 3835 zfs need not store 2 copies of all metadata
2014-07-15 04:39:55 +00:00
delphij
7afd1db032 MFC r267572: MFV r249332 (illumos-gate 14005:55fc53126003)
Illumos ZFS issues:
  3654 zdb should print number of ganged blocks
2014-07-15 04:33:11 +00:00
rpaulo
0cff4b03dd MFC 267929, 267937, 267939, 267940, 267941, 267942, 267987, 268006:
2915 DTrace in a zone should see "cpu", "curpsinfo", et al
 2916 DTrace in a zone should be able to access fds[]
 2917 DTrace in a zone should have limited provider access
 4477 DTrace should speak JSON
 Add stubs for CTF functions which are not yet implemented.
 4474 DTrace Userland CTF Support
 4475 DTrace userland Keyword
 4476 DTrace tests should be better citizens
 4479 pid provider types
 4480 dof emulation is missing checks
 4471 DTrace count() with histogram
 4472 DTrace full width distribution histograms
 4473 DTrace frequency trails
2014-07-12 22:56:41 +00:00
pfg
93d41081e3 MFC r267513:
Merge from r258379 missed the tests.

4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included

Illumos Revision:	54a20ab41aadcb81c53e72fc65886e964e9add59
2014-06-20 15:40:13 +00:00
markj
644a04942b MFC r262329:
Define the KM_NORMALPRI flag for kmem_alloc(), as it is used in some
upstream DTrace code.

MFC r262330:
1452 DTrace buffer autoscaling should be less violent

illumos/illumos-gate@6fb4854bed
2014-05-25 18:19:57 +00:00
mav
54ed85cbfe MFC r265821:
Comment out some pointless device open/close around reading device IDs.

FreeBSD ZFS port unlike OpenSolaris does not use device IDs, and does not
implement respective devid_*() fuctions.  It is pointless to open devices
just to close them back immediately.
2014-05-24 10:41:37 +00:00
delphij
7cb0f49ed2 MFC r264835 (MFV r264829):
3897 zfs filesystem and snapshot limits
2014-05-09 07:12:31 +00:00