Since the upstream for cddl code is now illumos not sun, mechanically
convert all sun #ifdef's to illumos #ifdef's which have been used in all
newer code for some time.
Also do a manual pass to correct the use if #ifdef comments as per style(9)
as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
MFC after: 1 month
Sponsored by: Multiplay
This special case prevented locating vdevs which start with c[0-9] e.g.
gptid/c6cde092-504b-11e4-ba52-c45444453598 hence it was impossible to
online a vdev via its path.
Submitted by: Peter Xu <xzpeter@gmail.com>
MFC after: 2 weeks
Sponsored by: Multiplay
This corrects inconsitencies between zpool list and zpool status which are
both described as displaying the pool <state> however zpool list would use
this hardcoded FAULTED instead of the correct UNAVAIL.
MFC after: 1 month
Summary:
PowerPC64 uses function descriptors in a section .opd, exporting the descriptors
to the symbol table. This adds support for these into dt_link.c so that dtrace
USDT probes can be compiled.
Test Plan:
Tested only on powerpc64. No regression testing has been performed, so I want
someone with x86 hardware to regression test this.
Tested on amd64 by markj
Reviewers: #dtrace, markj
Reviewed By: #dtrace, markj
Subscribers: markj
Differential Revision: https://reviews.freebsd.org/D1267
MFC after: 3 weeks
so take this into account when iterating over DOF tables.
PR: 195555
Submitted by: Fedor Indutny <fedor@indutny.com> (original version)
MFC after: 1 week
Introduce a seperate phase to list all unavailable pools when listing
pools to upgrade. This avoids confusing output when displaying older
and disabled feature pools. These existing phases now silently skip
unavailable pools.
Introduce cb_unavail to upgrade_cbdata_t which enables the final
output for zpool list to correctly detail if all pools or only all
available pools where up-to-date on version / features.
Correct the type of upgrade_cbdata_t.cb_first from int -> boolean_t.
Change the pool iteration when upgrading named pools to include
unavailable pools and update upgrade_one so it doesn't try to upgrade
unavailable pools but warns about them. This allows the correct error
to be displayed as well as upgrades with available and unavailable
pools intermixed to partially complete.
Also correct some missing trailing \n's from output in upgrade_one.
MFC after: 1 month
X-MFC-With: r276194
Prior to this fix "zpool upgrade" and "zpool upgrade -a" would fail due to
an assert when operating on unavailable pools.
We now print a warning to stderr but allow the processing of other pools
to procesed.
MFC after: 1 month
dlinfo() is a weak reference that may not be initialized at the time of
execution. The default implementation (in lib/libc/gen/dlfcn.c) neither
modifies the address pointed to by the third argument nor returns an error.
Differential Revision: https://reviews.freebsd.org/D1326
Reviewed by: markj
MFC after: 1 week
Plug a memory leak in libzfs. In zfs_iter_bookmarks, an nvlist is allocated
before calling lzc_get_bookmarks, which allocates the nvlist again (and
overwrites the pointer to previously allocated list).
Illumos issue:
5427 memory leak in libzfs when doing rollback
MFC after: 2 weeks
Convert ARC flags to use enum. Previously, public flags are defined in
arc.h and private flags are defined in arc.c which can lead to confusion
and programming errors.
Consistently use 'hdr' (when referencing arc_buf_hdr_t) instead of 'buf'
or 'ab' because arc_buf_t are often named 'buf' as well.
Illumos issue:
5369 arc flags should be an enum
5370 consistent arc_buf_hdr_t naming scheme
MFC after: 2 weeks
Remove "dbuf phys" db->db_data pointer aliases.
Use function accessors that cast db->db_data to the appropriate
"phys" type, removing the need for clients of the dmu buf user
API to keep properly typed pointer aliases to db->db_data in order
to conveniently access their data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c:
In zap_leaf() and zap_leaf_byteswap, now that the pointer alias
field l_phys has been removed, use the db_data field in an on
stack dmu_buf_t to point to the leaf's phys data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:
Remove the db_user_data_ptr_ptr field from dbuf and all logic
to maintain it.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:
Modify the DMU buf user API to remove the ability to specify
a db_data aliasing pointer (db_user_data_ptr_ptr).
cddl/contrib/opensolaris/cmd/zdb/zdb.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h:
Create and use the new "phys data" accessor functions
dsl_dir_phys(), dsl_dataset_phys(), zap_m_phys(),
zap_f_phys(), and zap_leaf_phys().
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h:
Remove now unused "phys pointer" aliases to db->db_data
from clients of the DMU buf user API.
Illumos issue:
5314 Remove "dbuf phys" db->db_data pointer aliases in ZFS
MFC after: 2 weeks
Port Illumos 'zfs allow' examples update. While I'm there also fix
a typo.
Illumos issue:
4181 zfs(1m): 'zfs allow' examples in the man page are outdated
MFC after: 2 weeks
Additionally fix a misparenthesization in the same check, noticed while
fixing the first bug. This bug only appears to cause problems if the same
USDT probe appears twice within a static function.
X-MFC-With: r274637
fields of dt_module_t. Previously, this was only done on architectures where
kernel modules have type ET_REL; this change fixes that. As a result, symbol
name resolution in the stack() action now works properly for kernel modules
on i386.
Reported by: Shrikanth Kamath <shrikanth07@gmail.com>
Tested by: Shrikanth Kamath
Discussed with: avg
MFC after: 2 weeks
a probe name. When dtrace -G builds up a DOF section for the specified
provider(s), the probe function names are truncated to fit in this limit.
The DOF is later used to build the symbol table for the generated object
file, so the table can end up with truncated references, causing link
errors.
Instead of potentially truncating symbol table entries, write the full
function name to the DOF string table and allow the kernel to enforce the
128-byte function name limit when a process attempts to load its DOF.
PR: 194757
Differential Revision: https://reviews.freebsd.org/D1175
Reviewed by: rpaulo
MFC after: 2 weeks
union must be checked when determine whether two types are equivalent. This
bug could cause ctfmerge(1) to incorrectly merge distinct types.
Reviewed by: Robert Mustacchi <rm@joyent.com>
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
These would cause ctfconvert(1) to return an error when attempting to
resolve valid C types.
Reviewed by: Robert Mustacchi <rm@joyent.com>
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
ZFS large block support.
Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool). This *may* remain unchanged because of memory constraint.
Limited safety belt is provided for mounted root filesystem but use
caution is advised.
Illumos issue:
5027 zfs large block support
MFC after: 1 month
Initialize tqent_flags in the userland taskq implementation. Without
this the assertion of tq->tq_freelist != NULL may fail in taskq_destroy.
The problem is that tqent_flags is never initialized in the userland
implementation while the kernel one does initialize it. Without proper
initialization, the flag may have its lowest bit set, making it treated
as TQENT_FLAG_PREALLOC and never removing taskq_ent_t from tq_freelist.
MFC after: 2 weeks
- Limit ARC for zdb at 256MB. zdb do not typically revisit data
in the ARC.
- Increase default max_inflight from 200 to 1000 (can be overriden
by -I) so we can queue more I/Os when doing scrubbing.
- Print status while loading meataslabs for leak detection.
Illumos issues:
5169 zdb should limit its ARC size
5170 zdb -c should create more scrub i/os by default
5171 zdb should print status while loading metaslabs for leak detection
MFC after: 2 weeks
one to, for example, access the "provider" field of a struct g_consumer,
even though "provider" is a D keyword.
PR: 169657
MFC after: 2 months
Discussed with: Bryan Cantrill
Sponsored by: EMC / Isilon Storage Division
modifications to libproc to support fetching the CTF info for a given file.
With this change, dtrace(1) is able to resolve type info for function and
USDT probe arguments, and function return values. In particular, the args[n]
syntax should now work for referencing arguments of userland probes,
provided that the requisite CTF info is available.
The uctf tests pass if the test programs are compiled with CTF info. The
current infrastructure around the DTrace test suite doesn't support this
yet.
Differential Revision: https://reviews.freebsd.org/D891
MFC after: 1 month
Relnotes: yes
Sponsored by: EMC / Isilon Storage Division
install signal handlers when running in list mode (-l), and acknowledge
interrupts by cleaning up and exiting. This ensures that a command like
$ dtrace -l -P 'pid$target' -p <target PID> | less
won't cause the ptrace(2)d target process to be killed if less(1) exits
before all dtrace output is consumed.
Reported by: Anton Yuzhaninov <citrin+bsd@citrin.ru>
Differential Revision: https://reviews.freebsd.org/D880
Reviewed by: rpaulo
MFC after: 1 month
Sponsored by: EMC / Isilon Storage Division
In the case where new features where enabled by a zpool upgrade -a the
boot code warning wasn't output.
Submitted by: Jan Kokemueller
MFC after: 3 days
This errno value is emitted by dsl_props_set_check() in
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c, and
is used to mean that the property value is too long. For the record,
the maximum length is ZAP_MAXVALUELEN, which is 8*1024 bytes.
Instead of claiming an unknown error (and abort()ing), provide
something more specific to the scenario involved. As far as I
can tell, E2BIG is not emitted for any other scenario.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 27 Feb 2009 (illumos ccba0801)
This change modified the value returned by
dsl_props_set_check(), so that it can distinguish between
a name that's too long and a value that's too long, but
libzfs was not updated accordingly.
MFSpectraBSD: r1051499 on 2014/03/28 11:07:59
This can occur if a spare is being spared, which would yield three
children: the original pool drive, the previous spare, and the spare
that is replacing it.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 7 Jun 2006 (illumos 94de1d4c)
MFSpectraBSD: r668345 on 2013/06/04 17:10:43
It seems that if a pragma is used to define a weak alias for a local
function, the pragma must appear after the function is defined.
PR: 193056
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
seems that they would only pass by chance on illumos; on FreeBSD, they still
fail since userland CTF is not yet supported.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
Enable debug printf's when ZFS_DEBUG or debug= is set.
Illumos issue:
5134 if ZFS_DEBUG or debug= is set, libzpool should enable debug prints
MFC after: 2 weeks
doing them in drti during startup. This fixes a number of problems with
using USDT probes in stripped executables and shared libraries, and with
USDT probes in static functions.
Reviewed by: rpaulo
MFC after: 1 month
Sponsored by: EMC / Isilon Storage Division
Phabric: D751
Iterate through all the children instead of returning error when we hit
the first error. This makes the error message give more information
rather than just the first device that causes problem.
Illumos issue:
5118 When verifying or creating a storage pool, error messages only
show one device
MFC after: 2 weeks
Double test device size for ztest(1).
Illumos issue:
5039 ztest should default to larger device sizes
Author: Matthew Ahrens <mahrens@delphix.com>
MFC after: 2 weeks
Change dn->dn_dbufs from linked list to AVL tree.
Illumos issues:
4873 zvol unmap calls can take a very long time for larger datasets
MFC after: 2 weeks
Import Illumos changes to address the following Illumos issues:
4976 zfs should only avoid writing to a failing non-redundant
top-level vdev
4978 ztest fails in get_metaslab_refcount()
4979 extend free space histogram to device and pool
4980 metaslabs should have a fragmentation metric
4981 remove fragmented ops vector from block allocator
4982 space_map object should proactively upgrade when feature
is enabled
4984 device selection should use fragmentation metric
MFC after: 2 weeks
Instead of asserting all zio's be properly aligned, only assert
on the logical ones.
Cap uberblocks at 8k, otherwise with ashift=17, there would be
only one uberblock.
This fixes a problem that zdb would trip assert on pools with
ashift >= 0xe (8k).
While there, also change the code so it only attempt to condense
space map unless the uncondensed size consumes greater than
zfs_metaslab_condense_block_threshold blocks.
Illumos issue:
4958 zdb trips assert on pools with ashift >= 0xe
MFC after: 2 weeks
Improve extreme rewind import.
When doing an "extreme rewind" import ("zpool import -XF"), we attempt
to verify all data in the pool, essentially scrubbing the entire pool.
The problem is that spa_load_verify_cb() issues an unbounded number of
concurrent scrub i/os. This can lead to all of memory being used for
these zio's, wedging the system. Like normal scrub, we need to put a
cap on the number of outstanding i/os, and have the traverse thread
block when we reach this cap.
For this purpose the cap can be very large (10,000) to optimize the
elevator algorithm. Three kernel tunables have been added:
vfs.zfs.spa_load_verify_maxinflight
vfs.zfs.spa_load_verify_metadata
vfs.zfs.spa_load_verify_data
The latter two tunables controls whether metadata and/or user data
when doing extreme rewind.
Make 'zpool import -T' imply scrub.
Make zpool import -T <txg> accept hexadecimal values for the txg when
prefixed with 0x.
Skip txg's for which there is no uberblock when doing extreme rewind.
Skip reading all user data twice by skipping prefetches when doing
extreme rewinds as we do not access via the ARC.
Illumos issues:
4970 need controls on i/o issued by zpool import -XF
4971 zpool import -T should accept hex values
4972 zpool import -T implies extreme rewind, and thus a scrub
4973 spa_load_retry retries the same txg
4974 spa_load_verify() reads all data twice
MFC after: 2 weeks
zpool status -x is used to identify pools that are exhibiting
errors or are otherwise unavailable, therefore non-native
block-size pools shouldn't be reported.
Also update man page to clarify other additional conditions
which won't cause a pool to be displayed under zpool status -x.
Sponsored by: Multiplay
Use reserved space for ZFS administrative commands.
We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at
least) for system use. Most ZPL operations, e.g. write(2), creat(2), will
fail with ENOSPC if we fall below this.
Certain operations, e.g. file removal and most administrative actions,
still permitted until half of the slop space is used. This would allow
users to use these operations to free up space in the pool when pool is
close to full but half of slop space is still free.
A very restricted set of operations that frees up space or change quota
are always permitted, regardless of the amount of free space.
MFC after: 2 weeks
Refresh zpool list for each interval in order to produce fresh
output.
Illumos issue: 4966 zpool list iterator does not update output
MFC after: 2 weeks
This includes:
o All directories named *ia64*
o All files named *ia64*
o All ia64-specific code guarded by __ia64__
o All ia64-specific makefile logic
o Mention of ia64 in comments and documentation
This excludes:
o Everything under contrib/
o Everything under crypto/
o sys/xen/interface
o sys/sys/elf_common.h
Discussed at: BSDcan
2915 DTrace in a zone should see "cpu", "curpsinfo", et al
2916 DTrace in a zone should be able to access fds[]
2917 DTrace in a zone should have limited provider access
MFC after: 2 weeks
Merge from r258379 missed the tests.
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included
Illumos Revision: 54a20ab41aadcb81c53e72fc65886e964e9add59
MFC after: 5 days
Add a new zfs property, "redundant_metadata" which can have values "all" or
"most". The default will be "all", which is the current behavior. When set
to all, ZFS stores an extra copy of all metadata. If a single on-disk block
is corrupt, at worst a single block of user data (which is recordsize bytes
long) can be lost.
Setting to "most" will cause us to only store 1 copy of level-1 indirect
blocks of user data files. This can improve performance of random writes,
because less metadata has to be written. In practice, at worst about
100 blocks (of recordsize bytes each) of user data can be lost if a single
on-disk block is corrupt.
The exact behavior of which metadata blocks are stored redundantly may change
in future releases.
Illumos issue: 3835 zfs need not store 2 copies of all metadata
MFC after: 2 weeks
(4543:12bb2876a62e). Without this, some third party applications
may break because the lack of AVL related symbols.
FreeBSD base system are not affected because the FreeBSD ZFS command
line tools were all linked against libavl and thus hide the underlying
issue.
PR: java/183081
Tested by: jkim
MFC after: 3 days
illumos, rather than using "1.0" everywhere.
Some of the translators use D functions that are not present in version
1.0 (e.g. inet_ntoa()) which can result in libdtrace crashing when running
scripts that restrict themselves to version 1.0
(e.g. with "-x version=1.0").
MFC after: 1 week
FreeBSD ZFS port unlike OpenSolaris does not use device IDs, and does not
implement respective devid_*() fuctions. It is pointless to open devices
just to close them back immediately.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
The thread pool is used by libzfs to implement parallel disk scanning.
Without this change our dummy wrapper made `zpool import ZZZ` command to
scan all disks sequentially from the single thread when searching for pools.
This change makes it use two threads per CPU, same as in OpenSolaris.
On system with 200 HDDs this change reduces ZFS pool import time from 35
to 22 seconds.
fail to attach to stripped binaries. With the _r_debug_postinit symbol,
dtrace(1) can now set a breakpoint in the victim process after it has
registered its DOF table(s) with the kernel. r_debug_state cannot be used
for this purpose since it is called before DOF is made available, in which
case dtrace(1) cannot create USDT probes before the program begins
execution.
MFC after: 2 weeks
This change adds tests/ directories in the source tree to create various
subdirectories in /usr/tests/ and to install placeholder Kyuafiles for
them.
the relevant hierarchies are: cddl, etc, games, gnu and secure.
The reason for this is to simplify the addition of new test programs for
utilities or libraries under any of these directories. Doing so on a
case by case basis is unnecessary and is quite an obscure process.
and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in
spa_history_create_obj()).
PR: bin/186574
Submitted by: Andrew Childs <lorne cons org nz> (with changes)
MFC after: 2 weeks
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL
between three modes:
geom -- existing fully functional behavior (default);
dev -- exposing volumes only as raw disk device file in devfs;
none -- not exposing volumes outside ZFS.
The "dev" mode is less functional (can't be partitioned, mounted, etc),
but it is faster, and in some scenarios with untrusted consumers safer.
It can be useful for NAS, VM block storages, etc.
The "none" mode may be convenient for backup servers, etc. that don't
need direct data access.
Due to the way ZVOL is integrated with main ZFS code, those property
and sysctl are checked only during pool import and volume creation.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included
Illumos Revision: 4a20ab41aadcb81c53e72fc65886e964e9add59
Reference:
https://www.illumos.org/issues/4248https://www.illumos.org/issues/4249
Obtained from: Illumos
MFC after: 1 month
Fix a memory leak in uu_avl_pool_create: pthread_mutex_init without
a corresponding pthread_mutex_destroy. It shows up, among other
places, when doing "zfs list".
MFC after: 3 weeks
Sponsored by: Spectra Logic Corporation
concatenates the DOF tables into one section. Previously, the USDT init
code in drti.o would only look at the first table in the DOF section; with
this change, it iterates over all the tables, passing each DOF table to
the kernel.
PR: 186821
Submitted by: Fedor Indutny <fedor@indutny.com>
MFC after: 1 month
illumos/illumos-gate@6fb4854bed
This fixes the tst.resize1.d and tst.resize2.d DTrace tests, which have
been failing since r261122 since they were causing dtrace(1) to attempt to
allocate and use large amounts of memory, and get killed by the OOM killer
as a result.
MFC after: 1 month
"Manpages should start a new sentence on a new line. This makes it easier
for translators to track changes." -jhb
Approved by: jhb
MFC after: 3 days
Sponsored by: SupraNet Communications, Inc
hot spares. This should be MFC'd to all STABLE branches.
Upon the availability of zfsd, the zpool manpage on relevant branches should
be updated to remove this caveat and document hot spare's reliance on zfsd.
Approved by: avg
MFC after: 1 week
Sponsored by: SupraNet Communications