When our linker merges .SUNW_dof sections from multiple files, it simply
concatenates the DOF tables into one section. Previously, the USDT init
code in drti.o would only look at the first table in the DOF section; with
this change, it iterates over all the tables, passing each DOF table to
the kernel.
PR: 186821
Improve extreme rewind import.
When doing an "extreme rewind" import ("zpool import -XF"), we attempt
to verify all data in the pool, essentially scrubbing the entire pool.
The problem is that spa_load_verify_cb() issues an unbounded number of
concurrent scrub i/os. This can lead to all of memory being used for
these zio's, wedging the system. Like normal scrub, we need to put a
cap on the number of outstanding i/os, and have the traverse thread
block when we reach this cap.
For this purpose the cap can be very large (10,000) to optimize the
elevator algorithm. Three kernel tunables have been added:
vfs.zfs.spa_load_verify_maxinflight
vfs.zfs.spa_load_verify_metadata
vfs.zfs.spa_load_verify_data
The latter two tunables controls whether metadata and/or user data
when doing extreme rewind.
Make 'zpool import -T' imply scrub.
Make zpool import -T <txg> accept hexadecimal values for the txg when
prefixed with 0x.
Skip txg's for which there is no uberblock when doing extreme rewind.
Skip reading all user data twice by skipping prefetches when doing
extreme rewinds as we do not access via the ARC.
Illumos issues:
4970 need controls on i/o issued by zpool import -XF
4971 zpool import -T should accept hex values
4972 zpool import -T implies extreme rewind, and thus a scrub
4973 spa_load_retry retries the same txg
4974 spa_load_verify() reads all data twice
Add a new zfs property, "redundant_metadata" which can have values "all" or
"most". The default will be "all", which is the current behavior. When set
to all, ZFS stores an extra copy of all metadata. If a single on-disk block
is corrupt, at worst a single block of user data (which is recordsize bytes
long) can be lost.
Setting to "most" will cause us to only store 1 copy of level-1 indirect
blocks of user data files. This can improve performance of random writes,
because less metadata has to be written. In practice, at worst about
100 blocks (of recordsize bytes each) of user data can be lost if a single
on-disk block is corrupt.
The exact behavior of which metadata blocks are stored redundantly may change
in future releases.
Illumos issue: 3835 zfs need not store 2 copies of all metadata
2915 DTrace in a zone should see "cpu", "curpsinfo", et al
2916 DTrace in a zone should be able to access fds[]
2917 DTrace in a zone should have limited provider access
4477 DTrace should speak JSON
Add stubs for CTF functions which are not yet implemented.
4474 DTrace Userland CTF Support
4475 DTrace userland Keyword
4476 DTrace tests should be better citizens
4479 pid provider types
4480 dof emulation is missing checks
4471 DTrace count() with histogram
4472 DTrace full width distribution histograms
4473 DTrace frequency trails
Merge from r258379 missed the tests.
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included
Illumos Revision: 54a20ab41aadcb81c53e72fc65886e964e9add59
Define the KM_NORMALPRI flag for kmem_alloc(), as it is used in some
upstream DTrace code.
MFC r262330:
1452 DTrace buffer autoscaling should be less violent
illumos/illumos-gate@6fb4854bed
Comment out some pointless device open/close around reading device IDs.
FreeBSD ZFS port unlike OpenSolaris does not use device IDs, and does not
implement respective devid_*() fuctions. It is pointless to open devices
just to close them back immediately.
Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL
between three modes:
geom -- existing fully functional behavior (default);
dev -- exposing volumes only as raw disk device file in devfs;
none -- not exposing volumes outside ZFS.
The "dev" mode is less functional (can't be partitioned, mounted, etc),
but it is faster, and in some scenarios with untrusted consumers safer.
It can be useful for NAS, VM block storages, etc.
The "none" mode may be convenient for backup servers, etc. that don't
need direct data access.
Due to the way ZVOL is integrated with main ZFS code, those property
and sysctl are checked only during pool import and volume creation.
4248 dtrace(1M) should never create DOF with empty probes section
4249 Only probes from the first DTrace object file will be included
Illumos Revision: 4a20ab41aadcb81c53e72fc65886e964e9add59
Reference:
https://www.illumos.org/issues/4248https://www.illumos.org/issues/4249
Obtained from: Illumos
Take into account when zpool history block grows exceeding 128KB in zpool(8)
and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in
spa_history_create_obj()).
PR: bin/186574
Submitted by: Andrew Childs <lorne cons org nz> (with changes)
* Make die_mem_offset() be able to handle DW_AT_data_member_location
attributes generated by Clang 3.4.
* Document how different compilers generate DW_AT_data_member_location
attributes differently.
* Document the quirks about DW_FORM_data[48].
This is a slightly modified version, adapted to work with the old
libdwarf in stable/9 and stable/10. It should fix DTrace on these
branches, when the kernel is compiled with clang 3.4.
Note that you have to build *and* install the CTF tools first, before
building the kernel. Otherwise you can possibly still get error
messages similar to "failed to copy type of 'pr_uid': Type information
is in parent and unavailable", when attempting to run dtrace(1).
Submitted by: kaiw
cddl/contrib/opensolaris/lib/libuutil/common/uu_avl.c
Fix a memory leak in uu_avl_pool_create: pthread_mutex_init without
a corresponding pthread_mutex_destroy. It shows up, among other
places, when doing "zfs list".
Added support for the 'zfs list -t snap' and 'zfs snap' aliases which are
available under Oracle Solaris 11.
This includes an update to the ZFS(8) man page to reflect all the
available alias (snap, umount, and recv).
Initial changes obtained from ZFS On Linux + fixes for man page and cmd
help:
10b75496bbcf81b00a73
Obtained from: https://github.com/zfsonlinux/zfs
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47
NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.
3306 zdb should be able to issue reads in parallel
3321 'zpool reopen' command should be documented in the man page
and help message
illumos/illumos-gate@31d7e8fa33
FreeBSD porting notes: the kernel part of this changeset depends
on Solaris buf(9S) interfaces and are not really applicable for
our use. vdev_disk.c is patched as-is to reduce diverge from
upstream, but vdev_file.c is left intact.
Add caveat to zpool manpage indicating that we do not automatically activate
hot spares. This should be MFC'd to all STABLE branches.
Upon the availability of zfsd, the zpool manpage on relevant branches should
be updated to remove this caveat and document hot spare's reliance on zfsd.
Requested by: feld
When clearing relocations to __dtrace* symbols, handle both SHT_REL and
SHT_RELA sections properly instead of assuming that the relocation section
is of type SHT_REL.
Use 'int' to store the return value of getopt(), rather than char.
On some architectures (powerpc), char is unsigned by default, which means
comparisons against -1 always fail, so the programs get stuck in an
infinite loop.
r256543:
Add fasttrap for PowerPC. This is the last piece of the DTrace/ppc puzzle.
It's incomplete, it doesn't contain full instruction emulation, but it should be
sufficient for most cases.
r259245,r259421: (FBT)
FBT now does work fully on PowerPC.
Save r3 before using it for the trap check, else we end up saving the new r3,
containing the trap instruction encoding (0x7c810808), and restoring it back
with the frame on return. This caused it to panic on my ppc32 machine.
r259668,r259674:
Fix a typo in the FBT code.
MFV r258373:
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state
illumos/illumos-gate@7fdd916c47