Merge changes from illumos:
3021 option for time-ordered output from dtrace(1M)
3022 DTrace: keys should not affect the sort order when sorting by value
3023 it should be possible to dereference dynamic variables
3024 D integer narrowing needs some work
3025 register leak in D code generation
3026 libdtrace should set LD_NOLAZYLOAD=1 to help the pid provider
This brings yet another feature implemented in upstream DTrace.
A complete description is available here:
http://dtrace.org/blogs/ahl/2012/07/28/my-new-dtrace-favorite/
This change bumps the DT_VERS_* number to 1.9.1 in
accordance to what is done in illumos.
This change was somewhat complicated because upstream is mixed many
changes in an individual commit and some of the tests don't really
apply to us.
There are also appear to be differences in timestamping with Solaris
so we had to workaround some assertions making sure no regression
happened.
Special thanks to Fabian Keil for changes and testing.
Illumos Revisions: 13758:23432da34147
Reference:
https://www.illumos.org/issues/3021https://www.illumos.org/issues/3022https://www.illumos.org/issues/3023https://www.illumos.org/issues/3024https://www.illumos.org/issues/3025https://www.illumos.org/issues/1694
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 1 months
release a non-existing snapshot of a existing dataset. In recursive case
error is reported if no snapshots with the requested name have been found.
Problem and proposed solution reported to illumos:
3699 zfs hold or release of a non-existent snapshot does not output error
MFC after: 8 days
doesn't copyout in this case.
To solve this issue a new struct zfs_iocparm_t is introduced consisting of:
- zfs_ioctl_version (future backwards compatibility purposes)
- user space pointer to zfs_cmd_t (copyin and copyout)
- size of zfs_cmd_t (verification purposes)
The copyin and copyout of zfs_cmd_t is now done the illumos (vendor) way
what makes porting of new changes easier and ensures correct behavior if
returning an error.
MFC after: 10 days
Merge change from vendor to reduce diff only.
ZFS dtrace probes are not supported on FreeBSD yet.
Illumos ZFS issues:
3598 want to dtrace when errors are generated in zfs
MFC after: 3 weeks
only other case where STT_FILE symbols are used, in symit_next() in
cddl/contrib/opensolaris/tools/ctf/cvt/input.c, save the basename of the
symbol, instead of the full pathname.
Reported by: avg
Tested by: avg, jimharris
MFC after: 1 week
Merge change from illumos:
1368 enablings on defunct providers prevent providers from unregistering
We try to address some underlying differences between the Solaris
and FreeBSD implementations: dtrace_attach() / dtrace_detach() are
currently unimplemented in FreeBSD but the new code from illumos
makes use of taskq so some adaptations were made to dtrace_open()
and dtrace_close() to handle them appropriately.
Illumos Revision: r13430:8e6add739e38
Reference:
https://www.illumos.org/issues/1368
Reviewed by: gnn
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 3 weeks
for and loading dependent modules. This addresses a bug seen with
io.d where it was being doubly included.
PR: 171678
Submitted by: Mark Johnston
MFC after: 2 weeks
Merge change from illumos:
1694 Add type-aware print() action
This is a very nice feature implemented in upstream Dtrace.
A complete description is available here:
http://dtrace.org/blogs/eschrock/2011/10/26/your-mdb-fell-into-my-dtrace/
This change bumps the DT_VERS_* number to 1.9.0 in
accordance to what is done in illumos.
While here also include some minor cleanups to ease further merging
and appease clang with a fix by Fabian Keil.
Illumos Revisions: 13501:c3a7090dbc16
13483:f413e6c5d297
Reference:
https://www.illumos.org/issues/1560https://www.illumos.org/issues/1694
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 1 month
Merge changes from illumos:
1451 DTrace needs toupper()/tolower() subroutines
1457 lltostr() D subroutine should take an optional base
This change bumps the DT_VERS_* number to 1.8.1 in
accordance to what is done in illumos.
The test suite we currently include is outdated and
doesnt support some updates in tst.subr.d which had to
be left out for now.
Illumos Revisions: r13458 5e394d8db762
r13459 c3454574dd1a
Reference:
https://www.illumos.org/issues/1451https://www.illumos.org/issues/1457
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 1 month
Merge change from illumos:
1455 DTrace tracemem() should take an optional size argument
Our local enhancements to dt_print_bytes were equivalent to
those in illumos but we made it match the illumos version
to ease further code merges.
For now leave out tst.smallsize.d and tst.smallsize.d.out
since those don't seem to work cleanly on FreeBSD.
This change bumps the DT_VERS_* number to 1.7.1 in accordance
to what is done in illumos.
Illumos Revision: 13457:571b0355c2e3
Reference:
https://www.illumos.org/issues/1455
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 1 month
It is not guaranteed that a program has a symbol table entry for main
and thus that it would be possible to set a breakpoint on it.
Reviewed by: rpaulo
Discussed with: rpaulo
MFC after: 13 days
includes MFV 238590, 238592, 247580
MFV 238590, 238592:
In the first zfs ioctl restructuring phase, the libzfs_core library was
introduced. It is a new thin library that wraps around kernel ioctl's.
The idea is to provide a forward-compatible way of dealing with new
features. Arguments are passed in nvlists and not random zfs_cmd fields,
new-style ioctls are logged to pool history using a new method of
history logging.
http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/
MFV 247580 [1]:
To address issues of several deadlocks and race conditions the locking
code around dsl_dataset was rewritten and the interface to synctasks
was changed.
User-Visible Changes:
"zfs snapshot" can create more arbitrary snapshots at once (atomically)
"zfs destroy" destroys multiple snapshots at once
"zfs recv" has improved performance
Backward Compatibility:
I have extended the compatibility layer to support full backward
compatibility by remapping or rewriting the responsible ioctl arguments.
Old utilities are fully supported by the new kernel module.
Forward Compatibility:
New utilities work with old kernels with the following restrictions:
- creating, destroying, holding and releasing of multiple snapshots
at once is not supported, this includes recursive (-r) commands
Illumos ZFS issues:
2882 implement libzfs_core
2900 "zfs snapshot" should be able to create multiple,
arbitrary snapshots at once
3464 zfs synctask code needs restructuring
References:
https://www.illumos.org/issues/2882https://www.illumos.org/issues/2900https://www.illumos.org/issues/3464 [1]
MFC after: 1 month
Sponsored by: Hybrid Logic Inc. [1]
- provide complete backwards compatibility (old utility, new kernel)
- add zfs_cmd_t compatibility mapping in both directions
- determine ioctl address in zfs_ioctl_compat.c
Import minor ZFS changes from vendor
Illumos ZFS issues:
3604 zdb should print bpobjs more verbosely (fix zdb hang)
3606 zpool status -x shouldn't warn about old on-disk format
MFC after: 3 days
puts the full original source filename in the STT_FILE entry of the ELF
symbol table, while gcc saves only the basename.
Since the DWARF DW_AT_name attribute contains the full source filename,
both for clang and gcc, ctfconvert takes just the basename of it, for
matching with the STT_FILE entry. So when attempting to match with such
an entry, use its basename, if necessary.
Reported by: avg
MFC after: 1 week
Merge new read-only zfs properties from vendor (illumos)
Illumos ZFS issues:
3588 provide zfs properties for logical (uncompressed) space used and
referenced
References:
https://www.illumos.org/issues/3588
MFC after: 2 weeks
in r247265 (ZFS deadman thread). Both new utilities now support the old
kernel and new kernel properly detects old utilities.
For future backwards compatibility, the vfs.zfs.version.ioctl read-only
sysctl has been introduced. With this sysctl zfs utilities will be able
to detect the ioctl interface version of the currently loaded zfs module.
As a side effect, the zfs utilities between r247265 and this revision don't
support the old kernel module. If you are using HEAD newer or equal than
r247265, install the new kernel module (or whole kernel) first.
MFC after: 10 days
Merge the ZFS I/O deadman thread from vendor (illumos).
This feature panics the system on hanging ZFS I/O, helps debugging
and resumes failed service.
The panic behavior can be controlled with the loader-only tunables:
vfs.zfs.deadman_enabled (enable or disable panic on stalled ZFS I/O)
vfs.zfs.deadman_synctime (expiration time for stalled ZFS I/O)
By default, ZFS I/O deadman is enabled by default on amd64 and i386
excluding virtual guest machines.
Illumos ZFS issues:
3246 ZFS I/O deadman thread
References:
https://www.illumos.org/issues/3246
MFC after: 2 weeks
kernel. Properly restore, continue, and detach from processes
being DTraced when DTrace exits with an error so the program
being inspected is not terminated.
cddl/contrib/opensolaris/cmd/dtrace/dtrace.c:
In fatal(), the generic error handler, close the DTrace
handle as is done in the "probe/script" error handler
dfatal(). fatal() can be invoked after DTrace attaches
to processes (e.g. a script specified by command line
argument can't be found) and closing the handle will
release them.
Submitted by: Spectra Logic Corporation
Reviewed by: rpaulo, gnn
* Illumos zfs issue #3035 [1] LZ4 compression support in ZFS.
LZ4 is a new high-speed BSD-licensed compression algorithm created
by Yann Collet that delivers very high compression and decompression
performance compared to lzjb (>50% faster on compression, >80% faster
on decompression and around 3x faster on compression of incompressible
data), while giving better compression ratio [1].
This version of LZ4 corresponds to upstream's [2] revision 85.
Please note that for obvious reasons this is not backward read
compatible. This means once a pool have LZ4 compressed data, these
data can no longer be read by older ZFS implementations.
Local changes:
- On-stack hash table disabled and using kernel slab allocator
instead, at this time. This requires larger kernel thread stack
for zio workers. This may change in the future should we adjusted
the zio workers' thread stack size.
- likely and unlikely will be undefined if they are already defined,
this is required for i386 XEN build.
- Removed De Bruijn sequence based __builtin_ctz family of builtins
in favor of the latter. Both GCC and clang supports these builtins.
- Changed the way the LZ4 code detects endianness.
- Manual pages modifications to mention the feature based on Illumos
counterpart.
- Boot loader changes to make it support LZ4 decompression.
[1] https://www.illumos.org/issues/3035
[2] http://code.google.com/p/lz4/source/list
Obtained from: Illumos (13921:9d721847e469)
Tested on: FreeBSD/amd64
MFC after: 1 month
of files other than the actual libraries.
Use LIBRARIES_ONLY to supress the inclusion of files in the lib32
distribution that are duplicates of files in base.
Sponsored by: DARPA, AFRL
Reviewed by: emaste
the underlying zap_count() to return no errors. However, it is possible
that the pool reaches to such a state where zap_count would return error,
leading to panics when a pool is imported.
This commit changes the ddt_zap_count to return error returned from
zap_count and handle the error appropriately. With this change, it's now
possible to let zpool rollback damaged transaction groups and import the
pool.
Obtained from: ZFS on Linux github (e8fd45a0f9)
MFC after: 1 month
random order instead of creation order.
Eliminates needless filesystem renames caused by removed parent snapshots
which subsequently causes many more errors.
PR: kern/172259
Submitted by: Steven Hartland
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks
Introduce a new dataset aclmode setting "restricted" to protect ACL's
being destroyed or corrupted by a drive-by chmod.
illumos-gate 13889:a67716f16746
3254 add support in zfs for aclmode=restricted
References:
https://www.illumos.org/issues/3254
MFC after: 2 weeks
Import the zio nop-write improvement from Illumos. To reduce I/O,
nop-write omits overwriting data if the checksum (cryptographically
secure) of new data matches the checksum of existing data.
It also saves space if snapshots are in use.
It currently works only on datasets with enabled compression, disabled
deduplication and sha256 checksums.
IllumOS 13887:196932ec9e6a and 13888:7204b3392a58
3236 zio nop-write
References:
https://www.illumos.org/issues/3236
MFC after: 2 weeks
Illumos 13886:e3261d03efbf
3349 zpool upgrade -V bumps the on disk version number, but leaves
the in core version
References:
https://www.illumos.org/issues/3349
MFC after: 1 week
Illumos r13840:97fd5cdf328a:
3145 single-copy arc
3212 ztest: race condition between vdev_online() and spa_vdev_remove()
Illumos r13849:3468a95b27cd:
3258 ztest's use of file descriptors is unstable
There is one known issue: Some probes will display an error message along the
lines of: "Invalid address (0)"
I tested this with both a simple dtrace probe and dtruss on a few different
binaries on 32-bit. I only compiled 64-bit, did not run it, but I don't expect
problems without the modules loaded. Volunteers are welcome.
MFC after: 1 month
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
doesn't exist on a dataset we are starting from. For example if we
have the following configuration:
tank
tank/foo
tank/foo@snap
tank/bar
tank/bar@snap
We can execute:
# zfs destroy -t tank@snap
eventhough tank@snap doesn't exit.
Unfortunately it is not possible to do the same with recursive rename:
# zfs rename -r tank@snap tank@pans
cannot open 'tank@snap': dataset does not exist
...until now. This change allows to recursively rename snapshots even if
snapshot doesn't exist on the starting dataset.
Sponsored by: rsync.net
MFC after: 2 weeks
The code builds a map of regions that were freed. On every write the
code consults the map and eventually removes ranges that were freed
before, but are now overwritten.
Freed blocks are not TRIMed immediately. There is a tunable that defines
how many txg we should wait with TRIMming freed blocks (64 by default).
There is a low priority thread that TRIMs ranges when the time comes.
During TRIM we keep in-flight ranges on a list to detect colliding
writes - we have to delay writes that collide with in-flight TRIMs in
case something will be reordered and write will reached the disk before
the TRIM. We don't have to do the same for in-flight writes, as
colliding writes just remove ranges to TRIM.
Sponsored by: multiplay.co.uk
This work includes some important fixes and some improvements obtained
from the zfsonlinux project, including TRIMming entire vdevs on pool
create/add/attach and on pool import for spare and cache vdevs.
Obtained from: zfsonlinux
Submitted by: Etienne Dechamps <etienne.dechamps@ovh.net>
impatient if we sent more than 2 INT signals. This fixes a bug where
we wouldn't see aggregations print on the command line if we Ctrl-C'd
a dtrace script or command line invocation.
MFC after: 2 weeks
Merge changes from illumos
906 dtrace depends_on pragma should search all library paths, not just the
current one
949 dtrace should only include the first instance of a library found on
its library path
Illumos Revisions: 13353:936a1e45726c
13354:2b2c36a81512
Reference:
https://www.illumos.org/issues/906https://www.illumos.org/issues/949
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 3 weeks
These probes are most useful when looking into the structures
they provide, which are listed in io.d. For example:
dtrace -n 'io:genunix::start { printf("%d\n", args[0]->bio_bcount); }'
Note that the I/O systems in FreeBSD and Solaris/Illumos are sufficiently
different that there is not a 1:1 mapping from scripts that work
with one to the other.
MFC after: 1 month
Bryan Cantrill implemented the equivalent of semi-log graph
paper for Dtrace so llquantize will use one logarithmic and
one linear scale.
Special thanks to Mark Peek for providing fix to an
assertion and to Fabian Keill for testing the port.
Illumos Revision: 13355:15b74a2a9a9d
Reference:
https://www.illumos/issues/905
Obtained from: Illumos
Tested by: Fabian Keill, mp
MFC after: 4 days
1948 zpool list should show more detailed pool information
Display per-vdev information with "zpool list -v".
The added expandsize property has currently no value on FreeBSD.
This changeset allows adding expansion support to individual vdevs
in the future.
References:
https://www.illumos.org/issues/1948
Obtained from: illumos (issue #1948)
MFC after: 2 weeks
2703 add mechanism to report ZFS send progress
If the zfs send command is used with the -v flag, the amount of bytes
transmitted is reported in per second updates.
References:
https://www.illumos.org/issues/2703
Obtained from: illumos (issue #2703)
MFC after: 2 weeks
in the system-wide version of <stdlib.h> by wrapping the #include_next
<stdlib.h> within the scope of its own header protection. On FreeBSD this has
no effect, since both header protections are equivalent. However the GNU version
of <stdlib.h> implements a special header protection mechanism which allows it
to be included multiple times (in different modes).
Simply by moving the #include_next off the header protection, we allow
system-wide <stdlib.h> to implement its own protection policy, whichever that
may be.
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particukar, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
Sponsored by: Sandvine Inc.
MFC after: 1 week
CTF format is not cross-platform by design, e.g. it is not guaranteed
that data generated by ctfconvert/ctfmerge on one architecture will
be successfuly read on another. CTF structures are saved/restored
using naive approach. Roughly it looks like:
write(fd, &ctf_struct, sizeof(ctf_struct))
read(fd, &ctf_struct, sizeof(ctf_struct))
By sheer luck memory layout of all type-related CTF structures is the same
on amd64/i386/mips32/mips64. It's different on ARM though. sparc, ia64,
powerpc, and powerpc64 were not tested. So in order to get file compatible
with dtrace on ARM it should be compiled on ARM. Alternative solution would
be to have "signatures" for every platform and ctfmerge should convert host's
reperesentation of CTF structure to target's one using "signature" as template.
This patch checks byte order of ELF files used for generating CTF record
and makes sure that byte order of data written to resulting files is the same
as target's byte order.
allow.mount.zfs:
allow mounting the zfs filesystem inside a jail
This way the permssions for mounting all current VFCF_JAIL filesystems
inside a jail are controlled wia allow.mount.* jail parameters.
Update sysctl descriptions.
Update jail(8) and zfs(8) manpages.
TODO: document the connection of allow.mount.* and VFCF_JAIL for kernel
developers
MFC after: 10 days
add support for "-t <datatype>" argument to zfs get
References:
https://www.illumos.org/issues/1936
Update zfs(8) manpage in respect of [1].
Fix typo in zfs(8) manpage.
Obtained from: illumos (issue #1936)
MFC after: 1 week
happen to have a .exe extension.
While here fix the shebang of a shell script that
was looking for /bin/bash.
Reviewed by: gnn
Approved by: jhb (mentor)
MFC after: 2 weeks
names and wants to sort them by name, ie. when executes:
# zfs list -t snapshot -o name -s name
Because only name is needed we don't have to read all snapshot properties.
Below you can find how long does it take to list 34509 snapshots from a single
disk pool before and after this change with cold and warm cache:
before:
# time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 525s
warm cache: 218s
after:
# time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 1.7s
warm cache: 1.1s
MFC after: 1 week
uint64_t values are snprintf'd using %llx. On amd64, uint64_t is
typedef'd as unsigned long, so cast the values to u_longlong_t, as is
done similarly in the rest of the file.
MFC after: 1 week
uint64_t values are snprintf'd using %llx. On amd64, uint64_t is
typedef'd as unsigned long, so cast the values to u_longlong_t, as is
done similarly in the rest of the file.
MFC after: 1 week
dt_popc() function assumes that either _ILP32 or _LP64 is defined,
otherwise it has no suitable implementation.
However, the _ILP32 and _LP64 macros come from isa_defs.h, which is not
included in this file. Add the include now, to get the macros defined.
MFC after: 1 week