OpenZFS merge main-gf11b09

- add dRAID support
- fix duplicate close handling
- fix memory leak in prefetch
- fix problem with SIMD benchmarking on FreeBSD boot
...
This commit is contained in:
Matt Macy 2021-01-07 15:27:17 -08:00
parent 84089de83e
commit 7877fdebee
471 changed files with 34205 additions and 6800 deletions

View File

@ -126,8 +126,8 @@ feature needed? What problem does it solve?
#### General #### General
* All pull requests must be based on the current master branch and apply * All pull requests, except backports and releases, must be based on the current master branch
without conflicts. and should apply without conflicts.
* Please attempt to limit pull requests to a single commit which resolves * Please attempt to limit pull requests to a single commit which resolves
one specific issue. one specific issue.
* Make sure your commit messages are in the correct format. See the * Make sure your commit messages are in the correct format. See the
@ -230,70 +230,6 @@ attempting to solve.
Signed-off-by: Contributor <contributor@email.com> Signed-off-by: Contributor <contributor@email.com>
``` ```
#### OpenZFS Patch Ports
If you are porting OpenZFS patches, the commit message must meet
the following guidelines:
* The first line must be the summary line from the most important OpenZFS commit being ported.
It must begin with `OpenZFS dddd, dddd - ` where `dddd` are OpenZFS issue numbers.
* Provides a `Authored by:` line to attribute each patch for each original author.
* Provides the `Reviewed by:` and `Approved by:` lines from each original
OpenZFS commit.
* Provides a `Ported-by:` line with the developer's name followed by
their email for each OpenZFS commit.
* Provides a `OpenZFS-issue:` line with link for each original illumos
issue.
* Provides a `OpenZFS-commit:` line with link for each original OpenZFS commit.
* If necessary, provide some porting notes to describe any deviations from
the original OpenZFS commits.
An example OpenZFS patch port commit message for a single patch is provided
below.
```
OpenZFS 1234 - Summary from the original OpenZFS commit
Authored by: Original Author <original@email.com>
Reviewed by: Reviewer One <reviewer1@email.com>
Reviewed by: Reviewer Two <reviewer2@email.com>
Approved by: Approver One <approver1@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here if necessary.
OpenZFS-issue: https://www.illumos.org/issues/1234
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/abcd1234
```
If necessary, multiple OpenZFS patches can be combined in a single port.
This is useful when you are porting a new patch and its subsequent bug
fixes. An example commit message is provided below.
```
OpenZFS 1234, 5678 - Summary of most important OpenZFS commit
1234 Summary from original OpenZFS commit for 1234
Authored by: Original Author <original@email.com>
Reviewed by: Reviewer Two <reviewer2@email.com>
Approved by: Approver One <approver1@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here for 1234 if necessary.
OpenZFS-issue: https://www.illumos.org/issues/1234
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/abcd1234
5678 Summary from original OpenZFS commit for 5678
Authored by: Original Author2 <original2@email.com>
Reviewed by: Reviewer One <reviewer1@email.com>
Approved by: Approver Two <approver2@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here for 5678 if necessary.
OpenZFS-issue: https://www.illumos.org/issues/5678
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/efgh5678
```
#### Coverity Defect Fixes #### Coverity Defect Fixes
If you are submitting a fix to a If you are submitting a fix to a
[Coverity defect](https://scan.coverity.com/projects/zfsonlinux-zfs), [Coverity defect](https://scan.coverity.com/projects/zfsonlinux-zfs),

View File

@ -0,0 +1,53 @@
---
name: Bug report
about: Create a report to help us improve OpenZFS
title: ''
labels: 'Type: Defect, Status: Triage Needed'
assignees: ''
---
<!-- Please fill out the following template, which will help other contributors address your issue. -->
<!--
Thank you for reporting an issue.
*IMPORTANT* - Please check our issue tracker before opening a new issue.
Additional valuable information can be found in the OpenZFS documentation
and mailing list archives.
Please fill in as much of the template as possible.
-->
### System information
<!-- add version after "|" character -->
Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Architecture |
ZFS Version |
SPL Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
-->
### Describe the problem you're observing
### Describe how to reproduce the problem
### Include any warning/errors/backtraces from the system logs
<!--
*IMPORTANT* - Please mark logs and text output from terminal commands
or else Github will not display them correctly.
An example is provided below.
Example:
```
this is an example how log text should be marked (wrap it with ```)
```
-->

View File

@ -0,0 +1,14 @@
blank_issues_enabled: false
contact_links:
- name: OpenZFS Questions
url: https://github.com/openzfs/zfs/discussions/new
about: Ask the community for help
- name: OpenZFS Community Support Mailing list (Linux)
url: https://zfsonlinux.topicbox.com/groups/zfs-discuss
about: Get community support for OpenZFS on Linux
- name: FreeBSD Community Support Mailing list
url: https://lists.freebsd.org/mailman/listinfo/freebsd-fs
about: Get community support for OpenZFS on FreeBSD
- name: OpenZFS on IRC
url: https://webchat.freenode.net/#openzfs
about: Use IRC to get community support for OpenZFS

View File

@ -0,0 +1,33 @@
---
name: Feature request
about: Suggest a feature for OpenZFS
title: ''
labels: 'Type: Feature'
assignees: ''
---
<!--
Thank you for suggesting a feature.
Please check our issue tracker before opening a new feature request.
Filling out the following template will help other contributors better understand your proposed feature.
-->
### Describe the feature would like to see added to OpenZFS
<!--
Provide a clear and concise description of the feature.
-->
### How will this feature improve OpenZFS?
<!--
What problem does this feature solve?
-->
### Additional context
<!--
Any additional information you can add about the proposal?
-->

25
sys/contrib/openzfs/.github/codecov.yml vendored Normal file
View File

@ -0,0 +1,25 @@
codecov:
notify:
require_ci_to_pass: false # always post
after_n_builds: 2 # user and kernel
coverage:
precision: 0 # 0 decimals of precision
round: nearest # Round to nearest precision point
range: "50...90" # red -> yellow -> green
status:
project:
default:
threshold: 1% # allow 1% coverage variance
patch:
default:
threshold: 1% # allow 1% coverage variance
comment:
layout: "reach, diff, flags, footer"
behavior: once # update if exists; post new; skip if deleted
require_changes: yes # only post when coverage changes
# ignore: Please place any ignores in config/ax_code_coverage.m4 instead

View File

@ -0,0 +1,13 @@
# Configuration for probot-no-response - https://github.com/probot/no-response
# Number of days of inactivity before an Issue is closed for lack of response
daysUntilClose: 31
# Label requiring a response
responseRequiredLabel: "Status: Feedback requested"
# Comment to post when closing an Issue for lack of response. Set to `false` to disable
closeComment: >
This issue has been automatically closed because there has been no response
to our request for more information from the original author. With only the
information that is currently in the issue, we don't have enough information
to take action. Please reach out if you have or find the answers we need so
that we can investigate further.

26
sys/contrib/openzfs/.github/stale.yml vendored Normal file
View File

@ -0,0 +1,26 @@
# Number of days of inactivity before an issue becomes stale
daysUntilStale: 365
# Number of days of inactivity before a stale issue is closed
daysUntilClose: 90
# Limit to only `issues` or `pulls`
only: issues
# Issues with these labels will never be considered stale
exemptLabels:
- "Type: Feature"
- "Bot: Not Stale"
- "Status: Work in Progress"
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: true
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: true
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: true
# Label to use when marking an issue as stale
staleLabel: "Status: Stale"
# Comment to post when marking an issue as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as "stale" because it has not had
any activity for a while. It will be closed in 90 days if no further activity occurs.
Thank you for your contributions.
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 6

View File

@ -0,0 +1,36 @@
name: checkstyle
on:
push:
pull_request:
jobs:
checkstyle:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install --yes -qq build-essential autoconf libtool gawk alien fakeroot linux-headers-$(uname -r)
sudo apt-get install --yes -qq zlib1g-dev uuid-dev libattr1-dev libblkid-dev libselinux-dev libudev-dev libssl-dev python-dev python-setuptools python-cffi python3 python3-dev python3-setuptools python3-cffi
# packages for tests
sudo apt-get install --yes -qq parted lsscsi ksh attr acl nfs-kernel-server fio
sudo apt-get install --yes -qq mandoc cppcheck pax-utils devscripts abigail-tools
sudo -E pip --quiet install flake8
- name: Prepare
run: |
sh ./autogen.sh
./configure
make -j$(nproc)
- name: Checkstyle
run: |
make checkstyle
- name: Lint
run: |
make lint
- name: CheckABI
run: |
make checkabi

View File

@ -0,0 +1,58 @@
name: zfs-tests-sanity
on:
push:
pull_request:
jobs:
tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install --yes -qq build-essential autoconf libtool gdb lcov \
git alien fakeroot wget curl bc fio acl \
sysstat mdadm lsscsi parted gdebi attr dbench watchdog ksh \
nfs-kernel-server samba rng-tools xz-utils \
zlib1g-dev uuid-dev libblkid-dev libselinux-dev \
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev pamtester python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
- name: Autogen.sh
run: |
sh autogen.sh
- name: Configure
run: |
./configure --enable-debug --enable-debuginfo
- name: Make
run: |
make --no-print-directory -s pkg-utils pkg-kmod
- name: Install
run: |
sudo dpkg -i *.deb
# Update order of directories to search for modules, otherwise
# Ubuntu will load kernel-shipped ones.
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G -r sanity
- name: Prepare artifacts
if: failure()
run: |
RESULTS_PATH=$(readlink -f /var/tmp/test_results/current)
sudo dmesg > $RESULTS_PATH/dmesg
sudo cp /var/log/syslog $RESULTS_PATH/
sudo chmod +r $RESULTS_PATH/*
- uses: actions/upload-artifact@v2
if: failure()
with:
name: Test logs
path: /var/tmp/test_results/20*/
if-no-files-found: ignore

View File

@ -0,0 +1,67 @@
name: zloop
on:
push:
pull_request:
jobs:
tests:
runs-on: ubuntu-latest
env:
TEST_DIR: /var/tmp/zloop
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install --yes -qq build-essential autoconf libtool gdb \
git alien fakeroot \
zlib1g-dev uuid-dev libblkid-dev libselinux-dev \
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev \
python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
- name: Autogen.sh
run: |
sh autogen.sh
- name: Configure
run: |
./configure --enable-debug --enable-debuginfo
- name: Make
run: |
make --no-print-directory -s pkg-utils pkg-kmod
- name: Install
run: |
sudo dpkg -i *.deb
# Update order of directories to search for modules, otherwise
# Ubuntu will load kernel-shipped ones.
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
- name: Tests
run: |
sudo mkdir -p $TEST_DIR
# run for 20 minutes to have a total runner time of 30 minutes
sudo /usr/share/zfs/zloop.sh -t 1200 -l -m1
- name: Prepare artifacts
if: failure()
run: |
sudo chmod +r -R $TEST_DIR/
- uses: actions/upload-artifact@v2
if: failure()
with:
name: Logs
path: |
/var/tmp/zloop/*/
!/var/tmp/zloop/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v2
if: failure()
with:
name: Pool files
path: |
/var/tmp/zloop/*/vdev/
if-no-files-found: ignore

View File

@ -2,9 +2,9 @@ Meta: 1
Name: zfs Name: zfs
Branch: 1.0 Branch: 1.0
Version: 2.0.0 Version: 2.0.0
Release: rc3 Release: rc1
Release-Tags: relext Release-Tags: relext
License: CDDL License: CDDL
Author: OpenZFS Author: OpenZFS
Linux-Maximum: 5.9 Linux-Maximum: 5.10
Linux-Minimum: 3.10 Linux-Minimum: 3.10

View File

@ -136,6 +136,13 @@ shellcheck:
echo "skipping shellcheck because shellcheck is not installed"; \ echo "skipping shellcheck because shellcheck is not installed"; \
fi fi
PHONY += checkabi storeabi
checkabi: lib
$(MAKE) -C lib checkabi
storeabi: lib
$(MAKE) -C lib storeabi
PHONY += checkbashisms PHONY += checkbashisms
checkbashisms: checkbashisms:
@if type checkbashisms > /dev/null 2>&1; then \ @if type checkbashisms > /dev/null 2>&1; then \
@ -152,9 +159,10 @@ checkbashisms:
-o -name 'smart' -prune \ -o -name 'smart' -prune \
-o -name 'paxcheck.sh' -prune \ -o -name 'paxcheck.sh' -prune \
-o -name 'make_gitrev.sh' -prune \ -o -name 'make_gitrev.sh' -prune \
-o -name '90zfs' -prune \
-o -type f ! -name 'config*' \ -o -type f ! -name 'config*' \
! -name 'libtool' \ ! -name 'libtool' \
-exec bash -c 'awk "NR==1 && /\#\!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \ -exec sh -c 'awk "NR==1 && /\#\!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \
else \ else \
echo "skipping checkbashisms because checkbashisms is not installed"; \ echo "skipping checkbashisms because checkbashisms is not installed"; \
fi fi

View File

@ -1,5 +1,6 @@
SUBDIRS = zfs zpool zdb zhack zinject zstream zstreamdump ztest SUBDIRS = zfs zpool zdb zhack zinject zstream zstreamdump ztest
SUBDIRS += fsck_zfs vdev_id raidz_test zfs_ids_to_path SUBDIRS += fsck_zfs vdev_id raidz_test zfs_ids_to_path
SUBDIRS += zpool_influxdb
if USING_PYTHON if USING_PYTHON
SUBDIRS += arcstat arc_summary dbufstat SUBDIRS += arcstat arc_summary dbufstat

View File

@ -59,14 +59,20 @@ if sys.platform.startswith('freebsd'):
# Requires py27-sysctl on FreeBSD # Requires py27-sysctl on FreeBSD
import sysctl import sysctl
def is_value(ctl):
return ctl.type != sysctl.CTLTYPE_NODE
def load_kstats(namespace): def load_kstats(namespace):
"""Collect information on a specific subsystem of the ARC""" """Collect information on a specific subsystem of the ARC"""
base = 'kstat.zfs.misc.%s.' % namespace base = 'kstat.zfs.misc.%s.' % namespace
return [(kstat.name, D(kstat.value)) for kstat in sysctl.filter(base)] fmt = lambda kstat: (kstat.name, D(kstat.value))
kstats = sysctl.filter(base)
return [fmt(kstat) for kstat in kstats if is_value(kstat)]
def load_tunables(): def load_tunables():
return dict((ctl.name, ctl.value) for ctl in sysctl.filter('vfs.zfs')) ctls = sysctl.filter('vfs.zfs')
return dict((ctl.name, ctl.value) for ctl in ctls if is_value(ctl))
elif sys.platform.startswith('linux'): elif sys.platform.startswith('linux'):
@ -219,12 +225,30 @@ def get_arc_summary(Kstat):
deleted = Kstat["kstat.zfs.misc.arcstats.deleted"] deleted = Kstat["kstat.zfs.misc.arcstats.deleted"]
mutex_miss = Kstat["kstat.zfs.misc.arcstats.mutex_miss"] mutex_miss = Kstat["kstat.zfs.misc.arcstats.mutex_miss"]
evict_skip = Kstat["kstat.zfs.misc.arcstats.evict_skip"] evict_skip = Kstat["kstat.zfs.misc.arcstats.evict_skip"]
evict_l2_cached = Kstat["kstat.zfs.misc.arcstats.evict_l2_cached"]
evict_l2_eligible = Kstat["kstat.zfs.misc.arcstats.evict_l2_eligible"]
evict_l2_eligible_mfu = Kstat["kstat.zfs.misc.arcstats.evict_l2_eligible_mfu"]
evict_l2_eligible_mru = Kstat["kstat.zfs.misc.arcstats.evict_l2_eligible_mru"]
evict_l2_ineligible = Kstat["kstat.zfs.misc.arcstats.evict_l2_ineligible"]
evict_l2_skip = Kstat["kstat.zfs.misc.arcstats.evict_l2_skip"]
# ARC Misc. # ARC Misc.
output["arc_misc"] = {} output["arc_misc"] = {}
output["arc_misc"]["deleted"] = fHits(deleted) output["arc_misc"]["deleted"] = fHits(deleted)
output["arc_misc"]['mutex_miss'] = fHits(mutex_miss) output["arc_misc"]["mutex_miss"] = fHits(mutex_miss)
output["arc_misc"]['evict_skips'] = fHits(evict_skip) output["arc_misc"]["evict_skips"] = fHits(evict_skip)
output["arc_misc"]["evict_l2_skip"] = fHits(evict_l2_skip)
output["arc_misc"]["evict_l2_cached"] = fBytes(evict_l2_cached)
output["arc_misc"]["evict_l2_eligible"] = fBytes(evict_l2_eligible)
output["arc_misc"]["evict_l2_eligible_mfu"] = {
'per': fPerc(evict_l2_eligible_mfu, evict_l2_eligible),
'num': fBytes(evict_l2_eligible_mfu),
}
output["arc_misc"]["evict_l2_eligible_mru"] = {
'per': fPerc(evict_l2_eligible_mru, evict_l2_eligible),
'num': fBytes(evict_l2_eligible_mru),
}
output["arc_misc"]["evict_l2_ineligible"] = fBytes(evict_l2_ineligible)
# ARC Sizing # ARC Sizing
arc_size = Kstat["kstat.zfs.misc.arcstats.size"] arc_size = Kstat["kstat.zfs.misc.arcstats.size"]
@ -340,8 +364,26 @@ def _arc_summary(Kstat):
sys.stdout.write("\tDeleted:\t\t\t\t%s\n" % arc['arc_misc']['deleted']) sys.stdout.write("\tDeleted:\t\t\t\t%s\n" % arc['arc_misc']['deleted'])
sys.stdout.write("\tMutex Misses:\t\t\t\t%s\n" % sys.stdout.write("\tMutex Misses:\t\t\t\t%s\n" %
arc['arc_misc']['mutex_miss']) arc['arc_misc']['mutex_miss'])
sys.stdout.write("\tEvict Skips:\t\t\t\t%s\n" % sys.stdout.write("\tEviction Skips:\t\t\t\t%s\n" %
arc['arc_misc']['evict_skips']) arc['arc_misc']['evict_skips'])
sys.stdout.write("\tEviction Skips Due to L2 Writes:\t%s\n" %
arc['arc_misc']['evict_l2_skip'])
sys.stdout.write("\tL2 Cached Evictions:\t\t\t%s\n" %
arc['arc_misc']['evict_l2_cached'])
sys.stdout.write("\tL2 Eligible Evictions:\t\t\t%s\n" %
arc['arc_misc']['evict_l2_eligible'])
sys.stdout.write("\tL2 Eligible MFU Evictions:\t%s\t%s\n" % (
arc['arc_misc']['evict_l2_eligible_mfu']['per'],
arc['arc_misc']['evict_l2_eligible_mfu']['num'],
)
)
sys.stdout.write("\tL2 Eligible MRU Evictions:\t%s\t%s\n" % (
arc['arc_misc']['evict_l2_eligible_mru']['per'],
arc['arc_misc']['evict_l2_eligible_mru']['num'],
)
)
sys.stdout.write("\tL2 Ineligible Evictions:\t\t%s\n" %
arc['arc_misc']['evict_l2_ineligible'])
sys.stdout.write("\n") sys.stdout.write("\n")
# ARC Sizing # ARC Sizing
@ -677,6 +719,11 @@ def get_l2arc_summary(Kstat):
l2_writes_done = Kstat["kstat.zfs.misc.arcstats.l2_writes_done"] l2_writes_done = Kstat["kstat.zfs.misc.arcstats.l2_writes_done"]
l2_writes_error = Kstat["kstat.zfs.misc.arcstats.l2_writes_error"] l2_writes_error = Kstat["kstat.zfs.misc.arcstats.l2_writes_error"]
l2_writes_sent = Kstat["kstat.zfs.misc.arcstats.l2_writes_sent"] l2_writes_sent = Kstat["kstat.zfs.misc.arcstats.l2_writes_sent"]
l2_mfu_asize = Kstat["kstat.zfs.misc.arcstats.l2_mfu_asize"]
l2_mru_asize = Kstat["kstat.zfs.misc.arcstats.l2_mru_asize"]
l2_prefetch_asize = Kstat["kstat.zfs.misc.arcstats.l2_prefetch_asize"]
l2_bufc_data_asize = Kstat["kstat.zfs.misc.arcstats.l2_bufc_data_asize"]
l2_bufc_metadata_asize = Kstat["kstat.zfs.misc.arcstats.l2_bufc_metadata_asize"]
l2_access_total = (l2_hits + l2_misses) l2_access_total = (l2_hits + l2_misses)
output['l2_health_count'] = (l2_writes_error + l2_cksum_bad + l2_io_error) output['l2_health_count'] = (l2_writes_error + l2_cksum_bad + l2_io_error)
@ -699,7 +746,7 @@ def get_l2arc_summary(Kstat):
output["io_errors"] = fHits(l2_io_error) output["io_errors"] = fHits(l2_io_error)
output["l2_arc_size"] = {} output["l2_arc_size"] = {}
output["l2_arc_size"]["adative"] = fBytes(l2_size) output["l2_arc_size"]["adaptive"] = fBytes(l2_size)
output["l2_arc_size"]["actual"] = { output["l2_arc_size"]["actual"] = {
'per': fPerc(l2_asize, l2_size), 'per': fPerc(l2_asize, l2_size),
'num': fBytes(l2_asize) 'num': fBytes(l2_asize)
@ -708,6 +755,26 @@ def get_l2arc_summary(Kstat):
'per': fPerc(l2_hdr_size, l2_size), 'per': fPerc(l2_hdr_size, l2_size),
'num': fBytes(l2_hdr_size), 'num': fBytes(l2_hdr_size),
} }
output["l2_arc_size"]["mfu_asize"] = {
'per': fPerc(l2_mfu_asize, l2_asize),
'num': fBytes(l2_mfu_asize),
}
output["l2_arc_size"]["mru_asize"] = {
'per': fPerc(l2_mru_asize, l2_asize),
'num': fBytes(l2_mru_asize),
}
output["l2_arc_size"]["prefetch_asize"] = {
'per': fPerc(l2_prefetch_asize, l2_asize),
'num': fBytes(l2_prefetch_asize),
}
output["l2_arc_size"]["bufc_data_asize"] = {
'per': fPerc(l2_bufc_data_asize, l2_asize),
'num': fBytes(l2_bufc_data_asize),
}
output["l2_arc_size"]["bufc_metadata_asize"] = {
'per': fPerc(l2_bufc_metadata_asize, l2_asize),
'num': fBytes(l2_bufc_metadata_asize),
}
output["l2_arc_evicts"] = {} output["l2_arc_evicts"] = {}
output["l2_arc_evicts"]['lock_retries'] = fHits(l2_evict_lock_retry) output["l2_arc_evicts"]['lock_retries'] = fHits(l2_evict_lock_retry)
@ -772,7 +839,7 @@ def _l2arc_summary(Kstat):
sys.stdout.write("\n") sys.stdout.write("\n")
sys.stdout.write("L2 ARC Size: (Adaptive)\t\t\t\t%s\n" % sys.stdout.write("L2 ARC Size: (Adaptive)\t\t\t\t%s\n" %
arc["l2_arc_size"]["adative"]) arc["l2_arc_size"]["adaptive"])
sys.stdout.write("\tCompressed:\t\t\t%s\t%s\n" % ( sys.stdout.write("\tCompressed:\t\t\t%s\t%s\n" % (
arc["l2_arc_size"]["actual"]["per"], arc["l2_arc_size"]["actual"]["per"],
arc["l2_arc_size"]["actual"]["num"], arc["l2_arc_size"]["actual"]["num"],
@ -783,11 +850,36 @@ def _l2arc_summary(Kstat):
arc["l2_arc_size"]["head_size"]["num"], arc["l2_arc_size"]["head_size"]["num"],
) )
) )
sys.stdout.write("\tMFU Alloc. Size:\t\t%s\t%s\n" % (
arc["l2_arc_size"]["mfu_asize"]["per"],
arc["l2_arc_size"]["mfu_asize"]["num"],
)
)
sys.stdout.write("\tMRU Alloc. Size:\t\t%s\t%s\n" % (
arc["l2_arc_size"]["mru_asize"]["per"],
arc["l2_arc_size"]["mru_asize"]["num"],
)
)
sys.stdout.write("\tPrefetch Alloc. Size:\t\t%s\t%s\n" % (
arc["l2_arc_size"]["prefetch_asize"]["per"],
arc["l2_arc_size"]["prefetch_asize"]["num"],
)
)
sys.stdout.write("\tData (buf content) Alloc. Size:\t%s\t%s\n" % (
arc["l2_arc_size"]["bufc_data_asize"]["per"],
arc["l2_arc_size"]["bufc_data_asize"]["num"],
)
)
sys.stdout.write("\tMetadata (buf content) Size:\t%s\t%s\n" % (
arc["l2_arc_size"]["bufc_metadata_asize"]["per"],
arc["l2_arc_size"]["bufc_metadata_asize"]["num"],
)
)
sys.stdout.write("\n") sys.stdout.write("\n")
if arc["l2_arc_evicts"]['lock_retries'] != '0' or \ if arc["l2_arc_evicts"]['lock_retries'] != '0' or \
arc["l2_arc_evicts"]["reading"] != '0': arc["l2_arc_evicts"]["reading"] != '0':
sys.stdout.write("L2 ARC Evicts:\n") sys.stdout.write("L2 ARC Evictions:\n")
sys.stdout.write("\tLock Retries:\t\t\t\t%s\n" % sys.stdout.write("\tLock Retries:\t\t\t\t%s\n" %
arc["l2_arc_evicts"]['lock_retries']) arc["l2_arc_evicts"]['lock_retries'])
sys.stdout.write("\tUpon Reading:\t\t\t\t%s\n" % sys.stdout.write("\tUpon Reading:\t\t\t\t%s\n" %

View File

@ -58,7 +58,6 @@ SECTION_PATHS = {'arc': 'arcstats',
'dmu': 'dmu_tx', 'dmu': 'dmu_tx',
'l2arc': 'arcstats', # L2ARC stuff lives in arcstats 'l2arc': 'arcstats', # L2ARC stuff lives in arcstats
'vdev': 'vdev_cache_stats', 'vdev': 'vdev_cache_stats',
'xuio': 'xuio_stats',
'zfetch': 'zfetchstats', 'zfetch': 'zfetchstats',
'zil': 'zil'} 'zil': 'zil'}
@ -86,16 +85,24 @@ if sys.platform.startswith('freebsd'):
VDEV_CACHE_SIZE = 'vdev.cache_size' VDEV_CACHE_SIZE = 'vdev.cache_size'
def is_value(ctl):
return ctl.type != sysctl.CTLTYPE_NODE
def namefmt(ctl, base='vfs.zfs.'):
# base is removed from the name
cut = len(base)
return ctl.name[cut:]
def load_kstats(section): def load_kstats(section):
base = 'kstat.zfs.misc.{section}.'.format(section=section) base = 'kstat.zfs.misc.{section}.'.format(section=section)
# base is removed from the name fmt = lambda kstat: '{name} : {value}'.format(name=namefmt(kstat, base),
fmt = lambda kstat: '{name} : {value}'.format(name=kstat.name[len(base):],
value=kstat.value) value=kstat.value)
return [fmt(kstat) for kstat in sysctl.filter(base)] kstats = sysctl.filter(base)
return [fmt(kstat) for kstat in kstats if is_value(kstat)]
def get_params(base): def get_params(base):
cut = 8 # = len('vfs.zfs.') ctls = sysctl.filter(base)
return {ctl.name[cut:]: str(ctl.value) for ctl in sysctl.filter(base)} return {namefmt(ctl): str(ctl.value) for ctl in ctls if is_value(ctl)}
def get_tunable_params(): def get_tunable_params():
return get_params('vfs.zfs') return get_params('vfs.zfs')
@ -112,25 +119,8 @@ if sys.platform.startswith('freebsd'):
return '{} version {}'.format(name, version) return '{} version {}'.format(name, version)
def get_descriptions(_request): def get_descriptions(_request):
# py-sysctl doesn't give descriptions, so we have to shell out. ctls = sysctl.filter('vfs.zfs')
command = ['sysctl', '-d', 'vfs.zfs'] return {namefmt(ctl): ctl.description for ctl in ctls if is_value(ctl)}
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
lines = info.stdout.split('\n')
else:
info = subprocess.check_output(command, universal_newlines=True)
lines = info.split('\n')
def fmt(line):
name, desc = line.split(':', 1)
return (name.strip(), desc.strip())
return dict([fmt(line) for line in lines if len(line) > 0])
elif sys.platform.startswith('linux'): elif sys.platform.startswith('linux'):
@ -397,8 +387,12 @@ def format_raw_line(name, value):
if ARGS.alt: if ARGS.alt:
result = '{0}{1}={2}'.format(INDENT, name, value) result = '{0}{1}={2}'.format(INDENT, name, value)
else: else:
spc = LINE_LENGTH-(len(INDENT)+len(value)) # Right-align the value within the line length if it fits,
result = '{0}{1:<{spc}}{2}'.format(INDENT, name, value, spc=spc) # otherwise just separate it from the name by a single space.
fit = LINE_LENGTH - len(INDENT) - len(name)
overflow = len(value) + 1
w = max(fit, overflow)
result = '{0}{1}{2:>{w}}'.format(INDENT, name, value, w=w)
return result return result
@ -598,6 +592,20 @@ def section_arc(kstats_dict):
prt_i1('Deleted:', f_hits(arc_stats['deleted'])) prt_i1('Deleted:', f_hits(arc_stats['deleted']))
prt_i1('Mutex misses:', f_hits(arc_stats['mutex_miss'])) prt_i1('Mutex misses:', f_hits(arc_stats['mutex_miss']))
prt_i1('Eviction skips:', f_hits(arc_stats['evict_skip'])) prt_i1('Eviction skips:', f_hits(arc_stats['evict_skip']))
prt_i1('Eviction skips due to L2 writes:',
f_hits(arc_stats['evict_l2_skip']))
prt_i1('L2 cached evictions:', f_bytes(arc_stats['evict_l2_cached']))
prt_i1('L2 eligible evictions:', f_bytes(arc_stats['evict_l2_eligible']))
prt_i2('L2 eligible MFU evictions:',
f_perc(arc_stats['evict_l2_eligible_mfu'],
arc_stats['evict_l2_eligible']),
f_bytes(arc_stats['evict_l2_eligible_mfu']))
prt_i2('L2 eligible MRU evictions:',
f_perc(arc_stats['evict_l2_eligible_mru'],
arc_stats['evict_l2_eligible']),
f_bytes(arc_stats['evict_l2_eligible_mru']))
prt_i1('L2 ineligible evictions:',
f_bytes(arc_stats['evict_l2_ineligible']))
print() print()
@ -736,6 +744,21 @@ def section_l2arc(kstats_dict):
prt_i2('Header size:', prt_i2('Header size:',
f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']), f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_hdr_size'])) f_bytes(arc_stats['l2_hdr_size']))
prt_i2('MFU allocated size:',
f_perc(arc_stats['l2_mfu_asize'], arc_stats['l2_asize']),
f_bytes(arc_stats['l2_mfu_asize']))
prt_i2('MRU allocated size:',
f_perc(arc_stats['l2_mru_asize'], arc_stats['l2_asize']),
f_bytes(arc_stats['l2_mru_asize']))
prt_i2('Prefetch allocated size:',
f_perc(arc_stats['l2_prefetch_asize'], arc_stats['l2_asize']),
f_bytes(arc_stats['l2_prefetch_asize']))
prt_i2('Data (buffer content) allocated size:',
f_perc(arc_stats['l2_bufc_data_asize'], arc_stats['l2_asize']),
f_bytes(arc_stats['l2_bufc_data_asize']))
prt_i2('Metadata (buffer content) allocated size:',
f_perc(arc_stats['l2_bufc_metadata_asize'], arc_stats['l2_asize']),
f_bytes(arc_stats['l2_bufc_metadata_asize']))
print() print()
prt_1('L2ARC breakdown:', f_hits(l2_access_total)) prt_1('L2ARC breakdown:', f_hits(l2_access_total))

View File

@ -88,6 +88,12 @@ cols = {
"mfug": [4, 1000, "MFU ghost list hits per second"], "mfug": [4, 1000, "MFU ghost list hits per second"],
"mrug": [4, 1000, "MRU ghost list hits per second"], "mrug": [4, 1000, "MRU ghost list hits per second"],
"eskip": [5, 1000, "evict_skip per second"], "eskip": [5, 1000, "evict_skip per second"],
"el2skip": [7, 1000, "evict skip, due to l2 writes, per second"],
"el2cach": [7, 1024, "Size of L2 cached evictions per second"],
"el2el": [5, 1024, "Size of L2 eligible evictions per second"],
"el2mfu": [6, 1024, "Size of L2 eligible MFU evictions per second"],
"el2mru": [6, 1024, "Size of L2 eligible MRU evictions per second"],
"el2inel": [7, 1024, "Size of L2 ineligible evictions per second"],
"mtxmis": [6, 1000, "mutex_miss per second"], "mtxmis": [6, 1000, "mutex_miss per second"],
"dread": [5, 1000, "Demand accesses per second"], "dread": [5, 1000, "Demand accesses per second"],
"pread": [5, 1000, "Prefetch accesses per second"], "pread": [5, 1000, "Prefetch accesses per second"],
@ -96,6 +102,16 @@ cols = {
"l2read": [6, 1000, "Total L2ARC accesses per second"], "l2read": [6, 1000, "Total L2ARC accesses per second"],
"l2hit%": [6, 100, "L2ARC access hit percentage"], "l2hit%": [6, 100, "L2ARC access hit percentage"],
"l2miss%": [7, 100, "L2ARC access miss percentage"], "l2miss%": [7, 100, "L2ARC access miss percentage"],
"l2pref": [6, 1024, "L2ARC prefetch allocated size"],
"l2mfu": [5, 1024, "L2ARC MFU allocated size"],
"l2mru": [5, 1024, "L2ARC MRU allocated size"],
"l2data": [6, 1024, "L2ARC data allocated size"],
"l2meta": [6, 1024, "L2ARC metadata allocated size"],
"l2pref%": [7, 100, "L2ARC prefetch percentage"],
"l2mfu%": [6, 100, "L2ARC MFU percentage"],
"l2mru%": [6, 100, "L2ARC MRU percentage"],
"l2data%": [7, 100, "L2ARC data percentage"],
"l2meta%": [7, 100, "L2ARC metadata percentage"],
"l2asize": [7, 1024, "Actual (compressed) size of the L2ARC"], "l2asize": [7, 1024, "Actual (compressed) size of the L2ARC"],
"l2size": [6, 1024, "Size of the L2ARC"], "l2size": [6, 1024, "Size of the L2ARC"],
"l2bytes": [7, 1024, "Bytes read per second from the L2ARC"], "l2bytes": [7, 1024, "Bytes read per second from the L2ARC"],
@ -118,22 +134,24 @@ opfile = None
sep = " " # Default separator is 2 spaces sep = " " # Default separator is 2 spaces
version = "0.4" version = "0.4"
l2exist = False l2exist = False
cmd = ("Usage: arcstat [-hvx] [-f fields] [-o file] [-s string] [interval " cmd = ("Usage: arcstat [-havxp] [-f fields] [-o file] [-s string] [interval "
"[count]]\n") "[count]]\n")
cur = {} cur = {}
d = {} d = {}
out = None out = None
kstat = None kstat = None
pretty_print = True
if sys.platform.startswith('freebsd'): if sys.platform.startswith('freebsd'):
# Requires py27-sysctl on FreeBSD # Requires py-sysctl on FreeBSD
import sysctl import sysctl
def kstat_update(): def kstat_update():
global kstat global kstat
k = sysctl.filter('kstat.zfs.misc.arcstats') k = [ctl for ctl in sysctl.filter('kstat.zfs.misc.arcstats')
if ctl.type != sysctl.CTLTYPE_NODE]
if not k: if not k:
sys.exit(1) sys.exit(1)
@ -181,6 +199,7 @@ def detailed_usage():
def usage(): def usage():
sys.stderr.write("%s\n" % cmd) sys.stderr.write("%s\n" % cmd)
sys.stderr.write("\t -h : Print this help message\n") sys.stderr.write("\t -h : Print this help message\n")
sys.stderr.write("\t -a : Print all possible stats\n")
sys.stderr.write("\t -v : List all possible field headers and definitions" sys.stderr.write("\t -v : List all possible field headers and definitions"
"\n") "\n")
sys.stderr.write("\t -x : Print extended stats\n") sys.stderr.write("\t -x : Print extended stats\n")
@ -188,6 +207,7 @@ def usage():
sys.stderr.write("\t -o : Redirect output to the specified file\n") sys.stderr.write("\t -o : Redirect output to the specified file\n")
sys.stderr.write("\t -s : Override default field separator with custom " sys.stderr.write("\t -s : Override default field separator with custom "
"character or string\n") "character or string\n")
sys.stderr.write("\t -p : Disable auto-scaling of numerical fields\n")
sys.stderr.write("\nExamples:\n") sys.stderr.write("\nExamples:\n")
sys.stderr.write("\tarcstat -o /tmp/a.log 2 10\n") sys.stderr.write("\tarcstat -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat -s \",\" -o /tmp/a.log 2 10\n") sys.stderr.write("\tarcstat -s \",\" -o /tmp/a.log 2 10\n")
@ -246,10 +266,14 @@ def print_values():
global hdr global hdr
global sep global sep
global v global v
global pretty_print
sys.stdout.write(sep.join( if pretty_print:
prettynum(cols[col][0], cols[col][1], v[col]) for col in hdr)) fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n") sys.stdout.write("\n")
sys.stdout.flush() sys.stdout.flush()
@ -257,9 +281,14 @@ def print_values():
def print_header(): def print_header():
global hdr global hdr
global sep global sep
global pretty_print
sys.stdout.write(sep.join("%*s" % (cols[col][0], col) for col in hdr)) if pretty_print:
fmt = lambda col: "%*s" % (cols[col][0], col)
else:
fmt = lambda col: col
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n") sys.stdout.write("\n")
@ -296,8 +325,10 @@ def init():
global sep global sep
global out global out
global l2exist global l2exist
global pretty_print
desired_cols = None desired_cols = None
aflag = False
xflag = False xflag = False
hflag = False hflag = False
vflag = False vflag = False
@ -306,14 +337,16 @@ def init():
try: try:
opts, args = getopt.getopt( opts, args = getopt.getopt(
sys.argv[1:], sys.argv[1:],
"xo:hvs:f:", "axo:hvs:f:p",
[ [
"all",
"extended", "extended",
"outfile", "outfile",
"help", "help",
"verbose", "verbose",
"separator", "separator",
"columns" "columns",
"parsable"
] ]
) )
except getopt.error as msg: except getopt.error as msg:
@ -322,6 +355,8 @@ def init():
opts = None opts = None
for opt, arg in opts: for opt, arg in opts:
if opt in ('-a', '--all'):
aflag = True
if opt in ('-x', '--extended'): if opt in ('-x', '--extended'):
xflag = True xflag = True
if opt in ('-o', '--outfile'): if opt in ('-o', '--outfile'):
@ -337,6 +372,8 @@ def init():
if opt in ('-f', '--columns'): if opt in ('-f', '--columns'):
desired_cols = arg desired_cols = arg
i += 1 i += 1
if opt in ('-p', '--parsable'):
pretty_print = False
i += 1 i += 1
argv = sys.argv[i:] argv = sys.argv[i:]
@ -381,6 +418,12 @@ def init():
incompat) incompat)
usage() usage()
if aflag:
if l2exist:
hdr = cols.keys()
else:
hdr = [col for col in cols.keys() if not col.startswith("l2")]
if opfile: if opfile:
try: try:
out = open(opfile, "w") out = open(opfile, "w")
@ -436,6 +479,12 @@ def calculate():
v["mrug"] = d["mru_ghost_hits"] / sint v["mrug"] = d["mru_ghost_hits"] / sint
v["mfug"] = d["mfu_ghost_hits"] / sint v["mfug"] = d["mfu_ghost_hits"] / sint
v["eskip"] = d["evict_skip"] / sint v["eskip"] = d["evict_skip"] / sint
v["el2skip"] = d["evict_l2_skip"] / sint
v["el2cach"] = d["evict_l2_cached"] / sint
v["el2el"] = d["evict_l2_eligible"] / sint
v["el2mfu"] = d["evict_l2_eligible_mfu"] / sint
v["el2mru"] = d["evict_l2_eligible_mru"] / sint
v["el2inel"] = d["evict_l2_ineligible"] / sint
v["mtxmis"] = d["mutex_miss"] / sint v["mtxmis"] = d["mutex_miss"] / sint
if l2exist: if l2exist:
@ -449,6 +498,17 @@ def calculate():
v["l2size"] = cur["l2_size"] v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] / sint v["l2bytes"] = d["l2_read_bytes"] / sint
v["l2pref"] = cur["l2_prefetch_asize"]
v["l2mfu"] = cur["l2_mfu_asize"]
v["l2mru"] = cur["l2_mru_asize"]
v["l2data"] = cur["l2_bufc_data_asize"]
v["l2meta"] = cur["l2_bufc_metadata_asize"]
v["l2pref%"] = 100 * v["l2pref"] / v["l2asize"]
v["l2mfu%"] = 100 * v["l2mfu"] / v["l2asize"]
v["l2mru%"] = 100 * v["l2mru"] / v["l2asize"]
v["l2data%"] = 100 * v["l2data"] / v["l2asize"]
v["l2meta%"] = 100 * v["l2meta"] / v["l2asize"]
v["grow"] = 0 if cur["arc_no_grow"] else 1 v["grow"] = 0 if cur["arc_no_grow"] else 1
v["need"] = cur["arc_need_free"] v["need"] = cur["arc_need_free"]
v["free"] = cur["memory_free_bytes"] v["free"] = cur["memory_free_bytes"]

View File

@ -131,7 +131,7 @@ elif sys.platform.startswith("linux"):
def print_incompat_helper(incompat): def print_incompat_helper(incompat):
cnt = 0 cnt = 0
for key in sorted(incompat): for key in sorted(incompat):
if cnt is 0: if cnt == 0:
sys.stderr.write("\t") sys.stderr.write("\t")
elif cnt > 8: elif cnt > 8:
sys.stderr.write(",\n\t") sys.stderr.write(",\n\t")
@ -662,7 +662,7 @@ def main():
if not ifile: if not ifile:
ifile = default_ifile() ifile = default_ifile()
if ifile is not "-": if ifile != "-":
try: try:
tmp = open(ifile, "r") tmp = open(ifile, "r")
sys.stdin = tmp sys.stdin = tmp

View File

@ -43,67 +43,30 @@
libzfs_handle_t *g_zfs; libzfs_handle_t *g_zfs;
/* /*
* Return the pool/dataset to mount given the name passed to mount. This * Opportunistically convert a target string into a pool name. If the
* is expected to be of the form pool/dataset, however may also refer to * string does not represent a block device with a valid zfs label
* a block device if that device contains a valid zfs label. * then it is passed through without modification.
*/ */
static char * static void
parse_dataset(char *dataset) parse_dataset(const char *target, char **dataset)
{ {
char cwd[PATH_MAX]; /* Assume pool/dataset is more likely */
struct stat64 statbuf; strlcpy(*dataset, target, PATH_MAX);
int error;
int len;
/* int fd = open(target, O_RDONLY | O_CLOEXEC);
* We expect a pool/dataset to be provided, however if we're if (fd < 0)
* given a device which is a member of a zpool we attempt to return;
* extract the pool name stored in the label. Given the pool
* name we can mount the root dataset.
*/
error = stat64(dataset, &statbuf);
if (error == 0) {
nvlist_t *config;
char *name;
int fd;
fd = open(dataset, O_RDONLY); nvlist_t *cfg = NULL;
if (fd < 0) if (zpool_read_label(fd, &cfg, NULL) == 0) {
goto out; char *nm = NULL;
if (!nvlist_lookup_string(cfg, ZPOOL_CONFIG_POOL_NAME, &nm))
error = zpool_read_label(fd, &config, NULL); strlcpy(*dataset, nm, PATH_MAX);
(void) close(fd); nvlist_free(cfg);
if (error)
goto out;
error = nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name);
if (error) {
nvlist_free(config);
} else {
dataset = strdup(name);
nvlist_free(config);
return (dataset);
}
} }
out:
/*
* If a file or directory in your current working directory is
* named 'dataset' then mount(8) will prepend your current working
* directory to the dataset. There is no way to prevent this
* behavior so we simply check for it and strip the prepended
* patch when it is added.
*/
if (getcwd(cwd, PATH_MAX) == NULL)
return (dataset);
len = strlen(cwd); if (close(fd))
perror("close");
/* Do not add one when cwd already ends in a trailing '/' */
if (strncmp(cwd, dataset, len) == 0)
return (dataset + len + (cwd[len-1] != '/'));
return (dataset);
} }
/* /*
@ -147,8 +110,8 @@ mtab_update(char *dataset, char *mntpoint, char *type, char *mntopts)
if (!fp) { if (!fp) {
(void) fprintf(stderr, gettext( (void) fprintf(stderr, gettext(
"filesystem '%s' was mounted, but /etc/mtab " "filesystem '%s' was mounted, but /etc/mtab "
"could not be opened due to error %d\n"), "could not be opened due to error: %s\n"),
dataset, errno); dataset, strerror(errno));
return (MOUNT_FILEIO); return (MOUNT_FILEIO);
} }
@ -156,8 +119,8 @@ mtab_update(char *dataset, char *mntpoint, char *type, char *mntopts)
if (error) { if (error) {
(void) fprintf(stderr, gettext( (void) fprintf(stderr, gettext(
"filesystem '%s' was mounted, but /etc/mtab " "filesystem '%s' was mounted, but /etc/mtab "
"could not be updated due to error %d\n"), "could not be updated due to error: %s\n"),
dataset, errno); dataset, strerror(errno));
return (MOUNT_FILEIO); return (MOUNT_FILEIO);
} }
@ -176,7 +139,7 @@ main(int argc, char **argv)
char badopt[MNT_LINE_MAX] = { '\0' }; char badopt[MNT_LINE_MAX] = { '\0' };
char mtabopt[MNT_LINE_MAX] = { '\0' }; char mtabopt[MNT_LINE_MAX] = { '\0' };
char mntpoint[PATH_MAX]; char mntpoint[PATH_MAX];
char *dataset; char dataset[PATH_MAX], *pdataset = dataset;
unsigned long mntflags = 0, zfsflags = 0, remount = 0; unsigned long mntflags = 0, zfsflags = 0, remount = 0;
int sloppy = 0, fake = 0, verbose = 0, nomtab = 0, zfsutil = 0; int sloppy = 0, fake = 0, verbose = 0, nomtab = 0, zfsutil = 0;
int error, c; int error, c;
@ -232,13 +195,13 @@ main(int argc, char **argv)
return (MOUNT_USAGE); return (MOUNT_USAGE);
} }
dataset = parse_dataset(argv[0]); parse_dataset(argv[0], &pdataset);
/* canonicalize the mount point */ /* canonicalize the mount point */
if (realpath(argv[1], mntpoint) == NULL) { if (realpath(argv[1], mntpoint) == NULL) {
(void) fprintf(stderr, gettext("filesystem '%s' cannot be " (void) fprintf(stderr, gettext("filesystem '%s' cannot be "
"mounted at '%s' due to canonicalization error %d.\n"), "mounted at '%s' due to canonicalization error: %s\n"),
dataset, argv[1], errno); dataset, argv[1], strerror(errno));
return (MOUNT_SYSERR); return (MOUNT_SYSERR);
} }

View File

@ -83,8 +83,17 @@ run_gen_bench_impl(const char *impl)
/* create suitable raidz_map */ /* create suitable raidz_map */
ncols = rto_opts.rto_dcols + fn + 1; ncols = rto_opts.rto_dcols + fn + 1;
zio_bench.io_size = 1ULL << ds; zio_bench.io_size = 1ULL << ds;
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, fn+1); if (rto_opts.rto_expand) {
rm_bench = vdev_raidz_map_alloc_expanded(
zio_bench.io_abd,
zio_bench.io_size, zio_bench.io_offset,
rto_opts.rto_ashift, ncols+1, ncols,
fn+1, rto_opts.rto_expand_offset);
} else {
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, fn+1);
}
/* estimate iteration count */ /* estimate iteration count */
iter_cnt = GEN_BENCH_MEMORY; iter_cnt = GEN_BENCH_MEMORY;
@ -163,8 +172,16 @@ run_rec_bench_impl(const char *impl)
(1ULL << BENCH_ASHIFT)) (1ULL << BENCH_ASHIFT))
continue; continue;
rm_bench = vdev_raidz_map_alloc(&zio_bench, if (rto_opts.rto_expand) {
BENCH_ASHIFT, ncols, PARITY_PQR); rm_bench = vdev_raidz_map_alloc_expanded(
zio_bench.io_abd,
zio_bench.io_size, zio_bench.io_offset,
BENCH_ASHIFT, ncols+1, ncols,
PARITY_PQR, rto_opts.rto_expand_offset);
} else {
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, PARITY_PQR);
}
/* estimate iteration count */ /* estimate iteration count */
iter_cnt = (REC_BENCH_MEMORY); iter_cnt = (REC_BENCH_MEMORY);

View File

@ -77,16 +77,20 @@ static void print_opts(raidz_test_opts_t *opts, boolean_t force)
(void) fprintf(stdout, DBLSEP "Running with options:\n" (void) fprintf(stdout, DBLSEP "Running with options:\n"
" (-a) zio ashift : %zu\n" " (-a) zio ashift : %zu\n"
" (-o) zio offset : 1 << %zu\n" " (-o) zio offset : 1 << %zu\n"
" (-e) expanded map : %s\n"
" (-r) reflow offset : %llx\n"
" (-d) number of raidz data columns : %zu\n" " (-d) number of raidz data columns : %zu\n"
" (-s) size of DATA : 1 << %zu\n" " (-s) size of DATA : 1 << %zu\n"
" (-S) sweep parameters : %s \n" " (-S) sweep parameters : %s \n"
" (-v) verbose : %s \n\n", " (-v) verbose : %s \n\n",
opts->rto_ashift, /* -a */ opts->rto_ashift, /* -a */
ilog2(opts->rto_offset), /* -o */ ilog2(opts->rto_offset), /* -o */
opts->rto_dcols, /* -d */ opts->rto_expand ? "yes" : "no", /* -e */
ilog2(opts->rto_dsize), /* -s */ (u_longlong_t)opts->rto_expand_offset, /* -r */
opts->rto_sweep ? "yes" : "no", /* -S */ opts->rto_dcols, /* -d */
verbose); /* -v */ ilog2(opts->rto_dsize), /* -s */
opts->rto_sweep ? "yes" : "no", /* -S */
verbose); /* -v */
} }
} }
@ -104,6 +108,8 @@ static void usage(boolean_t requested)
"\t[-S parameter sweep (default: %s)]\n" "\t[-S parameter sweep (default: %s)]\n"
"\t[-t timeout for parameter sweep test]\n" "\t[-t timeout for parameter sweep test]\n"
"\t[-B benchmark all raidz implementations]\n" "\t[-B benchmark all raidz implementations]\n"
"\t[-e use expanded raidz map (default: %s)]\n"
"\t[-r expanded raidz map reflow offset (default: %llx)]\n"
"\t[-v increase verbosity (default: %zu)]\n" "\t[-v increase verbosity (default: %zu)]\n"
"\t[-h (print help)]\n" "\t[-h (print help)]\n"
"\t[-T test the test, see if failure would be detected]\n" "\t[-T test the test, see if failure would be detected]\n"
@ -114,6 +120,8 @@ static void usage(boolean_t requested)
o->rto_dcols, /* -d */ o->rto_dcols, /* -d */
ilog2(o->rto_dsize), /* -s */ ilog2(o->rto_dsize), /* -s */
rto_opts.rto_sweep ? "yes" : "no", /* -S */ rto_opts.rto_sweep ? "yes" : "no", /* -S */
rto_opts.rto_expand ? "yes" : "no", /* -e */
(u_longlong_t)o->rto_expand_offset, /* -r */
o->rto_v); /* -d */ o->rto_v); /* -d */
exit(requested ? 0 : 1); exit(requested ? 0 : 1);
@ -128,7 +136,7 @@ static void process_options(int argc, char **argv)
bcopy(&rto_opts_defaults, o, sizeof (*o)); bcopy(&rto_opts_defaults, o, sizeof (*o));
while ((opt = getopt(argc, argv, "TDBSvha:o:d:s:t:")) != -1) { while ((opt = getopt(argc, argv, "TDBSvha:er:o:d:s:t:")) != -1) {
value = 0; value = 0;
switch (opt) { switch (opt) {
@ -136,6 +144,12 @@ static void process_options(int argc, char **argv)
value = strtoull(optarg, NULL, 0); value = strtoull(optarg, NULL, 0);
o->rto_ashift = MIN(13, MAX(9, value)); o->rto_ashift = MIN(13, MAX(9, value));
break; break;
case 'e':
o->rto_expand = 1;
break;
case 'r':
o->rto_expand_offset = strtoull(optarg, NULL, 0);
break;
case 'o': case 'o':
value = strtoull(optarg, NULL, 0); value = strtoull(optarg, NULL, 0);
o->rto_offset = ((1ULL << MIN(12, value)) >> 9) << 9; o->rto_offset = ((1ULL << MIN(12, value)) >> 9) << 9;
@ -179,25 +193,34 @@ static void process_options(int argc, char **argv)
} }
} }
#define DATA_COL(rm, i) ((rm)->rm_col[raidz_parity(rm) + (i)].rc_abd) #define DATA_COL(rr, i) ((rr)->rr_col[rr->rr_firstdatacol + (i)].rc_abd)
#define DATA_COL_SIZE(rm, i) ((rm)->rm_col[raidz_parity(rm) + (i)].rc_size) #define DATA_COL_SIZE(rr, i) ((rr)->rr_col[rr->rr_firstdatacol + (i)].rc_size)
#define CODE_COL(rm, i) ((rm)->rm_col[(i)].rc_abd) #define CODE_COL(rr, i) ((rr)->rr_col[(i)].rc_abd)
#define CODE_COL_SIZE(rm, i) ((rm)->rm_col[(i)].rc_size) #define CODE_COL_SIZE(rr, i) ((rr)->rr_col[(i)].rc_size)
static int static int
cmp_code(raidz_test_opts_t *opts, const raidz_map_t *rm, const int parity) cmp_code(raidz_test_opts_t *opts, const raidz_map_t *rm, const int parity)
{ {
int i, ret = 0; int r, i, ret = 0;
VERIFY(parity >= 1 && parity <= 3); VERIFY(parity >= 1 && parity <= 3);
for (i = 0; i < parity; i++) { for (r = 0; r < rm->rm_nrows; r++) {
if (abd_cmp(CODE_COL(rm, i), CODE_COL(opts->rm_golden, i)) raidz_row_t * const rr = rm->rm_row[r];
!= 0) { raidz_row_t * const rrg = opts->rm_golden->rm_row[r];
ret++; for (i = 0; i < parity; i++) {
LOG_OPT(D_DEBUG, opts, if (CODE_COL_SIZE(rrg, i) == 0) {
"\nParity block [%d] different!\n", i); VERIFY0(CODE_COL_SIZE(rr, i));
continue;
}
if (abd_cmp(CODE_COL(rr, i),
CODE_COL(rrg, i)) != 0) {
ret++;
LOG_OPT(D_DEBUG, opts,
"\nParity block [%d] different!\n", i);
}
} }
} }
return (ret); return (ret);
@ -206,16 +229,26 @@ cmp_code(raidz_test_opts_t *opts, const raidz_map_t *rm, const int parity)
static int static int
cmp_data(raidz_test_opts_t *opts, raidz_map_t *rm) cmp_data(raidz_test_opts_t *opts, raidz_map_t *rm)
{ {
int i, ret = 0; int r, i, dcols, ret = 0;
int dcols = opts->rm_golden->rm_cols - raidz_parity(opts->rm_golden);
for (i = 0; i < dcols; i++) { for (r = 0; r < rm->rm_nrows; r++) {
if (abd_cmp(DATA_COL(opts->rm_golden, i), DATA_COL(rm, i)) raidz_row_t *rr = rm->rm_row[r];
!= 0) { raidz_row_t *rrg = opts->rm_golden->rm_row[r];
ret++; dcols = opts->rm_golden->rm_row[0]->rr_cols -
raidz_parity(opts->rm_golden);
for (i = 0; i < dcols; i++) {
if (DATA_COL_SIZE(rrg, i) == 0) {
VERIFY0(DATA_COL_SIZE(rr, i));
continue;
}
LOG_OPT(D_DEBUG, opts, if (abd_cmp(DATA_COL(rrg, i),
"\nData block [%d] different!\n", i); DATA_COL(rr, i)) != 0) {
ret++;
LOG_OPT(D_DEBUG, opts,
"\nData block [%d] different!\n", i);
}
} }
} }
return (ret); return (ret);
@ -236,12 +269,13 @@ init_rand(void *data, size_t size, void *private)
static void static void
corrupt_colums(raidz_map_t *rm, const int *tgts, const int cnt) corrupt_colums(raidz_map_t *rm, const int *tgts, const int cnt)
{ {
int i; for (int r = 0; r < rm->rm_nrows; r++) {
raidz_col_t *col; raidz_row_t *rr = rm->rm_row[r];
for (int i = 0; i < cnt; i++) {
for (i = 0; i < cnt; i++) { raidz_col_t *col = &rr->rr_col[tgts[i]];
col = &rm->rm_col[tgts[i]]; abd_iterate_func(col->rc_abd, 0, col->rc_size,
abd_iterate_func(col->rc_abd, 0, col->rc_size, init_rand, NULL); init_rand, NULL);
}
} }
} }
@ -288,10 +322,22 @@ init_raidz_golden_map(raidz_test_opts_t *opts, const int parity)
VERIFY0(vdev_raidz_impl_set("original")); VERIFY0(vdev_raidz_impl_set("original"));
opts->rm_golden = vdev_raidz_map_alloc(opts->zio_golden, if (opts->rto_expand) {
opts->rto_ashift, total_ncols, parity); opts->rm_golden =
rm_test = vdev_raidz_map_alloc(zio_test, vdev_raidz_map_alloc_expanded(opts->zio_golden->io_abd,
opts->rto_ashift, total_ncols, parity); opts->zio_golden->io_size, opts->zio_golden->io_offset,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset);
rm_test = vdev_raidz_map_alloc_expanded(zio_test->io_abd,
zio_test->io_size, zio_test->io_offset,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset);
} else {
opts->rm_golden = vdev_raidz_map_alloc(opts->zio_golden,
opts->rto_ashift, total_ncols, parity);
rm_test = vdev_raidz_map_alloc(zio_test,
opts->rto_ashift, total_ncols, parity);
}
VERIFY(opts->zio_golden); VERIFY(opts->zio_golden);
VERIFY(opts->rm_golden); VERIFY(opts->rm_golden);
@ -312,6 +358,188 @@ init_raidz_golden_map(raidz_test_opts_t *opts, const int parity)
return (err); return (err);
} }
/*
* If reflow is not in progress, reflow_offset should be UINT64_MAX.
* For each row, if the row is entirely before reflow_offset, it will
* come from the new location. Otherwise this row will come from the
* old location. Therefore, rows that straddle the reflow_offset will
* come from the old location.
*
* NOTE: Until raidz expansion is implemented this function is only
* needed by raidz_test.c to the multi-row raid_map_t functionality.
*/
raidz_map_t *
vdev_raidz_map_alloc_expanded(abd_t *abd, uint64_t size, uint64_t offset,
uint64_t ashift, uint64_t physical_cols, uint64_t logical_cols,
uint64_t nparity, uint64_t reflow_offset)
{
/* The zio's size in units of the vdev's minimum sector size. */
uint64_t s = size >> ashift;
uint64_t q, r, bc, devidx, asize = 0, tot;
/*
* "Quotient": The number of data sectors for this stripe on all but
* the "big column" child vdevs that also contain "remainder" data.
* AKA "full rows"
*/
q = s / (logical_cols - nparity);
/*
* "Remainder": The number of partial stripe data sectors in this I/O.
* This will add a sector to some, but not all, child vdevs.
*/
r = s - q * (logical_cols - nparity);
/* The number of "big columns" - those which contain remainder data. */
bc = (r == 0 ? 0 : r + nparity);
/*
* The total number of data and parity sectors associated with
* this I/O.
*/
tot = s + nparity * (q + (r == 0 ? 0 : 1));
/* How many rows contain data (not skip) */
uint64_t rows = howmany(tot, logical_cols);
int cols = MIN(tot, logical_cols);
raidz_map_t *rm = kmem_zalloc(offsetof(raidz_map_t, rm_row[rows]),
KM_SLEEP);
rm->rm_nrows = rows;
for (uint64_t row = 0; row < rows; row++) {
raidz_row_t *rr = kmem_alloc(offsetof(raidz_row_t,
rr_col[cols]), KM_SLEEP);
rm->rm_row[row] = rr;
/* The starting RAIDZ (parent) vdev sector of the row. */
uint64_t b = (offset >> ashift) + row * logical_cols;
/*
* If we are in the middle of a reflow, and any part of this
* row has not been copied, then use the old location of
* this row.
*/
int row_phys_cols = physical_cols;
if (b + (logical_cols - nparity) > reflow_offset >> ashift)
row_phys_cols--;
/* starting child of this row */
uint64_t child_id = b % row_phys_cols;
/* The starting byte offset on each child vdev. */
uint64_t child_offset = (b / row_phys_cols) << ashift;
/*
* We set cols to the entire width of the block, even
* if this row is shorter. This is needed because parity
* generation (for Q and R) needs to know the entire width,
* because it treats the short row as though it was
* full-width (and the "phantom" sectors were zero-filled).
*
* Another approach to this would be to set cols shorter
* (to just the number of columns that we might do i/o to)
* and have another mechanism to tell the parity generation
* about the "entire width". Reconstruction (at least
* vdev_raidz_reconstruct_general()) would also need to
* know about the "entire width".
*/
rr->rr_cols = cols;
rr->rr_bigcols = bc;
rr->rr_missingdata = 0;
rr->rr_missingparity = 0;
rr->rr_firstdatacol = nparity;
rr->rr_abd_copy = NULL;
rr->rr_abd_empty = NULL;
rr->rr_nempty = 0;
for (int c = 0; c < rr->rr_cols; c++, child_id++) {
if (child_id >= row_phys_cols) {
child_id -= row_phys_cols;
child_offset += 1ULL << ashift;
}
rr->rr_col[c].rc_devidx = child_id;
rr->rr_col[c].rc_offset = child_offset;
rr->rr_col[c].rc_gdata = NULL;
rr->rr_col[c].rc_orig_data = NULL;
rr->rr_col[c].rc_error = 0;
rr->rr_col[c].rc_tried = 0;
rr->rr_col[c].rc_skipped = 0;
rr->rr_col[c].rc_need_orig_restore = B_FALSE;
uint64_t dc = c - rr->rr_firstdatacol;
if (c < rr->rr_firstdatacol) {
rr->rr_col[c].rc_size = 1ULL << ashift;
rr->rr_col[c].rc_abd =
abd_alloc_linear(rr->rr_col[c].rc_size,
B_TRUE);
} else if (row == rows - 1 && bc != 0 && c >= bc) {
/*
* Past the end, this for parity generation.
*/
rr->rr_col[c].rc_size = 0;
rr->rr_col[c].rc_abd = NULL;
} else {
/*
* "data column" (col excluding parity)
* Add an ASCII art diagram here
*/
uint64_t off;
if (c < bc || r == 0) {
off = dc * rows + row;
} else {
off = r * rows +
(dc - r) * (rows - 1) + row;
}
rr->rr_col[c].rc_size = 1ULL << ashift;
rr->rr_col[c].rc_abd =
abd_get_offset(abd, off << ashift);
}
asize += rr->rr_col[c].rc_size;
}
/*
* If all data stored spans all columns, there's a danger that
* parity will always be on the same device and, since parity
* isn't read during normal operation, that that device's I/O
* bandwidth won't be used effectively. We therefore switch
* the parity every 1MB.
*
* ...at least that was, ostensibly, the theory. As a practical
* matter unless we juggle the parity between all devices
* evenly, we won't see any benefit. Further, occasional writes
* that aren't a multiple of the LCM of the number of children
* and the minimum stripe width are sufficient to avoid pessimal
* behavior. Unfortunately, this decision created an implicit
* on-disk format requirement that we need to support for all
* eternity, but only for single-parity RAID-Z.
*
* If we intend to skip a sector in the zeroth column for
* padding we must make sure to note this swap. We will never
* intend to skip the first column since at least one data and
* one parity column must appear in each row.
*/
if (rr->rr_firstdatacol == 1 && rr->rr_cols > 1 &&
(offset & (1ULL << 20))) {
ASSERT(rr->rr_cols >= 2);
ASSERT(rr->rr_col[0].rc_size == rr->rr_col[1].rc_size);
devidx = rr->rr_col[0].rc_devidx;
uint64_t o = rr->rr_col[0].rc_offset;
rr->rr_col[0].rc_devidx = rr->rr_col[1].rc_devidx;
rr->rr_col[0].rc_offset = rr->rr_col[1].rc_offset;
rr->rr_col[1].rc_devidx = devidx;
rr->rr_col[1].rc_offset = o;
}
}
ASSERT3U(asize, ==, tot << ashift);
/* init RAIDZ parity ops */
rm->rm_ops = vdev_raidz_math_get_ops();
return (rm);
}
static raidz_map_t * static raidz_map_t *
init_raidz_map(raidz_test_opts_t *opts, zio_t **zio, const int parity) init_raidz_map(raidz_test_opts_t *opts, zio_t **zio, const int parity)
{ {
@ -330,8 +558,15 @@ init_raidz_map(raidz_test_opts_t *opts, zio_t **zio, const int parity)
(*zio)->io_abd = raidz_alloc(alloc_dsize); (*zio)->io_abd = raidz_alloc(alloc_dsize);
init_zio_abd(*zio); init_zio_abd(*zio);
rm = vdev_raidz_map_alloc(*zio, opts->rto_ashift, if (opts->rto_expand) {
total_ncols, parity); rm = vdev_raidz_map_alloc_expanded((*zio)->io_abd,
(*zio)->io_size, (*zio)->io_offset,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset);
} else {
rm = vdev_raidz_map_alloc(*zio, opts->rto_ashift,
total_ncols, parity);
}
VERIFY(rm); VERIFY(rm);
/* Make sure code columns are destroyed */ /* Make sure code columns are destroyed */
@ -420,7 +655,7 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
if (fn < RAIDZ_REC_PQ) { if (fn < RAIDZ_REC_PQ) {
/* can reconstruct 1 failed data disk */ /* can reconstruct 1 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) { for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_cols - raidz_parity(rm)) if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
continue; continue;
/* Check if should stop */ /* Check if should stop */
@ -445,10 +680,11 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
} else if (fn < RAIDZ_REC_PQR) { } else if (fn < RAIDZ_REC_PQR) {
/* can reconstruct 2 failed data disk */ /* can reconstruct 2 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) { for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_cols - raidz_parity(rm)) if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
continue; continue;
for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) { for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) {
if (x1 >= rm->rm_cols - raidz_parity(rm)) if (x1 >= rm->rm_row[0]->rr_cols -
raidz_parity(rm))
continue; continue;
/* Check if should stop */ /* Check if should stop */
@ -475,14 +711,15 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
} else { } else {
/* can reconstruct 3 failed data disk */ /* can reconstruct 3 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) { for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_cols - raidz_parity(rm)) if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
continue; continue;
for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) { for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) {
if (x1 >= rm->rm_cols - raidz_parity(rm)) if (x1 >= rm->rm_row[0]->rr_cols -
raidz_parity(rm))
continue; continue;
for (x2 = x1 + 1; x2 < opts->rto_dcols; x2++) { for (x2 = x1 + 1; x2 < opts->rto_dcols; x2++) {
if (x2 >= if (x2 >= rm->rm_row[0]->rr_cols -
rm->rm_cols - raidz_parity(rm)) raidz_parity(rm))
continue; continue;
/* Check if should stop */ /* Check if should stop */
@ -700,6 +937,8 @@ run_sweep(void)
opts->rto_dcols = dcols_v[d]; opts->rto_dcols = dcols_v[d];
opts->rto_offset = (1 << ashift_v[a]) * rand(); opts->rto_offset = (1 << ashift_v[a]) * rand();
opts->rto_dsize = size_v[s]; opts->rto_dsize = size_v[s];
opts->rto_expand = rto_opts.rto_expand;
opts->rto_expand_offset = rto_opts.rto_expand_offset;
opts->rto_v = 0; /* be quiet */ opts->rto_v = 0; /* be quiet */
VERIFY3P(thread_create(NULL, 0, sweep_thread, (void *) opts, VERIFY3P(thread_create(NULL, 0, sweep_thread, (void *) opts,
@ -732,6 +971,7 @@ run_sweep(void)
return (sweep_state == SWEEP_ERROR ? SWEEP_ERROR : 0); return (sweep_state == SWEEP_ERROR ? SWEEP_ERROR : 0);
} }
int int
main(int argc, char **argv) main(int argc, char **argv)
{ {

View File

@ -44,13 +44,15 @@ static const char *raidz_impl_names[] = {
typedef struct raidz_test_opts { typedef struct raidz_test_opts {
size_t rto_ashift; size_t rto_ashift;
size_t rto_offset; uint64_t rto_offset;
size_t rto_dcols; size_t rto_dcols;
size_t rto_dsize; size_t rto_dsize;
size_t rto_v; size_t rto_v;
size_t rto_sweep; size_t rto_sweep;
size_t rto_sweep_timeout; size_t rto_sweep_timeout;
size_t rto_benchmark; size_t rto_benchmark;
size_t rto_expand;
uint64_t rto_expand_offset;
size_t rto_sanity; size_t rto_sanity;
size_t rto_gdb; size_t rto_gdb;
@ -69,6 +71,8 @@ static const raidz_test_opts_t rto_opts_defaults = {
.rto_v = 0, .rto_v = 0,
.rto_sweep = 0, .rto_sweep = 0,
.rto_benchmark = 0, .rto_benchmark = 0,
.rto_expand = 0,
.rto_expand_offset = -1ULL,
.rto_sanity = 0, .rto_sanity = 0,
.rto_gdb = 0, .rto_gdb = 0,
.rto_should_stop = B_FALSE .rto_should_stop = B_FALSE
@ -113,4 +117,7 @@ void init_zio_abd(zio_t *zio);
void run_raidz_benchmark(void); void run_raidz_benchmark(void);
struct raidz_map *vdev_raidz_map_alloc_expanded(abd_t *, uint64_t, uint64_t,
uint64_t, uint64_t, uint64_t, uint64_t, uint64_t);
#endif /* RAIDZ_TEST_H */ #endif /* RAIDZ_TEST_H */

View File

@ -1642,7 +1642,11 @@ dump_metaslab(metaslab_t *msp)
SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift); SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift);
} }
ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift)); if (vd->vdev_ops == &vdev_draid_ops)
ASSERT3U(msp->ms_size, <=, 1ULL << vd->vdev_ms_shift);
else
ASSERT3U(msp->ms_size, ==, 1ULL << vd->vdev_ms_shift);
dump_spacemap(spa->spa_meta_objset, msp->ms_sm); dump_spacemap(spa->spa_meta_objset, msp->ms_sm);
if (spa_feature_is_active(spa, SPA_FEATURE_LOG_SPACEMAP)) { if (spa_feature_is_active(spa, SPA_FEATURE_LOG_SPACEMAP)) {
@ -4202,6 +4206,8 @@ dump_l2arc_log_entries(uint64_t log_entries,
(u_longlong_t)L2BLK_GET_PREFETCH((&le[j])->le_prop)); (u_longlong_t)L2BLK_GET_PREFETCH((&le[j])->le_prop));
(void) printf("|\t\t\t\taddress: %llu\n", (void) printf("|\t\t\t\taddress: %llu\n",
(u_longlong_t)le[j].le_daddr); (u_longlong_t)le[j].le_daddr);
(void) printf("|\t\t\t\tARC state: %llu\n",
(u_longlong_t)L2BLK_GET_STATE((&le[j])->le_prop));
(void) printf("|\n"); (void) printf("|\n");
} }
(void) printf("\n"); (void) printf("\n");
@ -5201,8 +5207,6 @@ zdb_blkptr_done(zio_t *zio)
zdb_cb_t *zcb = zio->io_private; zdb_cb_t *zcb = zio->io_private;
zbookmark_phys_t *zb = &zio->io_bookmark; zbookmark_phys_t *zb = &zio->io_bookmark;
abd_free(zio->io_abd);
mutex_enter(&spa->spa_scrub_lock); mutex_enter(&spa->spa_scrub_lock);
spa->spa_load_verify_bytes -= BP_GET_PSIZE(bp); spa->spa_load_verify_bytes -= BP_GET_PSIZE(bp);
cv_broadcast(&spa->spa_scrub_io_cv); cv_broadcast(&spa->spa_scrub_io_cv);
@ -5229,6 +5233,8 @@ zdb_blkptr_done(zio_t *zio)
blkbuf); blkbuf);
} }
mutex_exit(&spa->spa_scrub_lock); mutex_exit(&spa->spa_scrub_lock);
abd_free(zio->io_abd);
} }
static int static int
@ -6316,7 +6322,7 @@ dump_block_stats(spa_t *spa)
(void) printf("\t%-16s %14llu used: %5.2f%%\n", "Normal class:", (void) printf("\t%-16s %14llu used: %5.2f%%\n", "Normal class:",
(u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space); (u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space);
if (spa_special_class(spa)->mc_rotor != NULL) { if (spa_special_class(spa)->mc_allocator[0].mca_rotor != NULL) {
uint64_t alloc = metaslab_class_get_alloc( uint64_t alloc = metaslab_class_get_alloc(
spa_special_class(spa)); spa_special_class(spa));
uint64_t space = metaslab_class_get_space( uint64_t space = metaslab_class_get_space(
@ -6327,7 +6333,7 @@ dump_block_stats(spa_t *spa)
100.0 * alloc / space); 100.0 * alloc / space);
} }
if (spa_dedup_class(spa)->mc_rotor != NULL) { if (spa_dedup_class(spa)->mc_allocator[0].mca_rotor != NULL) {
uint64_t alloc = metaslab_class_get_alloc( uint64_t alloc = metaslab_class_get_alloc(
spa_dedup_class(spa)); spa_dedup_class(spa));
uint64_t space = metaslab_class_get_space( uint64_t space = metaslab_class_get_space(
@ -6756,6 +6762,7 @@ import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
{ {
int error = 0; int error = 0;
char *poolname, *bogus_name = NULL; char *poolname, *bogus_name = NULL;
boolean_t freecfg = B_FALSE;
/* If the target is not a pool, the extract the pool name */ /* If the target is not a pool, the extract the pool name */
char *path_start = strchr(target, '/'); char *path_start = strchr(target, '/');
@ -6774,6 +6781,7 @@ import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
"spa_get_stats() failed with error %d\n", "spa_get_stats() failed with error %d\n",
poolname, error); poolname, error);
} }
freecfg = B_TRUE;
} }
if (asprintf(&bogus_name, "%s%s", poolname, BOGUS_SUFFIX) == -1) if (asprintf(&bogus_name, "%s%s", poolname, BOGUS_SUFFIX) == -1)
@ -6783,6 +6791,8 @@ import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
error = spa_import(bogus_name, cfg, NULL, error = spa_import(bogus_name, cfg, NULL,
ZFS_IMPORT_MISSING_LOG | ZFS_IMPORT_CHECKPOINT | ZFS_IMPORT_MISSING_LOG | ZFS_IMPORT_CHECKPOINT |
ZFS_IMPORT_SKIP_MMP); ZFS_IMPORT_SKIP_MMP);
if (freecfg)
nvlist_free(cfg);
if (error != 0) { if (error != 0) {
fatal("Tried to import pool \"%s\" but spa_import() failed " fatal("Tried to import pool \"%s\" but spa_import() failed "
"with error %d\n", bogus_name, error); "with error %d\n", bogus_name, error);
@ -7011,7 +7021,6 @@ verify_checkpoint_blocks(spa_t *spa)
spa_t *checkpoint_spa; spa_t *checkpoint_spa;
char *checkpoint_pool; char *checkpoint_pool;
nvlist_t *config = NULL;
int error = 0; int error = 0;
/* /*
@ -7019,7 +7028,7 @@ verify_checkpoint_blocks(spa_t *spa)
* name) so we can do verification on it against the current state * name) so we can do verification on it against the current state
* of the pool. * of the pool.
*/ */
checkpoint_pool = import_checkpointed_state(spa->spa_name, config, checkpoint_pool = import_checkpointed_state(spa->spa_name, NULL,
NULL); NULL);
ASSERT(strcmp(spa->spa_name, checkpoint_pool) != 0); ASSERT(strcmp(spa->spa_name, checkpoint_pool) != 0);
@ -8429,6 +8438,11 @@ main(int argc, char **argv)
} }
} }
if (searchdirs != NULL) {
umem_free(searchdirs, nsearch * sizeof (char *));
searchdirs = NULL;
}
/* /*
* import_checkpointed_state makes the assumption that the * import_checkpointed_state makes the assumption that the
* target pool that we pass it is already part of the spa * target pool that we pass it is already part of the spa
@ -8447,6 +8461,11 @@ main(int argc, char **argv)
target = checkpoint_target; target = checkpoint_target;
} }
if (cfg != NULL) {
nvlist_free(cfg);
cfg = NULL;
}
if (target_pool != target) if (target_pool != target)
free(target_pool); free(target_pool);

View File

@ -181,6 +181,8 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
* from the vdev_disk layer after a hot unplug. Fortunately we do * from the vdev_disk layer after a hot unplug. Fortunately we do
* get an EC_DEV_REMOVE from our disk monitor and it is a suitable * get an EC_DEV_REMOVE from our disk monitor and it is a suitable
* proxy so we remap it here for the benefit of the diagnosis engine. * proxy so we remap it here for the benefit of the diagnosis engine.
* Starting in OpenZFS 2.0, we do get FM_RESOURCE_REMOVED from the spa
* layer. Processing multiple FM_RESOURCE_REMOVED events is not harmful.
*/ */
if ((strcmp(class, EC_DEV_REMOVE) == 0) && if ((strcmp(class, EC_DEV_REMOVE) == 0) &&
(strcmp(subclass, ESC_DISK) == 0) && (strcmp(subclass, ESC_DISK) == 0) &&

View File

@ -435,7 +435,15 @@ zfs_process_add(zpool_handle_t *zhp, nvlist_t *vdev, boolean_t labeled)
return; return;
} }
ret = zpool_vdev_attach(zhp, fullpath, path, nvroot, B_TRUE, B_FALSE); /*
* Prefer sequential resilvering when supported (mirrors and dRAID),
* otherwise fallback to a traditional healing resilver.
*/
ret = zpool_vdev_attach(zhp, fullpath, path, nvroot, B_TRUE, B_TRUE);
if (ret != 0) {
ret = zpool_vdev_attach(zhp, fullpath, path, nvroot,
B_TRUE, B_FALSE);
}
zed_log_msg(LOG_INFO, " zpool_vdev_replace: %s with %s (%s)", zed_log_msg(LOG_INFO, " zpool_vdev_replace: %s with %s (%s)",
fullpath, path, (ret == 0) ? "no errors" : fullpath, path, (ret == 0) ? "no errors" :

View File

@ -219,12 +219,18 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
* replace it. * replace it.
*/ */
for (s = 0; s < nspares; s++) { for (s = 0; s < nspares; s++) {
char *spare_name; boolean_t rebuild = B_FALSE;
char *spare_name, *type;
if (nvlist_lookup_string(spares[s], ZPOOL_CONFIG_PATH, if (nvlist_lookup_string(spares[s], ZPOOL_CONFIG_PATH,
&spare_name) != 0) &spare_name) != 0)
continue; continue;
/* prefer sequential resilvering for distributed spares */
if ((nvlist_lookup_string(spares[s], ZPOOL_CONFIG_TYPE,
&type) == 0) && strcmp(type, VDEV_TYPE_DRAID_SPARE) == 0)
rebuild = B_TRUE;
/* if set, add the "ashift" pool property to the spare nvlist */ /* if set, add the "ashift" pool property to the spare nvlist */
if (source != ZPROP_SRC_DEFAULT) if (source != ZPROP_SRC_DEFAULT)
(void) nvlist_add_uint64(spares[s], (void) nvlist_add_uint64(spares[s],
@ -237,7 +243,7 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
dev_name, basename(spare_name)); dev_name, basename(spare_name));
if (zpool_vdev_attach(zhp, dev_name, spare_name, if (zpool_vdev_attach(zhp, dev_name, spare_name,
replacement, B_TRUE, B_FALSE) == 0) { replacement, B_TRUE, rebuild) == 0) {
free(dev_name); free(dev_name);
nvlist_free(replacement); nvlist_free(replacement);
return (B_TRUE); return (B_TRUE);
@ -499,6 +505,7 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
* Attempt to substitute a hot spare. * Attempt to substitute a hot spare.
*/ */
(void) replace_with_spare(hdl, zhp, vdev); (void) replace_with_spare(hdl, zhp, vdev);
zpool_close(zhp); zpool_close(zhp);
} }

View File

@ -1,14 +1,50 @@
#!/bin/sh #!/bin/sh
#
# Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
# Copyright (c) 2020 by Delphix. All rights reserved.
#
# #
# Log the zevent via syslog. # Log the zevent via syslog.
#
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc" [ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh" . "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event zed_exit_if_ignoring_this_event
zed_log_msg "eid=${ZEVENT_EID}" "class=${ZEVENT_SUBCLASS}" \ # build a string of name=value pairs for this event
"${ZEVENT_POOL_GUID:+"pool_guid=${ZEVENT_POOL_GUID}"}" \ msg="eid=${ZEVENT_EID} class=${ZEVENT_SUBCLASS}"
"${ZEVENT_VDEV_PATH:+"vdev_path=${ZEVENT_VDEV_PATH}"}" \
"${ZEVENT_VDEV_STATE_STR:+"vdev_state=${ZEVENT_VDEV_STATE_STR}"}" if [ "${ZED_SYSLOG_DISPLAY_GUIDS}" = "1" ]; then
[ -n "${ZEVENT_POOL_GUID}" ] && msg="${msg} pool_guid=${ZEVENT_POOL_GUID}"
[ -n "${ZEVENT_VDEV_GUID}" ] && msg="${msg} vdev_guid=${ZEVENT_VDEV_GUID}"
else
[ -n "${ZEVENT_POOL}" ] && msg="${msg} pool='${ZEVENT_POOL}'"
[ -n "${ZEVENT_VDEV_PATH}" ] && msg="${msg} vdev=$(basename "${ZEVENT_VDEV_PATH}")"
fi
# log pool state if state is anything other than 'ACTIVE'
[ -n "${ZEVENT_POOL_STATE_STR}" ] && [ "$ZEVENT_POOL_STATE" -ne 0 ] && \
msg="${msg} pool_state=${ZEVENT_POOL_STATE_STR}"
# Log the following payload nvpairs if they are present
[ -n "${ZEVENT_VDEV_STATE_STR}" ] && msg="${msg} vdev_state=${ZEVENT_VDEV_STATE_STR}"
[ -n "${ZEVENT_CKSUM_ALGORITHM}" ] && msg="${msg} algorithm=${ZEVENT_CKSUM_ALGORITHM}"
[ -n "${ZEVENT_ZIO_SIZE}" ] && msg="${msg} size=${ZEVENT_ZIO_SIZE}"
[ -n "${ZEVENT_ZIO_OFFSET}" ] && msg="${msg} offset=${ZEVENT_ZIO_OFFSET}"
[ -n "${ZEVENT_ZIO_PRIORITY}" ] && msg="${msg} priority=${ZEVENT_ZIO_PRIORITY}"
[ -n "${ZEVENT_ZIO_ERR}" ] && msg="${msg} err=${ZEVENT_ZIO_ERR}"
[ -n "${ZEVENT_ZIO_FLAGS}" ] && msg="${msg} flags=$(printf '0x%x' "${ZEVENT_ZIO_FLAGS}")"
# log delays that are >= 10 milisec
[ -n "${ZEVENT_ZIO_DELAY}" ] && [ "$ZEVENT_ZIO_DELAY" -gt 10000000 ] && \
msg="${msg} delay=$((ZEVENT_ZIO_DELAY / 1000000))ms"
# list the bookmark data together
[ -n "${ZEVENT_ZIO_OBJSET}" ] && \
msg="${msg} bookmark=${ZEVENT_ZIO_OBJSET}:${ZEVENT_ZIO_OBJECT}:${ZEVENT_ZIO_LEVEL}:${ZEVENT_ZIO_BLKID}"
zed_log_msg "${msg}"
exit 0 exit 0

View File

@ -13,7 +13,7 @@ FSLIST="${FSLIST_DIR}/${ZEVENT_POOL}"
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc" [ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh" . "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event [ "$ZEVENT_SUBCLASS" != "history_event" ] && exit 0
zed_check_cmd "${ZFS}" sort diff grep zed_check_cmd "${ZFS}" sort diff grep
# If we are acting on a snapshot, we have nothing to do # If we are acting on a snapshot, we have nothing to do

View File

@ -118,5 +118,10 @@ ZED_USE_ENCLOSURE_LEDS=1
# Otherwise, if ZED_SYSLOG_SUBCLASS_EXCLUDE is set, the # Otherwise, if ZED_SYSLOG_SUBCLASS_EXCLUDE is set, the
# matching subclasses are excluded from logging. # matching subclasses are excluded from logging.
#ZED_SYSLOG_SUBCLASS_INCLUDE="checksum|scrub_*|vdev.*" #ZED_SYSLOG_SUBCLASS_INCLUDE="checksum|scrub_*|vdev.*"
#ZED_SYSLOG_SUBCLASS_EXCLUDE="statechange|config_*|history_event" ZED_SYSLOG_SUBCLASS_EXCLUDE="history_event"
##
# Use GUIDs instead of names when logging pool and vdevs
# Disabled by default, 1 to enable and 0 to disable.
#ZED_SYSLOG_DISPLAY_GUIDS=1

View File

@ -270,7 +270,7 @@ get_usage(zfs_help_t idx)
return (gettext("\tclone [-p] [-o property=value] ... " return (gettext("\tclone [-p] [-o property=value] ... "
"<snapshot> <filesystem|volume>\n")); "<snapshot> <filesystem|volume>\n"));
case HELP_CREATE: case HELP_CREATE:
return (gettext("\tcreate [-Pnpv] [-o property=value] ... " return (gettext("\tcreate [-Pnpuv] [-o property=value] ... "
"<filesystem>\n" "<filesystem>\n"
"\tcreate [-Pnpsv] [-b blocksize] [-o property=value] ... " "\tcreate [-Pnpsv] [-b blocksize] [-o property=value] ... "
"-V <size> <volume>\n")); "-V <size> <volume>\n"));
@ -892,6 +892,107 @@ zfs_do_clone(int argc, char **argv)
return (-1); return (-1);
} }
/*
* Return a default volblocksize for the pool which always uses more than
* half of the data sectors. This primarily applies to dRAID which always
* writes full stripe widths.
*/
static uint64_t
default_volblocksize(zpool_handle_t *zhp, nvlist_t *props)
{
uint64_t volblocksize, asize = SPA_MINBLOCKSIZE;
nvlist_t *tree, **vdevs;
uint_t nvdevs;
nvlist_t *config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &tree) != 0 ||
nvlist_lookup_nvlist_array(tree, ZPOOL_CONFIG_CHILDREN,
&vdevs, &nvdevs) != 0) {
return (ZVOL_DEFAULT_BLOCKSIZE);
}
for (int i = 0; i < nvdevs; i++) {
nvlist_t *nv = vdevs[i];
uint64_t ashift, ndata, nparity;
if (nvlist_lookup_uint64(nv, ZPOOL_CONFIG_ASHIFT, &ashift) != 0)
continue;
if (nvlist_lookup_uint64(nv, ZPOOL_CONFIG_DRAID_NDATA,
&ndata) == 0) {
/* dRAID minimum allocation width */
asize = MAX(asize, ndata * (1ULL << ashift));
} else if (nvlist_lookup_uint64(nv, ZPOOL_CONFIG_NPARITY,
&nparity) == 0) {
/* raidz minimum allocation width */
if (nparity == 1)
asize = MAX(asize, 2 * (1ULL << ashift));
else
asize = MAX(asize, 4 * (1ULL << ashift));
} else {
/* mirror or (non-redundant) leaf vdev */
asize = MAX(asize, 1ULL << ashift);
}
}
/*
* Calculate the target volblocksize such that more than half
* of the asize is used. The following table is for 4k sectors.
*
* n asize blksz used | n asize blksz used
* -------------------------+---------------------------------
* 1 4,096 8,192 100% | 9 36,864 32,768 88%
* 2 8,192 8,192 100% | 10 40,960 32,768 80%
* 3 12,288 8,192 66% | 11 45,056 32,768 72%
* 4 16,384 16,384 100% | 12 49,152 32,768 66%
* 5 20,480 16,384 80% | 13 53,248 32,768 61%
* 6 24,576 16,384 66% | 14 57,344 32,768 57%
* 7 28,672 16,384 57% | 15 61,440 32,768 53%
* 8 32,768 32,768 100% | 16 65,536 65,636 100%
*
* This is primarily a concern for dRAID which always allocates
* a full stripe width. For dRAID the default stripe width is
* n=8 in which case the volblocksize is set to 32k. Ignoring
* compression there are no unused sectors. This same reasoning
* applies to raidz[2,3] so target 4 sectors to minimize waste.
*/
uint64_t tgt_volblocksize = ZVOL_DEFAULT_BLOCKSIZE;
while (tgt_volblocksize * 2 <= asize)
tgt_volblocksize *= 2;
const char *prop = zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE);
if (nvlist_lookup_uint64(props, prop, &volblocksize) == 0) {
/* Issue a warning when a non-optimal size is requested. */
if (volblocksize < ZVOL_DEFAULT_BLOCKSIZE) {
(void) fprintf(stderr, gettext("Warning: "
"volblocksize (%llu) is less than the default "
"minimum block size (%llu).\nTo reduce wasted "
"space a volblocksize of %llu is recommended.\n"),
(u_longlong_t)volblocksize,
(u_longlong_t)ZVOL_DEFAULT_BLOCKSIZE,
(u_longlong_t)tgt_volblocksize);
} else if (volblocksize < tgt_volblocksize) {
(void) fprintf(stderr, gettext("Warning: "
"volblocksize (%llu) is much less than the "
"minimum allocation\nunit (%llu), which wastes "
"at least %llu%% of space. To reduce wasted "
"space,\nuse a larger volblocksize (%llu is "
"recommended), fewer dRAID data disks\n"
"per group, or smaller sector size (ashift).\n"),
(u_longlong_t)volblocksize, (u_longlong_t)asize,
(u_longlong_t)((100 * (asize - volblocksize)) /
asize), (u_longlong_t)tgt_volblocksize);
}
} else {
volblocksize = tgt_volblocksize;
fnvlist_add_uint64(props, prop, volblocksize);
}
return (volblocksize);
}
/* /*
* zfs create [-Pnpv] [-o prop=value] ... fs * zfs create [-Pnpv] [-o prop=value] ... fs
* zfs create [-Pnpsv] [-b blocksize] [-o prop=value] ... -V vol size * zfs create [-Pnpsv] [-b blocksize] [-o prop=value] ... -V vol size
@ -911,6 +1012,8 @@ zfs_do_clone(int argc, char **argv)
* check of arguments and properties, but does not check for permissions, * check of arguments and properties, but does not check for permissions,
* available space, etc. * available space, etc.
* *
* The '-u' flag prevents the newly created file system from being mounted.
*
* The '-v' flag is for verbose output. * The '-v' flag is for verbose output.
* *
* The '-P' flag is used for parseable output. It implies '-v'. * The '-P' flag is used for parseable output. It implies '-v'.
@ -927,17 +1030,19 @@ zfs_do_create(int argc, char **argv)
boolean_t bflag = B_FALSE; boolean_t bflag = B_FALSE;
boolean_t parents = B_FALSE; boolean_t parents = B_FALSE;
boolean_t dryrun = B_FALSE; boolean_t dryrun = B_FALSE;
boolean_t nomount = B_FALSE;
boolean_t verbose = B_FALSE; boolean_t verbose = B_FALSE;
boolean_t parseable = B_FALSE; boolean_t parseable = B_FALSE;
int ret = 1; int ret = 1;
nvlist_t *props; nvlist_t *props;
uint64_t intval; uint64_t intval;
char *strval;
if (nvlist_alloc(&props, NV_UNIQUE_NAME, 0) != 0) if (nvlist_alloc(&props, NV_UNIQUE_NAME, 0) != 0)
nomem(); nomem();
/* check options */ /* check options */
while ((c = getopt(argc, argv, ":PV:b:nso:pv")) != -1) { while ((c = getopt(argc, argv, ":PV:b:nso:puv")) != -1) {
switch (c) { switch (c) {
case 'V': case 'V':
type = ZFS_TYPE_VOLUME; type = ZFS_TYPE_VOLUME;
@ -984,6 +1089,9 @@ zfs_do_create(int argc, char **argv)
case 's': case 's':
noreserve = B_TRUE; noreserve = B_TRUE;
break; break;
case 'u':
nomount = B_TRUE;
break;
case 'v': case 'v':
verbose = B_TRUE; verbose = B_TRUE;
break; break;
@ -1003,6 +1111,11 @@ zfs_do_create(int argc, char **argv)
"used when creating a volume\n")); "used when creating a volume\n"));
goto badusage; goto badusage;
} }
if (nomount && type != ZFS_TYPE_FILESYSTEM) {
(void) fprintf(stderr, gettext("'-u' can only be "
"used when creating a filesystem\n"));
goto badusage;
}
argc -= optind; argc -= optind;
argv += optind; argv += optind;
@ -1018,7 +1131,7 @@ zfs_do_create(int argc, char **argv)
goto badusage; goto badusage;
} }
if (dryrun || (type == ZFS_TYPE_VOLUME && !noreserve)) { if (dryrun || type == ZFS_TYPE_VOLUME) {
char msg[ZFS_MAX_DATASET_NAME_LEN * 2]; char msg[ZFS_MAX_DATASET_NAME_LEN * 2];
char *p; char *p;
@ -1040,18 +1153,24 @@ zfs_do_create(int argc, char **argv)
} }
} }
/*
* if volsize is not a multiple of volblocksize, round it up to the
* nearest multiple of the volblocksize
*/
if (type == ZFS_TYPE_VOLUME) { if (type == ZFS_TYPE_VOLUME) {
uint64_t volblocksize; const char *prop = zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE);
uint64_t volblocksize = default_volblocksize(zpool_handle,
real_props);
if (nvlist_lookup_uint64(props, if (volblocksize != ZVOL_DEFAULT_BLOCKSIZE &&
zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE), nvlist_lookup_string(props, prop, &strval) != 0) {
&volblocksize) != 0) if (asprintf(&strval, "%llu",
volblocksize = ZVOL_DEFAULT_BLOCKSIZE; (u_longlong_t)volblocksize) == -1)
nomem();
nvlist_add_string(props, prop, strval);
free(strval);
}
/*
* If volsize is not a multiple of volblocksize, round it
* up to the nearest multiple of the volblocksize.
*/
if (volsize % volblocksize) { if (volsize % volblocksize) {
volsize = P2ROUNDUP_TYPED(volsize, volblocksize, volsize = P2ROUNDUP_TYPED(volsize, volblocksize,
uint64_t); uint64_t);
@ -1064,11 +1183,9 @@ zfs_do_create(int argc, char **argv)
} }
} }
if (type == ZFS_TYPE_VOLUME && !noreserve) { if (type == ZFS_TYPE_VOLUME && !noreserve) {
uint64_t spa_version; uint64_t spa_version;
zfs_prop_t resv_prop; zfs_prop_t resv_prop;
char *strval;
spa_version = zpool_get_prop_int(zpool_handle, spa_version = zpool_get_prop_int(zpool_handle,
ZPOOL_PROP_VERSION, NULL); ZPOOL_PROP_VERSION, NULL);
@ -1159,6 +1276,11 @@ zfs_do_create(int argc, char **argv)
log_history = B_FALSE; log_history = B_FALSE;
} }
if (nomount) {
ret = 0;
goto error;
}
ret = zfs_mount_and_share(g_zfs, argv[0], ZFS_TYPE_DATASET); ret = zfs_mount_and_share(g_zfs, argv[0], ZFS_TYPE_DATASET);
error: error:
nvlist_free(props); nvlist_free(props);
@ -6596,9 +6718,9 @@ share_mount_one(zfs_handle_t *zhp, int op, int flags, char *protocol,
(void) fprintf(stderr, gettext("cannot share '%s': " (void) fprintf(stderr, gettext("cannot share '%s': "
"legacy share\n"), zfs_get_name(zhp)); "legacy share\n"), zfs_get_name(zhp));
(void) fprintf(stderr, gettext("use share(1M) to " (void) fprintf(stderr, gettext("use exports(5) or "
"share this filesystem, or set " "smb.conf(5) to share this filesystem, or set "
"sharenfs property on\n")); "the sharenfs or sharesmb property\n"));
return (1); return (1);
} }
@ -6613,7 +6735,7 @@ share_mount_one(zfs_handle_t *zhp, int op, int flags, char *protocol,
(void) fprintf(stderr, gettext("cannot %s '%s': " (void) fprintf(stderr, gettext("cannot %s '%s': "
"legacy mountpoint\n"), cmdname, zfs_get_name(zhp)); "legacy mountpoint\n"), cmdname, zfs_get_name(zhp));
(void) fprintf(stderr, gettext("use %s(1M) to " (void) fprintf(stderr, gettext("use %s(8) to "
"%s this filesystem\n"), cmdname, cmdname); "%s this filesystem\n"), cmdname, cmdname);
return (1); return (1);
} }
@ -7416,8 +7538,8 @@ unshare_unmount(int op, int argc, char **argv)
"unshare '%s': legacy share\n"), "unshare '%s': legacy share\n"),
zfs_get_name(zhp)); zfs_get_name(zhp));
(void) fprintf(stderr, gettext("use " (void) fprintf(stderr, gettext("use "
"unshare(1M) to unshare this " "exports(5) or smb.conf(5) to unshare "
"filesystem\n")); "this filesystem\n"));
ret = 1; ret = 1;
} else if (!zfs_is_shared(zhp)) { } else if (!zfs_is_shared(zhp)) {
(void) fprintf(stderr, gettext("cannot " (void) fprintf(stderr, gettext("cannot "
@ -7435,7 +7557,7 @@ unshare_unmount(int op, int argc, char **argv)
"unmount '%s': legacy " "unmount '%s': legacy "
"mountpoint\n"), zfs_get_name(zhp)); "mountpoint\n"), zfs_get_name(zhp));
(void) fprintf(stderr, gettext("use " (void) fprintf(stderr, gettext("use "
"umount(1M) to unmount this " "umount(8) to unmount this "
"filesystem\n")); "filesystem\n"));
ret = 1; ret = 1;
} else if (!zfs_is_mounted(zhp, NULL)) { } else if (!zfs_is_mounted(zhp, NULL)) {
@ -8370,7 +8492,7 @@ zfs_do_wait(int argc, char **argv)
{ {
boolean_t enabled[ZFS_WAIT_NUM_ACTIVITIES]; boolean_t enabled[ZFS_WAIT_NUM_ACTIVITIES];
int error, i; int error, i;
char c; int c;
/* By default, wait for all types of activity. */ /* By default, wait for all types of activity. */
for (i = 0; i < ZFS_WAIT_NUM_ACTIVITIES; i++) for (i = 0; i < ZFS_WAIT_NUM_ACTIVITIES; i++)

View File

@ -44,7 +44,7 @@ int
main(int argc, char **argv) main(int argc, char **argv)
{ {
boolean_t verbose = B_FALSE; boolean_t verbose = B_FALSE;
char c; int c;
while ((c = getopt(argc, argv, "v")) != -1) { while ((c = getopt(argc, argv, "v")) != -1) {
switch (c) { switch (c) {
case 'v': case 'v':

View File

@ -47,10 +47,10 @@ usage(void)
" -h\t\t print this usage and exit\n" " -h\t\t print this usage and exit\n"
" -o <filename>\t write hostid to this file\n\n" " -o <filename>\t write hostid to this file\n\n"
"If hostid file is not present, store a hostid in it.\n" "If hostid file is not present, store a hostid in it.\n"
"The optional value must be an 8-digit hex number between" "The optional value should be an 8-digit hex number between"
"1 and 2^32-1.\n" " 1 and 2^32-1.\n"
"If no value is provided, a random one will" "If the value is 0 or no value is provided, a random one"
"be generated.\n" " will be generated.\n"
"The value must be unique among your systems.\n"); "The value must be unique among your systems.\n");
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
/* NOTREACHED */ /* NOTREACHED */
@ -108,7 +108,7 @@ main(int argc, char **argv)
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
if (input_i < 0x1 || input_i > UINT32_MAX) { if (input_i > UINT32_MAX) {
fprintf(stderr, "%s\n", strerror(ERANGE)); fprintf(stderr, "%s\n", strerror(ERANGE));
usage(); usage();
} }

View File

@ -150,6 +150,7 @@ zhack_import(char *target, boolean_t readonly)
zfeature_checks_disable = B_TRUE; zfeature_checks_disable = B_TRUE;
error = spa_import(target, config, props, error = spa_import(target, config, props,
(readonly ? ZFS_IMPORT_SKIP_MMP : ZFS_IMPORT_NORMAL)); (readonly ? ZFS_IMPORT_SKIP_MMP : ZFS_IMPORT_NORMAL));
fnvlist_free(config);
zfeature_checks_disable = B_FALSE; zfeature_checks_disable = B_FALSE;
if (error == EEXIST) if (error == EEXIST)
error = 0; error = 0;

View File

@ -56,6 +56,7 @@ typedef struct zpool_node {
struct zpool_list { struct zpool_list {
boolean_t zl_findall; boolean_t zl_findall;
boolean_t zl_literal;
uu_avl_t *zl_avl; uu_avl_t *zl_avl;
uu_avl_pool_t *zl_pool; uu_avl_pool_t *zl_pool;
zprop_list_t **zl_proplist; zprop_list_t **zl_proplist;
@ -88,7 +89,9 @@ add_pool(zpool_handle_t *zhp, void *data)
uu_avl_node_init(node, &node->zn_avlnode, zlp->zl_pool); uu_avl_node_init(node, &node->zn_avlnode, zlp->zl_pool);
if (uu_avl_find(zlp->zl_avl, node, NULL, &idx) == NULL) { if (uu_avl_find(zlp->zl_avl, node, NULL, &idx) == NULL) {
if (zlp->zl_proplist && if (zlp->zl_proplist &&
zpool_expand_proplist(zhp, zlp->zl_proplist) != 0) { zpool_expand_proplist(zhp, zlp->zl_proplist,
zlp->zl_literal)
!= 0) {
zpool_close(zhp); zpool_close(zhp);
free(node); free(node);
return (-1); return (-1);
@ -110,7 +113,8 @@ add_pool(zpool_handle_t *zhp, void *data)
* line. * line.
*/ */
zpool_list_t * zpool_list_t *
pool_list_get(int argc, char **argv, zprop_list_t **proplist, int *err) pool_list_get(int argc, char **argv, zprop_list_t **proplist,
boolean_t literal, int *err)
{ {
zpool_list_t *zlp; zpool_list_t *zlp;
@ -128,6 +132,8 @@ pool_list_get(int argc, char **argv, zprop_list_t **proplist, int *err)
zlp->zl_proplist = proplist; zlp->zl_proplist = proplist;
zlp->zl_literal = literal;
if (argc == 0) { if (argc == 0) {
(void) zpool_iter(g_zfs, add_pool, zlp); (void) zpool_iter(g_zfs, add_pool, zlp);
zlp->zl_findall = B_TRUE; zlp->zl_findall = B_TRUE;
@ -242,12 +248,12 @@ pool_list_count(zpool_list_t *zlp)
*/ */
int int
for_each_pool(int argc, char **argv, boolean_t unavail, for_each_pool(int argc, char **argv, boolean_t unavail,
zprop_list_t **proplist, zpool_iter_f func, void *data) zprop_list_t **proplist, boolean_t literal, zpool_iter_f func, void *data)
{ {
zpool_list_t *list; zpool_list_t *list;
int ret = 0; int ret = 0;
if ((list = pool_list_get(argc, argv, proplist, &ret)) == NULL) if ((list = pool_list_get(argc, argv, proplist, literal, &ret)) == NULL)
return (1); return (1);
if (pool_list_iter(list, unavail, func, data) != 0) if (pool_list_iter(list, unavail, func, data) != 0)
@ -711,7 +717,7 @@ all_pools_for_each_vdev_run(int argc, char **argv, char *cmd,
vcdl->g_zfs = g_zfs; vcdl->g_zfs = g_zfs;
/* Gather our list of all vdevs in all pools */ /* Gather our list of all vdevs in all pools */
for_each_pool(argc, argv, B_TRUE, NULL, for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE,
all_pools_for_each_vdev_gather_cb, vcdl); all_pools_for_each_vdev_gather_cb, vcdl);
/* Run command on all vdevs in all pools */ /* Run command on all vdevs in all pools */

View File

@ -669,9 +669,16 @@ print_vdev_tree(zpool_handle_t *zhp, const char *name, nvlist_t *nv, int indent,
} }
for (c = 0; c < children; c++) { for (c = 0; c < children; c++) {
uint64_t is_log = B_FALSE; uint64_t is_log = B_FALSE, is_hole = B_FALSE;
char *class = ""; char *class = "";
(void) nvlist_lookup_uint64(child[c], ZPOOL_CONFIG_IS_HOLE,
&is_hole);
if (is_hole == B_TRUE) {
continue;
}
(void) nvlist_lookup_uint64(child[c], ZPOOL_CONFIG_IS_LOG, (void) nvlist_lookup_uint64(child[c], ZPOOL_CONFIG_IS_LOG,
&is_log); &is_log);
if (is_log) if (is_log)
@ -692,6 +699,54 @@ print_vdev_tree(zpool_handle_t *zhp, const char *name, nvlist_t *nv, int indent,
} }
} }
/*
* Print the list of l2cache devices for dry runs.
*/
static void
print_cache_list(nvlist_t *nv, int indent)
{
nvlist_t **child;
uint_t c, children;
if (nvlist_lookup_nvlist_array(nv, ZPOOL_CONFIG_L2CACHE,
&child, &children) == 0 && children > 0) {
(void) printf("\t%*s%s\n", indent, "", "cache");
} else {
return;
}
for (c = 0; c < children; c++) {
char *vname;
vname = zpool_vdev_name(g_zfs, NULL, child[c], 0);
(void) printf("\t%*s%s\n", indent + 2, "", vname);
free(vname);
}
}
/*
* Print the list of spares for dry runs.
*/
static void
print_spare_list(nvlist_t *nv, int indent)
{
nvlist_t **child;
uint_t c, children;
if (nvlist_lookup_nvlist_array(nv, ZPOOL_CONFIG_SPARES,
&child, &children) == 0 && children > 0) {
(void) printf("\t%*s%s\n", indent, "", "spares");
} else {
return;
}
for (c = 0; c < children; c++) {
char *vname;
vname = zpool_vdev_name(g_zfs, NULL, child[c], 0);
(void) printf("\t%*s%s\n", indent + 2, "", vname);
free(vname);
}
}
static boolean_t static boolean_t
prop_list_contains_feature(nvlist_t *proplist) prop_list_contains_feature(nvlist_t *proplist)
{ {
@ -921,16 +976,16 @@ zpool_do_add(int argc, char **argv)
if (dryrun) { if (dryrun) {
nvlist_t *poolnvroot; nvlist_t *poolnvroot;
nvlist_t **l2child; nvlist_t **l2child, **sparechild;
uint_t l2children, c; uint_t l2children, sparechildren, c;
char *vname; char *vname;
boolean_t hadcache = B_FALSE; boolean_t hadcache = B_FALSE, hadspare = B_FALSE;
verify(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, verify(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&poolnvroot) == 0); &poolnvroot) == 0);
(void) printf(gettext("would update '%s' to the following " (void) printf(gettext("would update '%s' to the following "
"configuration:\n"), zpool_get_name(zhp)); "configuration:\n\n"), zpool_get_name(zhp));
/* print original main pool and new tree */ /* print original main pool and new tree */
print_vdev_tree(zhp, poolname, poolnvroot, 0, "", print_vdev_tree(zhp, poolname, poolnvroot, 0, "",
@ -991,6 +1046,29 @@ zpool_do_add(int argc, char **argv)
free(vname); free(vname);
} }
} }
/* And finaly the spares */
if (nvlist_lookup_nvlist_array(poolnvroot, ZPOOL_CONFIG_SPARES,
&sparechild, &sparechildren) == 0 && sparechildren > 0) {
hadspare = B_TRUE;
(void) printf(gettext("\tspares\n"));
for (c = 0; c < sparechildren; c++) {
vname = zpool_vdev_name(g_zfs, NULL,
sparechild[c], name_flags);
(void) printf("\t %s\n", vname);
free(vname);
}
}
if (nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_SPARES,
&sparechild, &sparechildren) == 0 && sparechildren > 0) {
if (!hadspare)
(void) printf(gettext("\tspares\n"));
for (c = 0; c < sparechildren; c++) {
vname = zpool_vdev_name(g_zfs, NULL,
sparechild[c], name_flags);
(void) printf("\t %s\n", vname);
free(vname);
}
}
ret = 0; ret = 0;
} else { } else {
@ -1548,6 +1626,8 @@ zpool_do_create(int argc, char **argv)
VDEV_ALLOC_BIAS_SPECIAL, 0); VDEV_ALLOC_BIAS_SPECIAL, 0);
print_vdev_tree(NULL, "logs", nvroot, 0, print_vdev_tree(NULL, "logs", nvroot, 0,
VDEV_ALLOC_BIAS_LOG, 0); VDEV_ALLOC_BIAS_LOG, 0);
print_cache_list(nvroot, 0);
print_spare_list(nvroot, 0);
ret = 0; ret = 0;
} else { } else {
@ -1762,7 +1842,7 @@ zpool_do_export(int argc, char **argv)
} }
return (for_each_pool(argc, argv, B_TRUE, NULL, return (for_each_pool(argc, argv, B_TRUE, NULL,
zpool_export_one, &cb)); B_FALSE, zpool_export_one, &cb));
} }
/* check arguments */ /* check arguments */
@ -1771,7 +1851,8 @@ zpool_do_export(int argc, char **argv)
usage(B_FALSE); usage(B_FALSE);
} }
ret = for_each_pool(argc, argv, B_TRUE, NULL, zpool_export_one, &cb); ret = for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE, zpool_export_one,
&cb);
return (ret); return (ret);
} }
@ -2294,7 +2375,7 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
} }
} }
/* Display vdev initialization and trim status for leaves */ /* Display vdev initialization and trim status for leaves. */
if (children == 0) { if (children == 0) {
print_status_initialize(vs, cb->cb_print_vdev_init); print_status_initialize(vs, cb->cb_print_vdev_init);
print_status_trim(vs, cb->cb_print_vdev_trim); print_status_trim(vs, cb->cb_print_vdev_trim);
@ -3613,7 +3694,8 @@ zpool_do_sync(int argc, char **argv)
argv += optind; argv += optind;
/* if argc == 0 we will execute zpool_sync_one on all pools */ /* if argc == 0 we will execute zpool_sync_one on all pools */
ret = for_each_pool(argc, argv, B_FALSE, NULL, zpool_sync_one, &force); ret = for_each_pool(argc, argv, B_FALSE, NULL, B_FALSE, zpool_sync_one,
&force);
return (ret); return (ret);
} }
@ -4958,7 +5040,7 @@ are_vdevs_in_pool(int argc, char **argv, char *pool_name,
/* Is this name a vdev in our pools? */ /* Is this name a vdev in our pools? */
ret = for_each_pool(pool_count, &pool_name, B_TRUE, NULL, ret = for_each_pool(pool_count, &pool_name, B_TRUE, NULL,
is_vdev, cb); B_FALSE, is_vdev, cb);
if (!ret) { if (!ret) {
/* No match */ /* No match */
break; break;
@ -4986,7 +5068,8 @@ is_pool_cb(zpool_handle_t *zhp, void *data)
static int static int
is_pool(char *name) is_pool(char *name)
{ {
return (for_each_pool(0, NULL, B_TRUE, NULL, is_pool_cb, name)); return (for_each_pool(0, NULL, B_TRUE, NULL, B_FALSE, is_pool_cb,
name));
} }
/* Are all our argv[] strings pool names? If so return 1, 0 otherwise. */ /* Are all our argv[] strings pool names? If so return 1, 0 otherwise. */
@ -5438,7 +5521,7 @@ zpool_do_iostat(int argc, char **argv)
* Construct the list of all interesting pools. * Construct the list of all interesting pools.
*/ */
ret = 0; ret = 0;
if ((list = pool_list_get(argc, argv, NULL, &ret)) == NULL) if ((list = pool_list_get(argc, argv, NULL, parsable, &ret)) == NULL)
return (1); return (1);
if (pool_list_count(list) == 0 && argc != 0) { if (pool_list_count(list) == 0 && argc != 0) {
@ -6112,7 +6195,7 @@ zpool_do_list(int argc, char **argv)
for (;;) { for (;;) {
if ((list = pool_list_get(argc, argv, &cb.cb_proplist, if ((list = pool_list_get(argc, argv, &cb.cb_proplist,
&ret)) == NULL) cb.cb_literal, &ret)) == NULL)
return (1); return (1);
if (pool_list_count(list) == 0) if (pool_list_count(list) == 0)
@ -6512,6 +6595,10 @@ zpool_do_split(int argc, char **argv)
"following layout:\n\n"), newpool); "following layout:\n\n"), newpool);
print_vdev_tree(NULL, newpool, config, 0, "", print_vdev_tree(NULL, newpool, config, 0, "",
flags.name_flags); flags.name_flags);
print_vdev_tree(NULL, "dedup", config, 0,
VDEV_ALLOC_BIAS_DEDUP, 0);
print_vdev_tree(NULL, "special", config, 0,
VDEV_ALLOC_BIAS_SPECIAL, 0);
} }
} }
@ -6864,7 +6951,7 @@ zpool_do_reopen(int argc, char **argv)
argv += optind; argv += optind;
/* if argc == 0 we will execute zpool_reopen_one on all pools */ /* if argc == 0 we will execute zpool_reopen_one on all pools */
ret = for_each_pool(argc, argv, B_TRUE, NULL, zpool_reopen_one, ret = for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE, zpool_reopen_one,
&scrub_restart); &scrub_restart);
return (ret); return (ret);
@ -6994,12 +7081,13 @@ zpool_do_scrub(int argc, char **argv)
usage(B_FALSE); usage(B_FALSE);
} }
error = for_each_pool(argc, argv, B_TRUE, NULL, scrub_callback, &cb); error = for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE,
scrub_callback, &cb);
if (wait && !error) { if (wait && !error) {
zpool_wait_activity_t act = ZPOOL_WAIT_SCRUB; zpool_wait_activity_t act = ZPOOL_WAIT_SCRUB;
error = for_each_pool(argc, argv, B_TRUE, NULL, wait_callback, error = for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE,
&act); wait_callback, &act);
} }
return (error); return (error);
@ -7037,7 +7125,8 @@ zpool_do_resilver(int argc, char **argv)
usage(B_FALSE); usage(B_FALSE);
} }
return (for_each_pool(argc, argv, B_TRUE, NULL, scrub_callback, &cb)); return (for_each_pool(argc, argv, B_TRUE, NULL, B_FALSE,
scrub_callback, &cb));
} }
/* /*
@ -7590,7 +7679,7 @@ print_removal_status(zpool_handle_t *zhp, pool_removal_stat_t *prs)
vdev_name = zpool_vdev_name(g_zfs, zhp, vdev_name = zpool_vdev_name(g_zfs, zhp,
child[prs->prs_removing_vdev], B_TRUE); child[prs->prs_removing_vdev], B_TRUE);
(void) printf(gettext("remove: ")); printf_color(ANSI_BOLD, gettext("remove: "));
start = prs->prs_start_time; start = prs->prs_start_time;
end = prs->prs_end_time; end = prs->prs_end_time;
@ -8431,7 +8520,7 @@ zpool_do_status(int argc, char **argv)
cb.vcdl = all_pools_for_each_vdev_run(argc, argv, cmd, cb.vcdl = all_pools_for_each_vdev_run(argc, argv, cmd,
NULL, NULL, 0, 0); NULL, NULL, 0, 0);
ret = for_each_pool(argc, argv, B_TRUE, NULL, ret = for_each_pool(argc, argv, B_TRUE, NULL, cb.cb_literal,
status_callback, &cb); status_callback, &cb);
if (cb.vcdl != NULL) if (cb.vcdl != NULL)
@ -8950,7 +9039,7 @@ zpool_do_upgrade(int argc, char **argv)
(void) printf(gettext("\n")); (void) printf(gettext("\n"));
} }
} else { } else {
ret = for_each_pool(argc, argv, B_FALSE, NULL, ret = for_each_pool(argc, argv, B_FALSE, NULL, B_FALSE,
upgrade_one, &cb); upgrade_one, &cb);
} }
@ -9036,6 +9125,12 @@ print_history_records(nvlist_t *nvhis, hist_cbdata_t *cb)
dump_nvlist(fnvlist_lookup_nvlist(rec, dump_nvlist(fnvlist_lookup_nvlist(rec,
ZPOOL_HIST_OUTPUT_NVL), 8); ZPOOL_HIST_OUTPUT_NVL), 8);
} }
if (nvlist_exists(rec, ZPOOL_HIST_OUTPUT_SIZE)) {
(void) printf(" output nvlist omitted; "
"original size: %lldKB\n",
(longlong_t)fnvlist_lookup_int64(rec,
ZPOOL_HIST_OUTPUT_SIZE) / 1024);
}
if (nvlist_exists(rec, ZPOOL_HIST_ERRNO)) { if (nvlist_exists(rec, ZPOOL_HIST_ERRNO)) {
(void) printf(" errno: %lld\n", (void) printf(" errno: %lld\n",
(longlong_t)fnvlist_lookup_int64(rec, (longlong_t)fnvlist_lookup_int64(rec,
@ -9133,7 +9228,7 @@ zpool_do_history(int argc, char **argv)
argc -= optind; argc -= optind;
argv += optind; argv += optind;
ret = for_each_pool(argc, argv, B_FALSE, NULL, get_history_one, ret = for_each_pool(argc, argv, B_FALSE, NULL, B_FALSE, get_history_one,
&cbdata); &cbdata);
if (argc == 0 && cbdata.first == B_TRUE) { if (argc == 0 && cbdata.first == B_TRUE) {
@ -9696,7 +9791,7 @@ zpool_do_get(int argc, char **argv)
cb.cb_proplist = &fake_name; cb.cb_proplist = &fake_name;
} }
ret = for_each_pool(argc, argv, B_TRUE, &cb.cb_proplist, ret = for_each_pool(argc, argv, B_TRUE, &cb.cb_proplist, cb.cb_literal,
get_callback, &cb); get_callback, &cb);
if (cb.cb_proplist == &fake_name) if (cb.cb_proplist == &fake_name)
@ -9766,7 +9861,7 @@ zpool_do_set(int argc, char **argv)
*(cb.cb_value) = '\0'; *(cb.cb_value) = '\0';
cb.cb_value++; cb.cb_value++;
error = for_each_pool(argc - 2, argv + 2, B_TRUE, NULL, error = for_each_pool(argc - 2, argv + 2, B_TRUE, NULL, B_FALSE,
set_callback, &cb); set_callback, &cb);
return (error); return (error);
@ -9849,7 +9944,8 @@ vdev_any_spare_replacing(nvlist_t *nv)
(void) nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE, &vdev_type); (void) nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE, &vdev_type);
if (strcmp(vdev_type, VDEV_TYPE_REPLACING) == 0 || if (strcmp(vdev_type, VDEV_TYPE_REPLACING) == 0 ||
strcmp(vdev_type, VDEV_TYPE_SPARE) == 0) { strcmp(vdev_type, VDEV_TYPE_SPARE) == 0 ||
strcmp(vdev_type, VDEV_TYPE_DRAID_SPARE) == 0) {
return (B_TRUE); return (B_TRUE);
} }
@ -10051,7 +10147,7 @@ int
zpool_do_wait(int argc, char **argv) zpool_do_wait(int argc, char **argv)
{ {
boolean_t verbose = B_FALSE; boolean_t verbose = B_FALSE;
char c; int c;
char *value; char *value;
int i; int i;
unsigned long count; unsigned long count;

View File

@ -64,7 +64,7 @@ nvlist_t *split_mirror_vdev(zpool_handle_t *zhp, char *newname,
* Pool list functions * Pool list functions
*/ */
int for_each_pool(int, char **, boolean_t unavail, zprop_list_t **, int for_each_pool(int, char **, boolean_t unavail, zprop_list_t **,
zpool_iter_f, void *); boolean_t, zpool_iter_f, void *);
/* Vdev list functions */ /* Vdev list functions */
typedef int (*pool_vdev_iter_f)(zpool_handle_t *, nvlist_t *, void *); typedef int (*pool_vdev_iter_f)(zpool_handle_t *, nvlist_t *, void *);
@ -72,7 +72,7 @@ int for_each_vdev(zpool_handle_t *zhp, pool_vdev_iter_f func, void *data);
typedef struct zpool_list zpool_list_t; typedef struct zpool_list zpool_list_t;
zpool_list_t *pool_list_get(int, char **, zprop_list_t **, int *); zpool_list_t *pool_list_get(int, char **, zprop_list_t **, boolean_t, int *);
void pool_list_update(zpool_list_t *); void pool_list_update(zpool_list_t *);
int pool_list_iter(zpool_list_t *, int unavail, zpool_iter_f, void *); int pool_list_iter(zpool_list_t *, int unavail, zpool_iter_f, void *);
void pool_list_free(zpool_list_t *); void pool_list_free(zpool_list_t *);

View File

@ -86,9 +86,6 @@
boolean_t error_seen; boolean_t error_seen;
boolean_t is_force; boolean_t is_force;
/*PRINTFLIKE1*/ /*PRINTFLIKE1*/
void void
vdev_error(const char *fmt, ...) vdev_error(const char *fmt, ...)
@ -222,6 +219,9 @@ is_spare(nvlist_t *config, const char *path)
uint_t i, nspares; uint_t i, nspares;
boolean_t inuse; boolean_t inuse;
if (zpool_is_draid_spare(path))
return (B_TRUE);
if ((fd = open(path, O_RDONLY|O_DIRECT)) < 0) if ((fd = open(path, O_RDONLY|O_DIRECT)) < 0)
return (B_FALSE); return (B_FALSE);
@ -267,9 +267,10 @@ is_spare(nvlist_t *config, const char *path)
* /dev/xxx Complete disk path * /dev/xxx Complete disk path
* /xxx Full path to file * /xxx Full path to file
* xxx Shorthand for <zfs_vdev_paths>/xxx * xxx Shorthand for <zfs_vdev_paths>/xxx
* draid* Virtual dRAID spare
*/ */
static nvlist_t * static nvlist_t *
make_leaf_vdev(nvlist_t *props, const char *arg, uint64_t is_log) make_leaf_vdev(nvlist_t *props, const char *arg, boolean_t is_primary)
{ {
char path[MAXPATHLEN]; char path[MAXPATHLEN];
struct stat64 statbuf; struct stat64 statbuf;
@ -309,6 +310,17 @@ make_leaf_vdev(nvlist_t *props, const char *arg, uint64_t is_log)
/* After whole disk check restore original passed path */ /* After whole disk check restore original passed path */
strlcpy(path, arg, sizeof (path)); strlcpy(path, arg, sizeof (path));
} else if (zpool_is_draid_spare(arg)) {
if (!is_primary) {
(void) fprintf(stderr,
gettext("cannot open '%s': dRAID spares can only "
"be used to replace primary vdevs\n"), arg);
return (NULL);
}
wholedisk = B_TRUE;
strlcpy(path, arg, sizeof (path));
type = VDEV_TYPE_DRAID_SPARE;
} else { } else {
err = is_shorthand_path(arg, path, sizeof (path), err = is_shorthand_path(arg, path, sizeof (path),
&statbuf, &wholedisk); &statbuf, &wholedisk);
@ -337,17 +349,19 @@ make_leaf_vdev(nvlist_t *props, const char *arg, uint64_t is_log)
} }
} }
/* if (type == NULL) {
* Determine whether this is a device or a file. /*
*/ * Determine whether this is a device or a file.
if (wholedisk || S_ISBLK(statbuf.st_mode)) { */
type = VDEV_TYPE_DISK; if (wholedisk || S_ISBLK(statbuf.st_mode)) {
} else if (S_ISREG(statbuf.st_mode)) { type = VDEV_TYPE_DISK;
type = VDEV_TYPE_FILE; } else if (S_ISREG(statbuf.st_mode)) {
} else { type = VDEV_TYPE_FILE;
(void) fprintf(stderr, gettext("cannot use '%s': must be a " } else {
"block device or regular file\n"), path); fprintf(stderr, gettext("cannot use '%s': must "
return (NULL); "be a block device or regular file\n"), path);
return (NULL);
}
} }
/* /*
@ -358,10 +372,7 @@ make_leaf_vdev(nvlist_t *props, const char *arg, uint64_t is_log)
verify(nvlist_alloc(&vdev, NV_UNIQUE_NAME, 0) == 0); verify(nvlist_alloc(&vdev, NV_UNIQUE_NAME, 0) == 0);
verify(nvlist_add_string(vdev, ZPOOL_CONFIG_PATH, path) == 0); verify(nvlist_add_string(vdev, ZPOOL_CONFIG_PATH, path) == 0);
verify(nvlist_add_string(vdev, ZPOOL_CONFIG_TYPE, type) == 0); verify(nvlist_add_string(vdev, ZPOOL_CONFIG_TYPE, type) == 0);
verify(nvlist_add_uint64(vdev, ZPOOL_CONFIG_IS_LOG, is_log) == 0);
if (is_log)
verify(nvlist_add_string(vdev, ZPOOL_CONFIG_ALLOCATION_BIAS,
VDEV_ALLOC_BIAS_LOG) == 0);
if (strcmp(type, VDEV_TYPE_DISK) == 0) if (strcmp(type, VDEV_TYPE_DISK) == 0)
verify(nvlist_add_uint64(vdev, ZPOOL_CONFIG_WHOLE_DISK, verify(nvlist_add_uint64(vdev, ZPOOL_CONFIG_WHOLE_DISK,
(uint64_t)wholedisk) == 0); (uint64_t)wholedisk) == 0);
@ -432,11 +443,16 @@ typedef struct replication_level {
#define ZPOOL_FUZZ (16 * 1024 * 1024) #define ZPOOL_FUZZ (16 * 1024 * 1024)
/*
* N.B. For the purposes of comparing replication levels dRAID can be
* considered functionally equivilant to raidz.
*/
static boolean_t static boolean_t
is_raidz_mirror(replication_level_t *a, replication_level_t *b, is_raidz_mirror(replication_level_t *a, replication_level_t *b,
replication_level_t **raidz, replication_level_t **mirror) replication_level_t **raidz, replication_level_t **mirror)
{ {
if (strcmp(a->zprl_type, "raidz") == 0 && if ((strcmp(a->zprl_type, "raidz") == 0 ||
strcmp(a->zprl_type, "draid") == 0) &&
strcmp(b->zprl_type, "mirror") == 0) { strcmp(b->zprl_type, "mirror") == 0) {
*raidz = a; *raidz = a;
*mirror = b; *mirror = b;
@ -445,6 +461,22 @@ is_raidz_mirror(replication_level_t *a, replication_level_t *b,
return (B_FALSE); return (B_FALSE);
} }
/*
* Comparison for determining if dRAID and raidz where passed in either order.
*/
static boolean_t
is_raidz_draid(replication_level_t *a, replication_level_t *b)
{
if ((strcmp(a->zprl_type, "raidz") == 0 ||
strcmp(a->zprl_type, "draid") == 0) &&
(strcmp(b->zprl_type, "raidz") == 0 ||
strcmp(b->zprl_type, "draid") == 0)) {
return (B_TRUE);
}
return (B_FALSE);
}
/* /*
* Given a list of toplevel vdevs, return the current replication level. If * Given a list of toplevel vdevs, return the current replication level. If
* the config is inconsistent, then NULL is returned. If 'fatal' is set, then * the config is inconsistent, then NULL is returned. If 'fatal' is set, then
@ -511,7 +543,8 @@ get_replication(nvlist_t *nvroot, boolean_t fatal)
rep.zprl_type = type; rep.zprl_type = type;
rep.zprl_children = 0; rep.zprl_children = 0;
if (strcmp(type, VDEV_TYPE_RAIDZ) == 0) { if (strcmp(type, VDEV_TYPE_RAIDZ) == 0 ||
strcmp(type, VDEV_TYPE_DRAID) == 0) {
verify(nvlist_lookup_uint64(nv, verify(nvlist_lookup_uint64(nv,
ZPOOL_CONFIG_NPARITY, ZPOOL_CONFIG_NPARITY,
&rep.zprl_parity) == 0); &rep.zprl_parity) == 0);
@ -677,6 +710,29 @@ get_replication(nvlist_t *nvroot, boolean_t fatal)
else else
return (NULL); return (NULL);
} }
} else if (is_raidz_draid(&lastrep, &rep)) {
/*
* Accepted raidz and draid when they can
* handle the same number of disk failures.
*/
if (lastrep.zprl_parity != rep.zprl_parity) {
if (ret != NULL)
free(ret);
ret = NULL;
if (fatal)
vdev_error(gettext(
"mismatched replication "
"level: %s and %s vdevs "
"with different "
"redundancy, %llu vs. "
"%llu are present\n"),
lastrep.zprl_type,
rep.zprl_type,
lastrep.zprl_parity,
rep.zprl_parity);
else
return (NULL);
}
} else if (strcmp(lastrep.zprl_type, rep.zprl_type) != } else if (strcmp(lastrep.zprl_type, rep.zprl_type) !=
0) { 0) {
if (ret != NULL) if (ret != NULL)
@ -1103,31 +1159,87 @@ is_device_in_use(nvlist_t *config, nvlist_t *nv, boolean_t force,
return (anyinuse); return (anyinuse);
} }
/*
* Returns the parity level extracted from a raidz or draid type.
* If the parity cannot be determined zero is returned.
*/
static int
get_parity(const char *type)
{
long parity = 0;
const char *p;
if (strncmp(type, VDEV_TYPE_RAIDZ, strlen(VDEV_TYPE_RAIDZ)) == 0) {
p = type + strlen(VDEV_TYPE_RAIDZ);
if (*p == '\0') {
/* when unspecified default to single parity */
return (1);
} else if (*p == '0') {
/* no zero prefixes allowed */
return (0);
} else {
/* 0-3, no suffixes allowed */
char *end;
errno = 0;
parity = strtol(p, &end, 10);
if (errno != 0 || *end != '\0' ||
parity < 1 || parity > VDEV_RAIDZ_MAXPARITY) {
return (0);
}
}
} else if (strncmp(type, VDEV_TYPE_DRAID,
strlen(VDEV_TYPE_DRAID)) == 0) {
p = type + strlen(VDEV_TYPE_DRAID);
if (*p == '\0' || *p == ':') {
/* when unspecified default to single parity */
return (1);
} else if (*p == '0') {
/* no zero prefixes allowed */
return (0);
} else {
/* 0-3, allowed suffixes: '\0' or ':' */
char *end;
errno = 0;
parity = strtol(p, &end, 10);
if (errno != 0 ||
parity < 1 || parity > VDEV_DRAID_MAXPARITY ||
(*end != '\0' && *end != ':')) {
return (0);
}
}
}
return ((int)parity);
}
/*
* Assign the minimum and maximum number of devices allowed for
* the specified type. On error NULL is returned, otherwise the
* type prefix is returned (raidz, mirror, etc).
*/
static const char * static const char *
is_grouping(const char *type, int *mindev, int *maxdev) is_grouping(const char *type, int *mindev, int *maxdev)
{ {
if (strncmp(type, "raidz", 5) == 0) { int nparity;
const char *p = type + 5;
char *end;
long nparity;
if (*p == '\0') {
nparity = 1;
} else if (*p == '0') {
return (NULL); /* no zero prefixes allowed */
} else {
errno = 0;
nparity = strtol(p, &end, 10);
if (errno != 0 || nparity < 1 || nparity >= 255 ||
*end != '\0')
return (NULL);
}
if (strncmp(type, VDEV_TYPE_RAIDZ, strlen(VDEV_TYPE_RAIDZ)) == 0 ||
strncmp(type, VDEV_TYPE_DRAID, strlen(VDEV_TYPE_DRAID)) == 0) {
nparity = get_parity(type);
if (nparity == 0)
return (NULL);
if (mindev != NULL) if (mindev != NULL)
*mindev = nparity + 1; *mindev = nparity + 1;
if (maxdev != NULL) if (maxdev != NULL)
*maxdev = 255; *maxdev = 255;
return (VDEV_TYPE_RAIDZ);
if (strncmp(type, VDEV_TYPE_RAIDZ,
strlen(VDEV_TYPE_RAIDZ)) == 0) {
return (VDEV_TYPE_RAIDZ);
} else {
return (VDEV_TYPE_DRAID);
}
} }
if (maxdev != NULL) if (maxdev != NULL)
@ -1167,6 +1279,163 @@ is_grouping(const char *type, int *mindev, int *maxdev)
return (NULL); return (NULL);
} }
/*
* Extract the configuration parameters encoded in the dRAID type and
* use them to generate a dRAID configuration. The expected format is:
*
* draid[<parity>][:<data><d|D>][:<children><c|C>][:<spares><s|S>]
*
* The intent is to be able to generate a good configuration when no
* additional information is provided. The only mandatory component
* of the 'type' is the 'draid' prefix. If a value is not provided
* then reasonable defaults are used. The optional components may
* appear in any order but the d/s/c suffix is required.
*
* Valid inputs:
* - data: number of data devices per group (1-255)
* - parity: number of parity blocks per group (1-3)
* - spares: number of distributed spare (0-100)
* - children: total number of devices (1-255)
*
* Examples:
* - zpool create tank draid <devices...>
* - zpool create tank draid2:8d:51c:2s <devices...>
*/
static int
draid_config_by_type(nvlist_t *nv, const char *type, uint64_t children)
{
uint64_t nparity = 1;
uint64_t nspares = 0;
uint64_t ndata = UINT64_MAX;
uint64_t ngroups = 1;
long value;
if (strncmp(type, VDEV_TYPE_DRAID, strlen(VDEV_TYPE_DRAID)) != 0)
return (EINVAL);
nparity = (uint64_t)get_parity(type);
if (nparity == 0)
return (EINVAL);
char *p = (char *)type;
while ((p = strchr(p, ':')) != NULL) {
char *end;
p = p + 1;
errno = 0;
if (!isdigit(p[0])) {
(void) fprintf(stderr, gettext("invalid dRAID "
"syntax; expected [:<number><c|d|s>] not '%s'\n"),
type);
return (EINVAL);
}
/* Expected non-zero value with c/d/s suffix */
value = strtol(p, &end, 10);
char suffix = tolower(*end);
if (errno != 0 ||
(suffix != 'c' && suffix != 'd' && suffix != 's')) {
(void) fprintf(stderr, gettext("invalid dRAID "
"syntax; expected [:<number><c|d|s>] not '%s'\n"),
type);
return (EINVAL);
}
if (suffix == 'c') {
if ((uint64_t)value != children) {
fprintf(stderr,
gettext("invalid number of dRAID children; "
"%llu required but %llu provided\n"),
(u_longlong_t)value,
(u_longlong_t)children);
return (EINVAL);
}
} else if (suffix == 'd') {
ndata = (uint64_t)value;
} else if (suffix == 's') {
nspares = (uint64_t)value;
} else {
verify(0); /* Unreachable */
}
}
/*
* When a specific number of data disks is not provided limit a
* redundancy group to 8 data disks. This value was selected to
* provide a reasonable tradeoff between capacity and performance.
*/
if (ndata == UINT64_MAX) {
if (children > nspares + nparity) {
ndata = MIN(children - nspares - nparity, 8);
} else {
fprintf(stderr, gettext("request number of "
"distributed spares %llu and parity level %llu\n"
"leaves no disks available for data\n"),
(u_longlong_t)nspares, (u_longlong_t)nparity);
return (EINVAL);
}
}
/* Verify the maximum allowed group size is never exceeded. */
if (ndata == 0 || (ndata + nparity > children - nspares)) {
fprintf(stderr, gettext("requested number of dRAID data "
"disks per group %llu is too high,\nat most %llu disks "
"are available for data\n"), (u_longlong_t)ndata,
(u_longlong_t)(children - nspares - nparity));
return (EINVAL);
}
if (nparity == 0 || nparity > VDEV_DRAID_MAXPARITY) {
fprintf(stderr,
gettext("invalid dRAID parity level %llu; must be "
"between 1 and %d\n"), (u_longlong_t)nparity,
VDEV_DRAID_MAXPARITY);
return (EINVAL);
}
/*
* Verify the requested number of spares can be satisfied.
* An arbitrary limit of 100 distributed spares is applied.
*/
if (nspares > 100 || nspares > (children - (ndata + nparity))) {
fprintf(stderr,
gettext("invalid number of dRAID spares %llu; additional "
"disks would be required\n"), (u_longlong_t)nspares);
return (EINVAL);
}
/* Verify the requested number children is sufficient. */
if (children < (ndata + nparity + nspares)) {
fprintf(stderr, gettext("%llu disks were provided, but at "
"least %llu disks are required for this config\n"),
(u_longlong_t)children,
(u_longlong_t)(ndata + nparity + nspares));
}
if (children > VDEV_DRAID_MAX_CHILDREN) {
fprintf(stderr, gettext("%llu disks were provided, but "
"dRAID only supports up to %u disks"),
(u_longlong_t)children, VDEV_DRAID_MAX_CHILDREN);
}
/*
* Calculate the minimum number of groups required to fill a slice.
* This is the LCM of the stripe width (ndata + nparity) and the
* number of data drives (children - nspares).
*/
while (ngroups * (ndata + nparity) % (children - nspares) != 0)
ngroups++;
/* Store the basic dRAID configuration. */
fnvlist_add_uint64(nv, ZPOOL_CONFIG_NPARITY, nparity);
fnvlist_add_uint64(nv, ZPOOL_CONFIG_DRAID_NDATA, ndata);
fnvlist_add_uint64(nv, ZPOOL_CONFIG_DRAID_NSPARES, nspares);
fnvlist_add_uint64(nv, ZPOOL_CONFIG_DRAID_NGROUPS, ngroups);
return (0);
}
/* /*
* Construct a syntactically valid vdev specification, * Construct a syntactically valid vdev specification,
* and ensure that all devices and files exist and can be opened. * and ensure that all devices and files exist and can be opened.
@ -1178,8 +1447,8 @@ construct_spec(nvlist_t *props, int argc, char **argv)
{ {
nvlist_t *nvroot, *nv, **top, **spares, **l2cache; nvlist_t *nvroot, *nv, **top, **spares, **l2cache;
int t, toplevels, mindev, maxdev, nspares, nlogs, nl2cache; int t, toplevels, mindev, maxdev, nspares, nlogs, nl2cache;
const char *type; const char *type, *fulltype;
uint64_t is_log, is_special, is_dedup; boolean_t is_log, is_special, is_dedup, is_spare;
boolean_t seen_logs; boolean_t seen_logs;
top = NULL; top = NULL;
@ -1189,18 +1458,20 @@ construct_spec(nvlist_t *props, int argc, char **argv)
nspares = 0; nspares = 0;
nlogs = 0; nlogs = 0;
nl2cache = 0; nl2cache = 0;
is_log = is_special = is_dedup = B_FALSE; is_log = is_special = is_dedup = is_spare = B_FALSE;
seen_logs = B_FALSE; seen_logs = B_FALSE;
nvroot = NULL; nvroot = NULL;
while (argc > 0) { while (argc > 0) {
fulltype = argv[0];
nv = NULL; nv = NULL;
/* /*
* If it's a mirror or raidz, the subsequent arguments are * If it's a mirror, raidz, or draid the subsequent arguments
* its leaves -- until we encounter the next mirror or raidz. * are its leaves -- until we encounter the next mirror,
* raidz or draid.
*/ */
if ((type = is_grouping(argv[0], &mindev, &maxdev)) != NULL) { if ((type = is_grouping(fulltype, &mindev, &maxdev)) != NULL) {
nvlist_t **child = NULL; nvlist_t **child = NULL;
int c, children = 0; int c, children = 0;
@ -1212,6 +1483,7 @@ construct_spec(nvlist_t *props, int argc, char **argv)
"specified only once\n")); "specified only once\n"));
goto spec_out; goto spec_out;
} }
is_spare = B_TRUE;
is_log = is_special = is_dedup = B_FALSE; is_log = is_special = is_dedup = B_FALSE;
} }
@ -1225,8 +1497,7 @@ construct_spec(nvlist_t *props, int argc, char **argv)
} }
seen_logs = B_TRUE; seen_logs = B_TRUE;
is_log = B_TRUE; is_log = B_TRUE;
is_special = B_FALSE; is_special = is_dedup = is_spare = B_FALSE;
is_dedup = B_FALSE;
argc--; argc--;
argv++; argv++;
/* /*
@ -1238,8 +1509,7 @@ construct_spec(nvlist_t *props, int argc, char **argv)
if (strcmp(type, VDEV_ALLOC_BIAS_SPECIAL) == 0) { if (strcmp(type, VDEV_ALLOC_BIAS_SPECIAL) == 0) {
is_special = B_TRUE; is_special = B_TRUE;
is_log = B_FALSE; is_log = is_dedup = is_spare = B_FALSE;
is_dedup = B_FALSE;
argc--; argc--;
argv++; argv++;
continue; continue;
@ -1247,8 +1517,7 @@ construct_spec(nvlist_t *props, int argc, char **argv)
if (strcmp(type, VDEV_ALLOC_BIAS_DEDUP) == 0) { if (strcmp(type, VDEV_ALLOC_BIAS_DEDUP) == 0) {
is_dedup = B_TRUE; is_dedup = B_TRUE;
is_log = B_FALSE; is_log = is_special = is_spare = B_FALSE;
is_special = B_FALSE;
argc--; argc--;
argv++; argv++;
continue; continue;
@ -1262,7 +1531,8 @@ construct_spec(nvlist_t *props, int argc, char **argv)
"specified only once\n")); "specified only once\n"));
goto spec_out; goto spec_out;
} }
is_log = is_special = is_dedup = B_FALSE; is_log = is_special = B_FALSE;
is_dedup = is_spare = B_FALSE;
} }
if (is_log || is_special || is_dedup) { if (is_log || is_special || is_dedup) {
@ -1280,13 +1550,15 @@ construct_spec(nvlist_t *props, int argc, char **argv)
for (c = 1; c < argc; c++) { for (c = 1; c < argc; c++) {
if (is_grouping(argv[c], NULL, NULL) != NULL) if (is_grouping(argv[c], NULL, NULL) != NULL)
break; break;
children++; children++;
child = realloc(child, child = realloc(child,
children * sizeof (nvlist_t *)); children * sizeof (nvlist_t *));
if (child == NULL) if (child == NULL)
zpool_no_memory(); zpool_no_memory();
if ((nv = make_leaf_vdev(props, argv[c], if ((nv = make_leaf_vdev(props, argv[c],
B_FALSE)) == NULL) { !(is_log || is_special || is_dedup ||
is_spare))) == NULL) {
for (c = 0; c < children - 1; c++) for (c = 0; c < children - 1; c++)
nvlist_free(child[c]); nvlist_free(child[c]);
free(child); free(child);
@ -1335,10 +1607,11 @@ construct_spec(nvlist_t *props, int argc, char **argv)
type) == 0); type) == 0);
verify(nvlist_add_uint64(nv, verify(nvlist_add_uint64(nv,
ZPOOL_CONFIG_IS_LOG, is_log) == 0); ZPOOL_CONFIG_IS_LOG, is_log) == 0);
if (is_log) if (is_log) {
verify(nvlist_add_string(nv, verify(nvlist_add_string(nv,
ZPOOL_CONFIG_ALLOCATION_BIAS, ZPOOL_CONFIG_ALLOCATION_BIAS,
VDEV_ALLOC_BIAS_LOG) == 0); VDEV_ALLOC_BIAS_LOG) == 0);
}
if (is_special) { if (is_special) {
verify(nvlist_add_string(nv, verify(nvlist_add_string(nv,
ZPOOL_CONFIG_ALLOCATION_BIAS, ZPOOL_CONFIG_ALLOCATION_BIAS,
@ -1354,6 +1627,15 @@ construct_spec(nvlist_t *props, int argc, char **argv)
ZPOOL_CONFIG_NPARITY, ZPOOL_CONFIG_NPARITY,
mindev - 1) == 0); mindev - 1) == 0);
} }
if (strcmp(type, VDEV_TYPE_DRAID) == 0) {
if (draid_config_by_type(nv,
fulltype, children) != 0) {
for (c = 0; c < children; c++)
nvlist_free(child[c]);
free(child);
goto spec_out;
}
}
verify(nvlist_add_nvlist_array(nv, verify(nvlist_add_nvlist_array(nv,
ZPOOL_CONFIG_CHILDREN, child, ZPOOL_CONFIG_CHILDREN, child,
children) == 0); children) == 0);
@ -1367,12 +1649,19 @@ construct_spec(nvlist_t *props, int argc, char **argv)
* We have a device. Pass off to make_leaf_vdev() to * We have a device. Pass off to make_leaf_vdev() to
* construct the appropriate nvlist describing the vdev. * construct the appropriate nvlist describing the vdev.
*/ */
if ((nv = make_leaf_vdev(props, argv[0], if ((nv = make_leaf_vdev(props, argv[0], !(is_log ||
is_log)) == NULL) is_special || is_dedup || is_spare))) == NULL)
goto spec_out; goto spec_out;
if (is_log) verify(nvlist_add_uint64(nv,
ZPOOL_CONFIG_IS_LOG, is_log) == 0);
if (is_log) {
verify(nvlist_add_string(nv,
ZPOOL_CONFIG_ALLOCATION_BIAS,
VDEV_ALLOC_BIAS_LOG) == 0);
nlogs++; nlogs++;
}
if (is_special) { if (is_special) {
verify(nvlist_add_string(nv, verify(nvlist_add_string(nv,
ZPOOL_CONFIG_ALLOCATION_BIAS, ZPOOL_CONFIG_ALLOCATION_BIAS,

View File

@ -0,0 +1 @@
/zpool_influxdb

View File

@ -0,0 +1,11 @@
include $(top_srcdir)/config/Rules.am
zfsexec_PROGRAMS = zpool_influxdb
zpool_influxdb_SOURCES = \
zpool_influxdb.c
zpool_influxdb_LDADD = \
$(top_builddir)/lib/libspl/libspl.la \
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libzfs/libzfs.la

View File

@ -0,0 +1,294 @@
# Influxdb Metrics for ZFS Pools
The _zpool_influxdb_ program produces
[influxdb](https://github.com/influxdata/influxdb) line protocol
compatible metrics from zpools. In the UNIX tradition, _zpool_influxdb_
does one thing: read statistics from a pool and print them to
stdout. In many ways, this is a metrics-friendly output of
statistics normally observed via the `zpool` command.
## Usage
When run without arguments, _zpool_influxdb_ runs once, reading data
from all imported pools, and prints to stdout.
```shell
zpool_influxdb [options] [poolname]
```
If no poolname is specified, then all pools are sampled.
| option | short option | description |
|---|---|---|
| --execd | -e | For use with telegraf's `execd` plugin. When [enter] is pressed, the pools are sampled. To exit, use [ctrl+D] |
| --no-histogram | -n | Do not print histogram information |
| --signed-int | -i | Use signed integer data type (default=unsigned) |
| --sum-histogram-buckets | -s | Sum histogram bucket values |
| --tags key=value[,key=value...] | -t | Add tags to data points. No tag sanity checking is performed. |
| --help | -h | Print a short usage message |
#### Histogram Bucket Values
The histogram data collected by ZFS is stored as independent bucket values.
This works well out-of-the-box with an influxdb data source and grafana's
heatmap visualization. The influxdb query for a grafana heatmap
visualization looks like:
```
field(disk_read) last() non_negative_derivative(1s)
```
Another method for storing histogram data sums the values for lower-value
buckets. For example, a latency bucket tagged "le=10" includes the values
in the bucket "le=1".
This method is often used for prometheus histograms.
The `zpool_influxdb --sum-histogram-buckets` option presents the data from ZFS
as summed values.
## Measurements
The following measurements are collected:
| measurement | description | zpool equivalent |
|---|---|---|
| zpool_stats | general size and data | zpool list |
| zpool_scan_stats | scrub, rebuild, and resilver statistics (omitted if no scan has been requested) | zpool status |
| zpool_vdev_stats | per-vdev statistics | zpool iostat -q |
| zpool_io_size | per-vdev I/O size histogram | zpool iostat -r |
| zpool_latency | per-vdev I/O latency histogram | zpool iostat -w |
| zpool_vdev_queue | per-vdev instantaneous queue depth | zpool iostat -q |
### zpool_stats Description
zpool_stats contains top-level summary statistics for the pool.
Performance counters measure the I/Os to the pool's devices.
#### zpool_stats Tags
| label | description |
|---|---|
| name | pool name |
| path | for leaf vdevs, the pathname |
| state | pool state, as shown by _zpool status_ |
| vdev | vdev name (root = entire pool) |
#### zpool_stats Fields
| field | units | description |
|---|---|---|
| alloc | bytes | allocated space |
| free | bytes | unallocated space |
| size | bytes | total pool size |
| read_bytes | bytes | bytes read since pool import |
| read_errors | count | number of read errors |
| read_ops | count | number of read operations |
| write_bytes | bytes | bytes written since pool import |
| write_errors | count | number of write errors |
| write_ops | count | number of write operations |
### zpool_scan_stats Description
Once a pool has been scrubbed, resilvered, or rebuilt, the zpool_scan_stats
contain information about the status and performance of the operation.
Otherwise, the zpool_scan_stats do not exist in the kernel, and therefore
cannot be reported by this collector.
#### zpool_scan_stats Tags
| label | description |
|---|---|
| name | pool name |
| function | name of the scan function running or recently completed |
| state | scan state, as shown by _zpool status_ |
#### zpool_scan_stats Fields
| field | units | description |
|---|---|---|
| errors | count | number of errors encountered by scan |
| examined | bytes | total data examined during scan |
| to_examine | bytes | prediction of total bytes to be scanned |
| pass_examined | bytes | data examined during current scan pass |
| issued | bytes | size of I/Os issued to disks |
| pass_issued | bytes | size of I/Os issued to disks for current pass |
| processed | bytes | data reconstructed during scan |
| to_process | bytes | total bytes to be repaired |
| rate | bytes/sec | examination rate |
| start_ts | epoch timestamp | start timestamp for scan |
| pause_ts | epoch timestamp | timestamp for a scan pause request |
| end_ts | epoch timestamp | completion timestamp for scan |
| paused_t | seconds | elapsed time while paused |
| remaining_t | seconds | estimate of time remaining for scan |
### zpool_vdev_stats Description
The ZFS I/O (ZIO) scheduler uses five queues to schedule I/Os to each vdev.
These queues are further divided into active and pending states.
An I/O is pending prior to being issued to the vdev. An active
I/O has been issued to the vdev. The scheduler and its tunable
parameters are described at the
[ZFS documentation for ZIO Scheduler]
(https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/ZIO%20Scheduler.html)
The ZIO scheduler reports the queue depths as gauges where the value
represents an instantaneous snapshot of the queue depth at
the sample time. Therefore, it is not unusual to see all zeroes
for an idle pool.
#### zpool_vdev_stats Tags
| label | description |
|---|---|
| name | pool name |
| vdev | vdev name (root = entire pool) |
#### zpool_vdev_stats Fields
| field | units | description |
|---|---|---|
| sync_r_active_queue | entries | synchronous read active queue depth |
| sync_w_active_queue | entries | synchronous write active queue depth |
| async_r_active_queue | entries | asynchronous read active queue depth |
| async_w_active_queue | entries | asynchronous write active queue depth |
| async_scrub_active_queue | entries | asynchronous scrub active queue depth |
| sync_r_pend_queue | entries | synchronous read pending queue depth |
| sync_w_pend_queue | entries | synchronous write pending queue depth |
| async_r_pend_queue | entries | asynchronous read pending queue depth |
| async_w_pend_queue | entries | asynchronous write pending queue depth |
| async_scrub_pend_queue | entries | asynchronous scrub pending queue depth |
### zpool_latency Histogram
ZFS tracks the latency of each I/O in the ZIO pipeline. This latency can
be useful for observing latency-related issues that are not easily observed
using the averaged latency statistics.
The histogram fields show cumulative values from lowest to highest.
The largest bucket is tagged "le=+Inf", representing the total count
of I/Os by type and vdev.
#### zpool_latency Histogram Tags
| label | description |
|---|---|
| le | bucket for histogram, latency is less than or equal to bucket value in seconds |
| name | pool name |
| path | for leaf vdevs, the device path name, otherwise omitted |
| vdev | vdev name (root = entire pool) |
#### zpool_latency Histogram Fields
| field | units | description |
|---|---|---|
| total_read | operations | read operations of all types |
| total_write | operations | write operations of all types |
| disk_read | operations | disk read operations |
| disk_write | operations | disk write operations |
| sync_read | operations | ZIO sync reads |
| sync_write | operations | ZIO sync writes |
| async_read | operations | ZIO async reads|
| async_write | operations | ZIO async writes |
| scrub | operations | ZIO scrub/scan reads |
| trim | operations | ZIO trim (aka unmap) writes |
### zpool_io_size Histogram
ZFS tracks I/O throughout the ZIO pipeline. The size of each I/O is used
to create a histogram of the size by I/O type and vdev. For example, a
4KiB write to mirrored pool will show a 4KiB write to the top-level vdev
(root) and a 4KiB write to each of the mirror leaf vdevs.
The ZIO pipeline can aggregate I/O operations. For example, a contiguous
series of writes can be aggregated into a single, larger I/O to the leaf
vdev. The independent I/O operations reflect the logical operations and
the aggregated I/O operations reflect the physical operations.
The histogram fields show cumulative values from lowest to highest.
The largest bucket is tagged "le=+Inf", representing the total count
of I/Os by type and vdev.
Note: trim I/Os can be larger than 16MiB, but the larger sizes are
accounted in the 16MiB bucket.
#### zpool_io_size Histogram Tags
| label | description |
|---|---|
| le | bucket for histogram, I/O size is less than or equal to bucket value in bytes |
| name | pool name |
| path | for leaf vdevs, the device path name, otherwise omitted |
| vdev | vdev name (root = entire pool) |
#### zpool_io_size Histogram Fields
| field | units | description |
|---|---|---|
| sync_read_ind | blocks | independent sync reads |
| sync_write_ind | blocks | independent sync writes |
| async_read_ind | blocks | independent async reads |
| async_write_ind | blocks | independent async writes |
| scrub_read_ind | blocks | independent scrub/scan reads |
| trim_write_ind | blocks | independent trim (aka unmap) writes |
| sync_read_agg | blocks | aggregated sync reads |
| sync_write_agg | blocks | aggregated sync writes |
| async_read_agg | blocks | aggregated async reads |
| async_write_agg | blocks | aggregated async writes |
| scrub_read_agg | blocks | aggregated scrub/scan reads |
| trim_write_agg | blocks | aggregated trim (aka unmap) writes |
#### About unsigned integers
Telegraf v1.6.2 and later support unsigned 64-bit integers which more
closely matches the uint64_t values used by ZFS. By default, zpool_influxdb
uses ZFS' uint64_t values and influxdb line protocol unsigned integer type.
If you are using old telegraf or influxdb where unsigned integers are not
available, use the `--signed-int` option.
## Using _zpool_influxdb_
The simplest method is to use the execd input agent in telegraf. For older
versions of telegraf which lack execd, the exec input agent can be used.
For convenience, one of the sample config files below can be placed in the
telegraf config-directory (often /etc/telegraf/telegraf.d). Telegraf can
be restarted to read the config-directory files.
### Example telegraf execd configuration
```toml
# # Read metrics from zpool_influxdb
[[inputs.execd]]
# ## default installation location for zpool_influxdb command
command = ["/usr/libexec/zfs/zpool_influxdb", "--execd"]
## Define how the process is signaled on each collection interval.
## Valid values are:
## "none" : Do not signal anything. (Recommended for service inputs)
## The process must output metrics by itself.
## "STDIN" : Send a newline on STDIN. (Recommended for gather inputs)
## "SIGHUP" : Send a HUP signal. Not available on Windows. (not recommended)
## "SIGUSR1" : Send a USR1 signal. Not available on Windows.
## "SIGUSR2" : Send a USR2 signal. Not available on Windows.
signal = "STDIN"
## Delay before the process is restarted after an unexpected termination
restart_delay = "10s"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
```
### Example telegraf exec configuration
```toml
# # Read metrics from zpool_influxdb
[[inputs.exec]]
# ## default installation location for zpool_influxdb command
commands = ["/usr/libexec/zfs/zpool_influxdb"]
data_format = "influx"
```
## Caveat Emptor
* Like the _zpool_ command, _zpool_influxdb_ takes a reader
lock on spa_config for each imported pool. If this lock blocks,
then the command will also block indefinitely and might be
unkillable. This is not a normal condition, but can occur if
there are bugs in the kernel modules.
For this reason, care should be taken:
* avoid spawning many of these commands hoping that one might
finish
* avoid frequent updates or short sample time
intervals, because the locks can interfere with the performance
of other instances of _zpool_ or _zpool_influxdb_
## Other collectors
There are a few other collectors for zpool statistics roaming around
the Internet. Many attempt to screen-scrape `zpool` output in various
ways. The screen-scrape method works poorly for `zpool` output because
of its human-friendly nature. Also, they suffer from the same caveats
as this implementation. This implementation is optimized for directly
collecting the metrics and is much more efficient than the screen-scrapers.
## Feedback Encouraged
Pull requests and issues are greatly appreciated at
https://github.com/openzfs/zfs

View File

@ -0,0 +1,3 @@
### Dashboards for zpool_influxdb
This directory contains a collection of dashboards related to ZFS with data
collected from the zpool_influxdb collector.

View File

@ -0,0 +1,7 @@
This directory contains sample telegraf configurations for
adding `zpool_influxdb` as an input plugin. Depending on your
telegraf configuration, the installation can be as simple as
copying one of these to the `/etc/telegraf/telegraf.d` directory
and restarting `systemctl restart telegraf`
See the telegraf docs for more information on input plugins.

View File

@ -0,0 +1,15 @@
# # Read metrics from zpool_influxdb
[[inputs.exec]]
# ## default installation location for zpool_influxdb command
commands = ["/usr/local/libexec/zfs/zpool_influxdb"]
# ## Timeout for each command to complete.
# timeout = "5s"
#
# ## measurement name suffix (for separating different commands)
# name_suffix = "_mycollector"
#
# ## Data format to consume.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"

View File

@ -0,0 +1,23 @@
# # Read metrics from zpool_influxdb
[[inputs.execd]]
# ## default installation location for zpool_influxdb command
command = ["/usr/local/libexec/zfs/zpool_influxdb", "--execd"]
## Define how the process is signaled on each collection interval.
## Valid values are:
## "none" : Do not signal anything. (Recommended for service inputs)
## The process must output metrics by itself.
## "STDIN" : Send a newline on STDIN. (Recommended for gather inputs)
## "SIGHUP" : Send a HUP signal. Not available on Windows. (not recommended)
## "SIGUSR1" : Send a USR1 signal. Not available on Windows.
## "SIGUSR2" : Send a USR2 signal. Not available on Windows.
signal = "STDIN"
## Delay before the process is restarted after an unexpected termination
restart_delay = "10s"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"

View File

@ -0,0 +1,843 @@
/*
* Gather top-level ZFS pool and resilver/scan statistics and print using
* influxdb line protocol
* usage: [options] [pool_name]
* where options are:
* --execd, -e run in telegraf execd input plugin mode, [CR] on
* stdin causes a sample to be printed and wait for
* the next [CR]
* --no-histograms, -n don't print histogram data (reduces cardinality
* if you don't care about histograms)
* --sum-histogram-buckets, -s sum histogram bucket values
*
* To integrate into telegraf use one of:
* 1. the `inputs.execd` plugin with the `--execd` option
* 2. the `inputs.exec` plugin to simply run with no options
*
* NOTE: libzfs is an unstable interface. YMMV.
*
* The design goals of this software include:
* + be as lightweight as possible
* + reduce the number of external dependencies as far as possible, hence
* there is no dependency on a client library for managing the metric
* collection -- info is printed, KISS
* + broken pools or kernel bugs can cause this process to hang in an
* unkillable state. For this reason, it is best to keep the damage limited
* to a small process like zpool_influxdb rather than a larger collector.
*
* Copyright 2018-2020 Richard Elling
*
* This software is dual-licensed MIT and CDDL.
*
* The MIT License (MIT)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
* You can obtain a copy of the license from the top-level file
* "OPENSOLARIS.LICENSE" or at <http://opensource.org/licenses/CDDL-1.0>.
* You may not use this file except in compliance with the license.
*
* See the License for the specific language governing permissions
* and limitations under the License.
*
* CDDL HEADER END
*/
#include <string.h>
#include <getopt.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <libzfs_impl.h>
#define POOL_MEASUREMENT "zpool_stats"
#define SCAN_MEASUREMENT "zpool_scan_stats"
#define VDEV_MEASUREMENT "zpool_vdev_stats"
#define POOL_LATENCY_MEASUREMENT "zpool_latency"
#define POOL_QUEUE_MEASUREMENT "zpool_vdev_queue"
#define MIN_LAT_INDEX 10 /* minimum latency index 10 = 1024ns */
#define POOL_IO_SIZE_MEASUREMENT "zpool_io_size"
#define MIN_SIZE_INDEX 9 /* minimum size index 9 = 512 bytes */
/* global options */
int execd_mode = 0;
int no_histograms = 0;
int sum_histogram_buckets = 0;
char metric_data_type = 'u';
uint64_t metric_value_mask = UINT64_MAX;
uint64_t timestamp = 0;
int complained_about_sync = 0;
char *tags = "";
typedef int (*stat_printer_f)(nvlist_t *, const char *, const char *);
/*
* influxdb line protocol rules for escaping are important because the
* zpool name can include characters that need to be escaped
*
* caller is responsible for freeing result
*/
static char *
escape_string(char *s)
{
char *c, *d;
char *t = (char *)malloc(ZFS_MAX_DATASET_NAME_LEN * 2);
if (t == NULL) {
fprintf(stderr, "error: cannot allocate memory\n");
exit(1);
}
for (c = s, d = t; *c != '\0'; c++, d++) {
switch (*c) {
case ' ':
case ',':
case '=':
case '\\':
*d++ = '\\';
default:
*d = *c;
}
}
*d = '\0';
return (t);
}
/*
* print key=value where value is a uint64_t
*/
static void
print_kv(char *key, uint64_t value)
{
printf("%s=%llu%c", key,
(u_longlong_t)value & metric_value_mask, metric_data_type);
}
/*
* print_scan_status() prints the details as often seen in the "zpool status"
* output. However, unlike the zpool command, which is intended for humans,
* this output is suitable for long-term tracking in influxdb.
* TODO: update to include issued scan data
*/
static int
print_scan_status(nvlist_t *nvroot, const char *pool_name)
{
uint_t c;
int64_t elapsed;
uint64_t examined, pass_exam, paused_time, paused_ts, rate;
uint64_t remaining_time;
pool_scan_stat_t *ps = NULL;
double pct_done;
char *state[DSS_NUM_STATES] = {
"none", "scanning", "finished", "canceled"};
char *func;
(void) nvlist_lookup_uint64_array(nvroot,
ZPOOL_CONFIG_SCAN_STATS,
(uint64_t **)&ps, &c);
/*
* ignore if there are no stats
*/
if (ps == NULL)
return (0);
/*
* return error if state is bogus
*/
if (ps->pss_state >= DSS_NUM_STATES ||
ps->pss_func >= POOL_SCAN_FUNCS) {
if (complained_about_sync % 1000 == 0) {
fprintf(stderr, "error: cannot decode scan stats: "
"ZFS is out of sync with compiled zpool_influxdb");
complained_about_sync++;
}
return (1);
}
switch (ps->pss_func) {
case POOL_SCAN_NONE:
func = "none_requested";
break;
case POOL_SCAN_SCRUB:
func = "scrub";
break;
case POOL_SCAN_RESILVER:
func = "resilver";
break;
#ifdef POOL_SCAN_REBUILD
case POOL_SCAN_REBUILD:
func = "rebuild";
break;
#endif
default:
func = "scan";
}
/* overall progress */
examined = ps->pss_examined ? ps->pss_examined : 1;
pct_done = 0.0;
if (ps->pss_to_examine > 0)
pct_done = 100.0 * examined / ps->pss_to_examine;
#ifdef EZFS_SCRUB_PAUSED
paused_ts = ps->pss_pass_scrub_pause;
paused_time = ps->pss_pass_scrub_spent_paused;
#else
paused_ts = 0;
paused_time = 0;
#endif
/* calculations for this pass */
if (ps->pss_state == DSS_SCANNING) {
elapsed = (int64_t)time(NULL) - (int64_t)ps->pss_pass_start -
(int64_t)paused_time;
elapsed = (elapsed > 0) ? elapsed : 1;
pass_exam = ps->pss_pass_exam ? ps->pss_pass_exam : 1;
rate = pass_exam / elapsed;
rate = (rate > 0) ? rate : 1;
remaining_time = ps->pss_to_examine - examined / rate;
} else {
elapsed =
(int64_t)ps->pss_end_time - (int64_t)ps->pss_pass_start -
(int64_t)paused_time;
elapsed = (elapsed > 0) ? elapsed : 1;
pass_exam = ps->pss_pass_exam ? ps->pss_pass_exam : 1;
rate = pass_exam / elapsed;
remaining_time = 0;
}
rate = rate ? rate : 1;
/* influxdb line protocol format: "tags metrics timestamp" */
printf("%s%s,function=%s,name=%s,state=%s ",
SCAN_MEASUREMENT, tags, func, pool_name, state[ps->pss_state]);
print_kv("end_ts", ps->pss_end_time);
print_kv(",errors", ps->pss_errors);
print_kv(",examined", examined);
print_kv(",issued", ps->pss_issued);
print_kv(",pass_examined", pass_exam);
print_kv(",pass_issued", ps->pss_pass_issued);
print_kv(",paused_ts", paused_ts);
print_kv(",paused_t", paused_time);
printf(",pct_done=%.2f", pct_done);
print_kv(",processed", ps->pss_processed);
print_kv(",rate", rate);
print_kv(",remaining_t", remaining_time);
print_kv(",start_ts", ps->pss_start_time);
print_kv(",to_examine", ps->pss_to_examine);
print_kv(",to_process", ps->pss_to_process);
printf(" %llu\n", (u_longlong_t)timestamp);
return (0);
}
/*
* get a vdev name that corresponds to the top-level vdev names
* printed by `zpool status`
*/
static char *
get_vdev_name(nvlist_t *nvroot, const char *parent_name)
{
static char vdev_name[256];
char *vdev_type = NULL;
uint64_t vdev_id = 0;
if (nvlist_lookup_string(nvroot, ZPOOL_CONFIG_TYPE,
&vdev_type) != 0) {
vdev_type = "unknown";
}
if (nvlist_lookup_uint64(
nvroot, ZPOOL_CONFIG_ID, &vdev_id) != 0) {
vdev_id = UINT64_MAX;
}
if (parent_name == NULL) {
(void) snprintf(vdev_name, sizeof (vdev_name), "%s",
vdev_type);
} else {
(void) snprintf(vdev_name, sizeof (vdev_name),
"%s/%s-%llu",
parent_name, vdev_type, (u_longlong_t)vdev_id);
}
return (vdev_name);
}
/*
* get a string suitable for an influxdb tag that describes this vdev
*
* By default only the vdev hierarchical name is shown, separated by '/'
* If the vdev has an associated path, which is typical of leaf vdevs,
* then the path is added.
* It would be nice to have the devid instead of the path, but under
* Linux we cannot be sure a devid will exist and we'd rather have
* something than nothing, so we'll use path instead.
*/
static char *
get_vdev_desc(nvlist_t *nvroot, const char *parent_name)
{
static char vdev_desc[2 * MAXPATHLEN];
char *vdev_type = NULL;
uint64_t vdev_id = 0;
char vdev_value[MAXPATHLEN];
char *vdev_path = NULL;
char *s, *t;
if (nvlist_lookup_string(nvroot, ZPOOL_CONFIG_TYPE, &vdev_type) != 0) {
vdev_type = "unknown";
}
if (nvlist_lookup_uint64(nvroot, ZPOOL_CONFIG_ID, &vdev_id) != 0) {
vdev_id = UINT64_MAX;
}
if (nvlist_lookup_string(
nvroot, ZPOOL_CONFIG_PATH, &vdev_path) != 0) {
vdev_path = NULL;
}
if (parent_name == NULL) {
s = escape_string(vdev_type);
(void) snprintf(vdev_value, sizeof (vdev_value), "vdev=%s", s);
free(s);
} else {
s = escape_string((char *)parent_name);
t = escape_string(vdev_type);
(void) snprintf(vdev_value, sizeof (vdev_value),
"vdev=%s/%s-%llu", s, t, (u_longlong_t)vdev_id);
free(s);
free(t);
}
if (vdev_path == NULL) {
(void) snprintf(vdev_desc, sizeof (vdev_desc), "%s",
vdev_value);
} else {
s = escape_string(vdev_path);
(void) snprintf(vdev_desc, sizeof (vdev_desc), "path=%s,%s",
s, vdev_value);
free(s);
}
return (vdev_desc);
}
/*
* vdev summary stats are a combination of the data shown by
* `zpool status` and `zpool list -v`
*/
static int
print_summary_stats(nvlist_t *nvroot, const char *pool_name,
const char *parent_name)
{
uint_t c;
vdev_stat_t *vs;
char *vdev_desc = NULL;
vdev_desc = get_vdev_desc(nvroot, parent_name);
if (nvlist_lookup_uint64_array(nvroot, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) != 0) {
return (1);
}
printf("%s%s,name=%s,state=%s,%s ", POOL_MEASUREMENT, tags,
pool_name, zpool_state_to_name((vdev_state_t)vs->vs_state,
(vdev_aux_t)vs->vs_aux), vdev_desc);
print_kv("alloc", vs->vs_alloc);
print_kv(",free", vs->vs_space - vs->vs_alloc);
print_kv(",size", vs->vs_space);
print_kv(",read_bytes", vs->vs_bytes[ZIO_TYPE_READ]);
print_kv(",read_errors", vs->vs_read_errors);
print_kv(",read_ops", vs->vs_ops[ZIO_TYPE_READ]);
print_kv(",write_bytes", vs->vs_bytes[ZIO_TYPE_WRITE]);
print_kv(",write_errors", vs->vs_write_errors);
print_kv(",write_ops", vs->vs_ops[ZIO_TYPE_WRITE]);
print_kv(",checksum_errors", vs->vs_checksum_errors);
print_kv(",fragmentation", vs->vs_fragmentation);
printf(" %llu\n", (u_longlong_t)timestamp);
return (0);
}
/*
* vdev latency stats are histograms stored as nvlist arrays of uint64.
* Latency stats include the ZIO scheduler classes plus lower-level
* vdev latencies.
*
* In many cases, the top-level "root" view obscures the underlying
* top-level vdev operations. For example, if a pool has a log, special,
* or cache device, then each can behave very differently. It is useful
* to see how each is responding.
*/
static int
print_vdev_latency_stats(nvlist_t *nvroot, const char *pool_name,
const char *parent_name)
{
uint_t c, end = 0;
nvlist_t *nv_ex;
char *vdev_desc = NULL;
/* short_names become part of the metric name and are influxdb-ready */
struct lat_lookup {
char *name;
char *short_name;
uint64_t sum;
uint64_t *array;
};
struct lat_lookup lat_type[] = {
{ZPOOL_CONFIG_VDEV_TOT_R_LAT_HISTO, "total_read", 0},
{ZPOOL_CONFIG_VDEV_TOT_W_LAT_HISTO, "total_write", 0},
{ZPOOL_CONFIG_VDEV_DISK_R_LAT_HISTO, "disk_read", 0},
{ZPOOL_CONFIG_VDEV_DISK_W_LAT_HISTO, "disk_write", 0},
{ZPOOL_CONFIG_VDEV_SYNC_R_LAT_HISTO, "sync_read", 0},
{ZPOOL_CONFIG_VDEV_SYNC_W_LAT_HISTO, "sync_write", 0},
{ZPOOL_CONFIG_VDEV_ASYNC_R_LAT_HISTO, "async_read", 0},
{ZPOOL_CONFIG_VDEV_ASYNC_W_LAT_HISTO, "async_write", 0},
{ZPOOL_CONFIG_VDEV_SCRUB_LAT_HISTO, "scrub", 0},
#ifdef ZPOOL_CONFIG_VDEV_TRIM_LAT_HISTO
{ZPOOL_CONFIG_VDEV_TRIM_LAT_HISTO, "trim", 0},
#endif
{NULL, NULL}
};
if (nvlist_lookup_nvlist(nvroot,
ZPOOL_CONFIG_VDEV_STATS_EX, &nv_ex) != 0) {
return (6);
}
vdev_desc = get_vdev_desc(nvroot, parent_name);
for (int i = 0; lat_type[i].name; i++) {
if (nvlist_lookup_uint64_array(nv_ex,
lat_type[i].name, &lat_type[i].array, &c) != 0) {
fprintf(stderr, "error: can't get %s\n",
lat_type[i].name);
return (3);
}
/* end count count, all of the arrays are the same size */
end = c - 1;
}
for (int bucket = 0; bucket <= end; bucket++) {
if (bucket < MIN_LAT_INDEX) {
/* don't print, but collect the sum */
for (int i = 0; lat_type[i].name; i++) {
lat_type[i].sum += lat_type[i].array[bucket];
}
continue;
}
if (bucket < end) {
printf("%s%s,le=%0.6f,name=%s,%s ",
POOL_LATENCY_MEASUREMENT, tags,
(float)(1ULL << bucket) * 1e-9,
pool_name, vdev_desc);
} else {
printf("%s%s,le=+Inf,name=%s,%s ",
POOL_LATENCY_MEASUREMENT, tags, pool_name,
vdev_desc);
}
for (int i = 0; lat_type[i].name; i++) {
if (bucket <= MIN_LAT_INDEX || sum_histogram_buckets) {
lat_type[i].sum += lat_type[i].array[bucket];
} else {
lat_type[i].sum = lat_type[i].array[bucket];
}
print_kv(lat_type[i].short_name, lat_type[i].sum);
if (lat_type[i + 1].name != NULL) {
printf(",");
}
}
printf(" %llu\n", (u_longlong_t)timestamp);
}
return (0);
}
/*
* vdev request size stats are histograms stored as nvlist arrays of uint64.
* Request size stats include the ZIO scheduler classes plus lower-level
* vdev sizes. Both independent (ind) and aggregated (agg) sizes are reported.
*
* In many cases, the top-level "root" view obscures the underlying
* top-level vdev operations. For example, if a pool has a log, special,
* or cache device, then each can behave very differently. It is useful
* to see how each is responding.
*/
static int
print_vdev_size_stats(nvlist_t *nvroot, const char *pool_name,
const char *parent_name)
{
uint_t c, end = 0;
nvlist_t *nv_ex;
char *vdev_desc = NULL;
/* short_names become the field name */
struct size_lookup {
char *name;
char *short_name;
uint64_t sum;
uint64_t *array;
};
struct size_lookup size_type[] = {
{ZPOOL_CONFIG_VDEV_SYNC_IND_R_HISTO, "sync_read_ind"},
{ZPOOL_CONFIG_VDEV_SYNC_IND_W_HISTO, "sync_write_ind"},
{ZPOOL_CONFIG_VDEV_ASYNC_IND_R_HISTO, "async_read_ind"},
{ZPOOL_CONFIG_VDEV_ASYNC_IND_W_HISTO, "async_write_ind"},
{ZPOOL_CONFIG_VDEV_IND_SCRUB_HISTO, "scrub_read_ind"},
{ZPOOL_CONFIG_VDEV_SYNC_AGG_R_HISTO, "sync_read_agg"},
{ZPOOL_CONFIG_VDEV_SYNC_AGG_W_HISTO, "sync_write_agg"},
{ZPOOL_CONFIG_VDEV_ASYNC_AGG_R_HISTO, "async_read_agg"},
{ZPOOL_CONFIG_VDEV_ASYNC_AGG_W_HISTO, "async_write_agg"},
{ZPOOL_CONFIG_VDEV_AGG_SCRUB_HISTO, "scrub_read_agg"},
#ifdef ZPOOL_CONFIG_VDEV_IND_TRIM_HISTO
{ZPOOL_CONFIG_VDEV_IND_TRIM_HISTO, "trim_write_ind"},
{ZPOOL_CONFIG_VDEV_AGG_TRIM_HISTO, "trim_write_agg"},
#endif
{NULL, NULL}
};
if (nvlist_lookup_nvlist(nvroot,
ZPOOL_CONFIG_VDEV_STATS_EX, &nv_ex) != 0) {
return (6);
}
vdev_desc = get_vdev_desc(nvroot, parent_name);
for (int i = 0; size_type[i].name; i++) {
if (nvlist_lookup_uint64_array(nv_ex, size_type[i].name,
&size_type[i].array, &c) != 0) {
fprintf(stderr, "error: can't get %s\n",
size_type[i].name);
return (3);
}
/* end count count, all of the arrays are the same size */
end = c - 1;
}
for (int bucket = 0; bucket <= end; bucket++) {
if (bucket < MIN_SIZE_INDEX) {
/* don't print, but collect the sum */
for (int i = 0; size_type[i].name; i++) {
size_type[i].sum += size_type[i].array[bucket];
}
continue;
}
if (bucket < end) {
printf("%s%s,le=%llu,name=%s,%s ",
POOL_IO_SIZE_MEASUREMENT, tags, 1ULL << bucket,
pool_name, vdev_desc);
} else {
printf("%s%s,le=+Inf,name=%s,%s ",
POOL_IO_SIZE_MEASUREMENT, tags, pool_name,
vdev_desc);
}
for (int i = 0; size_type[i].name; i++) {
if (bucket <= MIN_SIZE_INDEX || sum_histogram_buckets) {
size_type[i].sum += size_type[i].array[bucket];
} else {
size_type[i].sum = size_type[i].array[bucket];
}
print_kv(size_type[i].short_name, size_type[i].sum);
if (size_type[i + 1].name != NULL) {
printf(",");
}
}
printf(" %llu\n", (u_longlong_t)timestamp);
}
return (0);
}
/*
* ZIO scheduler queue stats are stored as gauges. This is unfortunate
* because the values can change very rapidly and any point-in-time
* value will quickly be obsoleted. It is also not easy to downsample.
* Thus only the top-level queue stats might be beneficial... maybe.
*/
static int
print_queue_stats(nvlist_t *nvroot, const char *pool_name,
const char *parent_name)
{
nvlist_t *nv_ex;
uint64_t value;
/* short_names are used for the field name */
struct queue_lookup {
char *name;
char *short_name;
};
struct queue_lookup queue_type[] = {
{ZPOOL_CONFIG_VDEV_SYNC_R_ACTIVE_QUEUE, "sync_r_active"},
{ZPOOL_CONFIG_VDEV_SYNC_W_ACTIVE_QUEUE, "sync_w_active"},
{ZPOOL_CONFIG_VDEV_ASYNC_R_ACTIVE_QUEUE, "async_r_active"},
{ZPOOL_CONFIG_VDEV_ASYNC_W_ACTIVE_QUEUE, "async_w_active"},
{ZPOOL_CONFIG_VDEV_SCRUB_ACTIVE_QUEUE, "async_scrub_active"},
{ZPOOL_CONFIG_VDEV_SYNC_R_PEND_QUEUE, "sync_r_pend"},
{ZPOOL_CONFIG_VDEV_SYNC_W_PEND_QUEUE, "sync_w_pend"},
{ZPOOL_CONFIG_VDEV_ASYNC_R_PEND_QUEUE, "async_r_pend"},
{ZPOOL_CONFIG_VDEV_ASYNC_W_PEND_QUEUE, "async_w_pend"},
{ZPOOL_CONFIG_VDEV_SCRUB_PEND_QUEUE, "async_scrub_pend"},
{NULL, NULL}
};
if (nvlist_lookup_nvlist(nvroot,
ZPOOL_CONFIG_VDEV_STATS_EX, &nv_ex) != 0) {
return (6);
}
printf("%s%s,name=%s,%s ", POOL_QUEUE_MEASUREMENT, tags, pool_name,
get_vdev_desc(nvroot, parent_name));
for (int i = 0; queue_type[i].name; i++) {
if (nvlist_lookup_uint64(nv_ex,
queue_type[i].name, &value) != 0) {
fprintf(stderr, "error: can't get %s\n",
queue_type[i].name);
return (3);
}
print_kv(queue_type[i].short_name, value);
if (queue_type[i + 1].name != NULL) {
printf(",");
}
}
printf(" %llu\n", (u_longlong_t)timestamp);
return (0);
}
/*
* top-level vdev stats are at the pool level
*/
static int
print_top_level_vdev_stats(nvlist_t *nvroot, const char *pool_name)
{
nvlist_t *nv_ex;
uint64_t value;
/* short_names become part of the metric name */
struct queue_lookup {
char *name;
char *short_name;
};
struct queue_lookup queue_type[] = {
{ZPOOL_CONFIG_VDEV_SYNC_R_ACTIVE_QUEUE, "sync_r_active_queue"},
{ZPOOL_CONFIG_VDEV_SYNC_W_ACTIVE_QUEUE, "sync_w_active_queue"},
{ZPOOL_CONFIG_VDEV_ASYNC_R_ACTIVE_QUEUE, "async_r_active_queue"},
{ZPOOL_CONFIG_VDEV_ASYNC_W_ACTIVE_QUEUE, "async_w_active_queue"},
{ZPOOL_CONFIG_VDEV_SCRUB_ACTIVE_QUEUE, "async_scrub_active_queue"},
{ZPOOL_CONFIG_VDEV_SYNC_R_PEND_QUEUE, "sync_r_pend_queue"},
{ZPOOL_CONFIG_VDEV_SYNC_W_PEND_QUEUE, "sync_w_pend_queue"},
{ZPOOL_CONFIG_VDEV_ASYNC_R_PEND_QUEUE, "async_r_pend_queue"},
{ZPOOL_CONFIG_VDEV_ASYNC_W_PEND_QUEUE, "async_w_pend_queue"},
{ZPOOL_CONFIG_VDEV_SCRUB_PEND_QUEUE, "async_scrub_pend_queue"},
{NULL, NULL}
};
if (nvlist_lookup_nvlist(nvroot,
ZPOOL_CONFIG_VDEV_STATS_EX, &nv_ex) != 0) {
return (6);
}
printf("%s%s,name=%s,vdev=root ", VDEV_MEASUREMENT, tags,
pool_name);
for (int i = 0; queue_type[i].name; i++) {
if (nvlist_lookup_uint64(nv_ex,
queue_type[i].name, &value) != 0) {
fprintf(stderr, "error: can't get %s\n",
queue_type[i].name);
return (3);
}
if (i > 0)
printf(",");
print_kv(queue_type[i].short_name, value);
}
printf(" %llu\n", (u_longlong_t)timestamp);
return (0);
}
/*
* recursive stats printer
*/
static int
print_recursive_stats(stat_printer_f func, nvlist_t *nvroot,
const char *pool_name, const char *parent_name, int descend)
{
uint_t c, children;
nvlist_t **child;
char vdev_name[256];
int err;
err = func(nvroot, pool_name, parent_name);
if (err)
return (err);
if (descend && nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_CHILDREN,
&child, &children) == 0) {
(void) strncpy(vdev_name, get_vdev_name(nvroot, parent_name),
sizeof (vdev_name));
vdev_name[sizeof (vdev_name) - 1] = '\0';
for (c = 0; c < children; c++) {
print_recursive_stats(func, child[c], pool_name,
vdev_name, descend);
}
}
return (0);
}
/*
* call-back to print the stats from the pool config
*
* Note: if the pool is broken, this can hang indefinitely and perhaps in an
* unkillable state.
*/
static int
print_stats(zpool_handle_t *zhp, void *data)
{
uint_t c;
int err;
boolean_t missing;
nvlist_t *config, *nvroot;
vdev_stat_t *vs;
struct timespec tv;
char *pool_name;
/* if not this pool return quickly */
if (data &&
strncmp(data, zhp->zpool_name, ZFS_MAX_DATASET_NAME_LEN) != 0) {
zpool_close(zhp);
return (0);
}
if (zpool_refresh_stats(zhp, &missing) != 0) {
zpool_close(zhp);
return (1);
}
config = zpool_get_config(zhp, NULL);
if (clock_gettime(CLOCK_REALTIME, &tv) != 0)
timestamp = (uint64_t)time(NULL) * 1000000000;
else
timestamp =
((uint64_t)tv.tv_sec * 1000000000) + (uint64_t)tv.tv_nsec;
if (nvlist_lookup_nvlist(
config, ZPOOL_CONFIG_VDEV_TREE, &nvroot) != 0) {
zpool_close(zhp);
return (2);
}
if (nvlist_lookup_uint64_array(nvroot, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) != 0) {
zpool_close(zhp);
return (3);
}
pool_name = escape_string(zhp->zpool_name);
err = print_recursive_stats(print_summary_stats, nvroot,
pool_name, NULL, 1);
/* if any of these return an error, skip the rest */
if (err == 0)
err = print_top_level_vdev_stats(nvroot, pool_name);
if (no_histograms == 0) {
if (err == 0)
err = print_recursive_stats(print_vdev_latency_stats, nvroot,
pool_name, NULL, 1);
if (err == 0)
err = print_recursive_stats(print_vdev_size_stats, nvroot,
pool_name, NULL, 1);
if (err == 0)
err = print_recursive_stats(print_queue_stats, nvroot,
pool_name, NULL, 0);
}
if (err == 0)
err = print_scan_status(nvroot, pool_name);
free(pool_name);
zpool_close(zhp);
return (err);
}
static void
usage(char *name)
{
fprintf(stderr, "usage: %s [--execd][--no-histograms]"
"[--sum-histogram-buckets] [--signed-int] [poolname]\n", name);
exit(EXIT_FAILURE);
}
int
main(int argc, char *argv[])
{
int opt;
int ret = 8;
char *line = NULL;
size_t len, tagslen = 0;
struct option long_options[] = {
{"execd", no_argument, NULL, 'e'},
{"help", no_argument, NULL, 'h'},
{"no-histograms", no_argument, NULL, 'n'},
{"signed-int", no_argument, NULL, 'i'},
{"sum-histogram-buckets", no_argument, NULL, 's'},
{"tags", required_argument, NULL, 't'},
{0, 0, 0, 0}
};
while ((opt = getopt_long(
argc, argv, "ehinst:", long_options, NULL)) != -1) {
switch (opt) {
case 'e':
execd_mode = 1;
break;
case 'i':
metric_data_type = 'i';
metric_value_mask = INT64_MAX;
break;
case 'n':
no_histograms = 1;
break;
case 's':
sum_histogram_buckets = 1;
break;
case 't':
tagslen = strlen(optarg) + 2;
tags = calloc(tagslen, 1);
if (tags == NULL) {
fprintf(stderr,
"error: cannot allocate memory "
"for tags\n");
exit(1);
}
(void) snprintf(tags, tagslen, ",%s", optarg);
break;
default:
usage(argv[0]);
}
}
libzfs_handle_t *g_zfs;
if ((g_zfs = libzfs_init()) == NULL) {
fprintf(stderr,
"error: cannot initialize libzfs. "
"Is the zfs module loaded or zrepl running?\n");
exit(EXIT_FAILURE);
}
if (execd_mode == 0) {
ret = zpool_iter(g_zfs, print_stats, argv[optind]);
return (ret);
}
while (getline(&line, &len, stdin) != -1) {
ret = zpool_iter(g_zfs, print_stats, argv[optind]);
fflush(stdout);
}
return (ret);
}

View File

@ -421,7 +421,7 @@ int
zstream_do_redup(int argc, char *argv[]) zstream_do_redup(int argc, char *argv[])
{ {
boolean_t verbose = B_FALSE; boolean_t verbose = B_FALSE;
char c; int c;
while ((c = getopt(argc, argv, "v")) != -1) { while ((c = getopt(argc, argv, "v")) != -1) {
switch (c) { switch (c) {

View File

@ -104,6 +104,7 @@
#include <sys/zio.h> #include <sys/zio.h>
#include <sys/zil.h> #include <sys/zil.h>
#include <sys/zil_impl.h> #include <sys/zil_impl.h>
#include <sys/vdev_draid.h>
#include <sys/vdev_impl.h> #include <sys/vdev_impl.h>
#include <sys/vdev_file.h> #include <sys/vdev_file.h>
#include <sys/vdev_initialize.h> #include <sys/vdev_initialize.h>
@ -167,8 +168,11 @@ typedef struct ztest_shared_opts {
size_t zo_vdev_size; size_t zo_vdev_size;
int zo_ashift; int zo_ashift;
int zo_mirrors; int zo_mirrors;
int zo_raidz; int zo_raid_children;
int zo_raidz_parity; int zo_raid_parity;
char zo_raid_type[8];
int zo_draid_data;
int zo_draid_spares;
int zo_datasets; int zo_datasets;
int zo_threads; int zo_threads;
uint64_t zo_passtime; uint64_t zo_passtime;
@ -191,9 +195,12 @@ static const ztest_shared_opts_t ztest_opts_defaults = {
.zo_vdevs = 5, .zo_vdevs = 5,
.zo_ashift = SPA_MINBLOCKSHIFT, .zo_ashift = SPA_MINBLOCKSHIFT,
.zo_mirrors = 2, .zo_mirrors = 2,
.zo_raidz = 4, .zo_raid_children = 4,
.zo_raidz_parity = 1, .zo_raid_parity = 1,
.zo_raid_type = VDEV_TYPE_RAIDZ,
.zo_vdev_size = SPA_MINDEVSIZE * 4, /* 256m default size */ .zo_vdev_size = SPA_MINDEVSIZE * 4, /* 256m default size */
.zo_draid_data = 4, /* data drives */
.zo_draid_spares = 1, /* distributed spares */
.zo_datasets = 7, .zo_datasets = 7,
.zo_threads = 23, .zo_threads = 23,
.zo_passtime = 60, /* 60 seconds */ .zo_passtime = 60, /* 60 seconds */
@ -232,7 +239,7 @@ static ztest_shared_ds_t *ztest_shared_ds;
#define BT_MAGIC 0x123456789abcdefULL #define BT_MAGIC 0x123456789abcdefULL
#define MAXFAULTS(zs) \ #define MAXFAULTS(zs) \
(MAX((zs)->zs_mirrors, 1) * (ztest_opts.zo_raidz_parity + 1) - 1) (MAX((zs)->zs_mirrors, 1) * (ztest_opts.zo_raid_parity + 1) - 1)
enum ztest_io_type { enum ztest_io_type {
ZTEST_IO_WRITE_TAG, ZTEST_IO_WRITE_TAG,
@ -689,8 +696,11 @@ usage(boolean_t requested)
"\t[-s size_of_each_vdev (default: %s)]\n" "\t[-s size_of_each_vdev (default: %s)]\n"
"\t[-a alignment_shift (default: %d)] use 0 for random\n" "\t[-a alignment_shift (default: %d)] use 0 for random\n"
"\t[-m mirror_copies (default: %d)]\n" "\t[-m mirror_copies (default: %d)]\n"
"\t[-r raidz_disks (default: %d)]\n" "\t[-r raidz_disks / draid_disks (default: %d)]\n"
"\t[-R raidz_parity (default: %d)]\n" "\t[-R raid_parity (default: %d)]\n"
"\t[-K raid_kind (default: random)] raidz|draid|random\n"
"\t[-D draid_data (default: %d)] in config\n"
"\t[-S draid_spares (default: %d)]\n"
"\t[-d datasets (default: %d)]\n" "\t[-d datasets (default: %d)]\n"
"\t[-t threads (default: %d)]\n" "\t[-t threads (default: %d)]\n"
"\t[-g gang_block_threshold (default: %s)]\n" "\t[-g gang_block_threshold (default: %s)]\n"
@ -716,8 +726,10 @@ usage(boolean_t requested)
nice_vdev_size, /* -s */ nice_vdev_size, /* -s */
zo->zo_ashift, /* -a */ zo->zo_ashift, /* -a */
zo->zo_mirrors, /* -m */ zo->zo_mirrors, /* -m */
zo->zo_raidz, /* -r */ zo->zo_raid_children, /* -r */
zo->zo_raidz_parity, /* -R */ zo->zo_raid_parity, /* -R */
zo->zo_draid_data, /* -D */
zo->zo_draid_spares, /* -S */
zo->zo_datasets, /* -d */ zo->zo_datasets, /* -d */
zo->zo_threads, /* -t */ zo->zo_threads, /* -t */
nice_force_ganging, /* -g */ nice_force_ganging, /* -g */
@ -731,6 +743,21 @@ usage(boolean_t requested)
exit(requested ? 0 : 1); exit(requested ? 0 : 1);
} }
static uint64_t
ztest_random(uint64_t range)
{
uint64_t r;
ASSERT3S(ztest_fd_rand, >=, 0);
if (range == 0)
return (0);
if (read(ztest_fd_rand, &r, sizeof (r)) != sizeof (r))
fatal(1, "short read from /dev/urandom");
return (r % range);
}
static void static void
ztest_parse_name_value(const char *input, ztest_shared_opts_t *zo) ztest_parse_name_value(const char *input, ztest_shared_opts_t *zo)
@ -780,11 +807,12 @@ process_options(int argc, char **argv)
int opt; int opt;
uint64_t value; uint64_t value;
char altdir[MAXNAMELEN] = { 0 }; char altdir[MAXNAMELEN] = { 0 };
char raid_kind[8] = { "random" };
bcopy(&ztest_opts_defaults, zo, sizeof (*zo)); bcopy(&ztest_opts_defaults, zo, sizeof (*zo));
while ((opt = getopt(argc, argv, while ((opt = getopt(argc, argv,
"v:s:a:m:r:R:d:t:g:i:k:p:f:MVET:P:hF:B:C:o:G")) != EOF) { "v:s:a:m:r:R:K:D:S:d:t:g:i:k:p:f:MVET:P:hF:B:C:o:G")) != EOF) {
value = 0; value = 0;
switch (opt) { switch (opt) {
case 'v': case 'v':
@ -793,6 +821,8 @@ process_options(int argc, char **argv)
case 'm': case 'm':
case 'r': case 'r':
case 'R': case 'R':
case 'D':
case 'S':
case 'd': case 'd':
case 't': case 't':
case 'g': case 'g':
@ -817,10 +847,19 @@ process_options(int argc, char **argv)
zo->zo_mirrors = value; zo->zo_mirrors = value;
break; break;
case 'r': case 'r':
zo->zo_raidz = MAX(1, value); zo->zo_raid_children = MAX(1, value);
break; break;
case 'R': case 'R':
zo->zo_raidz_parity = MIN(MAX(value, 1), 3); zo->zo_raid_parity = MIN(MAX(value, 1), 3);
break;
case 'K':
(void) strlcpy(raid_kind, optarg, sizeof (raid_kind));
break;
case 'D':
zo->zo_draid_data = MAX(1, value);
break;
case 'S':
zo->zo_draid_spares = MAX(1, value);
break; break;
case 'd': case 'd':
zo->zo_datasets = MAX(1, value); zo->zo_datasets = MAX(1, value);
@ -895,7 +934,54 @@ process_options(int argc, char **argv)
} }
} }
zo->zo_raidz_parity = MIN(zo->zo_raidz_parity, zo->zo_raidz - 1); /* When raid choice is 'random' add a draid pool 50% of the time */
if (strcmp(raid_kind, "random") == 0) {
(void) strlcpy(raid_kind, (ztest_random(2) == 0) ?
"draid" : "raidz", sizeof (raid_kind));
if (ztest_opts.zo_verbose >= 3)
(void) printf("choosing RAID type '%s'\n", raid_kind);
}
if (strcmp(raid_kind, "draid") == 0) {
uint64_t min_devsize;
/* With fewer disk use 256M, otherwise 128M is OK */
min_devsize = (ztest_opts.zo_raid_children < 16) ?
(256ULL << 20) : (128ULL << 20);
/* No top-level mirrors with dRAID for now */
zo->zo_mirrors = 0;
/* Use more appropriate defaults for dRAID */
if (zo->zo_vdevs == ztest_opts_defaults.zo_vdevs)
zo->zo_vdevs = 1;
if (zo->zo_raid_children ==
ztest_opts_defaults.zo_raid_children)
zo->zo_raid_children = 16;
if (zo->zo_ashift < 12)
zo->zo_ashift = 12;
if (zo->zo_vdev_size < min_devsize)
zo->zo_vdev_size = min_devsize;
if (zo->zo_draid_data + zo->zo_raid_parity >
zo->zo_raid_children - zo->zo_draid_spares) {
(void) fprintf(stderr, "error: too few draid "
"children (%d) for stripe width (%d)\n",
zo->zo_raid_children,
zo->zo_draid_data + zo->zo_raid_parity);
usage(B_FALSE);
}
(void) strlcpy(zo->zo_raid_type, VDEV_TYPE_DRAID,
sizeof (zo->zo_raid_type));
} else /* using raidz */ {
ASSERT0(strcmp(raid_kind, "raidz"));
zo->zo_raid_parity = MIN(zo->zo_raid_parity,
zo->zo_raid_children - 1);
}
zo->zo_vdevtime = zo->zo_vdevtime =
(zo->zo_vdevs > 0 ? zo->zo_time * NANOSEC / zo->zo_vdevs : (zo->zo_vdevs > 0 ? zo->zo_time * NANOSEC / zo->zo_vdevs :
@ -966,22 +1052,6 @@ ztest_kill(ztest_shared_t *zs)
(void) kill(getpid(), SIGKILL); (void) kill(getpid(), SIGKILL);
} }
static uint64_t
ztest_random(uint64_t range)
{
uint64_t r;
ASSERT3S(ztest_fd_rand, >=, 0);
if (range == 0)
return (0);
if (read(ztest_fd_rand, &r, sizeof (r)) != sizeof (r))
fatal(1, "short read from /dev/urandom");
return (r % range);
}
/* ARGSUSED */ /* ARGSUSED */
static void static void
ztest_record_enospc(const char *s) ztest_record_enospc(const char *s)
@ -997,12 +1067,27 @@ ztest_get_ashift(void)
return (ztest_opts.zo_ashift); return (ztest_opts.zo_ashift);
} }
static boolean_t
ztest_is_draid_spare(const char *name)
{
uint64_t spare_id = 0, parity = 0, vdev_id = 0;
if (sscanf(name, VDEV_TYPE_DRAID "%llu-%llu-%llu",
(u_longlong_t *)&parity, (u_longlong_t *)&vdev_id,
(u_longlong_t *)&spare_id) == 3) {
return (B_TRUE);
}
return (B_FALSE);
}
static nvlist_t * static nvlist_t *
make_vdev_file(char *path, char *aux, char *pool, size_t size, uint64_t ashift) make_vdev_file(char *path, char *aux, char *pool, size_t size, uint64_t ashift)
{ {
char *pathbuf; char *pathbuf;
uint64_t vdev; uint64_t vdev;
nvlist_t *file; nvlist_t *file;
boolean_t draid_spare = B_FALSE;
pathbuf = umem_alloc(MAXPATHLEN, UMEM_NOFAIL); pathbuf = umem_alloc(MAXPATHLEN, UMEM_NOFAIL);
@ -1024,9 +1109,11 @@ make_vdev_file(char *path, char *aux, char *pool, size_t size, uint64_t ashift)
ztest_dev_template, ztest_opts.zo_dir, ztest_dev_template, ztest_opts.zo_dir,
pool == NULL ? ztest_opts.zo_pool : pool, vdev); pool == NULL ? ztest_opts.zo_pool : pool, vdev);
} }
} else {
draid_spare = ztest_is_draid_spare(path);
} }
if (size != 0) { if (size != 0 && !draid_spare) {
int fd = open(path, O_RDWR | O_CREAT | O_TRUNC, 0666); int fd = open(path, O_RDWR | O_CREAT | O_TRUNC, 0666);
if (fd == -1) if (fd == -1)
fatal(1, "can't open %s", path); fatal(1, "can't open %s", path);
@ -1035,20 +1122,21 @@ make_vdev_file(char *path, char *aux, char *pool, size_t size, uint64_t ashift)
(void) close(fd); (void) close(fd);
} }
VERIFY(nvlist_alloc(&file, NV_UNIQUE_NAME, 0) == 0); VERIFY0(nvlist_alloc(&file, NV_UNIQUE_NAME, 0));
VERIFY(nvlist_add_string(file, ZPOOL_CONFIG_TYPE, VDEV_TYPE_FILE) == 0); VERIFY0(nvlist_add_string(file, ZPOOL_CONFIG_TYPE,
VERIFY(nvlist_add_string(file, ZPOOL_CONFIG_PATH, path) == 0); draid_spare ? VDEV_TYPE_DRAID_SPARE : VDEV_TYPE_FILE));
VERIFY(nvlist_add_uint64(file, ZPOOL_CONFIG_ASHIFT, ashift) == 0); VERIFY0(nvlist_add_string(file, ZPOOL_CONFIG_PATH, path));
VERIFY0(nvlist_add_uint64(file, ZPOOL_CONFIG_ASHIFT, ashift));
umem_free(pathbuf, MAXPATHLEN); umem_free(pathbuf, MAXPATHLEN);
return (file); return (file);
} }
static nvlist_t * static nvlist_t *
make_vdev_raidz(char *path, char *aux, char *pool, size_t size, make_vdev_raid(char *path, char *aux, char *pool, size_t size,
uint64_t ashift, int r) uint64_t ashift, int r)
{ {
nvlist_t *raidz, **child; nvlist_t *raid, **child;
int c; int c;
if (r < 2) if (r < 2)
@ -1058,20 +1146,41 @@ make_vdev_raidz(char *path, char *aux, char *pool, size_t size,
for (c = 0; c < r; c++) for (c = 0; c < r; c++)
child[c] = make_vdev_file(path, aux, pool, size, ashift); child[c] = make_vdev_file(path, aux, pool, size, ashift);
VERIFY(nvlist_alloc(&raidz, NV_UNIQUE_NAME, 0) == 0); VERIFY0(nvlist_alloc(&raid, NV_UNIQUE_NAME, 0));
VERIFY(nvlist_add_string(raidz, ZPOOL_CONFIG_TYPE, VERIFY0(nvlist_add_string(raid, ZPOOL_CONFIG_TYPE,
VDEV_TYPE_RAIDZ) == 0); ztest_opts.zo_raid_type));
VERIFY(nvlist_add_uint64(raidz, ZPOOL_CONFIG_NPARITY, VERIFY0(nvlist_add_uint64(raid, ZPOOL_CONFIG_NPARITY,
ztest_opts.zo_raidz_parity) == 0); ztest_opts.zo_raid_parity));
VERIFY(nvlist_add_nvlist_array(raidz, ZPOOL_CONFIG_CHILDREN, VERIFY0(nvlist_add_nvlist_array(raid, ZPOOL_CONFIG_CHILDREN,
child, r) == 0); child, r));
if (strcmp(ztest_opts.zo_raid_type, VDEV_TYPE_DRAID) == 0) {
uint64_t ndata = ztest_opts.zo_draid_data;
uint64_t nparity = ztest_opts.zo_raid_parity;
uint64_t nspares = ztest_opts.zo_draid_spares;
uint64_t children = ztest_opts.zo_raid_children;
uint64_t ngroups = 1;
/*
* Calculate the minimum number of groups required to fill a
* slice. This is the LCM of the stripe width (data + parity)
* and the number of data drives (children - spares).
*/
while (ngroups * (ndata + nparity) % (children - nspares) != 0)
ngroups++;
/* Store the basic dRAID configuration. */
fnvlist_add_uint64(raid, ZPOOL_CONFIG_DRAID_NDATA, ndata);
fnvlist_add_uint64(raid, ZPOOL_CONFIG_DRAID_NSPARES, nspares);
fnvlist_add_uint64(raid, ZPOOL_CONFIG_DRAID_NGROUPS, ngroups);
}
for (c = 0; c < r; c++) for (c = 0; c < r; c++)
nvlist_free(child[c]); nvlist_free(child[c]);
umem_free(child, r * sizeof (nvlist_t *)); umem_free(child, r * sizeof (nvlist_t *));
return (raidz); return (raid);
} }
static nvlist_t * static nvlist_t *
@ -1082,12 +1191,12 @@ make_vdev_mirror(char *path, char *aux, char *pool, size_t size,
int c; int c;
if (m < 1) if (m < 1)
return (make_vdev_raidz(path, aux, pool, size, ashift, r)); return (make_vdev_raid(path, aux, pool, size, ashift, r));
child = umem_alloc(m * sizeof (nvlist_t *), UMEM_NOFAIL); child = umem_alloc(m * sizeof (nvlist_t *), UMEM_NOFAIL);
for (c = 0; c < m; c++) for (c = 0; c < m; c++)
child[c] = make_vdev_raidz(path, aux, pool, size, ashift, r); child[c] = make_vdev_raid(path, aux, pool, size, ashift, r);
VERIFY(nvlist_alloc(&mirror, NV_UNIQUE_NAME, 0) == 0); VERIFY(nvlist_alloc(&mirror, NV_UNIQUE_NAME, 0) == 0);
VERIFY(nvlist_add_string(mirror, ZPOOL_CONFIG_TYPE, VERIFY(nvlist_add_string(mirror, ZPOOL_CONFIG_TYPE,
@ -1332,7 +1441,11 @@ ztest_dmu_objset_own(const char *name, dmu_objset_type_t type,
VERIFY0(dsl_crypto_params_create_nvlist(DCP_CMD_NONE, NULL, VERIFY0(dsl_crypto_params_create_nvlist(DCP_CMD_NONE, NULL,
crypto_args, &dcp)); crypto_args, &dcp));
err = spa_keystore_load_wkey(ddname, dcp, B_FALSE); err = spa_keystore_load_wkey(ddname, dcp, B_FALSE);
dsl_crypto_params_free(dcp, B_FALSE); /*
* Note: if there was an error loading, the wkey was not
* consumed, and needs to be freed.
*/
dsl_crypto_params_free(dcp, (err != 0));
fnvlist_free(crypto_args); fnvlist_free(crypto_args);
if (err == EINVAL) { if (err == EINVAL) {
@ -2809,6 +2922,10 @@ ztest_spa_upgrade(ztest_ds_t *zd, uint64_t id)
if (ztest_opts.zo_mmp_test) if (ztest_opts.zo_mmp_test)
return; return;
/* dRAID added after feature flags, skip upgrade test. */
if (strcmp(ztest_opts.zo_raid_type, VDEV_TYPE_DRAID) == 0)
return;
mutex_enter(&ztest_vdev_lock); mutex_enter(&ztest_vdev_lock);
name = kmem_asprintf("%s_upgrade", ztest_opts.zo_pool); name = kmem_asprintf("%s_upgrade", ztest_opts.zo_pool);
@ -2818,13 +2935,13 @@ ztest_spa_upgrade(ztest_ds_t *zd, uint64_t id)
(void) spa_destroy(name); (void) spa_destroy(name);
nvroot = make_vdev_root(NULL, NULL, name, ztest_opts.zo_vdev_size, 0, nvroot = make_vdev_root(NULL, NULL, name, ztest_opts.zo_vdev_size, 0,
NULL, ztest_opts.zo_raidz, ztest_opts.zo_mirrors, 1); NULL, ztest_opts.zo_raid_children, ztest_opts.zo_mirrors, 1);
/* /*
* If we're configuring a RAIDZ device then make sure that the * If we're configuring a RAIDZ device then make sure that the
* initial version is capable of supporting that feature. * initial version is capable of supporting that feature.
*/ */
switch (ztest_opts.zo_raidz_parity) { switch (ztest_opts.zo_raid_parity) {
case 0: case 0:
case 1: case 1:
initial_version = SPA_VERSION_INITIAL; initial_version = SPA_VERSION_INITIAL;
@ -2970,7 +3087,8 @@ ztest_vdev_add_remove(ztest_ds_t *zd, uint64_t id)
return; return;
mutex_enter(&ztest_vdev_lock); mutex_enter(&ztest_vdev_lock);
leaves = MAX(zs->zs_mirrors + zs->zs_splits, 1) * ztest_opts.zo_raidz; leaves = MAX(zs->zs_mirrors + zs->zs_splits, 1) *
ztest_opts.zo_raid_children;
spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER); spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER);
@ -2985,7 +3103,7 @@ ztest_vdev_add_remove(ztest_ds_t *zd, uint64_t id)
/* /*
* find the first real slog in log allocation class * find the first real slog in log allocation class
*/ */
mg = spa_log_class(spa)->mc_rotor; mg = spa_log_class(spa)->mc_allocator[0].mca_rotor;
while (!mg->mg_vd->vdev_islog) while (!mg->mg_vd->vdev_islog)
mg = mg->mg_next; mg = mg->mg_next;
@ -3024,7 +3142,8 @@ ztest_vdev_add_remove(ztest_ds_t *zd, uint64_t id)
*/ */
nvroot = make_vdev_root(NULL, NULL, NULL, nvroot = make_vdev_root(NULL, NULL, NULL,
ztest_opts.zo_vdev_size, 0, (ztest_random(4) == 0) ? ztest_opts.zo_vdev_size, 0, (ztest_random(4) == 0) ?
"log" : NULL, ztest_opts.zo_raidz, zs->zs_mirrors, 1); "log" : NULL, ztest_opts.zo_raid_children, zs->zs_mirrors,
1);
error = spa_vdev_add(spa, nvroot); error = spa_vdev_add(spa, nvroot);
nvlist_free(nvroot); nvlist_free(nvroot);
@ -3078,14 +3197,15 @@ ztest_vdev_class_add(ztest_ds_t *zd, uint64_t id)
return; return;
} }
leaves = MAX(zs->zs_mirrors + zs->zs_splits, 1) * ztest_opts.zo_raidz; leaves = MAX(zs->zs_mirrors + zs->zs_splits, 1) *
ztest_opts.zo_raid_children;
spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER); spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER);
ztest_shared->zs_vdev_next_leaf = spa_num_top_vdevs(spa) * leaves; ztest_shared->zs_vdev_next_leaf = spa_num_top_vdevs(spa) * leaves;
spa_config_exit(spa, SCL_VDEV, FTAG); spa_config_exit(spa, SCL_VDEV, FTAG);
nvroot = make_vdev_root(NULL, NULL, NULL, ztest_opts.zo_vdev_size, 0, nvroot = make_vdev_root(NULL, NULL, NULL, ztest_opts.zo_vdev_size, 0,
class, ztest_opts.zo_raidz, zs->zs_mirrors, 1); class, ztest_opts.zo_raid_children, zs->zs_mirrors, 1);
error = spa_vdev_add(spa, nvroot); error = spa_vdev_add(spa, nvroot);
nvlist_free(nvroot); nvlist_free(nvroot);
@ -3134,7 +3254,7 @@ ztest_vdev_aux_add_remove(ztest_ds_t *zd, uint64_t id)
char *aux; char *aux;
char *path; char *path;
uint64_t guid = 0; uint64_t guid = 0;
int error; int error, ignore_err = 0;
if (ztest_opts.zo_mmp_test) if (ztest_opts.zo_mmp_test)
return; return;
@ -3157,7 +3277,13 @@ ztest_vdev_aux_add_remove(ztest_ds_t *zd, uint64_t id)
/* /*
* Pick a random device to remove. * Pick a random device to remove.
*/ */
guid = sav->sav_vdevs[ztest_random(sav->sav_count)]->vdev_guid; vdev_t *svd = sav->sav_vdevs[ztest_random(sav->sav_count)];
/* dRAID spares cannot be removed; try anyways to see ENOTSUP */
if (strstr(svd->vdev_path, VDEV_TYPE_DRAID) != NULL)
ignore_err = ENOTSUP;
guid = svd->vdev_guid;
} else { } else {
/* /*
* Find an unused device we can add. * Find an unused device we can add.
@ -3214,7 +3340,9 @@ ztest_vdev_aux_add_remove(ztest_ds_t *zd, uint64_t id)
case ZFS_ERR_DISCARDING_CHECKPOINT: case ZFS_ERR_DISCARDING_CHECKPOINT:
break; break;
default: default:
fatal(0, "spa_vdev_remove(%llu) = %d", guid, error); if (error != ignore_err)
fatal(0, "spa_vdev_remove(%llu) = %d", guid,
error);
} }
} }
@ -3243,7 +3371,7 @@ ztest_split_pool(ztest_ds_t *zd, uint64_t id)
mutex_enter(&ztest_vdev_lock); mutex_enter(&ztest_vdev_lock);
/* ensure we have a usable config; mirrors of raidz aren't supported */ /* ensure we have a usable config; mirrors of raidz aren't supported */
if (zs->zs_mirrors < 3 || ztest_opts.zo_raidz > 1) { if (zs->zs_mirrors < 3 || ztest_opts.zo_raid_children > 1) {
mutex_exit(&ztest_vdev_lock); mutex_exit(&ztest_vdev_lock);
return; return;
} }
@ -3343,6 +3471,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
int replacing; int replacing;
int oldvd_has_siblings = B_FALSE; int oldvd_has_siblings = B_FALSE;
int newvd_is_spare = B_FALSE; int newvd_is_spare = B_FALSE;
int newvd_is_dspare = B_FALSE;
int oldvd_is_log; int oldvd_is_log;
int error, expected_error; int error, expected_error;
@ -3353,7 +3482,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
newpath = umem_alloc(MAXPATHLEN, UMEM_NOFAIL); newpath = umem_alloc(MAXPATHLEN, UMEM_NOFAIL);
mutex_enter(&ztest_vdev_lock); mutex_enter(&ztest_vdev_lock);
leaves = MAX(zs->zs_mirrors, 1) * ztest_opts.zo_raidz; leaves = MAX(zs->zs_mirrors, 1) * ztest_opts.zo_raid_children;
spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER); spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER);
@ -3365,8 +3494,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
*/ */
if (ztest_device_removal_active) { if (ztest_device_removal_active) {
spa_config_exit(spa, SCL_ALL, FTAG); spa_config_exit(spa, SCL_ALL, FTAG);
mutex_exit(&ztest_vdev_lock); goto out;
return;
} }
/* /*
@ -3393,14 +3521,17 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
if (zs->zs_mirrors >= 1) { if (zs->zs_mirrors >= 1) {
ASSERT(oldvd->vdev_ops == &vdev_mirror_ops); ASSERT(oldvd->vdev_ops == &vdev_mirror_ops);
ASSERT(oldvd->vdev_children >= zs->zs_mirrors); ASSERT(oldvd->vdev_children >= zs->zs_mirrors);
oldvd = oldvd->vdev_child[leaf / ztest_opts.zo_raidz]; oldvd = oldvd->vdev_child[leaf / ztest_opts.zo_raid_children];
} }
/* pick a child out of the raidz group */ /* pick a child out of the raidz group */
if (ztest_opts.zo_raidz > 1) { if (ztest_opts.zo_raid_children > 1) {
ASSERT(oldvd->vdev_ops == &vdev_raidz_ops); if (strcmp(oldvd->vdev_ops->vdev_op_type, "raidz") == 0)
ASSERT(oldvd->vdev_children == ztest_opts.zo_raidz); ASSERT(oldvd->vdev_ops == &vdev_raidz_ops);
oldvd = oldvd->vdev_child[leaf % ztest_opts.zo_raidz]; else
ASSERT(oldvd->vdev_ops == &vdev_draid_ops);
ASSERT(oldvd->vdev_children == ztest_opts.zo_raid_children);
oldvd = oldvd->vdev_child[leaf % ztest_opts.zo_raid_children];
} }
/* /*
@ -3447,6 +3578,10 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
if (sav->sav_count != 0 && ztest_random(3) == 0) { if (sav->sav_count != 0 && ztest_random(3) == 0) {
newvd = sav->sav_vdevs[ztest_random(sav->sav_count)]; newvd = sav->sav_vdevs[ztest_random(sav->sav_count)];
newvd_is_spare = B_TRUE; newvd_is_spare = B_TRUE;
if (newvd->vdev_ops == &vdev_draid_spare_ops)
newvd_is_dspare = B_TRUE;
(void) strcpy(newpath, newvd->vdev_path); (void) strcpy(newpath, newvd->vdev_path);
} else { } else {
(void) snprintf(newpath, MAXPATHLEN, ztest_dev_template, (void) snprintf(newpath, MAXPATHLEN, ztest_dev_template,
@ -3480,6 +3615,9 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
* If newvd is already part of the pool, it should fail with EBUSY. * If newvd is already part of the pool, it should fail with EBUSY.
* *
* If newvd is too small, it should fail with EOVERFLOW. * If newvd is too small, it should fail with EOVERFLOW.
*
* If newvd is a distributed spare and it's being attached to a
* dRAID which is not its parent it should fail with EINVAL.
*/ */
if (pvd->vdev_ops != &vdev_mirror_ops && if (pvd->vdev_ops != &vdev_mirror_ops &&
pvd->vdev_ops != &vdev_root_ops && (!replacing || pvd->vdev_ops != &vdev_root_ops && (!replacing ||
@ -3492,10 +3630,12 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
expected_error = replacing ? 0 : EBUSY; expected_error = replacing ? 0 : EBUSY;
else if (vdev_lookup_by_path(rvd, newpath) != NULL) else if (vdev_lookup_by_path(rvd, newpath) != NULL)
expected_error = EBUSY; expected_error = EBUSY;
else if (newsize < oldsize) else if (!newvd_is_dspare && newsize < oldsize)
expected_error = EOVERFLOW; expected_error = EOVERFLOW;
else if (ashift > oldvd->vdev_top->vdev_ashift) else if (ashift > oldvd->vdev_top->vdev_ashift)
expected_error = EDOM; expected_error = EDOM;
else if (newvd_is_dspare && pvd != vdev_draid_spare_get_parent(newvd))
expected_error = ENOTSUP;
else else
expected_error = 0; expected_error = 0;
@ -4880,13 +5020,13 @@ ztest_dmu_read_write_zcopy(ztest_ds_t *zd, uint64_t id)
void *packcheck = umem_alloc(packsize, UMEM_NOFAIL); void *packcheck = umem_alloc(packsize, UMEM_NOFAIL);
void *bigcheck = umem_alloc(bigsize, UMEM_NOFAIL); void *bigcheck = umem_alloc(bigsize, UMEM_NOFAIL);
VERIFY(0 == dmu_read(os, packobj, packoff, VERIFY0(dmu_read(os, packobj, packoff,
packsize, packcheck, DMU_READ_PREFETCH)); packsize, packcheck, DMU_READ_PREFETCH));
VERIFY(0 == dmu_read(os, bigobj, bigoff, VERIFY0(dmu_read(os, bigobj, bigoff,
bigsize, bigcheck, DMU_READ_PREFETCH)); bigsize, bigcheck, DMU_READ_PREFETCH));
ASSERT(bcmp(packbuf, packcheck, packsize) == 0); ASSERT0(bcmp(packbuf, packcheck, packsize));
ASSERT(bcmp(bigbuf, bigcheck, bigsize) == 0); ASSERT0(bcmp(bigbuf, bigcheck, bigsize));
umem_free(packcheck, packsize); umem_free(packcheck, packsize);
umem_free(bigcheck, bigsize); umem_free(bigcheck, bigsize);
@ -5761,7 +5901,7 @@ ztest_fault_inject(ztest_ds_t *zd, uint64_t id)
} }
maxfaults = MAXFAULTS(zs); maxfaults = MAXFAULTS(zs);
leaves = MAX(zs->zs_mirrors, 1) * ztest_opts.zo_raidz; leaves = MAX(zs->zs_mirrors, 1) * ztest_opts.zo_raid_children;
mirror_save = zs->zs_mirrors; mirror_save = zs->zs_mirrors;
mutex_exit(&ztest_vdev_lock); mutex_exit(&ztest_vdev_lock);
@ -6011,7 +6151,7 @@ ztest_fault_inject(ztest_ds_t *zd, uint64_t id)
/* /*
* By design ztest will never inject uncorrectable damage in to the pool. * By design ztest will never inject uncorrectable damage in to the pool.
* Issue a scrub, wait for it to complete, and verify there is never any * Issue a scrub, wait for it to complete, and verify there is never any
* any persistent damage. * persistent damage.
* *
* Only after a full scrub has been completed is it safe to start injecting * Only after a full scrub has been completed is it safe to start injecting
* data corruption. See the comment in zfs_fault_inject(). * data corruption. See the comment in zfs_fault_inject().
@ -7016,6 +7156,7 @@ ztest_import_impl(ztest_shared_t *zs)
VERIFY0(zpool_find_config(NULL, ztest_opts.zo_pool, &cfg, &args, VERIFY0(zpool_find_config(NULL, ztest_opts.zo_pool, &cfg, &args,
&libzpool_config_ops)); &libzpool_config_ops));
VERIFY0(spa_import(ztest_opts.zo_pool, cfg, NULL, flags)); VERIFY0(spa_import(ztest_opts.zo_pool, cfg, NULL, flags));
fnvlist_free(cfg);
} }
/* /*
@ -7347,7 +7488,7 @@ ztest_init(ztest_shared_t *zs)
zs->zs_splits = 0; zs->zs_splits = 0;
zs->zs_mirrors = ztest_opts.zo_mirrors; zs->zs_mirrors = ztest_opts.zo_mirrors;
nvroot = make_vdev_root(NULL, NULL, NULL, ztest_opts.zo_vdev_size, 0, nvroot = make_vdev_root(NULL, NULL, NULL, ztest_opts.zo_vdev_size, 0,
NULL, ztest_opts.zo_raidz, zs->zs_mirrors, 1); NULL, ztest_opts.zo_raid_children, zs->zs_mirrors, 1);
props = make_random_props(); props = make_random_props();
/* /*
@ -7683,10 +7824,12 @@ main(int argc, char **argv)
if (ztest_opts.zo_verbose >= 1) { if (ztest_opts.zo_verbose >= 1) {
(void) printf("%llu vdevs, %d datasets, %d threads," (void) printf("%llu vdevs, %d datasets, %d threads,"
" %llu seconds...\n", "%d %s disks, %llu seconds...\n\n",
(u_longlong_t)ztest_opts.zo_vdevs, (u_longlong_t)ztest_opts.zo_vdevs,
ztest_opts.zo_datasets, ztest_opts.zo_datasets,
ztest_opts.zo_threads, ztest_opts.zo_threads,
ztest_opts.zo_raid_children,
ztest_opts.zo_raid_type,
(u_longlong_t)ztest_opts.zo_time); (u_longlong_t)ztest_opts.zo_time);
} }

View File

@ -0,0 +1,29 @@
#
# When performing an ABI check the following options are applied:
#
# --no-unreferenced-symbols: Exclude symbols which are not referenced by
# any debug information. Without this _init() and _fini() are incorrectly
# reported on CentOS7 for libuutil.so.
#
# --headers-dir1: Limit ABI checks to public OpenZFS headers, otherwise
# changes in public system headers are also reported.
#
# --suppressions: Honor a suppressions file for each library to provide
# a mechanism for suppressing harmless warnings.
#
PHONY += checkabi storeabi
checkabi:
for lib in $(lib_LTLIBRARIES) ; do \
abidiff --no-unreferenced-symbols \
--headers-dir1 ../../include \
--suppressions $${lib%.la}.suppr \
$${lib%.la}.abi .libs/$${lib%.la}.so ; \
done
storeabi:
cd .libs ; \
for lib in $(lib_LTLIBRARIES) ; do \
abidw $${lib%.la}.so > ../$${lib%.la}.abi ; \
done

View File

@ -7,7 +7,7 @@ dnl # set the PYTHON environment variable accordingly.
dnl # dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYTHON], [ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYTHON], [
AC_ARG_WITH([python], AC_ARG_WITH([python],
AC_HELP_STRING([--with-python[=VERSION]], AS_HELP_STRING([--with-python[=VERSION]],
[default system python version @<:@default=check@:>@]), [default system python version @<:@default=check@:>@]),
[with_python=$withval], [with_python=$withval],
[with_python=check]) [with_python=check])

View File

@ -22,7 +22,7 @@ dnl # Determines if pyzfs can be built, requires Python 2.7 or later.
dnl # dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
AC_ARG_ENABLE([pyzfs], AC_ARG_ENABLE([pyzfs],
AC_HELP_STRING([--enable-pyzfs], AS_HELP_STRING([--enable-pyzfs],
[install libzfs_core python bindings @<:@default=check@:>@]), [install libzfs_core python bindings @<:@default=check@:>@]),
[enable_pyzfs=$enableval], [enable_pyzfs=$enableval],
[enable_pyzfs=check]) [enable_pyzfs=check])

View File

@ -4,7 +4,7 @@ dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_SED], [ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_SED], [
AC_REQUIRE([AC_PROG_SED])dnl AC_REQUIRE([AC_PROG_SED])dnl
AC_CACHE_CHECK([for sed --in-place], [ac_cv_inplace], [ AC_CACHE_CHECK([for sed --in-place], [ac_cv_inplace], [
tmpfile=$(mktemp conftest.XXX) tmpfile=$(mktemp conftest.XXXXXX)
echo foo >$tmpfile echo foo >$tmpfile
AS_IF([$SED --in-place 's#foo#bar#' $tmpfile 2>/dev/null], AS_IF([$SED --in-place 's#foo#bar#' $tmpfile 2>/dev/null],
[ac_cv_inplace="--in-place"], [ac_cv_inplace="--in-place"],

View File

@ -41,11 +41,11 @@ deb-utils: deb-local rpm-utils-initramfs
arch=`$(RPM) -qp $${name}-$${version}.src.rpm --qf %{arch} | tail -1`; \ arch=`$(RPM) -qp $${name}-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \ debarch=`$(DPKG) --print-architecture`; \
pkg1=$${name}-$${version}.$${arch}.rpm; \ pkg1=$${name}-$${version}.$${arch}.rpm; \
pkg2=libnvpair1-$${version}.$${arch}.rpm; \ pkg2=libnvpair3-$${version}.$${arch}.rpm; \
pkg3=libuutil1-$${version}.$${arch}.rpm; \ pkg3=libuutil3-$${version}.$${arch}.rpm; \
pkg4=libzfs2-$${version}.$${arch}.rpm; \ pkg4=libzfs4-$${version}.$${arch}.rpm; \
pkg5=libzpool2-$${version}.$${arch}.rpm; \ pkg5=libzpool4-$${version}.$${arch}.rpm; \
pkg6=libzfs2-devel-$${version}.$${arch}.rpm; \ pkg6=libzfs4-devel-$${version}.$${arch}.rpm; \
pkg7=$${name}-test-$${version}.$${arch}.rpm; \ pkg7=$${name}-test-$${version}.$${arch}.rpm; \
pkg8=$${name}-dracut-$${version}.noarch.rpm; \ pkg8=$${name}-dracut-$${version}.noarch.rpm; \
pkg9=$${name}-initramfs-$${version}.$${arch}.rpm; \ pkg9=$${name}-initramfs-$${version}.$${arch}.rpm; \
@ -53,10 +53,10 @@ deb-utils: deb-local rpm-utils-initramfs
## Arguments need to be passed to dh_shlibdeps. Alien provides no mechanism ## Arguments need to be passed to dh_shlibdeps. Alien provides no mechanism
## to do this, so we install a shim onto the path which calls the real ## to do this, so we install a shim onto the path which calls the real
## dh_shlibdeps with the required arguments. ## dh_shlibdeps with the required arguments.
path_prepend=`mktemp -d /tmp/intercept.XXX`; \ path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \
echo "#$(SHELL)" > $${path_prepend}/dh_shlibdeps; \ echo "#$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "`which dh_shlibdeps` -- \ echo "`which dh_shlibdeps` -- \
-xlibuutil1linux -xlibnvpair1linux -xlibzfs2linux -xlibzpool2linux" \ -xlibuutil3linux -xlibnvpair3linux -xlibzfs4linux -xlibzpool4linux" \
>> $${path_prepend}/dh_shlibdeps; \ >> $${path_prepend}/dh_shlibdeps; \
## These -x arguments are passed to dpkg-shlibdeps, which exclude the ## These -x arguments are passed to dpkg-shlibdeps, which exclude the
## Debianized packages from the auto-generated dependencies of the new debs, ## Debianized packages from the auto-generated dependencies of the new debs,

View File

@ -11,7 +11,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_POSIX_ACL_RELEASE], [
], [ ], [
struct posix_acl *tmp = posix_acl_alloc(1, 0); struct posix_acl *tmp = posix_acl_alloc(1, 0);
posix_acl_release(tmp); posix_acl_release(tmp);
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_POSIX_ACL_RELEASE], [ AC_DEFUN([ZFS_AC_KERNEL_POSIX_ACL_RELEASE], [
@ -50,7 +50,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SET_CACHED_ACL_USABLE], [
struct posix_acl *acl = posix_acl_alloc(1, 0); struct posix_acl *acl = posix_acl_alloc(1, 0);
set_cached_acl(ip, ACL_TYPE_ACCESS, acl); set_cached_acl(ip, ACL_TYPE_ACCESS, acl);
forget_cached_acl(ip, ACL_TYPE_ACCESS); forget_cached_acl(ip, ACL_TYPE_ACCESS);
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE], [ AC_DEFUN([ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE], [

View File

@ -188,7 +188,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SET_DEV], [
struct block_device *bdev = NULL; struct block_device *bdev = NULL;
struct bio *bio = NULL; struct bio *bio = NULL;
bio_set_dev(bio, bdev); bio_set_dev(bio, bdev);
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [ AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
@ -347,7 +347,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKG_TRYGET], [
struct blkcg_gq blkg __attribute__ ((unused)) = {}; struct blkcg_gq blkg __attribute__ ((unused)) = {};
bool rc __attribute__ ((unused)); bool rc __attribute__ ((unused));
rc = blkg_tryget(&blkg); rc = blkg_tryget(&blkg);
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_BLKG_TRYGET], [ AC_DEFUN([ZFS_AC_KERNEL_BLKG_TRYGET], [

View File

@ -179,7 +179,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
], [ ], [
struct request_queue *q = NULL; struct request_queue *q = NULL;
(void) blk_queue_flush(q, REQ_FLUSH); (void) blk_queue_flush(q, REQ_FLUSH);
], [$NO_UNUSED_BUT_SET_VARIABLE], [$ZFS_META_LICENSE]) ], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [ ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [
#include <linux/kernel.h> #include <linux/kernel.h>
@ -187,7 +187,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
], [ ], [
struct request_queue *q = NULL; struct request_queue *q = NULL;
blk_queue_write_cache(q, true, true); blk_queue_write_cache(q, true, true);
], [$NO_UNUSED_BUT_SET_VARIABLE], [$ZFS_META_LICENSE]) ], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [

View File

@ -77,6 +77,59 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_REREAD_PART], [
]) ])
]) ])
dnl #
dnl # check_disk_change() was removed in 5.10
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE], [
ZFS_LINUX_TEST_SRC([check_disk_change], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
bool error;
error = check_disk_change(bdev);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE], [
AC_MSG_CHECKING([whether check_disk_change() exists])
ZFS_LINUX_TEST_RESULT([check_disk_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CHECK_DISK_CHANGE, 1,
[check_disk_change() exists])
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.10 API, check_disk_change() is removed, in favor of
dnl # bdev_check_media_change(), which doesn't force revalidation
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
ZFS_LINUX_TEST_SRC([bdev_check_media_change], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
int error;
error = bdev_check_media_change(bdev);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
AC_MSG_CHECKING([whether bdev_disk_changed() exists])
ZFS_LINUX_TEST_RESULT([bdev_check_media_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_CHECK_MEDIA_CHANGE, 1,
[bdev_check_media_change() exists])
], [
AC_MSG_RESULT(no)
])
])
dnl # dnl #
dnl # 2.6.22 API change dnl # 2.6.22 API change
dnl # Single argument invalidate_bdev() dnl # Single argument invalidate_bdev()
@ -101,42 +154,69 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_INVALIDATE_BDEV], [
]) ])
dnl # dnl #
dnl # 2.6.27, lookup_bdev() was exported. dnl # 5.11 API, lookup_bdev() takes dev_t argument.
dnl # 4.4.0-6.21 - lookup_bdev() takes 2 arguments. dnl # 2.6.27 API, lookup_bdev() was first exported.
dnl # 4.4.0-6.21 API, lookup_bdev() on Ubuntu takes mode argument.
dnl # dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_LOOKUP_BDEV], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_LOOKUP_BDEV], [
ZFS_LINUX_TEST_SRC([lookup_bdev_devt], [
#include <linux/blkdev.h>
], [
int error __attribute__ ((unused));
const char path[] = "/example/path";
dev_t dev;
error = lookup_bdev(path, &dev);
])
ZFS_LINUX_TEST_SRC([lookup_bdev_1arg], [ ZFS_LINUX_TEST_SRC([lookup_bdev_1arg], [
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/blkdev.h> #include <linux/blkdev.h>
], [ ], [
lookup_bdev(NULL); struct block_device *bdev __attribute__ ((unused));
const char path[] = "/example/path";
bdev = lookup_bdev(path);
]) ])
ZFS_LINUX_TEST_SRC([lookup_bdev_2args], [ ZFS_LINUX_TEST_SRC([lookup_bdev_mode], [
#include <linux/fs.h> #include <linux/fs.h>
], [ ], [
lookup_bdev(NULL, FMODE_READ); struct block_device *bdev __attribute__ ((unused));
const char path[] = "/example/path";
bdev = lookup_bdev(path, FMODE_READ);
]) ])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_LOOKUP_BDEV], [ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_LOOKUP_BDEV], [
AC_MSG_CHECKING([whether lookup_bdev() wants 1 arg]) AC_MSG_CHECKING([whether lookup_bdev() wants dev_t arg])
ZFS_LINUX_TEST_RESULT_SYMBOL([lookup_bdev_1arg], ZFS_LINUX_TEST_RESULT_SYMBOL([lookup_bdev_devt],
[lookup_bdev], [fs/block_dev.c], [ [lookup_bdev], [fs/block_dev.c], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_1ARG_LOOKUP_BDEV, 1, AC_DEFINE(HAVE_DEVT_LOOKUP_BDEV, 1,
[lookup_bdev() wants 1 arg]) [lookup_bdev() wants dev_t arg])
], [ ], [
AC_MSG_RESULT(no) AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether lookup_bdev() wants 2 args]) AC_MSG_CHECKING([whether lookup_bdev() wants 1 arg])
ZFS_LINUX_TEST_RESULT_SYMBOL([lookup_bdev_2args], ZFS_LINUX_TEST_RESULT_SYMBOL([lookup_bdev_1arg],
[lookup_bdev], [fs/block_dev.c], [ [lookup_bdev], [fs/block_dev.c], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_2ARGS_LOOKUP_BDEV, 1, AC_DEFINE(HAVE_1ARG_LOOKUP_BDEV, 1,
[lookup_bdev() wants 2 args]) [lookup_bdev() wants 1 arg])
], [ ], [
ZFS_LINUX_TEST_ERROR([lookup_bdev()]) AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether lookup_bdev() wants mode arg])
ZFS_LINUX_TEST_RESULT_SYMBOL([lookup_bdev_mode],
[lookup_bdev], [fs/block_dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MODE_LOOKUP_BDEV, 1,
[lookup_bdev() wants mode arg])
], [
ZFS_LINUX_TEST_ERROR([lookup_bdev()])
])
]) ])
]) ])
]) ])
@ -191,6 +271,29 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_LOGICAL_BLOCK_SIZE], [
]) ])
]) ])
dnl #
dnl # 5.11 API change
dnl # Added bdev_whole() helper.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE], [
ZFS_LINUX_TEST_SRC([bdev_whole], [
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
bdev = bdev_whole(bdev);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
AC_MSG_CHECKING([whether bdev_whole() is available])
ZFS_LINUX_TEST_RESULT([bdev_whole], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_WHOLE, 1, [bdev_whole() is available])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH
ZFS_AC_KERNEL_SRC_BLKDEV_PUT ZFS_AC_KERNEL_SRC_BLKDEV_PUT
@ -199,6 +302,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_LOOKUP_BDEV ZFS_AC_KERNEL_SRC_BLKDEV_LOOKUP_BDEV
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_LOGICAL_BLOCK_SIZE ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_LOGICAL_BLOCK_SIZE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_PHYSICAL_BLOCK_SIZE ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_PHYSICAL_BLOCK_SIZE
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE
]) ])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
@ -209,4 +315,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_LOOKUP_BDEV ZFS_AC_KERNEL_BLKDEV_LOOKUP_BDEV
ZFS_AC_KERNEL_BLKDEV_BDEV_LOGICAL_BLOCK_SIZE ZFS_AC_KERNEL_BLKDEV_BDEV_LOGICAL_BLOCK_SIZE
ZFS_AC_KERNEL_BLKDEV_BDEV_PHYSICAL_BLOCK_SIZE ZFS_AC_KERNEL_BLKDEV_BDEV_PHYSICAL_BLOCK_SIZE
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
]) ])

View File

@ -86,7 +86,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC], [
mutex_init(&lock); mutex_init(&lock);
mutex_lock(&lock); mutex_lock(&lock);
mutex_unlock(&lock); mutex_unlock(&lock);
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
]) ])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC], [ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC], [

View File

@ -42,7 +42,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
], [ ], [
kernel_fpu_begin(); kernel_fpu_begin();
kernel_fpu_end(); kernel_fpu_end();
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([__kernel_fpu], [ ZFS_LINUX_TEST_SRC([__kernel_fpu], [
#include <linux/types.h> #include <linux/types.h>
@ -55,7 +55,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
], [ ], [
__kernel_fpu_begin(); __kernel_fpu_begin();
__kernel_fpu_end(); __kernel_fpu_end();
], [], [$ZFS_META_LICENSE]) ], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([fpu_internal], [ ZFS_LINUX_TEST_SRC([fpu_internal], [
#if defined(__x86_64) || defined(__x86_64__) || \ #if defined(__x86_64) || defined(__x86_64__) || \

View File

@ -2,6 +2,16 @@ dnl #
dnl # Check for generic io accounting interface. dnl # Check for generic io accounting interface.
dnl # dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
ZFS_LINUX_TEST_SRC([bio_io_acct], [
#include <linux/blkdev.h>
], [
struct bio *bio = NULL;
unsigned long start_time;
start_time = bio_start_io_acct(bio);
bio_end_io_acct(bio, start_time);
])
ZFS_LINUX_TEST_SRC([generic_acct_3args], [ ZFS_LINUX_TEST_SRC([generic_acct_3args], [
#include <linux/bio.h> #include <linux/bio.h>
@ -29,36 +39,49 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [ AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [
dnl # dnl #
dnl # 3.19 API addition dnl # 5.7 API,
dnl # dnl #
dnl # torvalds/linux@394ffa50 allows us to increment iostat dnl # Added bio_start_io_acct() and bio_end_io_acct() helpers.
dnl # counters without generic_make_request().
dnl # dnl #
AC_MSG_CHECKING([whether generic IO accounting wants 3 args]) AC_MSG_CHECKING([whether generic bio_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args], ZFS_LINUX_TEST_RESULT([bio_io_acct], [
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1, AC_DEFINE(HAVE_BIO_IO_ACCT, 1, [bio_*_io_acct() available])
[generic_start_io_acct()/generic_end_io_acct() available])
], [ ], [
AC_MSG_RESULT(no) AC_MSG_RESULT(no)
dnl # dnl #
dnl # Linux 4.14 API, dnl # 4.14 API,
dnl # dnl #
dnl # generic_start_io_acct/generic_end_io_acct now require dnl # generic_start_io_acct/generic_end_io_acct now require
dnl # request_queue to be provided. No functional changes, dnl # request_queue to be provided. No functional changes,
dnl # but preparation for inflight accounting. dnl # but preparation for inflight accounting.
dnl # dnl #
AC_MSG_CHECKING([whether generic IO accounting wants 4 args]) AC_MSG_CHECKING([whether generic_*_io_acct wants 4 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args], ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args],
[generic_start_io_acct], [block/bio.c], [ [generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1, AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1,
[generic_start_io_acct()/generic_end_io_acct() ] [generic_*_io_acct() 4 arg available])
[4 arg available])
], [ ], [
AC_MSG_RESULT(no) AC_MSG_RESULT(no)
dnl #
dnl # 3.19 API addition
dnl #
dnl # torvalds/linux@394ffa50 allows us to increment
dnl # iostat counters without generic_make_request().
dnl #
AC_MSG_CHECKING(
[whether generic_*_io_acct wants 3 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1,
[generic_*_io_acct() 3 arg available])
], [
AC_MSG_RESULT(no)
])
]) ])
]) ])
]) ])

View File

@ -1,24 +0,0 @@
dnl #
dnl # 4.16 API change
dnl # Verify if get_disk_and_module() symbol is available.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GET_DISK_AND_MODULE], [
ZFS_LINUX_TEST_SRC([get_disk_and_module], [
#include <linux/genhd.h>
], [
struct gendisk *disk = NULL;
(void) get_disk_and_module(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_AND_MODULE], [
AC_MSG_CHECKING([whether get_disk_and_module() is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([get_disk_and_module],
[get_disk_and_module], [block/genhd.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_DISK_AND_MODULE,
1, [get_disk_and_module() is available])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,26 @@
dnl #
dnl # 4.6 API change
dnl # Added CPU hotplug APIs
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CPU_HOTPLUG], [
ZFS_LINUX_TEST_SRC([cpu_hotplug], [
#include <linux/cpuhotplug.h>
],[
enum cpuhp_state state = CPUHP_ONLINE;
int (*fp)(unsigned int, struct hlist_node *) = NULL;
cpuhp_state_add_instance_nocalls(0, (struct hlist_node *)NULL);
cpuhp_state_remove_instance_nocalls(0, (struct hlist_node *)NULL);
cpuhp_setup_state_multi(state, "", fp, fp);
cpuhp_remove_multi_state(0);
])
])
AC_DEFUN([ZFS_AC_KERNEL_CPU_HOTPLUG], [
AC_MSG_CHECKING([whether CPU hotplug APIs exist])
ZFS_LINUX_TEST_RESULT([cpu_hotplug], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CPU_HOTPLUG, 1, [yes])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -27,6 +27,15 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
q = blk_alloc_queue(make_request, NUMA_NO_NODE); q = blk_alloc_queue(make_request, NUMA_NO_NODE);
]) ])
ZFS_LINUX_TEST_SRC([blk_alloc_queue_request_fn_rh], [
#include <linux/blkdev.h>
blk_qc_t make_request(struct request_queue *q,
struct bio *bio) { return (BLK_QC_T_NONE); }
],[
struct request_queue *q __attribute__ ((unused));
q = blk_alloc_queue_rh(make_request, NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([block_device_operations_submit_bio], [ ZFS_LINUX_TEST_SRC([block_device_operations_submit_bio], [
#include <linux/blkdev.h> #include <linux/blkdev.h>
],[ ],[
@ -47,7 +56,9 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1, AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1,
[submit_bio is member of struct block_device_operations]) [submit_bio is member of struct block_device_operations])
],[ ],[
AC_MSG_RESULT(no)
dnl # Checked as part of the blk_alloc_queue_request_fn test dnl # Checked as part of the blk_alloc_queue_request_fn test
dnl # dnl #
dnl # Linux 5.7 API Change dnl # Linux 5.7 API Change
@ -55,6 +66,9 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
dnl # dnl #
AC_MSG_CHECKING([whether blk_alloc_queue() expects request function]) AC_MSG_CHECKING([whether blk_alloc_queue() expects request function])
ZFS_LINUX_TEST_RESULT([blk_alloc_queue_request_fn], [ ZFS_LINUX_TEST_RESULT([blk_alloc_queue_request_fn], [
AC_MSG_RESULT(yes)
dnl # This is currently always the case.
AC_MSG_CHECKING([whether make_request_fn() returns blk_qc_t]) AC_MSG_CHECKING([whether make_request_fn() returns blk_qc_t])
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
@ -66,34 +80,59 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
[Noting that make_request_fn() returns blk_qc_t]) [Noting that make_request_fn() returns blk_qc_t])
],[ ],[
dnl # dnl #
dnl # Linux 3.2 API Change dnl # CentOS Stream 4.18.0-257 API Change
dnl # make_request_fn returns void. dnl # The Linux 5.7 blk_alloc_queue() change was back-
dnl # ported and the symbol renamed blk_alloc_queue_rh().
dnl # As of this kernel version they're not providing
dnl # any compatibility code in the kernel for this.
dnl # dnl #
AC_MSG_CHECKING([whether make_request_fn() returns void]) ZFS_LINUX_TEST_RESULT([blk_alloc_queue_request_fn_rh], [
ZFS_LINUX_TEST_RESULT([make_request_fn_void], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, void,
dnl # This is currently always the case.
AC_MSG_CHECKING([whether make_request_fn_rh() returns blk_qc_t])
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_ALLOC_QUEUE_REQUEST_FN_RH, 1,
[blk_alloc_queue_rh() expects request function])
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t,
[make_request_fn() return type]) [make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_VOID, 1, AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1,
[Noting that make_request_fn() returns void]) [Noting that make_request_fn() returns blk_qc_t])
],[ ],[
AC_MSG_RESULT(no) AC_MSG_RESULT(no)
dnl # dnl #
dnl # Linux 4.4 API Change dnl # Linux 3.2 API Change
dnl # make_request_fn returns blk_qc_t. dnl # make_request_fn returns void.
dnl # dnl #
AC_MSG_CHECKING( AC_MSG_CHECKING(
[whether make_request_fn() returns blk_qc_t]) [whether make_request_fn() returns void])
ZFS_LINUX_TEST_RESULT([make_request_fn_blk_qc_t], [ ZFS_LINUX_TEST_RESULT([make_request_fn_void], [
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t, AC_DEFINE(MAKE_REQUEST_FN_RET, void,
[make_request_fn() return type]) [make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1, AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_VOID, 1,
[Noting that make_request_fn() ] [Noting that make_request_fn() returns void])
[returns blk_qc_t])
],[ ],[
ZFS_LINUX_TEST_ERROR([make_request_fn]) AC_MSG_RESULT(no)
dnl #
dnl # Linux 4.4 API Change
dnl # make_request_fn returns blk_qc_t.
dnl #
AC_MSG_CHECKING(
[whether make_request_fn() returns blk_qc_t])
ZFS_LINUX_TEST_RESULT([make_request_fn_blk_qc_t], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1,
[Noting that make_request_fn() ]
[returns blk_qc_t])
],[
ZFS_LINUX_TEST_ERROR([make_request_fn])
])
]) ])
]) ])
]) ])

View File

@ -1,3 +1,24 @@
dnl #
dnl # Detect objtool functionality.
dnl #
dnl #
dnl # Kernel 5.10: linux/frame.h was renamed linux/objtool.h
dnl #
AC_DEFUN([ZFS_AC_KERNEL_OBJTOOL_HEADER], [
AC_MSG_CHECKING([whether objtool header is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/objtool.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_OBJTOOL_HEADER, 1,
[kernel has linux/objtool.h])
AC_MSG_RESULT(linux/objtool.h)
],[
AC_MSG_RESULT(linux/frame.h)
])
])
dnl # dnl #
dnl # Check for objtool support. dnl # Check for objtool support.
dnl # dnl #
@ -16,7 +37,11 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_OBJTOOL], [
dnl # 4.6 API added STACK_FRAME_NON_STANDARD macro dnl # 4.6 API added STACK_FRAME_NON_STANDARD macro
ZFS_LINUX_TEST_SRC([stack_frame_non_standard], [ ZFS_LINUX_TEST_SRC([stack_frame_non_standard], [
#ifdef HAVE_KERNEL_OBJTOOL_HEADER
#include <linux/objtool.h>
#else
#include <linux/frame.h> #include <linux/frame.h>
#endif
],[ ],[
#if !defined(STACK_FRAME_NON_STANDARD) #if !defined(STACK_FRAME_NON_STANDARD)
#error "STACK_FRAME_NON_STANDARD is not defined." #error "STACK_FRAME_NON_STANDARD is not defined."

View File

@ -25,10 +25,36 @@ AC_DEFUN([ZFS_AC_KERNEL_PERCPU_COUNTER_INIT], [
]) ])
]) ])
dnl #
dnl # 5.10 API change,
dnl # The "count" was moved into ref->data, from ref
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PERCPU_REF_COUNT_IN_DATA], [
ZFS_LINUX_TEST_SRC([percpu_ref_count_in_data], [
#include <linux/percpu-refcount.h>
],[
struct percpu_ref_data d;
atomic_long_set(&d.count, 1L);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PERCPU_REF_COUNT_IN_DATA], [
AC_MSG_CHECKING([whether is inside percpu_ref.data])
ZFS_LINUX_TEST_RESULT([percpu_ref_count_in_data], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_PERCPU_REF_COUNT_IN_DATA, 1,
[count is located in percpu_ref.data])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_PERCPU], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_PERCPU], [
ZFS_AC_KERNEL_SRC_PERCPU_COUNTER_INIT ZFS_AC_KERNEL_SRC_PERCPU_COUNTER_INIT
ZFS_AC_KERNEL_SRC_PERCPU_REF_COUNT_IN_DATA
]) ])
AC_DEFUN([ZFS_AC_KERNEL_PERCPU], [ AC_DEFUN([ZFS_AC_KERNEL_PERCPU], [
ZFS_AC_KERNEL_PERCPU_COUNTER_INIT ZFS_AC_KERNEL_PERCPU_COUNTER_INIT
ZFS_AC_KERNEL_PERCPU_REF_COUNT_IN_DATA
]) ])

View File

@ -0,0 +1,46 @@
dnl #
dnl # 5.11 API change
dnl # revalidate_disk_size() has been removed entirely.
dnl #
dnl # 5.10 API change
dnl # revalidate_disk() was replaced by revalidate_disk_size()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_REVALIDATE_DISK], [
ZFS_LINUX_TEST_SRC([revalidate_disk_size], [
#include <linux/genhd.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk_size(disk, false);
])
ZFS_LINUX_TEST_SRC([revalidate_disk], [
#include <linux/genhd.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_REVALIDATE_DISK], [
AC_MSG_CHECKING([whether revalidate_disk_size() is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([revalidate_disk_size],
[revalidate_disk_size], [block/genhd.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_REVALIDATE_DISK_SIZE, 1,
[revalidate_disk_size() is available])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether revalidate_disk() is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([revalidate_disk],
[revalidate_disk], [block/genhd.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_REVALIDATE_DISK, 1,
[revalidate_disk() is available])
], [
AC_MSG_RESULT(no)
])
])
])

View File

@ -1,29 +1,3 @@
dnl #
dnl # 3.1 API Change
dnl #
dnl # The rw_semaphore.wait_lock member was changed from spinlock_t to
dnl # raw_spinlock_t at commit ddb6c9b58a19edcfac93ac670b066c836ff729f1.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_RWSEM_SPINLOCK_IS_RAW], [
ZFS_LINUX_TEST_SRC([rwsem_spinlock_is_raw], [
#include <linux/rwsem.h>
],[
struct rw_semaphore dummy_semaphore __attribute__ ((unused));
raw_spinlock_t dummy_lock __attribute__ ((unused)) =
__RAW_SPIN_LOCK_INITIALIZER(dummy_lock);
dummy_semaphore.wait_lock = dummy_lock;
])
])
AC_DEFUN([ZFS_AC_KERNEL_RWSEM_SPINLOCK_IS_RAW], [
AC_MSG_CHECKING([whether struct rw_semaphore member wait_lock is raw])
ZFS_LINUX_TEST_RESULT([rwsem_spinlock_is_raw], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([rwsem_spinlock_is_raw])
])
])
dnl # dnl #
dnl # 3.16 API Change dnl # 3.16 API Change
dnl # dnl #
@ -76,13 +50,11 @@ AC_DEFUN([ZFS_AC_KERNEL_RWSEM_ATOMIC_LONG_COUNT], [
]) ])
AC_DEFUN([ZFS_AC_KERNEL_SRC_RWSEM], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_RWSEM], [
ZFS_AC_KERNEL_SRC_RWSEM_SPINLOCK_IS_RAW
ZFS_AC_KERNEL_SRC_RWSEM_ACTIVITY ZFS_AC_KERNEL_SRC_RWSEM_ACTIVITY
ZFS_AC_KERNEL_SRC_RWSEM_ATOMIC_LONG_COUNT ZFS_AC_KERNEL_SRC_RWSEM_ATOMIC_LONG_COUNT
]) ])
AC_DEFUN([ZFS_AC_KERNEL_RWSEM], [ AC_DEFUN([ZFS_AC_KERNEL_RWSEM], [
ZFS_AC_KERNEL_RWSEM_SPINLOCK_IS_RAW
ZFS_AC_KERNEL_RWSEM_ACTIVITY ZFS_AC_KERNEL_RWSEM_ACTIVITY
ZFS_AC_KERNEL_RWSEM_ATOMIC_LONG_COUNT ZFS_AC_KERNEL_RWSEM_ATOMIC_LONG_COUNT
]) ])

View File

@ -0,0 +1,206 @@
dnl #
dnl # Check for available iov_iter functionality.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_IOV_ITER], [
ZFS_LINUX_TEST_SRC([iov_iter_types], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
int type __attribute__ ((unused)) =
ITER_IOVEC | ITER_KVEC | ITER_BVEC | ITER_PIPE;
])
ZFS_LINUX_TEST_SRC([iov_iter_init], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
struct iovec iov;
unsigned long nr_segs = 1;
size_t count = 1024;
iov_iter_init(&iter, WRITE, &iov, nr_segs, count);
])
ZFS_LINUX_TEST_SRC([iov_iter_init_legacy], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
struct iovec iov;
unsigned long nr_segs = 1;
size_t count = 1024;
size_t written = 0;
iov_iter_init(&iter, &iov, nr_segs, count, written);
])
ZFS_LINUX_TEST_SRC([iov_iter_advance], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
size_t advance = 512;
iov_iter_advance(&iter, advance);
])
ZFS_LINUX_TEST_SRC([iov_iter_revert], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
size_t revert = 512;
iov_iter_revert(&iter, revert);
])
ZFS_LINUX_TEST_SRC([iov_iter_fault_in_readable], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
size_t size = 512;
int error __attribute__ ((unused));
error = iov_iter_fault_in_readable(&iter, size);
])
ZFS_LINUX_TEST_SRC([iov_iter_count], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
size_t bytes __attribute__ ((unused));
bytes = iov_iter_count(&iter);
])
ZFS_LINUX_TEST_SRC([copy_to_iter], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
char buf[512] = { 0 };
size_t size = 512;
size_t bytes __attribute__ ((unused));
bytes = copy_to_iter((const void *)&buf, size, &iter);
])
ZFS_LINUX_TEST_SRC([copy_from_iter], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
char buf[512] = { 0 };
size_t size = 512;
size_t bytes __attribute__ ((unused));
bytes = copy_from_iter((void *)&buf, size, &iter);
])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
enable_vfs_iov_iter="yes"
AC_MSG_CHECKING([whether iov_iter types are available])
ZFS_LINUX_TEST_RESULT([iov_iter_types], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_TYPES, 1,
[iov_iter types are available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
dnl #
dnl # 'iov_iter_init' available in Linux 3.16 and newer.
dnl # 'iov_iter_init_legacy' available in Linux 3.15 and older.
dnl #
AC_MSG_CHECKING([whether iov_iter_init() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_init], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_INIT, 1,
[iov_iter_init() is available])
],[
ZFS_LINUX_TEST_RESULT([iov_iter_init_legacy], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_INIT_LEGACY, 1,
[iov_iter_init() is available])
],[
ZFS_LINUX_TEST_ERROR([iov_iter_init()])
])
])
AC_MSG_CHECKING([whether iov_iter_advance() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_advance], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_ADVANCE, 1,
[iov_iter_advance() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
AC_MSG_CHECKING([whether iov_iter_revert() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_revert], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_REVERT, 1,
[iov_iter_revert() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
AC_MSG_CHECKING([whether iov_iter_fault_in_readable() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_fault_in_readable], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_FAULT_IN_READABLE, 1,
[iov_iter_fault_in_readable() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
AC_MSG_CHECKING([whether iov_iter_count() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_count], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_COUNT, 1,
[iov_iter_count() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
AC_MSG_CHECKING([whether copy_to_iter() is available])
ZFS_LINUX_TEST_RESULT([copy_to_iter], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_COPY_TO_ITER, 1,
[copy_to_iter() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
AC_MSG_CHECKING([whether copy_from_iter() is available])
ZFS_LINUX_TEST_RESULT([copy_from_iter], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_COPY_FROM_ITER, 1,
[copy_from_iter() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
dnl #
dnl # As of the 4.9 kernel support is provided for iovecs, kvecs,
dnl # bvecs and pipes in the iov_iter structure. As long as the
dnl # other support interfaces are all available the iov_iter can
dnl # be correctly used in the uio structure.
dnl #
AS_IF([test "x$enable_vfs_iov_iter" = "xyes"], [
AC_DEFINE(HAVE_VFS_IOV_ITER, 1,
[All required iov_iter interfaces are available])
])
])

View File

@ -13,6 +13,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
dnl # Sequential ZFS_LINUX_TRY_COMPILE tests dnl # Sequential ZFS_LINUX_TRY_COMPILE tests
ZFS_AC_KERNEL_FPU_HEADER ZFS_AC_KERNEL_FPU_HEADER
ZFS_AC_KERNEL_OBJTOOL_HEADER
ZFS_AC_KERNEL_WAIT_QUEUE_ENTRY_T ZFS_AC_KERNEL_WAIT_QUEUE_ENTRY_T
ZFS_AC_KERNEL_MISC_MINOR ZFS_AC_KERNEL_MISC_MINOR
ZFS_AC_KERNEL_DECLARE_EVENT_CLASS ZFS_AC_KERNEL_DECLARE_EVENT_CLASS
@ -60,7 +61,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BIO ZFS_AC_KERNEL_SRC_BIO
ZFS_AC_KERNEL_SRC_BLKDEV ZFS_AC_KERNEL_SRC_BLKDEV
ZFS_AC_KERNEL_SRC_BLK_QUEUE ZFS_AC_KERNEL_SRC_BLK_QUEUE
ZFS_AC_KERNEL_SRC_GET_DISK_AND_MODULE ZFS_AC_KERNEL_SRC_REVALIDATE_DISK
ZFS_AC_KERNEL_SRC_GET_DISK_RO ZFS_AC_KERNEL_SRC_GET_DISK_RO
ZFS_AC_KERNEL_SRC_GENERIC_READLINK_GLOBAL ZFS_AC_KERNEL_SRC_GENERIC_READLINK_GLOBAL
ZFS_AC_KERNEL_SRC_DISCARD_GRANULARITY ZFS_AC_KERNEL_SRC_DISCARD_GRANULARITY
@ -104,6 +105,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_VFS_DIRECT_IO ZFS_AC_KERNEL_SRC_VFS_DIRECT_IO
ZFS_AC_KERNEL_SRC_VFS_RW_ITERATE ZFS_AC_KERNEL_SRC_VFS_RW_ITERATE
ZFS_AC_KERNEL_SRC_VFS_GENERIC_WRITE_CHECKS ZFS_AC_KERNEL_SRC_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_SRC_VFS_IOV_ITER
ZFS_AC_KERNEL_SRC_KMAP_ATOMIC_ARGS ZFS_AC_KERNEL_SRC_KMAP_ATOMIC_ARGS
ZFS_AC_KERNEL_SRC_FOLLOW_DOWN_ONE ZFS_AC_KERNEL_SRC_FOLLOW_DOWN_ONE
ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN
@ -122,6 +124,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_TOTALHIGH_PAGES ZFS_AC_KERNEL_SRC_TOTALHIGH_PAGES
ZFS_AC_KERNEL_SRC_KSTRTOUL ZFS_AC_KERNEL_SRC_KSTRTOUL
ZFS_AC_KERNEL_SRC_PERCPU ZFS_AC_KERNEL_SRC_PERCPU
ZFS_AC_KERNEL_SRC_CPU_HOTPLUG
AC_MSG_CHECKING([for available kernel interfaces]) AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi]) ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@ -156,7 +159,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO ZFS_AC_KERNEL_BIO
ZFS_AC_KERNEL_BLKDEV ZFS_AC_KERNEL_BLKDEV
ZFS_AC_KERNEL_BLK_QUEUE ZFS_AC_KERNEL_BLK_QUEUE
ZFS_AC_KERNEL_GET_DISK_AND_MODULE ZFS_AC_KERNEL_REVALIDATE_DISK
ZFS_AC_KERNEL_GET_DISK_RO ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL
ZFS_AC_KERNEL_DISCARD_GRANULARITY ZFS_AC_KERNEL_DISCARD_GRANULARITY
@ -200,6 +203,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_VFS_DIRECT_IO ZFS_AC_KERNEL_VFS_DIRECT_IO
ZFS_AC_KERNEL_VFS_RW_ITERATE ZFS_AC_KERNEL_VFS_RW_ITERATE
ZFS_AC_KERNEL_VFS_GENERIC_WRITE_CHECKS ZFS_AC_KERNEL_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_VFS_IOV_ITER
ZFS_AC_KERNEL_KMAP_ATOMIC_ARGS ZFS_AC_KERNEL_KMAP_ATOMIC_ARGS
ZFS_AC_KERNEL_FOLLOW_DOWN_ONE ZFS_AC_KERNEL_FOLLOW_DOWN_ONE
ZFS_AC_KERNEL_MAKE_REQUEST_FN ZFS_AC_KERNEL_MAKE_REQUEST_FN
@ -218,6 +222,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_TOTALHIGH_PAGES ZFS_AC_KERNEL_TOTALHIGH_PAGES
ZFS_AC_KERNEL_KSTRTOUL ZFS_AC_KERNEL_KSTRTOUL
ZFS_AC_KERNEL_PERCPU ZFS_AC_KERNEL_PERCPU
ZFS_AC_KERNEL_CPU_HOTPLUG
]) ])
dnl # dnl #
@ -317,19 +322,15 @@ AC_DEFUN([ZFS_AC_KERNEL], [
utsrelease2=$kernelbuild/include/linux/utsrelease.h utsrelease2=$kernelbuild/include/linux/utsrelease.h
utsrelease3=$kernelbuild/include/generated/utsrelease.h utsrelease3=$kernelbuild/include/generated/utsrelease.h
AS_IF([test -r $utsrelease1 && fgrep -q UTS_RELEASE $utsrelease1], [ AS_IF([test -r $utsrelease1 && fgrep -q UTS_RELEASE $utsrelease1], [
utsrelease=linux/version.h utsrelease=$utsrelease1
], [test -r $utsrelease2 && fgrep -q UTS_RELEASE $utsrelease2], [ ], [test -r $utsrelease2 && fgrep -q UTS_RELEASE $utsrelease2], [
utsrelease=linux/utsrelease.h utsrelease=$utsrelease2
], [test -r $utsrelease3 && fgrep -q UTS_RELEASE $utsrelease3], [ ], [test -r $utsrelease3 && fgrep -q UTS_RELEASE $utsrelease3], [
utsrelease=generated/utsrelease.h utsrelease=$utsrelease3
]) ])
AS_IF([test "$utsrelease"], [ AS_IF([test -n "$utsrelease"], [
kernsrcver=`(echo "#include <$utsrelease>"; kernsrcver=$($AWK '/UTS_RELEASE/ { gsub(/"/, "", $[3]); print $[3] }' $utsrelease)
echo "kernsrcver=UTS_RELEASE") |
${CPP} -I $kernelbuild/include - |
grep "^kernsrcver=" | cut -d \" -f 2`
AS_IF([test -z "$kernsrcver"], [ AS_IF([test -z "$kernsrcver"], [
AC_MSG_RESULT([Not found]) AC_MSG_RESULT([Not found])
AC_MSG_ERROR([ AC_MSG_ERROR([
@ -536,7 +537,9 @@ dnl #
dnl # ZFS_LINUX_TEST_PROGRAM(C)([PROLOGUE], [BODY]) dnl # ZFS_LINUX_TEST_PROGRAM(C)([PROLOGUE], [BODY])
dnl # dnl #
m4_define([ZFS_LINUX_TEST_PROGRAM], [ m4_define([ZFS_LINUX_TEST_PROGRAM], [
#include <linux/module.h>
$1 $1
int int
main (void) main (void)
{ {
@ -544,6 +547,11 @@ $2
; ;
return 0; return 0;
} }
MODULE_DESCRIPTION("conftest");
MODULE_AUTHOR(ZFS_META_AUTHOR);
MODULE_VERSION(ZFS_META_VERSION "-" ZFS_META_RELEASE);
MODULE_LICENSE($3);
]) ])
dnl # dnl #
@ -683,19 +691,21 @@ dnl # $3 - source
dnl # $4 - extra cflags dnl # $4 - extra cflags
dnl # $5 - check license-compatibility dnl # $5 - check license-compatibility
dnl # dnl #
dnl # Check if the test source is buildable at all and then if it is
dnl # license compatible.
dnl #
dnl # N.B because all of the test cases are compiled in parallel they dnl # N.B because all of the test cases are compiled in parallel they
dnl # must never depend on the results of previous tests. Each test dnl # must never depend on the results of previous tests. Each test
dnl # needs to be entirely independent. dnl # needs to be entirely independent.
dnl # dnl #
AC_DEFUN([ZFS_LINUX_TEST_SRC], [ AC_DEFUN([ZFS_LINUX_TEST_SRC], [
ZFS_LINUX_CONFTEST_C([ZFS_LINUX_TEST_PROGRAM([[$2]], [[$3]])], [$1]) ZFS_LINUX_CONFTEST_C([ZFS_LINUX_TEST_PROGRAM([[$2]], [[$3]],
[["Dual BSD/GPL"]])], [$1])
ZFS_LINUX_CONFTEST_MAKEFILE([$1], [yes], [$4]) ZFS_LINUX_CONFTEST_MAKEFILE([$1], [yes], [$4])
AS_IF([ test -n "$5" ], [ AS_IF([ test -n "$5" ], [
ZFS_LINUX_CONFTEST_C([ZFS_LINUX_TEST_PROGRAM([[ ZFS_LINUX_CONFTEST_C([ZFS_LINUX_TEST_PROGRAM(
#include <linux/module.h> [[$2]], [[$3]], [[$5]])], [$1_license])
MODULE_LICENSE("$5");
$2]], [[$3]])], [$1_license])
ZFS_LINUX_CONFTEST_MAKEFILE([$1_license], [yes], [$4]) ZFS_LINUX_CONFTEST_MAKEFILE([$1_license], [yes], [$4])
]) ])
]) ])
@ -785,11 +795,13 @@ dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE], [ AC_DEFUN([ZFS_LINUX_TRY_COMPILE], [
AS_IF([test "x$enable_linux_builtin" = "xyes"], [ AS_IF([test "x$enable_linux_builtin" = "xyes"], [
ZFS_LINUX_COMPILE_IFELSE( ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])], [ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.o], [$3], [$4]) [test -f build/conftest/conftest.o], [$3], [$4])
], [ ], [
ZFS_LINUX_COMPILE_IFELSE( ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])], [ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko], [$3], [$4]) [test -f build/conftest/conftest.ko], [$3], [$4])
]) ])
]) ])
@ -855,7 +867,7 @@ dnl # provided via the fifth parameter
dnl # dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE_HEADER], [ AC_DEFUN([ZFS_LINUX_TRY_COMPILE_HEADER], [
ZFS_LINUX_COMPILE_IFELSE( ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])], [ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]], [[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko], [test -f build/conftest/conftest.ko],
[$3], [$4], [$5]) [$3], [$4], [$5])
]) ])

View File

@ -1,6 +1,6 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_MOUNT_HELPER], [ AC_DEFUN([ZFS_AC_CONFIG_USER_MOUNT_HELPER], [
AC_ARG_WITH(mounthelperdir, AC_ARG_WITH(mounthelperdir,
AC_HELP_STRING([--with-mounthelperdir=DIR], AS_HELP_STRING([--with-mounthelperdir=DIR],
[install mount.zfs in dir [[/sbin]]]), [install mount.zfs in dir [[/sbin]]]),
mounthelperdir=$withval,mounthelperdir=/sbin) mounthelperdir=$withval,mounthelperdir=/sbin)

View File

@ -1,7 +1,7 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_DRACUT], [ AC_DEFUN([ZFS_AC_CONFIG_USER_DRACUT], [
AC_MSG_CHECKING(for dracut directory) AC_MSG_CHECKING(for dracut directory)
AC_ARG_WITH([dracutdir], AC_ARG_WITH([dracutdir],
AC_HELP_STRING([--with-dracutdir=DIR], AS_HELP_STRING([--with-dracutdir=DIR],
[install dracut helpers @<:@default=check@:>@]), [install dracut helpers @<:@default=check@:>@]),
[dracutdir=$withval], [dracutdir=$withval],
[dracutdir=check]) [dracutdir=check])

View File

@ -1,6 +1,6 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_ZFSEXEC], [ AC_DEFUN([ZFS_AC_CONFIG_USER_ZFSEXEC], [
AC_ARG_WITH(zfsexecdir, AC_ARG_WITH(zfsexecdir,
AC_HELP_STRING([--with-zfsexecdir=DIR], AS_HELP_STRING([--with-zfsexecdir=DIR],
[install scripts [[@<:@libexecdir@:>@/zfs]]]), [install scripts [[@<:@libexecdir@:>@/zfs]]]),
[zfsexecdir=$withval], [zfsexecdir=$withval],
[zfsexecdir="${libexecdir}/zfs"]) [zfsexecdir="${libexecdir}/zfs"])

View File

@ -3,13 +3,12 @@ dnl # glibc 2.25
dnl # dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS], [ AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS], [
AC_MSG_CHECKING([makedev() is declared in sys/sysmacros.h]) AC_MSG_CHECKING([makedev() is declared in sys/sysmacros.h])
AC_TRY_COMPILE( AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
[
#include <sys/sysmacros.h> #include <sys/sysmacros.h>
],[ ]], [[
int k; int k;
k = makedev(0,0); k = makedev(0,0);
],[ ]])],[
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MAKEDEV_IN_SYSMACROS, 1, AC_DEFINE(HAVE_MAKEDEV_IN_SYSMACROS, 1,
[makedev() is declared in sys/sysmacros.h]) [makedev() is declared in sys/sysmacros.h])
@ -23,13 +22,12 @@ dnl # glibc X < Y < 2.25
dnl # dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV], [ AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV], [
AC_MSG_CHECKING([makedev() is declared in sys/mkdev.h]) AC_MSG_CHECKING([makedev() is declared in sys/mkdev.h])
AC_TRY_COMPILE( AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
[
#include <sys/mkdev.h> #include <sys/mkdev.h>
],[ ]], [[
int k; int k;
k = makedev(0,0); k = makedev(0,0);
],[ ]])],[
AC_MSG_RESULT(yes) AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MAKEDEV_IN_MKDEV, 1, AC_DEFINE(HAVE_MAKEDEV_IN_MKDEV, 1,
[makedev() is declared in sys/mkdev.h]) [makedev() is declared in sys/mkdev.h])

View File

@ -1,27 +1,27 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_SYSTEMD], [ AC_DEFUN([ZFS_AC_CONFIG_USER_SYSTEMD], [
AC_ARG_ENABLE(systemd, AC_ARG_ENABLE(systemd,
AC_HELP_STRING([--enable-systemd], AS_HELP_STRING([--enable-systemd],
[install systemd unit/preset files [[default: yes]]]), [install systemd unit/preset files [[default: yes]]]),
[enable_systemd=$enableval], [enable_systemd=$enableval],
[enable_systemd=check]) [enable_systemd=check])
AC_ARG_WITH(systemdunitdir, AC_ARG_WITH(systemdunitdir,
AC_HELP_STRING([--with-systemdunitdir=DIR], AS_HELP_STRING([--with-systemdunitdir=DIR],
[install systemd unit files in dir [[/usr/lib/systemd/system]]]), [install systemd unit files in dir [[/usr/lib/systemd/system]]]),
systemdunitdir=$withval,systemdunitdir=/usr/lib/systemd/system) systemdunitdir=$withval,systemdunitdir=/usr/lib/systemd/system)
AC_ARG_WITH(systemdpresetdir, AC_ARG_WITH(systemdpresetdir,
AC_HELP_STRING([--with-systemdpresetdir=DIR], AS_HELP_STRING([--with-systemdpresetdir=DIR],
[install systemd preset files in dir [[/usr/lib/systemd/system-preset]]]), [install systemd preset files in dir [[/usr/lib/systemd/system-preset]]]),
systemdpresetdir=$withval,systemdpresetdir=/usr/lib/systemd/system-preset) systemdpresetdir=$withval,systemdpresetdir=/usr/lib/systemd/system-preset)
AC_ARG_WITH(systemdmodulesloaddir, AC_ARG_WITH(systemdmodulesloaddir,
AC_HELP_STRING([--with-systemdmodulesloaddir=DIR], AS_HELP_STRING([--with-systemdmodulesloaddir=DIR],
[install systemd module load files into dir [[/usr/lib/modules-load.d]]]), [install systemd module load files into dir [[/usr/lib/modules-load.d]]]),
systemdmodulesloaddir=$withval,systemdmodulesloaddir=/usr/lib/modules-load.d) systemdmodulesloaddir=$withval,systemdmodulesloaddir=/usr/lib/modules-load.d)
AC_ARG_WITH(systemdgeneratordir, AC_ARG_WITH(systemdgeneratordir,
AC_HELP_STRING([--with-systemdgeneratordir=DIR], AS_HELP_STRING([--with-systemdgeneratordir=DIR],
[install systemd generators in dir [[/usr/lib/systemd/system-generators]]]), [install systemd generators in dir [[/usr/lib/systemd/system-generators]]]),
systemdgeneratordir=$withval,systemdgeneratordir=/usr/lib/systemd/system-generators) systemdgeneratordir=$withval,systemdgeneratordir=/usr/lib/systemd/system-generators)

View File

@ -1,6 +1,6 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_SYSVINIT], [ AC_DEFUN([ZFS_AC_CONFIG_USER_SYSVINIT], [
AC_ARG_ENABLE(sysvinit, AC_ARG_ENABLE(sysvinit,
AC_HELP_STRING([--enable-sysvinit], AS_HELP_STRING([--enable-sysvinit],
[install SysV init scripts [default: yes]]), [install SysV init scripts [default: yes]]),
[],enable_sysvinit=yes) [],enable_sysvinit=yes)

View File

@ -1,7 +1,7 @@
AC_DEFUN([ZFS_AC_CONFIG_USER_UDEV], [ AC_DEFUN([ZFS_AC_CONFIG_USER_UDEV], [
AC_MSG_CHECKING(for udev directories) AC_MSG_CHECKING(for udev directories)
AC_ARG_WITH(udevdir, AC_ARG_WITH(udevdir,
AC_HELP_STRING([--with-udevdir=DIR], AS_HELP_STRING([--with-udevdir=DIR],
[install udev helpers @<:@default=check@:>@]), [install udev helpers @<:@default=check@:>@]),
[udevdir=$withval], [udevdir=$withval],
[udevdir=check]) [udevdir=check])
@ -18,7 +18,7 @@ AC_DEFUN([ZFS_AC_CONFIG_USER_UDEV], [
]) ])
AC_ARG_WITH(udevruledir, AC_ARG_WITH(udevruledir,
AC_HELP_STRING([--with-udevruledir=DIR], AS_HELP_STRING([--with-udevruledir=DIR],
[install udev rules [[UDEVDIR/rules.d]]]), [install udev rules [[UDEVDIR/rules.d]]]),
[udevruledir=$withval], [udevruledir=$withval],
[udevruledir="${udevdir}/rules.d"]) [udevruledir="${udevdir}/rules.d"])

View File

@ -180,7 +180,7 @@ AC_DEFUN([ZFS_AC_CONFIG], [
[Config file 'kernel|user|all|srpm']), [Config file 'kernel|user|all|srpm']),
[ZFS_CONFIG="$withval"]) [ZFS_CONFIG="$withval"])
AC_ARG_ENABLE([linux-builtin], AC_ARG_ENABLE([linux-builtin],
[AC_HELP_STRING([--enable-linux-builtin], [AS_HELP_STRING([--enable-linux-builtin],
[Configure for builtin in-tree kernel modules @<:@default=no@:>@])], [Configure for builtin in-tree kernel modules @<:@default=no@:>@])],
[], [],
[enable_linux_builtin=no]) [enable_linux_builtin=no])

View File

@ -36,7 +36,7 @@ AC_LANG(C)
ZFS_AC_META ZFS_AC_META
AC_CONFIG_AUX_DIR([config]) AC_CONFIG_AUX_DIR([config])
AC_CONFIG_MACRO_DIR([config]) AC_CONFIG_MACRO_DIR([config])
AC_CANONICAL_SYSTEM AC_CANONICAL_TARGET
AM_MAINTAINER_MODE AM_MAINTAINER_MODE
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])]) m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
AM_INIT_AUTOMAKE([subdir-objects]) AM_INIT_AUTOMAKE([subdir-objects])
@ -45,9 +45,9 @@ AC_CONFIG_HEADERS([zfs_config.h], [
awk -f ${ac_srcdir}/config/config.awk zfs_config.h.tmp >zfs_config.h && awk -f ${ac_srcdir}/config/config.awk zfs_config.h.tmp >zfs_config.h &&
rm zfs_config.h.tmp) || exit 1]) rm zfs_config.h.tmp) || exit 1])
LT_INIT
AC_PROG_INSTALL AC_PROG_INSTALL
AC_PROG_CC AC_PROG_CC
AC_PROG_LIBTOOL
PKG_PROG_PKG_CONFIG PKG_PROG_PKG_CONFIG
AM_PROG_AS AM_PROG_AS
AM_PROG_CC_C_O AM_PROG_CC_C_O
@ -86,6 +86,7 @@ AC_CONFIG_FILES([
cmd/ztest/Makefile cmd/ztest/Makefile
cmd/zvol_id/Makefile cmd/zvol_id/Makefile
cmd/zvol_wait/Makefile cmd/zvol_wait/Makefile
cmd/zpool_influxdb/Makefile
contrib/Makefile contrib/Makefile
contrib/bash_completion.d/Makefile contrib/bash_completion.d/Makefile
contrib/bpftrace/Makefile contrib/bpftrace/Makefile
@ -208,6 +209,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/btree_test/Makefile tests/zfs-tests/cmd/btree_test/Makefile
tests/zfs-tests/cmd/chg_usr_exec/Makefile tests/zfs-tests/cmd/chg_usr_exec/Makefile
tests/zfs-tests/cmd/devname2devid/Makefile tests/zfs-tests/cmd/devname2devid/Makefile
tests/zfs-tests/cmd/draid/Makefile
tests/zfs-tests/cmd/dir_rd_update/Makefile tests/zfs-tests/cmd/dir_rd_update/Makefile
tests/zfs-tests/cmd/file_check/Makefile tests/zfs-tests/cmd/file_check/Makefile
tests/zfs-tests/cmd/file_trunc/Makefile tests/zfs-tests/cmd/file_trunc/Makefile
@ -342,6 +344,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/inheritance/Makefile tests/zfs-tests/tests/functional/inheritance/Makefile
tests/zfs-tests/tests/functional/inuse/Makefile tests/zfs-tests/tests/functional/inuse/Makefile
tests/zfs-tests/tests/functional/io/Makefile tests/zfs-tests/tests/functional/io/Makefile
tests/zfs-tests/tests/functional/l2arc/Makefile
tests/zfs-tests/tests/functional/large_files/Makefile tests/zfs-tests/tests/functional/large_files/Makefile
tests/zfs-tests/tests/functional/largest_pool/Makefile tests/zfs-tests/tests/functional/largest_pool/Makefile
tests/zfs-tests/tests/functional/libzfs/Makefile tests/zfs-tests/tests/functional/libzfs/Makefile
@ -358,7 +361,6 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/nopwrite/Makefile tests/zfs-tests/tests/functional/nopwrite/Makefile
tests/zfs-tests/tests/functional/online_offline/Makefile tests/zfs-tests/tests/functional/online_offline/Makefile
tests/zfs-tests/tests/functional/pam/Makefile tests/zfs-tests/tests/functional/pam/Makefile
tests/zfs-tests/tests/functional/persist_l2arc/Makefile
tests/zfs-tests/tests/functional/pool_checkpoint/Makefile tests/zfs-tests/tests/functional/pool_checkpoint/Makefile
tests/zfs-tests/tests/functional/pool_names/Makefile tests/zfs-tests/tests/functional/pool_names/Makefile
tests/zfs-tests/tests/functional/poolversion/Makefile tests/zfs-tests/tests/functional/poolversion/Makefile
@ -394,6 +396,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/vdev_zaps/Makefile tests/zfs-tests/tests/functional/vdev_zaps/Makefile
tests/zfs-tests/tests/functional/write_dirs/Makefile tests/zfs-tests/tests/functional/write_dirs/Makefile
tests/zfs-tests/tests/functional/xattr/Makefile tests/zfs-tests/tests/functional/xattr/Makefile
tests/zfs-tests/tests/functional/zpool_influxdb/Makefile
tests/zfs-tests/tests/functional/zvol/Makefile tests/zfs-tests/tests/functional/zvol/Makefile
tests/zfs-tests/tests/functional/zvol/zvol_ENOSPC/Makefile tests/zfs-tests/tests/functional/zvol/zvol_ENOSPC/Makefile
tests/zfs-tests/tests/functional/zvol/zvol_cli/Makefile tests/zfs-tests/tests/functional/zvol/zvol_cli/Makefile

View File

@ -1,4 +1,4 @@
#!/bin/bash #!/bin/sh
. /lib/dracut-zfs-lib.sh . /lib/dracut-zfs-lib.sh

View File

@ -85,7 +85,13 @@ install() {
fi fi
# Synchronize initramfs and system hostid # Synchronize initramfs and system hostid
zgenhostid -o "${initdir}/etc/hostid" "$(hostid)" if [ -f @sysconfdir@/hostid ]; then
inst @sysconfdir@/hostid
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
elif HOSTID="$(hostid 2>/dev/null)" && [ "${HOSTID}" != "00000000" ]; then
zgenhostid -o "${initdir}@sysconfdir@/hostid" "${HOSTID}"
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
fi
if dracut_module_included "systemd"; then if dracut_module_included "systemd"; then
mkdir -p "${initdir}/$systemdsystemunitdir/zfs-import.target.wants" mkdir -p "${initdir}/$systemdsystemunitdir/zfs-import.target.wants"

View File

@ -1,4 +1,4 @@
#!/bin/bash #!/bin/sh
. /lib/dracut-zfs-lib.sh . /lib/dracut-zfs-lib.sh
@ -58,7 +58,7 @@ ZFS_POOL="${ZFS_DATASET%%/*}"
if import_pool "${ZFS_POOL}" ; then if import_pool "${ZFS_POOL}" ; then
# Load keys if we can or if we need to # Load keys if we can or if we need to
if [ $(zpool list -H -o feature@encryption $(echo "${ZFS_POOL}" | awk -F\/ '{print $1}')) = 'active' ]; then if [ "$(zpool list -H -o feature@encryption "$(echo "${ZFS_POOL}" | awk -F/ '{print $1}')")" = 'active' ]; then
# if the root dataset has encryption enabled # if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${ZFS_DATASET}")" ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${ZFS_DATASET}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then if ! [ "${ENCRYPTIONROOT}" = "-" ]; then

View File

@ -1,4 +1,4 @@
#!/bin/bash #!/bin/sh
. /lib/dracut-lib.sh . /lib/dracut-lib.sh

View File

@ -1,4 +1,4 @@
#!/usr/bin/env bash #!/bin/sh
echo "zfs-generator: starting" >> /dev/kmsg echo "zfs-generator: starting" >> /dev/kmsg
@ -11,7 +11,7 @@ GENERATOR_DIR="$1"
[ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh [ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh
[ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh [ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh
type getarg >/dev/null 2>&1 || { command -v getarg >/dev/null 2>&1 || {
echo "zfs-generator: loading Dracut library from $dracutlib" >> /dev/kmsg echo "zfs-generator: loading Dracut library from $dracutlib" >> /dev/kmsg
. "$dracutlib" . "$dracutlib"
} }
@ -22,16 +22,17 @@ type getarg >/dev/null 2>&1 || {
# If root is not ZFS= or zfs: or rootfstype is not zfs # If root is not ZFS= or zfs: or rootfstype is not zfs
# then we are not supposed to handle it. # then we are not supposed to handle it.
[ "${root##zfs:}" = "${root}" -a "${root##ZFS=}" = "${root}" -a "$rootfstype" != "zfs" ] && exit 0 [ "${root##zfs:}" = "${root}" ] &&
[ "${root##ZFS=}" = "${root}" ] &&
[ "$rootfstype" != "zfs" ] &&
exit 0
rootfstype=zfs rootfstype=zfs
if echo "${rootflags}" | grep -Eq '^zfsutil$|^zfsutil,|,zfsutil$|,zfsutil,' ; then case ",${rootflags}," in
true *,zfsutil,*) ;;
elif test -n "${rootflags}" ; then ,,) rootflags=zfsutil ;;
rootflags="zfsutil,${rootflags}" *) rootflags="zfsutil,${rootflags}" ;;
else esac
rootflags=zfsutil
fi
echo "zfs-generator: writing extension for sysroot.mount to $GENERATOR_DIR"/sysroot.mount.d/zfs-enhancement.conf >> /dev/kmsg echo "zfs-generator: writing extension for sysroot.mount to $GENERATOR_DIR"/sysroot.mount.d/zfs-enhancement.conf >> /dev/kmsg
@ -58,4 +59,4 @@ echo "zfs-generator: writing extension for sysroot.mount to $GENERATOR_DIR"/sysr
[ -d "$GENERATOR_DIR"/initrd-root-fs.target.requires ] || mkdir -p "$GENERATOR_DIR"/initrd-root-fs.target.requires [ -d "$GENERATOR_DIR"/initrd-root-fs.target.requires ] || mkdir -p "$GENERATOR_DIR"/initrd-root-fs.target.requires
ln -s ../sysroot.mount "$GENERATOR_DIR"/initrd-root-fs.target.requires/sysroot.mount ln -s ../sysroot.mount "$GENERATOR_DIR"/initrd-root-fs.target.requires/sysroot.mount
echo "zfs-generator: finished" >> /dev/kmsg echo "zfs-generator: finished" >> /dev/kmsg

View File

@ -1,4 +1,4 @@
#!/bin/bash #!/bin/sh
command -v getarg >/dev/null || . /lib/dracut-lib.sh command -v getarg >/dev/null || . /lib/dracut-lib.sh
command -v getargbool >/dev/null || { command -v getargbool >/dev/null || {
@ -144,7 +144,7 @@ ask_for_password() {
{ flock -s 9; { flock -s 9;
# Prompt for password with plymouth, if installed and running. # Prompt for password with plymouth, if installed and running.
if type plymouth >/dev/null 2>&1 && plymouth --ping 2>/dev/null; then if plymouth --ping 2>/dev/null; then
plymouth ask-for-password \ plymouth ask-for-password \
--prompt "$ply_prompt" --number-of-tries="$ply_tries" \ --prompt "$ply_prompt" --number-of-tries="$ply_tries" \
--command="$ply_cmd" --command="$ply_cmd"

View File

@ -1,4 +1,4 @@
#!/bin/bash #!/bin/sh
# only run this on systemd systems, we handle the decrypt in mount-zfs.sh in the mount hook otherwise # only run this on systemd systems, we handle the decrypt in mount-zfs.sh in the mount hook otherwise
[ -e /bin/systemctl ] || return 0 [ -e /bin/systemctl ] || return 0
@ -17,10 +17,8 @@
[ "${root##zfs:}" = "${root}" ] && [ "${root##ZFS=}" = "${root}" ] && [ "$rootfstype" != "zfs" ] && exit 0 [ "${root##zfs:}" = "${root}" ] && [ "${root##ZFS=}" = "${root}" ] && [ "$rootfstype" != "zfs" ] && exit 0
# There is a race between the zpool import and the pre-mount hooks, so we wait for a pool to be imported # There is a race between the zpool import and the pre-mount hooks, so we wait for a pool to be imported
while true; do while [ "$(zpool list -H)" = "" ]; do
zpool list -H | grep -q -v '^$' && break systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && exit 1
[ "$(systemctl is-failed zfs-import-cache.service)" = 'failed' ] && exit 1
[ "$(systemctl is-failed zfs-import-scan.service)" = 'failed' ] && exit 1
sleep 0.1s sleep 0.1s
done done
@ -34,11 +32,11 @@ else
fi fi
# if pool encryption is active and the zfs command understands '-o encryption' # if pool encryption is active and the zfs command understands '-o encryption'
if [ "$(zpool list -H -o feature@encryption $(echo "${BOOTFS}" | awk -F\/ '{print $1}'))" = 'active' ]; then if [ "$(zpool list -H -o feature@encryption "$(echo "${BOOTFS}" | awk -F/ '{print $1}')")" = 'active' ]; then
# if the root dataset has encryption enabled # if the root dataset has encryption enabled
ENCRYPTIONROOT=$(zfs get -H -o value encryptionroot "${BOOTFS}") ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${BOOTFS}")"
# where the key is stored (in a file or loaded via prompt) # where the key is stored (in a file or loaded via prompt)
KEYLOCATION=$(zfs get -H -o value keylocation "${ENCRYPTIONROOT}") KEYLOCATION="$(zfs get -H -o value keylocation "${ENCRYPTIONROOT}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")" KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")"
# continue only if the key needs to be loaded # continue only if the key needs to be loaded

View File

@ -1,6 +1,6 @@
#!/bin/bash #!/bin/sh
type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh command -v getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
if zpool list 2>&1 | grep -q 'no pools available' ; then if zpool list 2>&1 | grep -q 'no pools available' ; then
info "ZFS: No active pools, no need to export anything." info "ZFS: No active pools, no need to export anything."

View File

@ -15,4 +15,4 @@ esac
. /usr/share/initramfs-tools/hook-functions . /usr/share/initramfs-tools/hook-functions
copy_exec /usr/share/initramfs-tools/zfsunlock /usr/bin copy_exec /usr/share/initramfs-tools/zfsunlock /usr/bin/zfsunlock

View File

@ -386,6 +386,8 @@ unmount_unload(pam_handle_t *pamh, const char *ds_name)
typedef struct { typedef struct {
char *homes_prefix; char *homes_prefix;
char *runstatedir; char *runstatedir;
char *homedir;
char *dsname;
uid_t uid; uid_t uid;
const char *username; const char *username;
int unmount_and_unload; int unmount_and_unload;
@ -423,6 +425,8 @@ zfs_key_config_load(pam_handle_t *pamh, zfs_key_config_t *config,
config->uid = entry->pw_uid; config->uid = entry->pw_uid;
config->username = name; config->username = name;
config->unmount_and_unload = 1; config->unmount_and_unload = 1;
config->dsname = NULL;
config->homedir = NULL;
for (int c = 0; c < argc; c++) { for (int c = 0; c < argc; c++) {
if (strncmp(argv[c], "homes=", 6) == 0) { if (strncmp(argv[c], "homes=", 6) == 0) {
free(config->homes_prefix); free(config->homes_prefix);
@ -432,6 +436,8 @@ zfs_key_config_load(pam_handle_t *pamh, zfs_key_config_t *config,
config->runstatedir = strdup(argv[c] + 12); config->runstatedir = strdup(argv[c] + 12);
} else if (strcmp(argv[c], "nounmount") == 0) { } else if (strcmp(argv[c], "nounmount") == 0) {
config->unmount_and_unload = 0; config->unmount_and_unload = 0;
} else if (strcmp(argv[c], "prop_mountpoint") == 0) {
config->homedir = strdup(entry->pw_dir);
} }
} }
return (0); return (0);
@ -441,11 +447,59 @@ static void
zfs_key_config_free(zfs_key_config_t *config) zfs_key_config_free(zfs_key_config_t *config)
{ {
free(config->homes_prefix); free(config->homes_prefix);
free(config->runstatedir);
free(config->homedir);
free(config->dsname);
}
static int
find_dsname_by_prop_value(zfs_handle_t *zhp, void *data)
{
zfs_type_t type = zfs_get_type(zhp);
zfs_key_config_t *target = data;
char mountpoint[ZFS_MAXPROPLEN];
/* Skip any datasets whose type does not match */
if ((type & ZFS_TYPE_FILESYSTEM) == 0) {
zfs_close(zhp);
return (0);
}
/* Skip any datasets whose mountpoint does not match */
(void) zfs_prop_get(zhp, ZFS_PROP_MOUNTPOINT, mountpoint,
sizeof (mountpoint), NULL, NULL, 0, B_FALSE);
if (strcmp(target->homedir, mountpoint) != 0) {
zfs_close(zhp);
return (0);
}
target->dsname = strdup(zfs_get_name(zhp));
zfs_close(zhp);
return (1);
} }
static char * static char *
zfs_key_config_get_dataset(zfs_key_config_t *config) zfs_key_config_get_dataset(zfs_key_config_t *config)
{ {
if (config->homedir != NULL &&
config->homes_prefix != NULL) {
zfs_handle_t *zhp = zfs_open(g_zfs, config->homes_prefix,
ZFS_TYPE_FILESYSTEM);
if (zhp == NULL) {
pam_syslog(NULL, LOG_ERR, "dataset %s not found",
config->homes_prefix);
zfs_close(zhp);
return (NULL);
}
(void) zfs_iter_filesystems(zhp, find_dsname_by_prop_value,
config);
zfs_close(zhp);
char *dsname = config->dsname;
config->dsname = NULL;
return (dsname);
}
size_t len = ZFS_MAX_DATASET_NAME_LEN; size_t len = ZFS_MAX_DATASET_NAME_LEN;
size_t total_len = strlen(config->homes_prefix) + 1 size_t total_len = strlen(config->homes_prefix) + 1
+ strlen(config->username); + strlen(config->username);

View File

@ -8,6 +8,7 @@ Wants=zfs-mount.service
After=zfs-mount.service After=zfs-mount.service
PartOf=nfs-server.service nfs-kernel-server.service PartOf=nfs-server.service nfs-kernel-server.service
PartOf=smb.service PartOf=smb.service
ConditionPathIsDirectory=/sys/module/zfs
[Service] [Service]
Type=oneshot Type=oneshot

View File

@ -3,6 +3,7 @@ Description=Wait for ZFS Volume (zvol) links in /dev
DefaultDependencies=no DefaultDependencies=no
After=systemd-udev-settle.service After=systemd-udev-settle.service
After=zfs-import.target After=zfs-import.target
ConditionPathIsDirectory=/sys/module/zfs
[Service] [Service]
Type=oneshot Type=oneshot

View File

@ -1,6 +1,7 @@
[Unit] [Unit]
Description=ZFS Event Daemon (zed) Description=ZFS Event Daemon (zed)
Documentation=man:zed(8) Documentation=man:zed(8)
ConditionPathIsDirectory=/sys/module/zfs
[Service] [Service]
ExecStart=@sbindir@/zed -F ExecStart=@sbindir@/zed -F

View File

@ -88,8 +88,8 @@ typedef enum zfs_error {
EZFS_ZONED, /* used improperly in local zone */ EZFS_ZONED, /* used improperly in local zone */
EZFS_MOUNTFAILED, /* failed to mount dataset */ EZFS_MOUNTFAILED, /* failed to mount dataset */
EZFS_UMOUNTFAILED, /* failed to unmount dataset */ EZFS_UMOUNTFAILED, /* failed to unmount dataset */
EZFS_UNSHARENFSFAILED, /* unshare(1M) failed */ EZFS_UNSHARENFSFAILED, /* failed to unshare over nfs */
EZFS_SHARENFSFAILED, /* share(1M) failed */ EZFS_SHARENFSFAILED, /* failed to share over nfs */
EZFS_PERM, /* permission denied */ EZFS_PERM, /* permission denied */
EZFS_NOSPC, /* out of space */ EZFS_NOSPC, /* out of space */
EZFS_FAULT, /* bad address */ EZFS_FAULT, /* bad address */
@ -455,6 +455,7 @@ extern void zpool_explain_recover(libzfs_handle_t *, const char *, int,
nvlist_t *); nvlist_t *);
extern int zpool_checkpoint(zpool_handle_t *); extern int zpool_checkpoint(zpool_handle_t *);
extern int zpool_discard_checkpoint(zpool_handle_t *); extern int zpool_discard_checkpoint(zpool_handle_t *);
extern boolean_t zpool_is_draid_spare(const char *);
/* /*
* Basic handle manipulations. These functions do not create or destroy the * Basic handle manipulations. These functions do not create or destroy the
@ -556,7 +557,7 @@ extern void zfs_prune_proplist(zfs_handle_t *, uint8_t *);
/* /*
* zpool property management * zpool property management
*/ */
extern int zpool_expand_proplist(zpool_handle_t *, zprop_list_t **); extern int zpool_expand_proplist(zpool_handle_t *, zprop_list_t **, boolean_t);
extern int zpool_prop_get_feature(zpool_handle_t *, const char *, char *, extern int zpool_prop_get_feature(zpool_handle_t *, const char *, char *,
size_t); size_t);
extern const char *zpool_prop_default_string(zpool_prop_t); extern const char *zpool_prop_default_string(zpool_prop_t);

View File

@ -30,6 +30,7 @@
#define _OPENSOLARIS_SYS_MISC_H_ #define _OPENSOLARIS_SYS_MISC_H_
#include <sys/limits.h> #include <sys/limits.h>
#include <sys/filio.h>
#define MAXUID UID_MAX #define MAXUID UID_MAX
@ -40,8 +41,8 @@
#define _FIOGDIO (INT_MIN+1) #define _FIOGDIO (INT_MIN+1)
#define _FIOSDIO (INT_MIN+2) #define _FIOSDIO (INT_MIN+2)
#define _FIO_SEEK_DATA FIOSEEKDATA #define F_SEEK_DATA FIOSEEKDATA
#define _FIO_SEEK_HOLE FIOSEEKHOLE #define F_SEEK_HOLE FIOSEEKHOLE
struct opensolaris_utsname { struct opensolaris_utsname {
char *sysname; char *sysname;
@ -53,4 +54,7 @@ struct opensolaris_utsname {
extern char hw_serial[11]; extern char hw_serial[11];
#define task_io_account_read(n)
#define task_io_account_write(n)
#endif /* _OPENSOLARIS_SYS_MISC_H_ */ #endif /* _OPENSOLARIS_SYS_MISC_H_ */

View File

@ -57,6 +57,8 @@
#define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \ #define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \
ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc) ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc)
#define ZFS_MODULE_VIRTUAL_PARAM_CALL ZFS_MODULE_PARAM_CALL
#define param_set_arc_long_args(var) \ #define param_set_arc_long_args(var) \
CTLTYPE_ULONG, &var, 0, param_set_arc_long, "LU" CTLTYPE_ULONG, &var, 0, param_set_arc_long, "LU"
@ -84,6 +86,9 @@
#define param_set_max_auto_ashift_args(var) \ #define param_set_max_auto_ashift_args(var) \
CTLTYPE_U64, &var, 0, param_set_max_auto_ashift, "QU" CTLTYPE_U64, &var, 0, param_set_max_auto_ashift, "QU"
#define fletcher_4_param_set_args(var) \
CTLTYPE_STRING, NULL, 0, fletcher_4_param, "A"
#include <sys/kernel.h> #include <sys/kernel.h>
#define module_init(fn) \ #define module_init(fn) \
static void \ static void \
@ -93,6 +98,13 @@ wrap_ ## fn(void *dummy __unused) \
} \ } \
SYSINIT(zfs_ ## fn, SI_SUB_LAST, SI_ORDER_FIRST, wrap_ ## fn, NULL) SYSINIT(zfs_ ## fn, SI_SUB_LAST, SI_ORDER_FIRST, wrap_ ## fn, NULL)
#define module_init_early(fn) \
static void \
wrap_ ## fn(void *dummy __unused) \
{ \
fn(); \
} \
SYSINIT(zfs_ ## fn, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_FIRST, wrap_ ## fn, NULL)
#define module_exit(fn) \ #define module_exit(fn) \
static void \ static void \

View File

@ -34,6 +34,7 @@
#include <sys/vnode.h> #include <sys/vnode.h>
struct mount; struct mount;
struct vattr; struct vattr;
struct znode;
int secpolicy_nfs(cred_t *cr); int secpolicy_nfs(cred_t *cr);
int secpolicy_zfs(cred_t *crd); int secpolicy_zfs(cred_t *crd);
@ -57,7 +58,7 @@ int secpolicy_vnode_setattr(cred_t *cr, vnode_t *vp, struct vattr *vap,
int unlocked_access(void *, int, cred_t *), void *node); int unlocked_access(void *, int, cred_t *), void *node);
int secpolicy_vnode_create_gid(cred_t *cr); int secpolicy_vnode_create_gid(cred_t *cr);
int secpolicy_vnode_setids_setgids(vnode_t *vp, cred_t *cr, gid_t gid); int secpolicy_vnode_setids_setgids(vnode_t *vp, cred_t *cr, gid_t gid);
int secpolicy_vnode_setid_retain(vnode_t *vp, cred_t *cr, int secpolicy_vnode_setid_retain(struct znode *zp, cred_t *cr,
boolean_t issuidroot); boolean_t issuidroot);
void secpolicy_setid_clear(struct vattr *vap, vnode_t *vp, cred_t *cr); void secpolicy_setid_clear(struct vattr *vap, vnode_t *vp, cred_t *cr);
int secpolicy_setid_setsticky_clear(vnode_t *vp, struct vattr *vap, int secpolicy_setid_setsticky_clear(vnode_t *vp, struct vattr *vap,

View File

@ -80,6 +80,7 @@ extern "C" {
#define kpreempt_disable() critical_enter() #define kpreempt_disable() critical_enter()
#define kpreempt_enable() critical_exit() #define kpreempt_enable() critical_exit()
#define CPU_SEQID curcpu #define CPU_SEQID curcpu
#define CPU_SEQID_UNSTABLE curcpu
#define is_system_labeled() 0 #define is_system_labeled() 0
/* /*
* Convert a single byte to/from binary-coded decimal (BCD). * Convert a single byte to/from binary-coded decimal (BCD).

View File

@ -64,7 +64,7 @@ typedef u_int uint_t;
typedef u_char uchar_t; typedef u_char uchar_t;
typedef u_short ushort_t; typedef u_short ushort_t;
typedef u_long ulong_t; typedef u_long ulong_t;
typedef u_int minor_t; typedef int minor_t;
/* END CSTYLED */ /* END CSTYLED */
#ifndef _OFF64_T_DECLARED #ifndef _OFF64_T_DECLARED
#define _OFF64_T_DECLARED #define _OFF64_T_DECLARED

View File

@ -43,27 +43,6 @@ typedef struct uio uio_t;
typedef struct iovec iovec_t; typedef struct iovec iovec_t;
typedef enum uio_seg uio_seg_t; typedef enum uio_seg uio_seg_t;
typedef enum xuio_type {
UIOTYPE_ASYNCIO,
UIOTYPE_ZEROCOPY
} xuio_type_t;
typedef struct xuio {
uio_t xu_uio;
/* Extended uio fields */
enum xuio_type xu_type; /* What kind of uio structure? */
union {
struct {
int xu_zc_rw;
void *xu_zc_priv;
} xu_zc;
} xu_ext;
} xuio_t;
#define XUIO_XUZC_PRIV(xuio) xuio->xu_ext.xu_zc.xu_zc_priv
#define XUIO_XUZC_RW(xuio) xuio->xu_ext.xu_zc.xu_zc_rw
static __inline int static __inline int
zfs_uiomove(void *cp, size_t n, enum uio_rw dir, uio_t *uio) zfs_uiomove(void *cp, size_t n, enum uio_rw dir, uio_t *uio)
{ {
@ -82,6 +61,8 @@ void uioskip(uio_t *uiop, size_t n);
#define uio_iovcnt(uio) (uio)->uio_iovcnt #define uio_iovcnt(uio) (uio)->uio_iovcnt
#define uio_iovlen(uio, idx) (uio)->uio_iov[(idx)].iov_len #define uio_iovlen(uio, idx) (uio)->uio_iov[(idx)].iov_len
#define uio_iovbase(uio, idx) (uio)->uio_iov[(idx)].iov_base #define uio_iovbase(uio, idx) (uio)->uio_iov[(idx)].iov_base
#define uio_fault_disable(uio, set)
#define uio_prefaultpages(size, uio) (0)
static inline void static inline void
uio_iov_at_index(uio_t *uio, uint_t idx, void **base, uint64_t *len) uio_iov_at_index(uio_t *uio, uint_t idx, void **base, uint64_t *len)

View File

@ -8,7 +8,7 @@ KERNEL_H = \
zfs_dir.h \ zfs_dir.h \
zfs_ioctl_compat.h \ zfs_ioctl_compat.h \
zfs_vfsops_os.h \ zfs_vfsops_os.h \
zfs_vnops.h \ zfs_vnops_os.h \
zfs_znode_impl.h \ zfs_znode_impl.h \
zpl.h zpl.h

View File

@ -56,7 +56,6 @@
#define tsd_set(key, value) osd_thread_set(curthread, (key), (value)) #define tsd_set(key, value) osd_thread_set(curthread, (key), (value))
#define fm_panic panic #define fm_panic panic
#define cond_resched() kern_yield(PRI_USER)
extern int zfs_debug_level; extern int zfs_debug_level;
extern struct mtx zfs_debug_mtx; extern struct mtx zfs_debug_mtx;
#define ZFS_LOG(lvl, ...) do { \ #define ZFS_LOG(lvl, ...) do { \

View File

@ -26,8 +26,9 @@
* $FreeBSD$ * $FreeBSD$
*/ */
#ifndef _SYS_ZFS_VNOPS_H_ #ifndef _SYS_FS_ZFS_VNOPS_OS_H
#define _SYS_ZFS_VNOPS_H_ #define _SYS_FS_ZFS_VNOPS_OS_H
int dmu_write_pages(objset_t *os, uint64_t object, uint64_t offset, int dmu_write_pages(objset_t *os, uint64_t object, uint64_t offset,
uint64_t size, struct vm_page **ppa, dmu_tx_t *tx); uint64_t size, struct vm_page **ppa, dmu_tx_t *tx);
int dmu_read_pages(objset_t *os, uint64_t object, vm_page_t *ma, int count, int dmu_read_pages(objset_t *os, uint64_t object, vm_page_t *ma, int count,

Some files were not shown because too many files have changed in this diff Show More