Reporting SCSI errors to console is often useless, pollutes logs and may
affect performance. For debugging there is kern.cam.ctl.debug sysctl
MFC after: 1 week
At this point IMMED flag is translated to MNT_NOWAIT flag of VOP_FSYNC(),
hoping that file system implements that (ZFS seems doesn't).
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Before this change SYNCHRONIZE CACHE commands were executed exclusively,
as if they had ORDERED tag. But looking through SCSI specs I've found
no any reason to be so strict. For reads this ordering seems pointless.
For writes it looks less obvious, so I left ordering against preceeding
write commands, while following ones are no longer required to wait.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
During vMotion and Clone VMware by default runs multiple sequential 4MB
XCOPY requests same time. If CTL issues reads sequentially in 1MB chunks
for each XCOPY command, reads from different commands are not detected
as sequential by serseq option code and allowed to execute simultaneously.
Such read pattern confused ZFS prefetcher, causing suboptimal disk access.
Issuing all reads same time make serseq code work properly, serializing
reads both within each XCOPY command and between them.
My tests with ZFS pool of 14 disks in RAID10 shows prefetcher efficiency
improved from 37% to 99.7%, copying speed improved by 10-60%, average
read latency reduced twice on HDD layer and by five times on zvol layer.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
- Use pointer assignment rather than a combination of pointers and
flags to switch buffers between unmapped and mapped. This eliminates
multiple flags and generally simplifies the logic.
- Eliminate b_saveaddr since it is only used with pager bufs which have
their b_data re-initialized on each allocation.
- Gather up some convenience routines in the buffer cache for
manipulating buf space and buf malloc space.
- Add an inline, buf_mapped(), to standardize checks around unmapped
buffers.
In collaboration with: mlaier
Reviewed by: kib
Tested by: pho (many small revisions ago)
Sponsored by: EMC / Isilon Storage Division
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.
Reviewed by: kib
To avoid conflicts between target and initiator devices in CAM, make
CTL use target ID reported by HBA as its initiator_id in XPT_PATH_INQ.
That target ID is known to never be used for initiator role, so it won't
conflict. For Fibre Channel and FireWire HBAs this specific ID choice
is irrelevant since all target IDs there are virtual. Same time for SPI
HBAs it seems could be even requirement to use same target ID for both
initiator and target roles.
While there are some more things to polish in isp(4) driver, first tests
of using both roles same time on the same port appeared successfull:
# camcontrol devlist -v
scbus0 on isp0 bus 0:
<FREEBSD CTLDISK 0001> at scbus0 target 1 lun 0 (da20,pass21)
<> at scbus0 target 256 lun 0 (ctl0)
<> at scbus0 target -1 lun ffffffff (ctl1)
- remove last remnants of never implemented multiple targets support;
- implement missing support for LUN mapping in this area.
Due to existing locking constraints LUN mapping code is practically
unlocked at this point. Hopefully it is not racy enough to live until
somebody get idea how to call sleeping fronend methods under lock also
taken by the same frontend in non-sleepable context. :(
At this point CTL has no known use case for device queue freezes.
Same time existing (considered to be broken) code was found to cause
modify-after-free issues.
Discussed with: ken
MFC after: 1 week
MAM is Medium Auxiliary Memory and is most commonly found as flash
chips on tapes.
This includes support for reading attributes and decoding most
known attributes, but does not yet include support for writing
attributes or reporting attributes in XML format.
libsbuf/Makefile:
Add subr_prf.c for the new sbuf_hexdump() function. This
function is essentially the same function.
libsbuf/Symbol.map:
Add a new shared library minor version, and include the
sbuf_hexdump() function.
libsbuf/Version.def:
Add version 1.4 of the libsbuf library.
libutil/hexdump.3:
Document sbuf_hexdump() alongside hexdump(3), since it is
essentially the same function.
camcontrol/Makefile:
Add attrib.c.
camcontrol/attrib.c:
Implementation of READ ATTRIBUTE support for camcontrol(8).
camcontrol/camcontrol.8:
Document the new 'camcontrol attrib' subcommand.
camcontrol/camcontrol.c:
Add the new 'camcontrol attrib' subcommand.
camcontrol/camcontrol.h:
Add a function prototype for scsiattrib().
share/man/man9/sbuf.9:
Document the existence of sbuf_hexdump() and point users to
the hexdump(3) man page for more details.
sys/cam/scsi/scsi_all.c:
Add a table of known attributes, text descriptions and
handler functions.
Add a new scsi_attrib_sbuf() function along with a number
of other related functions that help decode attributes.
scsi_attrib_ascii_sbuf() decodes ASCII format attributes.
scsi_attrib_int_sbuf() decodes binary format attributes, and
will pass them off to scsi_attrib_hexdump_sbuf() if they're
bigger than 8 bytes.
scsi_attrib_vendser_sbuf() decodes the vendor and drive
serial number attribute.
scsi_attrib_volcoh_sbuf() decodes the Volume Coherency
Information attribute that LTFS writes out.
sys/cam/scsi/scsi_all.h:
Add a number of attribute-related structure definitions and
other defines.
Add function prototypes for all of the functions added in
scsi_all.c.
sys/kern/subr_prf.c:
Add a new function, sbuf_hexdump(). This is the same as
the existing hexdump(9) function, except that it puts the
result in an sbuf.
This also changes subr_prf.c so that it can be compiled in
userland for includsion in libsbuf.
We should work to change this so that the kernel hexdump
implementation is a wrapper around sbuf_hexdump() with a
statically allocated sbuf with a drain. That will require
a drain function that goes to the kernel printf() buffer
that can take a non-NUL terminated string as input.
That is because an sbuf isn't NUL-terminated until it is
finished, and we don't want to finish it while we're still
using it.
We should also work to consolidate the userland hexdump and
kernel hexdump implemenatations, which are currently
separate. This would also mean making applications that
currently link in libutil link in libsbuf.
sys/sys/sbuf.h:
Add the prototype for sbuf_hexdump(), and add another copy
of the hexdump flag values if they aren't already defined.
Ideally the flags should be defined in one place but the
implemenation makes it difficult to do properly. (See
above.)
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
referenced. I think that there does exist an unlikely edge case for a
memory leak, but only if a driver is incorrectly written and specifies no
valid range of targets to scan. That can be fixed in a follow-up commit.
Obtained from: Netflix, Inc.
AC_ADVINFO_CHANGED.
Without this change, newly inserted hard disks won't always have their
physical path device nodes created. The problem reproduces most readily
when attaching a large number of disks at once.
Differential Revision: https://reviews.freebsd.org/D2290
Reviewed by: mav, imp
MFC after: 2 weeks
Sponsored by: Spectra Logic
There are four places, all in cam_xpt.c, where ccbs are malloc'ed. Two of
these use M_ZERO, two don't. The two that don't meant that allocated ccbs
had trash in them making it hard to debug errors where they showed up. Due
to this, use M_ZERO all the time when allocating ccbs.
Submitted by: Scott Ferris <scott.ferris@isilon.com>
Sponsored by: EMC/Isilon Storage Division
Reviewed by: scottl, imp
Differential: https://reviews.freebsd.org/D2120
The only drives I have discovered so far that support medium type
reports are newer HP LTO (LTO-5 and LTO-6) drives. IBM drives
only support the density reports.
sys/cam/scsi/scsi_sa.h:
The number of possible density codes in the medium type
report is 9, not 8. This caused problems parsing all of
the medium type report after this point in the structure.
usr.bin/mt/mt.c:
Run the density codes returned in the medium type report
through denstostring(), just like the primary and secondary
density codes in the density report. This will print the
density code in hex, and give a text description if it
is available.
Thanks to Rudolf Cejka for doing extensive testing with HP LTO drives
and Bacula and discovering these problems.
Tested by: Rudolf Cejka <cejkar at fit.vutbr.cz>
Sponsored by: Spectra Logic
MFC after: 4 days
SCSI-2 devices.
Some older tape devices claim to be SCSI-2, but actually do support
long position information. (Long position information includes
the current file mark.) For example, the COMPAQ SuperDLT1.
So we now only disable the check on SCSI-1 and older devices.
sys/cam/scsi/scsi_sa.c:
In saregister(), only disable fetching long position
information on SCSI-1 and older drives. Update the
comment to explain why.
Confirmed by: dvl
Sponsored by: Spectra Logic
MFC after: 3 weeks
tracking.
It is important to subtract the residual from the requested
transfer size to see how much data was actually transferred. With
tape drives in particular, it is common to request more data than is
returned.
Also, add I/O latency tracking for CAM requests issued by
cam_periph_runccb().
If the caller supplies a struct devstat, and the I/O is a SCSI or
ATA I/O, we will track the elapsed time to provide I/O latency
statistics for the request.
sys/cam/scsi/cam_periph.c:
In cam_periph_runccb(), subtract the residual when reporting I/O
totals to devstat(9) for SCSI and ATA passthrough requests.
In cam_periph_runccb(), grab the I/O start time and supply
the start time to devstat_end_transaction() so that it can
calculate the elapsed I/O time.
Sponsored by: Spectra Logic
MFC after: 1 week
The primary focus of these changes is to modernize FreeBSD's
tape infrastructure so that we can take advantage of some of the
features of modern tape drives and allow support for LTFS.
Significant changes and new features include:
o sa(4) driver status and parameter information is now exported via an
XML structure. This will allow for changes and improvements later
on that will not break userland applications. The old MTIOCGET
status ioctl remains, so applications using the existing interface
will not break.
o 'mt status' now reports drive-reported tape position information
as well as the previously available calculated tape position
information. These numbers will be different at times, because
the drive-reported block numbers are relative to BOP (Beginning
of Partition), but the block numbers calculated previously via
sa(4) (and still provided) are relative to the last filemark.
Both numbers are now provided. 'mt status' now also shows the
drive INQUIRY information, serial number and any position flags
(BOP, EOT, etc.) provided with the tape position information.
'mt status -v' adds information on the maximum possible I/O size,
and the underlying values used to calculate it.
o The extra sa(4) /dev entries (/dev/saN.[0-3]) have been removed.
The extra devices were originally added as place holders for
density-specific device nodes. Some OSes (NetBSD, NetApp's OnTap
and Solaris) have had device nodes that, when you write to them,
will automatically select a given density for particular tape drives.
This is a convenient way of switching densities, but it was never
implemented in FreeBSD. Only the device nodes were there, and that
sometimes confused users.
For modern tape devices, the density is generally not selectable
(e.g. with LTO) or defaults to the highest availble density when
the tape is rewritten from BOT (e.g. TS11X0). So, for most users,
density selection won't be necessary. If they do need to select
the density, it is easy enough to use 'mt density' to change it.
o Protection information is now supported. This is either a
Reed-Solomon CRC or CRC32 that is included at the end of each block
read and written. On write, the tape drive verifies the CRC, and
on read, the tape drive provides a CRC for the userland application
to verify.
o New, extensible tape driver parameter get/set interface.
o Density reporting information. For drives that support it,
'mt getdensity' will show detailed information on what formats the
tape drive supports, and what formats the tape drive supports.
o Some mt(1) functionality moved into a new mt(3) library so that
external applications can reuse the code.
o The new mt(3) library includes helper routines to aid in parsing
the XML output of the sa(4) driver, and build a tree of driver
metadata.
o Support for the MTLOAD (load a tape in the drive) and MTWEOFI
(write filemark immediate) ioctls needed by IBM's LTFS
implementation.
o Improve device departure behavior for the sa(4) driver. The previous
implementation led to hangs when the device was open.
o This has been tested on the following types of drives:
IBM TS1150
IBM TS1140
IBM LTO-6
IBM LTO-5
HP LTO-2
Seagate DDS-4
Quantum DLT-4000
Exabyte 8505
Sony DDS-2
contrib/groff/tmac/doc-syms,
share/mk/bsd.libnames.mk,
lib/Makefile,
Add libmt.
lib/libmt/Makefile,
lib/libmt/mt.3,
lib/libmt/mtlib.c,
lib/libmt/mtlib.h,
New mt(3) library that contains functions moved from mt(1) and
new functions needed to interact with the updated sa(4) driver.
This includes XML parser helper functions that application writers
can use when writing code to query tape parameters.
rescue/rescue/Makefile:
Add -lmt to CRUNCH_LIBS.
src/share/man/man4/mtio.4
Clarify this man page a bit, and since it contains what is
essentially the mtio.h header file, add new ioctls and structure
definitions from mtio.h.
src/share/man/man4/sa.4
Update BUGS and maintainer section.
sys/cam/scsi/scsi_all.c,
sys/cam/scsi/scsi_all.h:
Add SCSI SECURITY PROTOCOL IN/OUT CDB definitions and CDB building
functions.
sys/cam/scsi/scsi_sa.c
sys/cam/scsi/scsi_sa.h
Many tape driver changes, largely outlined above.
Increase the sa(4) driver read/write timeout from 4 to 32
minutes. This is based on the recommended values for IBM LTO
5/6 drives. This may also avoid timeouts for other tape
hardware that can take a long time to do retries and error
recovery. Longer term, a better way to handle this is to ask
the drive for recommended timeout values using the REPORT
SUPPORTED OPCODES command. Modern IBM and Oracle tape drives
at least support that command, and it would allow for more
accurate timeout values.
Add XML status generation. This is done with a series of
macros to eliminate as much duplicate code as possible. The
new XML-based status values are reported through the new
MTIOCEXTGET ioctl.
Add XML driver parameter reporting, using the new MTIOCPARAMGET
ioctl.
Add a new driver parameter setting interface, using the new
MTIOCPARAMSET and MTIOCSETLIST ioctls.
Add a new MTIOCRBLIM ioctl to get block limits information.
Add CCB/CDB building routines scsi_locate_16, scsi_locate_10,
and scsi_read_position_10().
scsi_locate_10 implements the LOCATE command, as does the
existing scsi_set_position() command. It just supports
additional arguments and features. If/when we figure out a
good way to provide backward compatibility for older
applications using the old function API, we can just revamp
scsi_set_position(). The same goes for
scsi_read_position_10() and the existing scsi_read_position()
function.
Revamp sasetpos() to take the new mtlocate structure as an
argument. It now will use either scsi_locate_10() or
scsi_locate_16(), depending upon the arguments the user
supplies. As before, once we change position we don't have a
clear idea of what the current logical position of the tape
drive is.
For tape drives that support long form position data, we
read the current position and store that for later reporting
after changing the position. This should help applications
like Bacula speed tape access under FreeBSD once they are
modified to support the new ioctls.
Add a new quirk, SA_QUIRK_NO_LONG_POS, that is set for all
drives that report SCSI-2 or older, as well as drives that
report an Illegal Request type error for READ POSITION with
the long format. So we should automatically detect drives
that don't support the long form and stop asking for it after
an initial try.
Add a partition number to the sa(4) softc.
Improve device departure handling. The previous implementation
led to hangs when the device was open.
If an application had the sa(4) driver open, and attempted to
close it after it went away, the cam_periph_release() call in
saclose() would cause the periph to get destroyed because that
was the last reference to it. Because destroy_dev() was
called from the sa(4) driver's cleanup routine (sacleanup()),
and would block waiting for the close to happen, a deadlock
would result.
So instead of calling destroy_dev() from the cleanup routine,
call destroy_dev_sched_cb() from saoninvalidate() and wait for
the callback.
Acquire a reference for devfs in saregister(), and release it
in the new sadevgonecb() routine when all devfs devices for
the particular sa(4) driver instance are gone.
Add a new function, sasetupdev(), to centralize setting
per-instance devfs device parameters instead of repeating the
code in saregister().
Add an open count to the softc, so we know how many
peripheral driver references are a result of open
sessions.
Add the D_TRACKCLOSE flag to the cdevsw flags so
that we get a 1:1 mapping of open to close calls
instead of a N:1 mapping.
This should be a no-op for everything except the
control device, since we don't allow more than one
open on non-control devices.
However, since we do allow multiple opens on the
control device, the combination of the open count
and the D_TRACKCLOSE flag should result in an
accurate peripheral driver reference count, and an
accurate open count.
The accurate open count allows us to release all
peripheral driver references that are the result
of open contexts once we get the callback from devfs.
sys/sys/mtio.h:
Add a number of new mt(4) ioctls and the requisite data
structures. None of the existing interfaces been removed
or changed.
This includes definitions for the following new ioctls:
MTIOCRBLIM /* get block limits */
MTIOCEXTLOCATE /* seek to position */
MTIOCEXTGET /* get tape status */
MTIOCPARAMGET /* get tape params */
MTIOCPARAMSET /* set tape params */
MTIOCSETLIST /* set N params */
usr.bin/mt/Makefile:
mt(1) now depends on libmt, libsbuf and libbsdxml.
usr.bin/mt/mt.1:
Document new mt(1) features and subcommands.
usr.bin/mt/mt.c:
Implement support for mt(1) subcommands that need to
use getopt(3) for their arguments.
Implement a new 'mt status' command to replace the old
'mt status' command. The old status command has been
renamed 'ostatus'.
The new status function uses the MTIOCEXTGET ioctl, and
therefore parses the XML data to determine drive status.
The -x argument to 'mt status' allows the user to dump out
the raw XML reported by the kernel.
The new status display is mostly the same as the old status
display, except that it doesn't print the redundant density
mode information, and it does print the current partition
number and position flags.
Add a new command, 'mt locate', that will supersede the
old 'mt setspos' and 'mt sethpos' commands. 'mt locate'
implements all of the functionality of the MTIOCEXTLOCATE
ioctl, and allows the user to change the logical position
of the tape drive in a number of ways. (Partition,
block number, file number, set mark number, end of data.)
The immediate bit and the explicit address bits are
implemented, but not documented in the man page.
Add a new 'mt weofi' command to use the new MTWEOFI ioctl.
This allows the user to ask the drive to write a filemark
without waiting around for the operation to complete.
Add a new 'mt getdensity' command that gets the XML-based
tape drive density report from the sa(4) driver and displays
it. This uses the SCSI REPORT DENSITY SUPPORT command
to get comprehensive information from the tape drive about
what formats it is able to read and write.
Add a new 'mt protect' command that allows getting and setting
tape drive protection information. The protection information
is a CRC tacked on to the end of every read/write from and to
the tape drive.
Sponsored by: Spectra Logic
MFC after: 1 month
properly.
If there is garbage in the flags field, it can sometimes include a
set CDAI_FLAG_STORE flag, which may cause either an error or
perhaps result in overwriting the field that was intended to be
read.
sys/cam/cam_ccb.h:
Add a new flag to the XPT_DEV_ADVINFO CCB, CDAI_FLAG_NONE,
that callers can use to set the flags field when no store
is desired.
sys/cam/scsi/scsi_enc_ses.c:
In ses_setphyspath_callback(), explicitly set the
XPT_DEV_ADVINFO flags to CDAI_FLAG_NONE when fetching the
physical path information. Instead of ORing in the
CDAI_FLAG_STORE flag when storing the physical path, set
the flags field to CDAI_FLAG_STORE.
sys/cam/scsi/scsi_sa.c:
Set the XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE when
fetching extended inquiry information.
sys/cam/scsi/scsi_da.c:
When storing extended READ CAPACITY information, set the
XPT_DEV_ADVINFO flags field to CDAI_FLAG_STORE instead of
ORing it into a field that isn't initialized.
sys/dev/mpr/mpr_sas.c,
sys/dev/mps/mps_sas.c:
When fetching extended READ CAPACITY information, set the
XPT_DEV_ADVINFO flags field to CDAI_FLAG_NONE instead of
setting it to 0.
sbin/camcontrol/camcontrol.c:
When fetching a device ID, set the XPT_DEV_ADVINFO flags
field to CDAI_FLAG_NONE instead of 0.
sys/sys/param.h:
Bump __FreeBSD_version to 1100061 for the new XPT_DEV_ADVINFO
CCB flag, CDAI_FLAG_NONE.
Sponsored by: Spectra Logic
MFC after: 1 week
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
This change by 2-3 times improves performance of misaligned WRITE SAME
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
target iSCSI offload. Add mechanism to query maximum receive data segment
size supported by chosen hardware offload module, and use it in ctld(8)
to determine the value to advertise to the other side.
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
This VPD page is effectively an extension of the standard Inquiry
data page, and includes lots of additional bits.
This commit includes support for probing the page in the SCSI probe code,
and an additional request type for the XPT_DEV_ADVINFO CCB. CTL already
supports the Extended Inquiry page.
Support for querying this page in the sa(4) driver will come later.
sys/cam/scsi/scsi_xpt.c:
Probe the Extended Inquiry page, if the device supports it, and
return it in response to a XPT_DEV_ADVINFO CCB if it is requested.
sys/cam/scsi/cam_ccb.h:
Define a new advanced information CCB data type, CDAI_TYPE_EXT_INQ.
sys/cam/cam_xpt.c:
Free the extended inquiry data in a device when the device goes
away.
sys/cam/cam_xpt_internal.h:
Add an extended inquiry data pointer and length to struct cam_ed.
sys/sys/param.h
Bump __FreeBSD_version for the addition of the new
CDAI_TYPE_EXT_INQ advanced information type.
Sponsored by: Spectra Logic
MFC after: 1 week
While ctld(8) still does not allow multiple portal groups per target
to be configured, kernel should now be able to handle it.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.