reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones:
- per-LUN locks to protect device and peripheral drivers state;
- per-target locks to protect list of LUNs on target;
- per-bus locks to protect reference counting;
- per-send queue locks to protect queue of CCBs to be sent;
- per-done queue locks to protect queue of completed CCBs;
- remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance
reasons) to take SIM lock. The opposite acquisition order is forbidden.
All the other locks are leaf locks, that can be taken anywhere, but should
not be cascaded. Many functions, such as: xpt_action(), xpt_done(),
xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM
lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all
xpt_async() calls in addition to xpt_done() calls are queued to completion
threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing
before, use multiple (depending on number of CPUs) threads. Load balanced
between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have
sufficient number of completion threads to efficiently scale to multiple
CPUs can use new function xpt_done_direct() to avoid extra context switch.
Make ahci(4) driver to use this mechanism depending on hardware setup.
Sponsored by: iXsystems, Inc.
MFC after: 2 months
registered simultaneously. Due to topology unlock between the ID allocation
and the bus registration there is a chance that two buses may get the
same IDs. That is supposed reason of lock assertion panic in CAM during
initial bus scanning after new iscsid initiates two sessions same time.
Reported by: trasz
Approved by: re (glebus, marius)
MFC after: 2 weeks
CAM_EXTLUN_VALID is not erroneously set. Also add an XPORT_SRP
identifier to the known SCSI transports for the SCSI RDMA protocol, as
used, for example with Infiniband storage.
Reviewed by: scottl
Approved by: re (marius)
original, this hides the contents of cam_compat.h from ktrace/kdump/truss,
avoiding problems there. There are no user-servicable parts in there, so
no need for those tools to be groping around in there.
Approved by: re
- Remove the timeout_ch field. It's been deprecated since FreeBSD 7.0;
MPSAFE drivers should be managing their own timeout storage. The
remaining non-MPSAFE drivers have been modified to also manage their own
storage, and should be considered for updating to MPSAFE (or removal)
during the FreeBSD 10.x lifecycle.
- Add fields related to soft timeouts and quality of service, to be used
in upcoming work.
- Add room for more flags in the CCB header and path_inq structures.
- Begin support for extended 64-bit LUNs.
- Bump the CAM version number to 0x18, but add compat shims. Tested with
camcontrol and smartctl.
Reviewed by: nathanw, ken, kib
Approved by: re
Obtained from: Netflix
functional state. While CTL is much more superior target from all points,
there is no reason why this code should not work.
Tested with ahc(4) as target side HBA.
MFC after: 2 weeks
to 15 minutes, and 5 minutes for things like READ ELEMENT STATUS.
This is needed to account for the worst case scenarios on at least
some Spectra Logic tape libraries.
Sponsored by: Spectra Logic
MFC after: 3 days
notify (enable spinup) required", instead of doing the normal
retries, poll for a change in status.
We will poll every half second for a minute for the status to
change.
Hitachi drives (and likely other SAS drives) return that ASC/ASCQ
when they are waiting to spin up. What it means is that they are
waiting for the SAS expander to send them the SAS
NOTIFY (ENABLE SPINUP) primitive.
That primitive is the mechanism expanders/enclosures use to
sequence drive spinup to avoid overloading power supplies.
Sponsored by: Spectra Logic
MFC after: 3 days
configure sa(4) to request no I/O splitting by default.
For tape devices, the user needs to be able to clearly understand
what blocksize is actually being used when writing to a tape
device. The previous behavior of physio(9) was that it would split
up any I/O that was too large for the device, or too large to fit
into MAXPHYS. This means that if, for instance, the user wrote a
1MB block to a tape device, and MAXPHYS was 128KB, the 1MB write
would be split into 8 128K chunks. This would be done without
informing the user.
This has suboptimal effects, especially when trying to communicate
status to the user. In the event of an error writing to a tape
(e.g. physical end of tape) in the middle of a 1MB block that has
been split into 8 pieces, the user could have the first two 128K
pieces written successfully, the third returned with an error, and
the last 5 returned with 0 bytes written. If the user is using
a standard write(2) system call, all he will see is the ENOSPC
error. He won't have a clue how much actually got written. (With
a writev(2) system call, he should be able to determine how much
got written in addition to the error.)
The solution is to prevent physio(9) from splitting the I/O. The
new cdev flag, SI_NOSPLIT, tells physio that the driver does not
want I/O to be split beforehand.
Although the sa(4) driver now enables SI_NOSPLIT by default,
that can be disabled by two loader tunables for now. It will not
be configurable starting in FreeBSD 11.0. kern.cam.sa.allow_io_split
allows the user to configure I/O splitting for all sa(4) driver
instances. kern.cam.sa.%d.allow_io_split allows the user to
configure I/O splitting for a specific sa(4) instance.
There are also now three sa(4) driver sysctl variables that let the
users see some sa(4) driver values. kern.cam.sa.%d.allow_io_split
shows whether I/O splitting is turned on. kern.cam.sa.%d.maxio shows
the maximum I/O size allowed by kernel configuration parameters
(e.g. MAXPHYS, DFLTPHYS) and the capabilities of the controller.
kern.cam.sa.%d.cpi_maxio shows the maximum I/O size supported by
the controller.
Note that a better long term solution would be to implement support
for chaining buffers, so that that MAXPHYS is no longer a limiting
factor for I/O size to tape and disk devices. At that point, the
controller and the tape drive would become the limiting factors.
sys/conf.h: Add a new cdev flag, SI_NOSPLIT, that allows a
driver to tell physio not to split up I/O.
sys/param.h: Bump __FreeBSD_version to 1000049 for the addition
of the SI_NOSPLIT cdev flag.
kern_physio.c: If the SI_NOSPLIT flag is set on the cdev, return
any I/O that is larger than si_iosize_max or
MAXPHYS, has more than one segment, or would have
to be split because of misalignment with EFBIG.
(File too large).
In the event of an error, print a console message to
give the user a clue about what happened.
scsi_sa.c: Set the SI_NOSPLIT cdev flag on the devices created
for the sa(4) driver by default.
Add tunables to control whether we allow I/O splitting
in physio(9).
Explain in the comments that allowing I/O splitting
will be deprecated for the sa(4) driver in FreeBSD
11.0.
Add sysctl variables to display the maximum I/O
size we can do (which could be further limited by
read block limits) and the maximum I/O size that
the controller can do.
Limit our maximum I/O size (recorded in the cdev's
si_iosize_max) by MAXPHYS. This isn't strictly
necessary, because physio(9) will limit it to
MAXPHYS, but it will provide some clarity for the
application.
Record the controller's maximum I/O size reported
in the Path Inquiry CCB.
sa.4: Document the block size behavior, and explain that
the option of allowing physio(9) to split the I/O
will disappear in FreeBSD 11.0.
Sponsored by: Spectra Logic
We now pay attention to the maxio field in the XPT_PATH_INQ CCB,
and if it is set, propagate it up to physio via the si_iosize_max
field in the cdev structure.
We also now pay attention to the PIM_UNMAPPED capability bit in the
XPT_PATH_INQ CCB, and set the new SI_UNMAPPED cdev flag when the
underlying SIM supports unmapped I/O.
scsi_sa.c: Add unmapped I/O support and propagate the SIM's
maximum I/O size up.
Adjust scsi_tape_read_write() in the same way that
scsi_read_write() was changed to support unmapped
I/O. We overload the readop parameter with bits
that tell us whether it's an unmapped I/O, and we
need to set the CAM_DATA_BIO CCB flag. This change
should be backwards compatible in source and
binary forms.
MFC after: 1 week
Sponsored by: Spectra Logic
Change CCB queue resize logic to be able safely handle overallocations:
- (re)allocate queue space in power of 2 chunks with 64 elements minimum
and never shrink it; with only 4/8 bytes per element size is insignificant.
- automatically reallocate the queue to double size if it is overflowed.
- if queue reallocation failed, store extra CCBs in unsorted TAILQ,
fetching them back as soon as some queue element is freed.
To free space in CCB for TAILQ linking, change highpowerq from keeping
high-power CCBs to keeping devices frozen due to high-power CCBs.
This encloses all pieces of queue resize logic inside of cam_queue.[ch],
removing some not obvious duties from xpt_release_ccb().
While these operations are not really needed otherwise, at least for SCSI
they may cause extra errors if some other initiator holds write exclusive
reservation on the LUN (SYNCHRONIZE CACHE handled as "write" operation).
Add a PIM_NOSCAN flag to the CAM path inquiry CCB. This tells CAM
not to perform a rescan on a bus when it is registered.
We now use this flag in the mps(4) driver. Since it knows what
devices it has attached, it is more efficient for it to just issue
a target rescan on the targets that are attached.
Also, remove the private rescan thread from the mps(4) driver in
favor of the rescan thread already built into CAM. Without this
change, but with the change above, the MPS scanner could run before
or during CAM's initial setup, which would cause duplicate device
reprobes and announcements.
sys/param.h:
Bump __FreeBSD_version to 1000039 for the inclusion of the
PIM_RESCAN CAM path inquiry flag.
sys/cam/cam_ccb.h:
sys/cam/cam_xpt.c:
Added a PIM_NOSCAN flag. If a SIM sets this in the path
inquiry ccb, then CAM won't rescan the bus in
xpt_bus_regsister.
sys/dev/mps/mps_sas.c
For versions of FreeBSD that have the PIM_NOSCAN path
inquiry flag, don't freeze the sim queue during scanning,
because CAM won't be scanning this bus. Instead, hold
up the boot. Don't call mpssas_rescan_target in
mpssas_startup_decrement; it's redundant and I don't
know why it was in there.
Set PIM_NOSCAN in path inquiry CCBs.
Remove methods related to the internal rescan daemon.
Always use async events to trigger a probe for EEDP support.
In older versions of FreeBSD where AC_ADVINFO_CHANGED is
not available, use AC_FOUND_DEVICE and issue the
necessary READ CAPACITY manually.
Provide a path to xpt_register_async() so that we only
receive events for our own SCSI domain.
Improve error reporting in cases where setup for EEDP
detection fails.
sys/dev/mps/mps_sas.h:
Remove softc flags and data related to the scanner thread.
sys/dev/mps/mps_sas_lsi.c:
Unconditionally rescan the target whenever a device is added.
Sponsored by: Spectra Logic
MFC after: 1 week
"Logical unit not supported" errors. First initiates specific target rescan,
second -- destroys specific LUN. That allows to automatically detect changes
in list of device LUNs. This mechanism doesn't work when target is completely
idle, but probably that is all what can be done without active polling.
Reviewed by: ken
Sponsored by: iXsystems, Inc.
changers that don't support the DVCID and CURDATA bits that were
introduced in the SMC spec.
These changers will return an Illegal Request type error if the
bits are set. This causes "chio status" to fail.
The fix is two-fold. First, for changers that claim to be SCSI-2
or older, don't set the DVCID and CURDATA bits for READ ELEMENT
STATUS. For newer changers (SCSI-3 and newer), we default to
setting the new bits, but back off and try the READ ELEMENT STATUS
without the bits if we get an Illegal Request type error.
This has been tested on a Qualstar TLS-8211, which is a SCSI-2
changer that does not support the new bits, and a Spectra T-380,
which is a SCSI-3 changer that does support the new bits. In the
absence of a SCSI-3 changer that does not support the bits, I
tested that with some error injection code. (The SMC spec says
that support for CURDATA is mandatory, and DVCID is optional.)
scsi_ch.c: Add a new quirk, CH_Q_NO_DVCID that gets set for
SCSI-2 and older libraries, or newer libraries that
report errors when the DVCID/CURDATA bits are set.
In chgetelemstatus(), use the new quirk to
determine whether or not to set DVCID and CURDATA.
If we get an error with the bits set, back off and
try without the bits. Set the quirk flag if the
read element status succeeds without the bits set.
Increase the READ ELEMENT STATUS timeout to 60
seconds after testing with a Spectra T-380. The
previous value was 10 seconds, and too short for
the T-380. This may be decreased later after
some additional testing and investigation.
Tested by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Sponsored by: Spectra Logic
MFC after: 3 days
Ensure that d_delmaxsize is always set, removing init to 0 which could cause
future issues if use cases change.
Allow kern.cam.da.X.delete_max (which maps to d_delmaxsize) to be increased
up to the calculated max after being reduced.
MFC after: 1 day
X-MFC-With: r249940
needed for the last 10 years. Far too much of the internal API is
exposed, and every small adjustment causes applications to stop working.
To kick this off, bump the API version to 0x17 as should have been done
with r246713, but add shims to compensate. Thanks to the shims, there
should be no visible change in application behavior.
I have plans to do a significant overhaul of the API to harnen it for
the future, but until then, I welcome others to add shims for older
versions of the API.
Obtained from: Netflix
SPC-4 specification states that serial number may be property of device,
but not a specific logical unit. People reported about FC storages using
serial number in that way, making it unusable for purposes of LUN multipath
detection. SPC-4 states that designators associated with logical unit from
the VPD page 83h "Device Identification" should be used for that purpose.
Report first of them in the new attribute in such preference order: NAA,
EUI-64, T10 and SCSI name string.
While there, make GEOM DISK properly report GEOM::ident in XML output also
using d_getattr() method, if available. This fixes serial numbers reporting
for SCSI disks in `geom disk list` output and confxml.
Discussed with: gibbs, ken
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
While GEOM in general has provider opened while sending BIO_GETATTR,
GEOM DISK does not really need to open disk to read medium-unrelated
attributes for own use.
Proposed by: ken
Re-ordered SSD quirks alphabetically so they are easier to maintain.
Removed my email and PR reference from comments on each quirk.
Added quirks for more SSDs:
* Crucial M4
* Corsair Force GT
* Intel 520 Series
* Kingston E100 Series
* Samsung 830 Series
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 1 week
This prevents users from selecting a delete method which may cause
corruption e.g. MPS WS16 on pre P14 firmware.
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 days
With "cached read" HDD testing and multiple ports busy on a SATA
host controller, 3726/3826 PMP will very rarely drop a deferred
R_OK that was intended for the host. Symptom will be all 5 drives
under test will timeout, get reset, and recover.
Submitted by: Rich Futyma <rich.futyma@sanmina.com>
MFC after: 2 weeks
- remove DA_FLAG_SAW_MEDIA flag, almost opposite to DA_FLAG_PACK_INVALID,
using the last instead.
- allow opening device with no media present, reporting zero media size
and non-zero sector size, as geom/notes suggests. That allow to read
device attributes and potentially do other things, not related to media.
to query ATA functionality via ATA Pass-Through (16) as this page is defined
as "must" for SATL devices, hence indicating that the device is at least
likely to support Pass-Through (16).
This eliminates errors produced by CTL when ATA Pass-Through (16) fails.
Switch ATA probe daerror call to SF_NO_PRINT to avoid errors printing out
for devices which return invalid errors.
Output details about supported and choosen delete method when verbose booted.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 1 week
Ensure that delete_available is reset so re-probes after a media change,
to one with different delete characteristics, will result in the correct
methods being flagged as available.
Make all ccb state changes use a consistent flow:
* free()
* xpt_release_ccb()
* softc->state = <new state>
* xpt_schedule()
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 1 week
Remove ADA_FLAG_PACK_INVALID flag. Since ATA disks have no concept of media
change it only duplicates CAM_PERIPH_INVALID flag, so we can use last one.
Slightly cleanup DA_FLAG_PACK_INVALID use.
Give periph validity flag own periph reference. That slightly simplifies
the release logic and covers hypothetical case if lock is dropped inside
the periph_oninval() method.
requests.
sys/geom/geom_disk.h:
- Added d_delmaxsize which represents the maximum size of individual
device delete requests in bytes. This can be used by devices to
inform geom of their size limitations regarding delete operations
which are generally different from the read / write limits as data
is not usually transferred from the host to physical device.
sys/geom/geom_disk.c:
- Use new d_delmaxsize to calculate the size of chunks passed through to
the underlying strategy during deletes instead of using read / write
optimised values. This defaults to d_maxsize if unset (0).
- Moved d_maxsize default up so it can be used to default d_delmaxsize
sys/cam/ata/ata_da.c:
- Added d_delmaxsize calculations for TRIM and CFA
sys/cam/scsi/scsi_da.c:
- Added re-calculation of d_delmaxsize whenever delete_method is set.
- Added kern.cam.da.X.delete_max sysctl which allows the max size for
delete requests to be limited. This is useful in preventing timeouts
on devices who's delete methods are slow. It should be noted that
this limit is reset then the device delete method is changed and
that it can only be lowered not increased from the device max.
Reviewed by: mav
Approved by: pjd (mentor)
maximum sizes for said methods, which are used when processing BIO_DELETE
requests. This includes updating UNMAP support discovery to be based on
SBC-3 T10/1799-D Revision 31 specification.
Added ATA TRIM support to cam scsi devices via ATA Pass-Through(16)
sys/cam/scsi/scsi_da.c:
- Added ATA Data Set Management TRIM support via ATA Pass-Through(16)
as a delete_method
- Added four new probe states used to identity available methods and their
limits for the processing of BIO_DELETE commands via both UNMAP and the
new ATA TRIM commands.
- Renamed Probe states to better indicate their use
- Added delete method descriptions used when informing user of issues.
- Added automatic calculation of the optimum delete mode based on which
method presents the largest maximum request size as this is most likely
to result in the best performance.
- Added WRITE SAME max block limits
- Updated UNMAP range generation to mirror that used by ATA TRIM, this
optimises the generation of ranges and fixes a potential overflow
issue in the count when combining multiple BIO_DELETE requests
- Added output of warnings about short deletes. This should only ever
be triggered on devices that fail to correctly advertise their supported
delete modes / max sizes.
- Fixed WS16 requests being incorrectly limited to 65535 in length.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
so its available for use in generic scsi code.
This is a pre-requirement for using VPD queries to determine available SCSI
delete methods within scsi_da.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
commands to an ATA device attached via a SCSI control.
sys/cam/scsi/scsi_all.c:
- Added scsi_ata_identify, scsi_ata_trim
Which use ATA Pass-Through to send commands to the attached disk.
sys/cam/scsi/scsi_all.h:
- Added defines for all missing ATA Pass-Through commands values.
- Added scsi_ata_identify, scsi_ata_trim methods used in ATA TRIM
support.
- Added scsi_vpd_logical_block_prov structure used when querying for
the supported sizes UNMAP commands.
- Added scsi_vpd_block_limits structure used when querying for the
supported sizes of the UNMAP command.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
This allows users who boot without loader to adjust their environments
around slightly buggy or slow hardware.
PR: kern/161809
Submitted by: rozhuk.im@gmail.com
MFC after: 2 weeks
This allows mapping a tape drive in a changer (as reported by
'chio status') to a sa(4) driver instance by comparing the
serial numbers.
The designators can be ASCII (which is printed out directly), binary
(which is printed in hex format) or UTF-8, which is printed in either
native UTF-8 format if the terminal can support it, or in %XX notation
for non-ASCII characters. Thanks to Hiroki Sato <hrs@> for the
explaining UTF-8 printing and example UTF-8 printing code.
chio.h: Modify the changer_element_status structure to add new
fields and definitions from the SMC3r16 spec.
Rename the original CHIOGSTATUS ioctl to OCHIOGTATUS and
define a new CHIOGSTATUS ioctl.
Clean up some tab/space issues.
chio.c: For the 'status' subcommand, print the designator field
if it is supplied by a device.
scsi_ch.h: Add new flags for DVCID and CURDATA to the READ
ELEMENT STATUS command structure.
Add a read_element_status_device_id structure
for the data fields in the new standard. Add new
unions, dt_or_obsolete and voltage_devid, to hold
and address data from either SCSI-2 or newer devices.
scsi_ch.c: Implement support for fetching device IDs with READ
ELEMENT STATUS data.
Add new arguments to scsi_read_element_status() to
allow the user to request the DVCID and CURDATA bits.
This isn't compiled into libcam (it's only an internal
kernel interface), so we don't need any special
handling for the API change.
If the user issues the new CHIOGSTATUS ioctl, copy all of
the available element status data out. If he issues the
OCHIOGSTATUS ioctl, we don't copy the new fields in the
structure.
Fix a bug in chopen() that would result in the peripheral
never getting unheld if chgetparams() failed.
Sponsored by: Spectra Logic
Submitted by: Po-Li Soong
MFC After: 1 week
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
r248917, r248918, r248978, r249001, r249014, r249030:
Remove multilevel freezing mechanism, implemented to handle specifics of
the ATA/SATA error recovery, when post-reset recovery commands should be
allocated when queues are already full of payload requests. Instead of
removing frozen CCBs with specified range of priorities from the queue
to provide free openings, use simple hack, allowing explicit CCBs over-
allocation for requests with priority higher (numerically lower) then
CAM_PRIORITY_OOB threshold.
Simplify CCB allocation logic by removing SIM-level allocation queue.
After that SIM-level queue manages only CCBs execution, while allocation
logic is localized within each single device.
Suggested by: gibbs
and kern.cam.ctl.disable tunable; those were introduced as a workaround
to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8),
this makes CTL work out of the box.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
Some failing disks tend to return vendor-specific ASC/ASCQ codes with
NOT READY sense key. It caused extremely long recovery attempts, repeating
these 120 TURs (it takes at least 1 minute) for every I/O request.
Instead of that use default error handling, doing just few retries.
Reviewed by: ken, gibbs
MFC after: 1 month
references to it.
This is the functional equivalent to change r237518, which added this
functionality to the cd(4) and da(4) drivers.
This fix prevents a panic caused by GEOM calling adaopen() while the device
is going away. We now keep the device around until GEOM has finished
cleaning up its state.
ata_da.c: In adaregister(), add a d_gone callback to the GEOM disk
structure registered for the ada driver. Increment the
peripheral reference count for GEOM.
Add a new callback, adadiskgonecb(), that GEOM calls when
it is done with its resources. This callback releases the
reference acquired in adaregister().
Submitted by: Po-Li Soong
Sponsored by: Spectra Logic
MFC After: 5 days
the LUN was never freed.
ctl.c: Adjust ctl_alloc_lun() to make sure we don't clear the
CTL_LUN_MALLOCED flag.
Reported by: Sreenivasa Honnur <shonnur@chelsio.com>
Sponsored by: Spectra Logic
MFC after: 3 days
option left but actually consumed by ada(4), so move it to opt_ada.h
and get rid of opt_ata.h.
- Fix stand-alone build of atacore(4) by adding opt_cam.h.
- Use __FBSDID.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
r249017:
Some cosmetic things:
- Unify device to target insertion inside xpt_alloc_device() instead of
duplicating it three times.
- Remove extra checks for empty lists of devices and targets on release
since zero refcount check also implies it.
- Reformat code to reduce indentation.
r249103:
- Add lock assertions to every point where reference counters are modified.
- When reference counters are reaching zero, add assertions that there are
no children items left.
- Add a bit more locking to the xptpdperiphtraverse().
Move CAM_DEBUG_CDB messages from the point of queuing to the point of
sending to SIM. That allows to inspect real requests execution order,
respecting priorities, freezing, etc.
MFC after: 2 weeks
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
copied in from userspace. This fixes instant panic when creating CTL LUN
on sparc64. Not a security problem, since the API is root-only.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
sys/cam/scsi/scsi_all.c:
- Added scsi_ata_pass_16 method
Which use ATA Pass-Through to send commands to the attached disk.
sys/cam/scsi/scsi_all.h:
- Added defines for all missing ATA Pass-Through commands values.
- Added scsi_ata_pass_16 method.
- Fixed a comment typo while I'm here
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
CAM. This can significantly improve performance particularly for SSDs
which don't suffer from seek latencies.
The sysctl / tunable kern.cam.sort_io_queues provides the systems default
setting where:-
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted (default)
Each device gets its own sysctl kern.cam.<type>.<id>.sort_io_queue
Valid values are:-
-1 = use system default (default)
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted
Note: Additional patch will look to add automatic use of none sorted queues
for none rotating media e.g. SSD's
Reviewed by: scottl
Approved by: pjd (mentor)
MFC after: 2 weeks
but execute the commands in regular way. There is no any reason to cook CPU
while the system is still fully operational. After this change polling in
CAM is used only for kernel dumping.
driver's periphs, acquiring and releaseing periph references while doing it.
Use it to iterate over the lists of ada and da periphs when flushing caches
and putting devices to sleep on shutdown and suspend. Previous code could
panic in theory if some device disappear in the middle of the process.
Before this change they were just leaked. Fortunately USB sticks now use
only one CCB, and so leak was only 2KB per detach, while other bigger SIMs
with much more allocated CCBs are rarely detached.
MFC after: 2 weeks
for the r248519:
For the cam-attached HBAs, allow the driver to specify that it accepts
the unmapped bio by the PIM_UNMAPPED flag. The CAM passes the
CAM_DATA_BIO data transfer type request for the unmapped bio, and the
driver could use the bus_dmamap_load_ccb() as a helper to
transparently handle the ccb.
Sponsored by: The FreeBSD Foundation
Reviewed by: scottl
Tested by: pho, scottl
The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.
Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl
tunable by default.
This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel. They
can simply set kern.cam.ctl.disable=0 in loader.conf.
The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.
UPDATING: Explain the change in the CTL configuration, and
how users can enable CTL if they would like to use
it.
sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL
from initializing.
ctl.c: If CTL_DISABLE is turned on, don't initialize.
i386/conf/GENERIC,
amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE
option.
PREVENT ALLOW MEDIUM REMOVAL commands return errors on these devices
without returning sense data. In some cases unrelated following commands
start to return errors too, that makes device to be dropped by CAM.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
Make umass return an error code if SCSI sense retrieval request
has failed. Make sure scsi_error_action honors SF_NO_RETRY and
SF_NO_RECOVERY in all cases, even if it cannot parse sense bytes.
Reviewed by: hselasky (umass), scottl (cam)
to avoid sending extra READ CAPACITY requests by dastart(). Schedule periph
again on reprobe completion, or otherwise it may stuck indefinitely long.
This should fix USB explore thread hanging on device unplug, waiting for
periph destruction.
Reported by: hselasky
and da_default_timeout where their current hardcoded values matched the current
default value for said tunables.
PR: kern/169976
Reviewed by: pjd (mentor)
Approved by: mav
DISKFLAG_CANDELETE. While this change makes this layer consistent
other layers such as UFS and ZFS BIO_DELETE support may not notice
any change made manually via these device sysctls until the device
is reopened via a mount.
Also corrected var order in dadeletemethodsysctl
PR: kern/169801
Reviewed by: pjd (mentor)
Approved by: mav
MFC after: 2 weeks
Previously CTL would leave individual LUNs enabled in the target
driver, whether or not the port as a whole was enabled. It would
also leave the wildcard LUN enabled indefinitely.
This change means that CTL will enable and disable any active LUNs,
as well as the wildcard LUN, when enabling and disabling a port.
Also, fix a bug that could crop up due to an uninitialized CCB
type.
ctl.c: Before calling ctl_frontend_online(), run through
the LUN list and enable all active LUNs.
After calling ctl_frontend_offline(), run through
the LUN list and disble all active LUNs.
scsi_ctl.c: Before bringing a port online, allocate the
wildcard peripheral for that bus. And after taking
a port offline, invalidate the wildcard peripheral
for that bus.
Make sure that we hold the SIM lock around all
calls to xpt_action() and other transport layer
interfaces that require it.
Use CAM_SIM_{LOCK|UNLOCK} consistently to acquire
and release the SIM lock.
Update a number of outdated comments. Some of
these should have been fixed long ago.
Actually do LUN disbables now. The newer drivers
in the tree work correctly for this as far as I
know.
Initialize the CCB type to CTLFE_CCB_DEFAULT to
avoid a panic due to uninitialized memory.
Submitted by: Chuck Tuffli (partially)
MFC after: 1 week
ctl_frontend_cam_sim.c: Coalesce cfcs_online() and cfcs_offline()
into a single function since these were
identical except for one line.
Make sure we hold the SIM lock around path
creation, and calling xpt_rescan().
scsi_ctl.c: In ctlfe_onoffline(), make sure we hold the
SIM lock around path creation and free
calls, as well as xpt_action().
In ctlfe_lun_enable(), hold the SIM lock
around path and peripheral operations that
require it.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
while doing a copyout. That can cause a panic, because copyout
can trigger VM faults, and we can't handle VM faults while holding
a mutex.
The solution here is to malloc a separate buffer to hold the OOA
queue entries, so that we don't risk a VM fault while filling up
the buffer and we don't have to drop the lock. The other solution
would be to wire the user's memory while filling their buffer with
copyout, but that would have been a little more complex.
Also fix a debugging parenthesis issue in ctl_abort_task() pointed
out by Chuck Tuffli.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
drivers.
The bug occurrs when a userland process has the driver instance
open and the underlying device goes away. We get the devfs
callback that the device node has been destroyed, but not all of
the closes necessary to fully decrement the reference count on the
CAM peripheral.
The reason is that once devfs calls back and says the device has
been destroyed, it is moved off to deadfs, and devfs guarantees
that there will be no more open or close calls. So the solution
is to keep track of how many outstanding open calls there are on
the device, and just release that many references when we get the
callback from devfs.
scsi_pass.c,
scsi_enc.c,
scsi_enc_internal.h: Add an open count to the softc in these
drivers. Increment it on open and
decrement it on close.
When we get a devfs callback to say that
the device node has gone away, decrement
the peripheral reference count by the
number of still outstanding opens.
Make sure we don't access the peripheral
with cam_periph_unlock() after what might
be the final call to
cam_periph_release_locked(). The
peripheral might have been freed, and we
will be dereferencing freed memory.
scsi_ch.c,
scsi_sg.c: For the ch(4) and sg(4) drivers, add the
same changes described above, and in
addition, fix another bug that was
previously fixed in the pass(4) and enc(4)
drivers.
These drivers were calling destroy_dev()
from their cleanup routine, but that could
cause a deadlock because the cleanup
routine could be indirectly called from
the driver's close routine. This would
cause a deadlock, because the device node
is being held open by the active close
call, and can't be destroyed.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The device reports support for SATA Asynchronous Notification in its
IDENTIFY data, but returns error on attempt to enable that feature.
Make SATA XPT of CAM only report these errors, but not fail the device.
MFC after: 1 week
Element Descriptor page if it is not supported. This removes one error
message from verbose logs during boot on systems with some enclosures.
Sponsored by: iXsystems, Inc.
safe in some cases to reduce CCB priority after it was scheduled with high
priority. This fixes reproducible deadlock when command sent through the
pass interface while ATA XPT recovers from command timeout.
Instead of that enforce priority at passioctl(). libcam provides no obvious
interface to specify CCB priority and so much (all?) code specifies zero
(highest) priority. This change limits pass CCBs priority to NORMAL run
level, allowing XPT to complete bus and device recovery after reset before
running any payload.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
System time is set later on boot process then initial bus scan by CAM.
Until that moment microtime() is equal to microuptime(), and if system
boots quickly, the value can be close to zero. That causes settle time
waiting even for buses that don't use reset during probe.
On my test system this reduces boot time by 1 second if USB enabled, or
by 4 seconds if USB disabled. CAM waited for ctl2cam0 bus "settle".
- Extend the lock to cover xpt_path_release() for the new path.
- While xpt_action() is called while holding right SIM lock for the new
bus, the old path release may require different SIM lock. So we have
to temporary drop the new lock and get the old one.
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
Submitted by: ken@ (earlier version)
It is required to store extra recovery requests in case of bus resets.
On ATA/SATA this fixes assertion panics on HEAD with INVARIANTS enabled or
possible memory corruptions otherwise if timeout/reset happens when device
CCB queue is already full.
Reported by: gibbs@
MFC after: 1 week
returns zero while request status is not CAM_REQ_CMP. That could cause
partial device attach or other unexpected results.
Found by: Clang Static Analyzer
drivers:
- Remove scsi_low_pisa.*, they were unused.
- Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that
header. They were empty nops.
- Retire sl_xname and use device_get_nameunit() and device_printf() with
the underlying device_t instead.
- Remove unused {ct,ncv,nsp,stg}print() functions.
- Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.
NetBSD/pc98 was never merged into the main NetBSD tree and is no longer
developed. Adding locking to these drivers would have made the compat
shims hard to impossible to maintain, so remove the shims to ease
future changes.
These changes were verified by md5. Some additional shims can be removed
that do affect the compiled results that I will probably do in another
round.
Approved by: nyan (tentatively)