Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
r248917, r248918, r248978, r249001, r249014, r249030:
Remove multilevel freezing mechanism, implemented to handle specifics of
the ATA/SATA error recovery, when post-reset recovery commands should be
allocated when queues are already full of payload requests. Instead of
removing frozen CCBs with specified range of priorities from the queue
to provide free openings, use simple hack, allowing explicit CCBs over-
allocation for requests with priority higher (numerically lower) then
CAM_PRIORITY_OOB threshold.
Simplify CCB allocation logic by removing SIM-level allocation queue.
After that SIM-level queue manages only CCBs execution, while allocation
logic is localized within each single device.
Suggested by: gibbs
and kern.cam.ctl.disable tunable; those were introduced as a workaround
to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8),
this makes CTL work out of the box.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
Some failing disks tend to return vendor-specific ASC/ASCQ codes with
NOT READY sense key. It caused extremely long recovery attempts, repeating
these 120 TURs (it takes at least 1 minute) for every I/O request.
Instead of that use default error handling, doing just few retries.
Reviewed by: ken, gibbs
MFC after: 1 month
references to it.
This is the functional equivalent to change r237518, which added this
functionality to the cd(4) and da(4) drivers.
This fix prevents a panic caused by GEOM calling adaopen() while the device
is going away. We now keep the device around until GEOM has finished
cleaning up its state.
ata_da.c: In adaregister(), add a d_gone callback to the GEOM disk
structure registered for the ada driver. Increment the
peripheral reference count for GEOM.
Add a new callback, adadiskgonecb(), that GEOM calls when
it is done with its resources. This callback releases the
reference acquired in adaregister().
Submitted by: Po-Li Soong
Sponsored by: Spectra Logic
MFC After: 5 days
the LUN was never freed.
ctl.c: Adjust ctl_alloc_lun() to make sure we don't clear the
CTL_LUN_MALLOCED flag.
Reported by: Sreenivasa Honnur <shonnur@chelsio.com>
Sponsored by: Spectra Logic
MFC after: 3 days
option left but actually consumed by ada(4), so move it to opt_ada.h
and get rid of opt_ata.h.
- Fix stand-alone build of atacore(4) by adding opt_cam.h.
- Use __FBSDID.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
r249017:
Some cosmetic things:
- Unify device to target insertion inside xpt_alloc_device() instead of
duplicating it three times.
- Remove extra checks for empty lists of devices and targets on release
since zero refcount check also implies it.
- Reformat code to reduce indentation.
r249103:
- Add lock assertions to every point where reference counters are modified.
- When reference counters are reaching zero, add assertions that there are
no children items left.
- Add a bit more locking to the xptpdperiphtraverse().
Move CAM_DEBUG_CDB messages from the point of queuing to the point of
sending to SIM. That allows to inspect real requests execution order,
respecting priorities, freezing, etc.
MFC after: 2 weeks
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
copied in from userspace. This fixes instant panic when creating CTL LUN
on sparc64. Not a security problem, since the API is root-only.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
sys/cam/scsi/scsi_all.c:
- Added scsi_ata_pass_16 method
Which use ATA Pass-Through to send commands to the attached disk.
sys/cam/scsi/scsi_all.h:
- Added defines for all missing ATA Pass-Through commands values.
- Added scsi_ata_pass_16 method.
- Fixed a comment typo while I'm here
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
CAM. This can significantly improve performance particularly for SSDs
which don't suffer from seek latencies.
The sysctl / tunable kern.cam.sort_io_queues provides the systems default
setting where:-
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted (default)
Each device gets its own sysctl kern.cam.<type>.<id>.sort_io_queue
Valid values are:-
-1 = use system default (default)
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted
Note: Additional patch will look to add automatic use of none sorted queues
for none rotating media e.g. SSD's
Reviewed by: scottl
Approved by: pjd (mentor)
MFC after: 2 weeks
but execute the commands in regular way. There is no any reason to cook CPU
while the system is still fully operational. After this change polling in
CAM is used only for kernel dumping.
driver's periphs, acquiring and releaseing periph references while doing it.
Use it to iterate over the lists of ada and da periphs when flushing caches
and putting devices to sleep on shutdown and suspend. Previous code could
panic in theory if some device disappear in the middle of the process.
Before this change they were just leaked. Fortunately USB sticks now use
only one CCB, and so leak was only 2KB per detach, while other bigger SIMs
with much more allocated CCBs are rarely detached.
MFC after: 2 weeks
for the r248519:
For the cam-attached HBAs, allow the driver to specify that it accepts
the unmapped bio by the PIM_UNMAPPED flag. The CAM passes the
CAM_DATA_BIO data transfer type request for the unmapped bio, and the
driver could use the bus_dmamap_load_ccb() as a helper to
transparently handle the ccb.
Sponsored by: The FreeBSD Foundation
Reviewed by: scottl
Tested by: pho, scottl
The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.
Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl
tunable by default.
This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel. They
can simply set kern.cam.ctl.disable=0 in loader.conf.
The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.
UPDATING: Explain the change in the CTL configuration, and
how users can enable CTL if they would like to use
it.
sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL
from initializing.
ctl.c: If CTL_DISABLE is turned on, don't initialize.
i386/conf/GENERIC,
amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE
option.
PREVENT ALLOW MEDIUM REMOVAL commands return errors on these devices
without returning sense data. In some cases unrelated following commands
start to return errors too, that makes device to be dropped by CAM.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
Make umass return an error code if SCSI sense retrieval request
has failed. Make sure scsi_error_action honors SF_NO_RETRY and
SF_NO_RECOVERY in all cases, even if it cannot parse sense bytes.
Reviewed by: hselasky (umass), scottl (cam)
to avoid sending extra READ CAPACITY requests by dastart(). Schedule periph
again on reprobe completion, or otherwise it may stuck indefinitely long.
This should fix USB explore thread hanging on device unplug, waiting for
periph destruction.
Reported by: hselasky
and da_default_timeout where their current hardcoded values matched the current
default value for said tunables.
PR: kern/169976
Reviewed by: pjd (mentor)
Approved by: mav
DISKFLAG_CANDELETE. While this change makes this layer consistent
other layers such as UFS and ZFS BIO_DELETE support may not notice
any change made manually via these device sysctls until the device
is reopened via a mount.
Also corrected var order in dadeletemethodsysctl
PR: kern/169801
Reviewed by: pjd (mentor)
Approved by: mav
MFC after: 2 weeks
Previously CTL would leave individual LUNs enabled in the target
driver, whether or not the port as a whole was enabled. It would
also leave the wildcard LUN enabled indefinitely.
This change means that CTL will enable and disable any active LUNs,
as well as the wildcard LUN, when enabling and disabling a port.
Also, fix a bug that could crop up due to an uninitialized CCB
type.
ctl.c: Before calling ctl_frontend_online(), run through
the LUN list and enable all active LUNs.
After calling ctl_frontend_offline(), run through
the LUN list and disble all active LUNs.
scsi_ctl.c: Before bringing a port online, allocate the
wildcard peripheral for that bus. And after taking
a port offline, invalidate the wildcard peripheral
for that bus.
Make sure that we hold the SIM lock around all
calls to xpt_action() and other transport layer
interfaces that require it.
Use CAM_SIM_{LOCK|UNLOCK} consistently to acquire
and release the SIM lock.
Update a number of outdated comments. Some of
these should have been fixed long ago.
Actually do LUN disbables now. The newer drivers
in the tree work correctly for this as far as I
know.
Initialize the CCB type to CTLFE_CCB_DEFAULT to
avoid a panic due to uninitialized memory.
Submitted by: Chuck Tuffli (partially)
MFC after: 1 week
ctl_frontend_cam_sim.c: Coalesce cfcs_online() and cfcs_offline()
into a single function since these were
identical except for one line.
Make sure we hold the SIM lock around path
creation, and calling xpt_rescan().
scsi_ctl.c: In ctlfe_onoffline(), make sure we hold the
SIM lock around path creation and free
calls, as well as xpt_action().
In ctlfe_lun_enable(), hold the SIM lock
around path and peripheral operations that
require it.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
while doing a copyout. That can cause a panic, because copyout
can trigger VM faults, and we can't handle VM faults while holding
a mutex.
The solution here is to malloc a separate buffer to hold the OOA
queue entries, so that we don't risk a VM fault while filling up
the buffer and we don't have to drop the lock. The other solution
would be to wire the user's memory while filling their buffer with
copyout, but that would have been a little more complex.
Also fix a debugging parenthesis issue in ctl_abort_task() pointed
out by Chuck Tuffli.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
drivers.
The bug occurrs when a userland process has the driver instance
open and the underlying device goes away. We get the devfs
callback that the device node has been destroyed, but not all of
the closes necessary to fully decrement the reference count on the
CAM peripheral.
The reason is that once devfs calls back and says the device has
been destroyed, it is moved off to deadfs, and devfs guarantees
that there will be no more open or close calls. So the solution
is to keep track of how many outstanding open calls there are on
the device, and just release that many references when we get the
callback from devfs.
scsi_pass.c,
scsi_enc.c,
scsi_enc_internal.h: Add an open count to the softc in these
drivers. Increment it on open and
decrement it on close.
When we get a devfs callback to say that
the device node has gone away, decrement
the peripheral reference count by the
number of still outstanding opens.
Make sure we don't access the peripheral
with cam_periph_unlock() after what might
be the final call to
cam_periph_release_locked(). The
peripheral might have been freed, and we
will be dereferencing freed memory.
scsi_ch.c,
scsi_sg.c: For the ch(4) and sg(4) drivers, add the
same changes described above, and in
addition, fix another bug that was
previously fixed in the pass(4) and enc(4)
drivers.
These drivers were calling destroy_dev()
from their cleanup routine, but that could
cause a deadlock because the cleanup
routine could be indirectly called from
the driver's close routine. This would
cause a deadlock, because the device node
is being held open by the active close
call, and can't be destroyed.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The device reports support for SATA Asynchronous Notification in its
IDENTIFY data, but returns error on attempt to enable that feature.
Make SATA XPT of CAM only report these errors, but not fail the device.
MFC after: 1 week
Element Descriptor page if it is not supported. This removes one error
message from verbose logs during boot on systems with some enclosures.
Sponsored by: iXsystems, Inc.
safe in some cases to reduce CCB priority after it was scheduled with high
priority. This fixes reproducible deadlock when command sent through the
pass interface while ATA XPT recovers from command timeout.
Instead of that enforce priority at passioctl(). libcam provides no obvious
interface to specify CCB priority and so much (all?) code specifies zero
(highest) priority. This change limits pass CCBs priority to NORMAL run
level, allowing XPT to complete bus and device recovery after reset before
running any payload.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
System time is set later on boot process then initial bus scan by CAM.
Until that moment microtime() is equal to microuptime(), and if system
boots quickly, the value can be close to zero. That causes settle time
waiting even for buses that don't use reset during probe.
On my test system this reduces boot time by 1 second if USB enabled, or
by 4 seconds if USB disabled. CAM waited for ctl2cam0 bus "settle".
- Extend the lock to cover xpt_path_release() for the new path.
- While xpt_action() is called while holding right SIM lock for the new
bus, the old path release may require different SIM lock. So we have
to temporary drop the new lock and get the old one.
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
Submitted by: ken@ (earlier version)
It is required to store extra recovery requests in case of bus resets.
On ATA/SATA this fixes assertion panics on HEAD with INVARIANTS enabled or
possible memory corruptions otherwise if timeout/reset happens when device
CCB queue is already full.
Reported by: gibbs@
MFC after: 1 week
returns zero while request status is not CAM_REQ_CMP. That could cause
partial device attach or other unexpected results.
Found by: Clang Static Analyzer
drivers:
- Remove scsi_low_pisa.*, they were unused.
- Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that
header. They were empty nops.
- Retire sl_xname and use device_get_nameunit() and device_printf() with
the underlying device_t instead.
- Remove unused {ct,ncv,nsp,stg}print() functions.
- Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.
NetBSD/pc98 was never merged into the main NetBSD tree and is no longer
developed. Adding locking to these drivers would have made the compat
shims hard to impossible to maintain, so remove the shims to ease
future changes.
These changes were verified by md5. Some additional shims can be removed
that do affect the compiled results that I will probably do in another
round.
Approved by: nyan (tentatively)
of this hardware still running (close to twenty years now).
2. Quiesece and use ENC_VLOG instead of ENC_LOG for most
complaints. That is, they're visible with bootverbose, but
otherwise quiesced and not repeatedly spamming messages
with constant reminders that hardware in this space is
rarely fully compliant.
MFC after: 1 month