Commit Graph

2395 Commits

Author SHA1 Message Date
Warner Losh
f5a95d9a07 Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.

Numerous posts to arch@ and other locations have found no actual users
for this software.

Relnotes:	Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
2019-06-25 04:50:09 +00:00
Warner Losh
97ad52ca4c Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-24 20:23:19 +00:00
Warner Losh
2afaed2d0f Create ata_param_fixup
Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-24 20:18:58 +00:00
Warner Losh
161d2a1796 Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
2019-06-24 20:18:49 +00:00
Alexander Motin
53f5ac1310 Improve AHCI Enclosure Management and SES interoperation.
Since SES specs do not define mechanism to map enclosure slots to SATA
disks, AHCI EM code I written many years ago appeared quite useless,
that always bugged me.  I was thinking whether it was a good idea, but
if LSI HBAs do that, why I shouldn't?

This change introduces simple non-standard mechanism for the mapping
into both AHCI EM and SES code, that makes AHCI EM on capable controllers
(most of Intel's) a first-class SES citizen, allowing it to report disk
physical path to GEOM, show devices inserted into each enclosure slot in
`sesutil map` and `getencstat`, control locate and fault LEDs for specific
devices with `sesutil locate adaX on` and `sesutil fault adaX on`, etc.

I've successfully tested this on Supermicro X10DRH-i motherboard connected
with sideband cable of its S-SATA Mini-SAS connector to SAS815TQ backplane.
It can indicate with LEDs Locate, Fault and Rebuild/Remap SES statuses for
each disk identical to real SES of Supermicro SAS2 backplanes.

MFC after:	2 weeks
2019-06-23 19:05:01 +00:00
Alexander Motin
6d4d657360 Decouple enc/ses verbosity from bootverbose.
I don't want to be regularly notified that my enclosure violates standards
until there is some real problem I want to debug.

MFC after:	2 weeks
2019-06-22 19:09:10 +00:00
Alexander Motin
b8038d7827 Remove ancient SCSI-2/3 mentioning.
MFC after:	2 weeks
2019-06-22 03:50:43 +00:00
Alexander Motin
6805c9b74d Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page".  It allows us to question ELEMENT INDEX that is lower
then values we already processed.  There are many SAS2 enclosures with
this kind of problem.

While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong.  Also skip elements with INVALID bit set.

MFC after:	2 weeks
2019-06-22 01:06:41 +00:00
Scott Long
0feb46b0c6 Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
2019-06-21 23:40:26 +00:00
Alexander Motin
7318fcb51d Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first.  To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.

MFC after:	2 weeks
2019-06-21 23:29:16 +00:00
Alexander Motin
68035f6381 SPC-3 and up require some UAs to be returned as fixed.
MFC after:	2 weeks
2019-06-20 22:20:30 +00:00
Alexander Motin
35a9ffc350 Optimize xpt_getattr().
Do not allocate temporary buffer for attributes we are going to return
as-is, just make sure to NUL-terminate them.  Do not zero temporary 64KB
buffer for CDAI_TYPE_SCSI_DEVID, XPT tells us how much data it filled
and there are also length fields inside the returned data also.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-20 20:29:42 +00:00
Warner Losh
15865ae73d Minor white space changes.
Remove trailing white space that's crept into this file.
2019-06-11 20:48:19 +00:00
Bjoern A. Zeeb
6e40542a4e Introduce sim_dev and cam_sim_alloc_dev().
Add cam_sim_alloc_dev() as a wrapper to cam_sim_alloc() which takes
a device_t instead of the unit_number (which we can derive from the
dev again).

Add device_t sim_dev to struct cam_sim. It will be used to pass through
the bus for cases when both sides of CAM speak newbus already and we want
to link them (yet make the calls through CAM for now).

SDIO will be the first consumer of this. For that make use of
cam_sim_alloc_dev() in sdhci under MMCCAM.

This will also allow people to start iterating more on the idea
to newbus-ify CAM without changing 50+ device drivers from the start.
Also to be clear there are callers to cam_sim_alloc() which do not
have a device_t (e.g., XPT) or provide their own unit number so we cannot
simply switch the KPI entirely.

Submitted by:	kibab (original idea, see https://reviews.freebsd.org/D12467)
Reviewed by:	imp, chuck
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19746
2019-06-08 15:19:50 +00:00
Chuck Tuffli
b1f1471064 Fix nda(4) PCIe link status output
Differentiate between PCI Express Endpoint devices and Root Complex
Integrated Endpoints in the nda driver. The Link Status and Capability
registers are not valid for Integrated Endpoints and should not be
displayed. The bhyve emulated NVMe device will advertise as being an
Integrated Endpoint.

Reviewed by:	imp
Approved byL	imp (mentor)
Differential Revision: https://reviews.freebsd.org/D20282
2019-06-07 18:34:48 +00:00
Alexander Motin
0a3b1d8090 Simplify math added in r310524.
Should be no functional change.

Reported by:	danfe
MFC after:	1 week
2019-05-22 15:39:35 +00:00
Alexander Motin
9c91a26579 Fix condition broken at r345815.
Reported by:	danfe
MFC after:	3 days
2019-05-22 15:25:10 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Alexander Motin
8cb46437a7 Drop periph lock around cam_periph_unmapmem().
Since r345656 it may call copyout(), that may sleep.

MFC after:	3 days
Sponsored by:	iXsystems, Inc.
2019-05-06 19:08:03 +00:00
Alexander Motin
0404d5981d Decode some more ATA commands found in ACS-4.
MFC after:	1 week
2019-05-05 17:10:12 +00:00
Alexander Motin
5a9170aa4c Report DIF protection type the disk is formatted with.
Some disks formatted with protection report errors if written without
protection used.  This should help to diagnose the problem.

MFC after:	2 weeks
2019-04-22 01:08:14 +00:00
Alexander Motin
ed569aadca Polish SCSI sense data validity checks.
According to specs and common sense, all sense data reported in descriptor
format should be valid.  But practice shows different, some devices return
descriptors with invalid data, resulting in error messages looking worse.

Decouple block/stream commands sense data and information field printing.
Looking on present specs, there are much more cases when those fields are
not related, and incomplete old code was not printing valid sense data and
leaving empty lines for invalid.

MFC after:	2 weeks
2019-04-21 19:07:03 +00:00
Ilya Bakulin
0660cfa0c4 Add new fields to mmc_data in preparation to SDIO CMD53 block mode support
SDIO command CMD53 (IO_RW_EXTENDED) allows data transfers using blocks of 1-2048 bytes,
with a maximum of 511 blocks per request.
Extend mmc_data structure to properly describe such requests,
and initialize the new fields in kernel and userland consumers.

No actual driver changes happen yet, these will follow in the separate changes.

Reviewed by:	bz
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D19779
2019-04-10 19:49:35 +00:00
Alexander Motin
9345f88f8c List few more ATA commands.
MFC after:	1 week
2019-04-03 18:27:54 +00:00
Alexander Motin
154c6ffd71 Build NVMe CAM transport unrelated to NVMe SIM.
Before this I suppose it was impossible load CAM-based NVMe as module.
Plus this appeared to be needed to build r345815 without NVMe driver.

MFC after:	2 weeks
2019-04-02 20:27:56 +00:00
Alexander Motin
e40d8dbbcb Make cam_error_print() decode NVMe commands.
MFC after:	2 weeks
2019-04-02 19:37:52 +00:00
Alexander Motin
99bad9ca9a Unify SCSI_STATUS_BUSY retry handling with other cases.
- Do not retry if periph was invalidated.
 - Do not decrement retry_count if already zero.
 - Report action_string when applicable.

MFC after:	2 weeks
2019-04-02 14:46:10 +00:00
Ilya Bakulin
1a22fb3f5e Refactor error handling
There is some code duplication in error handling paths in a few functions.
Create a function for printing such errors in human-readable way and get rid
of duplicates.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15912
2019-04-01 18:54:15 +00:00
Ilya Bakulin
5d20e65174 Use information about max data size that the controller is able to operate
Using DFLTPHYS/MAXPHYS is not always OK, instead make it possible for the
controller driver to provide maximum data size to MMCCAM, and use it there.

The old stack already does this.

Reviewed by:	manu
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15892
2019-04-01 18:49:39 +00:00
Alexander Motin
b059686a71 Do not map small IOCTL buffers to KVA, but copy.
CAM IOCTL interfaces traditionally mapped user-space data buffers to KVA.
It was nice originally, but now it takes too much to handle respective
TLB shootdowns, while small kernel memory allocations up to 64KB backed
by UMA and accompanied by copyin()/copyout() can be much cheaper.

For large buffers mapping still may have sense, and unmapped I/O would
be even better, but the last unfortunately is more tricky, since unmapped
I/O API is too specific to struct bio now.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-03-28 20:41:02 +00:00
Warner Losh
3899afd370 Upgrade Chipfancier SLC quirk to all versions
The 16GB, 32GB and 128GB versions of this product all have the same
problem. For some reason, the RC10 size is correct, while the RC16
size is larger (oddly by the capacity size / 1024 bytes). Using the
RC16 size results in illegal LBA range errors when geom tastes the
device. So, expand the quirk to cover all versions of this chip.

Ideally, we'd get both READ CAPACITY 10 and READ CAPACITY 16 sizes and
print a warnnig if they differ and use the smaller of the two numbers,
though that may be problematical as well. Furthermore, SBC-4
encourages users transition to RC16 only, which suggests that in the
future RC10 may disappear from some drives. It's unclear how to cope
with these drives generically.

PR: 234503
MFC After: 1 week
2019-03-11 20:57:54 +00:00
Alexander Motin
053db1fefd Reduce CTL threads priority to about PUSER.
Since in most configurations CTL serves as network service, we found
that this change improves local system interactivity under heavy load.
Priority of main threads is set slightly higher then worker taskqueues
to make them quickly sort incoming requests not creating bottlenecks,
while plenty of worker taskqueues should be less sensitive to latency.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-03-04 00:49:07 +00:00
Alexander Motin
321f819ba5 Refactor command ordering/blocking mechanism in CTL.
Replace long per-LUN queue of blocked commands, scanned on each command
completion and sometimes even twice, causing up to O(n^^2) processing cost,
by much shorter per-command blocked queues, scanned only when respective
command completes, and check only commands before the previous blocker,
reducing cost to O(n).

While there, unblock aborted commands to make them "complete" ASAP to be
removed from the OOA queue and so not waste time ordering other commands
against them.  Aborted commands that were not sent to execution yet should
have no visible side effects, so this is safe and easy optimization now,
comparing to commands already in processing, which are a still pain.

Together those two optimizations should fix quite pathological case, when
due to backend slowness CTL accumulated many thousands of blocked requests,
partially aborted by initiator and so supposedly not even existing, but
still wasting CTL CPU time.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-02-27 21:29:21 +00:00
Alexander Motin
db53c0adb9 Scrap some debug printf's, unused for years.
MFC after:	2 weeks
2019-02-26 16:05:33 +00:00
Alexander Motin
62e802cf3a Free some space in struct ctl_io_hdr for better use.
- Collapse original_sc and serializing_sc fields into one, since they
are never used simultanously, we have only one local I/O and one remote.

 - Move remote_sglist and local_sglist fields into CTL_PRIV_BACKEND,
since they are used only on Originating SC in XFER mode, where requests
don't ever reach backends, so we can reuse backend's private storage.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-02-23 23:35:52 +00:00
Alexander Motin
e806165bee Remove disabled CTL_LEGACY_STATS support.
It was not only disabled for quite a while, but also appeared to be broken
at r325517, when maximum number of ports was made configurable.

MFC after:	1 week
2019-02-23 04:24:44 +00:00
Warner Losh
a73b2e25e1 Fix panic message.
The panic message lead people to believe some userland CAM request had
caused a problem when in reallity it was for a kernel request (eg the
USER bit was cleared). Reword message. Also, improve a couple of
comments to reflect that the periph shouldn't be completely torn down
before we get here (so the path and sim pointers should be valid, but
aren't and the code is designed to be robust enough in the face of
that to give a specific panic message).
2019-02-13 00:10:12 +00:00
David Bright
3420c04b44 CID 1009492: Logically dead code in sys/cam/scsi/scsi_xpt.c
In `probedone()`, for the `PROBE_REPORT_LUNS` case, all paths that
fall to the bottom of the case set `lp` to `NULL`, so the test for a
non-NULL value of `lp` and call to `free()` if true is dead code as
the test can never be true. Fix by eliminating the whole if
statement. To guard against a possible future change that accidentally
violates this assumption, use a `KASSERT()` to catch if `lp` is
non-NULL.

Reviewed by:	cem
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19109
2019-02-11 22:09:26 +00:00
Warner Losh
a49077d365 Add quirk for Sansisk X400 drives
Certain versions of Sandisk x400 firmware can hang under extremely
heavly load of large I/Os for prolonged periods of time. Newer /
current versions work fine, and should be used where possible. Where
not possible, this quirk ensures that I/O requests are limited to 128k
to avoids the bug, even under extreme load. Since MAXPHYS is 128k,
only users with custom kernels are at risk on the older firmware.
Once all known users of the older firmware have upgraded, this quirk
will be removed.

Sponsored by: Netflix, Inc.
2019-02-05 22:53:36 +00:00
Warner Losh
52467047aa Regularize the Netflix copyright
Use recent best practices for Copyright form at the top of
the license:
1. Remove all the All Rights Reserved clauses on our stuff. Where we
   piggybacked others, use a separate line to make things clear.
2. Use "Netflix, Inc." everywhere.
3. Use a single line for the copyright for grep friendliness.
4. Use date ranges in all places for our stuff.

Approved by: Netflix Legal (who gave me the form), adrian@ (pmc files)
2019-02-04 21:28:25 +00:00
Alexander Motin
6a69d2a400 Use switch instead of chained if/else to improve readability.
Submitted by:	Ryan Moeller <ryan@freqlabs.com>
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D19051
2019-02-04 01:20:56 +00:00
Alexander Motin
441a6b699f Remove stale now comment, forgotten in r343582.
MFC after:	2 weeks
2019-01-30 18:56:45 +00:00
Alexander Motin
a5fde7ef52 Relax BIO_FLUSH ordering in da(4), respecting BIO_ORDERED.
r212160 tightened this from always using MSG_SIMPLE_Q_TAG to always
MSG_ORDERED_Q_TAG.  Since it also marked all BIO_FLUSH requests with
BIO_ORDERED, this commit changes nothing immediately, but it returns
BIO_FLUSH callers ability to actually specify ordering they really
need, alike to other request types.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-01-30 16:50:53 +00:00
Andriy Voskoboinyk
4dafe01e7d Add NO_6_BYTE / NO_SYNC_CACHE quirks for (C|D|E).* Olympus digital cameras
PR:		97472
Submitted by:	Fabio Luis Girardi <papelhigienico@gmail.com>
Reviewed by:	imp
MFC after:	3 weeks
2019-01-27 17:51:49 +00:00
Oleksandr Tymoshenko
fb81f26636 [ata] Add workaround for KingDian S200 SSD crash on receiving TRIM command
- Add ADA_Q_NO_TRIM quirk to be used with the device that falsely advertise TRIM support
- Add ADA_Q_NO_TRIM entry for KingDian S200 SSD

PR:		222802
Submitted by:	Bertrand Petit <bsdpr@phoe.frmug.org>
MFC after:	1 week
2019-01-18 04:23:52 +00:00
Gleb Smirnoff
756a541279 Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
  pbufs are we going to have set.
  In various subsystems that are going to utilize pbufs create private zones
  via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
  and sets a limit on created zone. After startup preallocate pbufs according
  to requirements of all pbuf zones.

  Subsystems that used to have a private limit with old allocator now have
  private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
  swap, vnode pager.

  The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
  aio(4). They should have their private limits, but changing that is out of
  scope of this commit.

o Fetch tunable value of kern.nswbuf from init_param2() and while here move
  NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
  this option.
  Default values aren't touched by this commit, but they probably should be
  reviewed wrt to modern hardware.

This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.

Together with:	gallatin
Tested by:	pho
2019-01-15 01:02:16 +00:00
Warner Losh
3f41bec239 Add NO_SYNC_CACHE quirk for PENTAX cameras
PR: 93389
Submitted by: Demin Alexander
2019-01-08 20:55:02 +00:00
Warner Losh
e11ed26a1d Add NO_RC16 quirk for Chipfancier 16GB USB stick...
Submitted by: osef.lar@gmail.com
PR: 234503
2018-12-31 22:20:30 +00:00
Andriy Gapon
5b7f9fada1 add a knob that disables detection of write protected disks
It has been reported that on some systems (with real hardware passed
through to a virtual machine) the WP detection causes USB disk probing
failures.

While here, also fix the selection of the next state in the case
of malloc failure in DA_STATE_PROBE_WP.  It was DA_STATE_PROBE_RC
unconditionally even when it should have been DA_STATE_PROBE_RC16.

PR:		225794
Reported by:	David Boyd <David.Boyd49@twc.com>
MFC after:	3 weeks
Differential Revision: https://reviews.freebsd.org/D18496
2018-12-17 16:01:37 +00:00
Chuck Tuffli
87b3975e36 nda(4) fix check for Dataset Management support
In the nda(4) driver, only set DISKFLAG_CANDELETE (a.k.a. can support
BIO_DELETE) if the drive supports Dataset Management. There are reports
that without this check, VMWare Workstation does not work reliably.

Fix is to check the ONCS field in the NVMe Controller Data structure for
support. This check previously existed but did not survive the
big-endian changes.

Reported by: yuripv@yuripv.net
Reviewed by: imp, mav, jimharris
Approved by: imp (mentor)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18493
2018-12-13 13:25:37 +00:00
Warner Losh
b920de1428 Send a START UNIT command when a disk responds with an ASC of 04/1C.
This will hopefully spin up a disk that's in low-power mode.

Sponsored by: Netflix
Submitted by: scottl@
2018-12-09 21:37:34 +00:00
Scott Long
e024533250 Don't allocate the config_intrhook separately from the softc, it's small
enough that it costs more code to handle the malloc/free than it saves.
2018-12-09 06:16:54 +00:00
Andriy Gapon
2e2b365e47 daprobedone: announce if a disk is write-protected
MFC after:	2 weeks
2018-12-07 12:02:31 +00:00
Warner Losh
d900ade516 NVME trim clocking
Add the ability to set two goals for trims in the I/O scheduler. The
first goal is the number of BIO_DELETEs to accumulate
(kern.cam.XX.U.trim_goal). When non-zero, this many trims will be
accumulated before we start to transfer them to lower layers. This is
useful for devices that like to get lots of trims all at once in one
transaction (not all devices are like this, and some vary by workload).

The second is a number of ticks to defer trims. If you've set a trim
goal, then kern.cam.XX.U.trim_ticks controls how long the system will
defer those trims before timing out and sending them anyway. It has no
effect when trim_goal is 0.

In any event, a BIO_FLUSH will cause all the TRIMs to be released to
the periph drivers. This may be a minor overloading of what BIO_FLUSH
is supposed to mean, but it's useful to preserve other ordering
semantics that users of BIO_FLUSH reply on.

Sponsored by: Netflix, Inc
2018-11-27 00:36:35 +00:00
Warner Losh
1759fd7798 Minor tweaks to the formatting
Tweak the format of the trim + read bias code. Add similar debug to
the read + writes case.

Spondored by: Netflix
2018-11-26 22:50:30 +00:00
Warner Losh
e5436ab5af Add cam_iosched_set_latfcn to set a latency callback for high latency.
It's often useful to have a callback when an I/O takes more than a
threshold amount of time. This adds the infrastructure for periph
devices to register one.

One use-case is as a debugging aide when you need a semi-realtime
indication of an I/O outlier so you can trigger bus capture gear for
vendor analysis.

Sponsored by: Netflix, Inc
2018-11-15 16:02:45 +00:00
Warner Losh
204a1a4d4c Introduce scsi_ata_setfeatures() as a convenient way to make
a passthru ATA SETFEATURES command.

Sponsored by: Netflix, Inc
2018-11-15 16:02:34 +00:00
Warner Losh
ee7eba240b Remove trailing white space in advance of other changes. 2018-11-14 23:15:50 +00:00
Warner Losh
74c0112fef Only assert locked for many async events.
Many async events that we see are called for this specific path. When
calling an async callback for a targetted device, XTP will lock that
specific device's path lock (same as what cam_periph_lock does). For
those AC_ events, assert we have the lock rather than trying to
recusrively take it (which causes panics since it's not recursive).

Add annotations about this and about the fact that AC_SCSI_AEN events
are generated now only in the ata stack (which cannot have a scsi_da
attachment). Leave it in place in case I've overlooked something as
the code is harmless.

This is fallout from my attempts to "fix" locking for softc->flags in
r330796 that's not been triggered often enough to get my attention
until now.

Sponsored by: Netflix
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D17837
2018-11-05 18:47:29 +00:00
Warner Losh
9385e92b25 Add comments explaining what hold/unhold do
They act as a simple one-deep semaphore to keep open/close/probe from
running at the same time to avoid races that creates.
2018-11-01 21:51:41 +00:00
Warner Losh
ea657f2c76 Add statistics for TRIM comands
Add a counter for the LBAs, Ranges and hardware commands so that we
can provide additional color to the statistics we provide to vendors.

Sponsored by: Netflix, Inc
2018-10-26 16:23:51 +00:00
Warner Losh
51a2f83991 Retire scsi_low
scsi_low was a common set of routines to do the SCSI bus sequencing
for the ncv, nsp and stg drivers. Those have been removed, so it's no
longer needed since nothing else in the tree uses it and nothing
likely ever will (it's for super-low-end 8-bit parallel SCSI cards).
2018-10-22 02:36:07 +00:00
Brooks Davis
4364eab875 Move 32-bit compat support for CDIOREADTOCENTRYS to the right place.
ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.

The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other members which require no translation.

Reviewed by:	kib (prior version), jhb
Approved by:	re (rgrimes)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Review:	https://reviews.freebsd.org/D17378
2018-10-02 23:23:56 +00:00
Kenneth D. Merry
aabac0c176 Fix a da(4) driver memory leak for SCSI SMR devices.
In the probe case for SCSI SMR Host Aware or Most Managed drives, be sure
to free allocated memory.

sys/cam/scsi/scsi_da.c:
	In dadone_probezone(), free the data pointer before returning.

MFC after:	3 days
Sponsored by:	Spectra Logic
Approved by:	re (kib)
2018-10-01 19:00:46 +00:00
Edward Tomasz Napierala
d4eab13738 Make the wait in cfiscsi_offline() interruptible. This is the second half
of the fix/workaround for the "ctld hanging on reload" problem.

PR:		220175
Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
Approved by:	re (kib)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-09-11 11:39:59 +00:00
Alexander Motin
cae8b43e5c Add missing copyin() to access LUN and port ioctl arguments.
Somehow this was working even after PTI in, at least on amd64, and got
broken by something only very recently.

Reviewed by:	araujo
Approved by:	re (gjb)
2018-09-06 14:03:10 +00:00
Edward Tomasz Napierala
d783154e46 Try harder in cfiscsi_offline(). This is believed to be the workaround
for the "ctld hanging on reload" problem observed in same cases under
high load.  I'm not 100% sure it's _the_ fix, as the issue is rather hard
to reproduce, but it was tested as part of a larger path and the problem
disappeared.  It certainly shouldn't break anything.

Now, technically, it shouldn't be needed.  Quoting mav@, "After
ct->ct_online == 0 there should be no new sessions attached to the target.
And if you see some problems abbout it, it may either mean that there are
some races where single cfiscsi_session_terminate(cs) call may be lost,
or as a guess while this thread was sleeping target was reenabbled and
redisabled again".  Should such race be discovered and properly fixed
in the future, than this and the followup two commits can be backed out.

PR:		220175
Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
Discussed with:	mav
Approved by:	re (gjb)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-09-01 16:16:40 +00:00
Chuck Tuffli
9544e6dcf1 Make NVMe compatible with the original API
The original NVMe API used bit-fields to represent fields in data
structures defined by the specification (e.g. the op-code in the command
data structure). The implementation targeted x86_64 processors and
defined the bit fields for little endian dwords (i.e. 32 bits).

This approach does not work as-is for big endian architectures and was
changed to use a combination of bit shifts and masks to support PowerPC.
Unfortunately, this changed the NVMe API and forces #ifdef's based on
the OS revision level in user space code.

This change reverts to something that looks like the original API, but
it uses bytes instead of bit-fields inside the packed command structure.
As a bonus, this works as-is for both big and little endian CPU
architectures.

Bump __FreeBSD_version to 1200081 due to API change

Reviewed by: imp, kbowling, smh, mav
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D16404
2018-08-22 04:29:24 +00:00
Edward Tomasz Napierala
3b95cface1 Remove unneccessary code, which also introduced a (very minor)
race condition, due to a missing call to cfiscsi_target_release().

Discussed with:	mav@
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru> (earlier version)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-08-21 14:34:24 +00:00
Warner Losh
74cc33ce57 Flesh out a comment about what we're doing with read bias and trims.
Sponsored by: Netflix
2018-08-15 00:15:40 +00:00
Warner Losh
0cc28e3cd5 Create xpt_sim_poll and refactor a bit using it.
xpt_sim_poll takes the sim to poll as an argument. It will do the
proper locking protocol, call the SIM polling routine, and then call
camisr_runqueue to process completions on any CCBs the SIM's poll
routine completed. It will be used during late shutdown when a SIM is
waiting for CCBs it sent during shutdown to finish and the scheduler
isn't running because we've panic'd.

This sequence was used twice in cam_xpt, so refactor those to use this
new function.

Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D16663
2018-08-13 19:59:32 +00:00
Conrad Meyer
f053ca1f08 Walk back r337554 while discussion continues
The idea was to get the uncontroversial mechanical change out of the way,
then get the meatier functional changes reviewed subsequently.  I had not
realized that the immediately adjacent issue was addressed in a different
direction in r334506 (see Warner's guidance in D15592).

Discussion continues, trying to determine if there is a secondary issue
still[1] and how best to fix it.  With 12-related activities coming up,
while that is ongoing, just take this back for now.

[1]: Shutdown-time eventhandler events fire normally during panic's reboot
path.  Driver callbacks that attempt to issue and wait on interrupt-
completed IO may never complete, hanging the system.  This is particularly
obnoxious in the shutdown/panic path, as the debugger cannot be entered
anymore and the hang prevents reboot restoring availability.

(There's nothing CAM-specific about this problem -- any shutdown
event-triggered driver could do something like this during panic.  But most
NICs, etc.  don't try to send spin-down commands at shutdown. ;-))

Discussed with:	imp, markj
2018-08-10 19:19:07 +00:00
Conrad Meyer
2077be2b73 cam(4): Add an xpt-neutral flag indicating a valid panic CCB
No functional change.

Note that this change is careful to set the CCB header xflags after
foo_fill_bar() routines, which generally zero existing flags.  An earlier
version of this patch mistakenly set the flag before the fill routines.

Submitted by:	Scott Ferris <sferris AT isilon.com>, jhibbits@
Reviewed by:	bdrewery@, markj@, and non-committer FreeBSD contributor Anton Rang
Sponsored by:	Dell EMC Isilon
2018-08-09 21:53:32 +00:00
Conrad Meyer
bc812246a0 cam_ccb.h: Remove redundant declarations of static inline functions
No functional change.

They're unnecessarily confusing for tools like grep or ctags.

Sponsored by:	Dell EMC Isilon
2018-08-09 21:20:07 +00:00
Warner Losh
62c94a0551 For the dynamic I/O scheduler, make the TRIM stuff also count against
read bias so we do reads in preference to TRIMs. This helps a lot when
many trims are delivered at once from the upper layers as they tend to
delay READs due to priority inversion in the code today.

The non iosched case will be fixed when the trim comibing changes
needed for nvme come in later this year.

Sponsored by: Netflix
2018-07-26 22:55:51 +00:00
Alexander Motin
79fab7d48a Stop further SCSI recovery attempts after one has failed.
We've got a set of probably damaged hard disks, reporting 0x04,0x02
("Logical unit not ready, initializing command required") in response
to READ CAPACITY(16), where attempts to use START STOP UNIT for recovery
results in 0x44,0x00 ("Internal target failure") after ~1 second delay.
As result of all recovery retries, device open attempt took ~3 seconds
before finally reporting to GEOM that device is opened, but has no media.
If the open was for writing and since it hasn't formally failed, following
close triggered GEOM retaste, opening device few more times with respective
delays.

This change reduces whole time of this cycle from ~12 seconds to ~3 by
giving up on recovery after the first failure.

Reviewed by:	ken
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2018-07-21 21:34:10 +00:00
Andriy Gapon
b0af06052c remove unneeded inclusion of sys/interrupt.h from several files
It's likely that the header was needed in the past for swi(9).
But now that code does not use swi(9) or any other interfaces defined
in sys/interrupt.h.

MFC after:	1 week
2018-07-04 09:07:18 +00:00
Ilya Bakulin
e8e5c76419 Fix setting RCA for MMC cards
Unlike SD cards, that publish RCA in response to CMD3,
MMC cards expect the host to set RCA itself.

Since we don't support multiple MMC cards on the bus,
just assign a static RCA of 2 to the attached MMC card.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D13063
2018-06-19 20:02:03 +00:00
Ilya Bakulin
8b0e085f65 Don't try to turn power down MMC bus if it is already down
Regulator framework doens't like turning off already turned off
regulators, so we get panic on AllWinner boards.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15890
2018-06-19 11:28:50 +00:00
Ilya Bakulin
4c4200c6d9 Correctly define rawscr so initializing it doesn't result in overwriting memory.
We need 8 bytes of storage for rawscr.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15889
2018-06-19 11:25:40 +00:00
Ilya Bakulin
3f1cfdb122 Set MMC_DATA_MULTI flag when doing multi-block transfers
Lower layers (MMC / SDHCI controller drivers) may make certain decisions
based on the presence of this flag. The fact that sdhci.c doesn't
look at this flag is another problem that should be fixed separately.

Found when adding MMCCAM support to AllWinner MMC controller driver
where the presence of this flag actually matters.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15888
2018-06-19 11:23:48 +00:00
Kenneth D. Merry
e4b58dfe33 Fix da(4) locking when probing SMR drives.
Probing host aware and host managed SMR drives got broken in revision
330796.

The added cam_periph_lock() calls were in areas in dadone() where
the peripheral lock was already held.

Since then, dadone() has been split into separate functions that are
dedicated to each probe state.

The result is that when probing a host aware drive, I ran into a recursive
lock acquisition in dadone_probeatalogdir(). I would have run into the
same problem in dadone_probeataiddir(), and in dadone_probeatasup() and
dadone_probeatazone() in the error paths had the probe continued.

The solution is to take out all of the extra cam_periph_lock() calls. I
also added cam_periph_assert(periph, MA_OWNED) near the top of each of
the dadone_* calls. These make it clear to anyone coming along in the
the future that the lock is held in the probe done functions.

Also add a locking assert in daprobedone(), to make it clear that it must
be called with the periph lock held.

Sponsored by:	Spectra Logic
Differential Revision:	https://reviews.freebsd.org/D15764
2018-06-14 17:08:44 +00:00
Ilya Bakulin
d670d9518f Enable high-speed on the card before increasing frequency on the controller
Increasing operating frequency without telling card to switch
to high-speed mode first upsets some cards and generates CRC errors.

While here, deselect / reselect cards after CMD6 and SCR fetch, as in original code.

Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15568
2018-06-05 11:03:24 +00:00
Eric van Gyzen
2ebb808f8c cam nvme: fix array overrun
Fix a classic array overrun where the index could be one past the end.

Reported by:	Coverity
CID:		1356596
MFC after:	3 days
Sponsored by:	Dell EMC
2018-05-28 03:14:36 +00:00
Alexander Motin
f439e3a4ff Refactor NVMe CAM integration.
- Remove layering violation, when NVMe SIM code accessed CAM internal
device structures to set pointers on controller and namespace data.
Instead make NVMe XPT probe fetch the data directly from hardware.
 - Cleanup NVMe SIM code, fixing support for multiple namespaces per
controller (reporting them as LUNs) and adding controller detach support
and run-time namespace change notifications.
 - Add initial support for namespace change async events.  So far only
in CAM mode, but it allows run-time namespace arrival and departure.
 - Add missing nvme_notify_fail_consumers() call on controller detach.
Together with previous changes this allows NVMe device detach/unplug.

Non-CAM mode still requires a lot of love to stay on par, but at least
CAM mode code should not stay in the way so much, becoming much more
self-sufficient.

Reviewed by:	imp
MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2018-05-25 03:34:33 +00:00
Warner Losh
b1988d44b3 We can't release the refcount outside of the periph lock.
We're dropping the periph lock then dropping the refcount. However,
that violates the locking protocol and is racy. This seems to be
the cause of weird occasional panics with a bogus assert.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D15517
2018-05-24 16:31:18 +00:00
Ilya Bakulin
96e47614f9 Implement initial MMC partitions support for MMCCAM.
For MMC cards, add partitions found on the card as separate disk(9) devices.
Don't do anything with RPMB partition for now.
Lots of code is copied almost 1:1 from the mmcsd.c in the old stack,
credits Marius Strobl (marius@FreeBSD.org)

Reviewed by:	marius
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D12762
2018-05-22 22:16:49 +00:00
Ilya Bakulin
7fbf511890 Fix MMCCAM scanning for new cards.
r326645 used an incorrect argument for xpt_path_inq().

Reviewed by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15521
2018-05-22 16:32:34 +00:00
Warner Losh
d9a7a61b2b Hold the reference count until the CCB is released
When a disk disappears and the periph is invalidated, any I/Os that
are pending with the controller can cause a crash when they
complete. Move to holding the softc reference count taken in dastart()
until the I/O is complete rather than only until xpt_action()
returns. (This approach was suggested by Ken Merry.) This extends
the method used in da to ada, nda, and mda.

Sponsored by: Netflix
Submitted by: Chuck Silvers
2018-05-15 22:22:10 +00:00
Warner Losh
0eedd21317 Hold the reference count until the CCB is released
When a disk disappears and the periph is invalidated, any I/Os that
are pending with the controller can cause a crash when they
complete. Move to holding the softc reference count taken in dastart()
until the I/O is complete rather than only until xpt_action()
returns. (This approach was suggested by Ken Merry.)

Sponsored by: Netflix
Submitted by: Chuck Silvers
Differential Revision: https://reviews.freebsd.org/D15435
2018-05-15 21:25:35 +00:00
Li-Wen Hsu
137c41d763 Fix build for platforms using GCC:
- Remove unused or dead store variable
- Remove unused function ctl_copyin_alloc
- Add missing curly brackets, this seems a regression in r287720

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D15383
2018-05-10 17:22:04 +00:00
Marcelo Araujo
8951f05525 Rework CTL frontend & backend options to use nv(3), allow creating multiple
ioctl frontend ports.

This revision introduces two changes to CTL:
- Changes the way options are passed to CTL_LUN_REQ and CTL_PORT_REQ ioctls.
  Removes ctl_be_arg structure and associated logic and replaces it with
  nv(3)-based logic for passing in and out arguments.
- Allows creating multiple ioctl frontend ports using either ctladm(8) or
  ctld(8).
  New frontend ports are represented by /dev/cam/ctl<pp>.<vp> nodes, eg /dev/cam/ctl5.3.
  Those device nodes respond only to CTL_IO ioctl.

New command-line options for ctladm:
# creates new ioctl frontend port with using free pp and vp=0
ctladm port -c
# creates new ioctl frontend port with pp=10 and vp=0
ctladm port -c -O pp=10
# creates new ioctl frontend port with pp=11 and vp=12
ctladm port -c -O pp=11 -O vp=12
# removes port with number 4 (it's a "targ_port" number, not pp number)
ctladm port -r -p 4

New syntax for ctl.conf:
target ... {
    port ioctl/<pp>
    ...
}

target ... {
    port ioctl/<pp>/<vp>
    ...

Note: Most of this work was made by jceel@, thank you.

Submitted by:	jceel
Reworked by:	myself
Reviewed by:	mav (earlier versions and recently during the rework)
Obtained from:  FreeNAS and TrueOS
Relnotes:	Yes
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D9299
2018-05-10 03:50:20 +00:00
Warner Losh
041f49aece Remove the 'All Rights Reserved' clause from some of the stuff I've
done for Netflix, since I'm in the neighborhood.
2018-05-09 20:32:23 +00:00
Scott Long
4899b94bac Refactor dadone(). There was no useful code sharing in it; it was just
a 1500 line switch statement.  Callers now specify a discrete completion
handler, though they're still welcome to track state via ccb_state.

Sponsored by:	Netflix
2018-05-01 21:42:27 +00:00
Scott Long
eed99e7557 cam_periph_runccb() changed several years ago to overwrite the ccb callback
pointer.  It's now unhelpful and misleading for callers to continue to set
it, so bring all callers into conformance.  There's no real functional change,
but it makes reading the code a lot less confusing.

Sponsored by:	Netflix
2018-05-01 20:09:29 +00:00
Scott Long
7631477269 Add and fix comments for cam_periph_runccb()
Sponsored by:	Netflix
2018-05-01 17:48:50 +00:00
Warner Losh
c67f3c609b Just assert that the lock is held here, rather than taking it out and
dropping it.

Sponsored by: Netflix
2018-04-13 16:45:35 +00:00
Kenneth D. Merry
fc774835cb Handle Programmable Early Warning for control commands in sa(4).
When the tape position is inside the Early Warning area, the tape
drive will return a sense key of NO SENSE, and an ASC/ASCQ of
0x00,0x02, which means: End-of-partition/medium detected".  If
this was in response to a control command like WRITE FILEMARKS,
we correctly translate this as informational status and return
0 from saerror().

Programmable Early Warning should be handled the same way, but
we weren't handling it that way.  As a result, if a PEW status
(sense key of NO SENSE, ASC/ASCQ of 0x00,0x07, "Programmable early
warning detected") came back in response to a WRITE FILEMARKS,
we returned an error.

The impact of this was that if an application was writing to a
sa(4) device, and a PEW area was set (in the Device Configuration
Extension subpage -- mode page 0x10, subpage 1), and a filemark
needed to be written on close, we could wind up returning an error
to the user on close because of a "failure" to write the filemarks.

It actually isn't a failure, but rather just a status report from
the drive, and shouldn't be treated as a failure.

sys/cam/scsi/scsi_sa.c:
	For control commands in saerror(), treat asc/ascq 0x00,0x07
	the same as 0x00,{0-5} -- not an error.  Return 0, since
	the command actually did succeed.

Reported by:	Dr. Andreas Haakh <andreas@haakh.de>
Tested by:	Dr. Andreas Haakh <andreas@haakh.de>
Sponsored by:	Spectra Logic
MFC after:	3 days
2018-04-12 21:21:18 +00:00
Alexander Motin
d8d4983e5e Do not fail devices just for errors in descriptor format.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2018-04-06 19:47:44 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Warner Losh
6a6c0d5844 Flag when we have a pending TUR. Don't schedule another one when we
have one pending. Otherwise, we can race and send two, which is
wasteful in close proximity. It can also cause the acaquire/release
count for TUR to be > 1, which is undexpected.

PR: 226510
Differential Review: https://reviews.freebsd.org/D14792
2018-03-23 16:23:15 +00:00
Warner Losh
df4ee7639e Revert r331273: "Release the "TUR" reference when clearing the TUR work flag. We mostly"
It exposes other issues, so revert to the pervious state of known issues.
2018-03-21 12:55:59 +00:00
Warner Losh
7b0eb8dbf8 Release the "TUR" reference when clearing the TUR work flag. We mostly
do this right, except when there's no BP and we do a TUR by request.
In that case, we clear the flag, but don't release the reference,
leaking the reference on rare occasion.

PR: 226510
Sponsored by: Netflix
2018-03-20 22:07:45 +00:00
John Baldwin
e875be212d Use <stdarg.h> instead of <machine/stdarg.h> in userland.
<machine/stdarg.h> is a kernel-only header.  The standard header for
userland is <stdarg.h>.  Using the standard header in userland avoids
weird build errors when building with external compilers that include
their own stdarg.h header.

Reviewed by:	arichardson, brooks, imp
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D14776
2018-03-20 21:00:45 +00:00
Warner Losh
400326b667 Kill assert I shouldn't have committed 2018-03-20 13:14:10 +00:00
Warner Losh
afdbfe1e1b Starting LBA is a 64bit number, so use htole64 instead of htole32. The
latter casts the LBA to a 32-bit number before assigning it to the 64
bit structure entity. This works fine on the first 2TB of TRIMs, but
terrible beyond that due to trucation.

Also, add an assert to make sure we don't end too many DSM TRIM
entries in one request.

Sponsored by: Netflix
2018-03-20 03:37:14 +00:00
Warner Losh
6f591d13fd Make kern.cam.nda.num_trim tunable to limit the number of BIO_DELETE
requests that we'll collapse into one DSM_TRIM. By default it is a
256, which is the max that will fit into a 4k page.

Sponsored by: Netflix
2018-03-20 03:37:09 +00:00
Warner Losh
fdfc0a83a3 Remove some redundant MPSAFE flags.
This was pointed out in a code review I'm having trouble finding right
now, but go ahead and eliminate these.

Sponsored by: Netfix
2018-03-20 03:37:04 +00:00
Kenneth D. Merry
0afdc47158 cam_periph_acquire() now returns an errno.
The ch(4) driver was missed in change 328918, which changed
cam_periph_acquire() to return an errno instead of cam_status.

As a result, ch(4) failed to attach.

Sponsored by:	Spectra Logic
2018-03-19 20:19:00 +00:00
Warner Losh
378e38c1cf Only take out the periph lock when we're modifying the flags of the
softc for an async unit attention. CAM locks, sometimes, the periph
lock and other times does not. We were taking the lock always and
running into lock recursion issues on a non-recursive lock. Now we
take it selectively. It's not clear why xpt takes the lock selectively
before calling us, though, and that's still under investigation.

Reported by:	avg
PR:		226510 (same panic, differnt circumstances)
Sponsored by:	Netflix
2018-03-17 16:04:06 +00:00
Edward Tomasz Napierala
6616539dcc Fix iSCSI target crash on session reinstation.
The crash scenario goes like this: there's a thread waiting on "reinstate";
because it doesn't update the timeout counter it gets terminated by the
callout; at this point the maintenance thread starts the termination routine.
The first thread finishes waiting, proceeds to icl_conn_handoff(), and drops
the refcount, which allows the maintenance thread to free its resources.  At
this point another thread receives a PDU.  Boom.

PR:		222898, 219866
Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
Reviewed by:	mav@ (earlier version)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-03-15 17:36:13 +00:00
Warner Losh
d38677d23c Create a sysctl kern.cam.{,a,n}da.X.invalidate
kern.cam.{,a,n}da.X.invalidate=1 forces *daX to detach by calling
cam_periph_invalidate on the underlying periph. This is for testing
purposes only. Include only with options CAM_TEST_FAILURE and rename
the former [AN]DA_TEST_FAILURE, and fix nda to compile with it set.
We're using it at work to harden geom and the buffer cache to be
resilient in the face of drive failure. Today, it far too often
results in a panic. While much work was done on SIM initiated removal
for the USB thumnb drive removal work, little has been done for periph
initiated removal. This simulates what *daerror() does for some errors
nicely: we get the same panics with it that we do with failing drives.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14581
2018-03-14 17:53:37 +00:00
Warner Losh
157cb465c4 Fix inverted logic that counted all completions as errors, except when
they were actual errors.

Sponsored by: Netflix
2018-03-14 16:44:57 +00:00
Warner Losh
807e94b2c3 Implement trim collapsing in nda
When multiple trims are in the queue, collapse them as much as
possible. At present, this usually results in only a few trims being
collapsed together, but more work on that will make it possible to do
hundreds (up to some configurable max).

Sponsored by: Netflix
2018-03-14 16:44:50 +00:00
Warner Losh
8a3de7bc34 Allow NULL ccb to cam_iosched_bio_complete
When the ccb is NULL to cam_iosched_bio_complete, just update the
other statistics, but not the time. If many operations are collapsed
together, this is needed to keep stats properly for the grouped bp.
This should fix trim accounting.

Sponsored by: Netflix
2018-03-14 16:44:16 +00:00
Brooks Davis
405b67a225 Reject ioctls to SCSI enclosures from 32-bit compat processes.
The ioctl objects contain pointers and require translation and some
refactoring of the infrastructure to work. For now prevent opertion
on garbage values. This is very slightly overbroad in that ENCIOC_INIT
is safe.

Reviewed by:	imp, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14671
2018-03-12 23:02:01 +00:00
Brooks Davis
871dc9833b Reject CAMIOGET and CAMIOQUEUE ioctl's on pass(4) in 32-bit compat mode.
These take a union ccb argument which is full of kernel pointers.
Substantial translation efforts would be required to make this work.
By rejecting the request we avoid processing or returning entierly
wrong data.

Reviewed by:	imp, ken, markj, cem
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14654
2018-03-12 22:58:07 +00:00
Warner Losh
af1823cde8 Tighten up periph lock to avoid some races
Make sure the periph lock is held around rmw access to softc data,
espeically flags, including work flags in iosched.
Add asserts for the periph lock where it should be held.

PR: 226510
Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D14456
2018-03-12 15:17:16 +00:00
Conrad Meyer
2e1fccf2cf nvme_da: Fix minor memory leak in error case
Reported by:	cppcheck
Sponsored by:	Dell EMC Isilon
2018-03-10 01:28:55 +00:00
Warner Losh
2d87718fda Use bool instead of int for predicate functions relating to work
available.
2018-02-23 16:06:54 +00:00
Wojciech Macek
0d787e9b35 NVMe: Add big-endian support
Remove bitfields from defined structures as they are not portable.
Instead use shift and mask macros in the driver and nvmecontrol application.

NVMe is now working on powerpc64 host.

Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           imp, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13916
2018-02-22 13:32:31 +00:00
Warner Losh
07e5967a22 Revert r329814 as well. It should have been in r329819. 2018-02-22 11:51:50 +00:00
Warner Losh
0028abe633 Backout r329818, r329816 and r329815.
These aren't the commits I thought I was testing prior to
commit. Revert until I can sort out what happened and fix it.
2018-02-22 11:18:33 +00:00
Warner Losh
91acaad987 Fix typo in last commit after last rebase before commit... 2018-02-22 10:55:23 +00:00
Warner Losh
4d87e27125 Combine BIO_DELETE requests for nda devices
Now that we're queueing BIO_DELETE requests in the CAM I/O scheduler,
it make sense to try to combine as many as possible into a single
request to send down to hardware. Hopefully, lots of larger requests
like this are better than lots of individual transactions.

Note for future: need to limit based on total size of the trim
request. Should also collapse adjacent ranges where possible to
increase the size of the max payload.

Sponsored by: Netflix
2018-02-22 05:44:00 +00:00
Warner Losh
c5fe3ae9b8 Introduce capacity flags for periphs
Introduce flags word to describe the capacities of the peripheral.
First bit will describe if the periph driver allows multiple
outstanding TRIMS to be active in a device.

Modify the I/O scheduler so that the nda driver can queue trims
for a while after the first one arrives. We'll queue until we see
a I/O scheduler tick, then we'll schedule as many TRIMs as allowed
by other factors (currently this is slocts in the NVMe controller).
This mariginally helps the read latency issues we see with reads,
but sets the stage for the nda driver to do TRIM collapsing like the
da and ada drivers do today.

Sponsored by: Netflix
2018-02-22 05:43:55 +00:00
Warner Losh
c9878d6d63 Note when we tick.
To help implement a policy of 'queue all trims until next I/O sched
tick' policy to help coalesce them, note when we tick so we can do
something special on the first call after the tick to get more work.

Sponsored by: Netflix
2018-02-22 05:43:50 +00:00
Warner Losh
f2b9885036 Wrap an extra long line
This debugging line is too big for even my largest xterm. wrap it at
about 80 columns.

Sponsored by: Netflix
2018-02-22 05:43:45 +00:00
Warner Losh
97f8aa050e Don't sort TRIMs.
While the code for ada and da both assume that the trim list is
ordered when doing the coaleascing the TRIMs, it turns out that
creating the sorted list uses more resources than are saved by having
slightly fewer trims sent to the device.

Sponsored by: Netflix
2018-02-22 05:43:20 +00:00
Warner Losh
6ffe523f18 Minor formatting nits. 2018-02-21 23:49:35 +00:00
Edward Tomasz Napierala
8404ad78db Use proper buffer length (the announce_buf char pointer used to be anarray),
broken in r317143. This fixes those weird "cd0: Attempt" messages at boot.

PR:		222103
Reviewed by:	scottl@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14369
2018-02-21 14:05:13 +00:00
Warner Losh
bc40691e40 Report the number of remaining retries when we have an error that
we're retrying.
2018-02-15 18:57:54 +00:00
Warner Losh
9d602e4e3d Fix cut and pasted comments to reflect differences in code from the
original source.

Sponsored by: Netflix
2018-02-07 18:33:46 +00:00
Warner Losh
c4b72d8b37 Keep a counter for number of requests completed with an error.
Sponsored by: Netflix
2018-02-06 23:21:08 +00:00
Scott Long
99e7a4ad9e Return a C errno for cam_periph_acquire().
There's no compelling reason to return a cam_status type for this
function and doing so only creates confusion with normal C
coding practices. It's technically an API change, but the periph API
isn't widely used. No efffective change to operation.

Reviewed by:	imp, mav, ken
Sponsored by:	Netflix
Differential Revision:	D14063
2018-02-06 06:42:25 +00:00
Warner Losh
de4f4237bf Do the book-keeping on release before we release the reference. The
periph was going away on final release, and then returning and we
started dancing in free memory.

Sponsored by: Netflix
2018-01-29 18:07:14 +00:00
Scott Long
da2f5dfb35 Finish the incomplete move of CAM_PERIPH_PRINT().
Reported by:	kevans
2018-01-27 07:18:02 +00:00
Scott Long
15747cacb4 Move CAM_PERIPH_PRINT() to cam_periph.h 2018-01-26 23:56:07 +00:00
Warner Losh
60b7691d56 Fix a sleepable malloc in ndastart. We shouldn't be sleeping
here. Return ENOMEM when we can't malloc a buffer for the DSM
TRIM. This should fix the WITNESS warnings similar to the following:

uma_zalloc_arg: zone "16" with the following non-sleepable locks held:
exclusive sleep mutex CAM device lock (CAM device lock) r = 0 (0xfffff800080c34d0) locked @ /usr/src/sys/cam/nvme/nvme_da.c:351

Reviewed by: scottl@
Sponsored by: Netflix
2018-01-26 23:14:46 +00:00
Scott Long
074cc5f66d Fix a cut-and-paste error in a panic message 2018-01-26 18:42:28 +00:00
Justin Hibbits
a99028fc70 Minimum changes for ctl to build on architectures with non-matching physical and
virtual address sizes

Summary:
Some architectures use physical addresses larger than virtual.  This is the
minimal changeset needed to get CAM/CTL to build on these targets.  No
functional changes.  More changes would likely be needed for this to be fully
functional on said platforms, but they can be made when needed.

Reviewed By:	mav, chuck
Differential Revision:	https://reviews.freebsd.org/D14041
2018-01-26 00:58:02 +00:00
Warner Losh
d047fd281d Track Ref / DeRef and Hold / Unhold that da is doing to track down
leaks. We assume each source can be taken / dropped only once and
don't recurse. These are only enabled via DA_TRACK_REFS or
INVARIANTS. There appreas to be a reference leak under extreme load,
and these should help us colaberatively work it out. It also documents
better the reference / holding protocol better.

Reviewed by: ken@, scottl@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14040
2018-01-25 21:38:30 +00:00
Warner Losh
de64cba2c7 When devices are invalidated, there's some cases where ccbs for that
device still wind up in xpt_done after the path has been
invalidated. Since we don't always need sim or devq, add some guard
rails to only fail if we have to use them.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14040
2018-01-25 21:38:09 +00:00
Warner Losh
8f182e3ff6 Minor whitespace cleanup to remove leading space before tab. No
functional changes.
2018-01-25 02:52:44 +00:00
Warner Losh
6a86483da1 This comment is bogus. This is a legit release.
Reviewed by: scottl@, ken@
Sponsored by: Netflix
2018-01-22 17:47:49 +00:00
Pedro F. Giffuni
d821d36419 Unsign some values related to allocation.
When allocating memory through malloc(9), we always expect the amount of
memory requested to be unsigned as a negative value would either stand for
an error or an overflow.
Unsign some values, found when considering the use of mallocarray(9), to
avoid unnecessary casting. Also consider that indexes should be of
at least the same size/type as the upper limit they pretend to index.

MFC after:	3 weeks
2018-01-22 02:08:10 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Scott Long
c35fca48b3 Fix compile errors in r328165
Reported by:	O. Hartmann
Sponsored by:	Netflix
2018-01-19 19:18:14 +00:00
Scott Long
19641ce893 Revert ABI breakage to CAM that came in with MMC/SD support in r320844.
Make it possible to retrieve mmc parameters via the XPT_GET_ADVINFO
call instead.  Convert camcontrol to the new scheme.

Reviewed by:	imp. kibab
Sponsored by:	Netflix
Differential Revision:	D13868
2018-01-19 15:32:27 +00:00
Pedro F. Giffuni
f24882eca5 SPDX: finish tagging sys/cam. 2018-01-16 23:19:57 +00:00
Pedro F. Giffuni
b6eb160014 scsi_ch.c: Small cleanups to the comments.
Move the the NetBSD tag near to the related licence. Update it to reflect
better the point where we started diverging.

Use grouping parenthesis for the SPDX tag.

No functional change.
2018-01-16 23:08:25 +00:00
Pedro F. Giffuni
0699955838 cam: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:15:25 +00:00
Andriy Gapon
6ce374aa94 geom_disk / scsi_da: deny opening write-protected disks for writing
Ths change consists of two parts.

geom_disk: deny opening a disk for writing if it's marked as
write-protected.  A new disk(9) flag is added to mark write protected
disks.  A possible alternative could be to add another parameter to d_open,
so that the open mode could be passed to it and the disk drivers could
make the decision internally, but the flag required less churn.

scsi_da: add a new phase of disk probing to query the all pages mode
sense page.  We can determine if the disk is write protected using bit 7
of the device specific field in the mode parameter header returned by
MODE SENSE.

PR:		224037
Reviewed by:	mav
MFC after:	4 weeks
Differential Revision: https://reviews.freebsd.org/D13360
2018-01-15 11:20:00 +00:00
Warner Losh
045f8bc8e4 When we crash, we'll stop the scheduler before we call the
shutdown_post_sync event.  For adashutdown, this causes problems
because we need to poll for completion of the commands, but we're not
yet officially dumping yet, so the code from r326964 assumed we could
use the interrupt-driven commands rather than the polled ones. This
lead to a hang. Prevent this by also checking to see if the scheduler
is stopped to do the polling.

Reported by: markj@
Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D13845
2018-01-11 03:11:41 +00:00
Scott Long
876f6a6af2 Release the held refcount on the probe periph when probing is
done, now that r327741 lets this happen.

Obtained from:	Netflix
2018-01-09 21:24:05 +00:00
Scott Long
7c1374d5f5 Hold a refcount on the periph while running the allocation
queue.  This will allow sub-transports to release their
probe pseudo-device with fewer convoluted restrictions.

Obtained from:	Netflix
2018-01-09 21:23:16 +00:00
Warner Losh
9c5a114886 Remove ccbque.h from i386/isa.
inline ccbque.h into scsi_low.h. The file isn't MD, so shouldn't live
in i386/isa. It's only used by scsi_low, so move it there so no new
clients accidentally grow. scsi_low may not even still work, and the
locking here is still SPL based. CAM should do the right thing, but
I've received no reports of these cards still working. At least it
compiles still and there's one fewer files in sys/i386/isa. While I'm
here, ansify and de-splize. CCB_MWANTED appears to be a clear-only
flag, but I've not changed that.

Differential Review: https://reviews.freebsd.org/D13672
2018-01-09 16:11:33 +00:00
Scott Long
bff0b56cdf Don't hold the periph locks during dump.
Obtained from:	Netflix
2018-01-09 00:17:15 +00:00
Scott Long
04e814aecd Don't hold the periph lock when calling into cam_periph_runccb()
from the ada and da dump routines.  This avoids difficult locking
problems from needing to be handled.  While it might seem like this
would leave the periphs unprotected during dump, they were aleady
at risk of unexpected removal due to the dump functions not
keeping refcount state across the many calls that come in during
a dump.  This is an exercise for future work.

Obtained from:	Netflix
2018-01-09 00:10:59 +00:00
Scott Long
329e7a8b51 Protect against a possible NULL deference from an accessor
function.

Obtained from:	Netflix
2018-01-09 00:00:55 +00:00
Eitan Adler
d734a3dfcd cam/da: QUIRK: Add 4K quirks for WD Red and Black MHDDs
PR:		188685
Submitted by:	Jeremy Chadwick <jdc@koitsu.org>
Reported by:	Martin Birgmeier <d8zNeCFG@aon.at>
2018-01-05 07:14:39 +00:00
Emmanuel Vadot
8e3f1dcf54 ctl: Correct comment in ctl_worker_thread
The incoming queue is handled before the RtR one.
No functional change.

MFC after:	3 days
2017-12-27 15:39:31 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Warner Losh
4484c8f5d2 Return domain, bus, slot, and function for the transport settings in
PATH_INQ requests for nvme.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13546
2017-12-20 19:13:55 +00:00
Warner Losh
5cf3cd108f When doing a dump, the scheduler is normally not running, so this
changed worked to capture dumps for me. However, the test for
SCHEDULER_STOPPED() isn't right. We can also call the dump routine
from ddb, in which case the scheduler is still running. This leads to
an assertion panic that we're sleeping when we shouldn't. Instead, use
the proper test for dumping or not. This brings us in line with other
places that do special things while we're doing polled I/O like this.

Noticed by: pho@
Differential Revision: https://reviews.freebsd.org/D13531
2017-12-19 04:13:22 +00:00
Alexander Motin
4d4709520a Reduce size of several on-stack string buffers.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2017-12-13 21:17:00 +00:00
Warner Losh
05972baf11 Use ataio ccb instead of general ccb to avoid excessice stack usage. 2017-12-13 07:07:27 +00:00
Warner Losh
762a7f4f5f Define xpt_path_inq.
This provides a nice wrarpper around the XPT_PATH_INQ ccb creation and
calling.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13387
2017-12-06 23:05:22 +00:00
Warner Losh
2b31251a64 Now that cam_periph_runccb() can be called from situations where the
kernel scheduler is stopped, replace the by hand calling of
xpt_polled_action() with it.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13388
2017-12-06 23:05:15 +00:00
Warner Losh
f93a843cd2 Make cam_periph_runccb be safe to call when we can only do polling.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13388
2017-12-06 23:05:07 +00:00
Alan Somers
99ae1d3b25 cam: fix sign-extension error in adagetparams
adagetparams contains a sign-extension error that will cause the sector
count to be incorrectly calculated for ATA disks of >=1TiB that still use
CHS addressing. Disks using LBA48 addressing are unaffected.

Reported by:	Coverity
CID:		1007296
Reviewed by:	ken
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13198
2017-12-06 17:01:25 +00:00
Warner Losh
553484ae07 Remove unused 4th argument to match the standard error routines.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13386
2017-12-06 00:29:50 +00:00
Warner Losh
d2f3208dda Add NVME as a known device type for devstat processing.
Also, reduce the amount of cut and pasted code a little since only two
args are different in the devstat_end_transaction calls.

Sponsored by: Netflix
2017-12-06 00:29:43 +00:00
Warner Losh
b836edd8b8 Remove stray cam_periph_async call. It's called twice this way. While
currently harmless for AC_UNIT_ATTENTION event (cam_periph_async does
nothing with them), it's still in error because if it were to start in
the future, it would be done twice.

Sponsored by: Netflix
2017-12-05 23:02:31 +00:00
Pedro F. Giffuni
bec9534d1d sys/cam: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:12:43 +00:00
Alan Somers
b0f662fed3 Always null-terminate CAM periph_name and dev_name
Reported by:	Coverity
CID:		1010039, 1010040, 1010041, 1010043
Reviewed by:	ken, imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13194
2017-11-22 19:57:34 +00:00
Alan Somers
e7dda951a4 Fix uninitialized variable from 326034
Reported by:	Coverity
CID:		1382887
MFC after:	20 days
X-MFC-With:	326034
Sponsored by:	Spectra Logic Corp
2017-11-21 16:38:30 +00:00
Alan Somers
a4557b0509 Quirk Seagate ST8000AS0003-2HH
Like its predecessor ST8000AS0002, this is a drive-managed SMR drive, but
doesn't declare that in its ATA identify data.

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-20 23:45:42 +00:00
Alan Somers
c95dc95b35 da(4): Short-circuit unnecessary BIO_FLUSH commands
sys/cam/scsi/scsi_da.c
	Complete BIO_FLUSH commands immediately if the da(4) device hasn't
	been written to since the last flush. If we haven't written to the
	device, there is no reason to send a flush.

Submitted by:	gibbs
Reviewed by:	imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13106
2017-11-20 22:27:33 +00:00
Alan Somers
8a0a413e12 Fix multiple bugs in cam_strmatch
* Wrongly matches strings that are shorter than the pattern
* Fails to match negative character sets
* Fails to match character sets that aren't at the end of the pattern
* Fails to match character ranges

Reviewed by:	imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13173
2017-11-20 22:01:45 +00:00
Alan Somers
1a69c11aab Add assertion in probedone() that we're holding the device lock.
Submitted by:	ken
Reviewed by:	asomers
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-17 20:53:52 +00:00
Alan Somers
602740b689 Fix potential NULL pointer dereference of device physical path
In scsi_dev_advinfo(), if the physical path is being stored and there is a
malloc failure (malloc(9) is called with M_NOWAIT), we could wind up in a
situation where the device's physpath_len is set to the length the user
provided, but the physpath itself is NULL.

If another context then comes in to fetch the physical path value, we would
wind up trying to memcpy a NULL pointer into the caller's buffer.

So, set the physpath_len to 0 when we free the physpath on entry into the
store case for the physical path.  Reset the length to a non-zero value only
after we've successfully malloced a buffer to hold it.

Submitted by:	ken
Reviewed by:	asomers
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-17 17:13:00 +00:00
Baptiste Daroussin
16f92ad234 Add some 4k quirks for Samsung pm863a SSDs
Submitted by:	Nikita Kozlov <nikita.kozlov at blade-group.com>
MFC after:	3 days
Sponsored by:	blade
Differential Revision:	https://reviews.freebsd.org/D13093
2017-11-16 10:15:17 +00:00
Alan Somers
30083240bc Remove a double free(9) in xpt_bus_register
In xpt_bus_register(), remove superfluous call to free().  This was mostly
benign since free(9) checks for NULL before doing anything, and
xpt_create_path() is nice enough to NULL out the pointer on failure.
However, it could've segfaulted if malloc(9) failed during
xpt_create_path().

Submitted by:	gibbs
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-15 15:52:06 +00:00
Warner Losh
5e8a39f6ab Properly decode NVMe state of the drive and print out the information
in the attach to more closely match what SCSI and ATA attached
storage provides.

Sponsored by: Netflix
2017-11-14 05:05:26 +00:00
Warner Losh
4e3b274457 Provide link speed data in XPT_GET_TRAN_SETTINGS. Provide full version
information for that and XPT_PATH_INQ. Provide macros to encode/decode
major/minor versions.  Read the link speed and lane count to compute
the base_transfer_speed for XPT_PATH_INQ.

Sponsored by: Netflix
2017-11-14 05:05:16 +00:00
Emmanuel Vadot
530fdf6771 ctl: Make max_luns and max_ports tunable variables instead of hardcoded
defines.

Reviewed by:	trasz (earlier version), bapt (earlier version), bcr (manpages)
MFC after:	2 Weeks
Sponsored by:	Gandi.net
Differential Revision:	https://reviews.freebsd.org/D12836
2017-11-07 16:59:52 +00:00
Warner Losh
50ee2e2aab Send IDLE IMMEDIATE for warm boot.
We must send either an IDLE IMMEDIATE or a STANDBY IMMEDIATE to drives
on warm boot so their SMART and other volatile data is
persisted. However, for a warm boot we don't want to send STANDBY
IMMEDIATE to some spinning drives because they will spin down. If
there's a lot of these drives on the system, that can cause a
thundering herd problem at startup time (that in extreme cases causes
timeout in device discovery).

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12811
2017-10-30 03:25:22 +00:00
Warner Losh
712ad71996 nvd alias has caused some problems, revert it for the moment.
Sponsored by: Netflix
2017-10-27 14:57:38 +00:00
Warner Losh
c136b6ecee We should be call adaerror() instead of cam_periph_error() always.
Sponsored by: Netflix
2017-10-26 22:53:55 +00:00
Warner Losh
f902366c48 Always send STANDBY IMMEDIATE when shutting down
To save SMART data and for a drive to understand that it's been nicely
shutdown, we need to send a STANDBY IMMEDIATE. Modify adaspindown to
use a local CCB on the stack. When we're panicing, used
xpt_polled_action rather than cam_periph_runccb so that we can SEND
IMMEDIATE after we've shutdown the scheduler.

Sponsored by: Netflix
Reviewed by: scottl@, gallatin@
Differential Revision: https://reviews.freebsd.org/D12799
2017-10-26 22:53:49 +00:00
Warner Losh
6a9fbe3a70 Handle RB_POWERCYCLE in ada driver
Allow the disks to be spun down when doing a POWERCYCLE as well as
POWEROFF.

Sponsored by: Netflix
2017-10-25 15:30:48 +00:00
Warner Losh
6ca2fb6623 Treat a 'current' value of 0 as unlimited as a failsfe.
When limiting I/O, a value of 0 makes no sense as a limit. No progress
can be made. Trade the possibility that someone might be doing
something clever to achieve ultra-low I/O limits vs the damage of not
ever making progress on an I/O in favor of making progress. Now the
machine won't be useless if this accidentally gets requested.

Sponsored by: Netflix
2017-10-24 02:25:42 +00:00
Warner Losh
1f88be2d3a Zero out the ccb's alloated on the stack for the dump routines to more
closely match a ccb returned from xpt_get_ccb().

Sponsored by: Netflix
2017-10-15 23:54:04 +00:00
Warner Losh
fa271a5d09 Closer examination shows that nvme and CAM both normally zero-fill
allocations (for req and ccb, which ultimately contain the
nvme_cmd). As such, we can micro-optimize these routines. Add a
comment to this effect, and bzero the ccb used to make the requests
for the nda dump rotuine so it more closely matches a ccb allocated
with xpt_get_ccb().

Sponsored by: Netflix
2017-10-15 23:53:55 +00:00
Warner Losh
c4231018d0 Be nicer on the dump stack by allocating only a ccb_nvmeio rather than
a full ccb. This saves a few hundre bytes, which might be important
during a crash dump...

Sponsored by: Netflix
Suggested by: scottl@
2017-10-15 16:18:03 +00:00
Warner Losh
717bff5d85 Update comment to reflect actual default timeout.
Sponsored by: Netflix
2017-10-15 16:17:55 +00:00
Edward Tomasz Napierala
5b4bc31eee Fix iSCSI target panics on concurrent session teardown and display
(eg removing a target and doing "ctladm islist -v" at the same time).

Reviewed by:	manu
Tested by:	manu
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-10-04 11:35:04 +00:00
Alexander Motin
748f120f7c Add sysctl/tunable for maximal request time.
MFC after:	1 week
2017-09-30 13:17:31 +00:00
Warner Losh
78ed811e6c cam iosched: Bettar account IOPS for smoother performance
Prevent cam_iosched_iops_tick() from discarding 'unspent' ios unless
it's a new accounting interval.

Previously ios that weren't used between ticks were lost, as a result
the iops limiter could enforce a limit below the configured maximum.

Obtained from: ElectroBSD
Submitted by: Fabian Keil
PR: 221974
2017-09-22 02:36:36 +00:00