Commit Graph

1131 Commits

Author SHA1 Message Date
imp
606deda9b7 Fix spelling of removable 2020-01-29 00:28:50 +00:00
arrowd
b4fe107b5b Fix typo: MANGAEMENT_PROTOCOL_OUT -> MANAGEMENT_PROTOCOL_OUT.
Approved by:	allanjude
2020-01-09 15:21:42 +00:00
imp
72481ca7a4 Revert r355813
It was extracted from a larger tree and is incomplete. Will resubmit after
reworking.
2019-12-16 19:16:26 +00:00
imp
5fa79c6768 Implement a system-wide limit or da and ada devices for delete.
Excesively large TRIMs can result in timeouts, which cause big
problems. Limit trims to 1GB to mititgate these issues.

Reviewed by: scottl
Differential Revision: https://reviews.freebsd.org/D22809
2019-12-16 18:16:44 +00:00
jhb
a224552510 Use callout_func_t instead of the deprecated timeout_t.
Reviewed by:	kib, imp
Differential Revision:	https://reviews.freebsd.org/D22752
2019-12-10 22:06:53 +00:00
asomers
6e2c3cd842 ses: sanitize illegal strings in SES element descriptors
The SES4r3 standard requires that element descriptors may only contain ASCII
characters in the range 0x20 to 0x7e.  Some SuperMicro expanders violate
that rule.  This patch adds a sanity check to ses(4).  Descriptors in
violation will be replaced by "<invalid>".

This patch fixes "sesutil --libxo xml" on such systems.  Previously it would
generate non-well-formed XML output.

PR:		241929
Reviewed by:	allanjude
MFC after:	2 weeks
Sponsored by:	Axcient
2019-12-06 00:06:05 +00:00
ken
3d73341d00 Fix a hang introduced in r351599.
My changes in 351599 (kindly committed by avg) made the cd(4) media check
asynchronous to avoid a sleep while holding a mutex.

There was a difficult to reproduce bug with those changes that caused a
hang on boot on some single processor machines/VMs.  Leandro Lupori
managed to reproduce the bug, diagnose it, and supplied a patch!  Here is
his analysis, from the PR:

======
I was able to reproduce the problem described in comment#14.

Actually, I wasn't trying to reproduce it, I just started seeing it a few
weeks ago, in CURRENT.

I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a
single core/thread (-smp 1).

It happens only when there is no media in the emulated CD-ROM, a device
that QEMU adds by default, unless -nodefaults is specified in command line.

I've debugged it and this is what I've found:

1- After the CD probe is successful, GEOM will try to open the device,
which will end up calling cdcheckmedia(), that sets CD state to
CD_STATE_MEDIA_PREVENT.
2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED
flag is set and CD state moves to CD_STATE_MEDIA_SIZE.
3- Next, scsi_read_capacity() is executed and fails, state is set to
CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up
cdcheckmedia().
4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it
first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to
CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering
an infinite MEDIA_SIZE command loop.

When there is a least another core/thread, the GEOM thread that performed
the initial cdopen() will get scheduled again, closing the CD device, that
will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and
breaks the loop.

So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when
CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the
state should be advanced to CD_STATE_MEDIA size only when the current state
is CD_STATE_MEDIA_PREVENT.
=====

PR:		kern/219857
Submitted by:	Leandro Lupori <leandro.lupori@gmail.com>
MFC after:	1 week
2019-12-02 19:57:39 +00:00
mav
4a46b2449c Make CAM use root_mount_hold_token() to delay boot.
Before this change CAM used config_intrhook_establish() for this purpose,
but that approach does not allow to delay it again after releasing once.

USB stack uses root_mount_hold() to delay boot until bus scan is complete.
But once it is, CAM had no time to scan SCSI bus, registered by umass(4),
if it already done other scans and called config_intrhook_disestablish().
The new approach makes it work smooth, assuming the USB device is found
during the initial bus scan.  Devices appearing on USB bus later may still
require setting kern.cam.boot_delay, but hopefully those are minority.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-22 18:39:51 +00:00
scottl
39d1a64ba3 Remove NEEDGIANT from the scsi_sg /dev node. It likely has not been
needed for many years.

Reported by:	imp
2019-11-22 18:18:36 +00:00
mav
f1c3864b6b Set handling for some "Logical unit not ready" errors.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-20 20:00:03 +00:00
imp
aa00f25b38 Fix a race between daopen and damediapoll
When we do a daopen, we call dareprobe and wait for the results. The repoll runs
the da state machine up through the DA_STATE_RC* and then exits.

For removable media, we poll the device every 3 seconds with a TUR to see if it
has disappeared. This introduces a race. If the removable device has lots of
partitions, and if it's a little slow (like say a USB2 connected USB stick),
then we can have a fair amount of time that this reporbe is going on for. If,
during that time, damediapoll fires, it calls daschedule which changes the
scheduling priority from NONE to NORMAL. When that happens, the careful single
stepping in the da state machine is disrupted and we wind up sceduling multiple
read capacity calls. The first one succeeds and releases the reference. The
second one succeeds and releases the reference (and panics if the right code is
compiled into the da driver).

To avoid the race, only do the TUR calls while in state normal, otherwise just
reschedule damediapoll. This prevents the race from happening.
2019-11-13 01:58:43 +00:00
imp
a2e7a62d8f Add asserts for some state transitions
For the PROBEWP and PROBERC* states, add assertiosn that both the da device
state is in the right state, as well as the ccb state is the right one when we
enter dadone_probe{wp,rc}. This will ensure that we don't sneak through when
we're re-probing the size and write protection status of the device and thereby
leak a reference which can later lead to an invalidated peripheral going away
before all references are released (and resulting panic).

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:57 +00:00
imp
60285ad235 Update the softc state of the da driver before releasing the CCB.
There are contexts where releasing the ccb triggers dastart() to be run
inline. When da was written, there was always a deferral, so it didn't matter
much. Now, with direct dispatch, we can call dastart from the dadone*
routines. If the probe state isn't updated, then dastart will redo things with
stale information. This normally isn't a problem, because we run the probe state
machine once at boot... Except that we also run it for each open of the device,
which means we can have multiple threads racing each other to try to kick off
the probe. However, if we update the state before we release the CCB, we can
avoid the race. While it's needed only for the probewp and proberc* states, do
it everywhere because it won't hurt the other places.

The race here happens because we reprobe dozens of times on boot when drives
have lots of partitions.  We should consider caching this info for 1-2 seconds
to avoid this thundering hurd.

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:52 +00:00
imp
e137ffd166 Require and enforce that dareprobe() has to be called with the periph lock held.
Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:47 +00:00
imp
1572f32fc8 Fix panic message to indicate right action that was improper.
Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:42 +00:00
trasz
76b010eaf2 Add GEOM attribute to report physical device name, and report it
via 'diskinfo -v'.  This avoids the need to track it down via CAM,
and should also work for disks that don't use CAM.  And since it's
inherited thru the GEOM hierarchy, in most cases one doesn't need
to walk the GEOM graph either, eg you can use it on a partition
instead of disk itself.

Reviewed by:	allanjude, imp
Sponsored by:	Klara Inc
Differential Revision:	https://reviews.freebsd.org/D22249
2019-11-09 17:30:19 +00:00
mav
ae9c9b703a Add kern.cam.da.X.quirks tunable, similar existing for ada.
Submitted by:	Michael Lass
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20677
2019-09-26 14:48:39 +00:00
mav
3157c6b68c Fix assumptions of only one device per SES slot.
It is typical to have one, but no longer true for multi-actuator HDDs
with separate LUN for each actuator.

MFC after:	4 days
Sponsored by:	iXsystems, Inc.
2019-09-11 03:25:30 +00:00
mav
f4dc60c4d5 Supply SAT layer with valid transfer sizes.
This is a rework of r344701, that noticed that number of bytes passes to
8 bit sector count field gets truncated.  First decision was to not pass
anything, since ATA specs define the field as N/A.  But it appeared to be a
problem for some SAT devices, that require information about data transfer
to operate properly.  Some additional investigation shown that it is quite
a common practice to set unused fields of ATA commands (fortunately ATA
specs formally allow it) to supply the information to SAT layer.  I have
found SAS-SATA interposer that does not allow pass-through without it.

As side effect, reduce code duplication by removing ata_do_28bit_cmd()
function, replacing it with more universal ata_do_cmd().

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-09-07 15:56:00 +00:00
mav
677a318a1e Take proper lock in ses_setphyspath_callback().
XPT_DEV_ADVINFO call should be protected by the lock of the specific
device it is addressed to, not the lock of SES device.  In some weird
case, probably with hardware violating standards, it sometimes caused
NULL dereference due to race.

To protect from it further, add lock assertion to *_dev_advinfo().

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-08-29 17:02:02 +00:00
avg
e1c624c9b6 scsi_cd: whitespace cleanup
Remove trailing whitespace and fix mixed indentation.

MFC after:	3 weeks
2019-08-29 08:26:40 +00:00
avg
818549d232 scsi_cd: ifdef out cdsize()
It was used only by the old cdcheckmedia().

MFC after:	3 weeks
2019-08-29 08:19:11 +00:00
avg
01f4cc17c5 scsi_cd: make the media check asynchronous
This makes the media check process asynchronous, so we no longer block
in cdstrategy() to check for media.

PR:		219857
Obtained from:	ken
MFC after:	3 weeks
2019-08-29 07:51:11 +00:00
mav
30b189e686 Always check cam_periph_error() status for ERESTART.
Even if we do not expect retries, we better be sure, since otherwise it
may result in use after free kernel panic.  I've noticed that it retries
SCSI_STATUS_BUSY even with SF_NO_RECOVERY | SF_NO_RETRY.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-08-27 16:41:06 +00:00
mav
8da32f95df Make camcontrol modepage support block descriptors.
It allows to read and write block descriptors alike to mode page parameters.
It allows to change block size or short-stroke HDDs or overprovision SSDs.
Depenting on -P parameter the change can be either persistent or till reset.
In case of block size change device may need reformat after the setting.
In case of SSD overprovisioning format or sanitize may be needed to really
free the flash.

During implementation appeared that csio_encode_visit() can not handle
integers of more then 4 bytes, that makes 8-byte LBA handling awkward.
I had to split it into two 4-byte halves now.

MFC after:	1 week
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2019-08-07 14:45:10 +00:00
mav
336ad2a4b3 Allow WRITE SAME handle more then 2^^32 blocks.
If not limited by write_same_max_lba option, split operation into several
2^^31 blocks chunks in a loop.  For large disks it may take a while, so
setting write_same_max_lba may be useful to avoid timeouts.

While there, fix build with CAM_CTL_DEBUG.

MFC after:	2 weeks
2019-07-27 17:27:26 +00:00
mav
64c9858a51 Add support for Long LBA mode parameter block descriptor.
It is formally required for SBC Base 2016 feature set.

MFC after:	2 weeks
2019-07-26 19:14:12 +00:00
mav
a516a7f6bf Add device temperature reporting into CTL.
The values to report can be set via LUN options.  It can be useful for
testing, and also required for Drive Maintenance 2016 feature set.

MFC after:	2 weeks
2019-07-26 03:49:16 +00:00
mav
a9029f5dc8 Add reporting of SCSI Feature Sets VPD page from SPC-5.
CTL implements all defined feature sets except Drive Maintenance 2016,
which is not very applicable to such a virtual device, and implemented
only partially now.  But may be it could be fixed later at least for
completeness.

MFC after:	2 weeks
2019-07-26 01:49:28 +00:00
mav
a17d030dcc Make camcontrol sanitize support also ATA devices.
ATA sanitize is functionally identical to SCSI, just uses different
initiation commands and status reporting mechanism.

While there, make kernel better handle sanitize commands and statuses.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-07-25 18:48:31 +00:00
markj
6bcf6e3fcb Remove the CDIOCREADSUBCHANNEL_SYSSPACE ioctl.
This was added for emulation of Linux's CDROMSUBCHNL, but allows
users with read access to a cd(4) device to overwrite kernel memory
provided that the driver detects some media present.

Reimplement CDROMSUBCHNL by bouncing the data from CDIOCREADSUBCHANNEL
through the linux_cdrom_subchnl structure passed from userspace.

admbugs:	768
Reported by:	Alex Fortune
Security:	CVE-2019-5602
Security:	FreeBSD-SA-19:11.cd_ioctl
2019-07-03 00:10:01 +00:00
imp
95e08d5e62 Replay r349342 by imp accidentally reverted by r349352
Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-25 06:14:31 +00:00
imp
7a03574ddd Replay r349340 by imp accidentally reverted by r349352
Create ata_param_fixup

Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-25 06:14:21 +00:00
imp
0ea6c510f8 Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.

Numerous posts to arch@ and other locations have found no actual users
for this software.

Relnotes:	Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
2019-06-25 04:50:09 +00:00
imp
b7ba0372a1 Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-24 20:23:19 +00:00
imp
f89d54f318 Create ata_param_fixup
Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-24 20:18:58 +00:00
mav
4c22188d67 Improve AHCI Enclosure Management and SES interoperation.
Since SES specs do not define mechanism to map enclosure slots to SATA
disks, AHCI EM code I written many years ago appeared quite useless,
that always bugged me.  I was thinking whether it was a good idea, but
if LSI HBAs do that, why I shouldn't?

This change introduces simple non-standard mechanism for the mapping
into both AHCI EM and SES code, that makes AHCI EM on capable controllers
(most of Intel's) a first-class SES citizen, allowing it to report disk
physical path to GEOM, show devices inserted into each enclosure slot in
`sesutil map` and `getencstat`, control locate and fault LEDs for specific
devices with `sesutil locate adaX on` and `sesutil fault adaX on`, etc.

I've successfully tested this on Supermicro X10DRH-i motherboard connected
with sideband cable of its S-SATA Mini-SAS connector to SAS815TQ backplane.
It can indicate with LEDs Locate, Fault and Rebuild/Remap SES statuses for
each disk identical to real SES of Supermicro SAS2 backplanes.

MFC after:	2 weeks
2019-06-23 19:05:01 +00:00
mav
3ae789d820 Decouple enc/ses verbosity from bootverbose.
I don't want to be regularly notified that my enclosure violates standards
until there is some real problem I want to debug.

MFC after:	2 weeks
2019-06-22 19:09:10 +00:00
mav
c889cf64dd Remove ancient SCSI-2/3 mentioning.
MFC after:	2 weeks
2019-06-22 03:50:43 +00:00
mav
440ed42621 Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page".  It allows us to question ELEMENT INDEX that is lower
then values we already processed.  There are many SAS2 enclosures with
this kind of problem.

While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong.  Also skip elements with INVALID bit set.

MFC after:	2 weeks
2019-06-22 01:06:41 +00:00
mav
daf9d8f42e Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first.  To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.

MFC after:	2 weeks
2019-06-21 23:29:16 +00:00
imp
f47bc88408 Minor white space changes.
Remove trailing white space that's crept into this file.
2019-06-11 20:48:19 +00:00
mav
db266da594 Simplify math added in r310524.
Should be no functional change.

Reported by:	danfe
MFC after:	1 week
2019-05-22 15:39:35 +00:00
mav
c8679b8bb2 Drop periph lock around cam_periph_unmapmem().
Since r345656 it may call copyout(), that may sleep.

MFC after:	3 days
Sponsored by:	iXsystems, Inc.
2019-05-06 19:08:03 +00:00
mav
b537b150e3 Report DIF protection type the disk is formatted with.
Some disks formatted with protection report errors if written without
protection used.  This should help to diagnose the problem.

MFC after:	2 weeks
2019-04-22 01:08:14 +00:00
mav
956606fe26 Polish SCSI sense data validity checks.
According to specs and common sense, all sense data reported in descriptor
format should be valid.  But practice shows different, some devices return
descriptors with invalid data, resulting in error messages looking worse.

Decouple block/stream commands sense data and information field printing.
Looking on present specs, there are much more cases when those fields are
not related, and incomplete old code was not printing valid sense data and
leaving empty lines for invalid.

MFC after:	2 weeks
2019-04-21 19:07:03 +00:00
imp
3770ce2b78 Upgrade Chipfancier SLC quirk to all versions
The 16GB, 32GB and 128GB versions of this product all have the same
problem. For some reason, the RC10 size is correct, while the RC16
size is larger (oddly by the capacity size / 1024 bytes). Using the
RC16 size results in illegal LBA range errors when geom tastes the
device. So, expand the quirk to cover all versions of this chip.

Ideally, we'd get both READ CAPACITY 10 and READ CAPACITY 16 sizes and
print a warnnig if they differ and use the smaller of the two numbers,
though that may be problematical as well. Furthermore, SBC-4
encourages users transition to RC16 only, which suggests that in the
future RC10 may disappear from some drives. It's unclear how to cope
with these drives generically.

PR: 234503
MFC After: 1 week
2019-03-11 20:57:54 +00:00
dab
a8b33d582f CID 1009492: Logically dead code in sys/cam/scsi/scsi_xpt.c
In `probedone()`, for the `PROBE_REPORT_LUNS` case, all paths that
fall to the bottom of the case set `lp` to `NULL`, so the test for a
non-NULL value of `lp` and call to `free()` if true is dead code as
the test can never be true. Fix by eliminating the whole if
statement. To guard against a possible future change that accidentally
violates this assumption, use a `KASSERT()` to catch if `lp` is
non-NULL.

Reviewed by:	cem
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19109
2019-02-11 22:09:26 +00:00
imp
c558d0f965 Add quirk for Sansisk X400 drives
Certain versions of Sandisk x400 firmware can hang under extremely
heavly load of large I/Os for prolonged periods of time. Newer /
current versions work fine, and should be used where possible. Where
not possible, this quirk ensures that I/O requests are limited to 128k
to avoids the bug, even under extreme load. Since MAXPHYS is 128k,
only users with custom kernels are at risk on the older firmware.
Once all known users of the older firmware have upgraded, this quirk
will be removed.

Sponsored by: Netflix, Inc.
2019-02-05 22:53:36 +00:00
mav
6abbf0c797 Use switch instead of chained if/else to improve readability.
Submitted by:	Ryan Moeller <ryan@freqlabs.com>
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D19051
2019-02-04 01:20:56 +00:00