Commit Graph

2540 Commits

Author SHA1 Message Date
Alexander Motin
135c269d87 Fix build. Sorry.
MFC after:	2 weeks
2022-01-07 14:33:51 -05:00
Alexander Motin
f4d499fd67 CTL: Relax callouts precisions.
MFC after:	2 weeks
2022-01-07 14:30:44 -05:00
Alexander Motin
0e5c50bf60 cam: Relax callouts precisions.
On large systems even relatively rare callouts may fire many times
per second.  This should allow them to aggregate better, since we do
not require any precision when polling for media change, etc.

MFC after:	2 weeks
2022-01-07 12:59:16 -05:00
John Baldwin
8903d8e37f iscsi: Pass the request PDU to icl_conn_transfer_setup().
This matches icl_conn_task_setup() which passes the PDU and avoids the
need for a layering violation in cxgbei to fetch the request PDU from
the ctl_io.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D33746
2022-01-04 14:37:17 -08:00
Robert Wing
bb8441184b cam: don't lock while handling an AC_UNIT_ATTENTION
Don't take the device_mtx lock in daasync() when handling an
AC_UNIT_ATTENTION. Instead, assert the lock is held before modifying the
periph's softc flags.

The device_mtx lock is taken in xptdevicetraverse() before daasync()
is eventually called in xpt_async_bcast().

PR:             240917, 226510, 226578
Reviewed by:    imp
MFC after:      3 weeks
Differential Revision: https://reviews.freebsd.org/D27735
2022-01-03 16:56:48 -09:00
Alexander Motin
757089f01e CAM: List few missed opcodes.
MFC after:	1 weeks
2021-12-31 11:48:03 -05:00
Alexander Motin
b06771aa66 CTL: Allow I/Os up to 8MB, depending on maxphys value.
For years CTL block backend limited I/O size to 1MB, splitting larger
requests into sequentially processed chunks.  It is sufficient for
most of use cases, since typical initiators rarely use bigger I/Os.

One of known exceptions is VMWare VAAI offload, by default sending up
to 8 4MB EXTENDED COPY requests same time.  CTL internally converted
those into 32 1MB READ/WRITE requests, that could overwhelm the block
backend, having finite number of processing threads and making more
important interactive I/Os to wait in its queue.  Previously it was
partially covered by CTL core serializing sequential reads to help
ZFS speculative prefetcher, but that serialization was significantly
relaxed after recent ZFS improvements.

With the new settings block backend receives 8 4MB requests, that
should be easier for both CTL itself and the underlying storage.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2021-12-29 23:49:24 -05:00
Andriy Gapon
15910dc0bc adaspindown: check disk power mode before sending IDLE command
If a disk is already in STANDBY mode, then setting IDLE mode can
actually spin it up.

Reviewed by:	mav
MFC after:	4 weeks
Differential Revision:	https://reviews.freebsd.org/D33588
2021-12-24 11:02:22 +02:00
Wojciech Macek
e0ceec676d cam: don't send scsi commands on shutdown when reboot method RB_NOSYNC
Don't send the SCSI comand SYNCHRONIZE CACHE on devices that are still
open when RB_NOSYNC is the reboot method. This may avoid recursive panics
when doadump is called due to a SCSI/CAM/USB error/bug.

Obtained from:		Semihalf
Sponsored by:		Stormshield
Reviewed by:		imp
Differential revision:	https://reviews.freebsd.org/D31549
2021-12-20 06:32:51 +01:00
Andriy Gapon
8eca341d9b follow up to 18679ab1, actually change size of mmc_sim::name to 16
The change was made locally but was not squashed into the commit.

Fixes:		18679ab1 mmc_sim: fix setting of the mutex name
MFC after:	8 days
2021-12-17 13:24:53 +02:00
Andriy Gapon
18679ab1c0 mmc_sim: fix setting of the mutex name
To quote the manual:
 The pointer passed in as name and type is saved rather than the data
 it points to.  The data pointed to must remain stable until the mutex
 is destroyed.

It seems that the type is actually copied, but the name is stored as
a pointer indeed.
mmc_cam_sim_alloc used a name stored on stack.
So, a corrupt mutex name would be reported.
For example:
  lock order reversal: (sleepable after non-sleepable)
  1st 0xd7285b20 <8A><C0><C0>P@<C1><D0>P@<C1>^D^A (aw_mmc_sim, sleep mutex) @ sys/cam/cam_xpt.c:2804

This change moves the name to struct mmc_sim.
Also, that name is used as the sim name as well.
Unused mtx_name variable is removed too.
The name buffer is reduced to 16 characters.

Reviewed by:	manu, bz
MFC after:	10 days
Differential Revision:	https://reviews.freebsd.org/D33412
2021-12-15 13:42:02 +02:00
Andriy Gapon
8b37048bc5 Revert "mmc_sim: fix setting of the mutex name"
This reverts commit df472af034.

The change hasn't been reviewed.
2021-12-13 13:51:47 +02:00
Andriy Gapon
df472af034 mmc_sim: fix setting of the mutex name
To quote the manual:
 The pointer passed in as name and type is saved rather than the data
 it points to.  The data pointed to must remain stable until the mutex
 is destroyed.

It seems that the type is actually copied, but the name is stored as
a pointer indeed.
mmc_cam_sim_alloc used a name stored on stack.
So, a corrupt mutex name would be reported.
For example:
  lock order reversal: (sleepable after non-sleepable)
  1st 0xd7285b20 <8A><C0><C0>P@<C1><D0>P@<C1>^D^A (aw_mmc_sim, sleep mutex) @ /usr/devel/git/orange/sys/cam/cam_xpt.c:2804

This change moves the name to struct mmc_sim.
Also, that name is used as the sim name as well.
Unused mtx_name variable is removed too.
2021-12-13 13:40:47 +02:00
Mateusz Guzik
a036d73e8f ctl: plug set-but-not-unused var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-10 12:06:48 +00:00
Warner Losh
1d92e81fa4 cam/iosched: fix off by one error
Set the bucket size to be SBT_1US/50000 + 1 to be the first number >
20us. I had this uncommitted in my three when I pushed 2283206935
since kern.cam.iosched.bucket_base_us was reporting 19us.

Fixes:		2283206935
Sponsored by:	Netflix
2021-12-05 23:00:01 -07:00
Warner Losh
2283206935 cam-iosched: Publish parameters of the latency buckets
Add sysctls to publish the latency bucket size, number, and stride. Move
to putting all the iosched stuff under kern.cam.iosched as well and move
kern.cam.do_dynamic_iosched to kern.cam.iosched.dynamic. In addition, move
kern.cam.io_sched_alpha_bits to kern.cam.iosched.alpha_bits. Publish
kern.cam.iosched.bucket_base (the smallest bucket time), .bucket_ratio
(the geometric progression factor * 100), and .buckets (the total number
of buckets).

Move to publishing 20 buckets, starting at 20us. This allows us to get
better resolution on the short end and detect preformance degredation of
the NVMe drives I've tested on, even with the uncertainty of bucketing.

Sponsored by:		Netflix
2021-12-05 22:19:07 -07:00
Warner Losh
3846662dab cam: Initialize wired to false
As part of converting the code to a while loop, the unconditional
initialization of wired to false was lost.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D33163
2021-11-30 14:40:30 -07:00
Edward Tomasz Napierala
fbf5246757 cfiscsi(4): Fix "set but not used" warning
No functional changes.

Sponsored By:	EPSRC
2021-11-29 16:45:15 +00:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Scott Long
c154feacc4 Fix "set but not used" warnings in CAM. 2021-11-25 03:17:54 +00:00
Warner Losh
1bc9ca3b35 cam: Unbreak CAM_IO_STATS build
Fixes:		6637b74600
Sponsored by:	Netflix
2021-11-24 02:36:48 -07:00
Warner Losh
6637b74600 cam: Remove all the write-only variables
Delete all the write only variables in CAM. At worst, the only behavior
change would be to prevent core dumps from chasing NULL pointers (though
I think in all these cases the pointers can't be NULL).

Sponsored by:		Netflix
2021-11-23 21:21:18 -07:00
Andriy Gapon
e17b58ecbc sddadone: 'error' gets assigned only errno codes, never MMC_ERR codes
MFC after:	2 weeks
2021-11-13 11:20:14 +02:00
Warner Losh
c048ac620f cam_iosched: Fix a comment
Array elements were added, but this comment wasn't updated.

Sponsored by:		Netflix
2021-11-08 14:21:08 -07:00
Warner Losh
d836c48e71 cam_periph: wired is really a bool, update it to a bool.
Sponsored by:		Netflix
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D32823
2021-11-05 08:56:48 -06:00
Warner Losh
bd82711aff cam: Remove trailing spaces from serial numbers too
The SanDisk SD8SB8U1 and likely others pad their serial number with
spaces on the end rather than the start (at least when connected to a
SAS3008). This makes them difficult to wire unit numbers to with the
serial because you have to specify the trailing spaces. Instead, strip
out the trailing spaces.

We already strip leading spaces both here. In addition, when glabel
creates the devfs device nodes, leading and trailing spaces are removed
already (so there will be no change there either).

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D32684
2021-11-05 08:56:41 -06:00
Warner Losh
577f9aa266 cam_periph: Add ability to wire units to a serial number
For scsi, ata and nvme, at least, we read a serial number from the
device (if the device supports it, some scsi drives do not) and record
it during the *_xpt probe device state machine before it posts the
AC_FOUND_DEVICE async event. For mmc, no serial number is ever
retrieved, so it's always NULL. Add the ability to match this serial
number during device wiring.

This mechanism is competely optional, and often times using a label
and/or some other attribute of the device is easier. However, other
times wiring a unit to a serial number simplifies management as most
monitoring tools require the *daX device and having it stable from boot
to boot helps with data continuity. It can be especially helpful for
nvme where no other means exists to reliably tie a ndaX device to an
underlying nvme drive and namespace.

A similar mechanism exists in Linux to mange device unit numbers with
udev.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D32683
2021-11-05 08:56:33 -06:00
Warner Losh
710a519ebb cam_periph: fix bug in camperiphunitnext logic
If we assigned just a lun as a wired unit (something that camperiphunit
will accept), we failed to properly skip over that unit when computing a
next unit number. Add lun so the code matches the comments that we have
to skip all the same criteria that camperiphunit uses to select wired
units for a driver.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D32682
2021-11-05 08:56:27 -06:00
Warner Losh
bee0133fb9 cam_periph: switch from negative logic to positive logic
When scanning the resources that are wired for this driver, skip any
that whose number doesn't match newunit. They aren't relevant. Switch to
positive logic to break out of the loop (and thus go to the next unit)
if we find either a target resource or an at resource. This makes the
code easier to read and modify.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D32681
2021-11-05 08:56:22 -06:00
Warner Losh
00f79c97a4 cam_periph: Remove vestigial "scbus" comparison
The code in camperiphunit rejects "scbus" as an 'at' location that would
allow any other wiring to use that unit number. Yet in
camperiphunitnext, if we have a no target and the 'at' location of
'scbus' it would be excluded on the basis that it's a wiring
cadidate. This is improper and appears to be a hold-over of the
pre-hints / pre-newbus config system, so remove it.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D32680
2021-11-05 08:56:13 -06:00
Mark Johnston
6afabf0092 scsi_cd: Improve TOC access validation
1. During CD probing, we read the TOC header to find the number of
   entries, then read the TOC itself.  The header determines the number
   of entries, which determines the amount of data to read from the
   device into the softc in the CD_STATE_MEDIA_TOC_FULL state.  We
   hard-code a limit of 99 tracks (plus one for the lead-out) in the
   softc, but were not validating that the size reported by the media
   would fit in this hard-coded limit.  Kernel memory corruption could
   occur if not.[1]  Add validation to check this, and refuse to cache
   the TOC if it would not fit.

2. The CDIOCPLAYTRACKS ioctl uses caller provided track numbers to index
   into the TOC, but we only validate the starting index.  Add
   validation of the ending index.

Also, raise the hard-coded limit from 100 tracks to 170, per a
suggestion from Ken.

Reported by:	C Turt <ecturt@gmail.com> [1]
Reviewed by:	ken, avg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32803
2021-11-03 15:09:17 -04:00
Warner Losh
dbfe5dd3f9 cam_periph: style change
wrap a long line at 80 columns

Sponsored by:		Netflix
Reviewed by:		chs
Differential Revision:	https://reviews.freebsd.org/D32679
2021-11-03 08:03:07 -06:00
Jessica Clarke
f350bc1dd3 ada: Fix intra-object buffer overread of identify strings
In the ATA/ATAPI spec these are space-padded fixed-length strings with
no NUL-terminator (and byte swapped). When performing the identify we
call ata_param_fixup to swap the bytes back to be in order, strip any
leading/trailing spaces and coalesce consecutive spaces, padding with
NULs. However, if the input has no padding spaces, the fixed-up strings
are still not NUL-terminated. This causes two issues. The first is that
strlcpy will truncate the string by replacing the final byte with a NUL.
The second is that strlcpy will keep reading src until it finds a NUL in
order to calculate the return value, which is defined as the length of
src (so that callers can then compare it with the dsize input to see if
the input string was truncated), thereby reading past the end of the
buffer and into whatever adjacent fields are in the structure. In
practice there's a NUL byte somewhere in the structure, but on CHERI
with subobject bounds enabled in the compiler this overread will be
detected and trap as a bounds violation.

Note this matches ata_xpt's aprobedone, which does a bcopy to a
malloc'ed buffer and manually NUL-terminates it for the CAM path's
device's serial_num.

Found by:	CHERI
Reviewed by:	imp, scottl
Differential Revision:	https://reviews.freebsd.org/D32567
2021-10-27 18:38:37 +01:00
Alexander Motin
730ea72c70 cam(4): Limit search for disks in SES enclosure by single bus
At least for SAS that we only support now disks are typically
connected to the same bus as the enclosure.  Limiting the search
scope makes it much faster on systems with multiple buses and
thousands of disks.

Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D32305
2021-10-05 15:01:16 -04:00
Alexander Motin
8f9be1eed1 cam(4): Improve XPT_DEV_MATCH
Remove *_MATCH_NONE enums, making no sense and so never used.  Make
*_MATCH_ANY enums 0 (no any match flags set), previously used by
*_MATCH_NONE.  Bump CAM_VERSION to 0x1a reflecting those changes and
add compat shims.

When traversing through buses and devices do not descend if we can
already see that requested pattern does not match the bus or device.
It allows to save significant amount of time on system with thousands
of disks when doing limited searches.

Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D32304
2021-10-05 14:54:03 -04:00
Warner Losh
b3b15d9256 cam: Add doxygen for cam_sim_free
Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32303
2021-10-05 07:07:47 -06:00
Gordon Bergling
15c5f657a0 cam: Fix a typo in a comment
- s/perorming/performing/

MFC after:	3 days
2021-10-02 10:48:43 +02:00
Mark Johnston
ed8ef7ae8b cam: Avoiding waking up doneq threads if we're dumping
Depending on the state of the target doneq thread at the time of the
panic, the wakeup can hang indefinitely in thread_lock_block_wait().
That function should likely be modified to return immediately if the
scheduler is stopped, but it is also preferable to avoid wakeups in
general after a panic.

Reported by:	pho
Reviewed by:	mav, imp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32126
2021-09-25 10:15:03 -04:00
Warner Losh
da73926566 libcam: Define depop structures and introduce scsi_wrap
Define structures related to the depop set of commands (GET PHYSICAL ELEMENT
STATUS, REMOVE ELEMENT AND TRUNCATE, and RESTORE ELEMENT AND REBUILD) as
well as the CDB construction routines.

Also create scsi_wrap.c. This will have convenience routines that will do all
the elements of allocating the ccb, generating the CDB, sending the command
(looping as necessary for cases where data is returned, but it's size isn't
known up front), etc. As this functionality is fleshed out, calling many
camcontrol commands programatically gets much easier.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29017
2021-09-20 16:27:59 -06:00
John Baldwin
0cd6e85e24 iscsi: Abort data-out tasks queued on a terminating session.
cfiscsi_datamove_out() can race with cfiscsi_session_terminate_tasks()
and enqueue a new task after the latter function has aborted existing
tasks.  This could result in a deadlock as
cfiscsi_session_terminate_tasks() waited forever for this task to
complete.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31892
2021-09-15 13:25:30 -07:00
John Baldwin
529364b032 iscsi: Add a helper routine to abort a data-out task.
Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31891
2021-09-15 13:25:04 -07:00
John Baldwin
d2bc7754a2 Assert that invalid bus widths can't be passed to bus_width_str().
This appeases a -Wreturn-type warning from GCC.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D31935
2021-09-15 09:03:17 -07:00
Ka Ho Ng
49050613ef ctl(4): Do hole-punching for UNMAP to file-backed LUNs
This adds support for SCSI UNMAP command to file-backed LUNs, if the
underlying file system has a non-zerofilling VOP_DEALLOCATE
implementation where some or all parts of the requested operation range
may be deallocated.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	mav
Differential Revision:	https://reviews.freebsd.org/D31922
2021-09-15 03:51:58 +08:00
Alexander Motin
e76786909c Fix data race in scsi cd driver.
There is a data race between cdsysctlinit and cdcheckmedia.  Both
functions change softc->flags without synchronization.

Submitted by:	Arseny Smalyuk <smalukav@gmail.com>
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D31726
2021-09-13 08:59:51 -04:00
Alan Somers
cc2d08d388 ses: Guard the elm_type_names declaration by _KERNEL
MFC after:	2 weeks
Sponsored by:	Axcient
2021-09-02 14:47:18 -06:00
Alan Somers
1fb52e4373 ses: Correct spelling of "Temperature Sensor"
According to SES 4 revision 2 table 71, it should be singular.

MFC after:	2 weeks
Sponsored by:	Axcient
2021-09-02 14:38:06 -06:00
Edward Tomasz Napierala
0f49ecffb7 cam: revert second half of 75b5caa08e
This turns debugging printf() into a KASSERT().

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31523
2021-09-01 09:35:27 +00:00
Alexander Motin
f3dcedd3de targ(4): Remove D_NEEDGIANT.
I don't believe this code needs Giant, if ever needed.

MFC after:	1 month
2021-08-21 11:20:54 -04:00
Alexander Motin
84d5b6bd68 cam(4): Fix quick unplug/replug for SCSI.
If some device is plugged back in after unplug before the probe periph
destroyed, it will just restart the probe process. But I've found that
PROBE_INQUIRY_CKSUM flag not cleared between the iterations may cause
AC_FOUND_DEVICE not reported on the second iteration, and because of
AC_LOST_DEVICE reported during the first iteration, the device end up
configured, but without any periphs attached.

We've found that enabled serial console and 102-disk JBOD cause enough
probe delays to easily trigger the issue for half of the disks.  This
change fixes it reliably on my tests.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2021-08-21 09:58:05 -04:00
John Baldwin
c261b6ea4e iscsi: Teach the iSCSI stack about "large" received PDUs.
When using iSCSI PDU offload (cxgbei) on T6 adapters, a burst of
received PDUs can be reported via a single message to the driver.

Previously the driver passed these multi-PDU bursts up to the iSCSI
stack up as a single "large" PDU by rewriting the buffer offset, data
segment length, and DataSN fields in the iSCSI header.  The DataSN
field in particular was rewritten so that each of the "large" PDUs
used consecutively increasing values.  While this worked, the forged
DataSN values did not match the ExpDataSN value in the subsequent SCSI
Response PDU.  The initiator does not currently verify this value, but
the forged DataSN values prevent adding a check.

To avoid this, allow a logical iSCSI PDU (struct icl_pdu) to describe
a burst of PDUs via a new 'ip_additional_pdus' field.  Normally this
field is set to zero when 'struct icl_pdu' represents a single PDU.
If logical PDU represents a burst of on-the-wire PDUs, then 'ip_npdus'
contains the count of additional on-the-wire PDUs.  The header of this
"large" PDU is still modified, but the DataSN field now contains the
DataSN value of the first on-the-wire PDU in the burst.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31577
2021-08-18 10:56:28 -07:00