Commit Graph

2435 Commits

Author SHA1 Message Date
Alexander Motin
6ed39db257 Do not exit ctl_be_block_worker() prematurely.
Return while there are any I/Os in a queue may result in them stuck
indefinitely, since there is only one taskqueue task for all of them.
I think I've reproduced this by switching ha_role to secondary under
heavy load.

MFC after:	3 days
2021-03-05 22:45:47 -05:00
Warner Losh
b3fce46a3e cam: remove redundant scsi_vpd_block_characteristics definition
There were two definitions for the SCSI VPD Block Device Characteristics (page
0xb1): struct scsi_vpd_block_characteristics and struct
scsi_vpd_block_device_characteristics. The latter is more complete and more
widely used. Convert uses of the former to the latter by tweaking the da driver
and removing sturct scsi_vpd_block_characteristics.
2021-03-02 18:35:09 -07:00
Alexander Motin
a59e2982fe Optimize out few extra memory accesses.
MFC after:	1 week
2021-03-01 18:36:33 -05:00
Alexander Motin
9d9fd8b79f Micro-optimize OOA queue processing.
- Move ctl_get_cmd_entry() calls from every OOA traversal to when
  the requests first inserted, storing seridx in struct ctl_scsiio.
- Move some checks out of the loop in ctl_check_ooa().
- Replace checks for errors that can not happen with asserts.
- Transpose ctl_serialize_table, so that any OOA traversal accessed
  only one row (cache line).  Compact it from enum to uint8_t.
- Optimize static branch predictions in hottest places.

Due to O(n) nature on deep LUN queues this can be the hottest code
path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests
are good to have in reserve.  About 50% of CPU time here according
to the profiles is now spent in two memory accesses per traversed
request in OOA.

Sponsored by:	iXsystems, Inc.
MFC after:	2 weeks
2021-02-27 10:40:24 -05:00
Warner Losh
34d6961108 cam: add new ASC and ASCQ values related to drive depopulation
Add 04/25 Depopulation restoration in progress, 31/04 Depopulation failed, and
31/05 Depopulation restoration failed.

These are defined in SPC-6r2 (though 31/4 was added in an earlier draft). They
relate to different aspects of in-progress or failed depopulation removal and
restoration commands.
2021-02-26 11:45:06 -07:00
Alexander Motin
a9bd22814f Remove pointless lun->be_lun checks.
There is no such thing as LUN without backend, at least for years.

MFC after:	1 week
2021-02-25 19:48:03 -05:00
Alexander Motin
7d4c444374 Bump CTL block backend threads from 14 to 32 per LUN.
This makes random read benchmarks look better on a wide ZFS pools.
I am not sure where the original value goes from, but it is there
for too long now.

MFC after:	1 week
2021-02-23 11:03:32 -05:00
Alexander Motin
c02a28754b Fix build after 2c7dc6bae9.
MFC after:	1 month
2021-02-21 17:21:14 -05:00
Alexander Motin
2c7dc6bae9 Refactor CTL datamove KPI.
- Make frontends call unified CTL core method ctl_datamove_done()
to report move completion.  It allows to reduce code duplication
in differerent backends by accounting DMA time in common code.
 - Add to ctl_datamove_done() and be_move_done() callback samethr
argument, reporting whether the callback is called in the same
context as ctl_datamove().  It allows for some cases like iSCSI
write with immediate data or camsim frontend write save one context
switch, since we know that the context is sleepable.
 - Remove data_move_done() methods from struct ctl_backend_driver,
unused since forever.

MFC after:	 1 month
2021-02-21 16:52:33 -05:00
Alexander Motin
05d882b780 Microoptimize CTL I/O queues.
Switch OOA queue from TAILQ to LIST and change its direction, so that
we traverse it forward, not backward.  There is only one place where
we really need other direction, and it is not critical.

Use STAILQ_REMOVE_HEAD() instead of STAILQ_REMOVE() in backends.

Replace few impossible conditions with assertions.

MFC after:	1 month
2021-02-19 15:49:36 -05:00
Alexander Motin
812c9f48a2 Save context switch per I/O for iSCSI and IOCTL frontends.
Introduce new CTL core KPI ctl_run(), preprocessing I/Os in the caller
context instead of scheduling another thread just for that.  This call
may sleep, that is not acceptable for some frontends like the original
CAM/FC one, but iSCSI already has separate sleepable per-connection RX
threads, and another thread scheduling is mostly just a waste of time.
IOCTL frontend actually waits for the I/O completion in the caller
thread, so the use of another thread for this has even less sense.

With this change I can measure ~5% IOPS improvement on 4KB iSCSI I/Os
to ZFS.

MFC after:	1 month
2021-02-18 22:29:38 -05:00
Alexander Motin
c67a2909a6 Move XPT_IMMEDIATE_NOTIFY handling out of periph lock.
It is a rare, but still better to not have lock dependencies.

MFC after:	1 month
2021-02-18 16:31:38 -05:00
John Baldwin
e6405c8c37 cam: Properly find the sim in the assertion in xpt_pollwait().
I had missed merging this fixup into
447b3557a9 before pushing it.

Pointy hat to:	jhb
MFC after:	2 weeks
2021-02-11 14:06:58 -08:00
John Baldwin
e07ac3f2fd cam: Don't permit crashdumps on non-pollable devices.
If a disk's SIM doesn't support polling, then it can't be used to
store crashdumps.  Leave d_dump NULL in that case so that dumpon(8)
fails gracefully rather than having dumps fail at crash time.

Reviewed by:	scottl, mav, imp
MFC after:	2 weeks
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D28454
2021-02-11 13:52:18 -08:00
John Baldwin
447b3557a9 cam: Permit non-pollable sims.
Some CAM sim drivers do not support polling (notably iscsi(4)).
Rather than using a no-op poll routine that always times out requests,
permit a SIM to set a NULL poll callback.  cam_periph_runccb() will
fail polled requests non-pollable sims immediately as if they had
timed out.

Reviewed by:	scottl, mav (earlier version)
Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D28453
2021-02-11 13:52:12 -08:00
Alexander Motin
b31dae0caa Exclude reserved iSCSI Target Transfer Tag.
RFC 7143 (11.7.4):
   The Target Transfer Tag values are not specified by this protocol,
   except that the value 0xffffffff is reserved and means that the
   Target Transfer Tag is not supplied.

MFC after:	1 month
2021-01-24 13:58:29 -05:00
Mark Johnston
2ce5eef6e2 cam: Remove Giant handling from cam_sim_alloc()
There are no non-MPSAFE SIM drivers left in the tree, verified with
coccinelle.

Reviewed by:	scottl, imp
Differential Revision:	https://reviews.freebsd.org/D27853
2021-01-03 11:50:31 -05:00
Marius Strobl
eae35125e9 ada(4): remove remainder of MD geometry translation support
This was missed in 9cf738228d and
r359718 respectively.
2020-12-25 20:20:54 +01:00
Emmanuel Vadot
ca7ef91285 mmccam: Convert some printf to CAM_DEBUG
This not not useful if you are not debuging mmccam
2020-11-30 14:49:13 +00:00
Alexander Motin
9093e27cc1 Remove alignment requirements for KVA buffer mapping.
After r368124 vmapbuf() should happily map misaligned maxphys-sized buffers
thanks to extra page added to pbuf_zone.
2020-11-29 00:49:14 +00:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Emmanuel Vadot
2fe1b4ca42 mmccam: We can't sleep during sdda_add_part so use M_NOWAIT
Reviewed by:	kibab
Differential Revision:	https://reviews.freebsd.org/D25947
2020-11-26 16:39:56 +00:00
Jung-uk Kim
7e06495b9b Do not truncate the last character from serial number.
strlcpy() requires one more byte for the NULL character.

Submitted by:	Henri Hennebert (hlh at restart dot be)
MFC after:	3 days
2020-11-24 21:14:36 +00:00
Alexander Motin
466d0a2572 Microoptimize cam_num_doneqs math in xpt_done().
MFC after:	1 week
2020-11-20 05:46:27 +00:00
Alexander Motin
8054320e07 Make CTL nicer to increased MAXPHYS.
Before this CTL always allocated MAXPHYS-sized buffers, even for 4KB I/O,
that is even more overkill for MAXPHYS of 1MB.  This change limits maximum
allocation to 512KB if MAXPHYS is bigger, plus if one is above 128KB, adds
new 128KB UMA zone for smaller I/Os.  The patch factors out alloc/free,
so later we could make it use more zones or malloc() if we'd like.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-11-11 21:59:39 +00:00
Ilya Bakulin
b6b885c4fe Always return MMC errors from mmc_handle_reply()
There are two ways to propagate the error in MMCCAM:
 * Using cmd.error which is set by the peripheral driver;
 * Using CCB status which is... also set by the driver.

The problem is that those two error conditions don't necessarily match.
This leads to the confusion when handling the MMC reply. So enforce the consistency
by panicking if request is marked as completed successfully but MMC-level error
is present (this hints to the programming error).

Reviewed by:	manu
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D26925
2020-11-03 21:38:59 +00:00
Alexander Motin
06c888ecb9 Add icc (Isochronous Command Completion) ccb_ataio field.
MFC after:	1 week
2020-11-02 01:01:41 +00:00
Edward Tomasz Napierala
bce7ee9d41 Drop "All rights reserved" from all my stuff. This includes
Foundation copyrights, approved by emaste@.  It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.

Reviewed by:	emaste, imp, gbe (manpages)
Differential Revision:	https://reviews.freebsd.org/D26980
2020-10-28 13:46:11 +00:00
Alexander Motin
8836496815 Introduce support of SCSI Command Priority.
SAM-3 specification introduced concept of Task Priority, that was renamed
to Command Priority in SAM-4, and supported by all modern SCSI transports.
It provides 15 levels of relative priorities: 1 - highest, 15 - lowest and
0 - default.  SAT specification for SATA devices translates priorities 1-3
into NCQ high priority.

This change adds new "priority" field into empty spots of struct ccb_scsiio
and struct ccb_accept_tio of CAM and struct ctl_scsiio of CTL.  Respective
support is added into iscsi(4), isp(4), mpr(4), mps(4) and ocs_fc(4) drivers
for both initiator and where applicable target roles.  Minimal support was
added to CTL to receive the priority value from different frontends, pass it
between HA controllers and report in few places.

This patch does not add consumers of this functionality, so nothing should
really change yet, since the field is still set to 0 (default) on initiator
and not actively used on target.  Those are to be implemented separately.

I've confirmed priority working on WD Red SATA disks connected via mpr(4)
and properly transferred to CTL target via iscsi(4), isp(4) and ocs_fc(4).

While there, added missing tag_action support to ocs_fc(4) initiator role.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2020-10-25 19:34:02 +00:00
Brooks Davis
44ca4575ea vmapbuf: don't smuggle address or length in buf
Instead, add arguments to vmapbuf.  Since this argument is
always a pointer use a type of void * and cast to vm_offset_t in
vmapbuf.  (In CheriBSD we've altered vm_fault_quick_hold_pages to
take a pointer and check its bounds.)

In no other situtation does b_data contain a user pointer and vmapbuf
replaces b_data with the actual mapping.

Suggested by:	jhb
Reviewed by:	imp, jhb
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26784
2020-10-21 16:00:15 +00:00
Alexander Motin
cd500da924 Fix sbuf_finish() error code check in user-space.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-10-13 23:29:06 +00:00
Warner Losh
863f967f95 cam: Add quirk for Samsung MZ7* behind a SATA-to-SAS interposer
Sometimes, this drive will be present in the system such that the the
firmware identification string doesn't start with ATA, such as when
it's behind a SATA-to-SAS interposer. Add another quirk for that.

Submitted by: github user mr44er
Github PR: 423
2020-10-07 05:44:35 +00:00
Warner Losh
f8503fde31 nvme: Note where the CCB was released for passthrough command 2020-10-06 23:35:26 +00:00
Warner Losh
a1975719dd cam: Assert we have a reference when freeing sim
Before we decrement refcount to sleep on the sim, assert that the
refcount >= 1. If it were 0 here, we'd never wake up.
2020-10-06 23:33:56 +00:00
John Baldwin
83a277830f Revert most of r360179.
I had failed to notice that sgsendccb() was using cam_periph_mapmem()
and thus was not passing down user pointers directly to drivers.  In
practice this broke requests submitted from userland.

PR:		249395
Reported by:	Trenton Schulz <trueos@norwegianrockcat.com>
Reviewed by:	scottl
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D26550
2020-09-25 21:19:56 +00:00
Andriy Gapon
15f4848ab7 mmc_da: universally use uint8_t for the partition index
Also, assert in sdda_init_switch_part() that the index is within the
defined range.

MFC after:	1 week
2020-09-08 06:19:23 +00:00
Andriy Gapon
fd38fa398a mmc_da: fix a typo and a too long line
MFC after:	1 week
2020-09-08 06:18:34 +00:00
Andriy Gapon
4dfdaf4d92 mmc_da: make sure that part_index is not used uninitialized in sddastart
This is a fix to r334065.

Without this change I once got stuck I/O with endless partition switching:

(sdda0:aw_mmc_sim2:0:0:0): sddastart
(sdda0:aw_mmc_sim2:0:0:0): Partition  0 -> -525703168
(sdda0:aw_mmc_sim2:0:0:0): xpt_action: func 0x91d XPT_MMC_IO
(sdda0:aw_mmc_sim2:0:0:0): xpt_done: func= 0x91d XPT_MMC_IO status 0x1
(sdda0:aw_mmc_sim2:0:0:0): sddadone
(sdda0:aw_mmc_sim2:0:0:0): Card status: 00000000
(sdda0:aw_mmc_sim2:0:0:0): Current state: 4
(sdda0:aw_mmc_sim2:0:0:0): Compteting partition switch to 0

Note that -525703168 (an int) is 0xe0aa6800 in binary representation.
The partition indexes are actually stored as uint8_t, so that value
was converted / truncated to zero.

MFC after:	1 week
2020-09-08 05:46:10 +00:00
Bjoern A. Zeeb
7a7ca53f69 cam_sim: harmonize code related to acquiring a mtx
cam_sim_free(), cam_sim_release(), and cam_sim_hold() all assign
a mtx variable during declaration and then if NULL or the mtx is
held may re-asign the variable and/or acquire/release a lock.

Harmonize the code, avoiding double assignments and make it look
the same for all three function (with cam_sim_free() not needing
an extra case).

No functional changes intended.

Reviewed by:	imp; no-objections by: mav
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D26286
2020-09-04 18:18:05 +00:00
Mateusz Guzik
27dcd3d90b cam: clean up empty lines in .c and .h files 2020-09-01 22:13:48 +00:00
Warner Losh
73be5dd2b2 Fix tiny style nit. 2020-08-27 17:46:13 +00:00
Alexander Motin
7758c80f74 Fix CTL ioctl port creation error handling.
Submitted by:	Bret Ketchum
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D26143
2020-08-21 20:10:29 +00:00
Warner Losh
773e541e8d Use devctl.h instead of bus.h to reduce newbus pollution.
There's no need for these parts of the kernel to know about newbus,
so narrow what is included to devctl.h for device_notify_*.

Suggested by: kib@
2020-08-21 00:03:24 +00:00
Mateusz Guzik
7ad2a82da2 vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error
Most consumers pass NULL.
2020-08-19 02:51:17 +00:00
Mateusz Guzik
f1f3f7cf0b cam: add missing MAKEDEV_NOWAIT in passregister
Reported by:	dhw
2020-08-18 16:06:16 +00:00
Alexander Motin
c119ccdbfd Extend EIIOE field handling according to ses4r5 draft.
It should not affect any existing systems.

MFC after:	2 weeks
2020-08-17 15:11:46 +00:00
Alexander Motin
654f53ab6a Fill device serial_num and device_id in NVMe XPT.
It allows to report GEOM::lunid for nda(4) same as for nvd(4).  Since
NVMe now allows multiple LUs (namespaces) with multiple paths unique
LU identification is important.  The serial_num field is filled same
as before with the controller serial number, while device_id is based
on namespace GUID and/or EUI64 fields as recommended by "NVM Express:
SCSI Translation Reference" and matching nvd(4) at the end.

MFC after:	1 week
2020-08-13 02:32:46 +00:00
Alexander Motin
1c7decd4de Report proper stripesize for nda(4).
Same as for nvd(4) report NPWG if present, otherise NOIOB.

MFC after:	1 week
2020-08-12 19:37:57 +00:00
Bjoern A. Zeeb
1e5d733503 mmc_da: fix memory leak in sddaregister()
In case we are failing to allocate mmcdata, we are returning with
a softc allocated but not attached to anything and thus leak the
memory.

Reviewed by:	scottl
MFC after:	2 weeks
X-MFC:		only if we also mfc other mmccam changes?
Differential Revision:	https://reviews.freebsd.org/D25987
2020-08-07 19:58:16 +00:00
Alexander Motin
8bdf81e4d1 Add CTL support for REPORT IDENTIFYING INFORMATION command.
It allows to report to initiator LU identifying information, preset via
"ident_info" and "text_ident_info" options.

Unfortunately it is impossible to implement SET IDENTIFYING INFORMATION,
since we have no persistent storage it requires, so the information is
read-only for initiator and has to be set out-of-band.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-08-06 19:16:11 +00:00