Commit Graph

32 Commits

Author SHA1 Message Date
Alexander Motin
b75168ed24 Make software iSCSI more configurable.
Move software iSCSI tunables/sysctls into kern.icl.soft subtree.
Replace several hardcoded length constants there with variables.

While there, stretch the limits to better match Linux' open-iscsi
and our own initiator with new MAXPHYS of 1MB.  Our CTL target is
also optimized for up to 1MB I/Os, so there is also a match now.
For Windows 10 and VMware 6.7 initiators at default settings it
should make no change, since previous limits were sufficient there.

Tests of QD1 1MB writes from FreeBSD over 10GigE link show throughput
increase by 29% on idle connection and 132% with concurrent QD8 reads.

MFC after:	3 days
Sponsored by:	iXsystems, Inc.
2021-01-28 16:22:45 -05:00
Alexander Motin
ff751ee05c Remove FirstBurstLength limit for software iSCSI.
For hardware offload solicited data may potentially be handled more
efficiently than unsolicited due to direct data placement.  Or there
can be some unsolicited write buffering limitations.  It may create
situations where FirstBurstLength limit is really useful.

Software driver though has no those factors, having to do memcopy()
any way and having no so hard limit on the temporary storage.  Same
time more active use of unsolicited transfers allows to avoid some
of Ready To Transfer (R2T) PDU round-trip times and processing.

This change effectively doubles from 64KB to 128KB the maximum size
of write command that can be transferred within one link RTT.  Tests
of (64KB, 128KB] QD1 writes mixed with simultaneous QD8 reads over
the same connection, increasing RTT, shows almost double write speed
and half latency, while we should be able to afford few megabytes of
RAM for additional buffering on a target these days.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2021-01-20 23:17:12 -05:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Edward Tomasz Napierala
bce7ee9d41 Drop "All rights reserved" from all my stuff. This includes
Foundation copyrights, approved by emaste@.  It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.

Reviewed by:	emaste, imp, gbe (manpages)
Differential Revision:	https://reviews.freebsd.org/D26980
2020-10-28 13:46:11 +00:00
Mateusz Guzik
2140d5b64f iscsi: clean up empty lines in .c and .h files 2020-09-01 21:30:22 +00:00
Alexander Motin
9a4510ac32 Implement zero-copy iSCSI target transmission/read.
Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter.  Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed.  In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC.  Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with:	Chelsio
Sponsored by:	iXsystems, Inc.
2020-06-08 20:53:57 +00:00
Alexander Motin
1f29b46c42 Do not try to fill socket send buffer to the last byte.
Setting so_snd.sb_lowat to at least 1/8 of the socket buffer size allows
send thread more actively use PDUs coalescing, that dramatically reduces
TCP lock congestion and number of context switches, when the socket is
full and PDUs are small.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-05-22 18:10:46 +00:00
Xin LI
f89d207279 Separate kernel crc32() implementation to its own header (gsb_crc32.h) and
rename the source to gsb_crc32.c.

This is a prerequisite of unifying kernel zlib instances.

PR:		229763
Submitted by:	Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision:	https://reviews.freebsd.org/D20193
2019-06-17 19:49:08 +00:00
Edward Tomasz Napierala
43ee6e9d7b Add SPDX tags to iscsi(4).
MFC after:	2 weeks
2018-01-24 16:58:26 +00:00
Edward Tomasz Napierala
22d3bb2625 Move the DIAGNOSTIC check for lost iSCSI PDUs from icl_conn_close()
to icl_conn_free().  It's perfectly valid for the counter to be non-zero
in the former.

MFC after:	2 weeks
Sponsored by:	playkey.net
2017-12-09 15:34:40 +00:00
Hans Petter Selasky
9ac7c5a64c Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.

Discussed with:	np@ and trasz@
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-23 13:57:44 +00:00
Alexander Motin
82f7fa7ae6 Inline some trivial wrapper functions.
MFC after:	2 weeks
2017-03-02 16:14:15 +00:00
Alexander Motin
605703b5df Fix handling of negative sbspace() return values.
I found that at least with Chelsio NICs TOE sockets quite often report
negative sbspace() values.  Using unsigned variable to store it resulted
in attempts to aggregate too much data in one sosend() call, that caused
errors and following connection termination.

MFC after:	2 weeks
2017-02-15 19:46:00 +00:00
Alexander Motin
33d9db92e2 Directly call m_gethdr() instead of m_getm2() for BHS.
All this code is based on assumption that data will be stored in one piece,
and since buffer size if known and fixed, it is easier to hardcode it.

MFC after:	2 weeks
2017-02-14 18:34:25 +00:00
Alexander Motin
875ac6cfac Temporary attach AHS to BHS to calculate header digest.
MFC after:	2 weeks
2017-02-14 18:29:07 +00:00
Alexander Motin
d0d587c787 Do not rely on data alignment after m_pullup().
In general case m_pullup() does not really guarantee any data alignment.
Instead of depenting on side effects caused by data being always copied
out of mbuf cluster (which is probably a bug by itself), always allocate
aligned BHS buffer and read data there directly from socket.

While there, reuse new icl_conn_receive_buf() function to read digests.
The code could probably be even more optimized to aggregate those reads,
but until that done, this is still easier then the way it was before.

MFC after:	2 weeks
2017-02-14 16:33:42 +00:00
Alexander Motin
898fd11f5e Remove M_PKTHDR from m_getm2() in icl_pdu_append_data().
ip_data_mbuf is always appended to ip_bhs_mbuf, so it does not need own
packet header.  This change first avoids allocation/initialization of the
header, and then avoids dropping one when it later gets to socket buffer.

MFC after:	2 weeks
2017-02-13 20:36:28 +00:00
Navdeep Parhar
97b84d344d Make the iSCSI parameter negotiation more flexible.
Decouple the send and receive limits on the amount of data in a single
iSCSI PDU.  MaxRecvDataSegmentLength is declarative, not negotiated, and
is direction-specific so there is no reason for both ends to limit
themselves to the same min(initiator, target) value in both directions.

Allow iSCSI drivers to report their send, receive, first burst, and max
burst limits explicitly instead of using hardcoded values or trying to
derive all of them from the receive limit (which was the only limit
reported by the drivers prior to this change).

Display the send and receive limits separately in the userspace iSCSI
utilities.

Reviewed by:	jpaetzel@ (earlier version), trasz@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7279
2016-08-25 05:22:53 +00:00
Edward Tomasz Napierala
b891159418 Add mechanism for choosing iSER-capable ICL modules.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-24 08:44:45 +00:00
Edward Tomasz Napierala
7deb68ab2c Provide a way for ICL modules to declare they support PIM_UNMAPPED.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-21 11:10:48 +00:00
Edward Tomasz Napierala
906a424b26 Call the ICL module's handoff method even when using ICL proxy.
The upcoming iSER code uses this.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-20 17:38:51 +00:00
Edward Tomasz Napierala
f41492b00f Add icl_conn_connect() ICL method, required for iSER.
Obtained from:	Mellanox Technologies (earlier version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-17 11:10:44 +00:00
Edward Tomasz Napierala
604c023f94 Extend the ICL interface to include the PDU pointer in the task_setup
method.  This is required for upcoming iSER support.

Obtained from:	Mellanox Technologies (earlier version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-17 08:55:21 +00:00
Pedro F. Giffuni
266078c6db dev/iscsi: minor spelling fixes.
No functional change.

Reviewed by:	trasz
2016-05-03 14:49:49 +00:00
Alexander Motin
5b157f2144 Close some potential races around socket start/close.
There are some reports about panics on ic->ic_socket NULL derefence.
This kind of races is the only way I can imagine it to happen.

MFC after:	2 weeks
2015-05-15 13:36:50 +00:00
Edward Tomasz Napierala
aca050aaaa Remove icl_conn_connected(); was unused.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-04-04 22:11:38 +00:00
Edward Tomasz Napierala
7a03d007cf Extend ICL to add receive offload methods. For software ICL backend
they are no-ops.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-08 19:15:14 +00:00
Edward Tomasz Napierala
d4b195d315 Make output of "iscsictl -v" and "ctladm islist -v" a little prettier
by capitalizing "None".

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-08 10:58:25 +00:00
Edward Tomasz Napierala
5aabcd7c4e Tidy up; no functional changes.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-07 14:15:17 +00:00
Edward Tomasz Napierala
82babffba9 Make it possible to set (via iscsi.conf(5)) and query (via iscsictl -v)
initiator iSCSI offload.  Pass maximum data segment size supported by
chosen offload module to iscsid(8), and make iscsid(8) not try to negotiate
anything larger than that.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-02-05 06:37:59 +00:00
Edward Tomasz Napierala
872d2d9276 Use proper module name in MODULE_VERSION().
Sponsored by:	The FreeBSD Foundation
2015-01-31 15:22:45 +00:00
Edward Tomasz Napierala
321b17ec15 Add kobj interface between ICL and the rest of the iSCSI stack.
Review note - icl.c was moved to icl_soft.c.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-01-31 07:49:50 +00:00