Commit Graph

445 Commits

Author SHA1 Message Date
Enji Cooper
092af585e1 Remove redundant declaration for tcp_dooptions, similar to r302576
netinet/tcp_var.h already defines this function

Differential Revision:	https://reviews.freebsd.org/D7189
MFC after:	1 week
PR:	209920
Reported by:	Mark Millard <markmi@dsl-only.net>
Reviewed by:	np
Tested with:	clang 3.8.0, gcc 4.2.1, gcc 5.3.0
Sponsored by:	EMC / Isilon Storage Division
2016-07-11 17:11:18 +00:00
Navdeep Parhar
a33b046750 cxgbe(4): Add sysctl to display the RSS indirection table size for an
interface.

dev.cxl.<n>.rss_size
dev.vcxl.<n>.rss_size

MFC after:	3 days
2016-07-08 18:13:23 +00:00
Navdeep Parhar
671bf2b8b2 cxgbe(4): Changes to the CPL-handler registration mechanism and code
related to "shared" CPLs.

a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single
function.  Allow callers to direct the response to any iq.  Tidy up
set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of
magic constants.

b) Remove all CPL handler tables from struct adapter.  This reduces its
size by around 2KB.  All handlers are now registered at MOD_LOAD instead
of attach or some kind of initialization/activation.  The registration
functions do not need an adapter parameter any more.

c) Add per-iq handlers to deal with CPLs whose destination cannot be
determined solely from the opcode.  There are 2 such CPLs in use right
now: SET_TCB_RPL and L2T_WRITE_RPL.  The base driver continues to send
filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq.
t4_tom (including the DDP code) now uses the port's ctrlq to send
L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq.
fwq and ofld_rxq have different handlers that know what kind of tid to
expect in the reply.  Update t4_write_l2e and callers to to support any
wrq/iq combination.

Approved by:	re@ (kib@)
Sponsored by:	Chelsio Communications
2016-07-05 01:29:24 +00:00
Navdeep Parhar
bf9363d72c cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries
used by switching filters that rewrite L2 information do not have any
associated ifnet.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2016-07-01 23:18:49 +00:00
Navdeep Parhar
5e03372b18 cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it.
The interface's queues are functional after VI_INIT_DONE (which is short
of interface-up) and that's all that's needed for t4_tom to communicate
with the chip.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2016-06-29 06:55:30 +00:00
Navdeep Parhar
62291463de cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces.  The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel.  The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.).  It did not have normal
tx/rx but had a specialized netmap-only data path.  r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support.  The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable.  This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface.  nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf.  num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap.  pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only.  No
need to bring them up.  The vcxl interfaces are completely independent
and everything should just work.
---------------------

Approved by:	re@ (gjb@)
Relnotes:	Yes
Sponsored by:	Chelsio Communications
2016-06-23 02:53:00 +00:00
John Baldwin
b1012d8036 Account for AIO socket operations in thread/process resource usage.
File and disk-backed I/O requests store counts of read/written disk
blocks in each AIO job so that they can be charged to the thread that
completes an AIO request via aio_return() or aio_waitcomplete().  This
change extends AIO jobs to store counts of received/sent messages and
updates socket backends to set these counts accordingly.  Note that
the socket backends are careful to only charge a single messages for
each AIO request even though a single request on a blocking socket might
invoke sosend or soreceive multiple times.  This is to mimic the
resource accounting of synchronous read/write.

Adjust the UNIX socketpair AIO test to verify that the message resource
usage counts update accordingly for aio_read and aio_write.

Approved by:	re (hrs)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6911
2016-06-21 22:19:06 +00:00
John Baldwin
ae0b1ccbab Use sbused() instead of sbspace() to avoid signed issues.
Inserting a full mbuf with an external cluster into the socket buffer
resulted in sbspace() returning -MLEN.  However, since sb_hiwat is
unsigned, the -MLEN value was converted to unsigned in comparisons.  As a
result, the socket buffer was never autosized.  Note that sb_lowat is signed
to permit direct comparisons with sbspace(), but sb_hiwat is unsigned.
Follow suit with what tcp_output() does and compare the value of sbused()
with sb_hiwat instead.

Approved by:	re (gjb)
Sponsored by:	Chelsio Communications
2016-06-15 21:08:51 +00:00
John Baldwin
fe0bdd1d2c Move backend-specific fields of kaiocb into a union.
This reduces the size of kaiocb slightly. I've also added some generic
fields that other backends can use in place of the BIO-specific fields.

Change the socket and Chelsio DDP backends to use 'backend3' instead of
abusing _aiocb_private.status directly. This confines the use of
_aiocb_private to the AIO internals in vfs_aio.c.

Reviewed by:	kib (earlier version)
Approved by:	re (gjb)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6547
2016-06-15 20:56:45 +00:00
Navdeep Parhar
c4765d2743 cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA
connections and not others that are allowed to fail the receive window
check.

Approved by:	re (gjb@)
2016-06-14 21:09:00 +00:00
Navdeep Parhar
390006c70f iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN.
Release the hold on ep->com immediately after sending the RST.  This
fixes a bug that sometimes leaves userspace iWARP tools hung when the
user presses ^C.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Approved by:	re (gjb@)
Sponsored by:	Chelsio Communications
2016-06-14 21:02:36 +00:00
Navdeep Parhar
02f972e8f3 cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
Sponsored by:	Chelsio Communications
2016-06-08 14:15:29 +00:00
Navdeep Parhar
468646f716 cxgbe(4): A couple of fixes to set_sched_queue.
- Validate the scheduling class against the actual limit (which is chip
  specific) instead of a magic number.

- Return an error if an attempt is made to manipulate the tx queues of a
  VI that hasn't been initialized.

Sponsored by:	Chelsio Communications
2016-06-07 07:48:36 +00:00
Navdeep Parhar
0bc01d66b6 cxgbe(4): Provide information about traffic classes in the sysctl mib.
Sponsored by:	Chelsio Communications
2016-06-07 06:42:35 +00:00
Navdeep Parhar
46464b95b0 cxgbe(4): Track the state of the hardware traffic schedulers in the
driver.  This works as long as everyone uses set_sched_class_params
to program them.

Sponsored by:	Chelsio Communications
2016-06-07 00:27:55 +00:00
Navdeep Parhar
cfbe2911c9 cxgbe(4): Break up set_sched_class. Validate the channel number and
min/max rates against their actual limits (which are chip and port
specific) instead of hardcoded constants.

Sponsored by:	Chelsio Communications
2016-06-06 22:51:44 +00:00
Navdeep Parhar
db80d07377 cxgbe(4): Create a reusable struct type for scheduling class parameters.
Sponsored by:	Chelsio Communications
2016-06-06 20:42:46 +00:00
Navdeep Parhar
e6b775f48f iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire
comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory
(which is where the lock resides).

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-06-01 18:46:54 +00:00
Edward Tomasz Napierala
ac3453eba0 Reduce the priority of cxgbei(4) driver, so it doesn't get chosen
by default.  This is a workaround for a too simplistic ICL module
choosing mechanism.  To use it, specify offload in ctl.conf
or iscsi.conf.

This fixes a problem where "kldload cxgbei" wedges the iSCSI stack,
if you don't have a Chelsio card installed, or the endpoints of the
iSCSI session are not reachable through addresses configured
on that interface.

Reviewed by:	np@
MFC after:	1 month
2016-06-01 12:04:04 +00:00
Navdeep Parhar
addd6a52c4 cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to
avoid panicking debug kernels.

t4_tom does not keep track of a connection once it switches to ULP mode
iWARP.  If the connection falls out of ULP mode the driver/hardware seq#
etc. are out of sync.  A better fix would be to figure out what the
current seq# are, update the driver's state, and perform all sanity
checks as usual.
2016-05-28 00:38:17 +00:00
Navdeep Parhar
fc511a3302 iw_cxgbe: Plug a lock leak in process_mpa_request().
If the parent is DEAD or connect_request_upcall() fails, the parent
mutex is left locked.  This leads to a hang when process_mpa_request()
is called again for another child of the listening endpoint.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Obtained from:	upstream iw_cxgb4
Sponsored by:	Chelsio Communications
2016-05-27 23:44:33 +00:00
Navdeep Parhar
69b913d68f iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations.
Submitted by:	Krishnamraju Eraparaju at Chelsio
Reviewed by:	Steve Wise
Sponsored by:	Chelsio Communications
2016-05-27 21:26:26 +00:00
Hans Petter Selasky
fa201e28fc Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
Edward Tomasz Napierala
b891159418 Add mechanism for choosing iSER-capable ICL modules.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-24 08:44:45 +00:00
Edward Tomasz Napierala
7deb68ab2c Provide a way for ICL modules to declare they support PIM_UNMAPPED.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-21 11:10:48 +00:00
John Baldwin
1081d2766c Move the KTR for the update of ddp_active_id on each completion under
VERBOSE_TRACES.

Sponsored by:	Chelsio Communications
2016-05-20 23:08:22 +00:00
Edward Tomasz Napierala
604c023f94 Extend the ICL interface to include the PDU pointer in the task_setup
method.  This is required for upcoming iSER support.

Obtained from:	Mellanox Technologies (earlier version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-17 08:55:21 +00:00
Navdeep Parhar
96807c23ce cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0.
These firmwares were obtained from the "Chelsio T5/T4 Unified Wire
v2.12.0.3 for Linux" release.  Changes since 1.14.4.0 (which is the
firmware in -STABLE branches) are in the "Release Notes" accompanying
the Unified Wire release and are copy-pasted here as well.

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.15.37.0
Date    : 04/27/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
   queue was ignored.
 - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
 - Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
 - Fixed 40G link failures with some switches when auto-negotiation enabled.
 - Fixed to improve on link bring-up time.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where bogus d3hot bits were set causing traffic stall.
 - Fixed an issue where sometimes adapter was not seen after reboot.
 - Fixed an issue where iWARP was crashing in conjunction with traffic management.
 - Fixed an issue where link failed to come up after removing twinax cable and
   inserting optical module.

ETH
 - Fixed a link flap issue on T580-CR.

OFLD
 - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.

FOiSCSI
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
   use given inerface.
 - Fixed an issue where fw was sending ENETUNREACH event for normal tcp
   disconnection.

DCBX
 - Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE)
 - Fixed an issue where apply bit set for APP id was affecting the ETS and PFC
  settings.(DCBX IEEE)
 - Fixed an issue where app priority values are not handled correctly in fw.
  (DCBX IEEE)
 - Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE)

FOFCoE
 - Removed BB6 support.

ENHANCEMENTS
------------

BASE:
 - Added new interface to program DCA settings in SGE contexts; allow 32-byte
   IQE size
 - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.

OFLD:
 - WR opcode is returned to host in cqe error response.

22.2. T4 Firmware
+++++++++++++++++

Version : 1.15.37.0
Date    : 04/27/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
   was ignored.
 - Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where iWARP was crashing in conjunction with traffic management.

FOiSCSI:
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
   use given inerface.

DCBX
 - Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE)
 - Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE)

FOiSCSI
 - Fixes an issue where fw was sending ENETUNREACH event for normal tcp
   disconnection.

FOFCoE
 - Removed BB6 support.

ENHANCEMENTS
------------

BASE:
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================

Obtained from:	Chelsio Communications
MFC after:	6 weeks
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-05-13 17:38:59 +00:00
Hans Petter Selasky
30de20448d The idr_for_each() function is now part of the LinuxKPI. Use the
LinuxKPI's idr_for_each() function instead of the local one to avoid
compilation issues.

Discussed with:	np @
MFC after:	1 week
2016-05-11 17:17:48 +00:00
John Baldwin
f7bc393477 Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h. 2016-05-10 03:32:22 +00:00
John Baldwin
d3cd2df1f2 Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h.
Other structures needed by prototypes in t4_tom.h are explicitly
declared in this file, so adding the prototype here seems most
consistent with existing code.
2016-05-09 20:01:34 +00:00
John Baldwin
dc9643853d Use DDP to implement zerocopy TCP receive with aio_read().
Chelsio's TCP offload engine supports direct DMA of received TCP payload
into wired user buffers.  This feature is known as Direct-Data Placement.
However, to scale well the adapter needs to prepare buffers for DDP
before data arrives.  aio_read() is more amenable to this requirement than
read() as applications often call read() only after data is available in
the socket buffer.

When DDP is enabled, TOE sockets use the recently added pru_aio_queue
protocol hook to claim aio_read(2) requests instead of letting them use
the default AIO socket logic.  The DDP feature supports scheduling DMA
to two buffers at a time so that the second buffer is ready for use
after the first buffer is filled.  The aio/DDP code optimizes the case
of an application ping-ponging between two buffers (similar to the
zero-copy bpf(4) code) by keeping the two most recently used AIO buffers
wired.  If a buffer is reused, the aio/DDP code is able to reuse the
vm_page_t array as well as page pod mappings (a kind of MMU mapping the
Chelsio NIC uses to describe user buffers).  The generation of the
vmspace of the calling process is used in conjunction with the user
buffer's address and length to determine if a user buffer matches a
previously used buffer.  If an application queues a buffer for AIO that
does not match a previously used buffer then the least recently used
buffer is unwired before the new buffer is wired.  This ensures that no
more than two user buffers per socket are ever wired.

Note that this feature is best suited to applications sending a steady
stream of data vs short bursts of traffic.

Discussed with:	np
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-05-07 00:33:35 +00:00
John Baldwin
826c2372c5 Set the correct vnet in TOE event handlers.
Differential Revision:	https://reviews.freebsd.org/D6152
2016-05-06 23:49:10 +00:00
Pedro F. Giffuni
612226d756 Revert r298955 for the cxgbe firmware.
These files have checksums that are none of my business.

Requested by:	np
2016-05-03 11:49:29 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
Pedro F. Giffuni
4ed3c0e713 sys: Make use of our rounddown() macro when sys/param.h is available.
No functional change.
2016-04-30 14:41:18 +00:00
Pedro F. Giffuni
b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Pedro F. Giffuni
d9c9c81c08 sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
Navdeep Parhar
cda2ab0e7a cxgbe(4): Always dispatch all work requests that have been written to the
descriptor ring before leaving drain_wrq_wr_list.
2016-04-12 22:11:29 +00:00
Navdeep Parhar
fdb562a960 cxgbe(4): Always read the entire mailbox into the reply buffer.
The size of the reply can be different from the size of the command in
case a debug firmware asserts.  fw_asrt() needs the entire reply in
order to decode the location of the assert.

Sponsored by:   Chelsio Communications
2016-04-12 21:17:19 +00:00
John Baldwin
b3b4ccc70a Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.
This fixes a conflict with the M_B macro in powerpc's
<machine/db_machdep.h> exposed by the recent addition of DDB commands
to the cxgbe driver.

Discussed with:	np
Reported by:	bz
Sponsored by:	Chelsio Communications
2016-04-12 17:44:34 +00:00
Navdeep Parhar
2829f76d1a cxgbe(4): Provide an explicit value for nqpcq in the firmware
configuration file.
2016-04-11 02:18:59 +00:00
Pedro F. Giffuni
74b8d63dcc Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
John Baldwin
307734b6d4 Add a 'show t4 devlog <nexus>' DDB command.
This command displays the adapter's firmware device log similar to the
dev.<nexus>.misc.devlog sysctl.

Sponsored by:	Chelsio Communications
2016-04-10 06:19:26 +00:00
John Baldwin
113f2316c6 Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.
This allows the contents of a TCB to be extracted from a T4/T5 card in
DDB after a panic.
2016-04-10 05:06:58 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
John Baldwin
80f3b01958 Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

Discussed with:	np
Sponsored by:	Chelsio Communications
2016-03-31 18:36:50 +00:00
Navdeep Parhar
79bb86ac2e cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!"
messages.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-29 00:10:47 +00:00
Navdeep Parhar
78552b23a5 cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.
2016-03-22 18:56:23 +00:00
Navdeep Parhar
8d814a458c iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux
iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.

This commit includes internal changesets 6785 8111 8149 8478 8617 8648
8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730
11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-21 00:29:45 +00:00