Commit Graph

683 Commits

Author SHA1 Message Date
Navdeep Parhar
e8257dbe43 cxgbe(4): Attach to the 2x25 debug card. This is for internal use only.
MFC after:	3 days
2017-01-10 01:36:50 +00:00
Navdeep Parhar
b91f227f60 cxgbe(4): Refresh t4_msg.h, mainly for definitions related to the crypto
engine.

Obtained from:	Chelsio Communications
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-01-10 01:30:41 +00:00
Navdeep Parhar
87b027ba69 cxgbe(4): Enable automatic cidx flush for all control queues.
MFC after:	3 days
2017-01-09 22:20:09 +00:00
Navdeep Parhar
f50c49cca7 cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved
in work requests that end at the end of the descriptor ring, even though
the pidx wraps around to 0.

MFC after:	3 days
2017-01-09 22:18:08 +00:00
Navdeep Parhar
d663d5ca23 cxgbe/t4_tom: Fix tid accounting. An offloaded IPv6 connection uses 2
tids, not 1, in the hardware.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-01-07 20:26:19 +00:00
Navdeep Parhar
e4612db2cd Fix comment in t4_tom. No functional change.
MFC after:	3 days
2017-01-07 00:08:55 +00:00
Navdeep Parhar
c88fa71928 cxgbe(4): Update T4, T5 and T6 firmwares to 1.16.26.0. Changelog for
all public firmwares for all chips since the last release (1.15.37.0)
follows (it's a straight copy-paste from the Release Notes for the
12/30/2016 Unified Wire release on Chelsio's website).

T6 Firmware
++++++++++++

Version : 1.16.26.0
Date    : 12/28/2016

Fixes
-----

BASE:
- Max number of egress and control queues adjusted to accomodate
  co-processor mode queues.
- Fixed intermittent DDR3/4 ECC errors.
- Fixed a traffic stall when ETS BW is configured as 0%.
- Max number of ethctrl queue in VF set to 1.

ETH:
- Added a new config file option 'speed' under port section to set the
  port speed.  Use only when auto negotiation is off.
- FEC option removed from firmware config file. cxgbtool can be used to
  change the fec setting.
- CPL_TX_TNL_LSO cpl handling added in ETH_TX_PKT_VM handler. This fixes
  large tunnel tcp packet support for VxLAN.

Version : 1.16.22.0
Date    : 12/05/2016

Fixes
-----

BASE:
- fw_port_type updated in fw API to match kernel.org definitions.
- Saved power by disaling unused MAC lanes.
- Configures correct power bin.
- Enhanced DDR4 performance.
- Enabled interrupts.
- Fixed an issue where filter rule for 'unicast hash' is not working.

ETH:
- Disabled auto negotiation by default because most of 100G switches do
  not support AN as of today.
- Fixed flow control not getting disabled problem.
- Fixed an issue where port0 doesn't come up sometimes.
- Fixed 10G link not coming up issue.
- Fixed an issue with promiscuous mode when dcbx disabled.

OFLD:
- Fixed a connection stuck issue when abort is received during out of tx
  pages backpressure.

ENHANCEMENTS
------------

BASE:
- Added inline TLS mode support.

Version : 1.16.12.0
Date    : 11/11/2016

ENHANCEMENTS
------------

BASE:
- Added T6 support.
- Added T6 1G/10G/25G/40G/100G link speeds.
- Added T6 co-processor mode crypto support.
- Added facility to increase link AN+AEC timeout.

OFLD:
- Added support for all T5 offload protocols except FCoE.

iSCSI:
- iscsi completion moderation enabled.

=======================================================================

T5 Firmware
++++++++++++

Version : 1.16.26.0
Date    : 12/28/2016

FIXES
-----

BASE:
- Max number of ethctrl queue in VF set to 1.

Version : 1.16.22.0
Date    : 12/05/2016

FIXES
-----

BASE:
- Fixed an issue where filter rule for 'unicast hash' is not working.

ETH:
- Fixed an issue with promiscuous mode when dcbx disabled.

ENHANCEMENTS
------------

ETH:
- Added 40G-KR support.

Version : 1.16.12.0
Date    : 11/11/2016

FIXES
-----

BASE:
- Fixed multiple issues related with VFs FLR processing.
- Fixed channel assignment based on number of ports in adapter.
- Fixed a crash when VM having PF assigned as passthrough mode is
  rebooted.
- Handled 2nd HELLO command from the same PF without seeing BYE from the
  same PF and if that is the only PF.
- A warning is printed in firmware log if PCI-E cookie generation is
  enabled in serial initialization file.
- Fixed multiple issues related with Filtering.
- Enabled DSGL memory write for iscsi and rdma.
- Added new FW_PARAMS_CMD[DEV] options to retrieve Serial Configuration
  and VPD version numbers.
- Fixed an issue where LVDS output was not getting enabled using vpd.

DCBX:
- Fixed DCBX CEE Incorrect class to pririty mapping.
- Fixed incorrect interpretation of DCBX IEEE PFC.

ETH:
- Adjusted the link related delay timings according to the QSFP spec.
- Improved 40G link bringup time with few switches.

OFLD:
- Do not reserve qp/cq if rdma capability is not enabled.
- Fixed an issue where approx 1600+ TOE connections were causing a
  firmware fatal error.

FOiSCSI:
- Fixed an issue where unloading foiscsi driver causes mailbox timeout.

ENHANCEMENTS
------------

BASE:
- Added 10G KR/KX support.
- Added T540-BT adapter support.
- Added 4 new rss key modes for PFs and VFs.

OFLD:
- Added new WR FW_RI_FR_NSMR_TPTE_WR to improve fast MR write
  performance in RDMA.

Version : 1.16.5.0
Date    : 10/26/2016

FIXES
-----

BASE:
- Fixed multiple issues where FLR from multiple VFs can cause firmware
  crash.
- Fixed channel assignment based on number of ports in adapter.
- Fixed the HELLO command master force api to handle the 2nd HELLO
  correctly without getting BYE from the PF driver.
- Added facility to retrieve Serial configuration and VPD version. Two
  new FW_PARAMS_CMD[DEV] options added to retrieve these values.
- Fixed multiple issues where FLR from multiple VFs are not completing.
- Added new RSS hash secret key modes.
- Fixed an issue where LVDS output was not getting enabled using vpd.

DCBX:
- Fixed an issue where iscsi tlv is sent incorrectly to host (DCBX CEE).
- Fixed an issue where app priority values are not handled correctly
  in fw (DCBX IEEE).

ETH:
- Adjusts the link related delay timings according to the QSFP spec.
- Changed 2.5G mac speed bit to 25G mac speed bit in fw API.
- Improvement in 40G link bringup time with few switches.

OFLD:
- Do not reserve qp/cq if rdma capability is not enabled.
- Fixed an issue where approx 1600+ TOE connections were causing a
  firmware fatal error.
- Fixed DSGL memory write in T5. Now iwarp and iscsi can use DSGL to do
  memory write.
- Fixed multiple issues in hash filter mode where incorrect protocol
  mask was getting used and affecting hash filter functionality.
- New fastpath WR FW_RI_FR_NSMR_TPTE_WR (with fully populated TPTE) is
  added for small REG_MR operations.

FOiSCSI:
- Fixed an issue in foiscsi recovery path.
- Fixed an issue where foiscsi (in VM in PCIE passthrough mode) didn't
  come up after VM FLR.

ENHANCEMENTS
------------

ETH:
- Implemented 1G/10G KR/KX ability.
- Implemented T540-BT adapter support.

=======================================================================

T4 Firmware
+++++++++++

Version : 1.16.12.0
Date    : 11/11/2016

FIXES
-----

BASE:
- Fixed an issue where reading temperature sesors using ldst command
  causes mailbox timeout.
- Added new FW_PARAMS_CMD[DEV] options to retrieve Serial Configuration
  and VPD version numbers.

ETH:
- Fixed DCBX CEE Incorrect class to pririty mapping.

FOiSCSI:
- Fixed an issue where unloading foiscsi driver causes mailbox timeout.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-01-03 22:05:07 +00:00
Navdeep Parhar
358bca3bc6 cxgbe(4): Updates to link configuration.
- Update struct link_settings and associated shared code.

- Add tunables to control FEC and autonegotiation.  All ports inherit
  these values as their initial settings.
  hw.cxgbe.fec
  hw.cxgbe.autoneg

- Add per-port sysctls to control FEC and autonegotiation.  These can be
  modified at any time.
  dev.<port>.<n>.fec
  dev.<port>.<n>.autoneg

MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-30 08:59:49 +00:00
Navdeep Parhar
5a8b662ea6 cxgbe(4): Fix typo in an unused macro.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-16 06:30:07 +00:00
Navdeep Parhar
07b5fdfd01 cxgbe(4): Changes to the default T6 firmware configuration file.
- Disable features that are not supported or not used on FreeBSD.
- Increase the RSS table slice per interface.
- Increase the share of the TCAM reserved for filtering.

MFH:		2 weeks
Sponsored by:	Chelsio Communications
2016-12-16 06:25:51 +00:00
Navdeep Parhar
1de8c69de7 cxgbe(4): Deal with compressed error vectors.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-15 02:05:29 +00:00
Navdeep Parhar
ca276276f1 cxgbe(4): Fix the tid range shown for T6 cards in misc.tids.
MFC after:	3 days
2016-12-14 07:36:36 +00:00
Navdeep Parhar
1521ca71d3 cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-13 20:35:57 +00:00
Navdeep Parhar
b8c1ffef80 cxgbe(4): netmap does not set IFCAP_NETMAP in an ifnet's if_capabilities
any more (since r307394).  Do it in the driver instead.

MFC after:	1 week
2016-12-09 02:21:27 +00:00
Navdeep Parhar
aa7792f2f6 cxgbe(4): unsigned short isn't large enough to store link speed (which
is in Mbps) for 100Gbps links.

MFC after:	3 days
2016-12-07 04:23:08 +00:00
Navdeep Parhar
3cbaf64f2e cxgbe(4): Update firmwares from version 1.16.12.0 to 1.16.22.0.
Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-06 12:43:07 +00:00
Navdeep Parhar
a10443e8ba cxgbe(4): Include firmware for T6 cards in the driver. Update all
firmwares to 1.16.12.0.

Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-11-30 00:26:35 +00:00
Navdeep Parhar
041e5aed63 cxgbe(4): Accurate statistics for all chip settings.
There are 4 independent knobs in T5+ chips to include or exclude PAUSE
frames from the "total frames" and "multicast frames" counters in either
direction.  This change lets the driver deal with any combination of
these settings.
2016-10-28 23:01:11 +00:00
Navdeep Parhar
77e9044c47 cxgbe(4): Fix bug in the calculation of the number of physically
contiguous regions in an mbuf chain.

If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would
incorrectly consider the next mbuf's payload physically contiguous based
solely on a KVA comparison.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2016-10-24 19:09:56 +00:00
Navdeep Parhar
aaa4ddd9d0 cxgbe(4): Dump any mailbox command that times out. 2016-10-22 00:48:58 +00:00
Navdeep Parhar
b806e571bf cxgbe(4): Adjust whitespace to line up the column titles in cim_qcfg
with the values displayed.
2016-10-17 20:57:54 +00:00
Navdeep Parhar
721f5406c8 cxgbe(4): Allow the interface MTU to be set as high as the actual
hardware limit.

Submitted by:	jpaetzel@
Differential Revision:	https://reviews.freebsd.org/D8237
2016-10-13 19:40:21 +00:00
Navdeep Parhar
35b5ef914c cxgbe(4): Add an ioctl to copy a firmware config file to the card's flash. 2016-10-07 19:02:39 +00:00
Navdeep Parhar
d4d953bf37 cxgbe(4): Fix whitespace in the pm_stats display. 2016-10-06 21:25:17 +00:00
Navdeep Parhar
4e6b9efc86 cxgbe(4): Claim the T6 -DBG card. 2016-09-30 00:16:54 +00:00
Eitan Adler
eb1c1a7f24 Remove a a duplicated word. 2016-09-29 13:59:14 +00:00
Navdeep Parhar
788f3c06f6 cxgbe(4): Use the port's top speed to figure out whether it is "high
speed" or not (for the purpose of calculating the number of queues etc.)
This does the right thing for 25Gbps and 100Gbps ports.
2016-09-24 19:03:05 +00:00
Navdeep Parhar
d44268d135 cxgbe(4): Support SIOGIFXMEDIA so that ifconfig displays correct media
for 25Gbps and 100Gbps ports.   This should have been part of r305713,
which is when the driver first started reporting extended media types.
2016-09-24 13:23:47 +00:00
Navdeep Parhar
aa93b99aa0 cxgbe(4): Make the location/length of all descriptor rings available in
the sysctl MIB.
2016-09-23 20:03:28 +00:00
Navdeep Parhar
3cdfcb51aa cxgbe(4): Fix netmap with T6, which doesn't encapsulate SGE_EGR_UPDATE
message inside a FW_MSG.  The base NIC already deals with updates in
either form.

Sponsored by:	Chelsio Communications
2016-09-23 17:24:06 +00:00
Navdeep Parhar
de58013128 cxgbe(4): Fix the output of the "tids" sysctl on T6. 2016-09-22 21:19:25 +00:00
Navdeep Parhar
d4759c89e5 cxgbe(4): Catch up with the different layout of WHOAMI in T6.
Note that the code moved below t4_prep_adapter() as part of this change
because now it needs a working chip_id().
2016-09-22 18:47:07 +00:00
Navdeep Parhar
8c0ca00b72 cxgbe(4): Setup congestion response for T6 rx queues. 2016-09-21 00:50:22 +00:00
Navdeep Parhar
6a0cae68b4 cxgbe(4): Show wcwr_stats for T6 cards. 2016-09-21 00:46:08 +00:00
Navdeep Parhar
0459a175eb cxgbe(4): Fixes to wrq stats.
- Increment tx_wrs_copied in the correct place.
- Add tx_wrs_sspace to the sysctl MIB.

Sponsored by:	Chelsio Communications
2016-09-19 17:16:51 +00:00
Navdeep Parhar
efe00fd923 cxgbe/t4_tom: Update the active/passive open code to support T6. Data
path works as-is.

Sponsored by:	Chelsio Communications
2016-09-17 23:08:49 +00:00
Navdeep Parhar
b0c554c3a5 cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.
Sponsored by:	Chelsio Communications
2016-09-17 22:13:03 +00:00
Navdeep Parhar
e6b81479f9 cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will
come up as 't6nex' nexus devices with 'cc' ports hanging off them.

The T6 firmware and configuration files will be added as soon as they
are released.  For now the driver will try to work with whatever
firmware and configuration is on the card's flash.

Sponsored by:	Chelsio Communications
2016-09-16 00:08:37 +00:00
Navdeep Parhar
83a202cac5 Whitespace nits. 2016-09-15 22:31:49 +00:00
Navdeep Parhar
97f2919d54 cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields
to use in tx work requests.
2016-09-15 08:30:47 +00:00
John Baldwin
1fedfdd5c1 Remove explicit device_verbose() from the t4iov driver detach routine
now that this case is handled generically.
2016-09-12 18:07:06 +00:00
Navdeep Parhar
4cf3aa135b cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

Sponsored by:	Chelsio Communications
2016-09-12 00:15:40 +00:00
Navdeep Parhar
9113e53d54 cxgbe(4): Add support for additional port types and link speeds.
Sponsored by:	Chelsio Communications.
2016-09-11 23:08:57 +00:00
Navdeep Parhar
769ef07a38 cxgbe(4): Rename the debug_flags driver tunable/sysctl to dflags.
Tunables that end with _flags are special.

Sponsored by:	Chelsio Communications
2016-09-11 18:05:37 +00:00
Navdeep Parhar
82c1d6b762 cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6.
Sponsored by:	Chelsio Communications
2016-09-11 17:57:53 +00:00
Navdeep Parhar
ed7e5640a5 cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.
Sponsored by:	Chelsio Communications
2016-09-11 17:51:17 +00:00
Navdeep Parhar
0dbc6cfd75 cxgbe(4): Update the pad_boundary calculation for T6, which has a
different range of boundaries.

Sponsored by:	Chelsio Communications
2016-09-11 17:22:54 +00:00
Navdeep Parhar
472a6004cf cxgbe(4): Use correct macro for header length with T6 ASICs. This
affects the transmit of the VF driver only.

Sponsored by:	Chelsio Communications
2016-09-11 16:11:51 +00:00
Navdeep Parhar
774168be39 cxgbe(4): Set up fl_starve_threshold2 accurately for T6.
Sponsored by:	Chelsio Communications
2016-09-11 16:06:17 +00:00
Navdeep Parhar
3aea32935c cxgbe(4): Avoid a NULL dereference in the clearstats ioctl handler.
Port softc's are not initialized when the adapter is in recovery mode.
2016-09-09 17:15:16 +00:00
Navdeep Parhar
9e7cb06c17 cxgbe(4): Do not prescreen frames before attempting LRO.
Sponsored by:	Chelsio Communications
2016-09-09 07:34:14 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
John Baldwin
e06ab612d2 Don't break out of the m_advance() loop if len drops to zero.
If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer.  The first loop iteration
will drop len to zero and exit the loop without setting 'p'.  However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).

Reviewed by:	np
Sponsored by:	Chelsio Communications
2016-09-07 18:08:43 +00:00
Navdeep Parhar
5aaa3bc3b9 cxgbe/t4_tom: toepcb should be all-zero on allocation because the code
that cleans up on failure assumes that non-NULL values indicate
initialized items.

Sponsored by:	Chelsio Communications
2016-09-05 19:37:47 +00:00
Navdeep Parhar
5a63431265 Use correct CTR<n> variant. 2016-09-03 18:54:26 +00:00
Navdeep Parhar
af4251f3a1 cxgbe/cxgbei: Provide a knob to set the DDP threshold for iSCSI
transfers.

The Initiator and Target both perform zero copy receive for transfers
greater than or equal to this threshold.

Sponsored by:	Chelsio Communications
2016-09-02 00:21:24 +00:00
Navdeep Parhar
22536e1556 cxgbe/cxgbei: Count various events related to iSCSI operation and make
these counters available in the sysctl MIB.

Sponsored by:	Chelsio Communications
2016-09-01 23:58:36 +00:00
Navdeep Parhar
fec7f83342 cxgbe/cxgbei: Minor changes in the iSCSI CPL handlers.
- Use m_copydata instead of bcopy.
- Add new assertions.
- Log some more information.

Sponsored by:	Chelsio Communications
2016-09-01 22:40:55 +00:00
Navdeep Parhar
7cba15b16e cxgbe/cxgbei: Retire all DDP related code from cxgbei and switch to
routines available in t4_tom to manage the iSCSI DDP page pod region.

This adds the ability to use multiple DDP page sizes to the iSCSI
driver, among other improvements.

Sponsored by:	Chelsio Communications
2016-09-01 20:43:01 +00:00
Navdeep Parhar
a9feb2cdbb cxgbe/t4_tom: Two new routines to allocate and write page pods for a
buffer in the kernel's address space.
2016-09-01 00:51:59 +00:00
Navdeep Parhar
968267fdb8 cxgbe/t4_tom: Add general purpose routines to deal with page pod regions
and allocations within them.  Switch to these routines to manage the TOE
DDP region.

Sponsored by:	Chelsio Communications
2016-08-31 23:23:46 +00:00
John Baldwin
bc32f05443 Use device_verbose() to undo device_quiet() when detaching from t[45]iovX.
The device quiet flag is not automatically reset on detach, so it is
inherited by other device drivers (e.g. when switching a device driver
over to ppt for PCI pass through).  Cope with this behavior by explicitly
marking the device verbose during detach so that the next driver can make
its own decision.

Sponsored by:	Chelsio Communications
2016-08-29 22:47:14 +00:00
Navdeep Parhar
e25621e5ea cxgbe(4): Provide more details about the card in the sysctl MIB.
dev.t5nex.0.%desc: Chelsio T580-CR
dev.t5nex.0.hw_revision: 1
dev.t5nex.0.sn: PT13140042
dev.t5nex.0.pn: 110117150A0
dev.t5nex.0.ec: 0000000000000000
dev.t5nex.0.na: 0007432AF490
dev.t5nex.0.vpd_version: 3
dev.t5nex.0.scfg_version: 53255
dev.t5nex.0.bs_version: 1.1.0.0
dev.t5nex.0.er_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
dev.t5nex.0.firmware_version: 1.16.2.0

Sponsored by:	Chelsio Communications
2016-08-27 00:13:41 +00:00
Navdeep Parhar
7df135c055 cxgbe/iw_cxgbe: Various fixes to the iWARP driver.
- Return appropriate error code instead of ENOMEM when sosend() fails in
  send_mpa_req.
- Fix for problematic race during destroy_qp.
- Abortive close in the failure of send_mpa_reject() instead of normal close.
- Remove the unnecessary doorbell flowcontrol logic.

Submitted by:	Krishnamraju Eraparaju at Chelsio
MFC after:	1 month
Sponsored by:	Chelsio communications
2016-08-26 17:38:13 +00:00
Navdeep Parhar
c149da1661 cxgbe/cxgbei: There is no need for multiple modules in the KLD.
Sponsored by:	Chelsio Communications
2016-08-26 01:28:31 +00:00
Navdeep Parhar
a95e0bfb6c cxgbe/cxgbei: Convert the driver-private PDU flags to enums and replace
pdu_ prefix with icp_ in struct icl_cxgbei_pdu.

Sponsored by:	Chelsio Communications
2016-08-25 23:06:12 +00:00
Navdeep Parhar
5d83392ac5 cxgbe/cxgbei: Read the chip's configuration to determine the actual
hardware send and receive PDU limits.  Report these limits to ICL and
take them into account when setting the socket's send and receive buffer
sizes.  The driver used a single hardcoded limit everywhere prior to
this change.

Sponsored by:	Chelsio Communications
2016-08-25 21:55:17 +00:00
Navdeep Parhar
97b84d344d Make the iSCSI parameter negotiation more flexible.
Decouple the send and receive limits on the amount of data in a single
iSCSI PDU.  MaxRecvDataSegmentLength is declarative, not negotiated, and
is direction-specific so there is no reason for both ends to limit
themselves to the same min(initiator, target) value in both directions.

Allow iSCSI drivers to report their send, receive, first burst, and max
burst limits explicitly instead of using hardcoded values or trying to
derive all of them from the receive limit (which was the only limit
reported by the drivers prior to this change).

Display the send and receive limits separately in the userspace iSCSI
utilities.

Reviewed by:	jpaetzel@ (earlier version), trasz@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7279
2016-08-25 05:22:53 +00:00
John Baldwin
bd6ff0807e Reorder sysctls so that nodes shared with the VF driver are added first.
This permits a single early return for VF devices in the routines that
add sysctl nodes.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7512
2016-08-19 17:54:51 +00:00
John Baldwin
301abe0f3f Adjust t4_port_init() to work with VF devices.
Specifically, the FW_PORT_CMD may or may not work for a VF (the PF
driver can choose whether or not to permit access to this command),
so don't attempt to fetch port information on a VF if permission is
denied by the PF.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7511
2016-08-19 17:52:48 +00:00
John Baldwin
577422fc53 Add structures for VF-specific adapter parameters.
While here, mark which parameters are PF-specific and which are
VF-specific.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7508
2016-08-19 17:49:49 +00:00
John Baldwin
8b2246b349 Add support for register dumps on VF devices.
- Add handling of VF register sets to t4_get_regs_len() and t4_get_regs().
- While here, use t4_get_regs_len() in the ioctl handler for regdump
  instead of inlining it.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7484
2016-08-15 17:42:54 +00:00
John Baldwin
e5e9897c5f Update mailbox writes to work with VF devices.
- Use alternate register locations for the data and control registers for
  VFs.
- Do a dummy read to force the writes to the  mailbox data registers to
  post before the write to the control register on VFs.
- Do not check the PCI-e firmware register for errors on VFs.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7483
2016-08-15 17:41:34 +00:00
John Baldwin
59c1e950b9 Make SGE parameter handling more VF-friendly.
Add fields to hold the SGE control register and free list buffer sizes to
the sge_params structure.  Populate these new fields in
t4_init_sge_params() for PF devices and change t4_read_chip_settings() to
pull these values out of the params structure instead of reading
registers directly.  This will permit t4_read_chip_settings() to be reused
for VF devices which cannot read SGE registers directly.

While here, move the call to t4_init_sge_params() to
get_params__post_init().  The VF driver will populate the SGE parameters
structure via a different method before calling t4_read_chip_settings().

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7476
2016-08-15 17:40:05 +00:00
John Baldwin
ec55567ce6 Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers.  For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero.  VF devices have a non-zero base
absolute ID and require this change.  While here, export the absolute ID
of egress queues via a sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7446
2016-08-09 17:49:42 +00:00
John Baldwin
a56745092c Reserve an adapter flag IS_VF to mark VF devices vs PF devices.
Sponsored by:	Chelsio Communications
2016-08-08 21:45:39 +00:00
John Baldwin
8f6690d385 Fix a typo. 2016-08-08 21:28:02 +00:00
John Baldwin
f454e7ebf5 Add __printflike() to bus_describe_intr() to enable -Wformat checks.
Fix a few places that were passing a raw string as the format to use
a "%s" format string instead.

MFC after:	2 months
2016-08-04 18:29:16 +00:00
Navdeep Parhar
9217931fb4 cxgbe/t4_tom: The page pod arena allocates from pod address space and
not index space.  The minimum valid allocation out of this arena is the
size of a single page pod.

Sponsored by:	Chelsio Communications
2016-08-04 17:29:42 +00:00
John Baldwin
847bfa8eb7 Use the port device name for the iov device for Chelsio T4/T5 cards.
Chelsio T4/T5 adapters are multifunction cards.  The main driver uses
physical function 4 (PF4).  However, VF devices for SR-IOV are only
supported on physical functions 0 through 3, where PF0 creates VFs tied
to port 0, etc.  The t4iov/t5iov driver was previously added to
create VF devices for ports that are present on each adapter.  This
change uses the recently added pci_iov_attach_name() function to
name the character device in /dev/iov after the associated port on
the card (e.g. /dev/iov/cxl0 is used to create VFs that share the
cxl0 port).  With this in place, mark the t4iov/t5iov devices quiet
to prevent them from cluttering dmesg.

Reviewed by:	rstone
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7402
2016-08-03 17:11:08 +00:00
Navdeep Parhar
515b36c5b5 cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a
per-adapter data structure.  This replaces a global array with hardcoded
page sizes.

Sponsored by:	Chelsio Communications
2016-08-02 23:54:21 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
John Baldwin
8dd07eab0e Various fixes to the t4/5nex character device.
- Remove null open/close methods.
- Don't set d_flags to 0 explicitly.
- Remove t5_cdevsw as the .d_name member isn't really used and doesn't
  warrant a separate cdevsw just for the name.
- Use ENOTTY as the error value for an unknown ioctl request.
- Use make_dev_s() to close race with setting si_drv1.

Sponsored by:	Chelsio Communications
2016-07-29 22:11:29 +00:00
John Baldwin
29c229e9fc Mark spg_len and fl_pktshift static.
These variables are no longer exported to t4_netmap.c after r296478.
2016-07-28 17:37:12 +00:00
John Baldwin
07159830be Add support for zero-copy aio_write() on TOE sockets.
AIO write requests for a TOE socket on a Chelsio T4+ adapter can now
DMA directly from the user-supplied buffer.  This is implemented by
wiring the pages backing the user-supplied buffer and queueing special
mbufs backed by raw VM pages to the socket buffer.  The TOE code
recognizes these special mbufs and builds a sglist from the VM page
array associated with the mbuf when queueing a work request to the TOE.

Because these mbufs do not have an associated virtual address, m_data
is not valid.  Thus, the AIO handler does not invoke sosend() directly
for these mbufs but instead inlines portions of sosend_generic() and
tcp_usr_send().

An aiotx_buffer structure is used to describe the user buffer (e.g.
it holds the array of VM pages and a reference to the AIO job).  The
special mbufs reference this structure via m_ext.  Note that a single
job might be split across multiple mbufs (e.g. if it is larger than
the socket buffer size).  The 'ext_arg2' member of each mbuf gives an
offset relative to the backing aiotx_buffer.  The AIO job associated
with an aiotx_buffer structure is completed when the last reference to
the structure is released.

Zero-copy aio_write()'s for connections associated with a given
adapter can be enabled/disabled at runtime via the
'dev.t[45]nex.N.toe.tx_zcopy' sysctl.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-07-27 18:29:35 +00:00
Navdeep Parhar
17146cd543 cxgbe(4): Initialize the adapter queues (fwq and mgmtq) instead of
returning EAGAIN if they aren't available when the user tries to program
a filter.  Do this after validating the filter so that the driver
doesn't bring up the queues if it doesn't have to.
2016-07-26 23:29:37 +00:00
John Baldwin
f91fca5ba7 Add a driver to create VF devices on Chelsio T4/T5 NICs.
Chelsio NICs are a bit unique compared to some other NICs in that they
expose different functionality on different physical functions.  In
particular, PF4 is used to manage the NIC interfaces ('t4nex' and 't5nex').
However, PF4 is not able to create VF devices.  Instead, VFs are only
supported by physical functions 0 through 3.  This commit adds 't4iov'
and 't5iov' drivers that attach to PF0-3.

One extra wrinkle is that the iov devices cannot enable SR-IOV until the
firwmare has been initialized by the main PF4 driver.  To handle this
case, a new t4_if kobj interface has been added to permit cross-calls
between the PF drivers.  The PF4 driver notifies sibling drivers when it
is fully attached.  It also requests sibling drivers to detach before it
detaches.  Sibling drivers query the PF4 driver during their attach
routine to see if it is attached.  If not, the sibling drivers defer
their attach actions until the PF4 driver informs them it is attached.

VF devices are associated with a single port on the NIC.  VF devices
created from PF0 are associated with the first port on the NIC, VFs
from PF1 are associated with the second port, etc.  VF devices can
only be created from a PF device that has an associated port.  Thus,
on a 2-port card, VFs are only supported on PF0 and PF1.

Reviewed by:	np (earlier versions)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2016-07-22 22:46:41 +00:00
John Baldwin
069af0eb14 Install a handler for firmware work request error messages.
If a driver sends an malformed or disallowed work request, the firmware
responds with a work request error.  Previously the driver treated this is
as an unexpected message and panicked.  Now it decodes the error message
to aid in debugging.

Reviewed by:	np (older version)
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6950
2016-07-22 21:52:07 +00:00
Enji Cooper
092af585e1 Remove redundant declaration for tcp_dooptions, similar to r302576
netinet/tcp_var.h already defines this function

Differential Revision:	https://reviews.freebsd.org/D7189
MFC after:	1 week
PR:	209920
Reported by:	Mark Millard <markmi@dsl-only.net>
Reviewed by:	np
Tested with:	clang 3.8.0, gcc 4.2.1, gcc 5.3.0
Sponsored by:	EMC / Isilon Storage Division
2016-07-11 17:11:18 +00:00
Navdeep Parhar
a33b046750 cxgbe(4): Add sysctl to display the RSS indirection table size for an
interface.

dev.cxl.<n>.rss_size
dev.vcxl.<n>.rss_size

MFC after:	3 days
2016-07-08 18:13:23 +00:00
Navdeep Parhar
671bf2b8b2 cxgbe(4): Changes to the CPL-handler registration mechanism and code
related to "shared" CPLs.

a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single
function.  Allow callers to direct the response to any iq.  Tidy up
set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of
magic constants.

b) Remove all CPL handler tables from struct adapter.  This reduces its
size by around 2KB.  All handlers are now registered at MOD_LOAD instead
of attach or some kind of initialization/activation.  The registration
functions do not need an adapter parameter any more.

c) Add per-iq handlers to deal with CPLs whose destination cannot be
determined solely from the opcode.  There are 2 such CPLs in use right
now: SET_TCB_RPL and L2T_WRITE_RPL.  The base driver continues to send
filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq.
t4_tom (including the DDP code) now uses the port's ctrlq to send
L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq.
fwq and ofld_rxq have different handlers that know what kind of tid to
expect in the reply.  Update t4_write_l2e and callers to to support any
wrq/iq combination.

Approved by:	re@ (kib@)
Sponsored by:	Chelsio Communications
2016-07-05 01:29:24 +00:00
Navdeep Parhar
bf9363d72c cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries
used by switching filters that rewrite L2 information do not have any
associated ifnet.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2016-07-01 23:18:49 +00:00
Navdeep Parhar
5e03372b18 cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it.
The interface's queues are functional after VI_INIT_DONE (which is short
of interface-up) and that's all that's needed for t4_tom to communicate
with the chip.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2016-06-29 06:55:30 +00:00
Navdeep Parhar
62291463de cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces.  The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel.  The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.).  It did not have normal
tx/rx but had a specialized netmap-only data path.  r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support.  The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable.  This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface.  nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf.  num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap.  pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only.  No
need to bring them up.  The vcxl interfaces are completely independent
and everything should just work.
---------------------

Approved by:	re@ (gjb@)
Relnotes:	Yes
Sponsored by:	Chelsio Communications
2016-06-23 02:53:00 +00:00
John Baldwin
b1012d8036 Account for AIO socket operations in thread/process resource usage.
File and disk-backed I/O requests store counts of read/written disk
blocks in each AIO job so that they can be charged to the thread that
completes an AIO request via aio_return() or aio_waitcomplete().  This
change extends AIO jobs to store counts of received/sent messages and
updates socket backends to set these counts accordingly.  Note that
the socket backends are careful to only charge a single messages for
each AIO request even though a single request on a blocking socket might
invoke sosend or soreceive multiple times.  This is to mimic the
resource accounting of synchronous read/write.

Adjust the UNIX socketpair AIO test to verify that the message resource
usage counts update accordingly for aio_read and aio_write.

Approved by:	re (hrs)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6911
2016-06-21 22:19:06 +00:00
John Baldwin
ae0b1ccbab Use sbused() instead of sbspace() to avoid signed issues.
Inserting a full mbuf with an external cluster into the socket buffer
resulted in sbspace() returning -MLEN.  However, since sb_hiwat is
unsigned, the -MLEN value was converted to unsigned in comparisons.  As a
result, the socket buffer was never autosized.  Note that sb_lowat is signed
to permit direct comparisons with sbspace(), but sb_hiwat is unsigned.
Follow suit with what tcp_output() does and compare the value of sbused()
with sb_hiwat instead.

Approved by:	re (gjb)
Sponsored by:	Chelsio Communications
2016-06-15 21:08:51 +00:00
John Baldwin
fe0bdd1d2c Move backend-specific fields of kaiocb into a union.
This reduces the size of kaiocb slightly. I've also added some generic
fields that other backends can use in place of the BIO-specific fields.

Change the socket and Chelsio DDP backends to use 'backend3' instead of
abusing _aiocb_private.status directly. This confines the use of
_aiocb_private to the AIO internals in vfs_aio.c.

Reviewed by:	kib (earlier version)
Approved by:	re (gjb)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6547
2016-06-15 20:56:45 +00:00
Navdeep Parhar
c4765d2743 cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA
connections and not others that are allowed to fail the receive window
check.

Approved by:	re (gjb@)
2016-06-14 21:09:00 +00:00
Navdeep Parhar
390006c70f iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN.
Release the hold on ep->com immediately after sending the RST.  This
fixes a bug that sometimes leaves userspace iWARP tools hung when the
user presses ^C.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Approved by:	re (gjb@)
Sponsored by:	Chelsio Communications
2016-06-14 21:02:36 +00:00
Navdeep Parhar
02f972e8f3 cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
Sponsored by:	Chelsio Communications
2016-06-08 14:15:29 +00:00
Navdeep Parhar
468646f716 cxgbe(4): A couple of fixes to set_sched_queue.
- Validate the scheduling class against the actual limit (which is chip
  specific) instead of a magic number.

- Return an error if an attempt is made to manipulate the tx queues of a
  VI that hasn't been initialized.

Sponsored by:	Chelsio Communications
2016-06-07 07:48:36 +00:00
Navdeep Parhar
0bc01d66b6 cxgbe(4): Provide information about traffic classes in the sysctl mib.
Sponsored by:	Chelsio Communications
2016-06-07 06:42:35 +00:00
Navdeep Parhar
46464b95b0 cxgbe(4): Track the state of the hardware traffic schedulers in the
driver.  This works as long as everyone uses set_sched_class_params
to program them.

Sponsored by:	Chelsio Communications
2016-06-07 00:27:55 +00:00
Navdeep Parhar
cfbe2911c9 cxgbe(4): Break up set_sched_class. Validate the channel number and
min/max rates against their actual limits (which are chip and port
specific) instead of hardcoded constants.

Sponsored by:	Chelsio Communications
2016-06-06 22:51:44 +00:00
Navdeep Parhar
db80d07377 cxgbe(4): Create a reusable struct type for scheduling class parameters.
Sponsored by:	Chelsio Communications
2016-06-06 20:42:46 +00:00
Navdeep Parhar
e6b775f48f iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire
comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory
(which is where the lock resides).

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-06-01 18:46:54 +00:00
Edward Tomasz Napierala
ac3453eba0 Reduce the priority of cxgbei(4) driver, so it doesn't get chosen
by default.  This is a workaround for a too simplistic ICL module
choosing mechanism.  To use it, specify offload in ctl.conf
or iscsi.conf.

This fixes a problem where "kldload cxgbei" wedges the iSCSI stack,
if you don't have a Chelsio card installed, or the endpoints of the
iSCSI session are not reachable through addresses configured
on that interface.

Reviewed by:	np@
MFC after:	1 month
2016-06-01 12:04:04 +00:00
Navdeep Parhar
addd6a52c4 cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to
avoid panicking debug kernels.

t4_tom does not keep track of a connection once it switches to ULP mode
iWARP.  If the connection falls out of ULP mode the driver/hardware seq#
etc. are out of sync.  A better fix would be to figure out what the
current seq# are, update the driver's state, and perform all sanity
checks as usual.
2016-05-28 00:38:17 +00:00
Navdeep Parhar
fc511a3302 iw_cxgbe: Plug a lock leak in process_mpa_request().
If the parent is DEAD or connect_request_upcall() fails, the parent
mutex is left locked.  This leads to a hang when process_mpa_request()
is called again for another child of the listening endpoint.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Obtained from:	upstream iw_cxgb4
Sponsored by:	Chelsio Communications
2016-05-27 23:44:33 +00:00
Navdeep Parhar
69b913d68f iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations.
Submitted by:	Krishnamraju Eraparaju at Chelsio
Reviewed by:	Steve Wise
Sponsored by:	Chelsio Communications
2016-05-27 21:26:26 +00:00
Hans Petter Selasky
fa201e28fc Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
Edward Tomasz Napierala
b891159418 Add mechanism for choosing iSER-capable ICL modules.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-24 08:44:45 +00:00
Edward Tomasz Napierala
7deb68ab2c Provide a way for ICL modules to declare they support PIM_UNMAPPED.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-21 11:10:48 +00:00
John Baldwin
1081d2766c Move the KTR for the update of ddp_active_id on each completion under
VERBOSE_TRACES.

Sponsored by:	Chelsio Communications
2016-05-20 23:08:22 +00:00
Edward Tomasz Napierala
604c023f94 Extend the ICL interface to include the PDU pointer in the task_setup
method.  This is required for upcoming iSER support.

Obtained from:	Mellanox Technologies (earlier version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-17 08:55:21 +00:00
Navdeep Parhar
96807c23ce cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0.
These firmwares were obtained from the "Chelsio T5/T4 Unified Wire
v2.12.0.3 for Linux" release.  Changes since 1.14.4.0 (which is the
firmware in -STABLE branches) are in the "Release Notes" accompanying
the Unified Wire release and are copy-pasted here as well.

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.15.37.0
Date    : 04/27/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
   queue was ignored.
 - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
 - Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
 - Fixed 40G link failures with some switches when auto-negotiation enabled.
 - Fixed to improve on link bring-up time.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where bogus d3hot bits were set causing traffic stall.
 - Fixed an issue where sometimes adapter was not seen after reboot.
 - Fixed an issue where iWARP was crashing in conjunction with traffic management.
 - Fixed an issue where link failed to come up after removing twinax cable and
   inserting optical module.

ETH
 - Fixed a link flap issue on T580-CR.

OFLD
 - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.

FOiSCSI
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
   use given inerface.
 - Fixed an issue where fw was sending ENETUNREACH event for normal tcp
   disconnection.

DCBX
 - Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE)
 - Fixed an issue where apply bit set for APP id was affecting the ETS and PFC
  settings.(DCBX IEEE)
 - Fixed an issue where app priority values are not handled correctly in fw.
  (DCBX IEEE)
 - Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE)

FOFCoE
 - Removed BB6 support.

ENHANCEMENTS
------------

BASE:
 - Added new interface to program DCA settings in SGE contexts; allow 32-byte
   IQE size
 - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.

OFLD:
 - WR opcode is returned to host in cqe error response.

22.2. T4 Firmware
+++++++++++++++++

Version : 1.15.37.0
Date    : 04/27/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
   was ignored.
 - Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where iWARP was crashing in conjunction with traffic management.

FOiSCSI:
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
   use given inerface.

DCBX
 - Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE)
 - Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE)

FOiSCSI
 - Fixes an issue where fw was sending ENETUNREACH event for normal tcp
   disconnection.

FOFCoE
 - Removed BB6 support.

ENHANCEMENTS
------------

BASE:
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================

Obtained from:	Chelsio Communications
MFC after:	6 weeks
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-05-13 17:38:59 +00:00
Hans Petter Selasky
30de20448d The idr_for_each() function is now part of the LinuxKPI. Use the
LinuxKPI's idr_for_each() function instead of the local one to avoid
compilation issues.

Discussed with:	np @
MFC after:	1 week
2016-05-11 17:17:48 +00:00
John Baldwin
f7bc393477 Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h. 2016-05-10 03:32:22 +00:00
John Baldwin
d3cd2df1f2 Forward declare 'struct cpl_set_tcb_rpl' before including t4_tom.h.
Other structures needed by prototypes in t4_tom.h are explicitly
declared in this file, so adding the prototype here seems most
consistent with existing code.
2016-05-09 20:01:34 +00:00
John Baldwin
dc9643853d Use DDP to implement zerocopy TCP receive with aio_read().
Chelsio's TCP offload engine supports direct DMA of received TCP payload
into wired user buffers.  This feature is known as Direct-Data Placement.
However, to scale well the adapter needs to prepare buffers for DDP
before data arrives.  aio_read() is more amenable to this requirement than
read() as applications often call read() only after data is available in
the socket buffer.

When DDP is enabled, TOE sockets use the recently added pru_aio_queue
protocol hook to claim aio_read(2) requests instead of letting them use
the default AIO socket logic.  The DDP feature supports scheduling DMA
to two buffers at a time so that the second buffer is ready for use
after the first buffer is filled.  The aio/DDP code optimizes the case
of an application ping-ponging between two buffers (similar to the
zero-copy bpf(4) code) by keeping the two most recently used AIO buffers
wired.  If a buffer is reused, the aio/DDP code is able to reuse the
vm_page_t array as well as page pod mappings (a kind of MMU mapping the
Chelsio NIC uses to describe user buffers).  The generation of the
vmspace of the calling process is used in conjunction with the user
buffer's address and length to determine if a user buffer matches a
previously used buffer.  If an application queues a buffer for AIO that
does not match a previously used buffer then the least recently used
buffer is unwired before the new buffer is wired.  This ensures that no
more than two user buffers per socket are ever wired.

Note that this feature is best suited to applications sending a steady
stream of data vs short bursts of traffic.

Discussed with:	np
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-05-07 00:33:35 +00:00
John Baldwin
826c2372c5 Set the correct vnet in TOE event handlers.
Differential Revision:	https://reviews.freebsd.org/D6152
2016-05-06 23:49:10 +00:00
Pedro F. Giffuni
612226d756 Revert r298955 for the cxgbe firmware.
These files have checksums that are none of my business.

Requested by:	np
2016-05-03 11:49:29 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
Pedro F. Giffuni
4ed3c0e713 sys: Make use of our rounddown() macro when sys/param.h is available.
No functional change.
2016-04-30 14:41:18 +00:00
Pedro F. Giffuni
b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Pedro F. Giffuni
d9c9c81c08 sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
Navdeep Parhar
cda2ab0e7a cxgbe(4): Always dispatch all work requests that have been written to the
descriptor ring before leaving drain_wrq_wr_list.
2016-04-12 22:11:29 +00:00
Navdeep Parhar
fdb562a960 cxgbe(4): Always read the entire mailbox into the reply buffer.
The size of the reply can be different from the size of the command in
case a debug firmware asserts.  fw_asrt() needs the entire reply in
order to decode the location of the assert.

Sponsored by:   Chelsio Communications
2016-04-12 21:17:19 +00:00
John Baldwin
b3b4ccc70a Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.
This fixes a conflict with the M_B macro in powerpc's
<machine/db_machdep.h> exposed by the recent addition of DDB commands
to the cxgbe driver.

Discussed with:	np
Reported by:	bz
Sponsored by:	Chelsio Communications
2016-04-12 17:44:34 +00:00
Navdeep Parhar
2829f76d1a cxgbe(4): Provide an explicit value for nqpcq in the firmware
configuration file.
2016-04-11 02:18:59 +00:00
Pedro F. Giffuni
74b8d63dcc Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
John Baldwin
307734b6d4 Add a 'show t4 devlog <nexus>' DDB command.
This command displays the adapter's firmware device log similar to the
dev.<nexus>.misc.devlog sysctl.

Sponsored by:	Chelsio Communications
2016-04-10 06:19:26 +00:00
John Baldwin
113f2316c6 Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.
This allows the contents of a TCB to be extracted from a T4/T5 card in
DDB after a panic.
2016-04-10 05:06:58 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
John Baldwin
80f3b01958 Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

Discussed with:	np
Sponsored by:	Chelsio Communications
2016-03-31 18:36:50 +00:00
Navdeep Parhar
79bb86ac2e cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!"
messages.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-29 00:10:47 +00:00
Navdeep Parhar
78552b23a5 cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.
2016-03-22 18:56:23 +00:00
Navdeep Parhar
8d814a458c iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux
iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.

This commit includes internal changesets 6785 8111 8149 8478 8617 8648
8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730
11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-21 00:29:45 +00:00
Navdeep Parhar
784a631dc5 cxgbe(4): Tidy up PAUSE frame accounting.
Figure out if the chip is counting PAUSE frames in the "normal" stats
and take them out if it is.  This fixes a bug in the tx stats because
the default hardware behavior is different for Tx and Rx but the driver
was treating both the same way.  The result was that OPACKETS, OBYTES,
and OMCASTS were under-reported (if tx_pause > 0) before this change.

Note that the mac_stats sysctl still gives you the raw value of these
statistics straight from the device registers.
2016-03-17 01:15:16 +00:00
Navdeep Parhar
460f25e5ca cxgbe(4): Enable PFs 0-3, and allow creation of SR-IOV VFs on these PFs
in the default configuration files.
2016-03-16 19:46:22 +00:00
Navdeep Parhar
fc2aa1fc0b cxgbe(4): Enable additional capabilities in the default configuration
files.  All features with FreeBSD drivers of some kind are now in the
default configuration.
2016-03-16 19:43:44 +00:00
Navdeep Parhar
c47c408a38 cxgbe(4): Update some register settings in the default configuration
files to match the "uwire" configuration.
2016-03-16 19:41:00 +00:00
Navdeep Parhar
78aa2c994c cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo.
Do not display range if start = stop (this is a workaround for some
unused regions).
2016-03-16 19:36:11 +00:00
Dimitry Andric
353cae4976 Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not
defined:

    sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used
    sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used
    sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used

This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.

Reviewed by:	np
Differential Revision: https://reviews.freebsd.org/D5620
2016-03-12 18:38:51 +00:00
Navdeep Parhar
537e5f9a74 cxgbe(4): Fix typo in previous commit. 2016-03-12 03:02:33 +00:00
Navdeep Parhar
0f2f53efd2 cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.
2016-03-12 02:54:55 +00:00
Navdeep Parhar
959824daba cxgbe(4): sysctls to display the TOE's TCP timers.
cask:~# sysctl -d dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: FINWAIT2 timer (us)
dev.t5nex.0.toe.initial_srtt: Initial SRTT (us)
dev.t5nex.0.toe.keepalive_intvl: Keepidle interval (us)
dev.t5nex.0.toe.keepalive_idle: Keepidle idle timer (us)
dev.t5nex.0.toe.persist_max: Persist timer max (us)
dev.t5nex.0.toe.persist_min: Persist timer min (us)
dev.t5nex.0.toe.rexmt_max: Retransmit max (us)
dev.t5nex.0.toe.rexmt_min: Retransmit min (us)
dev.t5nex.0.toe.dack_timer: DACK timer (us)
dev.t5nex.0.toe.dack_tick: DACK tick (us)
dev.t5nex.0.toe.timestamp_tick: TCP timestamp tick (us)
dev.t5nex.0.toe.timer_tick: TP timer tick (us)
...

cask:~# sysctl dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: 9765440
dev.t5nex.0.toe.initial_srtt: 244128
dev.t5nex.0.toe.keepalive_intvl: 73240800
dev.t5nex.0.toe.keepalive_idle: 7031116800
dev.t5nex.0.toe.persist_max: 9765440
dev.t5nex.0.toe.persist_min: 976544
dev.t5nex.0.toe.rexmt_max: 9765440
dev.t5nex.0.toe.rexmt_min: 244128
dev.t5nex.0.toe.dack_timer: 19520
dev.t5nex.0.toe.dack_tick: 32.768
dev.t5nex.0.toe.timestamp_tick: 1048.576
dev.t5nex.0.toe.timer_tick: 32.768
...
2016-03-11 23:24:04 +00:00
Navdeep Parhar
9945ceb857 cxgbe(4): Add sysctls to display the TP microcode version and the
expansion rom version (if there's one).

trantor:~# sysctl dev.t4nex dev.t5nex | grep _version
dev.t4nex.0.firmware_version: 1.15.28.0
dev.t4nex.0.tp_version: 0.1.9.4
dev.t5nex.0.firmware_version: 1.15.28.0
dev.t5nex.0.exprom_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
2016-03-11 03:15:17 +00:00
Navdeep Parhar
5849e3bbcd cxgbe(4): Add a sysctl for the event capture mask of the TP block's
logic analyzer.

dev.t5nex.<n>.misc.tp_la_mask
dev.t4nex.<n>.misc.tp_la_mask
2016-03-11 01:54:43 +00:00
Navdeep Parhar
98790d33c6 cxgbe(4): Improvements to the code that deals with the firmware's log.
- Query the location of the log very early during attach.  Refresh the
  location later after establishing contact with the firmware.
- Save the log's location as a flat address in devlog_params.
- Use a memory window instead of backdoor access to the EDC/MC to read
  the log.
2016-03-10 23:17:26 +00:00
Navdeep Parhar
30d816a7b5 cxgbe(4): Fix bug in r296603. The memory window needs to be
repositioned if the start address isn't in the window already.  One
of the bounds check used the end address instead.
2016-03-10 20:36:32 +00:00
Navdeep Parhar
c912289045 cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows.  Convert existing users of these windows to the
new routines.
2016-03-10 06:15:31 +00:00
Navdeep Parhar
f808381742 cxgbe(4): Allow the addr/len pair that is being validated in
validate_mem_range to span multiple memory types.  Update
validate_mt_off_len to use validate_mem_range.
2016-03-10 02:43:10 +00:00
Navdeep Parhar
4d131308f3 cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.
2016-03-08 22:23:30 +00:00
Navdeep Parhar
40f1f5c6c4 cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with
the internal shared code.

Obtained from:	Chelsio Communications
2016-03-08 19:34:56 +00:00
Navdeep Parhar
fa6d184d93 cxgbe(4): Minor updates to the shared routines that deal with firmware images. 2016-03-08 10:07:40 +00:00
Navdeep Parhar
89308a40ba cxgbe(4): Fix t4_tp_get_rdma_stats. 2016-03-08 09:40:45 +00:00
Navdeep Parhar
a5eff82112 cxgbe(4): Many new functions in the shared code, unused at this time.
Obtained from:	Chelsio Communications
2016-03-08 09:34:56 +00:00
Navdeep Parhar
f799998f84 cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason
the link is down, instead of doing it in OS specific code.
2016-03-08 08:59:34 +00:00
Navdeep Parhar
9b6d39a019 cxgbe(4): Updates to shared routines that get/set various parameters via
the firmware.

Obtained from:	Chelsio Communications
2016-03-08 08:39:53 +00:00
Navdeep Parhar
05e2c36c20 cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with
internal shared code.

Obtained from:	Chelsio Communications
2016-03-08 08:13:37 +00:00
Navdeep Parhar
b3500921c4 cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.

Obtained from:	Chelsio Communications
2016-03-08 07:48:55 +00:00
Navdeep Parhar
948d0ec074 cxgbe(4): Updates to mailbox routines in the shared code.
Obtained from:	Chelsio Communications
2016-03-08 06:27:47 +00:00
Navdeep Parhar
207d9fea78 cxgbe(4): Update the interrupt handlers for hardware errors.
Obtained from:	Chelsio Communications
2016-03-08 02:44:32 +00:00
Navdeep Parhar
700cfba72d cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.

Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here.  This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.

Sponsored by:	Chelsio Communications
2016-03-08 02:04:05 +00:00
Navdeep Parhar
90e7434a6d cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code.  Use these per-adapter values instead of globals.

Sponsored by:	Chelsio Communications
2016-03-08 00:23:56 +00:00
Navdeep Parhar
4f8a1fd8cd cxgbe(4): Updated register dumps.
- Get the list of registers to read during a regdump from the shared
  code instead of the OS specific code.  This follows a similar move
  internally.  The shared code includes the list for T6.

- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register
  dumps (and catch up with some updates to T4 and T5 register decode).

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-07 21:11:35 +00:00
Navdeep Parhar
d1205d093d cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
  all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
  work with T6.  Most of the changes are in the decoders for the CIM
  logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-04 13:11:13 +00:00
Navdeep Parhar
ecdff8fa76 cxgbe(4): First of many changes to reduce diffs with internal shared
code:

- Rename some CamelCase variables.
- s/t4_link_start/t4_link_l1cfg/g
- Pull in t4_get_port_type_description.
- Move t4_wait_op_done to t4_hw.c.
- Flip the order of the RDMA stats.
- Remove unsused function t4_iq_start_stop.
- Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c

Obtained from:	Chelsio Communications
2016-03-03 01:41:53 +00:00
Navdeep Parhar
dd991bd5a1 cxgbe(4): Update T5 and T4 firmwares to 1.15.28.0.
These firmwares were obtained from the beta "Chelsio T5/T4 Unified Wire
v2.12.0.2 for Linux" release.  Changes since last release are listed in the
"Release Notes" accompanying the beta release and are copy-pasted here as well.

The plan is to have only GA'd firmwares in any -STABLE FreeBSD branch so I'll
MFC this (after 2 months) only if it ends up in a GA release.

================================================================================
================================================================================

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.15.28.0
Date    : 02/29/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
   queue was ignored.
 - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
 - Fixed an issue in watchdog which was causing VM bring-up failure after
   reboot.
 - Fixed 40G link failures with some switches when auto-negotiation enabled.
 - Fixed to improve on link bring-up time.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where bogus d3hot bits were set causing traffic stall.
 - Fixed an issue where sometimes adapter was not seen after reboot.
 - Fixed an issue where iWARP was crashing in conjunction with traffic
   management.
 - Fixed an issue where link failed to come up after removing twinax cable and
   inserting optical module.

OFLD
 - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.

FOiSCSI
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
   use given inerface.

ENHANCEMENTS
------------

BASE:
 - Added new interface to program DCA settings in SGE contexts; allow 32-byte
   IQE size
 - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.

OFLD:
 - WR opcode is returned to host in cqe error response.

================================================================================
================================================================================

22.2. T4 Firmware
+++++++++++++++++

Version : 1.15.28.0
Date    : 02/29/2016
================================================================================

FIXES
-----

BASE:
 - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
   was ignored.
 - Fixed an issue in watchdog which was causing VM bring-up failure after
   reboot.
 - Per port buffer groups size doubled to improve performance.
 - Fixed an issue where iWARP was crashing in conjunction with traffic
   management.

FOiSCSI:
 - Fixed an issue in recovery path where connection was getting closed before
   recovery processing was done.
 - Fixed an issue in TCP port reuse.
 - Fixed an issue in recovery path when large number (>64) of iSCSI connections
   were in use.
 - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
   use given inerface.

ENHANCEMENTS
------------

BASE:
 - Added MPS raw interface.

ETH:
 - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.

================================================================================

Obtained from:	Chelsio Communications
MFC after:	2 months
Sponsored by:	Chelsio Communications
2016-03-01 02:36:50 +00:00
Navdeep Parhar
e8c6ba7265 cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a
port.

dev.cxgbe.<n>.max_speed
dev.cxl.<n>.max_speed

Sponsored by:	Chelsio Communications
2016-02-25 01:10:56 +00:00
Navdeep Parhar
40bf7442fa cxgbe: catch up with the latest hardware-related definitions.
Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-02-19 00:29:16 +00:00
Navdeep Parhar
748d440809 Remove duplicate definition (CPL_TRACE_PKT_T5). 2016-02-12 20:14:03 +00:00
Gleb Smirnoff
b4b12e52fb Garbage collect unused arguments of m_init(). 2016-02-10 18:54:18 +00:00
Gleb Smirnoff
f353ae1c62 More fixes to the build. 2016-01-27 05:15:53 +00:00
Gleb Smirnoff
57a78e3bae Augment struct tcpstat with tcps_states[], which is used for book-keeping
the amount of TCP connections by state.  Provides a cheap way to get
connection count without traversing the whole pcb list.

Sponsored by:	Netflix
2016-01-27 00:45:46 +00:00
Navdeep Parhar
097f289f25 Fix for iWARP servers that listen on INADDR_ANY.
The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.

Submitted by:	Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D4801
2016-01-22 23:33:34 +00:00
Navdeep Parhar
a34ec16adb iw_cxgbe: fix a couple of problems int the RDMA_TERMINATE handler.
a) Look for the CPL in the payload buffer instead of the descriptor.
b) Retrieve the socket associated with the tid with the inpcb lock held.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
2016-01-21 00:42:48 +00:00
Hans Petter Selasky
e936121d31 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
Navdeep Parhar
5725f0e490 cxgbe: bind the ithreads that handle NIC rx to the correct CPU if the kernel
is built with option RSS.
2016-01-11 17:52:42 +00:00
Alexander V. Chernikov
8a9f7532b0 Convert cxgb/cxgbe to the new routing API.
Discussed with:		np
2016-01-07 08:07:17 +00:00
Gleb Smirnoff
0c39d38d21 Historically we have two fields in tcpcb to describe sender MSS: t_maxopd,
and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned
up after T/TCP removal. After all permutations over the years the result is
that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum
protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if
timestamps are in action, or is equal to t_maxopd otherwise. That's a very
rough estimate of MSS reduced by options length. Throughout the code it
was used in places, where preciseness was not important, like cwnd or
ssthresh calculations.

With this change:

- t_maxopd goes away.
- t_maxseg now stores MSS not adjusted by options.
- new function tcp_maxseg() is provided, that calculates MSS reduced by
  options length. The functions gives a better estimate, since it takes
  into account SACK state as well.

Reviewed by:	jtl
Differential Revision:	https://reviews.freebsd.org/D3593
2016-01-07 00:14:42 +00:00
Navdeep Parhar
9f6b62e791 iw_cxgbe: Shut down the socket but do not close the fd in case of error.
The fd is closed later in this case.  This fixes a "SS_NOFDREF on enter"
panic.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Reviewed by:	Steve Wise @ Open Grid Computing
2016-01-05 01:32:40 +00:00
Alexander V. Chernikov
4fb3a8208c Implement interface link header precomputation API.
Add if_requestencap() interface method which is capable of calculating
  various link headers for given interface. Right now there is support
  for INET/INET6/ARP llheader calculation (IFENCAP_LL type request).
  Other types are planned to support more complex calculation
  (L2 multipath lagg nexthops, tunnel encap nexthops, etc..).

Reshape 'struct route' to be able to pass additional data (with is length)
  to prepend to mbuf.

These two changes permits routing code to pass pre-calculated nexthop data
  (like L2 header for route w/gateway) down to the stack eliminating the
  need for other lookups. It also brings us closer to more complex scenarios
  like transparently handling MPLS nexthops and tunnel interfaces.
  Last, but not least, it removes layering violation introduced by flowtable
  code (ro_lle) and simplifies handling of existing if_output consumers.

ARP/ND changes:
Make arp/ndp stack pre-calculate link header upon installing/updating lle
  record. Interface link address change are handled by re-calculating
  headers for all lles based on if_lladdr event. After these changes,
  arpresolve()/nd6_resolve() returns full pre-calculated header for
  supported interfaces thus simplifying if_output().
Move these lookups to separate ether_resolve_addr() function which ether
  returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr()
  compat versions to return link addresses instead of pre-calculated data.

BPF changes:
Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT.
Despite the naming, both of there have ther header "complete". The only
  difference is that interface source mac has to be filled by OS for
  AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside
  BPF and not pollute if_output() routines. Convert BPF to pass prepend data
  via new 'struct route' mechanism. Note that it does not change
  non-optimized if_output(): ro_prepend handling is purely optional.
Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI.
  It is not needed for ethernet anymore. The only remaining FDDI user is
  dev/pdq mostly untouched since 2007. FDDI support was eliminated from
  OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).

Flowtable changes:
  Flowtable violates layering by saving (and not correctly managing)
  rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated
  header data from that lle.

Differential Revision:	https://reviews.freebsd.org/D4102
2015-12-31 05:03:27 +00:00
Navdeep Parhar
e3148e46b2 cxgbei: Hardware accelerated iSCSI target and initiator for TOE capable
cards supported by cxgbe(4).

On the host side this driver interfaces with the storage stack via the
ICL (iSCSI Common Layer) in the kernel.  On the wire the traffic is
standard iSCSI (SCSI over TCP as per RFC 3720/7143 etc.) that
interoperates with all other standards compliant implementations.  The
driver is layered on top of the TOE driver (t4_tom) and promotes
connections being handled by t4_tom to iSCSI ULP (Upper Layer Protocol)
mode.  Hardware assistance in this mode includes:

- Full TCP processing.
- iSCSI PDU identification and recovery within the TCP stream.
- Header and/or data digest insertion (tx) and verification (rx).
- Zero copy (both tx and rx).

Man page will follow in a separate commit in a couple of weeks.

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2015-12-26 06:05:21 +00:00
Navdeep Parhar
9eb533d3b4 cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI
offload driver.  These changes come from projects/cxl_iscsi.
2015-12-26 00:26:02 +00:00
Navdeep Parhar
a89012b294 Fix RSS build.
Reported by:	arybchik@
2015-12-05 10:10:18 +00:00
Konstantin Belousov
f28b929a9e Fix build for !TCP_OFFLOAD case. 2015-12-03 10:33:57 +00:00
John Baldwin
fe2ebb7644 Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port.  This change allows additional non-netmap
interfaces to be configured on each port.  Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.

Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).

T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).

One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.

The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.

The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats.  There is currently no way to
clear the stats of an individual VI.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio
2015-12-03 00:02:01 +00:00
Navdeep Parhar
00cc2faabb cxgbe/t4_tom: add a knob to the default configuration file to tune
the TOE for LAN operation.  It is possible to set this to other values
(cluster for networks with little loss and really tight RTTs, and wan
for relatively large RTTs and/or lossy networks) depending on the
environment in which the TOE is being used.

None of this affects plain NIC operation in any way.

MFC after:	1 week
2015-11-10 02:29:19 +00:00
John Baldwin
02da5bb12a Chelsio T5 chips do not properly echo the No Snoop and Relaxed Ordering
attributes when replying to a TLP from a Root Port.  As a workaround,
disable No Snoop and Relaxed Ordering in the Root Port of each T5 adapter
during attach so that CPU-initiated requests do not contain these flags.

Note that this affects CPU-initiated requests to all devices under this
root port.

Reviewed by:	np
MFC after:	1 week
Sponsored by:	Chelsio
2015-11-05 21:33:15 +00:00
Navdeep Parhar
baa7d0bf9d cxgbe/tom: decide whether to shove segments or not only if there is
payload to transmit.

MFC after:	1 week
2015-10-30 01:18:07 +00:00
Hans Petter Selasky
2da3897d01 Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a
kernel programming interface module, KPI, to avoid confusion with the
existing Linux userspace binary compatibility shims. Bump the
FreeBSD_version number.

Reviewed by:	np @
Suggested by:	dumbbell @
Sponsored by:	Mellanox Technologies
2015-10-22 09:50:45 +00:00
Hans Petter Selasky
f556cede8a Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.

Reviewed by:	np @
Sponsored by:	Mellanox Technologies
2015-10-19 12:26:38 +00:00
Navdeep Parhar
07d684f712 cxgbe(4): support for the kernel RSS option.
You need PCBGROUP and RSS in the kernel config to use this.

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2015-10-16 01:19:55 +00:00
Navdeep Parhar
66eb134d76 iw_cxgbe: use correct RFC number. 2015-10-14 23:29:19 +00:00
Navdeep Parhar
ca7c0f2e2d iw_cxgbe: MPA v2 is always available.
Submitted by:	Krishnamraju Eraparaju at chelsio dot com
Reviewed by:	Steve Wise at opengridcomputing dot com
2015-10-13 01:04:38 +00:00
Navdeep Parhar
0d18010dec iw_cxgbe: fix for page fault in cm_close_handler().
This is roughly the iw_cxgbe equivalent of
be13b2dff8
-----------------
RDMA/cxgb4: Connect_request_upcall fixes

When processing an MPA Start Request, if the listening endpoint is
DEAD, then abort the connection.

If the IWCM returns an error, then we must abort the connection and
release resources.  Also abort_connection() should not post a CLOSE
event, so clean that up too.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
-----------------

Submitted by:	Krishnamraju Eraparaju at chelsio dot com.
2015-10-10 01:41:07 +00:00
John Baldwin
8fb15ddb00 Add a comment that to clarify how to determine the amount of received DDP
data.

Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D3619
2015-09-10 21:41:11 +00:00
Navdeep Parhar
8faf57012b cxgbe(4): Save the flags for the last adapter-wide synchronized
operation that was initiated successfully.  (The caller and thread are
already recorded).

MFC after:	1 week
2015-08-19 15:40:03 +00:00