Figure out if the chip is counting PAUSE frames in the "normal" stats
and take them out if it is. This fixes a bug in the tx stats because
the default hardware behavior is different for Tx and Rx but the driver
was treating both the same way. The result was that OPACKETS, OBYTES,
and OMCASTS were under-reported (if tx_pause > 0) before this change.
Note that the mac_stats sysctl still gives you the raw value of these
statistics straight from the device registers.
defined:
sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used
sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used
sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used
This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D5620
- Query the location of the log very early during attach. Refresh the
location later after establishing contact with the firmware.
- Save the log's location as a flat address in devlog_params.
- Use a memory window instead of backdoor access to the EDC/MC to read
the log.
which is responsible for filtering and RSS.
Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here. This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.
Sponsored by: Chelsio Communications
Move the code that reads all the parameters to t4_init_sge_params in the
shared code. Use these per-adapter values instead of globals.
Sponsored by: Chelsio Communications
- Get the list of registers to read during a regdump from the shared
code instead of the OS specific code. This follows a similar move
internally. The shared code includes the list for T6.
- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register
dumps (and catch up with some updates to T4 and T5 register decode).
Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications
update to the latest internal shared code.
- Add a chip_params structure to keep track of hardware constants for
all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
work with T6. Most of the changes are in the decoders for the CIM
logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.
Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications
code:
- Rename some CamelCase variables.
- s/t4_link_start/t4_link_l1cfg/g
- Pull in t4_get_port_type_description.
- Move t4_wait_op_done to t4_hw.c.
- Flip the order of the RDMA stats.
- Remove unsused function t4_iq_start_stop.
- Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c
Obtained from: Chelsio Communications
These firmwares were obtained from the beta "Chelsio T5/T4 Unified Wire
v2.12.0.2 for Linux" release. Changes since last release are listed in the
"Release Notes" accompanying the beta release and are copy-pasted here as well.
The plan is to have only GA'd firmwares in any -STABLE FreeBSD branch so I'll
MFC this (after 2 months) only if it ends up in a GA release.
================================================================================
================================================================================
22.1. T5 Firmware
+++++++++++++++++++++++++++++++++
Version : 1.15.28.0
Date : 02/29/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
queue was ignored.
- Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
- Fixed an issue in watchdog which was causing VM bring-up failure after
reboot.
- Fixed 40G link failures with some switches when auto-negotiation enabled.
- Fixed to improve on link bring-up time.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where bogus d3hot bits were set causing traffic stall.
- Fixed an issue where sometimes adapter was not seen after reboot.
- Fixed an issue where iWARP was crashing in conjunction with traffic
management.
- Fixed an issue where link failed to come up after removing twinax cable and
inserting optical module.
OFLD
- Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.
FOiSCSI
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
use given inerface.
ENHANCEMENTS
------------
BASE:
- Added new interface to program DCA settings in SGE contexts; allow 32-byte
IQE size
- Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
OFLD:
- WR opcode is returned to host in cqe error response.
================================================================================
================================================================================
22.2. T4 Firmware
+++++++++++++++++
Version : 1.15.28.0
Date : 02/29/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
was ignored.
- Fixed an issue in watchdog which was causing VM bring-up failure after
reboot.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where iWARP was crashing in conjunction with traffic
management.
FOiSCSI:
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
use given inerface.
ENHANCEMENTS
------------
BASE:
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================
Obtained from: Chelsio Communications
MFC after: 2 months
Sponsored by: Chelsio Communications
The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.
Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D4801
a) Look for the CPL in the payload buffer instead of the descriptor.
b) Retrieve the socket associated with the tid with the inpcb lock held.
Submitted by: Krishnamraju Eraparaju @ Chelsio
- Add optimizing LRO wrapper which pre-sorts all incoming packets
according to the hash type and flowid. This prevents exhaustion of
the LRO entries due to too many connections at the same time.
Testing using a larger number of higher bandwidth TCP connections
showed that the incoming ACK packet aggregation rate increased from
~1.3:1 to almost 3:1. Another test showed that for a number of TCP
connections greater than 16 per hardware receive ring, where 8 TCP
connections was the LRO active entry limit, there was a significant
improvement in throughput due to being able to fully aggregate more
than 8 TCP stream. For very few very high bandwidth TCP streams, the
optimizing LRO wrapper will add CPU usage instead of reducing CPU
usage. This is expected. Network drivers which want to use the
optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
"tcp_lro_flush()". Further the LRO control structure must be
initialized using "tcp_lro_init_args()" passing a non-zero number
into the "lro_mbufs" argument.
- Make LRO statistics 64-bit. Previously 32-bit integers were used for
statistics which can be prone to wrap-around. Fix this while at it
and update all SYSCTL's which expose LRO statistics.
- Ensure all data is freed when destroying a LRO control structures,
especially leftover LRO entries.
- Reduce number of memory allocations needed when setting up a LRO
control structure by precomputing the total amount of memory needed.
- Add own memory allocation counter for LRO.
- Bump the FreeBSD version to force recompilation of all KLDs due to
change of the LRO control structure size.
Sponsored by: Mellanox Technologies
Reviewed by: gallatin, sbruno, rrs, gnn, transport
Tested by: Netflix
Differential Revision: https://reviews.freebsd.org/D4914
and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned
up after T/TCP removal. After all permutations over the years the result is
that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum
protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if
timestamps are in action, or is equal to t_maxopd otherwise. That's a very
rough estimate of MSS reduced by options length. Throughout the code it
was used in places, where preciseness was not important, like cwnd or
ssthresh calculations.
With this change:
- t_maxopd goes away.
- t_maxseg now stores MSS not adjusted by options.
- new function tcp_maxseg() is provided, that calculates MSS reduced by
options length. The functions gives a better estimate, since it takes
into account SACK state as well.
Reviewed by: jtl
Differential Revision: https://reviews.freebsd.org/D3593
The fd is closed later in this case. This fixes a "SS_NOFDREF on enter"
panic.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: Steve Wise @ Open Grid Computing
Add if_requestencap() interface method which is capable of calculating
various link headers for given interface. Right now there is support
for INET/INET6/ARP llheader calculation (IFENCAP_LL type request).
Other types are planned to support more complex calculation
(L2 multipath lagg nexthops, tunnel encap nexthops, etc..).
Reshape 'struct route' to be able to pass additional data (with is length)
to prepend to mbuf.
These two changes permits routing code to pass pre-calculated nexthop data
(like L2 header for route w/gateway) down to the stack eliminating the
need for other lookups. It also brings us closer to more complex scenarios
like transparently handling MPLS nexthops and tunnel interfaces.
Last, but not least, it removes layering violation introduced by flowtable
code (ro_lle) and simplifies handling of existing if_output consumers.
ARP/ND changes:
Make arp/ndp stack pre-calculate link header upon installing/updating lle
record. Interface link address change are handled by re-calculating
headers for all lles based on if_lladdr event. After these changes,
arpresolve()/nd6_resolve() returns full pre-calculated header for
supported interfaces thus simplifying if_output().
Move these lookups to separate ether_resolve_addr() function which ether
returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr()
compat versions to return link addresses instead of pre-calculated data.
BPF changes:
Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT.
Despite the naming, both of there have ther header "complete". The only
difference is that interface source mac has to be filled by OS for
AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside
BPF and not pollute if_output() routines. Convert BPF to pass prepend data
via new 'struct route' mechanism. Note that it does not change
non-optimized if_output(): ro_prepend handling is purely optional.
Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI.
It is not needed for ethernet anymore. The only remaining FDDI user is
dev/pdq mostly untouched since 2007. FDDI support was eliminated from
OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).
Flowtable changes:
Flowtable violates layering by saving (and not correctly managing)
rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated
header data from that lle.
Differential Revision: https://reviews.freebsd.org/D4102
cards supported by cxgbe(4).
On the host side this driver interfaces with the storage stack via the
ICL (iSCSI Common Layer) in the kernel. On the wire the traffic is
standard iSCSI (SCSI over TCP as per RFC 3720/7143 etc.) that
interoperates with all other standards compliant implementations. The
driver is layered on top of the TOE driver (t4_tom) and promotes
connections being handled by t4_tom to iSCSI ULP (Upper Layer Protocol)
mode. Hardware assistance in this mode includes:
- Full TCP processing.
- iSCSI PDU identification and recovery within the TCP stream.
- Header and/or data digest insertion (tx) and verification (rx).
- Zero copy (both tx and rx).
Man page will follow in a separate commit in a couple of weeks.
Relnotes: Yes
Sponsored by: Chelsio Communications