Allow the filter mask (aka the hashfilter mode when hashfilters are
in use) to be set any time it is safe to do so. The requested mask
must be a subset of the filter mode already. The driver will not change
the mode or ingress config just to support a new mask.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
1. Query the firmware for filter mode, mask, and related ingress config
instead of trying to figure them out from hardware registers. Read
configuration from the registers only when the firmware does not
support this query.
2. Use the firmware to set the filter mode. This is the correct way to
do it and is more flexible as well. The filter mode (and associated
ingress config) can now be changed any time it is safe to do so.
The user can specify a subset of a valid mode and the driver will
enable enough bits to make sure that the mode is maxed out -- that
is, it is not possible to set another bit without exceeding the
total width for optional filter fields. This is a hardware
requirement that was not enforced by the driver previously.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
subset of a VI's RSS indirection table.
This makes it possible to make groups out of rx queues and steer
different kinds of traffic to different groups. For example, an
interface with 8 rx queues could have all non-TCP traffic delivered to
queues 0-3 and all TCP traffic to queues 4-7.
Note that it is already possible for filters to steer traffic to a
particular queue or to distribute it using the full indirection table
(much like normal rx does).
Sponsored by: Chelsio Communications
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool. Hash and
normal TCAM filters can be used together. The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters. Any T5/T6 card with memory can support at
least half a million hash filters. The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.
The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file. The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto). It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).
For example:
filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe
filterMask = protocol, port, vlan
Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.
Hash filters support all actions supported by TCAM filters. A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).
Sponsored by: Chelsio Communications
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload. t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface. The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so. The general format
of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or
passive open and stops at the first match. There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload
This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
debug (cudbg) code, hooked up to the main driver via an ioctl.
The ioctl can be used to collect the chip's internal state in a
compressed dump file. These dumps can be decoded with the "view"
component of cudbg.
Obtained from: Chelsio Communications
MFC after: 2 months
Sponsored by: Chelsio Communications
which is responsible for filtering and RSS.
Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here. This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.
Sponsored by: Chelsio Communications
scheduling classes in the chip and to bind tx queue(s) to a scheduling
class respectively. These can be used for various kinds of tx traffic
throttling (to force selected tx queues to drain at a fixed Kbps rate,
or a % of the port's total bandwidth, or at a fixed pps rate, etc.).
Obtained from: Chelsio
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.
/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy
includes support for the NIC and TOE features of the 40G, 10G, and
1G/100M cards based on the T5.
The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4)
has been updated instead of writing a brand new driver. T5 cards will
show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.
Sponsored by: Chelsio
This lets userspace read arbitrary information from the SFP+ modules
etc. on this bus.
Reading multiple bytes in the same transaction isn't possible right now.
I'll update the driver once the chip's firmware supports this.
MFC after: 3 days
- Device configuration via plain text config file. Also able to operate
when not attached to the chip as the master driver.
- Generic "work request" queue that serves as the base for both ctrl and
ofld tx queues.
- Generic interrupt handler routine that can process any event on any
kind of ingress queue (via a dispatch table).
- A couple of new driver ioctls. cxgbetool can now install a firmware
to the card ("loadfw" command) and can read the card's memory
("memdump" and "tcb" commands).
- Lots of assorted information within dev.t4nex.X.misc.* This is
primarily for debugging and won't show up in sysctl -a.
- Code to manage the L2 tables on the chip.
- Updates to cxgbe(4) man page to go with the tunables that have changed.
- Updates to the shared code in common/
- Updates to the driver-firmware interface (now at fw 1.4.16.0)
MFC after: 1 month
filters working. (All other filters - switch without L2 info rewrite,
steer, and drop - were already fully-functional).
Some contrived examples of "switch" filters with L2 rewriting:
# cxgbetool t4nex0 iport 0 dport 80 action switch vlan +9 eport 3
Intercept all packets received on physical port 0 with TCP port 80 as
destination, insert a vlan tag with VID 9, and send them out of port 3.
# cxgbetool t4nex0 sip 192.168.1.1/32 ivlan 5 action switch \
vlan =9 smac aa:bb:cc:dd:ee:ff eport 0
Intercept all packets (received on any port) with source IP address
192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac
to aa:bb:cc:dd:ee:ff, and send it out of port 0.
MFC after: 1 week
- Enable 5-tuple and every-packet lookup.
- Setup the default filter mode to allow filtering/steering based on IP
protocol, ingress port, inner VLAN ID, IP frag, FCoE, and MPS match
type; all combined together. You can also filter based on MAC index,
Ethernet type, IP TOS/IPv6 Traffic Class, and outer VLAN ID but you'll
have to modify the default filter mode and exclude some of the
match-fields in it.
IPv4 and IPv6 SIP/DIP/SPORT/DPORT are always available in all filter
rules.
- Add driver ioctls to get/set the global filter mode.
- Add driver ioctls to program and delete hardware filters. A couple of
the "switch" actions that rewrite Ethernet and VLAN information and
switch the packet out of another port may not work as the L2 code is not
yet in place. Everything else, including all "drop" and "pass" rules
with RSS or absolute qid, should work.
Obtained from: Chelsio Communications