Commit Graph

158 Commits

Author SHA1 Message Date
Navdeep Parhar
c3fce948fb cxgbe(4): Suppress a warning about code that is used only with options
RATELIMIT.

Reported by:	mmacy@
2018-05-25 18:57:41 +00:00
Navdeep Parhar
786099de5e cxgbe(4): Data path for rate-limited tx.
This is hardware support for the SO_MAX_PACING_RATE sockopt (see
setsockopt(2)), which is available in kernels built with "options
RATELIMIT".

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2018-05-24 10:18:14 +00:00
Navdeep Parhar
9c707b3287 cxgbe(4): Make FW4_ACK a shared CPL. ETHOFLD in the base driver will
use it for per-flow rate limiting.

Sponsored by:	Chelsio Communications
2018-05-24 08:21:43 +00:00
Navdeep Parhar
a6a8ff351d cxgbe(4): Slightly simpler needs_<foo> functions. 2018-05-24 07:38:46 +00:00
Navdeep Parhar
2e09fe9116 cxgbe(4): Make sure that the egress queue's cidx is updated periodically
when the driver is writing WRs using start_wrq_wr/commit_wrq_wr all the
time.

Sponsored by:	Chelsio Communications
2018-05-24 06:44:06 +00:00
Navdeep Parhar
eff62dba61 cxgbe(4): Allocate offload Tx queues when a card has resources
provisioned for NIC_ETHOFLD and the kernel has option RATELIMIT.

It is possible to use the chip's offload queues for normal NIC Tx and
not just TOE Tx.  The difference is that these queues support out of
order processing of work requests and have a per-"flowid" mechanism for
tracking credits between the driver and hardware.  This allows Tx for
any number of flows bound to different rate limits to be submitted to a
single Tx queue and the work requests for slow flows won't cause HOL
blocking for the rest.

Sponsored by:	Chelsio Communications
2018-05-17 01:42:18 +00:00
Navdeep Parhar
89f651e704 cxgbe(4): Add support for hash filters.
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool.  Hash and
normal TCAM filters can be used together.  The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters.  Any T5/T6 card with memory can support at
least half a million hash filters.  The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.

The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file.  The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto).  It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).

For example:
filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe
filterMask = protocol, port, vlan

Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.

Hash filters support all actions supported by TCAM filters.  A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).

Sponsored by:	Chelsio Communications
2018-05-09 04:09:49 +00:00
Navdeep Parhar
111638bf68 cxgbe(4): Convert ACT_OPEN_RPL to a shared CPL.
Reserve 3b in the 14b atid to identify the owner and use it to dispatch
the CPL.  This allows all CPLs that use an atid to be used as shared
CPLs, although ACT_OPEN_RPL is the only one being converted in this
revision.

Sponsored by:	Chelsio Communications
2018-04-30 21:47:30 +00:00
Navdeep Parhar
4535e8046f cxgbe(4): Use opaque cookies or tid range-checks to determine the
intended recipient of a CPL when it can't be determined solely from the
opcode.  Retire the per-queue handlers for such CPLs in favor of the new
scheme.

Sponsored by:	Chelsio Communications
2018-04-30 15:18:38 +00:00
Navdeep Parhar
1131c927c4 cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload.  t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface.  The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.

A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so.  The general format
of a rule is: [socket-type] expr => settings

Example.  See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed

The driver processes the rules for each new listen, active open, or
passive open and stops at the first match.  There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload

This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.

Sponsored by:	Chelsio Communications
2018-04-14 19:07:56 +00:00
Wojciech Macek
f078ecf642 CXGBE: fix get_filt to be endianness-aware
Unconditional 32-bit shift is not endianness-safe.
Modify the logic to work both on LE and BE.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13102
2018-01-11 09:17:02 +00:00
Navdeep Parhar
348694dada cxgbe(4): Reduce duplication by consolidating minor variations of the
same code into a single routine.

Sponsored by:	Chelsio Communications
2017-12-29 02:30:21 +00:00
Navdeep Parhar
f549e3521d cxgbe(4): Do not forward interrupts to queues with freelists. This
leaves the firmware event queue (fwq) as the only queue that can take
interrupts for others.

This simplifies cfg_itype_and_nqueues and queue allocation in the driver
at the cost of a little (never?) used configuration.  It also allows
service_iq to be split into two specialized variants in the future.

MFC after:	2 months
Sponsored by:	Chelsio Communications
2017-12-22 19:10:19 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Navdeep Parhar
5bcae8ddfa cxgbe(4): Read the MPS buffer group map from the firmware as it could be
different from hardware defaults.  The congestion channel map, which is
still fixed, needs to be tracked separately now.  Change the congestion
setting for TOE rx queues to match the drivers on other OSes while here.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-24 05:41:48 +00:00
Gleb Smirnoff
e8fd18f306 Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list.  Not all functions need the second
argument, some don't even need the first one.  The second argument
lives in next cache line, so not dereferencing it is a performance
gain.  This was discovered in sendfile(2), which will be covered by
next commits.

The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.

Reviewed by:	gallatin, kbowling
Differential Revision:	https://reviews.freebsd.org/D12615
2017-10-09 20:35:31 +00:00
Navdeep Parhar
08cd1f11bd cxgbe(4): Provide knobs to set the holdoff parameters of TOE rx queues
separately from NIC rx queues instead of using the same parameters for
both types of queues.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-05 07:18:16 +00:00
Navdeep Parhar
8d6ae10af6 cxgbe(4): Fix a couple of problems in the sge_wrq data path.
- start_wrq_wr must not drain the wr_list if there are incomplete_wrs
  pending.  This can happen when a t4_wrq_tx runs between two
  start_wrq_wr.

- commit_wrq_wr must examine the cookie's pidx and ndesc with the
  queue's lock held.  Otherwise there is a bad race when incomplete WRs
  are being completed and commit_wrq_wr for the WR that is ahead in the
  queue updates the next incomplete WR's cookie's pidx/ndesc but the
  commit_wrq_wr for the second one is using stale values that it read
  without the lock.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-09-09 05:12:14 +00:00
Navdeep Parhar
2f318252cb cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+).  The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 23:41:04 +00:00
Navdeep Parhar
0fa7560dfd cxgbe(4): Fix some assertions during driver detach. The netmap queues
can't be initialized if the VI isn't.

MFC after:	3 days
2017-08-28 03:25:41 +00:00
Navdeep Parhar
c45b18681f cxgbe(4): Some updates to the common code.
- Updated register ranges.
- Helper routines for access to TP registers.
- Updated routine to read flash parameters.

Obtained from:	Chelsio Communications
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-07-26 20:20:58 +00:00
Navdeep Parhar
a8c4fcb9c7 cxgbe(4): Fix per-queue netmap operation.
Do not attempt to initialize netmap queues that are already initialized
or aren't supposed to be initialized.  Similarly, do not free queues
that are not initialized or aren't supposed to be freed.

PR:		217156
Sponsored by:	Chelsio Communications
2017-06-15 19:56:59 +00:00
Navdeep Parhar
a59a14773a cxgbe(4): Update the statistics for compound tx work requests once per
work request, not once per frame.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-06-02 17:57:27 +00:00
Navdeep Parhar
3f1466a535 cxgbe(4): Avoid an out of bounds access when an attempt to unbind a tx
queue from a traffic class fails.

Reported by:	x ksi <s3810 at pjwstk edu pl>
MFC after:	3 days
2017-05-15 18:18:32 +00:00
Navdeep Parhar
c8da9163bf cxgbe(4): netmap-only interrupts for a VI do not have an associated rxq
or ofld_rxq and should be ignored by vi_intr_iq.

MFC after:	3 days.
Sponsored by:	Chelsio Communications
2017-05-14 09:07:13 +00:00
Navdeep Parhar
1404daa76c cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-05-09 18:33:41 +00:00
Navdeep Parhar
2204b42716 cxgbe(4): Support routines for Tx traffic scheduling.
- Create a new file, t4_sched.c, and move all of the code related to
  traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
  parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
  rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-05-02 20:38:10 +00:00
Navdeep Parhar
46f48ee519 cxgbe: Add tunables to control the number of LRO entries and the number
of rx mbufs that should be presorted before LRO.  There is no change in
default behavior.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-04-17 09:00:20 +00:00
Navdeep Parhar
d491f8ca2f cxgbe: Add a tunable to configure the SGE time scaler, which is
available starting with T6.  The values in the timer holdoff registers
are multiplied by the scaling factor before use.

dev.<nexus>.<n>.holdoff_timers shows the final values of the
timers in microseconds.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-04-15 17:00:50 +00:00
Navdeep Parhar
87b027ba69 cxgbe(4): Enable automatic cidx flush for all control queues.
MFC after:	3 days
2017-01-09 22:20:09 +00:00
Navdeep Parhar
f50c49cca7 cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved
in work requests that end at the end of the descriptor ring, even though
the pidx wraps around to 0.

MFC after:	3 days
2017-01-09 22:18:08 +00:00
Navdeep Parhar
1de8c69de7 cxgbe(4): Deal with compressed error vectors.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-15 02:05:29 +00:00
Navdeep Parhar
77e9044c47 cxgbe(4): Fix bug in the calculation of the number of physically
contiguous regions in an mbuf chain.

If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would
incorrectly consider the next mbuf's payload physically contiguous based
solely on a KVA comparison.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2016-10-24 19:09:56 +00:00
Navdeep Parhar
aa93b99aa0 cxgbe(4): Make the location/length of all descriptor rings available in
the sysctl MIB.
2016-09-23 20:03:28 +00:00
Navdeep Parhar
8c0ca00b72 cxgbe(4): Setup congestion response for T6 rx queues. 2016-09-21 00:50:22 +00:00
Navdeep Parhar
0459a175eb cxgbe(4): Fixes to wrq stats.
- Increment tx_wrs_copied in the correct place.
- Add tx_wrs_sspace to the sysctl MIB.

Sponsored by:	Chelsio Communications
2016-09-19 17:16:51 +00:00
Navdeep Parhar
97f2919d54 cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields
to use in tx work requests.
2016-09-15 08:30:47 +00:00
Navdeep Parhar
ed7e5640a5 cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.
Sponsored by:	Chelsio Communications
2016-09-11 17:51:17 +00:00
Navdeep Parhar
0dbc6cfd75 cxgbe(4): Update the pad_boundary calculation for T6, which has a
different range of boundaries.

Sponsored by:	Chelsio Communications
2016-09-11 17:22:54 +00:00
Navdeep Parhar
472a6004cf cxgbe(4): Use correct macro for header length with T6 ASICs. This
affects the transmit of the VF driver only.

Sponsored by:	Chelsio Communications
2016-09-11 16:11:51 +00:00
Navdeep Parhar
9e7cb06c17 cxgbe(4): Do not prescreen frames before attempting LRO.
Sponsored by:	Chelsio Communications
2016-09-09 07:34:14 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
John Baldwin
e06ab612d2 Don't break out of the m_advance() loop if len drops to zero.
If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer.  The first loop iteration
will drop len to zero and exit the loop without setting 'p'.  However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).

Reviewed by:	np
Sponsored by:	Chelsio Communications
2016-09-07 18:08:43 +00:00
Navdeep Parhar
7cba15b16e cxgbe/cxgbei: Retire all DDP related code from cxgbei and switch to
routines available in t4_tom to manage the iSCSI DDP page pod region.

This adds the ability to use multiple DDP page sizes to the iSCSI
driver, among other improvements.

Sponsored by:	Chelsio Communications
2016-09-01 20:43:01 +00:00
John Baldwin
59c1e950b9 Make SGE parameter handling more VF-friendly.
Add fields to hold the SGE control register and free list buffer sizes to
the sge_params structure.  Populate these new fields in
t4_init_sge_params() for PF devices and change t4_read_chip_settings() to
pull these values out of the params structure instead of reading
registers directly.  This will permit t4_read_chip_settings() to be reused
for VF devices which cannot read SGE registers directly.

While here, move the call to t4_init_sge_params() to
get_params__post_init().  The VF driver will populate the SGE parameters
structure via a different method before calling t4_read_chip_settings().

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7476
2016-08-15 17:40:05 +00:00
John Baldwin
ec55567ce6 Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers.  For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero.  VF devices have a non-zero base
absolute ID and require this change.  While here, export the absolute ID
of egress queues via a sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7446
2016-08-09 17:49:42 +00:00
John Baldwin
8f6690d385 Fix a typo. 2016-08-08 21:28:02 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
John Baldwin
29c229e9fc Mark spg_len and fl_pktshift static.
These variables are no longer exported to t4_netmap.c after r296478.
2016-07-28 17:37:12 +00:00
John Baldwin
069af0eb14 Install a handler for firmware work request error messages.
If a driver sends an malformed or disallowed work request, the firmware
responds with a work request error.  Previously the driver treated this is
as an unexpected message and panicked.  Now it decodes the error message
to aid in debugging.

Reviewed by:	np (older version)
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D6950
2016-07-22 21:52:07 +00:00