Commit Graph

202 Commits

Author SHA1 Message Date
Navdeep Parhar
f02cc9b2a8 cxgbe(4): Fall back to a basic configuration in case of any error during
card initialization.  This is an expanded version of r333682.

Break up prep_firmware into simpler routines while here.  Load the
firmware/config KLD only if needed.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-12-06 06:18:21 +00:00
John Baldwin
78afed1396 Move CLIP table handling out of TOM and into the base driver.
- Store the clip table in 'struct adapter' instead of in the TOM softc.
- Init the clip table during attach and teardown during detach.
- While here, add a dev.<nexus>.<unit>.misc.clip sysctl to dump the
  CLIP table.

This does mean that we update the clip table even if TOE is not enabled,
but non-TOE things need the CLIP table anyway.

Reviewed by:	np, Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18010
2018-11-29 01:15:53 +00:00
John Baldwin
2d714dbcc7 Add read-only sysctls for all tunables in the cxgbe(4) driver.
Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18360
2018-11-27 22:02:54 +00:00
John Baldwin
bc13c69bef Move the TLS key map into the adapter softc so non-TOE code can use it.
Sponsored by:	Chelsio Communications
2018-11-15 23:00:30 +00:00
John Baldwin
5cdaef71a9 Add a facility for transmitting "raw" work requests on regular NIC queues.
- Use PH_loc.eight[1] as a general 'cflags' (Chelsio flags) field to
  describe properties of a queued packet.  The MC_RAW_WR flag
  indicates an mbuf holding a raw work request.  mbuf_cflags() returns
  the current flags.
- Raw work request mbufs are allocated via alloc_wr_mbuf() which will
  allocate a single contiguous range to hold the mbuf data.  The
  consumer can use mtod() to obtain the start of the work request and
  write the required work request in the buffer.  The mbuf can then be
  enqueued directly to the txq via mp_ring_enqueue().
- Since raw work requests might potentially send arbitrary work
  requests, only set the EQUIQ and EQUEQ bits on work requests that
  support them such as the normal tunneled Ethernet packet work
  requests.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17811
2018-11-06 00:11:36 +00:00
Navdeep Parhar
b77aaff9bc cxgbe(4): Update the VI's default queue when netmap is enabled/disabled.
Sponsored by:	Chelsio Communications
2018-10-25 06:24:42 +00:00
Navdeep Parhar
17e81b7863 cxgbe(4): improve the accuracy of various TSO limits reported to the kernel.
Sponsored by:	Chelsio Communications
2018-10-22 23:57:59 +00:00
Navdeep Parhar
ea710848dc cxgbe(4): Link related changes.
- Switch to using 32b port/link capabilities in the driver.  The 32b
  format is used internally by firmwares > 1.16.45.0 and the driver will
  now interact with the firmware in its native format, whether it's 16b
  or 32b.  Note that the 16b format doesn't have room for 50G, 200G, or
  400G speeds.

- Add a bit in the pause_settings knobs to allow negotiated PAUSE
  settings to override manual settings.

- Ensure that manual link settings persist across an administrative
  down/up as well as transceiver unplug/replug.

- Remove unused is_*G_port() functions.

Approved by:	re@ (gjb@)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-09-25 05:52:42 +00:00
Navdeep Parhar
b8bfcb71fd cxgbev(4): Updates to the VF driver to cope with recent ifmedia and
ctrlq changes in the base driver.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-23 00:58:10 +00:00
Navdeep Parhar
e7e0844422 cxgbe(4): Replace T4_PKT_TIMESTAMP with something slightly less hackish. 2018-08-18 04:23:51 +00:00
Navdeep Parhar
9f78434942 cxgbe(4): Use VLAN_TRUNKDEV instead of private cookie to figure out the
parent of a VLAN ifnet.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-15 21:24:05 +00:00
Navdeep Parhar
51347c3ff1 cxgbe(4): Use two hashes instead of a table to keep track of
hashfilters.  Two because the driver needs to look up a hashfilter by
its 4-tuple or tid.

A couple of fixes while here:
- Reject attempts to add duplicate hashfilters.
- Do not assume that any part of the 4-tuple that isn't specified is 0.
  This makes it consistent with all other mandatory parameters that
  already require explicit user input.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2018-08-15 03:03:01 +00:00
Navdeep Parhar
37310a98a8 cxgbe(4): Move all control queues to the adapter.
There used to be one control queue per adapter (the mgmtq) that was
initialized during adapter init and one per port that was initialized
later during port init.  This change moves all the control queues (one
per port/channel) to the adapter so that they are initialized during
adapter init and are available before any port is up.  This allows the
driver to issue ctrlq work requests over any channel without having to
bring up any port.

MFH:		2 weeks
Sponsored by:	Chelsio Communications
2018-08-11 21:10:08 +00:00
Navdeep Parhar
3098bcfc05 cxgbe(4): Create two variants of service_iq, one for queues with
freelists and one for those without.

MFH:		3 weeks
Sponsored by:	Chelsio Communications
2018-08-11 04:55:47 +00:00
Navdeep Parhar
09a7189fb7 cxgbe(4): Allow the driver to specify a burst size when configuring a
traffic class for rate limiting.

Add experimental knobs that allow the user to specify a default pktsize
and burstsize for traffic classes associated with a port:

dev.<ifname>.<instance>.tc.pktsize
dev.<ifname>.<instance>.tc.burstsize

Sponsored by:	Chelsio Communications
2018-08-07 22:13:03 +00:00
Navdeep Parhar
1979b51141 cxgbe(4): Allow user-configured and driver-configured traffic classes to
be used simultaneously.  Move sysctl_tc and sysctl_tc_params to
t4_sched.c while here.

MFC after:	3 weeks
Sponsored by:	Chelsio Communications
2018-08-06 23:21:13 +00:00
Navdeep Parhar
af8854fdc1 cxgbe(4): Do not leak the filters in the hashfilter table on module
unload.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-06-27 01:51:17 +00:00
Navdeep Parhar
0afe96c7bf cxgbe(4): Add a hw.cxgbe.starve_fl sysctl that can be used to starve the
freelists of netmap receive queues.  This is primarily to test various
congestion scenarios in the chip.

Sponsored by:	Chelsio Communications
2018-06-15 23:42:22 +00:00
Navdeep Parhar
b9330ed7a2 cxgbe(4): Retire an old check. 2018-06-01 01:05:34 +00:00
Navdeep Parhar
2dae2a7487 cxgbe(4): Add code to deal with the chip's source MAC table (aka SMT).
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-05-31 21:31:08 +00:00
Navdeep Parhar
56226f5673 cxgbe(4): Consider all supported speeds when building the ifmedia list
for a port.  Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps.  Yes, really.
- New port flag to indicate that the media list is immutable.  It will
  be used in future refinements.

This also fixes a bug where the driver reports incorrect media with
recent firmwares.

MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-05-30 22:36:09 +00:00
Navdeep Parhar
786099de5e cxgbe(4): Data path for rate-limited tx.
This is hardware support for the SO_MAX_PACING_RATE sockopt (see
setsockopt(2)), which is available in kernels built with "options
RATELIMIT".

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2018-05-24 10:18:14 +00:00
Navdeep Parhar
9c707b3287 cxgbe(4): Make FW4_ACK a shared CPL. ETHOFLD in the base driver will
use it for per-flow rate limiting.

Sponsored by:	Chelsio Communications
2018-05-24 08:21:43 +00:00
Navdeep Parhar
67e071128d cxgbe(4): Implement ifnet callbacks that deal with send tags.
An etid (ethoffload tid) is allocated for a send tag and it acquires a
reference on the traffic class that matches the send parameters
associated with the tag.

Sponsored by:	Chelsio Communications
2018-05-18 06:09:15 +00:00
Navdeep Parhar
89f651e704 cxgbe(4): Add support for hash filters.
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool.  Hash and
normal TCAM filters can be used together.  The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters.  Any T5/T6 card with memory can support at
least half a million hash filters.  The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.

The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file.  The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto).  It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).

For example:
filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe
filterMask = protocol, port, vlan

Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.

Hash filters support all actions supported by TCAM filters.  A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).

Sponsored by:	Chelsio Communications
2018-05-09 04:09:49 +00:00
Navdeep Parhar
e1320420d5 cxgbe(4): Move all TCAM filter code into a separate file.
Sponsored by:	Chelsio Communications
2018-05-01 20:17:22 +00:00
Navdeep Parhar
4535e8046f cxgbe(4): Use opaque cookies or tid range-checks to determine the
intended recipient of a CPL when it can't be determined solely from the
opcode.  Retire the per-queue handlers for such CPLs in favor of the new
scheme.

Sponsored by:	Chelsio Communications
2018-04-30 15:18:38 +00:00
Navdeep Parhar
8896672a77 cxgbe(4): Move release_tid to the base NIC driver for future consumers.
Sponsored by:	Chelsio Communications.
2018-04-26 22:04:21 +00:00
Navdeep Parhar
3747c1ffc7 cxgbe(4): Break up alloc_tid_tabs and move the atid routines to the base
NIC driver.  The atid services will be used by new features (hashfilters
and inline TLS) that do not involve TOE.

Sponsored by:	Chelsio Communications
2018-04-26 19:00:35 +00:00
Navdeep Parhar
1131c927c4 cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload.  t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface.  The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.

A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so.  The general format
of a rule is: [socket-type] expr => settings

Example.  See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed

The driver processes the rules for each new listen, active open, or
passive open and stops at the first match.  There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload

This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.

Sponsored by:	Chelsio Communications
2018-04-14 19:07:56 +00:00
Navdeep Parhar
f8fea0d90e cxgbe: Implement tcp_info handler for connections handled by t4_tom.
The TCB is read using a memory window right now.  A better alternate to
get self-consistent, uncached information would be to use a GET_TCB
request but waiting for a reply from hw while holding non-sleepable
locks is quite inconvenient.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14817
2018-04-03 01:22:15 +00:00
John Baldwin
1e9538d253 Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS
encryption and TCP segmentation for offloaded connections.  Sockets
using TLS are required to use a set of custom socket options to upload
RX and TX keys to the NIC and to enable RX processing.  Currently
these socket options are implemented as TCP options in the vendor
specific range.  A patched OpenSSL library will be made available in a
port / package for use with the TLS TOE support.

TOE sockets can either offload both transmit and reception of TLS
records or just transmit.  TLS offload (both RX and TX) is enabled by
setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be
enabled on the relevant interface.  Transmit offload can be used on
any "normal" or TLS TOE socket by using the custom socket option to
program a transmit key.  This permits most TOE sockets to
transparently offload TLS when applications use a patched SSL library
(e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL
library).  Receive offload can only be used with TOE sockets using the
TLS mode.  The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a
list of TCP port numbers.  Any connection with either a local or
remote port number in that list will be created as a TLS socket rather
than a plain TOE socket.  Note that although this sysctl accepts an
arbitrary list of port numbers, the sysctl(8) tool is only able to set
sysctl nodes to a single value.  A TLS socket will hang without
receiving data if used by an application that is not using a patched
SSL library.  Thus, the tls_rx_ports node should be used with care.
For a server mostly concerned with offloading TLS transmit, this node
is not needed as plain TOE sockets will fall back to software crypto
when using an unpatched SSL library.

New per-interface statistics nodes are added giving counts of TLS
packets and payload bytes (payload bytes do not include TLS headers or
authentication tags/MACs) offloaded via the TOE engine, e.g.:

dev.cc.0.stats.rx_tls_octets: 149
dev.cc.0.stats.rx_tls_records: 13
dev.cc.0.stats.tx_tls_octets: 26501823
dev.cc.0.stats.tx_tls_records: 1620

TLS transmit work requests are constructed by a new variant of
t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.

TLS transmit work requests require a buffer containing IVs.  If the
IVs are too large to fit into the work request, a separate buffer is
allocated when constructing a work request.  This buffer is associated
with the transmit descriptor and freed when the descriptor is ACKed by
the adapter.

Received TLS frames use two new CPL messages.  The first message is a
CPL_TLS_DATA containing the decryped payload of a single TLS record.
The handler places the mbuf containing the received payload on an
mbufq in the TOE pcb.  The second message is a CPL_RX_TLS_CMP message
which includes a copy of the TLS header and indicates if there were
any errors.  The handler for this message places the TLS header into
the socket buffer followed by the saved mbuf with the payload data.
Both of these handlers are contained in tom/t4_tls.c.

A few routines were exposed from t4_cpl_io.c for use by t4_tls.c
including send_rx_credits(), a new send_rx_modulate(), and
t4_close_conn().

TLS keys for both transmit and receive are stored in onboard memory
in the NIC in the "TLS keys" memory region.

In some cases a TLS socket can hang with pending data available in the
NIC that is not delivered to the host.  As a workaround, TLS sockets
are more aggressive about sending CPL_RX_DATA_ACK messages anytime that
any data is read from a TLS socket.  In addition, a fallback timer will
periodically send CPL_RX_DATA_ACK messages to the NIC for connections
that are still in the handshake phase.  Once the connection has
finished the handshake and programmed RX keys via the socket option,
the timer is stopped.

A new function select_ulp_mode() is used to determine what sub-mode a
given TOE socket should use (plain TOE, DDP, or TLS).  The existing
set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and
handles initialization of TLS-specific state when necessary in
addition to DDP-specific state.

Since TLS sockets do not receive individual TCP segments but always
receive full TLS records, they can receive more data than is available
in the current window (e.g. if a 16k TLS record is received but the
socket buffer is itself 16k).  To cope with this, just drop the window
to 0 when this happens, but track the overage and "eat" the overage as
it is read from the socket buffer not opening the window (or adding
rx_credits) for the overage bytes.

Reviewed by:	np (earlier version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14529
2018-03-13 23:05:51 +00:00
John Baldwin
52f8c52677 Move ccr_aes_getdeckey() from ccr(4) to the cxgbe(4) driver.
This routine will also be used by the TOE module to manage TLS keys.

Sponsored by:	Chelsio Communications
2018-02-26 22:12:31 +00:00
Wojciech Macek
19a5b68236 CXGBE: implement prefetch on non-Intel architectures
Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           np, pdk@semihalf.com
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14452
2018-02-21 08:05:56 +00:00
Navdeep Parhar
f549e3521d cxgbe(4): Do not forward interrupts to queues with freelists. This
leaves the firmware event queue (fwq) as the only queue that can take
interrupts for others.

This simplifies cfg_itype_and_nqueues and queue allocation in the driver
at the cost of a little (never?) used configuration.  It also allows
service_iq to be split into two specialized variants in the future.

MFC after:	2 months
Sponsored by:	Chelsio Communications
2017-12-22 19:10:19 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Hans Petter Selasky
937d37fc6c Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
Navdeep Parhar
8c61c6bbda cxgbe(4): Combine all _10g and _1g tunables and drop the suffix from
their names.  The finer-grained knobs weren't practically useful.

Sponsored by:	Chelsio Communications
2017-11-15 23:48:02 +00:00
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Wojciech Macek
ec7f8d58b9 CXGBE: fix big-endian behaviour
The setbit/clearbit pair casts the bitfield pointer
to uint8_t* which effectively treats its contents as
little-endian variable. The ffs() function accepts int as
the parameter, which is big-endian. Use uint8_t here to
avoid mismatch, as we have only 4 doorbells.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D13084
2017-11-15 06:45:33 +00:00
Navdeep Parhar
5c2bacde58 Update the iw_cxgbe bits in the projects branch.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2017-11-07 23:52:14 +00:00
Navdeep Parhar
5bcae8ddfa cxgbe(4): Read the MPS buffer group map from the firmware as it could be
different from hardware defaults.  The congestion channel map, which is
still fixed, needs to be tracked separately now.  Change the congestion
setting for TOE rx queues to match the drivers on other OSes while here.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-24 05:41:48 +00:00
Navdeep Parhar
08cd1f11bd cxgbe(4): Provide knobs to set the holdoff parameters of TOE rx queues
separately from NIC rx queues instead of using the same parameters for
both types of queues.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-05 07:18:16 +00:00
Navdeep Parhar
2f318252cb cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+).  The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 23:41:04 +00:00
Navdeep Parhar
7023d9d4c6 cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI).  All autonomous VIs that share a port share the
same media.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 21:44:25 +00:00
Navdeep Parhar
f6d9d14b93 cxgbe(4): Verify that the driver accesses the firmware mailbox in a
thread-safe manner.

MFC after:	3 days
2017-08-28 03:13:16 +00:00
Navdeep Parhar
5d973bad2a cxgbe(4): Save the last reported link parameters and compare them with
the current state to determine whether to generate a link-state change
notification.  This fixes a bug introduced in r321063 that caused the
driver to sometimes skip these notifications.

Reported by:	Jason Eggleston @ LLNW
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-12 14:02:19 +00:00
Navdeep Parhar
01285747aa cxgbe(4): Various link/media related improvements.
- Deal with changes to port_type, and not just port_mod when a
  transceiver is changed.  This fixes hot swapping of transceivers of
  different types (QSFP+ or QSA or QSFP28 in a QSFP28 port, SFP+ or
  SFP28 in a SFP28 port, etc.).

- Always refresh media information for ifconfig if the port is down.
  The firmware does not generate tranceiver-change interrupts unless at
  least one VI is enabled on the physical port.  Before this change
  ifconfig diplayed potentially stale information for ports that were
  administratively down.

- Always recalculate and reapply L1 config on a transceiver change.

- Display PAUSE settings in ifconfig.  The driver sysctls for this
  continue to work as well.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-07-17 00:42:13 +00:00
Navdeep Parhar
a8c4fcb9c7 cxgbe(4): Fix per-queue netmap operation.
Do not attempt to initialize netmap queues that are already initialized
or aren't supposed to be initialized.  Similarly, do not free queues
that are not initialized or aren't supposed to be freed.

PR:		217156
Sponsored by:	Chelsio Communications
2017-06-15 19:56:59 +00:00
John Baldwin
5033c43b7a Add a driver for the Chelsio T6 crypto accelerator engine.
The ccr(4) driver supports use of the crypto accelerator engine on
Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.

Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
and SHA2-512-HMAC authentication algorithms.  The driver also supports
chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
algorithm for encrypt-then-authenticate operations.

Note that this driver is still under active development and testing and
may not yet be ready for production use.  It does pass the tests in
tests/sys/opencrypto with the exception that the AES-GCM implementation
in the driver does not yet support requests with a zero byte payload.

To use this driver currently, the "uwire" configuration must be used
along with explicitly enabling support for lookaside crypto capabilities
in the cxgbe(4) driver.  These can be done by setting the following
tunables before loading the cxgbe(4) driver:

    hw.cxgbe.config_file=uwire
    hw.cxgbe.cryptocaps_allowed=-1

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D10763
2017-05-17 22:13:07 +00:00
Navdeep Parhar
b2d8f4934e Adjust whitespace and fix a comment. No functional change.
MFC after:	3 days
2017-05-10 00:42:28 +00:00
Navdeep Parhar
1404daa76c cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-05-09 18:33:41 +00:00
Navdeep Parhar
e006d2a6fd cxgbe(4): Fixes related to the knob that controls link autonegotiation.
- Do not leak the adapter lock in sysctl_autoneg.
- Accept only 0 or 1 as valid settings for autonegotiation.
- A fixed speed must be requested by the driver when autonegotiation is
  disabled otherwise the firmware will reject the l1cfg command.  Use
  the top speed supported by the port for now.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-05-09 08:08:28 +00:00
Navdeep Parhar
2204b42716 cxgbe(4): Support routines for Tx traffic scheduling.
- Create a new file, t4_sched.c, and move all of the code related to
  traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
  parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
  rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-05-02 20:38:10 +00:00
Navdeep Parhar
46f48ee519 cxgbe: Add tunables to control the number of LRO entries and the number
of rx mbufs that should be presorted before LRO.  There is no change in
default behavior.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-04-17 09:00:20 +00:00
Navdeep Parhar
358bca3bc6 cxgbe(4): Updates to link configuration.
- Update struct link_settings and associated shared code.

- Add tunables to control FEC and autonegotiation.  All ports inherit
  these values as their initial settings.
  hw.cxgbe.fec
  hw.cxgbe.autoneg

- Add per-port sysctls to control FEC and autonegotiation.  These can be
  modified at any time.
  dev.<port>.<n>.fec
  dev.<port>.<n>.autoneg

MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-30 08:59:49 +00:00
Navdeep Parhar
1521ca71d3 cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-13 20:35:57 +00:00
Navdeep Parhar
b0c554c3a5 cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.
Sponsored by:	Chelsio Communications
2016-09-17 22:13:03 +00:00
Navdeep Parhar
e6b81479f9 cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will
come up as 't6nex' nexus devices with 'cc' ports hanging off them.

The T6 firmware and configuration files will be added as soon as they
are released.  For now the driver will try to work with whatever
firmware and configuration is on the card's flash.

Sponsored by:	Chelsio Communications
2016-09-16 00:08:37 +00:00
Navdeep Parhar
4cf3aa135b cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

Sponsored by:	Chelsio Communications
2016-09-12 00:15:40 +00:00
Navdeep Parhar
9113e53d54 cxgbe(4): Add support for additional port types and link speeds.
Sponsored by:	Chelsio Communications.
2016-09-11 23:08:57 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
Navdeep Parhar
e25621e5ea cxgbe(4): Provide more details about the card in the sysctl MIB.
dev.t5nex.0.%desc: Chelsio T580-CR
dev.t5nex.0.hw_revision: 1
dev.t5nex.0.sn: PT13140042
dev.t5nex.0.pn: 110117150A0
dev.t5nex.0.ec: 0000000000000000
dev.t5nex.0.na: 0007432AF490
dev.t5nex.0.vpd_version: 3
dev.t5nex.0.scfg_version: 53255
dev.t5nex.0.bs_version: 1.1.0.0
dev.t5nex.0.er_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
dev.t5nex.0.firmware_version: 1.16.2.0

Sponsored by:	Chelsio Communications
2016-08-27 00:13:41 +00:00
John Baldwin
ec55567ce6 Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers.  For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero.  VF devices have a non-zero base
absolute ID and require this change.  While here, export the absolute ID
of egress queues via a sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7446
2016-08-09 17:49:42 +00:00
John Baldwin
a56745092c Reserve an adapter flag IS_VF to mark VF devices vs PF devices.
Sponsored by:	Chelsio Communications
2016-08-08 21:45:39 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
Navdeep Parhar
671bf2b8b2 cxgbe(4): Changes to the CPL-handler registration mechanism and code
related to "shared" CPLs.

a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single
function.  Allow callers to direct the response to any iq.  Tidy up
set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of
magic constants.

b) Remove all CPL handler tables from struct adapter.  This reduces its
size by around 2KB.  All handlers are now registered at MOD_LOAD instead
of attach or some kind of initialization/activation.  The registration
functions do not need an adapter parameter any more.

c) Add per-iq handlers to deal with CPLs whose destination cannot be
determined solely from the opcode.  There are 2 such CPLs in use right
now: SET_TCB_RPL and L2T_WRITE_RPL.  The base driver continues to send
filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq.
t4_tom (including the DDP code) now uses the port's ctrlq to send
L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq.
fwq and ofld_rxq have different handlers that know what kind of tid to
expect in the reply.  Update t4_write_l2e and callers to to support any
wrq/iq combination.

Approved by:	re@ (kib@)
Sponsored by:	Chelsio Communications
2016-07-05 01:29:24 +00:00
Navdeep Parhar
62291463de cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces.  The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel.  The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.).  It did not have normal
tx/rx but had a specialized netmap-only data path.  r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support.  The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable.  This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface.  nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf.  num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap.  pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only.  No
need to bring them up.  The vcxl interfaces are completely independent
and everything should just work.
---------------------

Approved by:	re@ (gjb@)
Relnotes:	Yes
Sponsored by:	Chelsio Communications
2016-06-23 02:53:00 +00:00
Navdeep Parhar
02f972e8f3 cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
Sponsored by:	Chelsio Communications
2016-06-08 14:15:29 +00:00
Navdeep Parhar
46464b95b0 cxgbe(4): Track the state of the hardware traffic schedulers in the
driver.  This works as long as everyone uses set_sched_class_params
to program them.

Sponsored by:	Chelsio Communications
2016-06-07 00:27:55 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
John Baldwin
80f3b01958 Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

Discussed with:	np
Sponsored by:	Chelsio Communications
2016-03-31 18:36:50 +00:00
Navdeep Parhar
0f2f53efd2 cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.
2016-03-12 02:54:55 +00:00
Navdeep Parhar
9945ceb857 cxgbe(4): Add sysctls to display the TP microcode version and the
expansion rom version (if there's one).

trantor:~# sysctl dev.t4nex dev.t5nex | grep _version
dev.t4nex.0.firmware_version: 1.15.28.0
dev.t4nex.0.tp_version: 0.1.9.4
dev.t5nex.0.firmware_version: 1.15.28.0
dev.t5nex.0.exprom_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
2016-03-11 03:15:17 +00:00
Navdeep Parhar
c912289045 cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows.  Convert existing users of these windows to the
new routines.
2016-03-10 06:15:31 +00:00
Navdeep Parhar
4d131308f3 cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.
2016-03-08 22:23:30 +00:00
Navdeep Parhar
b3500921c4 cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.

Obtained from:	Chelsio Communications
2016-03-08 07:48:55 +00:00
Navdeep Parhar
700cfba72d cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.

Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here.  This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.

Sponsored by:	Chelsio Communications
2016-03-08 02:04:05 +00:00
Navdeep Parhar
90e7434a6d cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code.  Use these per-adapter values instead of globals.

Sponsored by:	Chelsio Communications
2016-03-08 00:23:56 +00:00
Navdeep Parhar
d1205d093d cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
  all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
  work with T6.  Most of the changes are in the decoders for the CIM
  logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-04 13:11:13 +00:00
Navdeep Parhar
e8c6ba7265 cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a
port.

dev.cxgbe.<n>.max_speed
dev.cxl.<n>.max_speed

Sponsored by:	Chelsio Communications
2016-02-25 01:10:56 +00:00
Navdeep Parhar
40bf7442fa cxgbe: catch up with the latest hardware-related definitions.
Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-02-19 00:29:16 +00:00
Navdeep Parhar
9eb533d3b4 cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI
offload driver.  These changes come from projects/cxl_iscsi.
2015-12-26 00:26:02 +00:00
John Baldwin
fe2ebb7644 Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port.  This change allows additional non-netmap
interfaces to be configured on each port.  Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.

Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).

T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).

One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.

The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.

The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats.  There is currently no way to
clear the stats of an individual VI.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio
2015-12-03 00:02:01 +00:00
Navdeep Parhar
8faf57012b cxgbe(4): Save the flags for the last adapter-wide synchronized
operation that was initiated successfully.  (The caller and thread are
already recorded).

MFC after:	1 week
2015-08-19 15:40:03 +00:00
Navdeep Parhar
a1ed88571f cxgbe(4): Ask the firmware for the start of the RSS slice for a port and
save it for later.  This enables direct manipulation of the indirection
tables (although the stock driver doesn't do that right now).

MFC after:	1 month
2015-07-17 06:46:18 +00:00
Navdeep Parhar
9af71ab3bc cxgbe(4): Add a new knob that controls the congestion response of netmap
rx queues.  The default is to drop rather than backpressure.

This decouples the congestion settings of NIC and netmap rx queues.

MFC after:	3 days
2015-07-06 20:56:59 +00:00
Navdeep Parhar
0e4cd4a2e0 cxgbe(4): Add the ability to dump mailbox commands and replies. It is
enabled/disabled via bit 0 of adapter->debug_flags (which is available
at dev.t5nex.<n>.debug_flags).

MFC after:	1 week
2015-06-16 12:36:29 +00:00
Navdeep Parhar
1605bac6fb cxgbe(4): set up congestion management for netmap rx queues.
The hw.cxgbe.cong_drop knob controls the response of the chip when
netmap queues are congested.
2015-02-24 18:40:10 +00:00
Navdeep Parhar
b3d44a6800 cxgbe(4): tidy up some of the interaction between the Upper Layer
Drivers (ULDs) and the base if_cxgbe driver.

Track the per-adapter activation of ULDs in a new "active_ulds" field.
This was done pretty arbitrarily before this change -- via TOM_INIT_DONE
in adapter->flags for TOM, and the (1 << MAX_NPORTS) bit in
adapter->offload_map for iWARP.

iWARP and hw-accelerated iSCSI rely on the TOE (supported by the TOM
ULD).  The rules are:
a) If the iWARP and/or iSCSI ULDs are available when TOE is enabled then
   iWARP and/or iSCSI are enabled too.
b) When the iWARP and iSCSI modules are loaded they go looking for
   adapters with TOE enabled and enable themselves on that adapter.
c) You cannot deactivate or unload the TOM module from underneath iWARP
   or iSCSI.  Any such attempt will fail with EBUSY.

MFC after:	2 weeks
2015-02-08 09:28:55 +00:00
Navdeep Parhar
d86a5ff917 cxgbe(4): a change to the synchronization rules within the the driver.
This is purely cosmetic because the new rules are already followed.

MFC after:	1 week
2015-02-08 08:42:45 +00:00
Navdeep Parhar
7951040f8a cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.

b) Replace buf_ring with a brand new mp_ring (multiproducer ring).  This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual.  mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue.  It also has:
- the ability to enqueue/dequeue multiple items.  This might become
  significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
  descriptors and have another if_transmit thread take over.  A thread
  that's writing tx descriptors can end up doing so for an unbounded
  time period if a) there are other if_transmit threads continuously
  feeding the sofware queue, and b) the chip keeps up with whatever the
  thread is throwing at it.
- accurate statistics about interesting events even when the stats come
  at the expense of additional branches/conditional code.

The NIC txq lock is uncontested on the fast path at this point.  I've
left it there for synchronization with the control events (interface
up/down, modload/unload).

c) Add support for "type 1" coalescing work request in the normal NIC tx
path.  This work request is optimized for frames with a single item in
the DMA gather list.  These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.

d) Do not request automatic cidx updates every 32 descriptors.  Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately).  Also, request an automatic final update
when the queue idles after activity.  This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles.  This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.

e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx).  Allow work requests to be written
directly to the hardware descriptor ring if room is available.  I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.

MFC after:	2 months
2014-12-31 23:19:16 +00:00
Navdeep Parhar
a7570ee305 Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe
code can use it too.

MFC after:	1 week
2014-12-12 21:54:59 +00:00
Navdeep Parhar
b741402c40 cxgbe(4): allow the driver to use rx buffers that do not end on a pack
boundary.

MFC after:	2 weeks
2014-12-06 01:47:38 +00:00
Navdeep Parhar
e3207e1973 cxgbe(4): Allow for different pad and pack boundaries for different
adapters.  Set the pack boundary for T5 cards to be the same as the
PCIe max payload size.  The chip likes it this way.

In this revision the driver allocate rx buffers that align on both
boundaries.  This is not a strict requirement and a followup commit
will switch the driver to a more relaxed allocation strategy.

MFC after:	2 weeks
2014-12-06 00:13:56 +00:00
Navdeep Parhar
2d8910854b cxgbe(4): implement if_get_counter. 2014-09-27 05:50:31 +00:00
Navdeep Parhar
4d6db4e0f7 cxgbe(4): some optimizations in freelist handling.
MFC after:	2 weeks.
2014-08-02 06:55:36 +00:00
Navdeep Parhar
b2daa9a9cd cxgbe(4): minor optimizations in ingress queue processing.
Reorganize struct sge_iq.  Make the iq entry size a compile time
constant.  While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.

MFC after:	2 weeks
2014-08-02 00:56:34 +00:00
Navdeep Parhar
82eff304b6 cxgbe(4): Keep track of the clusters that have to be freed by the
custom free routine (rxb_free) in the driver.  Fail MOD_UNLOAD with
EBUSY if any such cluster has been handed up to the kernel but hasn't
been freed yet.  This prevents a panic later when the cluster finally
needs to be freed but rxb_free is gone from the kernel.

MFC after:	1 week
2014-07-23 22:29:22 +00:00
Navdeep Parhar
c3fb772502 Simplify r267600, there's no need to distinguish between allocated and
inlined mbufs.

MFC after:	1 week
2014-07-22 02:02:39 +00:00