Commit Graph

992 Commits

Author SHA1 Message Date
Navdeep Parhar
a71c41ccc4 cxgbe(4): Delay the panic due to a fatal error by 30s.
This lets information logged by the interrupt handler reach the system
log before the system goes down.
2019-02-09 01:49:53 +00:00
Navdeep Parhar
c0a248ef93 cxgbev(4): Initialize debug_flags from the environment like in the PF driver. 2019-02-08 03:31:38 +00:00
Navdeep Parhar
644b22ae36 cxgbe(4): Auto-dump the CIM block's logic analyzer on a TIMER0 interrupt.
Sponsored by:	Chelsio Communications
2019-02-07 05:40:51 +00:00
Navdeep Parhar
286fd42ba6 cxgbe(4): Auto-dump the device log on a mailbox timeout or when the
firmware reports an error in pcie_fw.

Sponsored by:	Chelsio Communications
2019-02-07 05:06:29 +00:00
Navdeep Parhar
cb7c3f124a cxgbe(4): Improved error reporting and diagnostics.
"slow" interrupt handler:
- Expand the list of INT_CAUSE registers known to the driver.
- Add decode information for many more bits but decouple it from the
  rest of intr_info so that it is entirely optional.
- Call t4_fatal_err exactly once, and from the top level PL intr handler.

t4_fatal_err:
- Use t4_shutdown_adapter from the common code to stop the adapter.
- Stop servicing slow interrupts after the first fatal one.

Driver/firmware interaction:
- CH_DUMP_MBOX: note whether the mailbox being dumped is a command or a
  reply or something else.
- Log the raw value of pcie_fw for some errors.
- Use correct log levels (debug vs. error).

Sponsored by:	Chelsio Communications
2019-02-01 20:42:49 +00:00
Navdeep Parhar
3496224a96 cxgbe/iw_cxgbe: Fix an address calculation in the memory registration code that
was added in r342266.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2019-01-30 05:39:47 +00:00
Navdeep Parhar
ef96741259 cxgbe(4): Add adapter information to messages logged by the OS-agnostic
code in t4_hw.c.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-01-29 00:49:12 +00:00
John Baldwin
cecf8bebe7 Fix a few more places to handle ofld tx queues for RATELIMIT.
- Drain offload transmit queues when RATELIMIT is enabled but
  TCP_OFFLOAD is not.
- Expose the per-VI nofldtxq and first_ofld_txq sysctls when
  RATELIMIT is enabled but TCP_OFFLOAD is not.
- Clear offload transmit queue stats as part of a 'cxgbetool clearstats'
  request when RATELIMIT is enabled but TCP_OFFLOAD is not.

Reviewed by:	np
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18966
2019-01-25 20:54:18 +00:00
Navdeep Parhar
2a857082dc cxgbe(4): Allow negative values in hw.cxgbe.fw_install and take them to
mean that the driver should taste the firmware in the KLD and use that
firmware's version for all its fw_install checks.

The driver gets firmware version information from compiled-in values by
default and this change allows custom (or older/newer) firmware modules
to be used with the stock driver.

There is no change in default behavior.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-01-21 18:42:16 +00:00
Navdeep Parhar
8d5106760c cxgbe(4): Use a truncated firmware header for version checks. All the
version numbers are towards the begining of the header.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-01-21 17:58:06 +00:00
Navdeep Parhar
6baf1e4803 cxgbe(4): Clear the reply-pending status of a hashfilter when the reply
indicates an error.  Also, do not remove it twice from the hf list in
this case.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	1 week
Sponsored by:	Chelsio Communicatons
2019-01-20 23:30:16 +00:00
John Baldwin
475d54fac3 Reject new sessions if the necessary queues aren't initialized.
ccr reuses the control queue and first rx queue from the first port on
each adapter.  The driver cannot send requests until those queues are
initialized.  Refuse to create sessions for now if the queues aren't
ready.  This is a workaround until cxgbe allocates one or more
dedicated queues for ccr.

PR:		233851
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18478
2019-01-15 18:53:45 +00:00
Navdeep Parhar
1dca7005b1 cxgbe(4): Move some INTx specific code to a more appropriate place. 2019-01-12 04:44:25 +00:00
Navdeep Parhar
1a35ae9353 cxgbe(4): Clear FW_OK if the firmware reports an error.
Sponsored by:	Chelsio Communications
2019-01-04 04:15:17 +00:00
Navdeep Parhar
450ffb7cb2 cxgbe(4): Attach to two T540 variants.
MFC after:	1 week
2018-12-30 01:57:11 +00:00
Navdeep Parhar
6c5c0137a9 Remove unused macros from t4_tom.h. 2018-12-21 20:46:45 +00:00
Navdeep Parhar
ad025209ba cxgbe/iw_cxgbe: Remove redundant CTRs from c4iw_alloc/c4iw_rdev_open.
This information is readily available elsewhere.

Sponsored by:	Chelsio Communications
2018-12-20 22:39:58 +00:00
Navdeep Parhar
6bb034658d cxgbe/iw_cxgbe: Do not terminate CTRx messages with \n. 2018-12-20 22:31:07 +00:00
Navdeep Parhar
9877f73541 cxgbe(4): Make sure the rx queues start off with the correct timestamp
settings on initialization.

Sponsored by:	Chelsio Communications
2018-12-20 20:34:21 +00:00
Navdeep Parhar
8953e80f5e cxgbe/iw_cxgbe: Use -ve errno when interfacing with linuxkpi/OFED.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-12-20 01:35:45 +00:00
Navdeep Parhar
b562884d63 cxgbe/iw_cxgbe: Add a knob for testing that lets iWARP connections cycle
through 4-tuples quickly.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-12-20 01:00:21 +00:00
Navdeep Parhar
121684b714 cxgbe/iw_cxgbe: Use DSGLs to write to card's memory when appropriate.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-12-19 23:29:01 +00:00
Navdeep Parhar
377328701d cxgbe(4): Do not issue mbox commands after t4_fw_bye.
Sponsored by:	Chelsio Communications
2018-12-19 19:21:29 +00:00
Navdeep Parhar
b156a400a6 cxgbe/t4_tom: fixes for issues on the passive open side.
- Fix PR 227760 by getting the TOE to respond to the SYN after the call
  to toe_syncache_add, not during it.  The kernel syncache code calls
  syncache_respond just before syncache_insert.  If the ACK to the
  syncache_respond is processed in another thread it may run before the
  syncache_insert and won't find the entry.  Note that this affects only
  t4_tom because it's the only driver trying to insert and expand
  syncache entries from different threads.

- Do not leak resources if an embryonic connection terminates at
  SYN_RCVD because of L2 lookup failures.

- Retire lctx->synq and associated code because there is never a need to
  walk the list of embryonic connections associated with a listener.
  The per-tid state is still called a synq entry in the driver even
  though the synq itself is now gone.

PR:		227760
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2018-12-19 01:37:00 +00:00
Navdeep Parhar
9b11a65d1c cxgbe(4): Get Linux cxgb4vf working in bhyve VMs with VFs passed
through.

cxgb4vf doesn't own the buffer size list but still expects the first two
entries to be 4K and some power of 2 respectively.  The BSD cxgbe
doesn't care where its preferred buffer sizes are as long as they're in
the list somewhere, so just move its entries towards the end as a
workaround.

MFC after:	1 month
Sponsored by:	Chelsio Communicatons
2018-12-06 21:33:08 +00:00
Navdeep Parhar
f02cc9b2a8 cxgbe(4): Fall back to a basic configuration in case of any error during
card initialization.  This is an expanded version of r333682.

Break up prep_firmware into simpler routines while here.  Load the
firmware/config KLD only if needed.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-12-06 06:18:21 +00:00
John Baldwin
31562c4440 Make most of the CLIP code conditional on #ifdef INET6.
This fixes builds of kernels without INET6 such as LINT-NOINET6.

Reported by:	arybchik
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18384
2018-11-29 23:14:54 +00:00
John Baldwin
78afed1396 Move CLIP table handling out of TOM and into the base driver.
- Store the clip table in 'struct adapter' instead of in the TOM softc.
- Init the clip table during attach and teardown during detach.
- While here, add a dev.<nexus>.<unit>.misc.clip sysctl to dump the
  CLIP table.

This does mean that we update the clip table even if TOE is not enabled,
but non-TOE things need the CLIP table anyway.

Reviewed by:	np, Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18010
2018-11-29 01:15:53 +00:00
Vincenzo Maffione
43cf589ced cxgbe: revert r309725
After the fix contained in r341144, cxgbe does not need anymore
to set the IFCAP_NETMAP flag manually.

Reviewed by:	np
Approved by:	gnn (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D17987
2018-11-28 15:29:58 +00:00
John Baldwin
2d714dbcc7 Add read-only sysctls for all tunables in the cxgbe(4) driver.
Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D18360
2018-11-27 22:02:54 +00:00
Mark Johnston
423700997b Check for an allocation failure before dereferencing the pointer.
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:	np
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18310
2018-11-26 22:42:52 +00:00
Navdeep Parhar
c7a20141cc cxgbe(4): Update T4/5/6 firmwares to 1.22.0.3.
Obtained from:	Chelsio Communications
MFC after:	2 months
Sponsored by:	Chelsio Communications
2018-11-19 21:59:07 +00:00
John Baldwin
d09389fd05 Consolidate on a single set of constants for SCMD fields.
Both ccr(4) and the TOE TLS code had separate sets of constants for
fields in SCMD messages.

Sponsored by:	Chelsio Communications
2018-11-16 19:08:52 +00:00
John Baldwin
f0aefccb70 Restore the <sys/vmem.h> header to fix build of cxgbe(4) TOM.
vmem's are not just used for TLS memory in TOM and the #include actually
predates the TLS code so should not have been removed when the TLS vmem
moved in r340466.

Pointy hat to:	jhb
Sponsored by:	Chelsio Communications
2018-11-16 01:27:24 +00:00
John Baldwin
47c64f9e3e Remove bogus roundup2() of the key programming work request header.
The key context is always placed immediately after the work request
header.  The total work request length has to be rounded up by 16
however.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-11-15 23:31:04 +00:00
John Baldwin
2939ecd3ce Change the quantum for TLS key addresses to 32 bytes.
The addresses passed when reading and writing keys are always shifted
right by 5 as the memory locations are addressed in 32-byte chunks, so
the quantum needs to be 32, not 8.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-11-15 23:10:46 +00:00
John Baldwin
bc13c69bef Move the TLS key map into the adapter softc so non-TOE code can use it.
Sponsored by:	Chelsio Communications
2018-11-15 23:00:30 +00:00
John Baldwin
c15600b71a Use sbsndptr_adv() instead of sbsndptr() for TOE TLS.
For TOE TLS, we just want to advance the send pointer to skip over the
record just sent to the TOE.  The recently added sbsndptr_adv() is
sufficient for that and is cheaper.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-11-15 22:47:47 +00:00
Julien Charbon
23d903a783 cxgbe/netmap: Fix cxgbe netmap when interface is DOWN
A kernel panic can occur if the cxgbe interface is DOWN
when activating netmap. This patch prevents the driver
from freeing up cxgbe netmap resources when they have not
been allocated.

Submitted by:	Nicolas Witkowski <nwitkowski@verisign.com>
Reviewed by:	np
MFC after:	1 week
Sponsored by:	Verisign, Inc.
Differential Revision:	https://reviews.freebsd.org/D17802
2018-11-12 17:57:12 +00:00
John Baldwin
fe03ca08a6 Use tcp_state_change() in the cxgbe(4) TOE module.
r254889 added tcp_state_change() as a centralized place to log state
changes in TCP connections for DTrace.  r294869 and r296881 took
advantage of this central location to manage per-state counters.
However, TOE sockets were still performing some (but not all) state
change updates via direct assignments to t_state.  This resulted in
state counters underflowing when TOE was in use.  Fix by using
tcp_state_change() when changing a TOE connection's state.

Reviewed by:	np, markj
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17915
2018-11-09 21:16:45 +00:00
John Baldwin
2f3736eb65 Treat the memory lengths for CHELSIO_T4_GET_MEM as unsigned.
Previously attempts to read the MC region were failing since the
length was greater than 2^31.

Reviewed by:	np
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D17857
2018-11-06 22:33:36 +00:00
John Baldwin
5cdaef71a9 Add a facility for transmitting "raw" work requests on regular NIC queues.
- Use PH_loc.eight[1] as a general 'cflags' (Chelsio flags) field to
  describe properties of a queued packet.  The MC_RAW_WR flag
  indicates an mbuf holding a raw work request.  mbuf_cflags() returns
  the current flags.
- Raw work request mbufs are allocated via alloc_wr_mbuf() which will
  allocate a single contiguous range to hold the mbuf data.  The
  consumer can use mtod() to obtain the start of the work request and
  write the required work request in the buffer.  The mbuf can then be
  enqueued directly to the txq via mp_ring_enqueue().
- Since raw work requests might potentially send arbitrary work
  requests, only set the EQUIQ and EQUEQ bits on work requests that
  support them such as the normal tunneled Ethernet packet work
  requests.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17811
2018-11-06 00:11:36 +00:00
Navdeep Parhar
5c239d80c0 cxgbe/iw_cxgbe: Suppress spurious "Unexpected streaming data ..."
messages.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-11-02 16:21:44 +00:00
John Baldwin
7890b5c14e Check cannot_use_txpkts() rather than needs_tso() in add_to_txpkts().
Currently this is a no-op, but will matter in the future when
cannot_use_txpkts() starts checking other conditions than just
needs_tso().

Sponsored by:	Chelsio Communications
2018-11-01 21:49:49 +00:00
John Baldwin
1d8d91db74 Add support for port unit wiring to cxgbe(4).
- Add a bus_child_location_str method to the nexus drivers that prints
  out 'port=N' as the location string exported via devinfo and the
  '%location' sysctl node.

- We can't use a bus_hint_device_unit to wire the unit numbers of
  devices with a fixed devclass as the device gets assigned a unit in
  make_device() before the device creator can set softc, etc.
  Instead, when adding a child device, use a helper function much like
  a bus_hint_device_unit method to look for wiring hints or to return
  -1 to let the system choose a unit number.  This function requires
  an "at" hint for the port pointing to the nexus device and a "port"
  hint listing the port number.  For example:

hint.cxl.4.at="t5nex0"
hint.cxl.4.port="0"

  wires cxl4 to the first port on the t5nex0 adapter.

Requested by:	gallatin
MFC after:	2 months
2018-11-01 21:46:37 +00:00
John Baldwin
dcd50a20b7 Assert that reclaim_tx_descs() is always making forward progress.
MFC after:	2 months
Sponsored by:	Chelsio Communications
2018-11-01 21:39:33 +00:00
Navdeep Parhar
6933902df4 cxgbe(4): Add rate limiting support for UDP.
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-10-31 19:19:13 +00:00
Navdeep Parhar
1272051e82 cxgbe(4): Report a reasonable non-zero if_hw_tsomaxsegsize to the
kernel.

This reverts an accidental change that snuck in with r339628.

Sponsored by:	Chelsio Communications
2018-10-31 18:30:17 +00:00
Navdeep Parhar
f01fc2d0e8 cxgbe/iw_cxgbe: Install the socket upcall before calling soconnect to
ensure that it always runs when soisconnected does.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-10-29 22:35:46 +00:00
John Baldwin
567a3784c2 Add support for "plain" (non-HMAC) SHA digests.
MFC after:	2 months
Sponsored by:	Chelsio Communications
2018-10-29 22:24:31 +00:00
Navdeep Parhar
f02c9e69cb cxgbe(4): Add a knob to split the rx queues for a netmap enabled
interface into two groups.  Filters can be used to match traffic
and distribute it across a group.

hw.cxgbe.nm_split_rss

Sponsored by:	Chelsio Communications
2018-10-25 22:55:18 +00:00
Navdeep Parhar
d54dafc600 cxgbe(4): Allow "pass" filters to distribute matching traffic using a
subset of a VI's RSS indirection table.

This makes it possible to make groups out of rx queues and steer
different kinds of traffic to different groups.  For example, an
interface with 8 rx queues could have all non-TCP traffic delivered to
queues 0-3 and all TCP traffic to queues 4-7.

Note that it is already possible for filters to steer traffic to a
particular queue or to distribute it using the full indirection table
(much like normal rx does).

Sponsored by:	Chelsio Communications
2018-10-25 14:37:26 +00:00
Navdeep Parhar
b77aaff9bc cxgbe(4): Update the VI's default queue when netmap is enabled/disabled.
Sponsored by:	Chelsio Communications
2018-10-25 06:24:42 +00:00
Navdeep Parhar
b5da13f72d cxgbe(4): new sysctl to display the start of the RSS region for a VI.
dev.<ifname>.<inst>.rss_base

For example:
dev.cc.0.rss_base: 0
dev.cc.1.rss_base: 128
dev.vcc.0.rss_base: 256
dev.vcc.1.rss_base: 384

Sponsored by:	Chelsio Communications
2018-10-25 01:20:32 +00:00
Navdeep Parhar
980ab1baa6 cxgbe/iw_cxgbe: save the ep in the driver-private provider_data field.
Submitted By: Lily Wang @ Netapp

MFC after:	1 week
2018-10-23 18:32:55 +00:00
John Baldwin
1146377b4b Support the SHA224 HMAC algorithm in ccr(4).
MFC after:	2 months
Sponsored by:	Chelsio Communications
2018-10-23 18:31:39 +00:00
Navdeep Parhar
17e81b7863 cxgbe(4): improve the accuracy of various TSO limits reported to the kernel.
Sponsored by:	Chelsio Communications
2018-10-22 23:57:59 +00:00
Navdeep Parhar
ddf09ad64b cxgbe(4): Use automatic cidx updates with ofld and ctrl queues.
The bits that explicitly request cidx updates do not work reliably with
all possible WRs that can be sent over the queue.  The F_FW_WR_EQUIQ
requests that still remain may also have to be replaced with explicit
credit flush WRs in the future.

MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-10-22 23:06:23 +00:00
Navdeep Parhar
c3a88be466 cxgbe(4): Fix a divide-by-zero that occurs when hw.cxgbe.toecaps_allowed
is set to 0 with a kernel that has both TCP_OFFLOAD and RATELIMIT.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2018-10-13 00:13:24 +00:00
Navdeep Parhar
ac4031d805 cxgbe(4): Enable support for per-connection rate limiting in the default
firmware configuration files.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2018-09-26 21:16:07 +00:00
Navdeep Parhar
cb8dedc76e cxgbe(4): Treat base/end of firmware parameters as signed integers when
figuring out whether the range is valid or not.

Approved by:	re@ (rgrimes@)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-09-26 02:27:37 +00:00
Navdeep Parhar
ea710848dc cxgbe(4): Link related changes.
- Switch to using 32b port/link capabilities in the driver.  The 32b
  format is used internally by firmwares > 1.16.45.0 and the driver will
  now interact with the firmware in its native format, whether it's 16b
  or 32b.  Note that the 16b format doesn't have room for 50G, 200G, or
  400G speeds.

- Add a bit in the pause_settings knobs to allow negotiated PAUSE
  settings to override manual settings.

- Ensure that manual link settings persist across an administrative
  down/up as well as transceiver unplug/replug.

- Remove unused is_*G_port() functions.

Approved by:	re@ (gjb@)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-09-25 05:52:42 +00:00
Navdeep Parhar
061bbaf7e7 cxgbe(4): Reuse existing "switching" L2T entries when possible.
Approved by:	re@ (rgrimes@)
Sponsored by:	Chelsio Communications
2018-09-22 01:24:30 +00:00
Navdeep Parhar
5f65b4cab2 cxgbe(4): Enable TXRTLMT by default when the feature is available in the
kernel (options RATELIMIT) and provisioned in the driver's configuration
file (nethofld > 0).

Submitted by:	gallatin@
Approved by:	re@ (kib@)
2018-09-18 21:34:37 +00:00
Navdeep Parhar
4bb64e96e4 cxgbe(4): Use the correct number of parameters when querying the tid
range for hashfilters.

Approved by:	re@ (gjb@)
2018-09-13 22:58:13 +00:00
Navdeep Parhar
6f3a49c317 cxgbe/iw_cxgbe: Fix reported build breakage when the kernel
configuration has "device cxgbe' but no VIMAGE.

Reported by:	mav@
Approved by:	re@ (kib@)
2018-09-13 16:27:21 +00:00
Navdeep Parhar
e32cd65c5b cxgbe/iw_cxgbe: Fix iWARP RDMA + VIMAGE operation by setting the VNET
properly in a couple of places in the driver.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Approved by:	re@ (rgrimes@)
Sponsored by:	Chelsio Communications
2018-08-29 04:37:53 +00:00
Navdeep Parhar
d6ddb0848c cxgbe/tom: Unregister shared CPL handlers on module unload. This fixes
a panic with INVARIANTS that occurs when t4_tom is unloaded and reloaded.

Approved by:	re@ (kib@)
2018-08-28 18:16:02 +00:00
Navdeep Parhar
becda72184 cxgbe(4): Use fcmpset instead of cmpset when appropriate.
Suggested by:	mjg@
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-08-23 16:24:27 +00:00
Navdeep Parhar
b8bfcb71fd cxgbev(4): Updates to the VF driver to cope with recent ifmedia and
ctrlq changes in the base driver.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-23 00:58:10 +00:00
Navdeep Parhar
da6e33875c cxgbe(4): Be explicit about ignoring the return value of cmpset in some
cases.

Reported by:	Coverity (CIDs 1009398, 1009400, 1009401, 1357325, 1394783).  All false positives.
Sponsored by:	Chelsio Communications
2018-08-21 23:33:38 +00:00
Navdeep Parhar
24bc8671f9 cxgbe/tom: Make sure 'matched' is always initialized before use.
Reported by:	Coverity (CID 1390894)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-21 22:19:34 +00:00
Navdeep Parhar
e758d7758a cxgbe(4): Do not leak memory in case of errors during VI initialization.
Reported by:	Coverity (CID 1392026)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-21 22:15:57 +00:00
Navdeep Parhar
eb6f5d6e72 cxgbe(4): Make it clear that VI_INIT_DONE implies vi->ntxq > 0, and so
rc will never be returned uninitialized.

Reported by:	Coverity (CID 1394884).  This is a false positive though.
Sponsored by:	Chelsio Communications
2018-08-21 21:42:17 +00:00
Navdeep Parhar
036ff794dd cxgbe(4): Check the RO bit properly before disabling relaxed ordering.
Reported by:	Coverity (CID 1384286)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-21 21:32:51 +00:00
Navdeep Parhar
5a35633d7f cxgbe(4): Avoid overflow while calculating channel rate.
Reported by:	Coverity (CID 1008352)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-21 21:08:58 +00:00
Navdeep Parhar
7576fe761e cxgbe/tom: Provide the hardware tid in tcp_info.
Submitted by:	marius@
2018-08-20 21:40:14 +00:00
Navdeep Parhar
e7e0844422 cxgbe(4): Replace T4_PKT_TIMESTAMP with something slightly less hackish. 2018-08-18 04:23:51 +00:00
Navdeep Parhar
a56e2056a3 cxgbe(4): Adjust ntids to account for nhptids in the TOE case too.
This should have been part of r337538.
2018-08-17 20:28:31 +00:00
Navdeep Parhar
72049e7395 cxgbe/tom: Put the ifnet or VLAN's PCP value in the 802.1Q tag of frames
generated by the TOE.  Works with vid 0 (no VLAN, just priority) too.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-17 19:22:46 +00:00
Navdeep Parhar
9f78434942 cxgbe(4): Use VLAN_TRUNKDEV instead of private cookie to figure out the
parent of a VLAN ifnet.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-15 21:24:05 +00:00
Navdeep Parhar
51347c3ff1 cxgbe(4): Use two hashes instead of a table to keep track of
hashfilters.  Two because the driver needs to look up a hashfilter by
its 4-tuple or tid.

A couple of fixes while here:
- Reject attempts to add duplicate hashfilters.
- Do not assume that any part of the 4-tuple that isn't specified is 0.
  This makes it consistent with all other mandatory parameters that
  already require explicit user input.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2018-08-15 03:03:01 +00:00
Navdeep Parhar
408954013a Whitespace nit in t4_tom.h 2018-08-13 19:21:28 +00:00
Navdeep Parhar
4a89444d7e Remove unused stuff from iw_cxgbe.h 2018-08-12 03:36:09 +00:00
Navdeep Parhar
37310a98a8 cxgbe(4): Move all control queues to the adapter.
There used to be one control queue per adapter (the mgmtq) that was
initialized during adapter init and one per port that was initialized
later during port init.  This change moves all the control queues (one
per port/channel) to the adapter so that they are initialized during
adapter init and are available before any port is up.  This allows the
driver to issue ctrlq work requests over any channel without having to
bring up any port.

MFH:		2 weeks
Sponsored by:	Chelsio Communications
2018-08-11 21:10:08 +00:00
Navdeep Parhar
3098bcfc05 cxgbe(4): Create two variants of service_iq, one for queues with
freelists and one for those without.

MFH:		3 weeks
Sponsored by:	Chelsio Communications
2018-08-11 04:55:47 +00:00
Navdeep Parhar
2d73ac5e4a cxgbe(4): Add a sysctl to control the tx credit reclaim mechanism for
netmap tx queues.  There is no change in default behavior.

Sponsored by:	Chelsio Communications
2018-08-09 21:52:51 +00:00
Navdeep Parhar
518bca2c21 cxgbe(4): Set fl_pktshift to 0 by default.
Sponsored by:	Chelsio Communications
2018-08-09 21:07:32 +00:00
Navdeep Parhar
8a684e1fd1 cxgbe(4): Display pkt-size and burst-size in traffic class parameters. 2018-08-09 14:36:44 +00:00
Navdeep Parhar
5fc0f72f3b cxgbe(4): Add support for high priority filters on T6+. They have their
own region in the TCAM starting with T6, unlike previous chips where
they were in the same region as normal filters.

These filters "hit" before anything else in the LE's lookup.  The exact
order is:
a) High priority filters
b) TOE's active region (TCAM and/or hash)
c) Servers (TOE hw listeners)
d) Normal filters

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-09 14:19:47 +00:00
Navdeep Parhar
09a7189fb7 cxgbe(4): Allow the driver to specify a burst size when configuring a
traffic class for rate limiting.

Add experimental knobs that allow the user to specify a default pktsize
and burstsize for traffic classes associated with a port:

dev.<ifname>.<instance>.tc.pktsize
dev.<ifname>.<instance>.tc.burstsize

Sponsored by:	Chelsio Communications
2018-08-07 22:13:03 +00:00
Navdeep Parhar
1979b51141 cxgbe(4): Allow user-configured and driver-configured traffic classes to
be used simultaneously.  Move sysctl_tc and sysctl_tc_params to
t4_sched.c while here.

MFC after:	3 weeks
Sponsored by:	Chelsio Communications
2018-08-06 23:21:13 +00:00
Navdeep Parhar
7b8f5a200a cxgbe(4): Break up sysctl_bitfield into 8 bit and 16 bit variants. Have
them display the current value of the bitfield rather than the fixed
value that was provided when the sysctl node was created.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-06 21:54:51 +00:00
Navdeep Parhar
564ec04ea8 Fix typo in cxgbe/t4_tom. 2018-08-06 19:09:55 +00:00
Navdeep Parhar
0c71c9ccb2 cxgbe(4): Improvements in TID management.
- Ignore any type of TID where the start/end values are not in the
  correct order.  There are situations where the firmware isn't able to
  reserve room for the number requested in the config file but doesn't
  report a failure during configuration and instead sets end <= start.

- Track start/end in tid_tab and remove some redundant copies from
  adapter->params.

- Move all the start/end and other read-only parameters to a quiet part
  of tid_tab, away from the tid locks.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-02 22:52:05 +00:00
Navdeep Parhar
ac8ec5fea6 cxgbe(4): Use the tx credit limit for ethofld rather than TOE when
initializing the softc for a per-flow rate limiter.  The limit happens
to be the same for both and the existing code worked by accident for
common configurations.

Reported by:	gallatin@
Sponsored by:	Chelsio Communications
2018-08-02 19:50:12 +00:00
Navdeep Parhar
aa8c29e5e7 cxgbe(4): Consider rateunit before ratemode when displaying information
about a traffic class.  This matches the order in which the firmware
evaluates unit and mode internally.

Sponsored by:	Chelsio Communications
2018-07-26 07:29:44 +00:00
Navdeep Parhar
f7c6e09244 cxgbe(4): Better defaults for all cl-rl rate limiters.
Start in "class" instead of "flow" mode.  This eliminates the need to
specify an MTU, which is not available that early anyway.  It also
allows the user to manually configure ch-rl rate limiting after attach.
This used to fail because ch-rl isn't supported if cl-rl "flow" mode is
configured.

Set all traffic classes to 1Gbps during initialization.  The goal is to
start off with _any_ valid configuration and 1Gbps works even for
gigabit cards.

Sponsored by:	Chelsio Communications
2018-07-26 06:42:09 +00:00
Navdeep Parhar
2095de1c3f cxgbe(4): Remove useless code that crept in with r336718.
MFC after:	3 days
X-MFC With:	336718
2018-07-25 17:45:43 +00:00
Navdeep Parhar
9c3b8b3c32 cxgbe(4): Validate only those parameters that are relevant to the
type of rate limiter being programmed.  Skip the ones that are not
applicable.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-07-25 17:20:06 +00:00
Conrad Meyer
1b0909d51a OpenCrypto: Convert sessions to opaque handles instead of integers
Track session objects in the framework, and pass handles between the
framework (OCF), consumers, and drivers.  Avoid redundancy and complexity in
individual drivers by allocating session memory in the framework and
providing it to drivers in ::newsession().

Session handles are no longer integers with information encoded in various
high bits.  Use of the CRYPTO_SESID2FOO() macros should be replaced with the
appropriate crypto_ses2foo() function on the opaque session handle.

Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to
the opaque handle interface.  Discard existing session tracking as much as
possible (quick pass).  There may be additional code ripe for deletion.

Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style
interface.  The conversion is largely mechnical.

The change is documented in crypto.9.

Inspired by
https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html .

No objection from:	ae (ipsec portion)
Reported by:	jhb
2018-07-18 00:56:25 +00:00
Navdeep Parhar
069262a734 Fix vertical whitespace nit in cxgbe. 2018-07-10 06:09:25 +00:00
Navdeep Parhar
82df14c3ab cxgbe(4): Add a sysctl to report the chip's microprocessor's load
averages.  This works with debug or custom firmwares only.

sysctl dev.<nexus>.<instance>.loadavg
sysctl dev.t6nex.0.loadavg

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-07-10 03:03:10 +00:00
Navdeep Parhar
4d96a1b772 cxgbe(4): Assume that any unknown flash on the card is 4MB and has 64KB
sectors, instead of refusing to attach to the card.

Submitted by:	Casey Leedom @ Chelsio
MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-07-06 19:33:58 +00:00
Matt Macy
6573d7580b epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
  there's no longer any benefit to dropping the pcbinfo lock
  and trying to do so just adds an error prone branchfest to
  these functions
- Remove cases of same function recursion on the epoch as
  recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
  thread as the tracker field is now stack or heap allocated
  as appropriate.

Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066
2018-07-04 02:47:16 +00:00
Navdeep Parhar
d207040226 cxgbe/cxgbei: Fix harmful typo in the iSCSI offload driver.
Reported by:	gcc8 (via mmacy@)
MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-06-27 14:29:13 +00:00
Navdeep Parhar
af8854fdc1 cxgbe(4): Do not leak the filters in the hashfilter table on module
unload.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-06-27 01:51:17 +00:00
Navdeep Parhar
b605d9cd51 cxgbe(4): Some mailbox commands require access to the Tx pipeline and
can time out if it's backed up due to a non-stop deluge of PAUSE frames
from a misbehaving peer.  Detect this situation and toggle MPS TxEn
to allow forward progress.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2018-06-19 00:50:27 +00:00
Navdeep Parhar
0afe96c7bf cxgbe(4): Add a hw.cxgbe.starve_fl sysctl that can be used to starve the
freelists of netmap receive queues.  This is primarily to test various
congestion scenarios in the chip.

Sponsored by:	Chelsio Communications
2018-06-15 23:42:22 +00:00
Navdeep Parhar
31f494cdf5 cxgbe(4): Track the number of received frames separately from the number
of descriptors processed.  Add the ability to gather a certain maximum
number of frames in the driver's rx before waking up netmap rx.  If
there aren't enough frames then netmap rx will be woken up as usual.

hw.cxgbe.nm_rx_nframes

Sponsored by:	Chelsio Communications
2018-06-15 21:23:03 +00:00
Navdeep Parhar
6ddad9de86 cxgbe(4): sysctls to display the local and intr CPUs for the adapter.
The driver assumes the list can change (even though it does't right now)
and queries it every time the sysctl runs.

sysctl dev.<nexus>.<inst>.local_cpus
sysctl dev.<nexus>.<inst>.intr_cpus

sysctl dev.t6nex.0.local_cpus
sysctl dev.t6nex.0.intr_cpus

Sponsored by:	Chelsio Communications
2018-06-15 18:04:44 +00:00
Navdeep Parhar
2ea5b0f54a cxgbe(4): Catch up with recent changes in the kernel -- it no longer
holds non-sleepable locks around any of the driver ioctls.

Sponsored by:	Chelsio Communications
2018-06-14 01:27:35 +00:00
Navdeep Parhar
1bb577b4c2 cxgbe(4): Remove homemade version of htobe32 from the driver.
It was needed only for ia64 where it was implemented as a call to
bswapXX, which was always a real function.  htobeXX with a constant
argument is calculated at compile-time everywhere else.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-06-12 06:46:03 +00:00
Navdeep Parhar
c27fcc70cc cxgbe(4): Include full duplex mediaopt in media that can be reported as
active.  Always report full duplex in active media.

Sponsored by:	Chelsio Communications
2018-06-01 16:46:29 +00:00
Navdeep Parhar
b9330ed7a2 cxgbe(4): Retire an old check. 2018-06-01 01:05:34 +00:00
Navdeep Parhar
2c87bdc706 cxgbe(4): Add support for SMAC-rewriting filters.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-05-31 21:56:57 +00:00
Navdeep Parhar
2dae2a7487 cxgbe(4): Add code to deal with the chip's source MAC table (aka SMT).
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-05-31 21:31:08 +00:00
Navdeep Parhar
1e3e6b634e cxgbe(4): Use ifm for ifmedia just like the rest of the kernel.
No functional change.
2018-05-31 02:22:40 +00:00
Navdeep Parhar
7cff4fd2d7 cxgbe(4): Implement ifm_change callback.
Sponsored by:	Chelsio Communications
2018-05-31 02:10:50 +00:00
Navdeep Parhar
56226f5673 cxgbe(4): Consider all supported speeds when building the ifmedia list
for a port.  Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps.  Yes, really.
- New port flag to indicate that the media list is immutable.  It will
  be used in future refinements.

This also fixes a bug where the driver reports incorrect media with
recent firmwares.

MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-05-30 22:36:09 +00:00
Navdeep Parhar
c3fce948fb cxgbe(4): Suppress a warning about code that is used only with options
RATELIMIT.

Reported by:	mmacy@
2018-05-25 18:57:41 +00:00
Navdeep Parhar
475d42db4a cxgbe(4): Report IFCAP_TXRTLMT to kernels built with RATELIMIT if the
firmware has provisioned resources for this feature.

Sponsored by:	Chelsio Communications
2018-05-24 10:55:26 +00:00
Navdeep Parhar
786099de5e cxgbe(4): Data path for rate-limited tx.
This is hardware support for the SO_MAX_PACING_RATE sockopt (see
setsockopt(2)), which is available in kernels built with "options
RATELIMIT".

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2018-05-24 10:18:14 +00:00
Navdeep Parhar
c90a8cf80a cxgbe/t4_tom: ABORT_RPL_RSS is a shared CPL and t4_tom shouldn't remove
the global handler when it's being unloaded.
2018-05-24 08:32:02 +00:00
Navdeep Parhar
9c707b3287 cxgbe(4): Make FW4_ACK a shared CPL. ETHOFLD in the base driver will
use it for per-flow rate limiting.

Sponsored by:	Chelsio Communications
2018-05-24 08:21:43 +00:00
Navdeep Parhar
1dd95f641e cxgbe(4): Fix range checks in is_etid. 2018-05-24 08:02:11 +00:00
Navdeep Parhar
a6a8ff351d cxgbe(4): Slightly simpler needs_<foo> functions. 2018-05-24 07:38:46 +00:00
Navdeep Parhar
2e09fe9116 cxgbe(4): Make sure that the egress queue's cidx is updated periodically
when the driver is writing WRs using start_wrq_wr/commit_wrq_wr all the
time.

Sponsored by:	Chelsio Communications
2018-05-24 06:44:06 +00:00
Navdeep Parhar
80259b6c12 cxgbe(4): Only valid filters are expected to have a valid tid. 2018-05-22 16:23:14 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
John Baldwin
24ddd0ec9c Be more robust against garbage input on a TOE TLS TX socket.
If a socket is closed or shutdown and a partial record (or what
appears to be a partial record) is waiting in the socket buffer,
discard the partial record and close the connection rather than
waiting forever for the rest of the record.

Reported by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-05-18 19:09:11 +00:00
Navdeep Parhar
67e071128d cxgbe(4): Implement ifnet callbacks that deal with send tags.
An etid (ethoffload tid) is allocated for a send tag and it acquires a
reference on the traffic class that matches the send parameters
associated with the tag.

Sponsored by:	Chelsio Communications
2018-05-18 06:09:15 +00:00
Navdeep Parhar
d88a0442bb cxgbe(4): Fix s->neq miscalculation in r333698. 2018-05-17 06:04:50 +00:00
Navdeep Parhar
eff62dba61 cxgbe(4): Allocate offload Tx queues when a card has resources
provisioned for NIC_ETHOFLD and the kernel has option RATELIMIT.

It is possible to use the chip's offload queues for normal NIC Tx and
not just TOE Tx.  The difference is that these queues support out of
order processing of work requests and have a per-"flowid" mechanism for
tracking credits between the driver and hardware.  This allows Tx for
any number of flows bound to different rate limits to be submitted to a
single Tx queue and the work requests for slow flows won't cause HOL
blocking for the rest.

Sponsored by:	Chelsio Communications
2018-05-17 01:42:18 +00:00
Navdeep Parhar
93c0bfb85b cxgbe(4): Add NIC_ETHOFLD to the NIC capabilities allowed by the driver
by default.

This is the first of a series of commits that will add support for
RATELIMIT kernel option to the base if_cxgbe driver, for use with
ordinary NIC traffic "flows".  RATELIMIT is already supported by t4_tom
for the fully-offloaded TCP connections that it handles.

Sponsored by:	Chelsio Communications
2018-05-17 00:52:48 +00:00
Navdeep Parhar
47ae7a7e4f cxgbe(4): Fall back to a failsafe configuration built into the firmware
if an error is reported while pre-processing the configuration file that
the driver attempted to use.

Also, allow the user to explicitly use the built-in configuration with
hw.cxgbe.config_file="built-in"

MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-05-16 17:55:16 +00:00
Navdeep Parhar
b6ce1d5b1d cxgbe(4): Add support for two more flash parts.
Obtained from:	Chelsio Communications
MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-05-15 22:26:09 +00:00
Navdeep Parhar
40f242e440 cxgbe(4): Claim some more T5 and T6 boards.
MFC after:	2 days
Sponsored by:	Chelsio Communications
2018-05-15 21:54:59 +00:00
Navdeep Parhar
b3daa684d8 cxgbe(4): Filtering related features and fixes.
- Driver support for hardware NAT.
- Driver support for swapmac action.
- Validate a request to create a hashfilter against the filter mask.
- Add a hashfilter config file for T5.

Sponsored by:	Chelsio Communications
2018-05-15 04:24:38 +00:00
Navdeep Parhar
f348cdad1a cxgbe(4): Add fields to support configuration of hardware NAT and
swapmac (SMAC/DMAC switcheroo) from userspace.

Sponsored by:	Chelsio Communications
2018-05-10 20:39:04 +00:00
Navdeep Parhar
f7a203bc21 cxgbe(4): Disable write-combined doorbells by default.
This had been the default behavior but was changed accidentally as part
of the recent iw_cxgbe+OFED overhaul.  Fix another bug in that change
while here: the global knob affects all the adapters in the system and
should be left alone by per-adapter code.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-05-10 06:33:54 +00:00
Navdeep Parhar
5174205de5 cxgbe(4): Determine whether the firmware supports the FILTER2 work
request, which can be used to configure hardware NAT and swapmac.

All firmwares released after Jan 2017 support this work request.

Sponsored by:	Chelsio Communications
2018-05-10 00:04:14 +00:00
Navdeep Parhar
89f651e704 cxgbe(4): Add support for hash filters.
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool.  Hash and
normal TCAM filters can be used together.  The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters.  Any T5/T6 card with memory can support at
least half a million hash filters.  The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.

The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file.  The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto).  It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).

For example:
filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe
filterMask = protocol, port, vlan

Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.

Hash filters support all actions supported by TCAM filters.  A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).

Sponsored by:	Chelsio Communications
2018-05-09 04:09:49 +00:00
Navdeep Parhar
b6f2c452cb cxgbe(4): Update all firmwares to 1.19.1.0.
These firmwares and the following list of changes are from the public
ChelsioUwire-3.7.1.0 release.

T6 Firmware
================================================================================
Version : 1.19.1.0
Date    : 04/23/2018
================================================================================

Fixes
-----

BASE:
- Fixed traffic stall when rate-limit is modified while running traffic.
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.

ETH:
- Exit Auto-Negotiation if we don't receive base page from peer within 10s.
  This fixes some cases where in we keep on restarting auto negotiation without
  ever exiting, resulting in link failure.
- Fixes an issue where VF packets counter were not increasing if VF packets
  coalesced WR is used by driver.

OFLD:
- Kernel and user mode NVMEoF performance enhancements.

FOiSCSI:
- Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.

================================================================================
Version : 1.18.9.0
Date    : 03/27/2018
================================================================================

Fixes
-----

BASE:
- For Ethernet frames less than 64B, pad them with zero bytes as per IEEE spec
  (RFC 894).
- Added a new parameter iqtype to FW_IQ_CMD to identify the ingress NIC or offload
  queues. This fixes an issue where driver was receiving interrupt with no new
  messages in queue.
- FW_PARAMS_CMD processes all the valaid paramaters and returns value 0UL for
  any unknown parameter.

OFLD:
- Fixes connection failure during SRQ reuse.
- Fixes incorrect cqe in case of WRITE with immediate operation.

FOiSCSI:
- Fixes a fw crash when wrong node-id is passed to FW_FOISCSI_CTRL_WR.

FOFCoE:
- Fixes a fw hang while creating NPIV.

Enhancements
------------

ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.

================================================================================
Version : 1.18.4.0
Date    : 02/28/2018
================================================================================

Fixes
-----

BASE:
- Fixed Rate limiting not working for 101Mbps<=rate limit<=163Mbps range.
- Fixed starting more than 32 VMs on PF4 causing firmware hang.

ETH:
- Fixed link failure due to FEC mismatch with optics.
- Fixed link failure with link toggle stress tests.
- Only BaseR FEC is supported for 50G.
- Fixed a bug in next page handling which sometimes causes link down.
- Fixed port down due to failre to read eeprom contents of some modules.
- Fixed a bug causing adapter to fail with spider configuration.

FOiSCSI:
- Fixed a bug causing login failure when connecting to multiple targets.

Enhancements
------------

BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
  the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).

ETH:
- Added support for user to contol pause negotiation during auto negotiation.

FOiSCSI:
- Added a new facility to redirect few fw events to offload rx queue
  (based on driver's configration)
- Driver can ignore providing ipv6 prefix len during ipv6 address configuration.

================================================================================
Version : 1.17.14.0
Date    : 12/27/2017
================================================================================

FIXES
-----

BASE:
- Fixed an FLR failure during simulteneous power up of VM.
- Fixed an issue in vlan acl which was limiting vlan range to 1024.

ETH:
- Enabled RS-FEC for 25G active copper cable and 25GBASE-SR.
- When auto negotiation is enabled, final pause settings are resolved
  based on local and peer pause settings.
- Handle NACK for an I2C access.

OFLD
- Fixed rdma connection cleanup in SO adpater.
- Fixed rdma connections during read invalidate.
- Fixed the crash when invalid BW rate is passed to fw.
- Fixed the traffic hang when BW allocation is changed from switch during traffic.

FOFCoE:
- Fixed an issue where initiator remains logged-in even after LLDP is disabled
  on switch.

ENHANCEMENTS
------------

BASE:
- Added support for 248 VFs.
- Added fw driver periodic calibration for MC.

ETH:
- Added XLAUI port type support.
- Added raw mac entry deletion support (FW_VI_MAC_ID_BASED_FREE).

OFLD:
- Inline IPSec support added (flag F_FW_ULPTX_WR_DATA indicates the inline
  IPSec WR).
- New work request FW_RI_RDMA_WRITE_CMPL_WR (write with completion) added to

T5 Firmware
================================================================================
Version : 1.19.1.0
Date    : 04/23/2018
================================================================================

Fixes
-----

BASE:
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.

ETH:
- Fixes an issue where VF packets counter were not increasing if VF packets
  coalesced WR is used by driver.

OFLD:
- Fixes an issue where fw hangs if max traffic rate passed is 0.

FOiSCSI:
-  Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.

================================================================================
Version : 1.18.9.0
Date    : 03/27/2018
================================================================================

Fixes
-----

BASE:
- For Ethernet frames less than 64B, pad them with zero bytes as per IEEE spec
  (RFC 894).
- Added a new parameter iqtype to FW_IQ_CMD to identify the ingress NIC or offload
  queues. This fixes an issue where driver was receiving interrupt with no new
  messages in queue.

ETH:
- Pad the Ethernet packets of size less than 64B with zeros. This fixes the
  incorrect checksum generation of packets less then 64B.

FOiSCSI:
- Fixes a fw crash when wrong node-id is passed to FW_FOISCSI_CTRL_WR.

FOFCoE:
- Fixes a fw hang while creating NPIV.

Enhancements
------------

ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.

================================================================================
Version : 1.18.4.0
Date    : 02/28/2018
================================================================================

Fixes
-----

BASE:
- Fixed starting more than 32 VMs on PF4 causing firmware hang.

FOiSCSI:
- Fixed a bug causing login failure when connecting to multiple targets.

Enhancements
------------

BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
  the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).

ETH:
- Added support for user to contol pause negotiation during auto negotiation.

FOiSCSI:
- Added a new facility to redirect few fw events to offload rx queue
  (based on driver's configration)
- Driver can ignore providing ipv6 prefix len during ipv6 address configuration.

================================================================================
Version : 1.17.14.0
Date    : 12/27/2017
================================================================================

FIXES
-----

BASE:
- Fixed an issue in vlan acl which was limiting vlan range to 1024.

ETH:
- Corrected lane inversion logic.
- Fixed improper LED behavior in T580 cards.
- When auto negotiation is enabled, final pause settings are resolved
  based on local and peer pause settings.
- Handle NACK for an I2C access.

OFLD
- Fixed rdma connections during read invalidate.

FOiSCSI:
- Fixed a connections hang when link is toggled frequently.

FOFCoE:
- Fixed an issue where initiator remains logged-in even after LLDP is disabled
  on switch.

ENHANCEMENTS
------------

BASE:
- Added support for 124 VFs.

ETH:
- Added XLAUI port type support.
- Added raw mac entry deletion support (FW_VI_MAC_ID_BASED_FREE).

OFLD:
- New work request FW_RI_RDMA_WRITE_CMPL_WR (write with completion) added to
  optimize NVMEoF write.

T4 Firmware
================================================================================
Version : 1.19.1.0
Date    : 04/23/2018
================================================================================

Fixes
-----

BASE:
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.

FOiSCSI:
-  Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.

================================================================================
Version : 1.18.9.0
Date    : 03/27/2018
================================================================================

Fixes
-----

BASE:
- Added a new paramter iqtype to FW_IQ_CMD to identify the ingress NIC or
  offload queues. This fixes an issue where driver was receiving interrupt with
  no new messages in queue.

FOFCoE:
- Fixes a fw hang while creating NPIV.

Enhancements
------------

ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.

================================================================================
Version : 1.18.4.0
Date    : 02/28/2018
================================================================================

Enhancements
------------

BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
  the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).

================================================================================
Version : 1.17.14.0
Date    : 12/27/2017
================================================================================

FIXES
-----

BASE:
- Fixed an issue in vlan acl which was limiting vlan range to 1024.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-05-05 20:16:08 +00:00
Navdeep Parhar
e1320420d5 cxgbe(4): Move all TCAM filter code into a separate file.
Sponsored by:	Chelsio Communications
2018-05-01 20:17:22 +00:00
Andrew Gallatin
a378e59420 Optionally panic when cxgbe encounters a fatal error
Sometimes it is better to panic than to leave a machine
unreachable.

Reviewed by:	np
Sponsored by:	Netflix
2018-05-01 15:33:21 +00:00
Navdeep Parhar
faf6d96b45 cxgbe(4): Destroy the cdev before disabling interrupts in driver detach.
Filter work requests are submitted in the nexus cdev's ioctl which then
blocks waiting for a reply.  If driver detach runs in this state and
disables interrupts the ioctl will never complete and detach will hang
in destroy_cdev.

Sponsored by:	Chelsio Communications
2018-05-01 14:59:38 +00:00
Navdeep Parhar
111638bf68 cxgbe(4): Convert ACT_OPEN_RPL to a shared CPL.
Reserve 3b in the 14b atid to identify the owner and use it to dispatch
the CPL.  This allows all CPLs that use an atid to be used as shared
CPLs, although ACT_OPEN_RPL is the only one being converted in this
revision.

Sponsored by:	Chelsio Communications
2018-04-30 21:47:30 +00:00
Navdeep Parhar
f8bc08fa57 cxgbe/t4_tom: Use appropriate macros instead of magic math while
constructing the atid of an active open work request.

Sponsored by:	Chelsio Communications
2018-04-30 17:33:44 +00:00
Navdeep Parhar
4535e8046f cxgbe(4): Use opaque cookies or tid range-checks to determine the
intended recipient of a CPL when it can't be determined solely from the
opcode.  Retire the per-queue handlers for such CPLs in favor of the new
scheme.

Sponsored by:	Chelsio Communications
2018-04-30 15:18:38 +00:00
John Baldwin
4217c685b1 Use the correct key address when renegotiating the transmit key.
Previously, get_keyid() was returning the address of the receive key
instead of the transmit key when renegotiating the transmit key.  This
could either hang the card (if a connection was only offloading TLS TX
and thus had a receive key address of -1) or cause the connection to
fail by overwriting the wrong key (if both RX and TX TLS were
offloaded).

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-04-27 17:20:23 +00:00
Navdeep Parhar
8896672a77 cxgbe(4): Move release_tid to the base NIC driver for future consumers.
Sponsored by:	Chelsio Communications.
2018-04-26 22:04:21 +00:00
Navdeep Parhar
3747c1ffc7 cxgbe(4): Break up alloc_tid_tabs and move the atid routines to the base
NIC driver.  The atid services will be used by new features (hashfilters
and inline TLS) that do not involve TOE.

Sponsored by:	Chelsio Communications
2018-04-26 19:00:35 +00:00
Navdeep Parhar
8aa1c1d8b4 cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag
(instead of hardcoded 0xffff, implying no tag) in the routines that
process offload policy.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-04-19 18:10:44 +00:00
Navdeep Parhar
1131c927c4 cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload.  t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface.  The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.

A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so.  The general format
of a rule is: [socket-type] expr => settings

Example.  See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed

The driver processes the rules for each new listen, active open, or
passive open and stops at the first match.  There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload

This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.

Sponsored by:	Chelsio Communications
2018-04-14 19:07:56 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Navdeep Parhar
de93353248 cxgbe(4): Always display an error message if SIOCSIFFLAGS will leave
IFF_UP and IFF_DRV_RUNNING out of sync.  ifhwioctl in the kernel pays no
attention to the return code from the driver ioctl during SIOCSIFFLAGS
so these messages are the only indication that the ioctl was called but
failed.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-04-04 22:52:24 +00:00
Navdeep Parhar
f8fea0d90e cxgbe: Implement tcp_info handler for connections handled by t4_tom.
The TCB is read using a memory window right now.  A better alternate to
get self-consistent, uncached information would be to use a GET_TCB
request but waiting for a reply from hw while holding non-sleepable
locks is quite inconvenient.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14817
2018-04-03 01:22:15 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
John Baldwin
edf95feba4 Use the offload transmit queue to set flags on TLS connections.
Requests to modify the state of TLS connections need to be sent on the
same queue as TLS record transmit requests to ensure ordering.

However, in order to use the offload transmit queue in t4_set_tcb_field(),
the function needs to be updated to do proper flow control / credit
management when queueing a request to an offload queue.  This required
passing a pointer to the toepcb itself to this function, so while here
remove the 'tid' and 'iqid' parameters and obtain those values from the
toepcb in t4_set_tcb_field() itself.

Submitted by:	Harsh Jain @ Chelsio (original version)
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14871
2018-03-27 20:54:57 +00:00
Navdeep Parhar
d57241d2e7 cxgbe(4): Always initialize requested_speed to a valid value.
This fixes an avoidable EINVAL when the user tries to disable AN after
the port is initialized but l1cfg doesn't have a valid speed to use.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-24 01:07:58 +00:00
Navdeep Parhar
1b4df78b42 cxgbe(4): Do not read MFG diags information from custom boards.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 04:42:29 +00:00
Navdeep Parhar
5401e09688 cxgbe(4): Tunnel congestion drops on a port should be cleared when the
stats for that port are cleared.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 02:04:57 +00:00
John Baldwin
6b02bb1aa7 Fix the check for an empty send socket buffer on a TOE TLS socket.
Compare sbavail() with the cached sb_off of already-sent data instead of
always comparing with zero.  This will correctly close the connection and
send the FIN if the socket buffer contains some previously-sent data but
no unsent data.

Reported by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-03-14 20:49:51 +00:00
John Baldwin
02d2bcfaba Remove TLS-related inlines from t4_tom.h to fix iw_cxgbe(4) build.
- Remove the one use of is_tls_offload() and the function.  AIO special
  handling only needs to be disabled when a TOE socket is actively doing
  TLS offload on transmit.  The TOE socket's mode (which affects receive
  operation) doesn't matter, so remove the check for the socket's mode and
  only check if a TOE socket has TLS transmit keys configured to determine
  if an AIO write request should fall back to the normal socket handling
  instead of the TOE fast path.
- Move can_tls_offload() into t4_tls.c.  It is not used in critical paths,
  so inlining isn't that important.  Change return type to bool while here.

Sponsored by:	Chelsio Communications
2018-03-14 20:46:25 +00:00
John Baldwin
1e9538d253 Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS
encryption and TCP segmentation for offloaded connections.  Sockets
using TLS are required to use a set of custom socket options to upload
RX and TX keys to the NIC and to enable RX processing.  Currently
these socket options are implemented as TCP options in the vendor
specific range.  A patched OpenSSL library will be made available in a
port / package for use with the TLS TOE support.

TOE sockets can either offload both transmit and reception of TLS
records or just transmit.  TLS offload (both RX and TX) is enabled by
setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be
enabled on the relevant interface.  Transmit offload can be used on
any "normal" or TLS TOE socket by using the custom socket option to
program a transmit key.  This permits most TOE sockets to
transparently offload TLS when applications use a patched SSL library
(e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL
library).  Receive offload can only be used with TOE sockets using the
TLS mode.  The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a
list of TCP port numbers.  Any connection with either a local or
remote port number in that list will be created as a TLS socket rather
than a plain TOE socket.  Note that although this sysctl accepts an
arbitrary list of port numbers, the sysctl(8) tool is only able to set
sysctl nodes to a single value.  A TLS socket will hang without
receiving data if used by an application that is not using a patched
SSL library.  Thus, the tls_rx_ports node should be used with care.
For a server mostly concerned with offloading TLS transmit, this node
is not needed as plain TOE sockets will fall back to software crypto
when using an unpatched SSL library.

New per-interface statistics nodes are added giving counts of TLS
packets and payload bytes (payload bytes do not include TLS headers or
authentication tags/MACs) offloaded via the TOE engine, e.g.:

dev.cc.0.stats.rx_tls_octets: 149
dev.cc.0.stats.rx_tls_records: 13
dev.cc.0.stats.tx_tls_octets: 26501823
dev.cc.0.stats.tx_tls_records: 1620

TLS transmit work requests are constructed by a new variant of
t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.

TLS transmit work requests require a buffer containing IVs.  If the
IVs are too large to fit into the work request, a separate buffer is
allocated when constructing a work request.  This buffer is associated
with the transmit descriptor and freed when the descriptor is ACKed by
the adapter.

Received TLS frames use two new CPL messages.  The first message is a
CPL_TLS_DATA containing the decryped payload of a single TLS record.
The handler places the mbuf containing the received payload on an
mbufq in the TOE pcb.  The second message is a CPL_RX_TLS_CMP message
which includes a copy of the TLS header and indicates if there were
any errors.  The handler for this message places the TLS header into
the socket buffer followed by the saved mbuf with the payload data.
Both of these handlers are contained in tom/t4_tls.c.

A few routines were exposed from t4_cpl_io.c for use by t4_tls.c
including send_rx_credits(), a new send_rx_modulate(), and
t4_close_conn().

TLS keys for both transmit and receive are stored in onboard memory
in the NIC in the "TLS keys" memory region.

In some cases a TLS socket can hang with pending data available in the
NIC that is not delivered to the host.  As a workaround, TLS sockets
are more aggressive about sending CPL_RX_DATA_ACK messages anytime that
any data is read from a TLS socket.  In addition, a fallback timer will
periodically send CPL_RX_DATA_ACK messages to the NIC for connections
that are still in the handshake phase.  Once the connection has
finished the handshake and programmed RX keys via the socket option,
the timer is stopped.

A new function select_ulp_mode() is used to determine what sub-mode a
given TOE socket should use (plain TOE, DDP, or TLS).  The existing
set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and
handles initialization of TLS-specific state when necessary in
addition to DDP-specific state.

Since TLS sockets do not receive individual TCP segments but always
receive full TLS records, they can receive more data than is available
in the current window (e.g. if a 16k TLS record is received but the
socket buffer is itself 16k).  To cope with this, just drop the window
to 0 when this happens, but track the overage and "eat" the overage as
it is read from the socket buffer not opening the window (or adding
rx_credits) for the overage bytes.

Reviewed by:	np (earlier version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14529
2018-03-13 23:05:51 +00:00
John Baldwin
9689995d23 Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning
  success.  This avoids having to pretend to have proper support for
  unloading when only part of t4_tom_mod_load() has run.
- If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly.
  The module handling code in the kernel invokes MOD_UNLOAD on a module
  whose MOD_LOAD fails with an error already.

Reviewed by:	np (part of a larger patch)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-03-13 21:42:38 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
John Baldwin
6b554d26ed Move #include for rijndael.h out of x86-specific region.
The #include was added inside of the conditional by accident and the lack
of it broke non-x86 builds.

Reported by:	lwhsu (jenkins), andrew
2018-02-27 17:51:58 +00:00
John Baldwin
db631975fe Don't overflow the ipad[] array when clearing the remainder.
After the auth key is copied into the ipad[] array, any remaining bytes
are cleared to zero (in case the key is shorter than one block size).
The full block size was used as the length of the zero rather than the
size of the remaining ipad[].  In practice this overflow was harmless as
it could only clear bytes in the following opad[] array which is
initialized with a copy of ipad[] in the next statement.

Sponsored by:	Chelsio Communications
2018-02-26 22:17:27 +00:00
John Baldwin
52f8c52677 Move ccr_aes_getdeckey() from ccr(4) to the cxgbe(4) driver.
This routine will also be used by the TOE module to manage TLS keys.

Sponsored by:	Chelsio Communications
2018-02-26 22:12:31 +00:00
John Baldwin
198729ea7d Fetch TLS key parameters from the firmware.
The parameters describe how much of the adapter's memory is reserved for
storing TLS keys.  The 'meminfo' sysctl now lists this region of adapter
memory as 'TLS keys' if present.

Sponsored by:	Chelsio Communications
2018-02-26 21:56:06 +00:00
John Baldwin
6619d9fb70 Bring in additional constants and message fields for TLS-related messages.
Sponsored by:	Chelsio Communications
2018-02-22 02:02:31 +00:00
John Baldwin
125d42fe81 Move DDP PCB state into a helper structure.
This consolidates all of the DDP state in one place.  Also, the code has
now been fixed to ensure that DDP state is only accessed for DDP
connections.  This should not be a functional change but makes it cleaner
and easier to add state for other TOE socket modes in the future.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-02-22 01:50:30 +00:00
Wojciech Macek
19a5b68236 CXGBE: implement prefetch on non-Intel architectures
Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           np, pdk@semihalf.com
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14452
2018-02-21 08:05:56 +00:00
Navdeep Parhar
7cb7c6e37a Catch up with the removal of nktr_slot_flags from upstream netmap. No
functional impact intended.

Submitted by:	Vincenzo Maffione <v.maffione@gmail.com>
2018-02-20 21:42:45 +00:00
Navdeep Parhar
919da4ceff iw_cxgbe: Remove declaration of a function that no longer exists. 2018-02-07 20:13:08 +00:00
John Baldwin
f17985319d Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive
was marked as static. LLD honors static more correctly, so export this
variable properly (including moving it into the tcp_* namespace).

Reviewed by:	bz, emaste
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14129
2018-01-30 23:01:37 +00:00
Navdeep Parhar
ea3c5e5d04 cxgbe(4): Accept old names of a couple of tunables. 2018-01-26 00:45:40 +00:00
Navdeep Parhar
39e7cb038b cxgbe(4): Do not display harmless warning in non-debug builds.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-01-26 00:03:14 +00:00
John Baldwin
c0154062c7 Store IV in output buffer in GCM software fallback when requested.
Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM
software fallback case for encryption requests.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:16:48 +00:00
John Baldwin
2bc40b6ca9 Don't read or generate an IV until all error checking is complete.
In particular, this avoids edge cases where a generated IV might be
written into the output buffer even though the request is failed with
an error.

Sponsored by:	Chelsio Communications
2018-01-24 20:15:49 +00:00
John Baldwin
04043b3dcd Expand the software fallback for GCM to cover more cases.
- Extend ccr_gcm_soft() to handle requests with a non-empty payload.
  While here, switch to allocating the GMAC context instead of placing
  it on the stack since it is over 1KB in size.
- Allow ccr_gcm() to return a special error value (EMSGSIZE) which
  triggers a fallback to ccr_gcm_soft().  Move the existing empty
  payload check into ccr_gcm() and change a few other cases
  (e.g. large AAD) to fallback to software via EMSGSIZE as well.
- Add a new 'sw_fallback' stat to count the number of requests
  processed via the software fallback.

Submitted by:	Harsh Jain @ Chelsio (original version)
Sponsored by:	Chelsio Communications
2018-01-24 20:14:57 +00:00
John Baldwin
bf5b662033 Clamp DSGL entries to a length of 2KB.
This works around an issue in the T6 that can result in DMA engine
stalls if an error occurs while processing a DSGL entry with a length
larger than 2KB.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:13:07 +00:00
John Baldwin
f7b61e2fcc Fail crypto requests when the resulting work request is too large.
Most crypto requests will not trigger this condition, but a request
with a highly-fragmented data buffer (and a resulting "large" S/G
list) could trigger it.

Sponsored by:	Chelsio Communications
2018-01-24 20:12:00 +00:00
John Baldwin
5929c9fb13 Don't discard AAD and IV output data for AEAD requests.
The T6 can hang when processing certain AEAD requests if the request
sets a flag asking the crypto engine to discard the input IV and AAD
rather than copying them into the output buffer.  The existing driver
always discards the IV and AAD as we do not need it.  As a workaround,
allocate a single "dummy" buffer when the ccr driver attaches and
change all AEAD requests to write the IV and AAD to this scratch
buffer.  The contents of the scratch buffer are never used (similar to
"bogus_page"), and it is ok for multiple in-flight requests to share
this dummy buffer.

Submitted by:	Harsh Jain @ Chelsio (original version)
Sponsored by:	Chelsio Communications
2018-01-24 20:11:00 +00:00
John Baldwin
acaabdbbee Reject requests with AAD and IV larger than 511 bytes.
The T6 crypto engine's control messages only support a total AAD
length (including the prefixed IV) of 511 bytes.  Reject requests with
large AAD rather than returning incorrect results.

Sponsored by:	Chelsio Communications
2018-01-24 20:08:10 +00:00
John Baldwin
020ce53af3 Always set the IV location to IV_NOP.
The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work
request.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:06:02 +00:00
John Baldwin
d3f25aa152 Always store the IV in the immediate portion of a work request.
Combined authentication-encryption and GCM requests already stored the
IV in the immediate explicitly.  This extends this behavior to block
cipher requests to work around a firmware bug.  While here, simplify
the AEAD and GCM handlers to not include always-true conditions.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:04:08 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
26c1d774b5 dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
Navdeep Parhar
e62fb4b142 cxgbe/iw_cxgbe: Remove duplicates to fix compilation with recent gcc. 2018-01-13 00:04:11 +00:00
Wojciech Macek
f078ecf642 CXGBE: fix get_filt to be endianness-aware
Unconditional 32-bit shift is not endianness-safe.
Modify the logic to work both on LE and BE.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13102
2018-01-11 09:17:02 +00:00
Navdeep Parhar
218c5b2115 cxgbe(4): Add a knob to enable/disable PCIe relaxed ordering. Disable it by
default when running on Intel CPUs.

This is a crude fix for the performance issues alluded to in these Linux commits:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=87e09cdec4dae08acdb4aa49beb793c19d73e73e
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a99b646afa8a02571ea298bedca6592d818229cd

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-01-03 19:24:57 +00:00
Navdeep Parhar
348694dada cxgbe(4): Reduce duplication by consolidating minor variations of the
same code into a single routine.

Sponsored by:	Chelsio Communications
2017-12-29 02:30:21 +00:00
Navdeep Parhar
adbb3637ec cxgbe/iw_cxgbe: Fix iWARP over VLANs (catch up with r326169).
Submitted by:	KrishnamRaju ErapaRaju @ Chelsio
Sponsored by:	Chelsio Communications
2017-12-27 22:44:50 +00:00
Navdeep Parhar
f549e3521d cxgbe(4): Do not forward interrupts to queues with freelists. This
leaves the firmware event queue (fwq) as the only queue that can take
interrupts for others.

This simplifies cfg_itype_and_nqueues and queue allocation in the driver
at the cost of a little (never?) used configuration.  It also allows
service_iq to be split into two specialized variants in the future.

MFC after:	2 months
Sponsored by:	Chelsio Communications
2017-12-22 19:10:19 +00:00
Navdeep Parhar
426b6bd53c cxgbe(4): Read the MFG diags version from the VPD and make it available
in the sysctl MIB.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-12-21 15:19:43 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Hans Petter Selasky
82725ba9bf Merge ^/head r325999 through r326131. 2017-11-23 14:28:14 +00:00