Commit Graph

164 Commits

Author SHA1 Message Date
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Wojciech Macek
ec7f8d58b9 CXGBE: fix big-endian behaviour
The setbit/clearbit pair casts the bitfield pointer
to uint8_t* which effectively treats its contents as
little-endian variable. The ffs() function accepts int as
the parameter, which is big-endian. Use uint8_t here to
avoid mismatch, as we have only 4 doorbells.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D13084
2017-11-15 06:45:33 +00:00
Navdeep Parhar
5c2bacde58 Update the iw_cxgbe bits in the projects branch.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2017-11-07 23:52:14 +00:00
Navdeep Parhar
5bcae8ddfa cxgbe(4): Read the MPS buffer group map from the firmware as it could be
different from hardware defaults.  The congestion channel map, which is
still fixed, needs to be tracked separately now.  Change the congestion
setting for TOE rx queues to match the drivers on other OSes while here.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-24 05:41:48 +00:00
Navdeep Parhar
08cd1f11bd cxgbe(4): Provide knobs to set the holdoff parameters of TOE rx queues
separately from NIC rx queues instead of using the same parameters for
both types of queues.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-10-05 07:18:16 +00:00
Navdeep Parhar
2f318252cb cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+).  The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 23:41:04 +00:00
Navdeep Parhar
7023d9d4c6 cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI).  All autonomous VIs that share a port share the
same media.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 21:44:25 +00:00
Navdeep Parhar
f6d9d14b93 cxgbe(4): Verify that the driver accesses the firmware mailbox in a
thread-safe manner.

MFC after:	3 days
2017-08-28 03:13:16 +00:00
Navdeep Parhar
5d973bad2a cxgbe(4): Save the last reported link parameters and compare them with
the current state to determine whether to generate a link-state change
notification.  This fixes a bug introduced in r321063 that caused the
driver to sometimes skip these notifications.

Reported by:	Jason Eggleston @ LLNW
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-12 14:02:19 +00:00
Navdeep Parhar
01285747aa cxgbe(4): Various link/media related improvements.
- Deal with changes to port_type, and not just port_mod when a
  transceiver is changed.  This fixes hot swapping of transceivers of
  different types (QSFP+ or QSA or QSFP28 in a QSFP28 port, SFP+ or
  SFP28 in a SFP28 port, etc.).

- Always refresh media information for ifconfig if the port is down.
  The firmware does not generate tranceiver-change interrupts unless at
  least one VI is enabled on the physical port.  Before this change
  ifconfig diplayed potentially stale information for ports that were
  administratively down.

- Always recalculate and reapply L1 config on a transceiver change.

- Display PAUSE settings in ifconfig.  The driver sysctls for this
  continue to work as well.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2017-07-17 00:42:13 +00:00
Navdeep Parhar
a8c4fcb9c7 cxgbe(4): Fix per-queue netmap operation.
Do not attempt to initialize netmap queues that are already initialized
or aren't supposed to be initialized.  Similarly, do not free queues
that are not initialized or aren't supposed to be freed.

PR:		217156
Sponsored by:	Chelsio Communications
2017-06-15 19:56:59 +00:00
John Baldwin
5033c43b7a Add a driver for the Chelsio T6 crypto accelerator engine.
The ccr(4) driver supports use of the crypto accelerator engine on
Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.

Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
and SHA2-512-HMAC authentication algorithms.  The driver also supports
chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
algorithm for encrypt-then-authenticate operations.

Note that this driver is still under active development and testing and
may not yet be ready for production use.  It does pass the tests in
tests/sys/opencrypto with the exception that the AES-GCM implementation
in the driver does not yet support requests with a zero byte payload.

To use this driver currently, the "uwire" configuration must be used
along with explicitly enabling support for lookaside crypto capabilities
in the cxgbe(4) driver.  These can be done by setting the following
tunables before loading the cxgbe(4) driver:

    hw.cxgbe.config_file=uwire
    hw.cxgbe.cryptocaps_allowed=-1

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D10763
2017-05-17 22:13:07 +00:00
Navdeep Parhar
b2d8f4934e Adjust whitespace and fix a comment. No functional change.
MFC after:	3 days
2017-05-10 00:42:28 +00:00
Navdeep Parhar
1404daa76c cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-05-09 18:33:41 +00:00
Navdeep Parhar
e006d2a6fd cxgbe(4): Fixes related to the knob that controls link autonegotiation.
- Do not leak the adapter lock in sysctl_autoneg.
- Accept only 0 or 1 as valid settings for autonegotiation.
- A fixed speed must be requested by the driver when autonegotiation is
  disabled otherwise the firmware will reject the l1cfg command.  Use
  the top speed supported by the port for now.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-05-09 08:08:28 +00:00
Navdeep Parhar
2204b42716 cxgbe(4): Support routines for Tx traffic scheduling.
- Create a new file, t4_sched.c, and move all of the code related to
  traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
  parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
  rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-05-02 20:38:10 +00:00
Navdeep Parhar
46f48ee519 cxgbe: Add tunables to control the number of LRO entries and the number
of rx mbufs that should be presorted before LRO.  There is no change in
default behavior.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-04-17 09:00:20 +00:00
Navdeep Parhar
358bca3bc6 cxgbe(4): Updates to link configuration.
- Update struct link_settings and associated shared code.

- Add tunables to control FEC and autonegotiation.  All ports inherit
  these values as their initial settings.
  hw.cxgbe.fec
  hw.cxgbe.autoneg

- Add per-port sysctls to control FEC and autonegotiation.  These can be
  modified at any time.
  dev.<port>.<n>.fec
  dev.<port>.<n>.autoneg

MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-30 08:59:49 +00:00
Navdeep Parhar
1521ca71d3 cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-12-13 20:35:57 +00:00
Navdeep Parhar
b0c554c3a5 cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.
Sponsored by:	Chelsio Communications
2016-09-17 22:13:03 +00:00
Navdeep Parhar
e6b81479f9 cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will
come up as 't6nex' nexus devices with 'cc' ports hanging off them.

The T6 firmware and configuration files will be added as soon as they
are released.  For now the driver will try to work with whatever
firmware and configuration is on the card's flash.

Sponsored by:	Chelsio Communications
2016-09-16 00:08:37 +00:00
Navdeep Parhar
4cf3aa135b cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

Sponsored by:	Chelsio Communications
2016-09-12 00:15:40 +00:00
Navdeep Parhar
9113e53d54 cxgbe(4): Add support for additional port types and link speeds.
Sponsored by:	Chelsio Communications.
2016-09-11 23:08:57 +00:00
John Baldwin
6af45170c1 Chelsio T4/T5 VF driver.
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters.  The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device.  It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages).  This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request.  In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices.  Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics.  In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Reviewed by:	np
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7599
2016-09-07 18:13:57 +00:00
Navdeep Parhar
e25621e5ea cxgbe(4): Provide more details about the card in the sysctl MIB.
dev.t5nex.0.%desc: Chelsio T580-CR
dev.t5nex.0.hw_revision: 1
dev.t5nex.0.sn: PT13140042
dev.t5nex.0.pn: 110117150A0
dev.t5nex.0.ec: 0000000000000000
dev.t5nex.0.na: 0007432AF490
dev.t5nex.0.vpd_version: 3
dev.t5nex.0.scfg_version: 53255
dev.t5nex.0.bs_version: 1.1.0.0
dev.t5nex.0.er_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
dev.t5nex.0.firmware_version: 1.16.2.0

Sponsored by:	Chelsio Communications
2016-08-27 00:13:41 +00:00
John Baldwin
ec55567ce6 Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers.  For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero.  VF devices have a non-zero base
absolute ID and require this change.  While here, export the absolute ID
of egress queues via a sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7446
2016-08-09 17:49:42 +00:00
John Baldwin
a56745092c Reserve an adapter flag IS_VF to mark VF devices vs PF devices.
Sponsored by:	Chelsio Communications
2016-08-08 21:45:39 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
Navdeep Parhar
671bf2b8b2 cxgbe(4): Changes to the CPL-handler registration mechanism and code
related to "shared" CPLs.

a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single
function.  Allow callers to direct the response to any iq.  Tidy up
set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of
magic constants.

b) Remove all CPL handler tables from struct adapter.  This reduces its
size by around 2KB.  All handlers are now registered at MOD_LOAD instead
of attach or some kind of initialization/activation.  The registration
functions do not need an adapter parameter any more.

c) Add per-iq handlers to deal with CPLs whose destination cannot be
determined solely from the opcode.  There are 2 such CPLs in use right
now: SET_TCB_RPL and L2T_WRITE_RPL.  The base driver continues to send
filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq.
t4_tom (including the DDP code) now uses the port's ctrlq to send
L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq.
fwq and ofld_rxq have different handlers that know what kind of tid to
expect in the reply.  Update t4_write_l2e and callers to to support any
wrq/iq combination.

Approved by:	re@ (kib@)
Sponsored by:	Chelsio Communications
2016-07-05 01:29:24 +00:00
Navdeep Parhar
62291463de cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces.  The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel.  The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.).  It did not have normal
tx/rx but had a specialized netmap-only data path.  r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support.  The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable.  This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface.  nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf.  num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap.  pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only.  No
need to bring them up.  The vcxl interfaces are completely independent
and everything should just work.
---------------------

Approved by:	re@ (gjb@)
Relnotes:	Yes
Sponsored by:	Chelsio Communications
2016-06-23 02:53:00 +00:00
Navdeep Parhar
02f972e8f3 cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
Sponsored by:	Chelsio Communications
2016-06-08 14:15:29 +00:00
Navdeep Parhar
46464b95b0 cxgbe(4): Track the state of the hardware traffic schedulers in the
driver.  This works as long as everyone uses set_sched_class_params
to program them.

Sponsored by:	Chelsio Communications
2016-06-07 00:27:55 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
John Baldwin
80f3b01958 Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

Discussed with:	np
Sponsored by:	Chelsio Communications
2016-03-31 18:36:50 +00:00
Navdeep Parhar
0f2f53efd2 cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.
2016-03-12 02:54:55 +00:00
Navdeep Parhar
9945ceb857 cxgbe(4): Add sysctls to display the TP microcode version and the
expansion rom version (if there's one).

trantor:~# sysctl dev.t4nex dev.t5nex | grep _version
dev.t4nex.0.firmware_version: 1.15.28.0
dev.t4nex.0.tp_version: 0.1.9.4
dev.t5nex.0.firmware_version: 1.15.28.0
dev.t5nex.0.exprom_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
2016-03-11 03:15:17 +00:00
Navdeep Parhar
c912289045 cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows.  Convert existing users of these windows to the
new routines.
2016-03-10 06:15:31 +00:00
Navdeep Parhar
4d131308f3 cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.
2016-03-08 22:23:30 +00:00
Navdeep Parhar
b3500921c4 cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.

Obtained from:	Chelsio Communications
2016-03-08 07:48:55 +00:00
Navdeep Parhar
700cfba72d cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.

Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here.  This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.

Sponsored by:	Chelsio Communications
2016-03-08 02:04:05 +00:00
Navdeep Parhar
90e7434a6d cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code.  Use these per-adapter values instead of globals.

Sponsored by:	Chelsio Communications
2016-03-08 00:23:56 +00:00
Navdeep Parhar
d1205d093d cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
  all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
  work with T6.  Most of the changes are in the decoders for the CIM
  logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-04 13:11:13 +00:00
Navdeep Parhar
e8c6ba7265 cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a
port.

dev.cxgbe.<n>.max_speed
dev.cxl.<n>.max_speed

Sponsored by:	Chelsio Communications
2016-02-25 01:10:56 +00:00
Navdeep Parhar
40bf7442fa cxgbe: catch up with the latest hardware-related definitions.
Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-02-19 00:29:16 +00:00
Navdeep Parhar
9eb533d3b4 cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI
offload driver.  These changes come from projects/cxl_iscsi.
2015-12-26 00:26:02 +00:00
John Baldwin
fe2ebb7644 Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port.  This change allows additional non-netmap
interfaces to be configured on each port.  Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.

Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).

T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).

One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.

The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.

The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats.  There is currently no way to
clear the stats of an individual VI.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio
2015-12-03 00:02:01 +00:00
Navdeep Parhar
8faf57012b cxgbe(4): Save the flags for the last adapter-wide synchronized
operation that was initiated successfully.  (The caller and thread are
already recorded).

MFC after:	1 week
2015-08-19 15:40:03 +00:00
Navdeep Parhar
a1ed88571f cxgbe(4): Ask the firmware for the start of the RSS slice for a port and
save it for later.  This enables direct manipulation of the indirection
tables (although the stock driver doesn't do that right now).

MFC after:	1 month
2015-07-17 06:46:18 +00:00
Navdeep Parhar
9af71ab3bc cxgbe(4): Add a new knob that controls the congestion response of netmap
rx queues.  The default is to drop rather than backpressure.

This decouples the congestion settings of NIC and netmap rx queues.

MFC after:	3 days
2015-07-06 20:56:59 +00:00
Navdeep Parhar
0e4cd4a2e0 cxgbe(4): Add the ability to dump mailbox commands and replies. It is
enabled/disabled via bit 0 of adapter->debug_flags (which is available
at dev.t5nex.<n>.debug_flags).

MFC after:	1 week
2015-06-16 12:36:29 +00:00
Navdeep Parhar
1605bac6fb cxgbe(4): set up congestion management for netmap rx queues.
The hw.cxgbe.cong_drop knob controls the response of the chip when
netmap queues are congested.
2015-02-24 18:40:10 +00:00
Navdeep Parhar
b3d44a6800 cxgbe(4): tidy up some of the interaction between the Upper Layer
Drivers (ULDs) and the base if_cxgbe driver.

Track the per-adapter activation of ULDs in a new "active_ulds" field.
This was done pretty arbitrarily before this change -- via TOM_INIT_DONE
in adapter->flags for TOM, and the (1 << MAX_NPORTS) bit in
adapter->offload_map for iWARP.

iWARP and hw-accelerated iSCSI rely on the TOE (supported by the TOM
ULD).  The rules are:
a) If the iWARP and/or iSCSI ULDs are available when TOE is enabled then
   iWARP and/or iSCSI are enabled too.
b) When the iWARP and iSCSI modules are loaded they go looking for
   adapters with TOE enabled and enable themselves on that adapter.
c) You cannot deactivate or unload the TOM module from underneath iWARP
   or iSCSI.  Any such attempt will fail with EBUSY.

MFC after:	2 weeks
2015-02-08 09:28:55 +00:00
Navdeep Parhar
d86a5ff917 cxgbe(4): a change to the synchronization rules within the the driver.
This is purely cosmetic because the new rules are already followed.

MFC after:	1 week
2015-02-08 08:42:45 +00:00
Navdeep Parhar
7951040f8a cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.

b) Replace buf_ring with a brand new mp_ring (multiproducer ring).  This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual.  mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue.  It also has:
- the ability to enqueue/dequeue multiple items.  This might become
  significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
  descriptors and have another if_transmit thread take over.  A thread
  that's writing tx descriptors can end up doing so for an unbounded
  time period if a) there are other if_transmit threads continuously
  feeding the sofware queue, and b) the chip keeps up with whatever the
  thread is throwing at it.
- accurate statistics about interesting events even when the stats come
  at the expense of additional branches/conditional code.

The NIC txq lock is uncontested on the fast path at this point.  I've
left it there for synchronization with the control events (interface
up/down, modload/unload).

c) Add support for "type 1" coalescing work request in the normal NIC tx
path.  This work request is optimized for frames with a single item in
the DMA gather list.  These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.

d) Do not request automatic cidx updates every 32 descriptors.  Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately).  Also, request an automatic final update
when the queue idles after activity.  This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles.  This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.

e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx).  Allow work requests to be written
directly to the hardware descriptor ring if room is available.  I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.

MFC after:	2 months
2014-12-31 23:19:16 +00:00
Navdeep Parhar
a7570ee305 Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe
code can use it too.

MFC after:	1 week
2014-12-12 21:54:59 +00:00
Navdeep Parhar
b741402c40 cxgbe(4): allow the driver to use rx buffers that do not end on a pack
boundary.

MFC after:	2 weeks
2014-12-06 01:47:38 +00:00
Navdeep Parhar
e3207e1973 cxgbe(4): Allow for different pad and pack boundaries for different
adapters.  Set the pack boundary for T5 cards to be the same as the
PCIe max payload size.  The chip likes it this way.

In this revision the driver allocate rx buffers that align on both
boundaries.  This is not a strict requirement and a followup commit
will switch the driver to a more relaxed allocation strategy.

MFC after:	2 weeks
2014-12-06 00:13:56 +00:00
Navdeep Parhar
2d8910854b cxgbe(4): implement if_get_counter. 2014-09-27 05:50:31 +00:00
Navdeep Parhar
4d6db4e0f7 cxgbe(4): some optimizations in freelist handling.
MFC after:	2 weeks.
2014-08-02 06:55:36 +00:00
Navdeep Parhar
b2daa9a9cd cxgbe(4): minor optimizations in ingress queue processing.
Reorganize struct sge_iq.  Make the iq entry size a compile time
constant.  While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.

MFC after:	2 weeks
2014-08-02 00:56:34 +00:00
Navdeep Parhar
82eff304b6 cxgbe(4): Keep track of the clusters that have to be freed by the
custom free routine (rxb_free) in the driver.  Fail MOD_UNLOAD with
EBUSY if any such cluster has been handed up to the kernel but hasn't
been freed yet.  This prevents a panic later when the cluster finally
needs to be freed but rxb_free is gone from the kernel.

MFC after:	1 week
2014-07-23 22:29:22 +00:00
Navdeep Parhar
c3fb772502 Simplify r267600, there's no need to distinguish between allocated and
inlined mbufs.

MFC after:	1 week
2014-07-22 02:02:39 +00:00
Navdeep Parhar
30f337891d cxgbe(4): Add an iSCSI softc to the adapter structure. 2014-07-11 21:02:54 +00:00
Navdeep Parhar
ccc69b2fa9 cxgbe(4): Fix bug in the fast rx buffer recycle path. In some cases rx
buffers were getting recycled when they should have been left alone.

MFC after:	3 days
2014-06-18 00:16:35 +00:00
Navdeep Parhar
298d969c53 cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards.
Netmap gets its own hardware-assisted virtual interface and won't take
over or disrupt the "normal" interface in any way.  You can use both
simultaneously.

For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface
(note the 'n' prefix) in the hardware to accompany each cxl<N>
interface.  These two ifnet's per port share the same wire but really
are separate interfaces in the hardware and software.  Each gets its own
L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc.  You
should run netmap on the 'n' interfaces only, that's what they are for.

With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port
of a T580 card.  2 port tx is at ~56Mpps total (28M + 28M) as of now.
Single port receive is at 33Mpps but this is very much a work in
progress.  I expect it to be closer to 40Mpps once done.  In any case
the current effort can already saturate multiple 10G ports of a T5 card
at the smallest legal packet size.  T4 gear is totally untested.

trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43🆎cd:ef
881.952141 main [1621] interface is ncxl0
881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0
881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0
881.962540 main [1804] mapped 334980KB at 0x801dff000
Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43🆎cd:ef)
881.962562 main [1882] Sending 512 packets every  0.000000000 s
881.962563 main [1884] Wait 2 secs for phy reset
884.088516 main [1886] Ready...
884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1
884.088607 sender_body [996] start
884.093246 sender_body [1064] drop copy
885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec)
886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec)
887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec)
888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec)
889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec)
890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec)
891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec)
892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec)
893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec)
894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec)
895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec)
896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec)
...

Relnotes:	Yes
Sponsored by:	Chelsio Communications.
2014-05-27 18:18:41 +00:00
Navdeep Parhar
38035ed6dc cxgbe(4): significant rx rework.
- More flexible cluster size selection, including the ability to fall
  back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in
  case an allocation of a larger size fails.
- A single get_fl_payload() function that assembles the payload into an
  mbuf chain for any kind of freelist.  This replaces two variants: one
  for freelists with buffer packing enabled and another for those without.
- Buffer packing with any sized cluster.  It was limited to 4K clusters
  only before this change.
- Enable buffer packing for TOE rx queues as well.
- Statistics and tunables to go with all these changes.  The driver's
  man page will be updated separately.

MFC after:	5 weeks
2014-03-18 20:14:13 +00:00
Scott Long
f7a74e061b Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable,
hw.cxgbe.rsrv_noflow.  When set, queue 0 of the port is reserved for
TX packets without a flowid.  The hash value of packets with a flowid
is bumped up by 1.  The intent is to provide a private queue for
link-level packets like LACP that is unlikely to overflow or suffer
deep queue latency.

Reviewed by:	np
Obtained from:	Netflix
MFC after:	3 days
2014-02-06 18:40:38 +00:00
Navdeep Parhar
e46dcc5670 cxgbe(4): Use the rx channel map (instead of the tx channel map) as the
congestion channel map.

MFC after:	1 week
2014-02-06 03:30:12 +00:00
Navdeep Parhar
7293a15f54 cxgbe(4): The T5 allows for a different freelist starvation threshold
for queues with buffer packing.  Use the correct value to calculate a
freelist's low water mark.

MFC after:	1 week
2014-02-06 03:21:43 +00:00
Adrian Chadd
3af0f449ae Add an option to enable or disable the small RX packet copying that
is done to improve performance of small frames.

When doing RX packing, the RX copying isn't necessarily required.

Reviewed by:	np
2014-01-02 23:23:33 +00:00
Navdeep Parhar
273ef9912d cxgbe(4): save a copy of the RSS map for each port for the driver's use. 2013-12-08 17:47:37 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Navdeep Parhar
b3eda7872d cxgbe(4): Store the log2 of the # of doorbells per BAR2 page for both
ingress and egress queues, and for both T4 and T5.  These values are
used by the T4/T5 iWARP driver.
2013-10-14 23:32:56 +00:00
Navdeep Parhar
1458bff9a4 Implement support for rx buffer packing. Enable it by default for T5
cards.

This is a T4 and T5 chip feature which lets the chip deliver multiple
Ethernet frames in a single buffer.  This is more efficient within the
chip, in the driver, and reduces wastage of space in rx buffers.

- Always allocate rx buffers from the jumbop zone, no matter what the
  MTU is.  Do not use the normal cluster refcounting mechanism.
- Reserve space for an mbuf and a refcount in the cluster itself and let
  the chip DMA multiple frames in the rest.
- Use the embedded mbuf for the first frame and allocate mbufs on the
  fly for any additional frames delivered in the cluster.  Each of these
  mbufs has a reference on the underlying cluster.
2013-08-30 01:45:36 +00:00
Navdeep Parhar
480e603c79 Merge r254386 from user/np/cxl_tuning. Add an INET|INET6 check missing
in said revision.

r254386:
Flush inactive LRO entries periodically.
2013-08-29 06:26:22 +00:00
Navdeep Parhar
9800517691 Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update a
couple of assertions in the TOE driver as well.
2013-08-28 20:45:45 +00:00
Navdeep Parhar
6e22f9f3da Display SGE tunables in the sysctl tree.
dev.t5nex.0.fl_pktshift: payload DMA offset in rx buffer (bytes)
dev.t5nex.0.fl_pad: payload pad boundary (bytes)
dev.t5nex.0.spg_len: status page size (bytes)
dev.t5nex.0.cong_drop: congestion drop setting

Discussed with:	scottl
2013-07-31 05:12:51 +00:00
Navdeep Parhar
caf20efcde Add support for packet-sniffing tracers to cxgbe(4). This works with
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.

Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire.  On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc.  It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel).  On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.

There are 4 tracers on a chip.  A tracer can trace only in one direction
(tx or rx).  For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port.  This is a
small subset of what the hardware can do.  A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip.  Normal delivery to cxgbe<n> or cxl<n> will be made as usual.

/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0  (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0  (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0   <== all that cxl0 sees and puts on the wire

If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.

/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy
2013-07-26 22:04:11 +00:00
Navdeep Parhar
3a760ee793 - Show the reason why link is down if this information is available.
- Display the temperature and PHY firmware version of the BT PHY.

MFC after:	1 day
2013-07-05 01:53:51 +00:00
Navdeep Parhar
6eb3180fb2 - Make note of interface MTU change if the rx queues exist, and not just
when the interface is up.
- Add a tunable to control the TOE's rx coalesce feature (enabled by
  default as it always has been).  Consider the interface MTU or the
  coalesce size when deciding which cluster zone to use to fill the
  offload rx queue's free list.  The tunable is:
  dev.{t4nex,t5nex}.<N>.toe.rx_coalesce

MFC after:	1 day
2013-07-04 21:19:01 +00:00
Navdeep Parhar
c337fa30af - Read all TP parameters in one place.
- Read the filter mode, calculate various shifts, and use them
  properly during active open (in select_ntuple).

MFC after:	1 day
2013-07-04 17:55:52 +00:00
Navdeep Parhar
8cf31b85b5 - Provide accurate ifmedia information so that 40G ports/transceivers are
displayed properly in ifconfig, etc.

- Use the same number of tx and rx queues for a 40G port as for a 10G port.

MFC after:	1 week
2013-04-30 05:51:52 +00:00
Navdeep Parhar
77ad3c4146 Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).
MFC after:	3 days.
2013-04-11 22:49:29 +00:00
Navdeep Parhar
d14b0ac129 cxgbe(4): Add support for Chelsio's Terminator 5 (aka T5) ASIC. This
includes support for the NIC and TOE features of the 40G, 10G, and
1G/100M cards based on the T5.

The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4)
has been updated instead of writing a brand new driver.  T5 cards will
show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.

Sponsored by:	Chelsio
2013-03-30 02:26:20 +00:00
Navdeep Parhar
0abd31e2f7 cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating
them in a firmware message if it is able to do so.  This works out
better for one of the FIFOs in the chip.

MFC after:	5 days
2013-02-26 00:27:27 +00:00
Navdeep Parhar
1cdc889916 Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.
MFC after:	3 days
2013-01-26 03:10:28 +00:00
Navdeep Parhar
e13fe79820 cxgbe: Make the for_each macros safer to use by turning them
into a single statement each.

Submitted by:	Christoph Mallon <christoph dot mallon at gmx dot de>
MFC after:	1 week
2013-01-17 18:52:49 +00:00
Navdeep Parhar
5bb17208d7 cxgbe: Fix the for_each_foo macros -- the last argument should not share
its name with any member of struct sge.

MFC after:	3 days
2013-01-16 23:48:55 +00:00
Navdeep Parhar
b174b65819 cxgbe(4): Add functions to help synchronize "slow" operations (those not
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.

Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete).  This guarantees
  that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
  power of 2.

MFC after:	1 week
2013-01-10 23:56:50 +00:00
Ed Schouten
6b946662d9 Prefer __containerof() over __member2struct().
The former works better with qualifiers, but also properly type checks
the input pointer.
2012-10-19 13:26:40 +00:00
Navdeep Parhar
5323ca8f4b Remove unused item. cxgbe's rx queue's lock was removed a long time ago.
MFC after:	3 days
2012-10-10 16:52:39 +00:00
Navdeep Parhar
5f7a640879 Initialize various DDP parameters in the main cxgbe(4) driver:
- Setup multiple DDP page sizes.  When the driver attempts DDP it will
  try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
  be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
  deliver to the driver (this is ~16K by default, which is also why the
  offload rx queue is backed by 16K buffers).  If the chip is able to
  coalesce up to the max it's allowed to, it's a good sign that the peer
  is transmitting in bulk without any TCP PSH.

MFC after:	2 weeks
2012-08-16 22:33:56 +00:00
Navdeep Parhar
1f1b5a0f6f Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB.  Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL.  Figure out whether the reply is for a filter-write or
something else and route it appropriately.

MFC after:	2 weeks
2012-08-16 20:15:29 +00:00
Navdeep Parhar
1b4cc91fcc Allow for a different handler for each type of firmware message.
MFC after:	2 weeks
2012-08-16 18:31:50 +00:00
Navdeep Parhar
a1ea9a8276 cxgbe(4): support for IPv6 TSO and LRO.
Submitted by:	bz (this is a modified version of that patch)
2012-06-29 19:51:06 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Bjoern A. Zeeb
62b5b6ecd0 MFp4 bz_ipv6_fast:
Significantly update tcp_lro for mostly two things:
  1) introduce basic support for IPv6 without extension headers.
  2) try hard to also get the incremental checksum updates right,
     especially also in the IPv4 case for the IP and TCP header.

  Move variables around for better locality, factor things out into
  functions, allow checksum updates to be compiled out, ...

  Leave a few comments on further things to look at in the future,
  though that is not the full list.

  Update drivers with appropriate #includes as needed for IPv6 data
  type in LRO.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 23:03:23 +00:00
Navdeep Parhar
bfb08b6b6b cxgbe: reduce diffs with other branches.
Will help future MFCs from HEAD.

MFC after:	3 days
2012-02-07 06:21:59 +00:00
Navdeep Parhar
733b92779e Many updates to cxgbe(4)
- Device configuration via plain text config file.  Also able to operate
  when not attached to the chip as the master driver.

- Generic "work request" queue that serves as the base for both ctrl and
  ofld tx queues.

- Generic interrupt handler routine that can process any event on any
  kind of ingress queue (via a dispatch table).

- A couple of new driver ioctls.  cxgbetool can now install a firmware
  to the card ("loadfw" command) and can read the card's memory
  ("memdump" and "tcb" commands).

- Lots of assorted information within dev.t4nex.X.misc.*  This is
  primarily for debugging and won't show up in sysctl -a.

- Code to manage the L2 tables on the chip.

- Updates to cxgbe(4) man page to go with the tunables that have changed.

- Updates to the shared code in common/

- Updates to the driver-firmware interface (now at fw 1.4.16.0)

MFC after:	1 month
2011-12-16 02:09:51 +00:00
Navdeep Parhar
59bc8ce035 - driver ioctl to get SGE context for any given queue.
- sysctls to display the context id, cidx, and pidx of all kinds of queues.

MFC after:	3 days
2011-06-11 04:50:54 +00:00