Commit Graph

414 Commits

Author SHA1 Message Date
jhb
5a10a1e2e6 Use DDP to implement zerocopy TCP receive with aio_read().
Chelsio's TCP offload engine supports direct DMA of received TCP payload
into wired user buffers.  This feature is known as Direct-Data Placement.
However, to scale well the adapter needs to prepare buffers for DDP
before data arrives.  aio_read() is more amenable to this requirement than
read() as applications often call read() only after data is available in
the socket buffer.

When DDP is enabled, TOE sockets use the recently added pru_aio_queue
protocol hook to claim aio_read(2) requests instead of letting them use
the default AIO socket logic.  The DDP feature supports scheduling DMA
to two buffers at a time so that the second buffer is ready for use
after the first buffer is filled.  The aio/DDP code optimizes the case
of an application ping-ponging between two buffers (similar to the
zero-copy bpf(4) code) by keeping the two most recently used AIO buffers
wired.  If a buffer is reused, the aio/DDP code is able to reuse the
vm_page_t array as well as page pod mappings (a kind of MMU mapping the
Chelsio NIC uses to describe user buffers).  The generation of the
vmspace of the calling process is used in conjunction with the user
buffer's address and length to determine if a user buffer matches a
previously used buffer.  If an application queues a buffer for AIO that
does not match a previously used buffer then the least recently used
buffer is unwired before the new buffer is wired.  This ensures that no
more than two user buffers per socket are ever wired.

Note that this feature is best suited to applications sending a steady
stream of data vs short bursts of traffic.

Discussed with:	np
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-05-07 00:33:35 +00:00
jhb
fb93838519 Set the correct vnet in TOE event handlers.
Differential Revision:	https://reviews.freebsd.org/D6152
2016-05-06 23:49:10 +00:00
pfg
748b87203d Revert r298955 for the cxgbe firmware.
These files have checksums that are none of my business.

Requested by:	np
2016-05-03 11:49:29 +00:00
pfg
eed4bd22ad sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
pfg
e72339bbf0 sys: Make use of our rounddown() macro when sys/param.h is available.
No functional change.
2016-04-30 14:41:18 +00:00
pfg
b4106812fd Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
pfg
729533413f sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
np
b38b6a219d cxgbe(4): Always dispatch all work requests that have been written to the
descriptor ring before leaving drain_wrq_wr_list.
2016-04-12 22:11:29 +00:00
np
e89a20428b cxgbe(4): Always read the entire mailbox into the reply buffer.
The size of the reply can be different from the size of the command in
case a debug firmware asserts.  fw_asrt() needs the entire reply in
order to decode the location of the assert.

Sponsored by:   Chelsio Communications
2016-04-12 21:17:19 +00:00
jhb
1f54f511a0 Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.
This fixes a conflict with the M_B macro in powerpc's
<machine/db_machdep.h> exposed by the recent addition of DDB commands
to the cxgbe driver.

Discussed with:	np
Reported by:	bz
Sponsored by:	Chelsio Communications
2016-04-12 17:44:34 +00:00
np
65b4f5af73 cxgbe(4): Provide an explicit value for nqpcq in the firmware
configuration file.
2016-04-11 02:18:59 +00:00
pfg
b63211eed5 Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
jhb
f04f5692d1 Add a 'show t4 devlog <nexus>' DDB command.
This command displays the adapter's firmware device log similar to the
dev.<nexus>.misc.devlog sysctl.

Sponsored by:	Chelsio Communications
2016-04-10 06:19:26 +00:00
jhb
28641629e2 Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.
This allows the contents of a TCB to be extracted from a T4/T5 card in
DDB after a panic.
2016-04-10 05:06:58 +00:00
sephe
d0428dd51c tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
jhb
20ec114656 Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

Discussed with:	np
Sponsored by:	Chelsio Communications
2016-03-31 18:36:50 +00:00
np
0d21619c69 cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!"
messages.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-29 00:10:47 +00:00
np
ba13a348f8 cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.
2016-03-22 18:56:23 +00:00
np
151439df17 iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux
iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.

This commit includes internal changesets 6785 8111 8149 8478 8617 8648
8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730
11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2016-03-21 00:29:45 +00:00
np
0b6bb279a2 cxgbe(4): Tidy up PAUSE frame accounting.
Figure out if the chip is counting PAUSE frames in the "normal" stats
and take them out if it is.  This fixes a bug in the tx stats because
the default hardware behavior is different for Tx and Rx but the driver
was treating both the same way.  The result was that OPACKETS, OBYTES,
and OMCASTS were under-reported (if tx_pause > 0) before this change.

Note that the mac_stats sysctl still gives you the raw value of these
statistics straight from the device registers.
2016-03-17 01:15:16 +00:00
np
7d1fd13da9 cxgbe(4): Enable PFs 0-3, and allow creation of SR-IOV VFs on these PFs
in the default configuration files.
2016-03-16 19:46:22 +00:00
np
691b4852d0 cxgbe(4): Enable additional capabilities in the default configuration
files.  All features with FreeBSD drivers of some kind are now in the
default configuration.
2016-03-16 19:43:44 +00:00
np
39c3dc6b78 cxgbe(4): Update some register settings in the default configuration
files to match the "uwire" configuration.
2016-03-16 19:41:00 +00:00
np
76345b1fe4 cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo.
Do not display range if start = stop (this is a workaround for some
unused regions).
2016-03-16 19:36:11 +00:00
dim
83673368eb Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not
defined:

    sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used
    sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used
    sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used

This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.

Reviewed by:	np
Differential Revision: https://reviews.freebsd.org/D5620
2016-03-12 18:38:51 +00:00
np
96b33bc25d cxgbe(4): Fix typo in previous commit. 2016-03-12 03:02:33 +00:00
np
3114028bf2 cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.
2016-03-12 02:54:55 +00:00
np
ef752a91d6 cxgbe(4): sysctls to display the TOE's TCP timers.
cask:~# sysctl -d dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: FINWAIT2 timer (us)
dev.t5nex.0.toe.initial_srtt: Initial SRTT (us)
dev.t5nex.0.toe.keepalive_intvl: Keepidle interval (us)
dev.t5nex.0.toe.keepalive_idle: Keepidle idle timer (us)
dev.t5nex.0.toe.persist_max: Persist timer max (us)
dev.t5nex.0.toe.persist_min: Persist timer min (us)
dev.t5nex.0.toe.rexmt_max: Retransmit max (us)
dev.t5nex.0.toe.rexmt_min: Retransmit min (us)
dev.t5nex.0.toe.dack_timer: DACK timer (us)
dev.t5nex.0.toe.dack_tick: DACK tick (us)
dev.t5nex.0.toe.timestamp_tick: TCP timestamp tick (us)
dev.t5nex.0.toe.timer_tick: TP timer tick (us)
...

cask:~# sysctl dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: 9765440
dev.t5nex.0.toe.initial_srtt: 244128
dev.t5nex.0.toe.keepalive_intvl: 73240800
dev.t5nex.0.toe.keepalive_idle: 7031116800
dev.t5nex.0.toe.persist_max: 9765440
dev.t5nex.0.toe.persist_min: 976544
dev.t5nex.0.toe.rexmt_max: 9765440
dev.t5nex.0.toe.rexmt_min: 244128
dev.t5nex.0.toe.dack_timer: 19520
dev.t5nex.0.toe.dack_tick: 32.768
dev.t5nex.0.toe.timestamp_tick: 1048.576
dev.t5nex.0.toe.timer_tick: 32.768
...
2016-03-11 23:24:04 +00:00
np
7dc9277bbc cxgbe(4): Add sysctls to display the TP microcode version and the
expansion rom version (if there's one).

trantor:~# sysctl dev.t4nex dev.t5nex | grep _version
dev.t4nex.0.firmware_version: 1.15.28.0
dev.t4nex.0.tp_version: 0.1.9.4
dev.t5nex.0.firmware_version: 1.15.28.0
dev.t5nex.0.exprom_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
2016-03-11 03:15:17 +00:00
np
2f6fcdc437 cxgbe(4): Add a sysctl for the event capture mask of the TP block's
logic analyzer.

dev.t5nex.<n>.misc.tp_la_mask
dev.t4nex.<n>.misc.tp_la_mask
2016-03-11 01:54:43 +00:00
np
641cb8b823 cxgbe(4): Improvements to the code that deals with the firmware's log.
- Query the location of the log very early during attach.  Refresh the
  location later after establishing contact with the firmware.
- Save the log's location as a flat address in devlog_params.
- Use a memory window instead of backdoor access to the EDC/MC to read
  the log.
2016-03-10 23:17:26 +00:00
np
03f8f8e396 cxgbe(4): Fix bug in r296603. The memory window needs to be
repositioned if the start address isn't in the window already.  One
of the bounds check used the end address instead.
2016-03-10 20:36:32 +00:00
np
a2eea2e32a cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows.  Convert existing users of these windows to the
new routines.
2016-03-10 06:15:31 +00:00
np
30f0441902 cxgbe(4): Allow the addr/len pair that is being validated in
validate_mem_range to span multiple memory types.  Update
validate_mt_off_len to use validate_mem_range.
2016-03-10 02:43:10 +00:00
np
21185aab4f cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.
2016-03-08 22:23:30 +00:00
np
7793331c97 cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with
the internal shared code.

Obtained from:	Chelsio Communications
2016-03-08 19:34:56 +00:00
np
b2c73c6f43 cxgbe(4): Minor updates to the shared routines that deal with firmware images. 2016-03-08 10:07:40 +00:00
np
f1c5a8fddd cxgbe(4): Fix t4_tp_get_rdma_stats. 2016-03-08 09:40:45 +00:00
np
85671180d8 cxgbe(4): Many new functions in the shared code, unused at this time.
Obtained from:	Chelsio Communications
2016-03-08 09:34:56 +00:00
np
077b368018 cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason
the link is down, instead of doing it in OS specific code.
2016-03-08 08:59:34 +00:00
np
2e631eec3d cxgbe(4): Updates to shared routines that get/set various parameters via
the firmware.

Obtained from:	Chelsio Communications
2016-03-08 08:39:53 +00:00
np
fcb5d99d47 cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with
internal shared code.

Obtained from:	Chelsio Communications
2016-03-08 08:13:37 +00:00
np
54eec7a19d cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.

Obtained from:	Chelsio Communications
2016-03-08 07:48:55 +00:00
np
475b5a6654 cxgbe(4): Updates to mailbox routines in the shared code.
Obtained from:	Chelsio Communications
2016-03-08 06:27:47 +00:00
np
2b06c278d4 cxgbe(4): Update the interrupt handlers for hardware errors.
Obtained from:	Chelsio Communications
2016-03-08 02:44:32 +00:00
np
eeb2a9e4d2 cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.

Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here.  This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.

Sponsored by:	Chelsio Communications
2016-03-08 02:04:05 +00:00
np
1dda046f14 cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code.  Use these per-adapter values instead of globals.

Sponsored by:	Chelsio Communications
2016-03-08 00:23:56 +00:00
np
503abbeeca cxgbe(4): Updated register dumps.
- Get the list of registers to read during a regdump from the shared
  code instead of the OS specific code.  This follows a similar move
  internally.  The shared code includes the list for T6.

- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register
  dumps (and catch up with some updates to T4 and T5 register decode).

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-07 21:11:35 +00:00
np
38b9655bb9 cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
  all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
  work with T6.  Most of the changes are in the decoders for the CIM
  logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2016-03-04 13:11:13 +00:00
np
048dbd1d8a cxgbe(4): First of many changes to reduce diffs with internal shared
code:

- Rename some CamelCase variables.
- s/t4_link_start/t4_link_l1cfg/g
- Pull in t4_get_port_type_description.
- Move t4_wait_op_done to t4_hw.c.
- Flip the order of the RDMA stats.
- Remove unsused function t4_iq_start_stop.
- Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c

Obtained from:	Chelsio Communications
2016-03-03 01:41:53 +00:00