Commit Graph

1035 Commits

Author SHA1 Message Date
Navdeep Parhar
5323ca8f4b Remove unused item. cxgbe's rx queue's lock was removed a long time ago.
MFC after:	3 days
2012-10-10 16:52:39 +00:00
Kevin Lo
9823d52705 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
Kevin Lo
a10cee30c9 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
Gavin Atkinson
e935190a33 Switch some PCI register reads from using magic numbers to using the names
defined in pcireg.h

MFC after:	1 week
2012-09-19 12:27:23 +00:00
Gavin Atkinson
389c8bd51e Align the PCI Express #defines with the style used for the PCI-X
#defines.  This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
  s/PCIR_EXPRESS_/PCIER_/g
  s/PCIM_EXP_/PCIEM_/g
  s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with:	jhb
MFC after:	1 week
2012-09-18 22:04:59 +00:00
Navdeep Parhar
8a599c0859 Install interrupt handlers early, during attach, for the reason
explained in r239913 by jhb.

MFC after:	1 week
2012-09-13 09:18:13 +00:00
Navdeep Parhar
57c60f98b8 Use native FreeBSD facilities everywhere except the shared code in common/
MFC after:	1 week
2012-09-13 09:10:10 +00:00
Navdeep Parhar
8a7ba352b0 Update interface to firmware 1.6.2 and include the firmware in the driver.
Obtained from:	Chelsio
MFC after:	1 week
2012-09-13 06:32:52 +00:00
Navdeep Parhar
87a74dd6e3 Deal with the case where a syncache entry added by the TOE driver is
evicted from the syncache but a later syncache_expand succeeds because
of syncookies.  The TOE driver has to resort to more direct means to
install its hooks in the socket in this case.
2012-08-21 22:23:17 +00:00
Navdeep Parhar
f9796f4373 Avoid a NULL pointer dereference. 2012-08-21 19:45:19 +00:00
Navdeep Parhar
36fd646e38 Cannot hold a mutex around vm_fault_quick_hold_pages, so don't. Tweak
some comments while here.
2012-08-21 19:39:09 +00:00
Navdeep Parhar
c91bcaaab5 Minor cleanup: use bitwise ops instead of pointless wrappers around
setbit/clrbit.
2012-08-21 18:30:16 +00:00
Navdeep Parhar
06fd9875aa Correctly handle the case where an inp has already been dropped by the time
the TOE driver reports that an active open failed.  toe_connect_failed is
supposed to handle this but it should be provided the inpcb instead of the
tcpcb which may no longer be around.
2012-08-21 18:09:33 +00:00
Navdeep Parhar
e682d02e12 Support for TCP DDP (Direct Data Placement) in the T4 TOE module.
Basically, this is automatic rx zero copy when feasible.  TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.

- Works with sockets that are being handled by the TCP offload engine
  of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
  "ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default.  Use hw.t4nex.<X>.toe.ddp="1" to enable it.
2012-08-17 00:49:29 +00:00
Navdeep Parhar
5f7a640879 Initialize various DDP parameters in the main cxgbe(4) driver:
- Setup multiple DDP page sizes.  When the driver attempts DDP it will
  try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
  be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
  deliver to the driver (this is ~16K by default, which is also why the
  offload rx queue is backed by 16K buffers).  If the chip is able to
  coalesce up to the max it's allowed to, it's a good sign that the peer
  is transmitting in bulk without any TCP PSH.

MFC after:	2 weeks
2012-08-16 22:33:56 +00:00
Navdeep Parhar
efc9fddc3d Make room for DDP page pods in the default configuration profile. While
here, bump up the L2 table's size to 4K entries.

MFC after:	2 weeks
2012-08-16 20:30:14 +00:00
Navdeep Parhar
1f1b5a0f6f Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB.  Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL.  Figure out whether the reply is for a filter-write or
something else and route it appropriately.

MFC after:	2 weeks
2012-08-16 20:15:29 +00:00
Navdeep Parhar
1b4cc91fcc Allow for a different handler for each type of firmware message.
MFC after:	2 weeks
2012-08-16 18:31:50 +00:00
Navdeep Parhar
8340ece577 The size of the buffers in an Ethernet freelist has to be higher than the
interface's MTU.  Initialize such freelists with correct values.

This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048
and 9216 in size) happened to have enough spare room.  I ran into it when
playing around with unusual MTUs.

MFC after:	2 weeks
2012-08-15 01:03:13 +00:00
Navdeep Parhar
2951cbabf0 if_iqdrops should include frames truncated within the chip.
MFC after:	2 weeks
2012-08-14 22:15:12 +00:00
Navdeep Parhar
9fb8886b29 Convert some fixed parameters to tunables (with reasonable default
values).

- cong_drop specifies what to do on congestion: nothing, backpressure,
  or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.

MFC after:	2 weeks
2012-08-14 21:47:41 +00:00
Dimitry Andric
daccbb811d In sys/dev/cxgbe/firmware/t4fw_interface.h, change the enum
'fw_hdr_intfver' into an anonymous enum, which avoids a clang 3.2
warning about all the enum values being the same value.

Reviewed by:	np
MFC after:	1 week
2012-08-06 18:54:17 +00:00
Navdeep Parhar
c8d954ab75 Fix a bug in code that calculates the number of the first interrupt
vector for a port.  This affected the gigabit ports of T422 cards (the
ones with 2x10G ports and 2x1G ports).

MFC after:	will check with re@
2012-07-09 21:53:50 +00:00
Navdeep Parhar
9a0b948f98 Fix inverted test that resulted in incorrect multicast hw programming. 2012-07-03 06:56:11 +00:00
Navdeep Parhar
9f1dae79da Instruct the firmware not to provision resources for TCP offload if the
kernel is being built without TCP_OFFLOAD.  But never override
toecaps_allowed if it has been set manually.
2012-07-02 20:42:43 +00:00
Navdeep Parhar
932b1a5f1d - Assign (don't OR) the CSUM_XXX bits to csum_flags in the rx checksum code.
- Fix TSO/TSO4 mixup.
- Add IFCAP_LINKSTATE to the available/enabled capabilities.
2012-06-30 02:05:09 +00:00
Navdeep Parhar
a1ea9a8276 cxgbe(4): support for IPv6 TSO and LRO.
Submitted by:	bz (this is a modified version of that patch)
2012-06-29 19:51:06 +00:00
Navdeep Parhar
9600bf00bb cxgbe(4): support for IPv6 hardware checksumming (rx and tx). 2012-06-29 16:50:52 +00:00
Navdeep Parhar
2cd9f0711d Allow cxgbe(4) running within a VM to attach to its devices that have been
exported via PCI passthrough.

- Do not check for a specific physical function (PF) before claiming a device.
  Different PFs have different device-ids so this check is redundant anyway.

- Obtain the PF# from the WHOAMI register instead of pci_get_function().

- Setup the memory windows using the real BAR0 address, not what the VM says it
  is.

Obtained from:	Chelsio Communications
2012-06-26 00:34:34 +00:00
Navdeep Parhar
4defc81b0e Better way to determine the status page length and rx pad boundary. 2012-06-23 22:12:27 +00:00
Navdeep Parhar
3c51d1544a Do not allocate extra vectors when adapter is not TOE
capable (or toecaps have been disallowed by the user).

+ one very minor unrelated cleanup in t4_sge.c
2012-06-22 22:59:42 +00:00
Navdeep Parhar
afce448c1a Do not read registers with read side effects while performing a register
dump for cxgbetool.
2012-06-22 08:37:33 +00:00
Navdeep Parhar
2a5f6b0e65 cxgbe(4): update to firmware interface 1.5.2.0; updates to shared code. 2012-06-22 07:51:15 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Bjoern A. Zeeb
62b5b6ecd0 MFp4 bz_ipv6_fast:
Significantly update tcp_lro for mostly two things:
  1) introduce basic support for IPv6 without extension headers.
  2) try hard to also get the incremental checksum updates right,
     especially also in the IPv4 case for the IP and TCP header.

  Move variables around for better locality, factor things out into
  functions, allow checksum updates to be compiled out, ...

  Leave a few comments on further things to look at in the future,
  though that is not the full list.

  Update drivers with appropriate #includes as needed for IPv6 data
  type in LRO.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 23:03:23 +00:00
Navdeep Parhar
7a32954c40 Change the default to not use packet counters to generate rx interrupts.
Rely solely on the timer based mechanism.

Update man page to reflect this change.

MFC after:	1 week
2012-04-30 09:46:05 +00:00
Navdeep Parhar
e07f03e8fc Make sure that the firmware version is available in
dev.t4nex.X.firmware_version even if the driver fails to attach
properly.  At least it'll be easy to tell what we're dealing with.

MFC after:	1 week
2012-04-30 08:44:10 +00:00
Navdeep Parhar
d513f5b690 Use the non-sleeping variang of t4_wr_mbox in code that can be called
with locks held.

MFC after:	1 day
2012-02-13 18:41:32 +00:00
Navdeep Parhar
62795b70eb Program the MAC exact match table in batches of 7 addresses at
a time when possible.  This is more efficient than one at a time.

Submitted by:	gnn
MFC after:	3 days
2012-02-08 00:36:36 +00:00
Navdeep Parhar
17c60e7b50 Acquire the adapter lock before updating fields of the filter structure.
Submitted by:	gnn (different version)
MFC after:	3 days
2012-02-07 09:39:46 +00:00
Navdeep Parhar
65d43cc6e7 Remove if_start from cxgb and cxgbe.
Submitted by:	jhb
MFC after:	3 days
2012-02-07 07:32:39 +00:00
Navdeep Parhar
bfb08b6b6b cxgbe: reduce diffs with other branches.
Will help future MFCs from HEAD.

MFC after:	3 days
2012-02-07 06:21:59 +00:00
Navdeep Parhar
733b92779e Many updates to cxgbe(4)
- Device configuration via plain text config file.  Also able to operate
  when not attached to the chip as the master driver.

- Generic "work request" queue that serves as the base for both ctrl and
  ofld tx queues.

- Generic interrupt handler routine that can process any event on any
  kind of ingress queue (via a dispatch table).

- A couple of new driver ioctls.  cxgbetool can now install a firmware
  to the card ("loadfw" command) and can read the card's memory
  ("memdump" and "tcb" commands).

- Lots of assorted information within dev.t4nex.X.misc.*  This is
  primarily for debugging and won't show up in sysctl -a.

- Code to manage the L2 tables on the chip.

- Updates to cxgbe(4) man page to go with the tunables that have changed.

- Updates to the shared code in common/

- Updates to the driver-firmware interface (now at fw 1.4.16.0)

MFC after:	1 month
2011-12-16 02:09:51 +00:00
Navdeep Parhar
214c358257 Do not clobber the ingress queue's congestion setting.
MFC after:	1 month
2011-12-14 05:34:23 +00:00
Matthew D Fleming
103af58f59 Do not define bool/true/false if the symbols already exist.
MFC after:	2 weeks
Sponsored by:	Isilon Systems, LLC
2011-12-12 18:43:24 +00:00
Marius Strobl
4b7ec27007 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Navdeep Parhar
59bc8ce035 - driver ioctl to get SGE context for any given queue.
- sysctls to display the context id, cidx, and pidx of all kinds of queues.

MFC after:	3 days
2011-06-11 04:50:54 +00:00
Navdeep Parhar
9104663338 Cause backpressure (instead of dropping frames) on congestion.
MFC after:	3 days
2011-06-04 23:36:19 +00:00
Navdeep Parhar
9b4d7b4e67 Allow lazy fill up of freelists.
MFC after:	3 days
2011-06-04 23:31:33 +00:00
Navdeep Parhar
272cba15b8 Provide hit-count with rest of the information about a filter.
MFC after:	1 week
2011-06-01 01:32:58 +00:00
Navdeep Parhar
136e410ceb Firmware device log.
# sysctl dev.t4nex.0.devlog

MFC after:	mdf's sysctl+sbuf changes are MFC'd
2011-05-31 23:49:13 +00:00
Navdeep Parhar
b400f1ea97 Update to firmware interface 1.3.10
MFC after:	1 week
2011-05-30 21:56:37 +00:00
Navdeep Parhar
56599263c5 - Specialized ingress queues that take interrupts for other ingress
queues.  Try to have a set of these per port when possible, fall back
  to sharing a common pool between all ports otherwise.

- One control queue per port (used to be one per hardware channel).

- t4_eth_rx now handles Ethernet rx only.

- sysctls to display pidx/cidx for some queues.

MFC after:	1 week
2011-05-30 21:34:44 +00:00
Navdeep Parhar
4dba21f17e L2 table code. This is enough to get the T4's switch + L2 rewrite
filters working.  (All other filters - switch without L2 info rewrite,
steer, and drop - were already fully-functional).

Some contrived examples of "switch" filters with L2 rewriting:

# cxgbetool t4nex0  iport 0  dport 80  action switch  vlan +9  eport 3
Intercept all packets received on physical port 0 with TCP port 80 as
destination, insert a vlan tag with VID 9, and send them out of port 3.

# cxgbetool t4nex0  sip 192.168.1.1/32  ivlan 5  action switch \
	vlan =9  smac aa:bb:cc:dd:ee:ff  eport 0
Intercept all packets (received on any port) with source IP address
192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac
to aa:bb:cc:dd:ee:ff, and send it out of port 0.

MFC after:	1 week
2011-05-30 21:07:26 +00:00
Navdeep Parhar
b0775aef77 Simplify t4_os_find_pci_capability.
MFC after:	3 days
2011-05-19 19:37:41 +00:00
Navdeep Parhar
bc14b14d62 - Enable per-channel congestion notification.
- Enable PCIe relaxed ordering for all egress queues and rx data buffers.

MFC after:	3 days
2011-05-18 22:09:04 +00:00
Navdeep Parhar
c43431465e Add missing header. The test for VLAN_CAPABILITIES later in the file
doesn't make sense without it.

MFC after:	3 days
2011-05-17 00:40:11 +00:00
Navdeep Parhar
af49c94220 sysctl that displays the absolute queue id of an rxq. 2011-05-14 19:27:15 +00:00
Navdeep Parhar
3792a4d286 Bump up the number of egress queues that the driver is allowed to use.
MFC after:	3 days
2011-05-05 23:09:17 +00:00
Navdeep Parhar
489eeba940 T4 packet timestamps.
Reference code that shows how to get a packet's timestamp out of
cxgbe(4).  Disabled by default because we don't have a standard way
today to pass this information up the stack.

The timestamp is 60 bits wide and each increment represents 1 tick of
the T4's core clock.  As an example, the timestamp granularity is ~4.4ns
for this card:

# sysctl dev.t4nex.0.core_clock
dev.t4nex.0.core_clock: 228125

MFC after:	1 week
2011-05-05 02:38:08 +00:00
Navdeep Parhar
8820ce5fe7 T4 packet filtering/steering.
- Enable 5-tuple and every-packet lookup.

- Setup the default filter mode to allow filtering/steering based on IP
  protocol, ingress port, inner VLAN ID, IP frag, FCoE, and MPS match
  type; all combined together.  You can also filter based on MAC index,
  Ethernet type, IP TOS/IPv6 Traffic Class, and outer VLAN ID but you'll
  have to modify the default filter mode and exclude some of the
  match-fields in it.

  IPv4 and IPv6 SIP/DIP/SPORT/DPORT are always available in all filter
  rules.

- Add driver ioctls to get/set the global filter mode.

- Add driver ioctls to program and delete hardware filters.  A couple of
  the "switch" actions that rewrite Ethernet and VLAN information and
  switch the packet out of another port may not work as the L2 code is not
  yet in place.  Everything else, including all "drop" and "pass" rules
  with RSS or absolute qid, should work.

Obtained from:	 Chelsio Communications
2011-05-05 02:04:56 +00:00
Navdeep Parhar
b815af1b74 Always re-arm an iq's interrupt before leaving the handler.
MFC after:	1 week
2011-05-04 23:07:30 +00:00
Navdeep Parhar
fb12416c9f Ring the freelist doorbell from within refill_fl. While here, fix a bug
that could have allowed the hardware pidx to reach the cidx even though
the freelist isn't empty.  (Haven't actually seen this but it was there
waiting to happen..)

MFC after:	1 week
2011-04-20 23:20:00 +00:00
Navdeep Parhar
b5a6d97e1e Use the correct free routine when destroying a control queue.
X-MFC after:	r220873
2011-04-20 18:04:34 +00:00
Navdeep Parhar
657d9381b1 Use Toeplitz hash for RSS.
MFC after:	3 days
2011-04-19 22:14:18 +00:00
Navdeep Parhar
f7dfe243b4 - Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is
now a suitable base for all kinds of egress queues.

- Add control queues (sge_ctrlq) and allocate one of these per hardware
  channel.  They can be used to program filters and steer traffic (and
  more).

MFC after:	1 week
2011-04-19 22:08:28 +00:00
Navdeep Parhar
2be67d2948 Fix a couple of bad races that can occur when a cxgbe interface is taken
down.  The ingress queue lock was unused and has been removed as part of
these changes.

- An in-flight egress update from the SGE must be handled before the
  queue that requested it is destroyed.  Wait for the update to arrive.

- Interrupt handlers must stop processing rx events for a queue before
  the queue is destroyed.  Events that have not yet been processed
  should be ignored once the queue disappears.

MFC after:	1 week
2011-04-15 03:09:27 +00:00
Navdeep Parhar
6b49a4ece8 There is no need to request a tx credit flush if such a request is already
pending.

MFC after:	3 days
2011-04-14 20:06:23 +00:00
Navdeep Parhar
bd0d62016a Modify read/write ioctls to work with 64 bit registers too.
MFC after:	3 days
2011-04-07 07:10:42 +00:00
Navdeep Parhar
37ba135472 Update header and related code for firmware 1.3.8
MFC after:	3 days
2011-04-01 00:40:24 +00:00
Navdeep Parhar
a91fea93ad Do not over-allocate MSI interrupts for the case where each ingress
queue has its own interrupt.  If the exact number that we need is not a
power of 2 and we're using MSI, then switch to interrupt multiplexing.

While here, replace the magic numbers with something more readable.

MFC after:	3 days
2011-03-24 01:03:01 +00:00
Navdeep Parhar
9f1f7ec9a8 Fix an error while constructing the table that maps context id -> egress
queue.

MFC after:	1 day
2011-03-22 21:05:56 +00:00
Navdeep Parhar
d986a01abf Display holdoff timers and packet counts as a list of numbers.
MFC after:	1 week
2011-03-09 21:07:09 +00:00
Navdeep Parhar
9458619309 cxgbe shouldn't directly know of the UMA zones where network buffers
come from.

MFC after:	1 week
2011-03-08 03:04:07 +00:00
Navdeep Parhar
99bb3c5399 Be sure to stay within the bounds of the mod_str array when displaying
the transceiver type.
2011-03-05 04:19:38 +00:00
Navdeep Parhar
83d58badea There is no need to hold an ingress queue's lock while processing its
descriptors.

MFC after:	1 week
2011-03-05 04:04:23 +00:00
Navdeep Parhar
e874ff7a8b Calculate how many descriptors can be reclaimed before calling
reclaim_tx_descs
2011-03-05 03:54:37 +00:00
Navdeep Parhar
7d29df5931 Tweaks for rx:
- everything related to LRO should be in #ifdef INET blocks
- reorder sge_iq's fields so that the most frequently used are all together
- pull all rx code into t4_intr_data directly
- let go of the ingress queue lock when passing up data
- refill the freelist only if it is short of at least 32 buffers
2011-03-05 03:42:03 +00:00
Navdeep Parhar
29ca78e104 Store the ifnet rather than the port_info in each txq and rxq struct.
MFC after:	1 week
2011-03-05 03:27:14 +00:00
Navdeep Parhar
aa2457e17c A txpkts work request should have a valid FID.
MFC after:	1 week
2011-03-05 03:18:56 +00:00
Navdeep Parhar
56c2cdaf9b Upgrade the firmware on the card automatically if a better version is
available.  Downgrade only for a major version mismatch.

MFC after:	1 week
2011-03-05 03:12:50 +00:00
Navdeep Parhar
ecb79ca4f6 Resume tx immediately in response to an SGE egress update from the hardware.
MFC after:	1 week
2011-03-05 03:06:38 +00:00
Navdeep Parhar
4a1bd0e4e8 Fix incorrect assertion.
MFC after:	3 days
2011-03-05 03:01:14 +00:00
Navdeep Parhar
54e4ee7163 cxgbe(4) - NIC driver for Chelsio T4 (Terminator 4) based 10Gb/1Gb adapters.
MFC after:	3 weeks
2011-02-18 08:00:26 +00:00