Commit Graph

138 Commits

Author SHA1 Message Date
Gleb Smirnoff
56b61ca27a Remove ifq_drops from struct ifqueue. Now queue drops are accounted in
struct ifnet if_oqdrops.

Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops
is simply removed from them. There were no API to read this statistic.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-09-19 09:01:19 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Konstantin Belousov
5ada86640b Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.

Reviewed and approved by:	np
MFC after:	1 week
2013-05-16 13:07:02 +00:00
Gleb Smirnoff
c6499eccad Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
Eitan Adler
db702c59cf remove duplicate semicolons where possible.
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:37 +00:00
Gavin Atkinson
389c8bd51e Align the PCI Express #defines with the style used for the PCI-X
#defines.  This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
  s/PCIR_EXPRESS_/PCIER_/g
  s/PCIM_EXP_/PCIEM_/g
  s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with:	jhb
MFC after:	1 week
2012-09-18 22:04:59 +00:00
John Baldwin
ec9a9cf1e0 Attach interrupt handlers during attach instead of during the first time
the interface is brought up.  Without this, the boot time interrupt
round-robin assignment does not think the allocated interrupt resources
are active and leaves them assigned to CPU 0.

While here, add descriptive tags to each interrupt handler when MSI-X
is used.

Reviewed by:	np
MFC after:	1 week
2012-08-30 17:47:39 +00:00
Navdeep Parhar
0a7049095f cxgb(4): IPv6 rx/tx hw checksum, IPv6 TSO and LRO too.
(Some parts already worked, this makes it complete).
2012-06-30 02:11:53 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Navdeep Parhar
3e7cc3cab3 Add IPv6 TSO (including TSO+VLAN) support to cxgb(4).
If an IPv6 packet has extension headers the kernel needs to deal with it
itself.  For the rest it can set various CSUM_XXX flags and the driver
will act on them.
2012-02-09 23:19:09 +00:00
Navdeep Parhar
c3286cd2b6 Allocate the BAR for userspace doorbells after the is_offload check
is functional.

MFC after:	3 days
2012-02-08 03:02:12 +00:00
Navdeep Parhar
65d43cc6e7 Remove if_start from cxgb and cxgbe.
Submitted by:	jhb
MFC after:	3 days
2012-02-07 07:32:39 +00:00
Marius Strobl
4b7ec27007 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
Navdeep Parhar
7eeb16cee7 t3_free_sge_resources should be given the number of qsets it needs to free.
MFC after:	1 week
2011-03-24 01:16:48 +00:00
John Baldwin
3b0a4aef96 Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Matthew D Fleming
deceab8792 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the cxgb driver piece.
2011-01-12 19:53:44 +00:00
Navdeep Parhar
61cb6c9076 wakeup is required if the adapter lock is released anywhere during
init and not just for the may_sleep case.

Pointed out by:	Isilon
MFC after:	3 days
2010-08-15 20:34:51 +00:00
John Baldwin
b739a509f2 - Change the warning about PCI-e links narrower than x8 to only apply to
10G cards.  1G cards are x4 only.
- Use constants from pcireg.h for reading the current link width.
- Use pci_set_max_read_req() rather than implementing it by hand.

Reviewed by:	np
MFC after:	1 week
2010-07-26 17:31:15 +00:00
Navdeep Parhar
bd1a9fbad6 Improve cxgb(4)'s behaviour when faced with temporarily "bouncy" links:
- Run the adapter's tick at 1Hz and remove link state checks from it.
  Instead, have each port check its link state.  Delay the check so that
  it takes place slightly after the driver is notified of a change in
  link state.  This is a cheap way to debounce these notifications if
  many are received in rapid succession.  POLL_LINK_1ST_TIME flag can
  also be eliminated as a side effect of these changes.
- Do not reset the PHY when link goes down.
- Clear port's link_fault flag if the PHY indicates link is down.
- get_link_status_r should leave speed and duplex alone when link is down.

MFC after:	1 month
2010-07-09 00:38:00 +00:00
Navdeep Parhar
2c32b50248 Eliminate ext_intr_task. The "slow" interrupt handler is already
running on the adapter's task queue.  Just do what the task does
instead of enqueueing it.

MFC after:	3 days
2010-07-09 00:36:35 +00:00
Navdeep Parhar
29c54b85f9 Fix bufsize calculation so that cxgbtool can display information for the
last I/O queue too.

MFC after:	3 days
2010-07-09 00:35:09 +00:00
Navdeep Parhar
06eace6376 make format string a string literal.
Reported by:	clang
2010-06-12 22:24:39 +00:00
Navdeep Parhar
3a2c6562f3 cxgb(4): add an 'nfilters' tunable that lets the user place an upper
limit on the number of hardware filters (and thus the amount of TCAM
reserved for filtering).
2010-06-07 08:23:16 +00:00
Navdeep Parhar
cb958aba98 Remove invalid assertion.
Holding the adapter lock while changing the LRO settings is sufficient.

PR:		kern/146759
MFC after:	3 days
2010-05-20 18:22:45 +00:00
Navdeep Parhar
b85998cb48 Do not hold the T3 firmware in memory all the time. firmware(9) can
load/unload it as needed.
2010-05-05 22:29:54 +00:00
Navdeep Parhar
d6da836201 Add support for hardware filters to cxgb(4). The T3 chip can inspect
L2/3/4 headers and can drop or steer packets as instructed.  Filtering
based on src ip, dst ip, src port, dst port, 802.1q, udp/tcp, and mac
addr is possible.  Add support in cxgbtool to program these filters.
Some simple examples:

Drop all tcp/80 traffic coming from the subnet specified.
# cxgbtool cxgb2 filter 0 sip 192.168.1.0/24 dport 80 type tcp action drop

Steer all incoming UDP traffic to qset 0.
# cxgbtool cxgb2 filter 1 type udp queue 0 action pass

Steer all tcp traffic from 192.168.1.1 to qset 1.
# cxgbtool cxgb2 filter 2 sip 192.168.1.1 type tcp queue 1 action pass

Drop fragments.
# cxgbtool cxgb2 filter 3 type frag action drop

List all filters.
# cxgbtool cxgb2 filter list
index         SIP                DIP     sport dport VLAN PRI P/MAC type Q
    0     192.168.1.0/24         0.0.0.0     *    80    0 0/1 */*    tcp -
    1         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*    udp 0
    2     192.168.1.1/32         0.0.0.0     *     *    0 0/1 */*    tcp 1
    3         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*   frag -
16367         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*      * *

MFC after:	2 weeks
2010-05-05 00:41:40 +00:00
Navdeep Parhar
2caefebb07 Add IFCAP_LINKSTATE to cxgb's capabilities.
MFC after:	3 days
2010-05-04 23:55:08 +00:00
Maxim Sobolev
e50d35e6c6 Add new tunable 'net.link.ifqmaxlen' to set default send interface
queue length. The default value for this parameter is 50, which is
quite low for many of today's uses and the only way to modify this
parameter right now is to edit if_var.h file. Also add read-only
sysctl with the same name, so that it's possible to retrieve the
current value.

MFC after:	1 month
2010-05-03 07:32:50 +00:00
Navdeep Parhar
489ca05be7 Increase response queue size to avoid starvation, add a counter
to track it when it does occur.
2010-04-02 17:50:52 +00:00
Navdeep Parhar
97ae3bc359 Multiple fixes related to queue set sizing and resources:
- Only the tunnelq (TXQ_ETH) requires a buf_ring, an ifq, and the watchdog/timer
  callouts.  Do not allocate these for the other tx queues.

- Use 16k jumbo clusters only on offload capable cards by default.

- Do not allocate a full tx ring for the offload queue if the card is not
  offload capable.

- Slightly better freelist size calculation.

- Fix nmbjumbo4 typo, remove unneeded global variables.

MFC after:	3 days
2010-03-31 00:27:49 +00:00
Navdeep Parhar
92f61ecb4b Fix tx drop statistics.
MFC after:	3 days
2010-03-31 00:26:02 +00:00
Navdeep Parhar
1d609d51f0 Do not attempt to retrieve interrupt information before it is available.
MFC after:	3 days
2010-03-31 00:22:58 +00:00
Navdeep Parhar
a9da6d239c Refresh the firmware version immediately after it is upgraded (or downgraded).
MFC after:	3 days
2010-03-31 00:19:39 +00:00
Navdeep Parhar
cd5c70b2ba Better TwinAx transceiver detection.
Originally submitted by: <Bruno dot Bittner at isilon dot com>
(This is a rewritten, corrected version of that patch)

MFC after:    1 week
2010-03-09 19:57:44 +00:00
Navdeep Parhar
f9c6e16451 Support IFCAP_VLANHWTSO in cxgb(4). It works with or without vlanhwtag.
While here, remove old DPRINTFs and tidy up the capability code a bit.
2010-02-26 07:08:44 +00:00
Navdeep Parhar
e83ec3e5c3 There is no need to test __FreeBSD_version for features that have
been around for a long time now (7.1-ish or even earlier); assume
they are present.  These includes MSI, TSO, LRO, VLAN, INTR_FILTERS,
FIRMWARE, etc.

Also, eliminate some dead code and clean up in other places as part
of this quick once-over.

MFC after:	1 week
2010-02-24 10:16:18 +00:00
Navdeep Parhar
3c0e59de3e Don't forget to release the adapter lock for a no-op. 2010-01-23 01:44:30 +00:00
Navdeep Parhar
b302b77ca7 Fix for a cxgb(4) panic. cxgb_ioctl can be called by the IP and IPv6
layers with non-sleepable locks held.  Don't (potentially) sleep in
those situations.
2010-01-20 03:40:43 +00:00
Navdeep Parhar
a9c23ef044 Don't disable the XGMAC's tx on ifconfig down. It is unnecessary
and can cause false backpressure in the chip.  Fix a us/ms mixup
while here.
2009-11-13 00:37:29 +00:00
Navdeep Parhar
7ead19d4a8 sc->rev and is_offload(sc) will always be 0 during probe. Wait till
attach to get correct values.
2009-11-13 00:28:16 +00:00
John Baldwin
e1b17582f4 Take a step towards removing if_watchdog/if_timer. Don't explicitly set
if_watchdog/if_timer to NULL/0 when initializing an ifnet.  if_alloc()
sets those members to NULL/0 already.
2009-11-06 14:55:01 +00:00
Navdeep Parhar
c01f2b8301 cxgb(4) updates, including:
- support for the new Gen-2, BT, and LP-CR cards.
- T3 firmware 7.7.0
- shared "common code" updates.

Approved by:	gnn (mentor)
Obtained from:	Chelsio
MFC after:	1 month
2009-10-05 20:21:41 +00:00
John Baldwin
6b2eaa836f Fill the reverse RSS map with 0xff's so that the subsequent loop to
calculate the values will work properly.

Reviewed by:	np
MFC after:	1 month
2009-09-04 21:00:45 +00:00
Navdeep Parhar
2975f78738 Various ifmedia related fixes in cxgb(4), including:
- build ifmedia list based on phy->caps, not string comparisons.
- rebuild media list when a transceiver change is detected.
- return EOPNOTSUPP instead of ENXIO in cxgb_media_status.

Approved by:	gnn (mentor)
MFC after:	2 weeks.
2009-06-24 21:56:05 +00:00
Navdeep Parhar
16daf36814 Fix cxgb's ifmedia ioctl handling. Also fixed a comment.
Reviewed by:	kmacy
Approved by:	gnn (mentor)
2009-06-22 21:42:57 +00:00
Kip Macy
3f345a5d09 Greatly simplify cxgb by removing almost all of the custom mbuf management logic
- remove mbuf iovec - useful, but adds too much complexity when isolated to
   the driver

- remove driver private caching - insufficient benefit over UMA to justify
  the added complexity and maintenance overhead

- remove separate logic for managing multiple transmit queues, with the
  new drbr routines the control flow can be made to much more closely resemble
  legacy drivers

- remove dedicated service threads, with per-cpu callouts one can get the same
  benefit much more simply by registering a callout 1 tick in the future if there
  are still buffered packets

- remove embedded mbuf usage - Jeffr's changes will (I hope) soon be integrated
  greatly reducing the overhead of using kernel APIs for reference counting
  clusters

- add hysteresis to descriptor coalescing logic

- add coalesce threshold sysctls to allow users to decide at run-time
  between optimizing for forwarding / UDP or optimizing for TCP

- add once per second watchdog to effectively close the very rare races
  occurring from coalescing

- incorporate Navdeep's changes to the initialization path required to
  convert port and adapter locks back to ordinary mutexes (silencing BPF
  LOR complaints)

- enable prefetches in get_packet and tx cleaning

Reviewed by:	navdeep@
MFC after:	2 weeks
2009-06-19 23:34:32 +00:00