150 Commits

Author SHA1 Message Date
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Conrad Meyer
7790c8c199 Split out a more generic debugnet(4) from netdump(4)
Debugnet is a simplistic and specialized panic- or debug-time reliable
datagram transport.  It can drive a single connection at a time and is
currently unidirectional (debug/panic machine transmit to remote server
only).

It is mostly a verbatim code lift from netdump(4).  Netdump(4) remains
the only consumer (until the rest of this patch series lands).

The INET-specific logic has been extracted somewhat more thoroughly than
previously in netdump(4), into debugnet_inet.c.  UDP-layer logic and up, as
much as possible as is protocol-independent, remains in debugnet.c.  The
separation is not perfect and future improvement is welcome.  Supporting
INET6 is a long-term goal.

Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to
'debugnet_' or 'dn_' -- sorry.  I thought keeping the netdump name on the
generic module would be more confusing than the refactoring.

The only functional change here is the mbuf allocation / tracking.  Instead
of initiating solely on netdump-configured interface(s) at dumpon(8)
configuration time, we watch for any debugnet-enabled NIC for link
activation and query it for mbuf parameters at that time.  If they exceed
the existing high-water mark allocation, we re-allocate and track the new
high-water mark.  Otherwise, we leave the pre-panic mbuf allocation alone.
In a future patch in this series, this will allow initiating netdump from
panic ddb(4) without pre-panic configuration.

No other functional change intended.

Reviewed by:	markj (earlier version)
Some discussion with:	emaste, jhb
Objection from:	marius
Differential Revision:	https://reviews.freebsd.org/D21421
2019-10-17 16:23:03 +00:00
Conrad Meyer
3948ad29e9 cxgb(4): Netdump: only reference allocated qsets
SGE_QSETS is an upper bound -- fewer qsets may be allocated depending on
the number of CPUs.

Reviewed by:	markj, np, vangyzen
X-MFC-With:	r333288
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D17274
2019-03-01 01:57:22 +00:00
Warner Losh
0dc34160f3 Add PNP info to PCI attachments of cbb, cxgb, ida, iwn, ixl, ixlv,
mfi, mps, mpr, mvs, my, oce, pcn, ral, rl. This only labels existing
pci device tables, and has no probe / attach code changes.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Approved by: re (glen)
2018-09-26 17:12:30 +00:00
Mark Johnston
eb07d67ef3 Add netdump support to cxgb(4).
Tested with a T320 adapter.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15258
2018-05-06 00:48:43 +00:00
Navdeep Parhar
76aca1d671 cxgb(4): Validate offset/len in the GET_EEPROM ioctl.
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
2018-01-24 05:16:11 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Jason A. Harmening
eb36b1d0bc Clean up MD pollution of bus_dma.h:
--Remove special-case handling of sparc64 bus_dmamap* functions.
  Replace with a more generic mechanism that allows MD busdma
  implementations to generate inline mapping functions by
  defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>.  This
  is currently useful for sparc64, x86, and arm64, which all
  implement non-load dmamap operations as simple wrappers
  around map objects which may be bus- or device-specific.

--Remove NULL-checked bus_dmamap macros.  Implement the
  equivalent NULL checks in the inlined x86 implementation.
  For non-x86 platforms, these checks are a minor pessimization
  as those platforms do not currently allow NULL maps.  NULL
  maps were originally allowed on arm64, which appears to have
  been the motivation behind adding arm[64]-specific barriers
  to bus_dma.h, but that support was removed in r299463.

--Simplify the internal interface used by the bus_dmamap_load*
  variants and move it to bus_dma_internal.h

--Fix some drivers that directly include sys/bus_dma.h
  despite the recommendations of bus_dma(9)

Reviewed by:	kib (previous revision), marius
Differential Revision:	https://reviews.freebsd.org/D10729
2017-07-01 05:35:29 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Navdeep Parhar
a5eb009b49 cxgb: replace r273280 with a more comprehensive fix.
Poll for link state when the link is down, even for interrupt capable
PHYs.

Allow PHYs to report a dubious "partial" link.  If this state is seen 3
consecutive times (each check is ~1s apart) then reset the PHY.  This is
a workaround for a situation where repeatedly toggling the link from the
peer gets the AEL2005 PHY into a state where it never establishes a PCS
block lock even when everything is in order.

MFC after:	1 week
2015-01-11 07:51:58 +00:00
Navdeep Parhar
e26e6373c8 cxgb(4): implement if_get_counter. 2014-09-27 18:35:16 +00:00
Gleb Smirnoff
56b61ca27a Remove ifq_drops from struct ifqueue. Now queue drops are accounted in
struct ifnet if_oqdrops.

Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops
is simply removed from them. There were no API to read this statistic.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-09-19 09:01:19 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Konstantin Belousov
5ada86640b Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.

Reviewed and approved by:	np
MFC after:	1 week
2013-05-16 13:07:02 +00:00
Gleb Smirnoff
c6499eccad Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
Eitan Adler
db702c59cf remove duplicate semicolons where possible.
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:37 +00:00
Gavin Atkinson
389c8bd51e Align the PCI Express #defines with the style used for the PCI-X
#defines.  This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
  s/PCIR_EXPRESS_/PCIER_/g
  s/PCIM_EXP_/PCIEM_/g
  s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with:	jhb
MFC after:	1 week
2012-09-18 22:04:59 +00:00
John Baldwin
ec9a9cf1e0 Attach interrupt handlers during attach instead of during the first time
the interface is brought up.  Without this, the boot time interrupt
round-robin assignment does not think the allocated interrupt resources
are active and leaves them assigned to CPU 0.

While here, add descriptive tags to each interrupt handler when MSI-X
is used.

Reviewed by:	np
MFC after:	1 week
2012-08-30 17:47:39 +00:00
Navdeep Parhar
0a7049095f cxgb(4): IPv6 rx/tx hw checksum, IPv6 TSO and LRO too.
(Some parts already worked, this makes it complete).
2012-06-30 02:11:53 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Navdeep Parhar
3e7cc3cab3 Add IPv6 TSO (including TSO+VLAN) support to cxgb(4).
If an IPv6 packet has extension headers the kernel needs to deal with it
itself.  For the rest it can set various CSUM_XXX flags and the driver
will act on them.
2012-02-09 23:19:09 +00:00
Navdeep Parhar
c3286cd2b6 Allocate the BAR for userspace doorbells after the is_offload check
is functional.

MFC after:	3 days
2012-02-08 03:02:12 +00:00
Navdeep Parhar
65d43cc6e7 Remove if_start from cxgb and cxgbe.
Submitted by:	jhb
MFC after:	3 days
2012-02-07 07:32:39 +00:00
Marius Strobl
4b7ec27007 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
Navdeep Parhar
7eeb16cee7 t3_free_sge_resources should be given the number of qsets it needs to free.
MFC after:	1 week
2011-03-24 01:16:48 +00:00
John Baldwin
3b0a4aef96 Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Matthew D Fleming
deceab8792 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the cxgb driver piece.
2011-01-12 19:53:44 +00:00
Navdeep Parhar
61cb6c9076 wakeup is required if the adapter lock is released anywhere during
init and not just for the may_sleep case.

Pointed out by:	Isilon
MFC after:	3 days
2010-08-15 20:34:51 +00:00
John Baldwin
b739a509f2 - Change the warning about PCI-e links narrower than x8 to only apply to
10G cards.  1G cards are x4 only.
- Use constants from pcireg.h for reading the current link width.
- Use pci_set_max_read_req() rather than implementing it by hand.

Reviewed by:	np
MFC after:	1 week
2010-07-26 17:31:15 +00:00
Navdeep Parhar
bd1a9fbad6 Improve cxgb(4)'s behaviour when faced with temporarily "bouncy" links:
- Run the adapter's tick at 1Hz and remove link state checks from it.
  Instead, have each port check its link state.  Delay the check so that
  it takes place slightly after the driver is notified of a change in
  link state.  This is a cheap way to debounce these notifications if
  many are received in rapid succession.  POLL_LINK_1ST_TIME flag can
  also be eliminated as a side effect of these changes.
- Do not reset the PHY when link goes down.
- Clear port's link_fault flag if the PHY indicates link is down.
- get_link_status_r should leave speed and duplex alone when link is down.

MFC after:	1 month
2010-07-09 00:38:00 +00:00
Navdeep Parhar
2c32b50248 Eliminate ext_intr_task. The "slow" interrupt handler is already
running on the adapter's task queue.  Just do what the task does
instead of enqueueing it.

MFC after:	3 days
2010-07-09 00:36:35 +00:00
Navdeep Parhar
29c54b85f9 Fix bufsize calculation so that cxgbtool can display information for the
last I/O queue too.

MFC after:	3 days
2010-07-09 00:35:09 +00:00
Navdeep Parhar
06eace6376 make format string a string literal.
Reported by:	clang
2010-06-12 22:24:39 +00:00
Navdeep Parhar
3a2c6562f3 cxgb(4): add an 'nfilters' tunable that lets the user place an upper
limit on the number of hardware filters (and thus the amount of TCAM
reserved for filtering).
2010-06-07 08:23:16 +00:00
Navdeep Parhar
cb958aba98 Remove invalid assertion.
Holding the adapter lock while changing the LRO settings is sufficient.

PR:		kern/146759
MFC after:	3 days
2010-05-20 18:22:45 +00:00
Navdeep Parhar
b85998cb48 Do not hold the T3 firmware in memory all the time. firmware(9) can
load/unload it as needed.
2010-05-05 22:29:54 +00:00
Navdeep Parhar
d6da836201 Add support for hardware filters to cxgb(4). The T3 chip can inspect
L2/3/4 headers and can drop or steer packets as instructed.  Filtering
based on src ip, dst ip, src port, dst port, 802.1q, udp/tcp, and mac
addr is possible.  Add support in cxgbtool to program these filters.
Some simple examples:

Drop all tcp/80 traffic coming from the subnet specified.
# cxgbtool cxgb2 filter 0 sip 192.168.1.0/24 dport 80 type tcp action drop

Steer all incoming UDP traffic to qset 0.
# cxgbtool cxgb2 filter 1 type udp queue 0 action pass

Steer all tcp traffic from 192.168.1.1 to qset 1.
# cxgbtool cxgb2 filter 2 sip 192.168.1.1 type tcp queue 1 action pass

Drop fragments.
# cxgbtool cxgb2 filter 3 type frag action drop

List all filters.
# cxgbtool cxgb2 filter list
index         SIP                DIP     sport dport VLAN PRI P/MAC type Q
    0     192.168.1.0/24         0.0.0.0     *    80    0 0/1 */*    tcp -
    1         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*    udp 0
    2     192.168.1.1/32         0.0.0.0     *     *    0 0/1 */*    tcp 1
    3         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*   frag -
16367         0.0.0.0/0          0.0.0.0     *     *    0 0/1 */*      * *

MFC after:	2 weeks
2010-05-05 00:41:40 +00:00
Navdeep Parhar
2caefebb07 Add IFCAP_LINKSTATE to cxgb's capabilities.
MFC after:	3 days
2010-05-04 23:55:08 +00:00
Maxim Sobolev
e50d35e6c6 Add new tunable 'net.link.ifqmaxlen' to set default send interface
queue length. The default value for this parameter is 50, which is
quite low for many of today's uses and the only way to modify this
parameter right now is to edit if_var.h file. Also add read-only
sysctl with the same name, so that it's possible to retrieve the
current value.

MFC after:	1 month
2010-05-03 07:32:50 +00:00
Navdeep Parhar
489ca05be7 Increase response queue size to avoid starvation, add a counter
to track it when it does occur.
2010-04-02 17:50:52 +00:00
Navdeep Parhar
97ae3bc359 Multiple fixes related to queue set sizing and resources:
- Only the tunnelq (TXQ_ETH) requires a buf_ring, an ifq, and the watchdog/timer
  callouts.  Do not allocate these for the other tx queues.

- Use 16k jumbo clusters only on offload capable cards by default.

- Do not allocate a full tx ring for the offload queue if the card is not
  offload capable.

- Slightly better freelist size calculation.

- Fix nmbjumbo4 typo, remove unneeded global variables.

MFC after:	3 days
2010-03-31 00:27:49 +00:00
Navdeep Parhar
92f61ecb4b Fix tx drop statistics.
MFC after:	3 days
2010-03-31 00:26:02 +00:00
Navdeep Parhar
1d609d51f0 Do not attempt to retrieve interrupt information before it is available.
MFC after:	3 days
2010-03-31 00:22:58 +00:00
Navdeep Parhar
a9da6d239c Refresh the firmware version immediately after it is upgraded (or downgraded).
MFC after:	3 days
2010-03-31 00:19:39 +00:00
Navdeep Parhar
cd5c70b2ba Better TwinAx transceiver detection.
Originally submitted by: <Bruno dot Bittner at isilon dot com>
(This is a rewritten, corrected version of that patch)

MFC after:    1 week
2010-03-09 19:57:44 +00:00