Commit Graph

135 Commits

Author SHA1 Message Date
Krzysztof Galazka
20a52706c8 ixl(4): Add tunable to override Flow Control settings
Add flow_control to hw.ixl tunables tree to let override
initial flow control configuration for all interfaces.
Keep using configuration set by NVM by default.

Reviewed by:	erj@, gallatin@
Tested by:	gowtham.kumar.ks_intel.com
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D29338
2021-04-05 11:17:55 -07:00
Gordon Bergling
5666643a95 Fix some common typos in comments
- occured -> occurred
- normaly -> normally
- controling -> controlling
- fileds -> fields
- insterted -> inserted
- outputing -> outputting

MFC after:	1 week
2021-03-13 18:26:15 +01:00
Mark Johnston
ffe3def903 iflib: Make if_shared_ctx_t a pointer to const
This structure is shared among multiple instances of a driver, so we
should ensure that it doesn't somehow get treated as if there's a
separate instance per interface.  This is especially important for
software-only drivers like wg.

DEVICE_REGISTER() still returns a void * and so the per-driver sctx
structures are not yet defined with the const qualifier.

Reviewed by:	gallatin, erj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29102
2021-03-08 12:39:06 -05:00
Krzysztof Galazka
21802a127d ixl(4): Add ability to control link state on ifconfig down
Add sysctl link_active_on_if_down, which allows user to control
if interface is kept in active state when it is brought
down with ifconfig. Set it to enabled by default to preserve
backwards compatibility.

Reviewed by:	erj
Tested by:	gowtham.kumar.ks@intel.com
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D28028
2021-03-02 17:43:38 -08:00
Krzysztof Galazka
9f99061ef9 ixl(4): Report RX errors as sum of all RX error counters
HW keeps track of RX errors using several counters, each for
specific type of errors. Report RX errors to OS as sum
of all those counters: CRC errors, illegal bytes, checksum,
length, undersize, fragment, oversize and jabber errors.

There is no HW counter for frames with invalid L3/L4 checksums
so add a SW one.

Also add a "rx_errors" sysctl with a copy of netstat IERRORS
counter value to make it easier accessible from scripts.

Reviewed By:	erj
Tested By:	gowtham.kumar.ks@intel.com
Sponsored By:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D27639
2021-03-02 17:37:04 -08:00
Krzysztof Galazka
7d4dceec10 ixl(4): Fix VLAN HW filtering
X700 family of controllers has limited number of available VLAN
HW filters. Driver did not handle properly a case when user
assigned more VLANs to the interface which had all filters
already in use. Fix that by disabling HW filtering when
it is impossible to create filters for all requested VLANs.
Keep track of registered VLANs using bitstring to be able
to re-enable HW filtering when number of requested VLANs
drops below the limit.

Also switch all allocations to use M_IXL malloc type
to ease detecting memory leaks in the driver.

Reviewed by:	erj
Tested by:	gowtham.kumar.ks@intel.com
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28137
2021-02-04 15:33:42 -08:00
Sai Rajesh Tallamraju
38bfc6dee3 iflib: Free resources in a consistent order during detach
Memory and PCI resources are freed with no particular order.  This could
cause use-after-frees when detaching following a failed attach.  For
instance, iflib_tx_structures_free() frees ctx->ifc_txqs[] but
iflib_tqg_detach() attempts to access this array. Similarly, adapter
queues gets freed by IFDI_QUEUES_FREE() but IFDI_DETACH() attempts to
access adapter queues to free PCI resources.

MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D27634
2021-02-01 11:15:54 -05:00
Lutz Donnerhacke
fa6662b368 ixl: Permit 802.1ad frames to pass though the chip
This patch is a quick hack to change the internal Ethertype used
within the chip.  All frames with this type are dropped silently.
This patch allows you to overwrite the factory default 0x88a8, which
is used by IEEE 802.1ad VLAN stacking.

Reviewed by:	kp, philip, brueffer
Approved by:	kp (mentor)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D24179
2021-01-19 16:01:09 +01:00
Matt Macy
81be655266 iflib: ensure that tx interrupts enabled and cleanups
Doing a 'dd' over iscsi will reliably cause stalls. Tx
cleaning _should_ reliably happen as data is sent.
However, currently if the transmit queue fills it will
wait until the iflib timer (hz/2) runs.

This change causes the the tx taskq thread to be run
if there are completed descriptors.

While here:

- make timer interrupt delay a sysctl

- simplify txd_db_check handling

- comment on INTR types

Background on the change:

Initially doorbell updates were minimized by only writing to the register
on every fourth packet. If txq_drain would return without writing to the
doorbell it scheduled a callout on the next tick to do the doorbell write
to ensure that the write otherwise happened "soon". At that time a sysctl
was added for users to avoid the potential added latency by simply writing
to the doorbell register on every packet. This worked perfectly well for
e1000 and ixgbe ... and appeared to work well on ixl. However, as it
turned out there was a race to this approach that would lockup the ixl MAC.
It was possible for a lower producer index to be written after a higher one.
On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial
response was to add a lock around doorbell writes - fixing the problem but
adding an unacceptable amount of lock contention.

The next iteration was to use transmit interrupts to drive delayed doorbell
writes. If there were no packets in the queue all doorbell writes would be
immediate as the queue started to fill up we could delay doorbell writes
further and further. At the start of drain if we've cleaned any packets we
know we've moved the state machine along and we write the doorbell (an
obvious missing optimization was to skip that doorbell write if db_pending
is zero). This change required that tx interrupts be scheduled periodically
as opposed to just when the hardware txq was full. However, that just leads
to our next problem.

Initially dedicated msix vectors were used for both tx and rx. However, it
was often possible to use up all available vectors before we set up all the
queues we wanted. By having rx and tx share a vector for a given queue we
could halve the number of vectors used by a given configuration. The problem
here is that with this change only e1000 passed the necessary value to have
the fast interrupt drive tx when appropriate.

Reported by: mav@
Tested by: mav@
Reviewed by:    gallatin@
MFC after:      1 month
Sponsored by:   iXsystems
Differential Revision:  https://reviews.freebsd.org/D27683
2021-01-07 14:07:35 -08:00
Eric Joyner
2984a8dd7c ixl(4): Add support for X710-T*L devices
Add support for new devices which are capable of 2.5 and 5G speeds, as well as
Energy Efficient Ethernet (EEE):

- introduce new device ids
- add ability to select 2.5 and 5G speeds on devices which support it
- add sysctls to enable EEE and read related statistics

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	#IntelNetworking
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D25549
2020-09-01 23:16:38 +00:00
Li-Wen Hsu
fc4c42c9e3 Remove redeclaration found by gcc build
Reviewed by:	Jacob Keller <jacob.e.keller@intel.com>
Suggested editing from:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25954
2020-08-15 03:26:00 +00:00
Pawel Biernacki
bae599753b dev.ixl.<N>.debug: mark as MPSAFE
This node provides no handler, it's implicitly MPSAFE.

Reviewed by:	erj
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25408
2020-07-04 14:20:03 +00:00
Eric Joyner
b4a7ce0690 ixl(4): Add FW recovery mode support and other things
Update the iflib version of ixl driver based on the OOT version ixl-1.11.29.

Major changes:

- Extract iflib specific functions from ixl_pf_main.c to ixl_pf_iflib.c
  to simplify code sharing between legacy and iflib version of driver

- Add support for most recent FW API version (1.10), which extends FW
  LLDP Agent control by user to X722 devices

- Improve handling of device global reset

- Add support for the FW recovery mode

- Use virtchnl function to validate virtual channel messages instead of
  using separate checks

- Fix MAC/VLAN filters accounting

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	erj@
Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D24564
2020-06-09 22:42:54 +00:00
Eric Joyner
cf1509179c em/ix/ixv/ixl/iavf: Implement ifdi_needs_restart iflib method
Pursuant to r360398, implement driver-specific versions of the
ifdi_needs_restart iflib device method.

Some (if not most?) Intel network cards don't need reinitializing when a
VLAN is added or removed from the device hardware, so these implement
ifdi_needs_restart in a way that tell iflib not to bring the interface
up or down when a VLAN is added or removed, regardless of whether the
VLAN_HWFILTER interface capability flag is set or not.

This could potentially solve several PRs relating to link flaps that
occur when VLANs are added/removed to devices.

Signed-off-by: Eric Joyner <erj@freebsd.org>

PR:		240818, 241785
Reviewed by:	gallatin@, olivier@
MFC after:	3 days
MFC with:	r360398
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D24659
2020-05-11 17:42:04 +00:00
Leandro Lupori
c65f571c89 ixl: Add missing conversions from/to LE16
This fixes some errors on PPC64, during attach and when trying to assign an IP
to an interface.  With this change, basic operation of X710 NICs is now
possible.

This also fixes builds with IXL_DEBUG enabled

Reviewed by:	erj
Sponsored by:	Eldorado Research Institute (eldorado.org.br)
Differential Revision:	https://reviews.freebsd.org/D23975
2020-03-06 12:37:04 +00:00
Pawel Biernacki
20b91f0aa5 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (15 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
2020-02-24 10:51:26 +00:00
Ed Maste
5aa0576b33 Miscellaneous typo fixes
Submitted by:	Gordon Bergling <gbergling_gmail.com>
Differential Revision:	https://reviews.freebsd.org/D23453
2020-02-07 19:53:07 +00:00
Eric Joyner
ab43ce7a22 ixl: prevent non-privileged access to NVM update interface
Add a privilege check to the ixl_handle_nvmupd_cmd function, ensuring
that only privileged users are allowed to access the NVM update
interface.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reported by:	markj@
Reviewed by:	markj@, erj@, jeffrey.e.pieper@intel.com
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D22870
2020-01-02 23:24:57 +00:00
Eric Joyner
087ea4103c Fix compile error introduced in r353658
"adapter" doesn't exist in ixl.

Reported by:	O. Hartmann <ohartmann@walstatt.org>
2019-10-16 18:12:22 +00:00
Eric Joyner
f17e0a71cd ixl: report whether device received pause frames
From Jake:
When updating the device statistics, report whether or not we have
received any pause frames to the iflib stack. This allows the iflib
stack to avoid generating a Tx hang message while the device is paused.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21870
2019-10-16 17:19:17 +00:00
Gleb Smirnoff
ba76aa6357 Convert if_foreach_llmaddr() KPI.
Reviewed by:	erj
2019-10-14 20:21:02 +00:00
Li-Wen Hsu
f1b0e65941 Add the missing braces to fix the code not guarded by the if clause and has
misleading indentation.  This is found by gcc -Wmisleading-indentation

Approved by:	erj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20428
2019-05-30 20:42:36 +00:00
Eric Joyner
1b9d93948a iflib: expose the Rx mbuf buffer size to drivers
From Jake:
iflib_fl_setup calculates a suitable buffer size for the Rx mbufs based
on the isc_max_frame_size value that drivers setup. This calculation is
repeated by drivers when programming their hardware with the size of
each Rx buffer.

This can lead to a mismatch where the iflib mbuf size is different from
the expected size of the buffer as programmed by the hardware. This can
lead to unexpected results.

If iflib ever wants to support mbuf sizes larger than one page, every
driver must be updated to account for the new possible buffer sizes.

Fix this by calculating the mbuf size prior to calling IFDI_INIT, and
adding the iflib_get_rx_mbuf_sz function which will expose this value to
drivers, so that they do not repeat the same calculation.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	shurd@, erj@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D19489
2019-03-19 17:59:56 +00:00
Eric Joyner
af06fa2652 ixl: Fix panic caused by bug exposed by r344062
Don't use a struct if_irq for IFLIB_INTR_IOV type interrupts since that results
in get_core_offset() being called on them, and get_core_offset() doesn't
handle IFLIB_INTR_IOV type interrupts, which results in an assert() being triggered
in iflib_irq_set_affinity().

PR:		235730
Reported by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	1 day
Sponsored by:	Intel Corporation
2019-02-14 18:02:37 +00:00
Marius Strobl
b97de13ae0 - Stop iflib(4) from leaking MSI messages on detachment by calling
bus_teardown_intr(9) before pci_release_msi(9).
- Ensure that iflib(4) and associated drivers pass correct RIDs to
  bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9)
  on the corresponding resources instead of using the RIDs initially
  passed to bus_alloc_resource_any(9) as the latter function may
  change those RIDs. Solely em(4) for the ioport resource (but not
  others) and bnxt(4) were using the correct RIDs by caching the ones
  returned by bus_alloc_resource_any(9).
- Change the logic of iflib_msix_init() around to only map the MSI-X
  BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns
  > 0. Otherwise the "Unable to map MSIX table " message triggers for
  devices that simply don't support MSI-X and the user may think that
  something is wrong while in fact everything works as expected.
- Put some (mostly redundant) debug messages emitted by iflib(4)
  and em(4) during attachment under bootverbose. The non-verbose
  output of em(4) seen during attachment now is close to the one
  prior to the conversion to iflib(4).
- Replace various variants of spelling "MSI-X" (several in messages)
  with "MSI-X" as used in the PCI specifications.
- Remove some trailing whitespace from messages emitted by iflib(4)
  and change them to consistently start with uppercase.
- Remove some obsolete comments about releasing interrupts from
  drivers and correct a few others.

Reviewed by:	erj, Jacob Keller, shurd
Differential Revision:	https://reviews.freebsd.org/D18980
2019-01-30 13:21:26 +00:00
Eric Joyner
2de16a737f ixl(4): Fix handling data passed with ioctl from NVM update tool
From Krzysztof:

Ensure that the entire data buffer passed from the NVM update tool is copied in
to kernel space and copied back out to user space using copyin() and copyout().

PR:		234104
Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reported by:	Finn <ixbug@riseup.net>
MFC after:	5 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18817
2019-01-24 01:08:37 +00:00
Eric Joyner
088a0b27ad intel iflib drivers: correct initialization of tx_cidx_processed
From Jake:

In r341156 ("Fix first-packet completion", 2018-11-28) a hack to work
around a delta calculation determining how many descriptors were used
was added to ixl_isc_tx_credits_update_dwb.

The same fix was also applied to the em and igb drivers in r340310, and
to ix in r341156.

The hack checked the case where prev and cur were equal, and then added
one. This works, because by the time we do the delta check, we already
know there is at least one packet available, so the delta should be at
least one.

However, it's not a complete fix, and as indicated by the comment is
really a hack to work around the real bug.

The real problem is that the first time that we transmit a packet,
tx_cidx_processed will be set to point to the start of the ring.
Ultimately, the credits_update function expects it to point to the
*last* descriptor that was processed. Since we haven't yet processed any
descriptors, pointing it to 0 results in this incorrect calculation.

Fix the initialization code to have it point to the end of the ring
instead. One way to think about this, is that we are setting the value
to be one prior to the first available descriptor.

Doing so, corrects the delta calculation in all cases. The original fix
only works if the first packet has exactly one descriptor. Otherwise, we
will report 1 less than the correct value.

As part of this fix, also update the MPASS assertions to match the real
expectations. First, ensure that prev is not equal to cur, since this
should never happen. Second, remove the assertion about prev==0 || delta
!= 0. It looks like that originated from when the em driver was
converted to iflib. It seems like it was supposed to ensure that delta
was non-zero. However, because we originally returned 0 delta for the
first calculation, the "prev == 0" was tacked on.

Instead, replace this with a check that delta is greater than zero,
after the correction necessary when the ring pointers wrap around.

This new solution should fix the same bug as r341156 did, but in a more
robust way.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	shurd@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18545
2019-01-24 01:03:00 +00:00
Stephen Hurd
e8e0ecb969 Fix first-packet completion
The first packet after the ring is initialized was never
completed as isc_txd_credits_update() would not include it in the
count of completed packets. This caused netmap to never complete
a batch. See PR 233022 for more details.

This is the same fix as the r340310 for e1000

PR:		233607
Reported by:	lev
Reviewed by:	lev
MFC after:	3 days
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D18368
2018-11-28 17:37:17 +00:00
Eric Joyner
37761e2eda ixl/iavf(4): Fix TSO offloads when TXCSUM is disabled
From Jake:
The iflib stack does not disable TSO automatically when TXCSUM is
disabled, instead assuming that the driver will correctly handle TSOs
even when CSUM_IP is not set.

This results in iflib calling ixl_isc_txd_encap with packets which have
CSUM_IP_TSO, but do not have CSUM_IP or CSUM_IP_TCP set. Because of
this, ixl_tx_setup_offload will not setup the IPv4 checksum offloading.

This results in bad TSO packets being sent if a user disables TXCSUM
without disabling TSO.

Fix this by updating the ixl_tx_setup_offload function to check both
CSUM_IP and CSUM_IP_TSO when deciding whether to enable IPv4 checksums.

Once this is corrected, another issue for TSO packets is revealed. The
driver sets IFLIB_NEED_ZERO_CSUM in order to enable a work around that
causes the ip->sum field to be zero'd. This is necessary for ixl
hardware to correctly perform TSOs.

However, if TXCSUM is disabled, then the work around is not enabled, as
CSUM_IP will not be set when the iflib stack checks to see if it should
clear the sum field.

Fix this by adding IFLIB_TSO_INIT_IP to the iflib flags for the iavf and
ixl interface files.

It is uncertain if the hardware needs IFLIB_NEED_ZERO_CSUM for any other
case besides TSO, so leave that flag assigned. It may be worth
investigating to see if this work around flag could be disabled in
a future change.

Once both of these changes are made, the ixl driver should correctly
offload TSO packets when TSO4 offload is enabled, regardless of whether
TXCSUM is enabled or disabled.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	erj@, shurd@
MFC after:	0 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D17900
2018-11-08 19:10:43 +00:00
Eric Joyner
5e6652144d ixl/iavf(4): Update remaining references of "num_queues" to "num_rx_queues"
This should fix a build issue when "options RSS" is set.

Reported by:	bz@
Sponsored by:	Intel Corporation
2018-11-01 17:29:14 +00:00
Conrad Meyer
1f71b44df7 ixl/iavf(4): Fix GCC 6.4.0 build
Don't define redundant prototypes.

Sponsored by:	Dell EMC Isilon
2018-10-20 18:00:12 +00:00
Eric Joyner
3f74c0272e iavf(4): Finish rename/rebrand internally
Rename functions and variables from ixlv to iavf to match the
user-facing name change. There shouldn't be any functional changes
with this change, but this may help with browsing the source code
and reducing diffs in the future.

Submitted by:   kbowling@
Reviewed by:    erj@, sbruno@
Approved by:	re (gjb@)
Differential Revision:  https://reviews.freebsd.org/D17544
2018-10-15 17:23:41 +00:00
Eric Joyner
77c1fcec91 ixl/iavf(4): Change ixlv to iavf and update it to use iflib(9)
Finishes the conversion of the 40Gb Intel Ethernet drivers to iflib(9) for
FreeBSD 12.0, and fixes numerous bugs in both ixl(4) and iavf(4).

This commit also re-adds the VF driver to GENERIC since it now compiles and
functions.

The VF driver name was changed from ixlv(4) to iavf(4) because the VF driver is
now intended to be used with future products, not just with Fortville/Fort Park
VFs.

A man page update that documents these drivers is forthcoming in a separate
commit.

Reviewed by:    sbruno@, kbowling@
Tested by:      jeffrey.e.pieper@intel.com
Approved by:	re (gjb@)
Relnotes:       yes
Sponsored by:   Intel Corporation
Differential Revision: https://reviews.freebsd.org/D16429
2018-10-12 22:40:54 +00:00
Warner Losh
0dc34160f3 Add PNP info to PCI attachments of cbb, cxgb, ida, iwn, ixl, ixlv,
mfi, mps, mpr, mvs, my, oce, pcn, ral, rl. This only labels existing
pci device tables, and has no probe / attach code changes.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Approved by: re (glen)
2018-09-26 17:12:30 +00:00
Marius Strobl
6daf59aa79 Remove a duplicated interface capability bit missed in r336313. 2018-08-23 18:11:55 +00:00
Marius Strobl
7f87c0406d Assorted TSO fixes for em(4)/iflib(9) and dead code removal:
- Ever since the workaround for the silicon bug of TSO4 causing MAC hangs
  was committed in r295133, CSUM_TSO always got disabled unconditionally
  by em(4) on the first invocation of em_init_locked(). However, even with
  that problem fixed, it turned out that for at least e. g. 82579 not all
  necessary TSO workarounds are in place, still causing MAC hangs even at
  Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled
  in r323292 (r323293 for stable/10) for the EM-class by default, allowing
  users to turn it on if it happens to work with their particular EM MAC
  in a Gigabit-only environment.
  In head, the TSO workaround for speeds other than Gigabit was lost with
  the conversion to iflib(9) in r311849 (possibly along with another one
  or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4
  got enabled by default again, causing device hangs. Therefore, change the
  default for this hardware class back to have TSO4 off, allowing users
  to turn it on manually if it happens to work in their environment as
  we do in stable/{10,11}. An alternative would be to add a whitelist of
  EM-class devices where TSO4 actually is reliable with the workarounds in
  place, but given that the advantage of TSO at Gigabit speed is rather
  limited - especially with the overhead of these workarounds -, that's
  really not worth it. [1]
  This change includes the addition of an isc_capabilities to struct
  if_softc_ctx so iflib(9) can also handle interface capabilities that
  shouldn't be enabled by default which is used to handle the default-off
  capabilities of e1000 as suggested by shurd@ and moving their handling
  from em_setup_interface() to em_if_attach_pre() accordingly.
- Although 82543 support TSO4 in theory, the former lem(4) didn't have
  support for TSO4, presumably because TSO4 is even more broken in the
  LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class
  devices was enabled as part of the conversion to iflib(9) in r311849,
  causing device hangs. So revert back to the pre-r311849 behavior of
  not supporting TSO4 for LEM-class at all, which includes not creating
  a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2]
- In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET
  (65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO
  DMA must have a maxsize of the maximum TSO size plus the size of a
  VLAN header for software VLAN tagging. The iflib(9) converted em(4),
  thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE
  in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET
  in em_setup_interface() (apparently, left-over from pre-iflib(9)
  times). So remove the later and correct iflib(9) to correctly cap
  the maximum TSO size reported to the stack at IP_MAXPACKET. While at
  it, let iflib(9) use if_sethwtsomax*().
  This change includes the addition of isc_tso_max{seg,}size DMA engine
  constraints for the TSO DMA tag to struct if_shared_ctx and letting
  iflib_txsd_alloc() automatically adjust the maxsize of that tag in case
  IFCAP_VLAN_MTU is supported as requested by shurd@.
- Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet
  header length from {ixgbe,ixl,ixlv,ixv,em}_setup_interface() to iflib(9)
  so adjustment is automatically done in case IFCAP_VLAN_MTU is supported.
  As a consequence, this adjustment now is also done in case of bnxt(4)
  which missed it previously.
- Move the reduction of the maximum TSO segment count reported to the
  stack by the number of m_pullup(9) calls (which in the worst case,
  can add another mbuf and, thus, the requirement for another DMA
  segment each) in the transmit path for performance reasons from
  em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now
  done in iflib_parse_header() rather than in the no longer existing
  em_xmit(). Moreover, this optimization applies to all drivers using
  iflib(9) and not just em(4); all in-tree iflib(9) consumers still
  have enough room to handle full size TSO packets. Also, reduce the
  adjustment to the maximum number of m_pullup(9)'s now performed in
  iflib_parse_header().
- Prior to the conversion of em(4)/igb(4)/lem(4) and ixl(4) to iflib(9)
  in r311849 and r335338 respectively, these drivers didn't enable
  IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed
  through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER was turned on
  by default but also lagg(4) was fixed in that regard in r203548. So
  just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling
  in {em,ixl,ixlv}_setup_interface().
- Nuke other redundant IFCAP_* setting in {em,ixl,ixlv}_setup_interface()
  which is (more completely) already done in {em,ixl,ixlv}_if_attach_pre()
  now.
- Remove some redundant/dead setting of scctx->isc_tx_csum_flags in
  em_if_attach_pre().
- Remove some IFCAP_* duplicated either directly or indirectly (e. g.
  via IFCAP_HWCSUM) in {EM,IGB,IXL}_CAPS.
- Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as
  iflib(9) adds that capability unconditionally.
- Remove some unused macros from em(4).
- Bump __FreeBSD_version as some of the above changes require the modules
  of drivers using iflib(9) to be recompiled.

Okayed by:	sbruno@ at 201806 DevSummit Transport Working Group [1]
Reviewed by:	sbruno (earlier version), erj
PR:	219428 (part of; comment #10) [1], 220997 (part of; comment #3) [2]
Differential Revision:	https://reviews.freebsd.org/D15720
2018-07-15 19:04:23 +00:00
Marius Strobl
52a05efab9 As suggested by a comment in ixl_initialize_vsi(), use if_getcapenable(9)
instead of directly interrogating ifp->if_capenable.

Reviewed by:	erj (ixl_initialize_vsi())
Differential Revision:	https://reviews.freebsd.org/D15720 (part of)
2018-07-15 18:02:50 +00:00
Eric Joyner
c9da8d8beb ixl(4): Set baudrate on link up using proper link_speed variable
And remove old, now-completely unused link_speed variable.

Reported by:	Jacob Keller <jacob.e.keller@intel.com>
MFC after:	1 month
2018-07-12 17:42:36 +00:00
Eric Joyner
94c86dd038 ixl(4): Fix gcc build errors
By removing redundant function declarations.

Reported by:	ci.freebsd.org via Mark Millard <marklmi@yahoo.com>
MFC after:	1 month
2018-06-20 22:16:46 +00:00
Sean Bruno
d6c579b29f Remove "diff" line indicator. Next to see if this code works or not.
Submitted by:	mmacy
Sponsored by:	Limelight Networks
2018-06-19 15:55:21 +00:00
Eric Joyner
f4cc2d1710 ixl(4): Update version number to 2.0.0-k
And update copyrights to current year.

MFC after:	1 month
Sponsored by:	Intel Corporation
2018-06-18 20:32:53 +00:00
Eric Joyner
1031d839aa ixl(4): Update to use iflib
Update the driver to use iflib in order to bring performance,
maintainability, and (hopefully) stability benefits to the driver.

The driver currently isn't completely ported; features that are missing:

- VF driver (ixlv)
- SR-IOV host support
- RDMA support

The plan is to have these re-added to the driver before the next FreeBSD release.

Reviewed by:	gallatin@
Contributions by: gallatin@, mmacy@, krzysztof.galazka@intel.com
Tested by:	jeffrey.e.pieper@intel.com
MFC after:	1 month
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15577
2018-06-18 20:12:54 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Eric Joyner
ceebc2f348 ixl(4): Update to 1.9.9-k
Refresh upstream driver before impending conversion to iflib.

Major changes:

- Support for descriptor writeback mode (required by ixlv(4) for AVF support)
- Ability to disable firmware LLDP agent by user (PR 221530)
- Fix for TX queue hang when using TSO (PR 221919)
- Separate descriptor ring sizes for TX and RX rings

PR:		221530, 221919
Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	#IntelNetworking
MFC after:	1 day
Relnotes:	Yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D14985
2018-05-01 18:50:12 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
efaa3e0789 dev/(e1000,ixl): Make some use of mallocarray(9).
Reviewed by:	erj
Differential Revision: https://reviews.freebsd.org/D13833
2018-01-11 15:25:26 +00:00
Sepherosa Ziehau
03b04fd4f3 ixl: Fix mbuf hash type settings.
IPV6_EXs in RSS never mean fragment.  They mean:
"- Home address from the home address option in the IPv6 destination
   options header.  If the extension header is not present, use the
   Source IPv6 Address.
 - IPv6 address that is contained in the Routing-Header-Type-2 from
   the associated extension header.  If the extension header is not
   present, use the Destination IPv6 Address."

UDP_IPV4_EX is an invalid RSS hash type, which will be removed.

Quoted from:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-hashing-types#ndishashipv6ex

Reviewed by:	erj
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12450
2017-09-27 05:59:54 +00:00
Sean Bruno
b7e0bde0af Drop IXL RX lock during TCP_LRO, fixes LOR mahem while holding the RX
queue lock when the uppoer stack is called inside TCP_LRO

Submitted by:	Kevin Bowling <kevin.bowling@kev009.com>
Reviewed by:	erj
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11724
2017-07-27 23:01:07 +00:00