Doing a 'dd' over iscsi will reliably cause stalls. Tx
cleaning _should_ reliably happen as data is sent.
However, currently if the transmit queue fills it will
wait until the iflib timer (hz/2) runs.
This change causes the the tx taskq thread to be run
if there are completed descriptors.
While here:
- make timer interrupt delay a sysctl
- simplify txd_db_check handling
- comment on INTR types
Background on the change:
Initially doorbell updates were minimized by only writing to the register
on every fourth packet. If txq_drain would return without writing to the
doorbell it scheduled a callout on the next tick to do the doorbell write
to ensure that the write otherwise happened "soon". At that time a sysctl
was added for users to avoid the potential added latency by simply writing
to the doorbell register on every packet. This worked perfectly well for
e1000 and ixgbe ... and appeared to work well on ixl. However, as it
turned out there was a race to this approach that would lockup the ixl MAC.
It was possible for a lower producer index to be written after a higher one.
On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial
response was to add a lock around doorbell writes - fixing the problem but
adding an unacceptable amount of lock contention.
The next iteration was to use transmit interrupts to drive delayed doorbell
writes. If there were no packets in the queue all doorbell writes would be
immediate as the queue started to fill up we could delay doorbell writes
further and further. At the start of drain if we've cleaned any packets we
know we've moved the state machine along and we write the doorbell (an
obvious missing optimization was to skip that doorbell write if db_pending
is zero). This change required that tx interrupts be scheduled periodically
as opposed to just when the hardware txq was full. However, that just leads
to our next problem.
Initially dedicated msix vectors were used for both tx and rx. However, it
was often possible to use up all available vectors before we set up all the
queues we wanted. By having rx and tx share a vector for a given queue we
could halve the number of vectors used by a given configuration. The problem
here is that with this change only e1000 passed the necessary value to have
the fast interrupt drive tx when appropriate.
Reported by: mav@
Tested by: mav@
Reviewed by: gallatin@
MFC after: 1 month
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D27683
Coverity flagged this condition: The condition
offset == 0 && offset == 65535
can never be true because offset cannot be equal
to two different values at the same time.
Submitted by: bret_ketchum@dell.com
Reported by: Coverity
Reviewed by: tsoome, cem
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D26144
Pursuant to r360398, implement driver-specific versions of the
ifdi_needs_restart iflib device method.
Some (if not most?) Intel network cards don't need reinitializing when a
VLAN is added or removed from the device hardware, so these implement
ifdi_needs_restart in a way that tell iflib not to bring the interface
up or down when a VLAN is added or removed, regardless of whether the
VLAN_HWFILTER interface capability flag is set or not.
This could potentially solve several PRs relating to link flaps that
occur when VLANs are added/removed to devices.
Signed-off-by: Eric Joyner <erj@freebsd.org>
PR: 240818, 241785
Reviewed by: gallatin@, olivier@
MFC after: 3 days
MFC with: r360398
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D24659
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
All of these uses of sizeof() were on the wrong type in relation to the pointer
passed to SYSCTL_ADD_PROC as arg1. Fortunately, none of the handlers actually
use arg2. So just don't pass a (non-zero) arg2.
Reported by: Coverity
CID: 1007701
It is reported that those VFs share their RSS configuration with PF and,
thus, they cannot be configured independently.
Also:
- add missing opt_rss.h to if_ixv.c, otherwise RSS kernel option could
not be seen
- do not enable IXGBE_FEATURE_RSS on the older VFs
- set flowid / hash type to M_HASHTYPE_NONE or M_HASHTYPE_OPAQUE_HASH
(based on what the hardware reports) if IXGBE_FEATURE_RSS is not set
Reviewed by: nobody
MFC after: 4 weeks
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D21705
From Jake:
Notify the iflib stack of whether we received any pause frames during
the timer window. This allows the stack to avoid reporting a Tx hang due
to the device being paused.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: gallatin@
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21869
From Zach:
Intel documentation indicates that backplane X550EM_X KR devices do not
support Energy Efficient Ethernet. Prior to this patch, X552 devices
(device ID 0x15AB) will crash the system when transitioning EEE state
via sysctl.
Signed-off-by: Zach Vargas <zvargas@xes-inc.com>
PR: 240320
Submitted by: Zach Vargas <zvargas@xes-inc.com>
Reviewed by: erj@
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21673
Instead of predicting the MSI-X bar index based on the device's MAC
type, read it from the device's PCI configuration instead.
PR: 239704
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21547
- Wrong order of casting and bit shift caused that enabling and disabling
queues didn't work properly for queues number larger than 32. Use literals
with right suffix instead.
- TX ring tail address was not updated during reinitiailzation of TX
structures. It could block sending traffic.
- Also remove unused variables 'eims' and 'active_queues'.
Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: erj@
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D20826
Similar to r348026, exhaustive search for uses of CTRn() and cross reference
ktr.h includes. Where it was obvious that an OS compat header of some kind
included ktr.h indirectly, .c files were left alone. Some of these files
clearly got ktr.h via header pollution in some scenarios, or tinderbox would
not be passing prior to this revision, but go ahead and explicitly include it
in files using it anyway.
Like r348026, these CUs did not show up in tinderbox as missing the include.
Reported by: peterj (arm64/mp_machdep.c)
X-MFC-With: r347984
Sponsored by: Dell EMC Isilon
From Krzysztof:
The driver built as KLD cannot be unloaded, if this flag is not set.
Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: shurd@, erj@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19402
From Jake:
iflib_fl_setup calculates a suitable buffer size for the Rx mbufs based
on the isc_max_frame_size value that drivers setup. This calculation is
repeated by drivers when programming their hardware with the size of
each Rx buffer.
This can lead to a mismatch where the iflib mbuf size is different from
the expected size of the buffer as programmed by the hardware. This can
lead to unexpected results.
If iflib ever wants to support mbuf sizes larger than one page, every
driver must be updated to account for the new possible buffer sizes.
Fix this by calculating the mbuf size prior to calling IFDI_INIT, and
adding the iflib_get_rx_mbuf_sz function which will expose this value to
drivers, so that they do not repeat the same calculation.
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: shurd@, erj@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19489
r344162 exposed a bug in one of ixgbe's interrupt filters; they are never
supposed to return 0. Fix the interrupt filter to return the proper nonzero
return value.
Reported by: Oleg Ginzburg <olevole@olevole.ru>
MFC after: 1 week
Sponsored by: Intel Corporation
This patch and commit message are based on r340256 created by Jacob Keller:
The iflib stack does not disable TSO automatically when TXCSUM is
disabled, instead assuming that the driver will correctly handle TSOs
even when CSUM_IP is not set.
This results in iflib calling ixgbe_isc_txd_encap with packets which have
CSUM_IP_TSO, but do not have CSUM_IP or CSUM_IP_TCP set. Because of
this, ixgbe_tx_ctx_setup will not setup the IPv4 checksum offloading.
This results in bad TSO packets being sent if a user disables TXCSUM
without disabling TSO.
Fix this by updating the ixgbe_tx_ctx_setup function to check both
CSUM_IP and CSUM_IP_TSO when deciding whether to enable checksums.
Once this is corrected, another issue for TSO packets is revealed. The
driver sets IFLIB_NEED_ZERO_CSUM in order to enable a work around that
causes the ip->sum field to be zero'd. This is necessary for ix
hardware to correctly perform TSOs.
However, if TXCSUM is disabled, then the work around is not enabled, as
CSUM_IP will not be set when the iflib stack checks to see if it should
clear the sum field.
Fix this by adding IFLIB_TSO_INIT_IP to the iflib flags for the ix and
ixv interface files.
Once both of these changes are made, the ix and ixv drivers should
correctly offload TSO packets when TSO offload is enabled, regardless
of whether TXCSUM is enabled or disabled.
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: IntelNetworking
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D18470
From Piotr:
This patch introduces adapter->task_requests register responsible for
recording requests for mod_task, msf_task, mbx_task, fdir_task and
phy_task calls. Instead of enqueueing these tasks with
GROUPTASK_ENQUEUE, handlers will be called directly from
ixgbe_if_update_admin_status() while holding ctx lock.
SIOCGIFXMEDIA ioctl() call reads adapter->media list. The list is
deleted and rewritten in ixgbe_handle_msf() task without holding ctx
lock. This change is needed to maintain data coherency when sharing
adapter info via ioctl() calls.
Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>.
PR: 221317
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: sbruno@, IntelNetworking
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D18468
iflib is already a module, but it is unconditionally compiled into the
kernel. There are drivers which do not need iflib(4), and there are
situations where somebody might not want iflib in kernel because of
using the corresponding driver as module.
Reviewed by: marius
Discussed with: erj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D19041
bus_teardown_intr(9) before pci_release_msi(9).
- Ensure that iflib(4) and associated drivers pass correct RIDs to
bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9)
on the corresponding resources instead of using the RIDs initially
passed to bus_alloc_resource_any(9) as the latter function may
change those RIDs. Solely em(4) for the ioport resource (but not
others) and bnxt(4) were using the correct RIDs by caching the ones
returned by bus_alloc_resource_any(9).
- Change the logic of iflib_msix_init() around to only map the MSI-X
BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns
> 0. Otherwise the "Unable to map MSIX table " message triggers for
devices that simply don't support MSI-X and the user may think that
something is wrong while in fact everything works as expected.
- Put some (mostly redundant) debug messages emitted by iflib(4)
and em(4) during attachment under bootverbose. The non-verbose
output of em(4) seen during attachment now is close to the one
prior to the conversion to iflib(4).
- Replace various variants of spelling "MSI-X" (several in messages)
with "MSI-X" as used in the PCI specifications.
- Remove some trailing whitespace from messages emitted by iflib(4)
and change them to consistently start with uppercase.
- Remove some obsolete comments about releasing interrupts from
drivers and correct a few others.
Reviewed by: erj, Jacob Keller, shurd
Differential Revision: https://reviews.freebsd.org/D18980
From Jake:
In r341156 ("Fix first-packet completion", 2018-11-28) a hack to work
around a delta calculation determining how many descriptors were used
was added to ixl_isc_tx_credits_update_dwb.
The same fix was also applied to the em and igb drivers in r340310, and
to ix in r341156.
The hack checked the case where prev and cur were equal, and then added
one. This works, because by the time we do the delta check, we already
know there is at least one packet available, so the delta should be at
least one.
However, it's not a complete fix, and as indicated by the comment is
really a hack to work around the real bug.
The real problem is that the first time that we transmit a packet,
tx_cidx_processed will be set to point to the start of the ring.
Ultimately, the credits_update function expects it to point to the
*last* descriptor that was processed. Since we haven't yet processed any
descriptors, pointing it to 0 results in this incorrect calculation.
Fix the initialization code to have it point to the end of the ring
instead. One way to think about this, is that we are setting the value
to be one prior to the first available descriptor.
Doing so, corrects the delta calculation in all cases. The original fix
only works if the first packet has exactly one descriptor. Otherwise, we
will report 1 less than the correct value.
As part of this fix, also update the MPASS assertions to match the real
expectations. First, ensure that prev is not equal to cur, since this
should never happen. Second, remove the assertion about prev==0 || delta
!= 0. It looks like that originated from when the em driver was
converted to iflib. It seems like it was supposed to ensure that delta
was non-zero. However, because we originally returned 0 delta for the
first calculation, the "prev == 0" was tacked on.
Instead, replace this with a check that delta is greater than zero,
after the correction necessary when the ring pointers wrap around.
This new solution should fix the same bug as r341156 did, but in a more
robust way.
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: shurd@
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D18545
The recent gcc versions (7 and 8 at least) can check for switch case
statements for fall through (implicit-fallthrough). When fall through
is intentional, the default method for warning suppression is to place
comment /* FALLTHROUGH */ exactly before next case statement.
Differential Revision: https://reviews.freebsd.org/D18577
The first packet after the ring is initialized was never
completed as isc_txd_credits_update() would not include it in the
count of completed packets. This caused netmap to never complete
a batch. See PR 233022 for more details.
This is the same fix as the r340310 for e1000
PR: 233607
Reported by: lev
Reviewed by: lev
MFC after: 3 days
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D18368
- Fix assert/panic on receive when Jumbo Frames are enabled.
From the commit I made to ixl:
"It turns out that *_isc_rxd_available is supposed to return how many
packets are available to be cleaned on the rx ring. This patch removes
a section of code where if the budget argument is 1, the function would return
one if there was a descriptor available, not necessarily a packet.
This is okay in regular mtu 1500 traffic since the max frame size is less
than the configured receive buffer size (2048), but this doesn't work when
received packets can span more than one descriptor, as is the case when the
mtu is 9000 and the receive buffer size is 4096."
- Fix possible Tx hang because *_isc_txd_credits_update returns incorrect result
From the commit by Krzysztof Galazka to ixl: "Function isc_txd_update_credits
called with clear set to false should return 1 if there are TX descriptors
already handled by HW. It was always returning 0 causing troubles with UDP TX
traffic."
PR: 231659
Reported by: lev@
Approved by: re (gjb@)
Sponsored by: Intel Corporation
Remove unused and easy to misuse PNP macro parameter
Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to
have correct pointer (or array) type. Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.
Mostly done with the coccinelle 'spatch' tool:
$ cat modpnpsize0.cocci
@normaltables@
identifier b,c;
expression a,d,e;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,d,
-sizeof(d[0]),
e);
@singletons@
identifier b,c,d;
expression a;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,&d,
-sizeof(d),
1);
$ rg -l MODULE_PNP_INFO -- sys | \
xargs spatch --in-place --sp-file modpnpsize0.cocci
(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not. So I had to link gdiff into
PATH as diff to use spatch.)
Tinderbox'd (-DMAKE_JUST_KERNELS).
Approved by: re (glen)
From Piotr:
ix(4), ixv(4): Add VLAN tag strip check when receiving packets
ixv(4): Fix support for VLAN_HWTAGGING and VLAN_HWFILTER flags
This change will prevent driver from passing VLAN tags when
interface configuration is not expecting them. VF driver will
check for VLAN_HWTAGGING and VLAN_HWFILTER flags and act adequately.
This patch resolves problem occuring on EC2 platforms.
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reported by: cperciva@
Reviewed by: cperciva@, Intel Networking
Approved by: re
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D17061
I was not aware Warner was making or planning to make forward progress in
this area and have since been informed of that.
It's easy to apply/reapply when churn dies down.
Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to
have correct pointer (or array) type. Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.
Mostly done with the coccinelle 'spatch' tool:
$ cat modpnpsize0.cocci
@normaltables@
identifier b,c;
expression a,d,e;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,d,
-sizeof(d[0]),
e);
@singletons@
identifier b,c,d;
expression a;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,&d,
-sizeof(d),
1);
$ rg -l MODULE_PNP_INFO -- sys | \
xargs spatch --in-place --sp-file modpnpsize0.cocci
(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not. So I had to link gdiff into
PATH as diff to use spatch.)
Tinderbox'd (-DMAKE_JUST_KERNELS).
- Ever since the workaround for the silicon bug of TSO4 causing MAC hangs
was committed in r295133, CSUM_TSO always got disabled unconditionally
by em(4) on the first invocation of em_init_locked(). However, even with
that problem fixed, it turned out that for at least e. g. 82579 not all
necessary TSO workarounds are in place, still causing MAC hangs even at
Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled
in r323292 (r323293 for stable/10) for the EM-class by default, allowing
users to turn it on if it happens to work with their particular EM MAC
in a Gigabit-only environment.
In head, the TSO workaround for speeds other than Gigabit was lost with
the conversion to iflib(9) in r311849 (possibly along with another one
or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4
got enabled by default again, causing device hangs. Therefore, change the
default for this hardware class back to have TSO4 off, allowing users
to turn it on manually if it happens to work in their environment as
we do in stable/{10,11}. An alternative would be to add a whitelist of
EM-class devices where TSO4 actually is reliable with the workarounds in
place, but given that the advantage of TSO at Gigabit speed is rather
limited - especially with the overhead of these workarounds -, that's
really not worth it. [1]
This change includes the addition of an isc_capabilities to struct
if_softc_ctx so iflib(9) can also handle interface capabilities that
shouldn't be enabled by default which is used to handle the default-off
capabilities of e1000 as suggested by shurd@ and moving their handling
from em_setup_interface() to em_if_attach_pre() accordingly.
- Although 82543 support TSO4 in theory, the former lem(4) didn't have
support for TSO4, presumably because TSO4 is even more broken in the
LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class
devices was enabled as part of the conversion to iflib(9) in r311849,
causing device hangs. So revert back to the pre-r311849 behavior of
not supporting TSO4 for LEM-class at all, which includes not creating
a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2]
- In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET
(65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO
DMA must have a maxsize of the maximum TSO size plus the size of a
VLAN header for software VLAN tagging. The iflib(9) converted em(4),
thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE
in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET
in em_setup_interface() (apparently, left-over from pre-iflib(9)
times). So remove the later and correct iflib(9) to correctly cap
the maximum TSO size reported to the stack at IP_MAXPACKET. While at
it, let iflib(9) use if_sethwtsomax*().
This change includes the addition of isc_tso_max{seg,}size DMA engine
constraints for the TSO DMA tag to struct if_shared_ctx and letting
iflib_txsd_alloc() automatically adjust the maxsize of that tag in case
IFCAP_VLAN_MTU is supported as requested by shurd@.
- Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet
header length from {ixgbe,ixl,ixlv,ixv,em}_setup_interface() to iflib(9)
so adjustment is automatically done in case IFCAP_VLAN_MTU is supported.
As a consequence, this adjustment now is also done in case of bnxt(4)
which missed it previously.
- Move the reduction of the maximum TSO segment count reported to the
stack by the number of m_pullup(9) calls (which in the worst case,
can add another mbuf and, thus, the requirement for another DMA
segment each) in the transmit path for performance reasons from
em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now
done in iflib_parse_header() rather than in the no longer existing
em_xmit(). Moreover, this optimization applies to all drivers using
iflib(9) and not just em(4); all in-tree iflib(9) consumers still
have enough room to handle full size TSO packets. Also, reduce the
adjustment to the maximum number of m_pullup(9)'s now performed in
iflib_parse_header().
- Prior to the conversion of em(4)/igb(4)/lem(4) and ixl(4) to iflib(9)
in r311849 and r335338 respectively, these drivers didn't enable
IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed
through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER was turned on
by default but also lagg(4) was fixed in that regard in r203548. So
just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling
in {em,ixl,ixlv}_setup_interface().
- Nuke other redundant IFCAP_* setting in {em,ixl,ixlv}_setup_interface()
which is (more completely) already done in {em,ixl,ixlv}_if_attach_pre()
now.
- Remove some redundant/dead setting of scctx->isc_tx_csum_flags in
em_if_attach_pre().
- Remove some IFCAP_* duplicated either directly or indirectly (e. g.
via IFCAP_HWCSUM) in {EM,IGB,IXL}_CAPS.
- Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as
iflib(9) adds that capability unconditionally.
- Remove some unused macros from em(4).
- Bump __FreeBSD_version as some of the above changes require the modules
of drivers using iflib(9) to be recompiled.
Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1]
Reviewed by: sbruno (earlier version), erj
PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3) [2]
Differential Revision: https://reviews.freebsd.org/D15720
events are passed through by lagg(4) ever since r203548. Deactivation of
this capability by default due to lagg(4) was already not done for ixgbev(4)
and has been - although inadvertently - broken when em(4)/igb(4)/lem(4) and
ixl(4) were converted to iflib(9) in r311849 and r335338 respectively.
Reviewed by: erj
Differential Revision: https://reviews.freebsd.org/D15720 (part of)
When ixgbe was converted to iflib, it lost the SIOCGI2C support
that allows ifconfig to print SFP state, optical light levels, etc.
Restore this by plugging in to the ifdi_i2c_req iflib method. Note
that the sanity checking on dev_addr that used to be done in ixgbe is
now done in iflib.
Reviewed by: erj, Matthew Macy <mmacy@mattmacy.io>
Sponsored by: Netflix
Changelist:
- Turn tx_rings and rx_rings arrays into arrays of pointers to kring
structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
vtnet and ptnet drivers to cope with the change.
- Generalize the nm_config() callback to accept a struct containing many
parameters.
- Introduce NKR_FAKERING to support buffers sharing (used for netmap
pipes)
- Improved API for external VALE modules.
- Various bug fixes and improvements to the netmap memory allocator,
including support for externally (userspace) allocated memory.
- Refactoring of netmap pipes: now linked rings share the same netmap
buffers, with a separate set of kring pointers (rhead, rcur, rtail).
Buffer swapping does not need to happen anymore.
- Large refactoring of the control API towards an extensible solution;
the goal is to allow the addition of more commands and extension of
existing ones (with new options) without the need of hacks or the
risk of running out of configuration space.
A new NIOCCTRL ioctl has been added to handle all the requests of the
new control API, which cover all the functionalities so far supported.
The netmap API bumps from 11 to 12 with this patch. Full backward
compatibility is provided for the old control command (NIOCREGIF), by
means of a new netmap_legacy module. Many parts of the old netmap.h
header has now been moved to netmap_legacy.h (included by netmap.h).
Approved by: hrs (mentor)
This patch will:
- Update ixgbe shared code
- Add support for Intel(R) Ethernet Connection X552 1000BASE-T
- Add error handling for link state check preventing VF from stopping traffic
after changing PF's MTU value
Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: Intel Networking
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D13885
In iflib, the device-specific init() function isn't supposed to edit
the struct ifnet driver flags. If it does, it'll cause an MPASS() assert
in iflib to fail.
PR: 225312
Reported by: bhughes@
Initial update to the ixgbe PF and VF drivers to support the iflib interface.
The PF driver version is bumped to 4.0.0, and the VF driver version is bumped to 2.0.0.
Special thanks to sbruno@ for the support in helping make this conversion happen.
Submitted by: Jeb Cramer <cramerj@intel.com>, Krzysztof Galazka (Chris) <krzysztof.galazka@intel.com>, Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: sbruno@, shurd@, #IntelNetworking
Tested by: Jeffrey Pieper <jeffrey.e.pieper@intel.com>, Sergey Kozlov <kozlov.sergey.404@gmail.com>
Sponsored by: Limelight Networks, Intel Corporation
Differential Revision: https://reviews.freebsd.org/D11727
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
will now prevent the driver from stripping vlan tags from packets.
PR: 219390
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reported by: Charles Goncalves <halfling@halfling.com.br>
Obtained from: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D12795
The previous update to the driver to 3.2.12-k changed the VF's API version
to 1.2, but did not let the VF fall back to 1.1 or 1.0 versions. So, this
patch tries 1.2 first, then the older versions in succession if that fails.
This should allow the VF driver to negotiate 1.1 and work with older PF
drivers, such as the one used in Amazon's EC2 service.
PR: 220872
Submitted by: Jeb Cramer <jeb.j.cramer@intel.com>
MFC after: 1 week
Sponsored by: Intel Corporation
of LOR detection and a bit of lock release/acquire collision when using LRO.
Submitted by: Kevin Bowling <kevin.bowling@kev009.com>
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D11712
Includes:
- Support for X550EM devices.
- Support for Bypass adapters.
- Flow Director code moved to separate files
- SR-IOV code moved to separate files
- Netmap code moved to separate files
Differential Revision: https://reviews.freebsd.org/D11232
Submitted by: Jeb Cramer <cramerj@intel.com>
Reviewed by: erj@
Tested by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Sponsored by: Intel Corporation