This patch enables fm10k TSO feature for both non-tunneling packet
and tunneling packet.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
This patch adds #pragma pack(push, 1) around some structures which are passed
via TLV messages. These structures must not be left unpacked as GCC and
other compilers are wont to do. Otherwise, we get invalid message
responses from the Switch Manager software since it sends 20 bytes and
we expect 24.
Solaris (and other OS's) are not C99 compliant, so they are not able
to use the C99 style #pragma pack() code. Wrap with C99 tag for easy
stripping.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The SYSTIME_CFG.Adjust field has a Direction bit to indicate whether the
adjustment is positive or negative. However, we incorrectly read the
documentation and the direction bit should be set 1 when positive, not
when negative.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Add support for clock offset message from switch manager. Each PEP will
be responsible for notifying its own VFs, and the originating PEP must
notify its own VFs prior or in addition to sending, as it will not
receive a copy of its own message. Base drivers are expected to need
custom implementations so no message handler is provided in shared code.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Add support for tx timestamp mode response message. The switch manager
should send this message whenever the owner changes or when a new port
appears. To simplify logic, treat this as full clock ownership, and call
it the CLOCK_OWNER message. Implement this as a hw->flags field, so that
base driver may use it to disable any functions which modify the clock
including Tx timestamps, frequency adjustments, and offset adjustments.
This ensures only one PEP will be handling these at a time.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Remove support for VF transmit timestamps. VFs should not write the
timestamp bit in the Tx descriptor. Only one Tx timestamp can be
realistically handled at once. It is expected that the switch manager
use FFU logic to disable all timestamp requests except for those
originating from a specific virtual port. It is not possible to
correlate this timestamp accurately if more than one occurs out any
given EPL at a time. Since the primary purpose of Tx timestamps is to
implement PTP daemon, which also requires BAR4 access to change the
clock, do not allow VFs to transmit timestamp. Remove the PF<->VF
message for this behavior.
Note, the VF already didn't have ability to request Tx timestamp mode,
so it essentially wasn't allowed to timestamp before anyways under the
old API.
No longer support old API of request-response timestamp mode messages.
New API only sends timestamp-response when the switch decides which port
will be given control of timestamps. To simplify review of this code,
completely remove the support and re-add support for the response
message in a future patch.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
It is possible that the PF has not yet assigned resources to the VF.
Although rare, this could result in the VF attempting to read queues it
does not own and result in FUM or THI faults in the PF. To prevent this,
check queue 0 before we continue in init_hw_vf.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
When a VF issues an LPORT_STATE request to enable a port which is
already enabled, the PF will first disable the VF. Then it is supposed
to re-enable the VF again with new settings. This is primarily done in
order to ensure that the switch management software properly clears the
previous VF settings. (ie: switch flow rules and so forth). However,
there is a bug in the flow because we check if VF is enabled and don't
re-enable it at the end. The issue is that we disable the VF in order to
clear switch rules, and never follow-up with a re-enable. This results in
a call to enable the VF results in disabling the logical port.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
During initialization, the VF counts its rings by walking the TQDLOC
registers. This only works if the TQMAP/RQMAP registers are set to map
the out-of-bound rings to the first one, so the VF driver can detect when
it has run out of queues cleanly. Update the PF to reset the empty
TQMAP/RQMAP registers post-VFLR to prevent innocent VF drivers from
triggering malicious events.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The VF will send a message to request multicast addresses with the
default vid. In the current code, if the PF has statically assigned a
VLAN to a VF, then the VF will not get the multicast addresses. Fix up
all of the various vlan messages to use identical checks (since each
check was different). Also use set as a variable, so that it simplifies
our check for whether vlan matches the pf_vid.
The new logic will allow set of a vlan if it is zero, automatically
converting to the default vid. Otherwise it will allow setting the PF
vid, or any VLAN if PF has not statically assigned a VLAN. This is
consistent behavior, and allows VF to request either 0 or the
default_vid without silently failing. Note that we need the check for
zero since VFs might not get the default VID message in time to actually
request non-zero VLANs.
Create a function, fm10k_iov_select_vid which implements the logic for
selecting a default vid. This helps us remove duplicate code and
streamlines location of this logic so that we don't make similar bugs in
the future.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
VFs were being improperly added to the switch's multicast group. The
error stems from the fact that incorrect arguments were passed to the
"update_mc_addr" function. It would seem to be a copy paste error since
the parameters are similar to the "update_uc_addr" function.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
A previous bug was uncovered by addition of a debug stat to indicate the
actual number of DWORDS we pulled from the mbmem. It turned out this was
not the same as the tx_dwords counter. While the previous bug fix should
have corrected this in all cases, add some debug stats that count the
number of DWORDs pushed or pulled from the mbmem. Base drivers can use
this in debug builds to help detect this problem in the future.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
When we connect to the mailbox, we insert a fake disconnect header so
that the code does not see an error and thus instantly error every time
we bring up the mailbox. However, we incorrectly record the tail and
head from the local perspective. Since the remote end shouldn't have
anything for us, add a "create_fake_disconnect_hdr" function which
inverts the TAIL and HEAD fields. This enables us to connect without any
errors of either TAIL or HEAD incorrectness, and prevents creating
extraneous error messages. This is necessary now since mbx_reset_work
does not actually clear the Tx FIFO head and tail pointers.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The phantom messages were a result of incorrectly forgetting to drop
already transmitted messages. We would reset pulled, and tail_len but
left the head/tail pointers alone.
The correct fix is to loop through pulled and drop messages until we've
dropped at least as many bytes as we pulled (possibly dropping a message
we've only partially transmitted. However, we also have to account for
tail_len variable and the 'ack' value as in mbx_pull_head. This means
that we need to re-read the HEAD field of the mailbox header.
Based on testing, this resolves the phantom messages issue, as well as
correctly keeping messages which have yet to be transmitted at all in
the Tx FIFO. Thus, we will begin re-transmission once we have
re-connected.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
When we call update_max_size, it does not drop all oversized messages.
This is due to the difficulty in performing this operation, since it is
a FIFO which makes updating anything other than head or tail very
difficult. To fix this, modify validate_msg_size to ensure that we error
out later when trying to transmit the message that could be oversized.
This will generally be a rare condition, as it requires the FIFO to
include a message larger than the max_size negotiated during mailbox
connect. Note that max_size is always smaller than rx.size, so it should
be safe to use here.
Also, update the update_max_size function header comment to clearly
indicate that it does not drop all oversized messages, but only those at
the head of the FIFO.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
After shutting down the mailbox by force, we then go about resetting max
size to 0, and clearing all messages in the FIFO. However, we should
just reset the head pointer so that the FIFO will become empty, rather than
changing the max size to 0. This helps prevent increment in tx_dropped
counter during mailbox negotiation, which is confusing to viewers of
Linux ethtool statistics output.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Red Rock Canyon's interrupt throttle timers are based on the PCIe link
speed. Because of this, the value being programmed into the ITR
registers must be scaled.
For the PF, this is as simple as reading the PCIe link speed and storing
the result. However, in the case of SR-IOV, the VF's interrupt throttle
timers are based on the link speed of the PF. However, the VF is unable
to get the link speed information from its configuration space, so the
PF must inform it of what scale to use.
Rather than passing this scale via mailbox message, we take advantage of
unused bits in the TDLEN register to pass the scale. It is the
responsibility of the PF to program this for the VF while setting up the
VF queues and the responsibility of the VF to get the information
accordingly. This is preferable because it allows the VF to set up the
interrupts properly during initialization and matches how the MAC
address is passed in the TDBAL/TDBAH registers.
A VF unload followed by a reload incorrectly left this value as 0.
If the VF driver blindly trusted this value it could cause a divide by
zero failure.
Fix this by having stop_hw_vf reset the ITR scale as the device goes
down, similar to the way we handle the MAC address.
To prevent divide-by-zero issues, ensure that we always have an ITR
scale. Default to Gen3 scaling if we don't know the speed. Also ensure
the VF checks the register value and ensures we use Gen3 if we are
provided a zero value.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
This patch resolves a bug in Linux where we called the
request_tx_timestamp_mode function that is undefined for VF. Implement a
no-op function that simply ensures that the mode is NONE, otherwise it
would fail with ERR_PARAM.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
We need a handler function to be able to listen for Tx timestamp mode
responses. Without this, core driver code for PTP can't determine if the
Timestamp mode request was successful. This was overlooked in the
previous commit.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
To keep consistency with ND team, I add macro definitions about
FM10K_IS_VALID_ETHER_ADD in fm10k_type.h, though they have already
been defined in fm10k_osdep.h.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The reference to err_no was left around after an old re-factor. We never
use this value again, and the macros called on the function appear to
have no relevant side effect I could see. Discovered via cppcheck
fm10k_mbx.c:1312: (style) Variable 'err_no' is assigned a value that is never used.
This occurred because a previous commit refactored and removed all used
references to err_no.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The header comment included a miscopy of a C-code line, and also
mis-used Rx FIFO when it clearly meant Tx FIFO.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
The extended unified packet type is now part of the standard ABI.
As mbuf struct is changed, the mbuf library version is incremented.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
When a Tx queue fails to start in fm10k_dev_start, all Rx queues
and Tx queues that are started should be cleaned before the
function returns an error.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
In Rx and Tx queue_disable functions, the index of queue should
be qnum other than i which is the iteration of time expiration.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Some drivers was not following DPDK convention and
was leaving logging always in even if LOG_LEVEL was configured
to disable debug logs.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
[Thomas: apply same fix to i40e, fm10k and bnx2x]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
fm10k has the capability to do checksum offload in TX side. This
change will expose the capability to application in infos_get
function.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
The fm10k driver was reading the interrupt cause register but then
using the interrupt mask register defines to look at the bits.
The result is that if a fault happens, the driver would never clear
the fault and would get into an infinite cycle of interrupts.
Note: I don't work for Intel or have the hardware manuals (probably
requires NDA anyway), but this looks logical and matches how the
known working Linux driver handles these bits.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jing Chen <jing.d.chen@intel.com>
The return in fm10k_dev_handle_fault has useless conditional
since it always returns 0 here.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jing Chen <jing.d.chen@intel.com>
If FM10K_DEBUG_DRIVER is enabled, then the log messages about
function entry are missing newline causing extremely long lines.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.
Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.
This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
fm10k has 128 RETA entries in 32 registers, but it only initialized
first 32 when doing multiple rx queue configurations. This fix will
initialize all 128 entries instead.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
The default MAC address is read from hardware and copied to
Device Ethernet Link address array in the device initialization phase,
which bypasses fm10k MAC address number check mechanism,
and will cause an error message when adding default VLAN:
"MAC address number not match"
Fix it by moving default MAC address registration to device
initialize phase.
Fixes: f5c1a236a2 ("fm10k: fix default mac/vlan in switch")
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
To support querying hash key size per port, an new field of
'hash_key_size' was added in 'struct rte_eth_dev_info' for storing
hash key size in bytes.
The correct hash key size in bytes should be filled into the
'struct rte_eth_dev_info', to support querying it.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The parameter tx_free_thresh is not consistent between the drivers:
some use it as rte_eth_tx_burst() requires, some release buffers when
the number of free descriptors drop below this value.
Let's use it as most fast-path code does, which is the latter, and update
comments throughout the code to reflect that.
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
fm10k was failing to run in XEN domain0, as the physical
memory for DMA should be allocated and translated
in a different way for XEN domain0. So
rte_memzone_reserve_bounded() should be used for DMA
memory allocation, and rte_mem_phy2mch() should be used
for DMA memory address translation to support running
fm10k PMD in XEN domain0.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
This patch includes 3 changes related to MAC/VLAN address table
when the system (e.g. testpmd) is started and closed:
- remove default MAC address with fixed VLAN 0 which was for the
debug purpose before the MAC/VLAN filter function was implemented.
- enable VF MAC/VLAN filter for the first valid MAC address
and first valid VLAN ID. This is needed for system (e.g. testpmd)
to setup default MAC address and default VLAN for VF.
Later attempt to change these default value will be refused by
under layer shared code and PF host functions.
- un-register any combination of VLAN and MAC address from fm10k
switch side MAC table when the system (e.g. testpmd) is closed.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Tested-by: Michael Qiu <michael.qiu@intel.com>
Fm10k PF/VF does not support QinQ; VLAN strip and filter are always on
for PF/VF ports.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
MAC filter function was newly added, each PF and VF can have up to
64 MAC addresses.
VF filter needs support from PF host, which is not available now.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
VLAN filter was updated to add/delete one static entry in MAC table for
each combination of VLAN and MAC address. More sanity checks were added.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Since the communication between PF/Switch Manager, VF/PF is
asynchronous through mailbox, it's hard to determine when Switch
Manager/PF host will send the default vlan to PF/VF. So, it's
necessary to set default vlan until the device is started.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
After acquiring MAC address from HW, it's necessary to validate
MAC address before use.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Tested-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
In fm10k, PF driver needs to communicate with switch through
mailbox if it needs to add/delete MAC address.
This fix will validate if switch is ready before going forward.
Then, it is necessary to acquire LPORT_MAP info after issuing
MAC addr request to switch.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Tested-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
In TX side, bit FM10K_TXD_FLAG_LAST in TX descriptor only is set
in the last descriptor for multi-segment packets. But current
implementation didn't set all the fields of TX descriptor, which
will cause descriptors processed now to re-use fields set in last
scroll. If FM10K_TXD_FLAG_LAST bit was set in the last round and
it happened this is not the last descriptor of a multi-segnment
packet, HW will send out the incomplete packet out and leads to
data intergrity issue.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Tested-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>