Commit Graph

3161 Commits

Author SHA1 Message Date
Simon Kagstrom
139debc42d mbuf: move chaining from ip_frag library
Chaining/segmenting mbufs can be useful in many places, so make it
global.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-10-25 00:00:34 +02:00
Mark Smith
fd4b6f78ad acl: improve rules sorting
Replace O(n^2) list sort with an O(n log n) merge sort.
The merge sort is based on the solution suggested in:
http://cslibrary.stanford.edu/105/LinkedListProblems.pdf
Tested sort_rules() improvement:
100K rules: O(n^2):  31382 milliseconds; O(n log n): 10 milliseconds
259K rules: O(n^2): 133753 milliseconds; O(n log n): 22 milliseconds

Signed-off-by: Mark Smith <marsmith@akamai.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-10-24 22:52:53 +02:00
Stephen Hurd
7acf894d07 app/testpmd: detect numa socket count
Currently, there is a MAX_SOCKET macro which artificially limits the
number of NUMA sockets testpmd can use.  Anything on a higher socket
ends up using socket zero.  This patch replaces this with a variable
set during set_default_fwd_lcores_config() and uses RTE_MAX_NUMA_NODES
where a hard-coded max number of sockets is required.

Signed-off-by: Stephen Hurd <shurd@broadcom.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-10-24 21:41:17 +02:00
Ravi Kerur
e83982c4b3 mpipe: return error for init allocation failure
In function rte_pmd_mpipe_devinit, if rte_eth_dev_allocate
fails return error which is inline with other drivers.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Zhigang Lu <zlu@ezchip.com>
2015-10-24 19:24:17 +02:00
Jasvinder Singh
ca743ea84e cfgfile: increase entry name and value sizes
This patch refers to the ABI change proposed for
librte_cfgfile(rte_cfgfile.h). In order to allow
for longer names and values, the values of macro
CFG_NAME_LEN and CFG_VAL_LEN is increased.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-10-22 18:35:11 +02:00
Michal Jastrzebski
a73c93ec14 examples/qos_sched: remove duplicated cfgfile library
This is a supplement for previous patch that was incomplete.
Previous commit message: This is a modification of qos_sched
example to use librte_cfgfile for parsing configuration file.

Fixes: db935d0171 ("examples/qos_sched: use librte_cfgfile")

Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
2015-10-22 18:09:36 +02:00
Christoph Gysin
7499ef45c3 eal: fix C++ build
'virtual' is a keyword and can't be used if the code is to compile with
C++ compilers.

If rte_devargs.h was included in C++ code, compilation with clang++
failed with an error. g++ did not fail, but only because of a bug
that treats it as an anonymous struct with a decl-specifier which it
ignores.

This simply renames the member to 'virt'.

Reported-by: Ming Zhao <mzhao@luminatewireless.com>
Signed-off-by: Christoph Gysin <christoph.gysin@gmail.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-10-22 17:50:51 +02:00
Wen-Chi Yang
d08d304508 eal/linux: make alarm not affected by system time jump
Due to eal_alarm_callback() and rte_eal_alarm_set() use gettimeofday()
to get the current time, and gettimeofday() is affected by jumps.

For example, set up a rte_alarm which will be triggerd next second (
current time + 1 second) by rte_eal_alarm_set(). And the callback
function of this rte_alarm sets up another rte_alarm which will be
triggered next second (current time + 2 second).
Once we change the system time when the callback function is triggered,
it is possible that rte alarm functionalities work out of expectation.

Replace gettimeofday() with clock_gettime(CLOCK_MONOTONIC_RAW, &now)
could avoid this phenomenon.

Signed-off-by: Wen-Chi Yang <wolkayang@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2015-10-21 17:01:24 +02:00
Stephen Hemminger
1e7bd2380f virtio: fix Coverity unsigned warnings
There are some places in virtio driver where uint16_t or int are used
where it would be safer to use unsigned.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2015-10-21 16:14:02 +02:00
Stephen Hemminger
954ea11540 virtio: do not report link state feature unless available
If host does not support virtio link state (like current DPDK vhost)
then don't set the flag. This keeps applications from incorrectly
assuming that link state is available when it is not. It also
avoids useless "guess what works in the config".

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-10-21 16:12:32 +02:00
Jerome Jutteau
6c6373c763 vhost: fix missing device checks
virtio-net search for it's device in reset_owner.
The function don't check the return result of get_config_ll_entry.
Using get_config_ll_entry in reset_owner don't show any error when the
device is not found. This patch fix this by using get_device instead
instead of get_config_ll_entry.

In user_get_vring_base, get_device return is not checked and may cause
segfault when device is not found.

Signed-off-by: Jerome Jutteau <jerome.jutteau@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-10-21 12:21:18 +02:00
Jerome Jutteau
2c95f4de6a vhost: keep device identifier after reset owner
virtio-net clean and init device after a VHOST_USER_RESET_OWNER.
This reset device identifier to 0 and break ll_root listing logic.
This patch keep the old device identifier and re-write it on the cleaned
device.

Signed-off-by: Jerome Jutteau <jerome.jutteau@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-10-21 12:03:57 +02:00
Bernard Iremonger
ce8e121870 virtio: fix crash when releasing null queue
if input parameter vq is NULL, hw = vq->hw, causes a segmentation fault.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2015-10-20 23:29:37 +02:00
David Marchand
fd6949c55c eal: fix io permission for virtio interrupt handler
For virtio-net pmd, the interrupt management thread must be created after
this driver has initialised so that iopl() has been properly called and
its effects are inherited by all eal children threads.

Before this change, changing link status on a virtio-net device would
trigger a segfault in the interrupt thread :

$ mkdir -p /mnt/huge
$ echo 256 > /proc/sys/vm/nr_hugepages
$ mount -t hugetlbfs none /mnt/huge
$ lspci |grep Ethernet
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
$ modprobe uio
$ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
$ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind
$ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id
$ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01 --total-num-mbufs 2048
[snip]
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 1af4:1000 rte_virtio_pmd
Interactive-mode selected
Configuring Port 0 (socket 0)
Port 0: DE:AD:DE:01:02:03
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Done
testpmd>

Then, from qemu monitor:
(qemu) set_link virtio-net-pci.0 off

testpmd> Segmentation fault

Fixes: 565b85dcd9 ("eal: set iopl only when needed")

Reported-by: Stephen Hemminger <shemming@brocade.com>
Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2015-10-20 23:20:42 +02:00
Didier Pallard
50d86a005e mlx4: do not expose broadcast address in MAC list
Use the last array entry to store the broadcast address and keep it hidden
by not reporting the entire array size.

This is done to prevent DPDK applications from attempting to modify or
remove it.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
2015-10-20 21:58:16 +02:00
Francesco Santoro
04c6383de9 mlx4: save bound interface
Allows applications to retrieve the name of the related netdevice.

Signed-off-by: Francesco Santoro <francesco.santoro@6wind.com>
2015-10-20 21:50:29 +02:00
Adrien Mazarguil
08c192054b mlx4: fix missing offload flags in scattered Rx
They were dropped by mistake in the commit below.

Fixes: ab351fe1c9 ("mbuf: remove packet type from offload flags")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-20 21:49:24 +02:00
David Marchand
922a5466c1 enic: fix hash creation when not using first numa node
If dpdk is run with memory only available on socket != 0, then hash
creation will fail and flow director feature won't be available.
Fix this by asking for allocation on caller socket.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked by: Sujith Sankar <ssujith@cisco.com>
2015-10-20 21:32:06 +02:00
David Marchand
7a0b8b7cab enic: fix allocation when not using first numa node
Seen by code review.

If dpdk is run with memory only available on socket != 0, then enic pmd
refuses to initialize ports as this pmd requires some memory on socket 0.
Fix this by setting socket to SOCKET_ID_ANY, so that allocations happen on
the caller socket.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked by: Sujith Sankar <ssujith@cisco.com>
2015-10-20 21:30:48 +02:00
Rahul Lakkireddy
0ec33be4c8 cxgbe: allow to change mtu
Add a mtu_set() eth_dev_ops to allow DPDK apps to modify device mtu.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
Rahul Lakkireddy
0740f6ea3c cxgbe: receive jumbo frames
Ensure jumbo mode is enabled and that the mbuf data room size can
accommodate jumbo size.  If the mbuf data room size can't accommodate
jumbo size, chain mbufs to jumbo size.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
Rahul Lakkireddy
0dae2ba2cb cxgbe: transmit jumbo frames
Add a non-coalesce path.  Skip coalescing for Jumbo Frames, and send the
packet through non-coalesced path if there are enough credits.  Also,
free these non-coalesced packets while reclaiming credits.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
Rahul Lakkireddy
4b2eff452d cxgbe: enable jumbo frames
Increase max_rx_pktlen to accommodate jumbo frame size. Perform sanity
checks and enable jumbo mode in rx queue setup. Set link mtu based on
max_rx_pktlen.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
Rahul Lakkireddy
bf89cbedd2 cxgbe: optimize forwarding performance for 40G
Update sge initialization with respect to free-list manager configuration
and ingress arbiter. Also update refill logic to refill mbufs only after
a certain threshold for rx.  Optimize tx packet prefetch.

Approx. 3 MPPS improvement seen in forwarding performance after the
optimization.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
Rahul Lakkireddy
14b094a401 cxgbe: update documentation
- Add a missed step to mount huge pages in Linux.
- Re-structure Sample Application Notes.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2015-10-20 18:49:18 +02:00
John W. Linville
b89664c07b af_packet: check Tx error
Coverity CID # 13200

If sendto fails, the packets will not get transmitted.  Return 0 as
the number of packets transmitted.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2015-10-20 17:59:19 +02:00
John W. Linville
43254a3367 af_packet: refactor error handling to avoid NULL pointer dereference
Coverity CID # 13321

Checking *internals != NULL before accessing req is not good enough,
because **internals is a function argument and the function doesn't
really know what is passed-in.  We can close our eyes and ignore the
warning on the basis of controlling all the calling code, or we can
refactor the error exit to avoid the issue entirely...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-10-20 17:58:10 +02:00
Fan Zhang
ba92d511dd port: move metadata offset reference at mbuf head
This patch relates to ABI change proposed for librte_port. Macros to
access the packet meta-data stored within the packet buffer has been
adjusted to cover the packet mbuf structure.

The LIBABIVER number is incremented.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-10-19 17:00:36 +02:00
Jasvinder Singh
5aaf45e09a apps: add name to LPM parameters
LPM table and pipeline apps have been modified to
include name parameter of the lpm table.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-10-12 16:04:10 +02:00
Jasvinder Singh
f71c7fc0b9 table: add name to LPM parameters
This patch relates to ABI change proposed for librte_table
(lpm table). A new parameter to hold the table name has
been added to the LPM table parameter structures
rte_table_lpm_params and rte_table_lpm_ipv6_params.

The LIBABIVER number is incremented. The release notes
is updated and the deprecation announcement is removed.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-10-12 16:03:19 +02:00
Piotr Azarewicz
4f8e575f89 ip_frag: fix bit-fields in ipv6 fragment extension
Previous implementation won't work on every environment. The order of
allocation of bit-fields within a unit (high-order to low-order or
low-order to high-order) is implementation-defined.
Solution: used bytes instead of bit fields.

Signed-off-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-10-08 13:15:17 +02:00
Wang Xiao W
97661df7d2 fm10k/base: add FM10420 device ids
Add the device ID for Boulder Rapids and Atwood Channel to enable
drivers to support those devices.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:35:53 +02:00
Wang Xiao W
925c862cbc fm10k/base: pack TLV overlay structures
This patch adds #pragma pack(push, 1) around some structures which are passed
via TLV messages. These structures must not be left unpacked as GCC and
other compilers are wont to do. Otherwise, we get invalid message
responses from the Switch Manager software since it sends 20 bytes and
we expect 24.

Solaris (and other OS's) are not C99 compliant, so they are not able
to use the C99 style #pragma pack() code. Wrap with C99 tag for easy
stripping.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:35:53 +02:00
Wang Xiao W
b56d0781aa fm10k/base: fix ieee1588 adjustment direction
The SYSTIME_CFG.Adjust field has a Direction bit to indicate whether the
adjustment is positive or negative. However, we incorrectly read the
documentation and the direction bit should be set 1 when positive, not
when negative.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:35:48 +02:00
Wang Xiao W
d6eaec3589 fm10k/base: add clock offset message
Add support for clock offset message from switch manager. Each PEP will
be responsible for notifying its own VFs, and the originating PEP must
notify its own VFs prior or in addition to sending, as it will not
receive a copy of its own message. Base drivers are expected to need
custom implementations so no message handler is provided in shared code.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:35:48 +02:00
Wang Xiao W
001c2f311a fm10k/base: add ieee1588 clock owner message support
Add support for tx timestamp mode response message. The switch manager
should send this message whenever the owner changes or when a new port
appears. To simplify logic, treat this as full clock ownership, and call
it the CLOCK_OWNER message. Implement this as a hw->flags field, so that
base driver may use it to disable any functions which modify the clock
including Tx timestamps, frequency adjustments, and offset adjustments.
This ensures only one PEP will be handling these at a time.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:35:40 +02:00
Wang Xiao W
685e5fb30b fm10k/base: remove 1588 VF API
Remove support for VF transmit timestamps. VFs should not write the
timestamp bit in the Tx descriptor. Only one Tx timestamp can be
realistically handled at once. It is expected that the switch manager
use FFU logic to disable all timestamp requests except for those
originating from a specific virtual port. It is not possible to
correlate this timestamp accurately if more than one occurs out any
given EPL at a time. Since the primary purpose of Tx timestamps is to
implement PTP daemon, which also requires BAR4 access to change the
clock, do not allow VFs to transmit timestamp. Remove the PF<->VF
message for this behavior.

Note, the VF already didn't have ability to request Tx timestamp mode,
so it essentially wasn't allowed to timestamp before anyways under the
old API.

No longer support old API of request-response timestamp mode messages.
New API only sends timestamp-response when the switch decides which port
will be given control of timestamps. To simplify review of this code,
completely remove the support and re-add support for the response
message in a future patch.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
8b8264bdb9 fm10k/base: check VF has a queue
It is possible that the PF has not yet assigned resources to the VF.
Although rare, this could result in the VF attempting to read queues it
does not own and result in FUM or THI faults in the PF. To prevent this,
check queue 0 before we continue in init_hw_vf.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
6eca661890 fm10k/base: fix VF re-enabling
When a VF issues an LPORT_STATE request to enable a port which is
already enabled, the PF will first disable the VF. Then it is supposed
to re-enable the VF again with new settings. This is primarily done in
order to ensure that the switch management software properly clears the
previous VF settings. (ie: switch flow rules and so forth). However,
there is a bug in the flow because we check if VF is enabled and don't
re-enable it at the end. The issue is that we disable the VF in order to
clear switch rules, and never follow-up with a re-enable. This results in
a call to enable the VF results in disabling the logical port.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
46e018c501 fm10k/base: fix VF queues counting
During initialization, the VF counts its rings by walking the TQDLOC
registers. This only works if the TQMAP/RQMAP registers are set to map
the out-of-bound rings to the first one, so the VF driver can detect when
it has run out of queues cleanly. Update the PF to reset the empty
TQMAP/RQMAP registers post-VFLR to prevent innocent VF drivers from
triggering malicious events.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
e78a599651 fm10k/base: fix VF multicast
The VF will send a message to request multicast addresses with the
default vid. In the current code, if the PF has statically assigned a
VLAN to a VF, then the VF will not get the multicast addresses. Fix up
all of the various vlan messages to use identical checks (since each
check was different). Also use set as a variable, so that it simplifies
our check for whether vlan matches the pf_vid.

The new logic will allow set of a vlan if it is zero, automatically
converting to the default vid. Otherwise it will allow setting the PF
vid, or any VLAN if PF has not statically assigned a VLAN. This is
consistent behavior, and allows VF to request either 0 or the
default_vid without silently failing. Note that we need the check for
zero since VFs might not get the default VID message in time to actually
request non-zero VLANs.

Create a function, fm10k_iov_select_vid which implements the logic for
selecting a default vid. This helps us remove duplicate code and
streamlines location of this logic so that we don't make similar bugs in
the future.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
9d87822572 fm10k/base: fix VF multicast update
VFs were being improperly added to the switch's multicast group. The
error stems from the fact that incorrect arguments were passed to the
"update_mc_addr" function. It would seem to be a copy paste error since
the parameters are similar to the "update_uc_addr" function.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
ba4ea5ee35 fm10k/base: add mailbox counters
A previous bug was uncovered by addition of a debug stat to indicate the
actual number of DWORDS we pulled from the mbmem. It turned out this was
not the same as the tx_dwords counter. While the previous bug fix should
have corrected this in all cases, add some debug stats that count the
number of DWORDs pushed or pulled from the mbmem. Base drivers can use
this in debug builds to help detect this problem in the future.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
c3cb69d9a6 fm10k/base: fix mailbox connect
When we connect to the mailbox, we insert a fake disconnect header so
that the code does not see an error and thus instantly error every time
we bring up the mailbox. However, we incorrectly record the tail and
head from the local perspective. Since the remote end shouldn't have
anything for us, add a "create_fake_disconnect_hdr" function which
inverts the TAIL and HEAD fields. This enables us to connect without any
errors of either TAIL or HEAD incorrectness, and prevents creating
extraneous error messages. This is necessary now since mbx_reset_work
does not actually clear the Tx FIFO head and tail pointers.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
00b18e3de1 fm10k/base: fix mailbox phantom messages
The phantom messages were a result of incorrectly forgetting to drop
already transmitted messages. We would reset pulled, and tail_len but
left the head/tail pointers alone.

The correct fix is to loop through pulled and drop messages until we've
dropped at least as many bytes as we pulled (possibly dropping a message
we've only partially transmitted. However, we also have to account for
tail_len variable and the 'ack' value as in mbx_pull_head. This means
that we need to re-read the HEAD field of the mailbox header.

Based on testing, this resolves the phantom messages issue, as well as
correctly keeping messages which have yet to be transmitted at all in
the Tx FIFO. Thus, we will begin re-transmission once we have
re-connected.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
63cfcf90fe fm10k/base: ignore oversized mailbox messages
When we call update_max_size, it does not drop all oversized messages.
This is due to the difficulty in performing this operation, since it is
a FIFO which makes updating anything other than head or tail very
difficult. To fix this, modify validate_msg_size to ensure that we error
out later when trying to transmit the message that could be oversized.
This will generally be a rare condition, as it requires the FIFO to
include a message larger than the max_size negotiated during mailbox
connect. Note that max_size is always smaller than rx.size, so it should
be safe to use here.

Also, update the update_max_size function header comment to clearly
indicate that it does not drop all oversized messages, but only those at
the head of the FIFO.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
400f3c18fa fm10k/base: avoid Tx drop increment during mailbox negotiation
After shutting down the mailbox by force, we then go about resetting max
size to 0, and clearing all messages in the FIFO. However, we should
just reset the head pointer so that the FIFO will become empty, rather than
changing the max size to 0. This helps prevent increment in tx_dropped
counter during mailbox negotiation, which is confusing to viewers of
Linux ethtool statistics output.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
e18683f8ef fm10k/base: scale interrupt on PCIe link speed
Red Rock Canyon's interrupt throttle timers are based on the PCIe link
speed. Because of this, the value being programmed into the ITR
registers must be scaled.

For the PF, this is as simple as reading the PCIe link speed and storing
the result. However, in the case of SR-IOV, the VF's interrupt throttle
timers are based on the link speed of the PF. However, the VF is unable
to get the link speed information from its configuration space, so the
PF must inform it of what scale to use.

Rather than passing this scale via mailbox message, we take advantage of
unused bits in the TDLEN register to pass the scale. It is the
responsibility of the PF to program this for the VF while setting up the
VF queues and the responsibility of the VF to get the information
accordingly. This is preferable because it allows the VF to set up the
interrupts properly during initialization and matches how the MAC
address is passed in the TDBAL/TDBAH registers.

A VF unload followed by a reload incorrectly left this value as 0.
If the VF driver blindly trusted this value it could cause a divide by
zero failure.
Fix this by having stop_hw_vf reset the ITR scale as the device goes
down, similar to the way we handle the MAC address.

To prevent divide-by-zero issues, ensure that we always have an ITR
scale. Default to Gen3 scaling if we don't know the speed. Also ensure
the VF checks the register value and ensures we use Gen3 if we are
provided a zero value.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
76cf5b44b9 fm10k/base: set unlimited bandwidth for PF queues
Set PF queues used for VMDq to unlimited bandwidth when virtualization
resources are assigned.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00
Wang Xiao W
ce1a476adc fm10k/base: add VF Tx timestamp mode no-op
This patch resolves a bug in Linux where we called the
request_tx_timestamp_mode function that is undefined for VF. Implement a
no-op function that simply ensures that the mode is NONE, otherwise it
would fail with ERR_PARAM.

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
2015-10-07 13:25:07 +02:00