Commit Graph

38 Commits

Author SHA1 Message Date
Thomas Monjalon
924e6b7634 drivers: replace page size definitions with function
The page size is often retrieved from the macro PAGE_SIZE.
If PAGE_SIZE is not defined, it is either using hard coded default,
or getting the system value from the UNIX-only function sysconf().

Such definitions are replaced with the generic function
rte_mem_page_size() defined for each supported OS.

Removing PAGE_SIZE definitions will fix dlb drivers for musl libc,
because #ifdef checks were missing, causing redefinition errors.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Boyer <aboyer@pensando.io>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
2021-03-23 08:41:05 +01:00
Long Li
a2a23a794b net/netvsc: support VF device hot add/remove
When a VF device is present, netvsc can send or receive packets over the
VF device. The VF device driver communicates directly with the PCI device
via the PF from the host hypervisor. This is faster than exchanging data
with netvsp via vmbus, i.e. syntheic path.

In Azure and Hyper-v environments, VF device can be hot added or hot
removed at anytime while guest VM is running. This patch improves netvsc
to support VF device hot add/remove.

1. netvsc monitors all system hot add activities over the PCI bus. When it
detects a VF device is added to the system and is managed under this
netvsc device, it asks EAL to probe and start this VF device, then it
attaches and switches data path to the VF device.

2. After a VF device is attached to netvsc, netvsc monitors this device on
hot remove. When this VF device is hot removed, netvsc switches data path
to synthetic, stops this VF device and removes it from EAL.

3. If any failure happens during a VF device hot remove or add, the netvsc
falls back to synthetic path for all data traffic.

Signed-off-by: Long Li <longli@microsoft.com>
2021-01-17 22:37:28 +01:00
Long Li
096b31fc0d net/netvsc: control use of external mbuf on Rx
When receiving packets, netvsp puts data in a buffer mapped through UIO.
Depending on packet size, netvsc may attach the buffer as an external
mbuf. This is not a problem if this mbuf is consumed in the application,
and the application can correctly read data out of an external mbuf.

However, there are two problems with data in an external mbuf.
1. Due to the limitation of the kernel UIO implementation, physical
   address of this external buffer is not exposed to the user-mode. If
   this mbuf is passed to another driver, the other driver is unable to
   map this buffer to iova.
2. Some DPDK applications are not aware of external mbuf, and may bug
   when they receive an mbuf with external buffer attached.

Introduce a driver parameter "rx_extmbuf_enable" to control if netvsc
should use external mbuf for receiving packets. The default value is 0.
(netvsc doesn't use external mbuf, it always allocates mbuf and copy
data to mbuf) A non-zero value tells netvsc to attach external buffers
to mbuf on receiving packets, thus avoid copying memory.

Signed-off-by: Long Li <longli@microsoft.com>
2020-11-03 23:35:07 +01:00
Stephen Hemminger
74a5a6663b net/netvsc: allow setting Rx and Tx copy break
The values for Rx and Tx copy break should be tunable rather
than hard coded constants.

The rx_copybreak sets the threshold where the driver uses an
external mbuf to avoid having to copy data. Setting 0 for copybreak
will cause driver to always create an external mbuf. Setting
a value greater than the MTU would prevent it from ever making
an external mbuf and always copy. The default value is 256 (bytes).

Likewise the tx_copybreak sets the threshold where the driver
aggregates multiple small packets into one request. If tx_copybreak
is 0 then each packet goes as a VMBus request (no copying).
If tx_copybreak is set larger than the MTU, then all packets smaller
than the chunk size of the VMBus send buffer will be copied; larger
packets always have to go as a single direct request. The default
value is 512 (bytes).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Long Li <longli@microsoft.com>
2020-11-03 23:35:07 +01:00
Long Li
b8c3c628af net/netvsc: allocate contiguous physical memory for RNDIS
When sending data, netvsc assumes the tx_rndis buffer is contiguous and
calculates physical addresses based on this assumption.

Use memzone to allocate tx_rndis so it's guaranteed that this buffer is
physically contiguous.

Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
2020-11-03 23:24:26 +01:00
Ivan Ilchenko
62024eb827 ethdev: change stop operation callback to return int
Change eth_dev_stop_t return value from void to int.
Make eth_dev_stop_t implementations across all drivers to return
negative errno values if case of error conditions.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-16 22:26:41 +02:00
Thomas Monjalon
8a5a0aad5d ethdev: allow close function to return an error
The API function rte_eth_dev_close() was returning void.
The return type is changed to int for notifying of errors.

If an error happens during a close operation,
the status of the port is undefined,
a maximum of resources having been freed.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-16 22:26:41 +02:00
Long Li
ac837bdd22 net/netvsc: fix multiple channel Rx
netvsc uses rxbuf_info buffer to track received packets attached via
rte_pktmbuf_attach_extbuf() and ack the host based on usage count. It
uses the transaction_id in the VMBus packet to locate where to use
memory in the rxbuf_info.

This is not correct in multiple channel setup, as different channels may
return identical transaction_ids at a time, and may corrupt the
rxbuf_info buffer.

Fix this by defining rxbuf_info for each queue.

Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-18 18:55:06 +02:00
Long Li
d43b8c7108 net/netvsc: fix underflow when Rx external mbuf
When rte_pktmbuf_attach_extbuf() is used, the driver should not decrease
the reference count in its callback function hn_rx_buf_free_cb, because
the reference count is already decreased by rte_pktmbuf. Doing it twice
may result in underflow and driver may never send an ack packet over
vmbus to host.

Also declares rxbuf_outstanding as atomic, because this value is shared
among all receive queues.

Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-07-11 06:18:53 +02:00
Stephen Hemminger
a4f53bec7c net/netvsc: do not query VF link state
When the primary device link state is queried, there is no
need to query the VF state as well. The application only sees
the state of the synthetic device.

Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-05-28 17:57:07 +02:00
Stephen Hemminger
b757deb8e3 net/netvsc: change datapath logging
The PMD_TX_LOG and PMD_RX_LOG can hide errors since this
debug log is typically disabled. Change the code to use
PMD_DRV_LOG for errors.

Under load, the ring buffer to the host can fill.
Add some statistics to estimate the impact and see other errors.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-05-28 17:57:07 +02:00
Stephen Hemminger
a41ef8eefe net/netvsc: implement descriptor status
These functions are useful for applications and debugging.
The netvsc PMD also transparently handles the rx/tx descriptor
functions for underlying VF device.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-05-28 17:57:07 +02:00
Stephen Hemminger
c7b82b14e3 net/netvsc: support per-queue info requests
There is not a lot of info here from this driver.
But worth supporting these additional info queries.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-05-28 17:57:07 +02:00
Stephen Hemminger
81938ebb54 net/netvsc: manage VF port under read/write lock
With multiple channels, the primary channel may receive notification
that VF has been added or removed while secondary channel is in
process of doing receive or transmit.  Resolve this race by converting
existing vf_lock to a reader/writer lock.

Users of lock (tx/rx/stats) acquire for read, and actions like
add/remove acquire it for write.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-05-11 22:27:39 +02:00
Stephen Hemminger
30408aab2d net/netvsc: fix memory free on device close
The netvsc PMD was putting the mac address in private data but the
core rte_ethdev doesn't allow that it. It has to be in rte_malloc'd
memory or a message will be printed on shutdown/close.
 EAL: Invalid memory

Fixes: f8279f47dd ("net/netvsc: fix crash in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:06 +02:00
Stephen Hemminger
cc02518132 net/netvsc: split send buffers from Tx descriptors
The VMBus has reserved transmit area (per device) and transmit
descriptors (per queue). The previous code was always having a 1:1
mapping between send buffers and descriptors.
This can lead to one queue starving another and also buffer bloat.

Change to working more like FreeBSD where there is a pool of transmit
descriptors per queue. If send buffer is not available then no
aggregation happens but the queue can still drain.

Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:06 +02:00
Ivan Ilchenko
ca041cd44f ethdev: change allmulticast callbacks to return status
Enabling/disabling of allmulticast mode is not always successful and
it should be taken into account to be able to handle it properly.

When correct return status is unclear from driver code, -EAGAIN is used.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2019-10-07 15:00:55 +02:00
Igor Romanov
9970a9ad07 ethdev: make stats and xstats reset callbacks return int
Change return value of the callbacks from void to int. Make
implementations across all drivers return negative errno
values in case of error conditions.

Both callbacks are updated together because a large number of
drivers assign the same function to both callbacks.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 15:00:54 +02:00
Andrew Rybchenko
9039c81257 ethdev: change promiscuous callbacks to return status
Enabling/disabling of promiscuous mode is not always successful and
it should be taken into account to be able to handle it properly.

When correct return status is unclear from driver code, -EAGAIN is used.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2019-10-07 15:00:54 +02:00
Ivan Ilchenko
2db039a759 net/netvsc: check status of getting ethdev info
rte_eth_dev_info_get() return value was changed from void to int,
so this patch modify rte_eth_dev_info_get() usage across
net/netvsc according to its new return type.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 14:45:35 +02:00
Stephen Hemminger
cc9271f9e7 net/netvsc: fix xstats for VF device
The id values for VF stats were not being offset correctly.
And getting xstats for VF device only worked if VF device supported
it; it did not support the generic stats.

Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-06-28 20:32:18 +02:00
Stephen Hemminger
92d23a57ca net/netvsc: support configuring RSS parameters
Add RSS hash key and reta update and query functions.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2019-06-28 20:32:18 +02:00
Olivier Matz
6d13ea8e8e net: add rte prefix to ether structures
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.

Do not update the command line library to avoid adding a dependency to
librte_net.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Stephen Hemminger
8428da7285 net/netvsc: free all queues on close
When dev_close is called, the netvsc driver will clean up all
queues including the primary ring buffer.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2019-05-03 18:45:23 +02:00
Stephen Hemminger
4a9efcddad net/netvsc: fix VF support with secondary process
The VF device management in netvsc was using a pointer to the
rte_eth_devices. But the actual rte_eth_devices array is likely to
be place in the secondary process; which causes a crash.

The solution is to record the port of the VF (instead of a pointer)
and find the device in the per process array as needed.

Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2019-03-29 13:43:55 +01:00
Stephen Hemminger
c578d8507b net/netvsc: fix transmit descriptor pool cleanup
On device close or startup errors, the transmit descriptor pool
was being left behind.

Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-12-21 16:22:40 +01:00
Stephen Hemminger
3c32fdf432 net/netvsc: support receive without VLAN strip
In some cases, VLAN stripping is not desireable. If necessary
re-insert stripped VLAN tag.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-12-21 16:22:40 +01:00
Stephen Hemminger
7d146e1769 net/netvsc: support multicast/promiscuous settings on VF
Provide API's to enable allmulticast and promiscuous in Netvsc PMD
with VF. This keeps the VF and PV path in sync.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-10-11 18:53:48 +02:00
Stephen Hemminger
dc7680e859 net/netvsc: support integrated VF
Integrate accelerated networking support into netvsc PMD.
This allows netvsc to manage VF without using failsafe or vdev_netvsc.
For the exception vswitch path some tests like transmit
get a 22% increase in packets/sec.
For the VF path, the code is slightly shorter but has no
real change in performance.

Pro:
   * using netvsc is more like other DPDK NIC's
   * the exception packet uses less CPU
   * much smaller code size
   * no locking required on VF transmit/receive path
   * no legacy Linux network device to get mangled by userspace
   * much simpler (1K vs 9K) LOC
   * unified extended statistics

Con:
   * using netvsc has more complex startup model
   * no bifurcated driver support
   * no flow support (since host does not have flow API).
   * no tunnel offload support
   * no receive interrupt support

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-09-14 20:08:41 +02:00
Stephen Hemminger
f6ddcf80ad net/netvsc: implement link state change callback
Implement callback functionality on link state changes.
This is not really driven off of interrupt file descriptor like most other
PMD's. Instead, it happens when a link state change message arrives
in the common ring buffer.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-09-14 20:08:41 +02:00
Stephen Hemminger
85c4209189 net/netvsc: exhausting Tx descriptors is not an error
If application sends faster than vswitch can keep up, then the
transmit descriptor pool will be exhausted. This is not a failure
so change the name statistic and don't include it in oerrors.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-09-14 20:08:41 +02:00
Stephen Hemminger
a25d39a3eb net/netvsc: allow tuning latency with devargs
Allow overriding default guest to host latency on per device basis
with devargs.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-09-14 20:08:41 +02:00
Stephen Hemminger
1f2766b7ee net/netvsc: resize event buffer as needed
The event buffer was changed to be a fixed size value,
had a couple of issues. The big one is that rte_free was still
being called for a pointer that was not setup with rte_malloc().

The event buffer was also too small to handle heavy receive
traffic; and running the event buffer out would crash
the application.

Fix by going back to a dynamically resized event buffer.
And grow it by 25% to avoid lots of realloc's.

Fixes: 530af95a78 ("bus/vmbus: avoid signalling host on read")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-08-28 15:27:39 +02:00
Stephen Hemminger
7a866f0d1b net/netvsc: implement free Tx mbuf on demand
Add tx_done_cleanup ethdev hook to allow application to
control if/when it wants completions to be handled.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-08-28 15:27:39 +02:00
Stephen Hemminger
0312753ef2 net/netvsc: set lower host latency
Tune the vmbus connection so the host scans faster. This improves
transmit performance. The host default value is 100us but setting
to 50us reduces packet loss significantly.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-08-28 15:27:39 +02:00
Stephen Hemminger
530af95a78 bus/vmbus: avoid signalling host on read
Don't signal host that receive ring has been read until all events
have been processed. This reduces the number of guest exits and
therefore improves performance.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-08-05 11:03:18 +02:00
Stephen Hemminger
2d2c4991b4 net/netvsc: add queue info
This helps when diagnosing ring issues in testpmd.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-08-05 11:03:15 +02:00
Stephen Hemminger
4e9c73e96e net/netvsc: add Hyper-V network device
The driver supports Hyper-V networking directly like
virtio for KVM or vmxnet3 for VMware.

This code is based off of the FreeBSD driver. The file and variable
names are kept the same to help with understanding (with most of the
BSD style warts removed).

This version supports the latest NetVSP 6.1 version and
older versions.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-07-13 23:48:07 +02:00