Commit Graph

53 Commits

Author SHA1 Message Date
Michal Krawczyk
1c808fcd85 Allocate BAR for ENA MSIx vector table
In the new ENA-based instances like c6gn, the vector table moved to a
new PCIe bar - BAR1. Previously it was always located on the BAR0, so
the resources were already allocated together with the registers.

As the FreeBSD isn't doing any resource allocation behind the scenes,
the driver is responsible to allocate them explicitly, before other
parts of the OS (like the PCI code allocating MSIx) will be able to
access them.

To determine dynamically BAR on which the MSIx vector table is present
the pci_msix_table_bar() is being used and the new BAR is allocated if
needed.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 3 days
2021-02-18 13:54:36 +01:00
Marcin Wojtas
7dee315ed7 Update ENA driver version to v2.3.0
The v2.3.0 introduces new ena_com layer, ENI metrics updates and SPDX
license tags.

Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon, Inc
MFC after:      1 week
Differential revision:  https://reviews.freebsd.org/D27120
2020-11-18 15:25:38 +00:00
Marcin Wojtas
7d2e6f207e Rename descriptions of the supported ENA devices
Some of the PCI ID were described as ENA with LLQ support - it's not
fully accurate and because of that, their names were changed.

Instead of LLQ, use RSERV0 for the description of those devices.

Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon, Inc
MFC after:      1 week
Differential revision:  https://reviews.freebsd.org/D27119
2020-11-18 15:20:01 +00:00
Marcin Wojtas
f180142c76 Add ENI metrics for the ENA driver
The new HAL allows the driver to read extra ENI stats. Exact meaning of
each of them can be found in base/ena_defs/ena_admin_defs.h file and
structure ena_admin_eni_stats.

Those stats are being updated inside of the timer service, which is
executed every second.
ENI metrics are turned off by default. They can be enabled, using the
sysctl node: dev.ena.X.eni_metrics.update_delay
0 value in this node means that the update is turned off. Other values
determine how many seconds must pass, before ENI metrics will be
updated.

They can be acquired, using sysctl:

sysctl dev.ena.X.eni_metrics

Where X stands for the interface number.

Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon, Inc
MFC after:      1 week
Differential revision: https://reviews.freebsd.org/D27118
2020-11-18 15:17:55 +00:00
Marcin Wojtas
0835cc783b Add SPDX license tag to the ENA driver files
Refering to guide: https://wiki.freebsd.org/SPDX the SPDX tag should not
replace the standard license text, however it should be added over the
standard license text to make the automation easier.

Because of that, the old license was kept, but the SPDX tag was added
on top of every ENA driver file.

Submited by:    Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon, Inc
MFC after:      1 week
Differential revision:  https://reviews.freebsd.org/D27117
2020-11-18 15:07:34 +00:00
Marcin Wojtas
9eb1615f33 Adjust ENA driver files to latest ena-com changes
* Use the new API of ena_trace_*
* Fix typo syndrom --> syndrome
* Remove validation of the Rx req ID (already performed in the ena-com)
* Remove usage of deprecated ENA_ASSERT macro

Submitted by:   Ido Segev <idose@amazon.com>
Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon, Inc
MFC after:      1 week
Differential revision:  https://reviews.freebsd.org/D27115
2020-11-18 14:59:22 +00:00
Marcin Wojtas
2287afd818 Update ENA driver version to v2.2.0
Driver version upgrade is connected with support for the new device
fetures, like Tx drops reporting or disabling meta caching.

Moreover, the driver configuration from the sysctl was reworked to
provide safer and better flow for configuring:
* number of IO queues (new feature),
* drbr size on Tx,
* Rx queue size.

Moreover, a lot of minor bug fixes and improvements were added.

Copyright date in the license of the modified files in this release was
updated to 2020.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2020-05-26 16:11:46 +00:00
Marcin Wojtas
0b432b702e Allow disabling meta caching for ENA Tx path
Determined by a flag passed from the device. No metadata is set within
ena_tx_csum when caching is disabled.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 16:00:30 +00:00
Marcin Wojtas
9762a033da Create ENA IO queues with optional backoff
If requested size of IO queues is not supported try to decrease it until
finding the highest value that can be satisfied.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:58:48 +00:00
Marcin Wojtas
56d41ad5fe Add sysctl node for ENA IO queues number adjustment
By default, in ena_attach() the driver attempts to acquire
ena_adapter::max_num_io_queues MSI-X vectors for the purpose of IO
queues, however this is not guaranteed. The number of vectors acquired
depends also on system resources availability.

Regardless of that, enable the number of effectively used IO queues to
be further limited through the sysctl node.

Example: Assumming that there are 8 IO queues configured by default, the
command

$ sysctl dev.ena.0.io_queues_nb=4

will reduce the number of available IO queues to 4. Similarly, the value
can be also increased up to maximum supported value. A value higher than
maximum supported number of IO queues is ignored. Zero is ignored too.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:57:02 +00:00
Marcin Wojtas
2182354622 Rework ENA Tx buffer ring size reconfiguration
This method has been aligned with the way how the Rx queue size is being
updated - so it's now done synchronously instead of resetting the
device.

Moreover, the input parameter is now being validated if it's a power of
2. Without this, it can cause kernel panic.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:50:30 +00:00
Marcin Wojtas
7d8c4fee95 Rework ENA Rx queue size configuration
This patch reworks how the Rx queue size is being reconfigured and how
the information from the device is being processed.

Reconfiguration of the queues and reset of the device in order to make
the changes alive isn't the best approach. It can be done synchronously
and it will let to pass information if the reconfiguration was
successful to the user. It now is done in the ena_update_queue_size()
function.

To avoid reallocation of the ring buffer, statistic counters and the
reinitialization of the mutexes when only new size has to be assigned,
the io queues initialization function has been split into 2 stages:
basic, which is just copying appropriate fields and the advanced, which
allocates and inits more advanced structures for the IO rings.

Moreover, now the max allowed Rx and Tx ring size is being kept
statically in the adapter and the size of the variables holding those
values has been changed to uint32_t everywhere.

Information about IO queues size is now being logged in the up routine
instead of the attach.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:48:06 +00:00
Marcin Wojtas
02a2a7cea4 Expose argument names for non static ENA driver functions
As functions which are declared in the header files are intended to be
the interface and are going to be used by other files, it's better to
include argument names in the definition, so the caller won't have to
check the .c file in order to check their meaning and order.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:41:53 +00:00
Marcin Wojtas
6959869eae Use single global lock in the ENA driver
Currently, the driver had 2 global locks - one was sx lock used for
up/down synchronization and the second one was mutex, which was used
for link configuration and timer service callout.

It is better to have single lock for that. We cannot use mutex, as it
can sleep and cause witness errors in up/down configuration, so sx lock
seems to be the only choice.

Callout cannot use sx lock, but the timer service is MP safe, so we just
need to avoid race between ena_down() and ena_detach(). It can be
avoided by acquiring sx lock.

Simple macros were added that are encapsulating implementation of the
lock and makes the code cleaner.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:39:41 +00:00
Marcin Wojtas
7926bc4492 Add trigger reset function in the ENA driver
As the reset triggering is no longer a simple macro that was just
setting appropriate flag, the new function for triggering reset was
added. It improves code readability a lot, as we are avoiding additional
indentation.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:37:55 +00:00
Marcin Wojtas
6c84cec373 Enable Tx drops reporting in the ENA driver
Tx drops statistics are fetched from HW every ena_keepalive_wd() call
and are observable using one of the commands:
* sysctl dev.ena.0.hw_stats.tx_drops
* netstat -I ena0 -d

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:31:28 +00:00
Marcin Wojtas
8483b844e7 Adjust ENA driver to the new HAL
* Removed adaptive interrupt moderation (not suported on FreeBSD).
* Use ena_com_free_q_entries instead of ena_com_free_desc.
* Don't use ENA_MEM_FREE outside of the ena_com.
* Don't use barriers before calling doorbells as it's already done in
  the HAL.
* Add function that generates random RSS key, common for all driver's
  interfaces.
* Change admin stats sysctls to U64.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:29:19 +00:00
Marcin Wojtas
04cf2b885d Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left,
allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time
and it is not guaranteed that it'll succeed. In that situation, the
fallback will work, but if the refill needs to take a place for a lot of
descriptors at once, the time spent in m_getjcl looking for memory can
cause system unresponsiveness due to high priority of the Rx task. This
can also lead to driver reset, because Tx cleanup routine is being
blocked and timer service could detect that Tx packets aren't cleaned
up. The reset routine can further create another unresponsiveness - Rx
rings are being refilled there, so m_getjcl will again burn the CPU.
This was causing NVMe driver timeouts and resets, because network driver
is having higher priority.

Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are
enough - ENA MTU is being set to 9K anyway, so it's very unlikely that
more space than 9KB will be needed.

However, 9KB jumbo clusters can still cause issues, so by default the
page size mbuf cluster will be used for the Rx descriptors. This can have a
small (~2%) impact on the throughput of the device, so to restore
original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to
"1" in "/boot/loader.conf" file.

As a part of this patch (important fix), the version of the driver
was updated to v2.1.2.

Submitted by:   cperciva
Reviewed by:    Michal Krawczyk <mk@semihalf.com>
Reviewed by:    Ido Segev <idose@amazon.com>
Reviewed by:    Guy Tzalik <gtzalik@amazon.com>
MFC after:      3 days
PR:             225791, 234838, 235856, 236989, 243531
Differential Revision: https://reviews.freebsd.org/D24546
2020-05-07 11:28:39 +00:00
Marcin Wojtas
888810f0fb Rework and simplify Tx DMA mapping in ENA
Driver working in LLQ mode in some cases can send only few last segments
of the mbuf using DMA engine, and the rest of them are sent to the
device using direct PCI transaction. To map the only necessary data, two DMA
maps were used. That solution was very rough and was causing a bug - if
both maps were used (head_map and seg_map), there was a race in between
two flows on two queues and the device was receiving corrupted
data which could be further received on the other host if the Tx cksum
offload was enabled.

As it's ok to map whole mbuf and then send to the device only needed
segments, the design was simplified to use only single DMA map.

The driver version was updated to v2.1.1 as it's important bug fix.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2020-02-24 15:35:31 +00:00
Warner Losh
1c028d7227 Make valdiate_rx_req_id static inline because it uses other static
inline functions. gcc complains about this, most likely due to
the subtle differences between inline and static inline functions
defined in headers.
2019-11-02 02:05:09 +00:00
Marcin Wojtas
2731abe8e6 Update ENA version to v2.1.0
In this release the netmap support was introduced.

Moreover, it is also now possible to use the LLQ mode of the driver on
the arm64 AWS instances (A1 type).

Differential Revision: https://reviews.freebsd.org/D21938
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-10-31 16:03:43 +00:00
Marcin Wojtas
6f2128c7ed Add support for ENA NETMAP Tx
Two new tables are added to ena_tx_buffer structure:
* netmap_map_seg stores DMA mapping structures,
* netmap_buf_idx stores buff indexes taken from the slots.

When Tx resources are being set, the new mapping structures are created
and netmap Tx rings are being reset.

When Tx resources are being released, used netmap bufs are unmapped from
DMA and then mapping structures are destroyed.

When Tx interrupt occurrs, ena_netmap_tx_irq is called.

ena_netmap_txsync callback signalizes that there are new packets which
should be transmitted.
First, it fills ena_netmap_ctx. Then it performs two actions:
* ena_netmap_tx_frames moves packets from netmap ring to NIC,
* ena_netmap_tx_cleanup restores buffers from NIC and gives them back
to the userspace app.
0 is returned in case of Tx error that could be handled by the driver.

ena_netmap_tx_frames checks if there are packets ready for transmission.
Then, for each of them, ena_netmap_tx_frame is called. If error occurs,
transmitting is stopped, but if the error was cause due to HW ring being
full, information about that is not propagated to the userspace app.
When all packets are ready, doorbell is written to NIC and netmap ring
state is updated.

Parsing of one packet is done by the ena_netmap_tx_frame function.
First, it checks if number of slots does not exceed NIC limit. Invalid
packets are being dropped and the error is propagated to the upper
layer. As each netmap buffer has equal size, which is typically greater
then 2KiB, there shouldn't be any packets which contain too many slots.
Then, the ena_com_tx_ctx structure is being filled. As netmap does not
support any hardware offloads, ena_com_tx_meta structure is set to zero.
After that, ena_netmap_map_slots maps all memory slots for DMA.
If the device works in the LLQ mode, the push header is being determined
by checking if the header fits within the first socket.
If so, the portion of data is being copied directly from the slot.
In other case, the data is copied to the intermediate buffer.
First slots are treated the same as as the others, because DMA mapping
has no impact on LLQ mode. Index of each netmap buffer is taken from
slot and stored in netmap_buf_idx array. In case of mapping error,
memory is unmapped and packets are put back to the netmap ring.

ena_netmap_tx_cleanup performs out of order cleanup of sent buffers.
First, req_id is taken and is validated. As validate_tx_req_id from
ena.c is specific to kernels mbuf, another implementation is provided.
Each req_id is cleaned up by ena_netmap_tx_clean_one function. Buffers
are being unmaped from DMA and put back to netmap ring. In the end,
state of netmap and NIC rings are being updated.

Differential Revision: https://reviews.freebsd.org/D21936
Submitted by: Rafal Kozik <rk@semihalf.com>
              Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-10-31 15:59:29 +00:00
Marcin Wojtas
9a0f2079ca Add support for ENA NETMAP Rx
Most of code used for Rx ring initialization could be reused in NETMAP.
Reset of NETMAP ring and new alloc method was added. Driver decides if
use kernels mbufs or NETMAPs slots based on IFCAP_NETMAP flag. It
allows to reuse ena_refill_rx_bufs, which provides proper handling of
Rx out of order completion.

ena_netmap_alloc_rx_slot takes exactly the same arguments as
ena_alloc_rx_mbuf, but instead of allocating one mbuf it takes one slot
from NETMAP ring. Based on queue id proper netmap_ring is found. As
NETMAP provides the "partial opening" feature not all of the rings are
avaiable. Not used points to invalid ring. If there is available slot,
it is taken from the ring. Its buffer is mapped to DMA and its index is
stored in ena_rx_buffer field in ena_rx_buffer structure. Then ena_buf
is filled with addresses and ring state is updated.

Cleanup is handled by ena_netmap_free_rx_slot. It unmaps DMA and returns
buffer to ring. As we could not return more bufs than we have taken and
we should not override occupied slots, buf_index should be 0. It is
being checked by assertion.

ena_netmap_rxsync callback puts received packets back to NETMAP ring and
passes them to user space by updating ring pointers. First it fills
ena_netmap_ctx.
Then it performs two actions:
* ena_netmap_rx_frames moves received frames from NIC to NETMAP ring,
* ena_netmap_rx_cleanup fills NIC ring with slots released by userspace
app.

In case of Rx error that could be handled by NIC driver (for example by
performing reset) rx sync should return 0.

ena_netmap_rx_frames first checks if NETMAP ring is in consistent
state and then in the loop receives new frames. When all available
frames are taken nr_hwtail is updated.

Receiving one frame is handled by ena_netmap_rx_frame. If no error
occurrs, each Descriptor is loaded by ena_netmap_rx_load_desc function.
If packets take more than one segments NS_MOREFRAG flag must be set in
all, but not last slot. In case of wrong req_id packet is removed from
NETMAP ring. If packet is successful received counters are updated.

Refiling of NIC ring is performed by ena_netmap_rx_cleanup function.
It calculates number of available slots and call ena_refill_rx_bufs with
proper number.

Differential Revision: https://reviews.freebsd.org/D21935
Submitted by: Rafal Kozik <rk@semihalf.com>
              Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-10-31 15:57:44 +00:00
Marcin Wojtas
38c7b96517 Split Rx/Tx from initialization code in ENA driver
Move Rx/Tx routines to separate file.
Some functions:
* ena_restore_device,
* ena_destroy_device,
* ena_up,
* ena_down,
* ena_refill_rx_bufs
could be reused in upcoming netmap code in the driver. To make it
possible, they were moved to ena.h header.

Differential Revision: https://reviews.freebsd.org/D21933
Submitted by:  Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-10-31 15:44:26 +00:00
Marcin Wojtas
9d0073e413 Update ENA version to v2.0.0
ENAv2 introduces many new features, bug fixes and improvements.

Main new features are LLQ (Low Latency Queues) and independent queues
reconfiguration using sysctl commands.

The year in copyright notice was updated to 2019.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:52:32 +00:00
Marcin Wojtas
32f63fa7f9 Split ENA reset routine into restore and destroy stages
For alignment with Linux driver and better handling ena_detach(), the
reset is now calling ena_device_restore() and ena_device_destroy().

The ena_device_destroy() is also being called on ena_detach(), so the
code will be more readable.

The watchdog is now being activated after reset only, if it was active
before.

There were added additional checks to ensure, that there is no race with
the link state change AENQ handler.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:39:25 +00:00
Marcin Wojtas
fd43fd2af0 Use bitfield for storing global ENA device states
As the ENA can have multiple states turned on/off, it is more convenient
to store them in single bitfield instead of multiple boolean variables.

The bitset FreeBSD API was used for the bitfield implementation, as it
provides flexible structure together with API which also supports atomic
bitfield operations.

For better readability basic macros from API were wrapped into custom
ENA_FLAG_* macros, which are filling up common parameters for all calls.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:37:15 +00:00
Marcin Wojtas
af66d7d029 Add additional doorbells on ENA Tx path
The new ENA HAL is introducing API, which can determine on Tx path if
the doorbell is needed.

That way, it can tell the driver, that it should call an doorbell.
The old threshold value wasn't removed, as not all HW is supporting this
feature - so it was reworked to also work with the new API.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:33:31 +00:00
Marcin Wojtas
82f5a7921c Limit maximum size of Rx refill threshold in ENA
The Rx ring size can be as high as 8k. Because of that we want to limit
the cleanup threshold by maximum value of 256.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:31:35 +00:00
Marcin Wojtas
4fa9e02d9b Add support for the LLQv2 and WC in ENA
LLQ (Low Latency Queue) is the feature, that allows pushing header
directly to the device through PCI before even DMA is triggered.

It reduces latency, because device can start preparing packet before
payload is sent through DMA.

To speed up sending data through PCI, the Write Combining is enabled,
which allows hardware to buffer data before sending them on the PCI - it
allows to reduce number of PCI IO operations.

ENAv2 is using special descriptor for the negotiation of the LLQ.
Currently, only the default configuration is supported.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:30:52 +00:00
Marcin Wojtas
5cb9db0706 Lock optimization in ENA
Handle IO interrupts using filter routine. That way, the main cleanup
task could be moved to the separate thread using taskqueue.

The deferred Rx cleanup task was removed, and now the cleanup task is
begin called instead. That way, the Rx lock could be removed.

In addition, Queue management (wake up and stop TX ring) was added, so
the TX cleanup task can be performed mostly lockless.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:29:24 +00:00
Marcin Wojtas
6064f2899f Add tuneable drbr ring size and hw queues depth for ENA
The driver now supports per adapter tuning of buffer ring size and HW Rx
ring size.

It can be achieved using sysctl node dev.ena.X.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:28:03 +00:00
Marcin Wojtas
d12f7bfc17 Check for missing MSI-x and Tx completions in ENA
If the first MSI-x won't be executed, then the timer service will detect
that and trigger device reset.

The checking for missing Tx completion was reworked, so it will also
check for missing interrupts. Checking number of missing Tx completions
can be performed after loop, instead of checking it every iteration.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2019-05-30 13:16:56 +00:00
Marcin Wojtas
c2e7e247bf Prevent double activation of admin interrupt in ENA
The resource is already being activated in the bus_alloc_resource(),
because the flag RF_ACTIVE is being passed.

Double activation on arm64 is causing kernel panic.

Version of the driver was upgraded to 0.8.4.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Reported-by:   Greg V <greg@unrelenting.technology>
Tested-by:     cperciva, Greg V <greg@unrelenting.technology>
Obtained from: Semihalf
MFC after:     2 weeks
Sponsored by:  Amazon, Inc.
Differential revision: https://reviews.freebsd.org/D19655
2019-03-21 10:46:10 +00:00
Marcin Wojtas
1d65b4c095 Do not use ntc for obtaining buffer on Rx in the ENA
In out of order mode Rx buffer are accesses by req_id.
Accessing and validating mbuf using ntc is causing false error.

Increase driver revision after latest RX OOO completion fixes.

Submitted by: Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
MFC after: 1 week
2019-02-15 10:40:41 +00:00
Marcin Wojtas
4c220feb7e Suppress excessive error prints in ENA TX hotpath
In FreeBSD, this is normal situation that the Tx ring is being full. In
hat case, the packet is put back into drbr and the next attempt to send
it is taken after the cleanup.

Too much logs like this can cause system instability and even cause the
device reset (because keep alive or cleanup could be missed).

To fix that, the log level of this message is changed to debug.

Upon this change upgrade the driver version to v0.8.2.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2019-01-16 02:13:21 +00:00
Warner Losh
40abe76bf0 Add PNP info to PCI attachment of ena driver
Make unsigned values uint16_t for pnp table. They are properly
uint16_t befause they are 16-bit PCI IDs. The PNP_INFO language has no
type for bare unsigned.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Pull Request: https://github.com/bsdimp/freebsd/pull/5
2018-07-08 20:39:38 +00:00
Marcin Wojtas
fbb0ed71b2 Upgrade ENA version to v0.8.1
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2018-05-10 09:06:21 +00:00
Marcin Wojtas
4727bda6f2 Allow usage of more RX descriptors than 1 in ENA driver
Using only 1 descriptor on RX could be an issue, if system would be low
on resources and could not provide driver with large chunks of
contiguous memory.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12871
2017-11-09 13:36:42 +00:00
Marcin Wojtas
3cfadb28c3 Read max MTU from the ENA device
The device now provides driver with max available MTU value it
can handle.

The function setting MTU for the interface was simplified and reworked
to follow up this changes.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12870
2017-11-09 13:35:07 +00:00
Marcin Wojtas
5a9902126e Cleanup of the ENA driver header file
Remove unused macros and fields - some of them were only initialized,
without further usage.

Implement minor style fixes and add required comments.

On the occasion add missing TX completion counter, which was existing,
but mistakenly remained unused.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12864
2017-11-09 12:07:02 +00:00
Marcin Wojtas
8805021a2a Allow partial MSI-x allocation in ENA driver
The situation, where part of the MSI-x was not configured properly, was
not properly handled. Now, the driver reduces number of queues to
reflect number of existing and properly configured MSI-x vectors.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12863
2017-11-09 12:03:06 +00:00
Marcin Wojtas
0052f3b580 Remove deprecated and unused counters in ENA driver
Few counters were imported from the Linux driver and never used,
because of differences between the Linux and FreeBSD APIs.

Queue stops and resumes are no longer supported by the driver and
counters were incremented indicating false events.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: rlibby
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12862
2017-11-09 12:01:46 +00:00
Marcin Wojtas
0bdffe5959 Refactor style of the ENA driver
* Change all conditional checks in "if" statement to boolean expressions
* Initialize variables with too complex values outside the declaration
* Fix indentations
* Move code associated with sysctls to ena_sysctl.c file
* For consistency, remove unnecesary "return" from void functions
* Use if_getdrvflags() function instead of accesing variable directly

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12860
2017-11-09 11:57:02 +00:00
Marcin Wojtas
efe6ab18da Check for Rx ring state to prevent from stall in the ENA driver
In case when Rx ring is full and driver will fail to allocate Rx mbufs,
the ring could be stalled.

Keep alive is checking every second for Rx ring state, and if it is full
for two cycles, then trigger rx_cleanup routine in another thread.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12856
2017-11-09 11:48:22 +00:00
Marcin Wojtas
43fefd1629 Add RX OOO completion feature
The RX out of order completion feature, allows to complete RX
descriptors out of order, by keeping trace of all free descriptors in
the separate array.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12855
2017-11-09 11:45:59 +00:00
Marcin Wojtas
30217e2dff Rework counting of hardware statistics in ENA driver
Do not read all statistics from the device, instead count them in the
driver except from RX drops - they are received directly from the NIC
in the AENQ descriptor.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: imp
Obtained from: Semihalf
Sponsored by: Amazon.com, Inc.
Differential Revision: https://reviews.freebsd.org/D12852
2017-10-31 16:31:23 +00:00
Marcin Wojtas
a195fab02b Update ena-com HAL to v1.1.4.3 and update driver accordingly
The newest ena-com HAL supports LLQv2 and introduces
API changes. In order not to break the driver compilation
it was updated/fixed in a following way:

* Change version of the driver to 0.8.0
* Provide reset cause when triggering reset of the device
* Reset device after attach fails
* In the reset task free management irq after calling ena_down. Admin
  queue can still be used before ena_down is called, or when it is
  being handled
* Do not reset device if ena_reset_task fails
* Move call of the ena_com_dev_reset to the ena_down() routine - it
  should be called only if interface was up
* Use different function for checking empty space on the sq ring
  (ena-com API change)
* Fix typo on ENA_TX_CLEANUP_THRESHOLD
* Change checking for EPERM with EOPNOTSUPP - change in the ena-com API
* Minor style fixes

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Amazon.com, Inc.
               Semihalf
Sponsored by: Amazon.com, Inc.
Differential Revision: https://reviews.freebsd.org/D12143
2017-10-31 12:41:07 +00:00
Zbigniew Bodek
1b069f1c69 Replace mbuf defragmentation with collapse
Collapse should be more effective than defragmentation.
Added missing declaration of ena_check_and_collapse_mbuf().

Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon.com Inc.
2017-07-04 00:10:29 +00:00
Zbigniew Bodek
8a573700af Fix creation of dma tags and TSO settings
TSO settings were not reflecting real HW capabilities.

DMA tags were created with wrong window - high address was the same as
low, so excluding window was not working.

Capabilities of TX dma transaction were not set properly - TSO max size
had been increased and size of one segment had been adjusted.

Submitted by:   Michal Krawczyk <mk@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Amazon.com Inc.
2017-07-04 00:08:47 +00:00