Add support for Amazon Elastic Network Adapter (ENA) NIC

ENA is a networking interface designed to make good use of modern CPU
features and system architectures.

The ENA device exposes a lightweight management interface with a
minimal set of memory mapped registers and extendable command set
through an Admin Queue.

The driver supports a range of ENA devices, is link-speed independent
(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
a negotiated and extendable feature set.

Some ENA devices support SR-IOV. This driver is used for both the
SR-IOV Physical Function (PF) and Virtual Function (VF) devices.

ENA devices enable high speed and low overhead network traffic
processing by providing multiple Tx/Rx queue pairs (the maximum number
is advertised by the device via the Admin Queue), a dedicated MSI-X
interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized
data placement.

The ENA driver supports industry standard TCP/IP offload features such
as checksum offload and TCP transmit segmentation offload (TSO).
Receive-side scaling (RSS) is supported for multi-core scaling.

The ENA driver and its corresponding devices implement health
monitoring mechanisms such as watchdog, enabling the device and driver
to recover in a manner transparent to the application, as well as
debug logs.

Some of the ENA devices support a working mode called Low-latency
Queue (LLQ), which saves several more microseconds. This feature will
be implemented for driver in future releases.

Submitted by:	Michal Krawczyk <mk@semihalf.com>
		Jakub Palider <jpa@semihalf.com>
		Jan Medala <jan@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
Differential revision: https://reviews.freebsd.org/D10427
This commit is contained in:
Zbigniew Bodek 2017-05-22 14:46:13 +00:00
parent fefd043963
commit 9b8d05b8ac
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=318647
8 changed files with 4794 additions and 0 deletions

255
share/man/man4/ena.4 Normal file
View File

@ -0,0 +1,255 @@
.\" Copyright (c) 2015-2017 Amazon.com, Inc. or its affiliates.
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\"
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\"
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in
.\" the documentation and/or other materials provided with the
.\" distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
.\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
.\" OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
.\" OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd May 04, 2017
.Dt ENA 4
.Os
.Sh NAME
.Nm ena
.Nd "FreeBSD kernel driver for Elastic Network Adapter (ENA) family"
.Sh SYNOPSIS
To compile this driver into the kernel,
place the following line in your
kernel configuration file:
.Bd -ragged -offset indent
.Cd "device ena"
.Ed
.Pp
Alternatively, to load the driver as a
module at boot time, place the following line in
.Xr loader.conf 5 :
.Bd -literal -offset indent
if_ena_load="YES"
.Ed
.Sh DESCRIPTION
The ENA is a networking interface designed to make good use of modern CPU
features and system architectures.
.Pp
The ENA device exposes a lightweight management interface with a
minimal set of memory mapped registers and extendable command set
through an Admin Queue.
.Pp
The driver supports a range of ENA devices, is link-speed independent
(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
a negotiated and extendable feature set.
.Pp
Some ENA devices support SR-IOV. This driver is used for both the
SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
.Pp
The ENA devices enable high speed and low overhead network traffic
processing by providing multiple Tx/Rx queue pairs (the maximum number
is advertised by the device via the Admin Queue), a dedicated MSI-X
interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized
data placement.
.Pp
The
.Nm
driver supports industry standard TCP/IP offload features such
as checksum offload and TCP transmit segmentation offload (TSO).
Receive-side scaling (RSS) is supported for multi-core scaling.
.Pp
The
.Nm
driver and its corresponding devices implement health
monitoring mechanisms such as watchdog, enabling the device and driver
to recover in a manner transparent to the application, as well as
debug logs.
.Pp
Some of the ENA devices support a working mode called Low-latency
Queue (LLQ), which saves several more microseconds. This feature will
be implemented for driver in future releases.
.Sh HARDWARE
Supported PCI vendor ID/device IDs:
.Pp
.Bl -bullet -compact
.It
1d0f:0ec2 - ENA PF
.It
1d0f:1ec2 - ENA PF with LLQ support
.It
1d0f:ec20 - ENA VF
.It
1d0f:ec21 - ENA VF with LLQ support
.El
.Sh DIAGNOSTICS
.Ss Device initialization phase:
.Bl -diag
.It ena%d: failed to init mmio read less
.Pp
Error occured during initialization of the mmio register read request.
.It ena%d: Can not reset device
.Pp
Device could not be reset; device may not be responding or is already
during reset.
.It ena%d: device version is too low
.Pp
Version of the controller is too low and it is not supported by the driver.
.It ena%d: Invalid dma width value %d
.Pp
The controller is able to request dma transcation width. Device stopped
responding or it demanded invalid value.
.It ena%d: Can not initialize ena admin queue with device
.Pp
Initialization of the Admin Queue failed; device may not be responding or there
was a problem with initialization of the resources.
.It ena%d: Cannot get attribute for ena device rc: %d
.Pp
Failed to get attributes of the device from the controller.
.It ena%d: Cannot configure aenq groups rc: %d
.Pp
Errors occured when trying to configure AENQ groups.
.El
.Ss Driver initialisation/shutdown phase:
.Bl -diag
.It ena%d: PCI resource allocation failed!
.It ena%d: allocating ena_dev failed
.It ena%d: failed to pmap registers bar
.It ena%d: Error while setting up bufring
.It ena%d: Error with initialization of IO rings
.It ena%d: can not allocate ifnet structure
.It ena%d: Error with network interface setup
.It ena%d: Failed to enable and set the admin interrupts
.It ena%d: Failed to allocate %d, vectors %d
.It ena%d: Failed to enable MSIX, vectors %d rc %d
.It ena%d: Error with MSI-X enablement
.It ena%d: could not allocate irq vector: %d
.It ena%d: Unable to allocate bus resource: registers
.Pp
Resource allocation failed when initializing the device; driver will not
be attached.
.It ena%d: ENA device init failed (err: %d)
.Pp
Device initialization failed; driver will not be attached.
.It ena%d: could not activate irq vector: %d
.Pp
Error occured when trying to activate interrupt vectors for Admin Queue.
.It ena%d: failed to register interrupt handler for irq %ju: %d
.Pp
Error occured when trying to register Admin Queue interrupt handler.
.It ena%d: Cannot setup mgmnt queue intr
.Pp
Error occured during configuration of the Admin Queue interrupts.
.It ena%d: Enable MSI-X failed
.Pp
Configuration of the MSI-X for Admin Queue failed; there could be lack
of resources or interrupts could not have been configured; driver will
not be attached.
.It ena%d: VLAN is in use, detach first
.Pp
VLANs are being used when trying to detach the driver; VLANs should be detached
first and then detach routine should be called again.
.It ena%d: Unmapped RX DMA tag associations
.It ena%d: Unmapped TX DMA tag associations
.Pp
Error occured when trying to destroy RX/TX DMA tag.
.It ena%d: Cannot init RSS
.It ena%d: Cannot fill indirect table
.It ena%d: Cannot fill indirect table
.It ena%d: Cannot fill hash function
.It ena%d: Cannot fill hash control
.It ena%d: WARNING: RSS was not properly initialized, it will affect bandwidth
.Pp
Error occured during initialization of one of RSS resources; device is still
going to work but it will affect performance because all RX packets will be
passed to queue 0 and there will be no hash information.
.It ena%d: failed to tear down irq: %d
.It ena%d: dev has no parent while releasing res for irq: %d
Release of the interrupts failed.
.El
.Ss Additional diagnostic:
.Bl -diag
.It ena%d: Cannot get attribute for ena device
.Pp
This message appears when trying to change MTU and driver is unable to get
attributes from the device.
.It ena%d: Invalid MTU setting. new_mtu: %d
.Pp
Requested MTU value is not supported and will not be set.
.It ena%d: keep alive watchdog timeout
.Pp
Device stopped responding and will be reset.
.It ena%d: Found a Tx that wasn't completed on time, qid %d, index %d.
.Pp
Packet was pushed to the NIC but not sent within given time limit; it may
be caused by hang of the IO queue.
.It ena%d: The number of lost tx completion is aboce the threshold (%d > %d). Reset the device
.Pp
If too many Tx wasn't completed on time the device is going to be reset; it may
be caused by hanged queue or device.
.It ena%d: trigger reset is on
.Pp
Device will be reset; reset is triggered either by watchdog or if too many TX
packets were not completed on time.
.It ena%d: invalid value recvd
.Pp
Link status received from the device in the AENQ handler is invalid.
.It ena%d: Allocation for Tx Queue %u failed
.It ena%d: Allocation for Rx Queue %u failed
.It ena%d: Unable to create Rx DMA map for buffer %d
.It ena%d: Failed to create io TX queue #%d rc: %d
.It ena%d: Failed to get TX queue handlers. TX queue num %d rc: %d
.It ena%d: Failed to create io RX queue[%d] rc: %d
.It ena%d: Failed to get RX queue handlers. RX queue num %d rc: %d
.It ena%d: failed to request irq
.It ena%d: could not allocate irq vector: %d
.It ena%d: failed to register interrupt handler for irq %ju: %d
.Pp
IO resources initialization failed. Interface will not be brought up.
.It ena%d: LRO[%d] Initialization failed!
.Pp
Initialization of the LRO for the RX ring failed.
.It ena%d: failed to alloc buffer for rx queue
.It ena%d: failed to add buffer for rx queue %d
.It ena%d: refilled rx queue %d with %d pages only
.Pp
Allocation of resources used on RX path failed; if happened during
initialization of the IO queue, the interface will not be brought up.
.It ena%d: ioctl promisc/allmulti
.Pp
IOCTL request for the device to work in promiscuous/allmulti mode; see
.Xr ifconfig 8
for more details.
.It ena%d: too many fragments. Last fragment: %d!
.Pp
Packet with unsupported number of segments was queued for sending to the
device; packet will be dropped.
.Sh SUPPORT
If an issue is identified with the released source code with a supported adapter
email the specific information related to the issue to
.Aq Mt mk@semihalf.com
and
.Aq Mt mw@semihalf.com .
.Sh SEE ALSO
.Xr vlan 4 ,
.Xr ifconfig 8
.Sh AUTHORS
The
.Nm
driver was written by
.An Semihalf.

View File

@ -1584,6 +1584,12 @@ dev/e1000/e1000_mbx.c optional em \
dev/e1000/e1000_osdep.c optional em \
compile-with "${NORMAL_C} -I$S/dev/e1000"
dev/et/if_et.c optional et
dev/ena/ena.c optional ena \
compile-with "${NORMAL_C} -I$S/contrib"
dev/ena/ena_sysctl.c optional ena \
compile-with "${NORMAL_C} -I$S/contrib"
contrib/ena-com/ena_com.c optional ena
contrib/ena-com/ena_eth_com.c optional ena
dev/ep/if_ep.c optional ep
dev/ep/if_ep_isa.c optional ep isa
dev/ep/if_ep_pccard.c optional ep pccard

3769
sys/dev/ena/ena.c Normal file

File diff suppressed because it is too large Load Diff

434
sys/dev/ena/ena.h Normal file
View File

@ -0,0 +1,434 @@
/*-
* BSD LICENSE
*
* Copyright (c) 2015-2017 Amazon.com, Inc. or its affiliates.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* $FreeBSD$
*
*/
#ifndef ENA_H
#define ENA_H
#include <sys/types.h>
#include "ena-com/ena_com.h"
#include "ena-com/ena_eth_com.h"
#define DRV_MODULE_VER_MAJOR 0
#define DRV_MODULE_VER_MINOR 7
#define DRV_MODULE_VER_SUBMINOR 0
#define DRV_MODULE_NAME "ena"
#ifndef DRV_MODULE_VERSION
#define DRV_MODULE_VERSION \
__XSTRING(DRV_MODULE_VER_MAJOR) "." \
__XSTRING(DRV_MODULE_VER_MINOR) "." \
__XSTRING(DRV_MODULE_VER_SUBMINOR)
#endif
#define DEVICE_NAME "Elastic Network Adapter (ENA)"
#define DEVICE_DESC "ENA adapter"
/* Calculate DMA mask - width for ena cannot exceed 48, so it is safe */
#define ENA_DMA_BIT_MASK(x) ((1ULL << (x)) - 1ULL)
/* 1 for AENQ + ADMIN */
#define ENA_MAX_MSIX_VEC(io_queues) (1 + (io_queues))
#define ENA_REG_BAR 0
#define ENA_MEM_BAR 2
#define ENA_BUS_DMA_SEGS 32
#define ENA_DEFAULT_RING_SIZE 1024
#define ENA_DEFAULT_SMALL_PACKET_LEN 128
#define ENA_DEFAULT_MAX_RX_BUFF_ALLOC_SIZE 1536
#define ENA_RX_REFILL_THRESH_DEVIDER 8
#define ENA_MAX_PUSH_PKT_SIZE 128
#define ENA_NAME_MAX_LEN 20
#define ENA_IRQNAME_SIZE 40
#define ENA_PKT_MAX_BUFS 19
#define ENA_STALL_TIMEOUT 100
#define ENA_RX_RSS_TABLE_LOG_SIZE 7
#define ENA_RX_RSS_TABLE_SIZE (1 << ENA_RX_RSS_TABLE_LOG_SIZE)
#define ENA_HASH_KEY_SIZE 40
#define ENA_DMA_BITS_MASK 40
#define ENA_MAX_FRAME_LEN 10000
#define ENA_MIN_FRAME_LEN 60
#define ENA_RX_HASH_KEY_NUM 10
#define ENA_RX_THASH_TABLE_SIZE (1 << 8)
#define ENA_TX_CLEANUP_TRESHOLD 128
#define DB_THRESHOLD 64
#define TX_COMMIT 32
/*
* TX budget for cleaning. It should be half of the RX budget to reduce amount
* of TCP retransmissions.
*/
#define TX_BUDGET 128
/* RX cleanup budget. -1 stands for infinity. */
#define RX_BUDGET 256
/*
* How many times we can repeat cleanup in the io irq handling routine if the
* RX or TX budget was depleted.
*/
#define CLEAN_BUDGET 8
#define RX_IRQ_INTERVAL 20
#define TX_IRQ_INTERVAL 50
#define ENA_MAX_MTU 9216
#define ENA_TSO_MAXSIZE PAGE_SIZE
#define ENA_TSO_NSEGS ENA_PKT_MAX_BUFS
#define ENA_RX_OFFSET NET_SKB_PAD + NET_IP_ALIGN
#define ENA_MMIO_DISABLE_REG_READ BIT(0)
#define ENA_TX_RING_IDX_NEXT(idx, ring_size) (((idx) + 1) & ((ring_size) - 1))
#define ENA_RX_RING_IDX_NEXT(idx, ring_size) (((idx) + 1) & ((ring_size) - 1))
#define ENA_RX_RING_IDX_ADD(idx, n, ring_size) \
(((idx) + (n)) & ((ring_size) - 1))
#define ENA_IO_TXQ_IDX(q) (2 * (q))
#define ENA_IO_RXQ_IDX(q) (2 * (q) + 1)
#define ENA_MGMNT_IRQ_IDX 0
#define ENA_IO_IRQ_FIRST_IDX 1
#define ENA_IO_IRQ_IDX(q) (ENA_IO_IRQ_FIRST_IDX + (q))
/*
* ENA device should send keep alive msg every 1 sec.
* We wait for 6 sec just to be on the safe side.
*/
#define DEFAULT_KEEP_ALIVE_TO (SBT_1S * 6)
/* Time in jiffies before concluding the transmitter is hung. */
#define DEFAULT_TX_CMP_TO (SBT_1S * 5)
/* Number of queues to check for missing queues per timer tick */
#define DEFAULT_TX_MONITORED_QUEUES (4)
/* Max number of timeouted packets before device reset */
#define DEFAULT_TX_CMP_THRESHOLD (128)
/*
* Supported PCI vendor and devices IDs
*/
#define PCI_VENDOR_ID_AMAZON 0x1d0f
#define PCI_DEV_ID_ENA_PF 0x0ec2
#define PCI_DEV_ID_ENA_LLQ_PF 0x1ec2
#define PCI_DEV_ID_ENA_VF 0xec20
#define PCI_DEV_ID_ENA_LLQ_VF 0xec21
struct msix_entry {
int entry;
int vector;
};
typedef struct _ena_vendor_info_t {
unsigned int vendor_id;
unsigned int device_id;
unsigned int index;
} ena_vendor_info_t;
struct ena_irq {
/* Interrupt resources */
struct resource *res;
driver_intr_t *handler;
void *data;
void *cookie;
unsigned int vector;
bool requested;
int cpu;
char name[ENA_IRQNAME_SIZE];
};
struct ena_que {
struct ena_adapter *adapter;
struct ena_ring *tx_ring;
struct ena_ring *rx_ring;
uint32_t id;
int cpu;
};
struct ena_tx_buffer {
struct mbuf *mbuf;
/* # of ena desc for this specific mbuf
* (includes data desc and metadata desc) */
unsigned int tx_descs;
/* # of buffers used by this mbuf */
unsigned int num_of_bufs;
bus_dmamap_t map;
/* Used to detect missing tx packets */
struct bintime timestamp;
bool print_once;
struct ena_com_buf bufs[ENA_PKT_MAX_BUFS];
} __aligned(CACHE_LINE_SIZE);
struct ena_rx_buffer {
struct mbuf *mbuf;
bus_dmamap_t map;
struct ena_com_buf ena_buf;
} __aligned(CACHE_LINE_SIZE);
struct ena_stats_tx {
counter_u64_t cnt;
counter_u64_t bytes;
counter_u64_t queue_stop;
counter_u64_t prepare_ctx_err;
counter_u64_t queue_wakeup;
counter_u64_t dma_mapping_err;
/* Not counted */
counter_u64_t unsupported_desc_num;
/* Not counted */
counter_u64_t napi_comp;
/* Not counted */
counter_u64_t tx_poll;
counter_u64_t doorbells;
counter_u64_t missing_tx_comp;
counter_u64_t bad_req_id;
};
struct ena_stats_rx {
counter_u64_t cnt;
counter_u64_t bytes;
counter_u64_t refil_partial;
counter_u64_t bad_csum;
/* Not counted */
counter_u64_t page_alloc_fail;
counter_u64_t mbuf_alloc_fail;
counter_u64_t dma_mapping_err;
counter_u64_t bad_desc_num;
/* Not counted */
counter_u64_t small_copy_len_pkt;
};
struct ena_ring {
/* Holds the empty requests for TX out of order completions */
uint16_t *free_tx_ids;
struct ena_com_dev *ena_dev;
struct ena_adapter *adapter;
struct ena_com_io_cq *ena_com_io_cq;
struct ena_com_io_sq *ena_com_io_sq;
/* The maximum length the driver can push to the device (For LLQ) */
enum ena_admin_placement_policy_type tx_mem_queue_type;
uint16_t rx_small_copy_len;
uint16_t qid;
uint16_t mtu;
uint8_t tx_max_header_size;
struct ena_com_rx_buf_info ena_bufs[ENA_PKT_MAX_BUFS];
uint32_t smoothed_interval;
enum ena_intr_moder_level moder_tbl_idx;
struct ena_que *que;
struct lro_ctrl lro;
uint16_t next_to_use;
uint16_t next_to_clean;
union {
struct ena_tx_buffer *tx_buffer_info; /* contex of tx packet */
struct ena_rx_buffer *rx_buffer_info; /* contex of rx packet */
};
int ring_size; /* number of tx/rx_buffer_info's entries */
struct buf_ring *br; /* only for TX */
struct mtx ring_mtx;
char mtx_name[16];
struct task enqueue_task;
struct taskqueue *enqueue_tq;
struct task cmpl_task;
struct taskqueue *cmpl_tq;
union {
struct ena_stats_tx tx_stats;
struct ena_stats_rx rx_stats;
};
} __aligned(CACHE_LINE_SIZE);
struct ena_stats_dev {
/* Not counted */
counter_u64_t tx_timeout;
/* Not counted */
counter_u64_t io_suspend;
/* Not counted */
counter_u64_t io_resume;
/* Not counted */
counter_u64_t wd_expired;
counter_u64_t interface_up;
counter_u64_t interface_down;
/* Not counted */
counter_u64_t admin_q_pause;
};
struct ena_hw_stats {
uint64_t rx_packets;
uint64_t tx_packets;
uint64_t rx_bytes;
uint64_t tx_bytes;
uint64_t rx_drops;
};
/* Board specific private data structure */
struct ena_adapter {
struct ena_com_dev *ena_dev;
/* OS defined structs */
if_t ifp;
device_t pdev;
struct ifmedia media;
/* OS resources */
struct resource * memory;
struct resource * registers;
struct mtx global_mtx;
struct sx ioctl_sx;
/* MSI-X */
uint32_t msix_enabled;
struct msix_entry *msix_entries;
int msix_vecs;
/* DMA tags used throughout the driver adapter for Tx and Rx */
bus_dma_tag_t tx_buf_tag;
bus_dma_tag_t rx_buf_tag;
int dma_width;
/*
* RX packets that shorter that this len will be copied to the skb
* header
*/
unsigned int small_copy_len;
uint16_t max_tx_sgl_size;
uint16_t max_rx_sgl_size;
uint32_t tx_offload_cap;
/* Tx fast path data */
int num_queues;
unsigned int tx_usecs, rx_usecs; /* Interrupt coalescing */
unsigned int tx_ring_size;
unsigned int rx_ring_size;
/* RSS*/
uint8_t rss_ind_tbl[ENA_RX_RSS_TABLE_SIZE];
bool rss_support;
uint32_t msg_enable;
uint8_t mac_addr[ETHER_ADDR_LEN];
/* mdio and phy*/
char name[ENA_NAME_MAX_LEN];
bool link_status;
bool trigger_reset;
bool up;
bool running;
uint32_t wol;
/* Queue will represent one TX and one RX ring */
struct ena_que que[ENA_MAX_NUM_IO_QUEUES]
__aligned(CACHE_LINE_SIZE);
/* TX */
struct ena_ring tx_ring[ENA_MAX_NUM_IO_QUEUES]
__aligned(CACHE_LINE_SIZE);
/* RX */
struct ena_ring rx_ring[ENA_MAX_NUM_IO_QUEUES]
__aligned(CACHE_LINE_SIZE);
struct ena_irq irq_tbl[ENA_MAX_MSIX_VEC(ENA_MAX_NUM_IO_QUEUES)];
/* Timer service */
struct callout timer_service;
sbintime_t keep_alive_timestamp;
uint32_t next_monitored_tx_qid;
struct task reset_task;
struct taskqueue *reset_tq;
int wd_active;
sbintime_t keep_alive_timeout;
sbintime_t missing_tx_timeout;
uint32_t missing_tx_max_queues;
uint32_t missing_tx_threshold;
/* Statistics */
struct ena_stats_dev dev_stats;
struct ena_hw_stats hw_stats;
};
#define ENA_DEV_LOCK mtx_lock(&adapter->global_mtx)
#define ENA_DEV_UNLOCK mtx_unlock(&adapter->global_mtx)
#define ENA_RING_MTX_LOCK(_ring) mtx_lock(&(_ring)->ring_mtx)
#define ENA_RING_MTX_TRYLOCK(_ring) mtx_trylock(&(_ring)->ring_mtx)
#define ENA_RING_MTX_UNLOCK(_ring) mtx_unlock(&(_ring)->ring_mtx)
struct ena_dev *ena_efa_enadev_get(device_t pdev);
int ena_register_adapter(struct ena_adapter *adapter);
void ena_unregister_adapter(struct ena_adapter *adapter);
int ena_update_stats_counters(struct ena_adapter *adapter);
static inline int ena_mbuf_count(struct mbuf *mbuf)
{
int count = 1;
while ((mbuf = mbuf->m_next) != NULL)
++count;
return count;
}
#endif /* !(ENA_H) */

243
sys/dev/ena/ena_sysctl.c Normal file
View File

@ -0,0 +1,243 @@
/*-
* BSD LICENSE
*
* Copyright (c) 2015-2017 Amazon.com, Inc. or its affiliates.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
#include "ena_sysctl.h"
static int ena_sysctl_update_stats(SYSCTL_HANDLER_ARGS);
static void ena_sysctl_add_stats(struct ena_adapter *);
void
ena_sysctl_add_nodes(struct ena_adapter *adapter)
{
ena_sysctl_add_stats(adapter);
}
static void
ena_sysctl_add_stats(struct ena_adapter *adapter)
{
device_t dev;
struct ena_ring *tx_ring;
struct ena_ring *rx_ring;
struct ena_hw_stats *hw_stats;
struct ena_stats_dev *dev_stats;
struct ena_stats_tx *tx_stats;
struct ena_stats_rx *rx_stats;
struct ena_com_stats_admin *admin_stats;
struct sysctl_ctx_list *ctx;
struct sysctl_oid *tree;
struct sysctl_oid_list *child;
struct sysctl_oid *queue_node, *tx_node, *rx_node, *hw_node;
struct sysctl_oid *admin_node;
struct sysctl_oid_list *queue_list, *tx_list, *rx_list, *hw_list;
struct sysctl_oid_list *admin_list;
#define QUEUE_NAME_LEN 32
char namebuf[QUEUE_NAME_LEN];
int i;
dev = adapter->pdev;
ctx = device_get_sysctl_ctx(dev);
tree = device_get_sysctl_tree(dev);
child = SYSCTL_CHILDREN(tree);
tx_ring = adapter->tx_ring;
rx_ring = adapter->rx_ring;
hw_stats = &adapter->hw_stats;
dev_stats = &adapter->dev_stats;
admin_stats = &adapter->ena_dev->admin_queue.stats;
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "tx_timeout",
CTLFLAG_RD, &dev_stats->tx_timeout,
"Driver TX timeouts");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "io_suspend",
CTLFLAG_RD, &dev_stats->io_suspend,
"IO queue suspends");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "io_resume",
CTLFLAG_RD, &dev_stats->io_resume,
"IO queue resumes");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "wd_expired",
CTLFLAG_RD, &dev_stats->wd_expired,
"Watchdog expiry count");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "interface_up",
CTLFLAG_RD, &dev_stats->interface_up,
"Network interface up count");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "interface_down",
CTLFLAG_RD, &dev_stats->interface_down,
"Network interface down count");
SYSCTL_ADD_COUNTER_U64(ctx, child, OID_AUTO, "admin_q_pause",
CTLFLAG_RD, &dev_stats->admin_q_pause,
"Admin queue pauses");
for (i = 0; i < adapter->num_queues; ++i, ++tx_ring, ++rx_ring) {
snprintf(namebuf, QUEUE_NAME_LEN, "queue%d", i);
queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO,
namebuf, CTLFLAG_RD, NULL, "Queue Name");
queue_list = SYSCTL_CHILDREN(queue_node);
/* TX specific stats */
tx_node = SYSCTL_ADD_NODE(ctx, queue_list, OID_AUTO,
"tx_ring", CTLFLAG_RD, NULL, "TX ring");
tx_list = SYSCTL_CHILDREN(tx_node);
tx_stats = &tx_ring->tx_stats;
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"count", CTLFLAG_RD,
&tx_stats->cnt, "Packets sent");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"bytes", CTLFLAG_RD,
&tx_stats->bytes, "Bytes sent");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"prepare_ctx_err", CTLFLAG_RD,
&tx_stats->prepare_ctx_err,
"TX buffer preparation failures");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"queue_wakeup", CTLFLAG_RD,
&tx_stats->queue_wakeup, "Queue wakeups");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"dma_mapping_err", CTLFLAG_RD,
&tx_stats->dma_mapping_err, "DMA mapping failures");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"unsupported_desc_num", CTLFLAG_RD,
&tx_stats->unsupported_desc_num,
"Excessive descriptor packet discards");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"napi_comp", CTLFLAG_RD,
&tx_stats->napi_comp, "Napi completions");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"tx_poll", CTLFLAG_RD,
&tx_stats->tx_poll, "TX poll count");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"doorbells", CTLFLAG_RD,
&tx_stats->doorbells, "Queue doorbells");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"missing_tx_comp", CTLFLAG_RD,
&tx_stats->missing_tx_comp, "TX completions missed");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"bad_req_id", CTLFLAG_RD,
&tx_stats->bad_req_id, "Bad request id count");
SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
"stops", CTLFLAG_RD,
&tx_stats->queue_stop, "Queue stops");
/* RX specific stats */
rx_node = SYSCTL_ADD_NODE(ctx, queue_list, OID_AUTO,
"rx_ring", CTLFLAG_RD, NULL, "RX ring");
rx_list = SYSCTL_CHILDREN(rx_node);
rx_stats = &rx_ring->rx_stats;
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"count", CTLFLAG_RD,
&rx_stats->cnt, "Packets received");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"bytes", CTLFLAG_RD,
&rx_stats->bytes, "Bytes received");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"refil_partial", CTLFLAG_RD,
&rx_stats->refil_partial, "Partial refilled mbufs");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"bad_csum", CTLFLAG_RD,
&rx_stats->bad_csum, "Bad RX checksum");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"page_alloc_fail", CTLFLAG_RD,
&rx_stats->page_alloc_fail, "Failed page allocs");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"mbuf_alloc_fail", CTLFLAG_RD,
&rx_stats->mbuf_alloc_fail, "Failed mbuf allocs");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"dma_mapping_err", CTLFLAG_RD,
&rx_stats->dma_mapping_err, "DMA mapping errors");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"bad_desc_num", CTLFLAG_RD,
&rx_stats->bad_desc_num, "Bad descriptor count");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"small_copy_len_pkt", CTLFLAG_RD,
&rx_stats->small_copy_len_pkt, "Small copy packet count");
}
/* Stats read from device */
hw_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "hw_stats",
CTLFLAG_RD, NULL, "Statistics from hardware");
hw_list = SYSCTL_CHILDREN(hw_node);
SYSCTL_ADD_U64(ctx, hw_list, OID_AUTO, "rx_packets", CTLFLAG_RD,
&hw_stats->rx_packets, 0, "Packets received");
SYSCTL_ADD_U64(ctx, hw_list, OID_AUTO, "tx_packets", CTLFLAG_RD,
&hw_stats->tx_packets, 0, "Packets transmitted");
SYSCTL_ADD_U64(ctx, hw_list, OID_AUTO, "rx_bytes", CTLFLAG_RD,
&hw_stats->rx_bytes, 0, "Bytes received");
SYSCTL_ADD_U64(ctx, hw_list, OID_AUTO, "tx_bytes", CTLFLAG_RD,
&hw_stats->tx_bytes, 0, "Bytes transmitted");
SYSCTL_ADD_U64(ctx, hw_list, OID_AUTO, "rx_drops", CTLFLAG_RD,
&hw_stats->rx_drops, 0, "Receive packet drops");
SYSCTL_ADD_PROC(ctx, hw_list, OID_AUTO, "update_stats",
CTLTYPE_INT|CTLFLAG_RD, adapter, 0, ena_sysctl_update_stats,
"A", "Update stats from hardware");
/* ENA Admin queue stats */
admin_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "admin_stats",
CTLFLAG_RD, NULL, "ENA Admin Queue statistics");
admin_list = SYSCTL_CHILDREN(admin_node);
SYSCTL_ADD_U32(ctx, admin_list, OID_AUTO, "aborted_cmd", CTLFLAG_RD,
&admin_stats->aborted_cmd, 0, "Aborted commands");
SYSCTL_ADD_U32(ctx, admin_list, OID_AUTO, "sumbitted_cmd", CTLFLAG_RD,
&admin_stats->submitted_cmd, 0, "Submitted commands");
SYSCTL_ADD_U32(ctx, admin_list, OID_AUTO, "completed_cmd", CTLFLAG_RD,
&admin_stats->completed_cmd, 0, "Completed commands");
SYSCTL_ADD_U32(ctx, admin_list, OID_AUTO, "out_of_space", CTLFLAG_RD,
&admin_stats->out_of_space, 0, "Queue out of space");
SYSCTL_ADD_U32(ctx, admin_list, OID_AUTO, "no_completion", CTLFLAG_RD,
&admin_stats->no_completion, 0, "Commands not completed");
}
static int
ena_sysctl_update_stats(SYSCTL_HANDLER_ARGS)
{
struct ena_adapter *adapter = (struct ena_adapter *)arg1;
int rc;
if (adapter->up)
ena_update_stats_counters(adapter);
rc = sysctl_handle_string(oidp, "", 1, req);
return (rc);
}

44
sys/dev/ena/ena_sysctl.h Normal file
View File

@ -0,0 +1,44 @@
/*-
* BSD LICENSE
*
* Copyright (c) 2015-2017 Amazon.com, Inc. or its affiliates.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* $FreeBSD$
*
*/
#ifndef ENA_SYSCTL_H
#define ENA_SYSCTL_H
#include <sys/types.h>
#include <sys/sysctl.h>
#include "ena.h"
void ena_sysctl_add_nodes(struct ena_adapter *);
#endif /* !(ENA_SYSCTL_H) */

View File

@ -107,6 +107,7 @@ SUBDIR= \
${_efirt} \
${_elink} \
${_em} \
${_ena} \
${_ep} \
${_epic} \
esp \
@ -566,6 +567,7 @@ _drm= drm
_drm2= drm2
_ed= ed
_em= em
_ena= ena
_ep= ep
_et= et
_exca= exca

41
sys/modules/ena/Makefile Normal file
View File

@ -0,0 +1,41 @@
#
# BSD LICENSE
#
# Copyright (c) 2015-2017 Amazon.com, Inc. or its affiliates.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# $FreeBSD$
#
.PATH: ${SRCTOP}/sys/dev/ena \
${SRCTOP}/sys/contrib/ena-com
KMOD = if_ena
SRCS = ena.c ena_com.c ena_eth_com.c ena_sysctl.c
SRCS += device_if.h bus_if.h pci_if.h
CFLAGS += -I${SRCTOP}/sys/contrib
.include <bsd.kmod.mk>