eb0d471a89
Some PMDs (e.g. hns3) could detect hardware or firmware errors, one error recovery mode is to report RTE_ETH_EVENT_INTR_RESET event, and wait for application invoke rte_eth_dev_reset() to recover the port, however, this mode has the following weaknesses: 1) Due to different hardware and software design, some NIC port recovery process requires multiple handshakes with the firmware and PF (when the port is VF). It takes a long time to complete the entire operation for one port, If multiple ports (for example, multiple VFs of a PF) are reset at the same time, other VFs may fail to be reset. (Because the reset processing is serial, the previous VFs must be processed before the subsequent VFs). 2) The impact on the application layer is great, and it should stop working queues, stop calling Rx and Tx functions, and then call rte_eth_dev_reset(), and re-setup all again. This patch introduces proactive error handling mode, the PMD will try to recover from the errors itself. In this process, the PMD sets the data path pointers to dummy functions (which will prevent the crash), and also make sure the control path operations failed with retcode -EBUSY. Because the PMD recovers automatically, the application can only sense that the data flow is disconnected for a while and the control API returns an error in this period. In order to sense the error happening/recovering, three events were introduced: 1) RTE_ETH_EVENT_ERR_RECOVERING: used to notify the application that it detected an error and the recovery is being started. Upon receiving the event, the application should not invoke any control path APIs until receiving RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED event. 2) RTE_ETH_EVENT_RECOVERY_SUCCESS: used to notify the application that it recovers successful from the error, the PMD already re-configures the port, and the effect is the same as that of the restart operation. 3) RTE_ETH_EVENT_RECOVERY_FAILED: used to notify the application that it recovers failed from the error, the port should not usable anymore. The application should close the port. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> |
||
---|---|---|
.. | ||
dumpcap | ||
pdump | ||
proc-info | ||
test | ||
test-acl | ||
test-bbdev | ||
test-cmdline | ||
test-compress-perf | ||
test-crypto-perf | ||
test-eventdev | ||
test-fib | ||
test-flow-perf | ||
test-gpudev | ||
test-pipeline | ||
test-pmd | ||
test-regex | ||
test-sad | ||
meson.build |