freebsd-dev/sys
Ka Ho Ng 74ada297e8 AMD-vi: Fix IOMMU device interrupts being overridden
Currently, AMD-vi PCI-e passthrough will lead to the following lines in
dmesg:
"kernel: CPU0: local APIC error 0x40
ivhd0: Error: completion failed tail:0x720, head:0x0."

After some tracing, the problem is due to the interaction with
amdvi_alloc_intr_resources() and pci_driver_added(). In ivrs_drv, the
identification of AMD-vi IVHD is done by walking over the ACPI IVRS
table and ivhdX device_ts are added under the acpi bus, while there are
no driver handling the corresponding IOMMU PCI function. In
amdvi_alloc_intr_resources(), the MSI intr are allocated with the ivhdX
device_t instead of the IOMMU PCI function device_t. bus_setup_intr() is
called on ivhdX. the IOMMU pci function device_t is only used for
pci_enable_msi(). Since bus_setup_intr() is not called on IOMMU pci
function, the IOMMU PCI function device_t's dinfo->cfg.msi is never
updated to reflect the supposed msi_data and msi_addr. So the msi_data
and msi_addr stay in the value 0. When pci_driver_added() tried to loop
over the children of a pci bus, and do pci_cfg_restore() on each of
them, msi_addr and msi_data with value 0 will be written to the MSI
capability of the IOMMU pci function, thus explaining the errors in
dmesg.

This change includes an amdiommu driver which currently does attaching,
detaching and providing DEVMETHODs for setting up and tearing down
interrupt. The purpose of the driver is to prevent pci_driver_added()
from calling pci_cfg_restore() on the IOMMU PCI function device_t.
The introduction of the amdiommu driver handles allocation of an IRQ
resource within the IOMMU PCI function, so that the dinfo->cfg.msi is
populated.

This has been tested on EPYC Rome 7282 with Radeon 5700XT GPU.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	jhb
Approved by:	philip (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28984
2021-03-22 17:33:43 +08:00
..
amd64 AMD-vi: Fix IOMMU device interrupts being overridden 2021-03-22 17:33:43 +08:00
arm Remove PCPU_INC 2021-03-20 19:23:59 -07:00
arm64 Remove PCPU_INC 2021-03-20 19:23:59 -07:00
bsm Add aio_writev and aio_readv 2021-01-02 19:57:58 -07:00
cam cam: Run all XPT_ASYNC ccbs in a dedicated thread 2021-03-12 13:29:42 -07:00
cddl Handle functions that use a nop in the arm64 fbt 2021-03-03 14:18:03 +00:00
compat Rename linux_set_upcall_kse() to linux_set_upcall(). 2021-03-18 12:14:34 -07:00
conf x86: consolidate hw watchpoint logic into new file 2021-03-19 16:51:52 -03:00
contrib zfs: merge OpenZFS master-891568c99 2021-03-21 02:17:59 +01:00
crypto armv8crypto: note derivation in armv8_crypto_wrap.c 2021-03-19 10:53:49 -03:00
ddb ddb: enable the use of ^C and ^S/^Q 2021-03-14 16:04:27 -07:00
dev netmap: fix issues in nm_os_extmem_create() 2021-03-20 17:15:50 +00:00
dts Remove DTS files for arm boards we don't support 2021-01-27 10:02:01 +00:00
fs fusefs: fix a dead store in fuse_vnop_advlock 2021-03-19 19:38:57 -06:00
gdb Use atomic loads/stores when updating td->td_state 2021-02-18 14:02:48 +00:00
geom gmirror: Pre-allocate the timeout event structure 2021-03-11 15:45:15 -05:00
gnu Remove the old dts imported tree. 2021-01-15 20:09:55 +01:00
i386 Remove PCPU_INC 2021-03-20 19:23:59 -07:00
isa Move back the isa non-PNP driver deadline to FreeBSD 14. 2021-03-08 16:00:23 -07:00
kern Add device and ifnet logging methods, similar to device_printf / if_printf 2021-03-22 00:02:34 +00:00
kgssapi opencrypto: Introduce crypto_dispatch_async() 2021-02-08 09:19:19 -05:00
libkern Restore the augmented strlen commentary 2021-02-08 19:15:21 +00:00
mips Remove PCPU_INC 2021-03-20 19:23:59 -07:00
modules AMD-vi: Fix IOMMU device interrupts being overridden 2021-03-22 17:33:43 +08:00
net Add device and ifnet logging methods, similar to device_printf / if_printf 2021-03-22 00:02:34 +00:00
net80211 net80211: split up ieee80211_probereq() 2021-03-18 11:02:45 +00:00
netgraph netgraph/ng_bridge: Add counters for the first link, too 2021-02-10 19:05:37 +01:00
netinet fix panic when rescue retransmission and FIN overlap 2021-03-17 17:12:04 +01:00
netinet6 base: remove if_wg(4) and associated utilities, manpage 2021-03-17 09:14:48 -05:00
netipsec Revert "SO_RERROR indicates that receive buffer overflows should be handled as errors." 2021-02-08 22:32:32 +00:00
netpfil pfsync: Unconditionally push packets when requesting state updates 2021-03-17 19:18:14 +01:00
netsmb
nfs nfs: Cleanup dead files 2021-03-17 06:16:31 +11:00
nfsclient nfs: Cleanup dead files 2021-03-17 06:16:31 +11:00
nfsserver nfs: Cleanup dead files 2021-03-17 06:16:31 +11:00
nlm
ofed ofed: quiet gcc -Wint-in-bool-context 2021-02-24 15:56:16 -08:00
opencrypto ktls: Fix non-inplace TLS 1.3 encryption. 2021-03-10 11:07:40 -08:00
powerpc Remove PCPU_INC 2021-03-20 19:23:59 -07:00
riscv Remove PCPU_INC 2021-03-20 19:23:59 -07:00
rpc nfs-over-tls: handle res.gid.gid_val correctly for memory allocation 2021-01-12 13:59:52 -08:00
security Add a comment on why the call to mac_vnode_relabel() might be in the wrong 2021-02-27 16:25:26 +00:00
sys Add device and ifnet logging methods, similar to device_printf / if_printf 2021-03-22 00:02:34 +00:00
teken loader: implement framebuffer console 2021-01-02 21:41:36 +02:00
tests Enable running fib tests inside vnet jail. 2021-01-17 20:32:26 +00:00
tools syscalls.master: Add a new syscall type: RESERVED 2021-01-26 18:27:44 +00:00
ufs softdep_unmount: assert that no dandling dependencies are left 2021-03-12 13:31:08 +02:00
vm Remove unused obj variable missed in r354870. 2021-03-17 15:29:15 -07:00
x86 x86: consolidate hw watchpoint logic into new file 2021-03-19 16:51:52 -03:00
xdr
xen xen: move x86-specific xen_vector_callback_enabled to sys/x86 2021-03-15 14:20:21 +01:00
Makefile