Teach the pci host generic ACPI attachment about PCI_ID_OFW_IOMMU. This
will be used by the arm64 smmu IOMMU driver to read the xref and ID
this interface provides in a bus-agnostic way.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D39182
Add a n attachment to the pci_host_generic driver for the Arm DEN0115
PCI Configuration Space Access Firmware Interface [1]. This can be used
when PCI controllers need to implement quirks in the PCI root bus.
To handle this the firmware implements a SMCCC interface the driver can
use to read and write the configuration register.
This has been tested on a Raspberry Pi 4 booting with EDK2.
[1] https://developer.arm.com/documentation/den0115/latest
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D39228
To allow for attachments that don't use memory mapped registers add
a flag they can set when the base driver shouldn't map them.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D39227
If the tunable is set to 0, the tuning of the MPS (maximum payload size)
is disabled and the default MPS values set by the BIOS are used. In this
case the system may use a lower speed or operate in a less optimized
state, but it can resolve issues with stability and compatibility. With
specific devices the tuning of the mps, can lead to a complete freeze of
the system.
Reviewed by: manu
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38397
On arm64 PCI config memory is expected to be mapped with a non-posted
device type. To handle this use the new bus_map_resource support in
arm64 to map memory with the new VM_MEMATTR_DEVICE_NP attribute. This
memory has already been allocated and activated, it just needs to be
mapped.
Reviewed by: kevans, mmel
Differential Revision: https://reviews.freebsd.org/D30079
Add ACPI_RESOURCE_TYPE_FIXED_MEMORY32 to the PCI ECAM driver. This is
used on the Microsoft Dev Kit 2023 and reportedly the Lenovo x13s.
Reviewed by: Robert Clausecker <fuz@fuz.su> (Earlier version)
Tested by: Robert Clausecker <fuz@fuz.su> (Earlier version)
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38031
These values come from section 7.7.11 ("ACS Extended Capability") of the
PCI Express Base Specification Revision 6.0, dated 16 Dec 2021.
MFC after: 1 week
Sponsored by: Chelsio Communications
Reviewed by: kib@
Differential Revision: https://reviews.freebsd.org/D37270
Add sysctl/tunable to control Electromechanical Interlock support.
Disable it by default since Linux does not do it either and it seems
the number of systems having it broken is higher than having working.
This fixes NVMe backplane operation on ASUS RS500A-E11-RS12U server
with AMD EPYC 7402 CPU, where attempts to control reported interlock
for some reason end up in PCIe link loss, while interlock status does
not change (it is not really there).
MFC after: 2 weeks
Translating the provided range prior to rman_reserve_resource(9) is
decidedly wrong; the caller may be trying to do a wildcard allocation,
for which the implementation is expected to DTRT and clamp the range to
what's actually feasible.
We don't use the resulting translation here anyways, so just remove it
entirely -- the rman in the default implementation is derived from
sc->ranges, so the translation should trivially succeed every time as
long as the reservation succeeded. If something has gone awry in a
derived driver, we'll detect it when we translate prior to activation,
so there's likely no diagnostic value in retaining the translation after
reservation either.
Reviewed by: andrew
Noticed by: jhb
Differential Revision: https://reviews.freebsd.org/D36618
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This is not completely exhaustive, but covers a large majority of
commands in the tree.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583
Add clean up on failure and a detach function to the pci host generic
driver.
Reviewed by: jhb (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35291
Some AMD EPYC VCPUs generated boot message of the type:
pci4: <unknown> at device 0.0 (no driver attached)
These are displayed for device class 0x13 devices, e.g.:
none8@pci0:130:0:0: class=0x130000 rev=0x00 hdr=0x00 vendor=0x1022 \
device=0x148a subvendor=0x1022 subdevice=0x148a
vendor = 'Advanced Micro Devices, Inc. [AMD]'
device = 'Starship/Matisse PCIe Dummy Function'
class = non-essential instrumentation
Since these devices serve no purpose (no driver attaches) I have
enabled the reporting of suich devices only for verbose boots (a
diversion from the patch provided in the PR).
A verbose boot will now display such devices as:
pci4: <non-essential instrumentation> at device 0.0 (no driver attached)
PR: 263469
Reported by: jfc@mit.edu (John F. Carr)
MFC after: 1 week
If the pciX:Y:Z and pciW:X:Y:Z 'at' locations don't work, allow try the
LOCATOR:PATH syntax. Use dev_wired_cache to generically look them up.
Sponsored by: Netflix
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D32784
If we find a match, then assign it. Flip the logic in the if and assign
the unit rather than continuing if it doesn't match. Will make it easier
to expand to other matching schemes.
Sponsored by: Netflix
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D32779
Add a UEFI locator type. It prints the UEFI device names for a FreeBSD
device_t name. It works with PCI and ACPI device nodes. USB forthcoming.
Sponsored by: Netflix
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D32749
A panic has been observed on a system with a Intel X520 dual LAN
device. The panic is caused by a KASSERT() noticing that the amount
of VPD data copied out to the pciconf command does not match the
amount of data read from the device.
The cause of the size mismatch was VPD data that started with 0x82,
the VPD tag that indicates that a VPD ident follows, but with a length
of more than 255 characters, which happens to be the maximum ident
size supported by the API between kernel and the pciconf program.
The data provided did not resemble an actual VPD identifier, and it
can be assumed that the initial tag value 0x82 happens to be there
by accident.
An ident size of 255 far exceeds the sensible length of that data
element, which is in the order of at most 30 to 40 bytes.
This patch adds several consitstency checks to the VPD parser, the
most critical being that ident lengths of more than 255 bytes are
rejected. Other checks reject VPD with more than one ident tag or
with an empty (zero length) ident string.
This patch prevents the panic that occured when "pciconf -lV" was
executed on the affected system.
During the anaylsis of the issue and the VPD code it has been
found that the VPD parser uses a state machine that accepts tags
in any order and combination. This is a bad match for the actual
VPD data, which has a very simple structure that can be parsed
with a non-recursive direct descent parser (which always knows
exactly which token to expect next).
A review fpr a much simpler VPD parser that performs many more
consistency checks and rejects invalid VPD has been proposed in
review https://reviews.freebsd.org/D34268.
Reported by: mikej at paymentallianceintl.com (Michael Jung)
Approved by: jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34255
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D31831
Instead of returning 0xffs some controllers, such as Layerscape generate
an external exception when someone attempts to read any register
of config space of a non-existing device other than PCIR_VENDOR.
This causes a kernel panic.
Fix it by bailing during device enumeration if a device vendor register
returns invalid value. (0xffff)
Use this opportunity to replace some hardcoded values with a macro.
I believe that this change won't have any unintended side-effects since
it is safe to assume that vendor == 0xffff -> hdr_type == 0xffff.
Sponsored by: Alstom
Obtained from: Semihalf
Reviewed by: jhb
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33059
- Return an errno value upon failure, instead of 1.
- Provide a bus_translate_resource() wrapper.
- Implement the generic version, which traverses the hierarchy until a
bus driver with a non-trivial implementation is found, in subr_bus.c
like other similar default implementations.
- Make ofw_pcib_translate_resource() return an error if a matching PCI
address range is not found.
- Make generic_pcie_translate_resource_common() return an int instead of
a bool. Fix up callers.
No functional change intended.
Reviewed by: imp, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32855
In a VF's configuration space, "memory space enable" is hard-wired to 0,
so the existing implementation always returns false. We need to read
the SR-IOV control register from the PF device to get the value of the
MSE bit.
Fix pci_bar_enabled() to read this register instead for VFs. I don't
see any way to access the PF's config space without a backpointer in the
pci device ivars, so I added one.
This fixes a regression where bhyve(8) fails to map the MSI-X table
after commit 7fa2335347 ("bhyve: Map the MSI-X table unconditionally
for passthrough") when a VF is passed through, since with that commit we
use PCIOCBARMMAP to map the table and that ioctl always fails for VFs
without this change. As a bonus, pciconf(8) now correctly reports the
enablement of BARs for VFs.
Reported and tested by: Raúl Muñoz <raul.munoz@custos.es>
Reviewed by: rstone, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32839
Linux KPIs like pci_resource_start/len assume that BARs have been
allocated, but FreeBSD lazily allocates BARs if it cannot allocate the
firmware-allocated BARs. Thus using the Linux KPIs must force allocation
of the BARs rather than returning 0 for the start and length, which can
crash drm-kmod drivers that assume the BARs are valid. This is needed
for the AMDGPU driver to be able to attach on SiFive's HiFive Unmatched.
Reviewed by: hselasky, jhb, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32447
This is the same underlying problem as 2624598064, just for bus ranges
rather than windows. SiFive's HiFive Unmatched has the following
topology:
Root Port <---> Bridge <---> Bridge <-+-> Bridge <---> (Unused)
(pcib0) (pcib1) (pcib2) | (pcib3)
+-> Bridge <---> xHCI
| (pcib4)
+-> Bridge <---> M.2 E-key
| (pcib5)
+-> Bridge <---> M.2 M-key
| (pcib6)
+-> Bridge <---> x16 slot
(pcib7)
If a device is plugged into the x16 slot that itself has a bridge, such
as many graphics cards, we currently fail to allocate a bus number for
its child bus (and so pcib_attach_child skips adding a child bus for
further enumeration) as, when the new child bridge attaches, it attempts
to allocate a bus number from its parent (pcib7) which in turn attempts
to grow its own bus range by calling bus_adjust_resource on its own
parent (pcib2) whose bus rman cannot accommodate the request and needs
to itself be extended by calling its own parent (pcib1). Note that
pcib3-7 do not face the same issue when they attach since pcib1 ends up
managing bus numbers 1-255 from the beginning and so never needs to grow
its own range.
Reviewed by: jhb, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32011
In D21096 BUS_TRANSLATE_RESOURCE was introduced to allow LinuxKPI to get
physical addresses in pci_resource_start for PowerPC and implemented
in ofw_pci.
When the translation was implemented in pci_host_generic in 372c142b4f,
this method was not implemented; instead a local static function was
added for a similar purpose.
Rename the static function to "_common" and implement the bus function
as a wrapper around that. With this a LinuxKPI driver using
physical addresses correctly finds the configuration registers of
the GPU.
This unbreaks amdgpu on NXP Layerscape LX2160A SoC (SolidRun HoneyComb
LX2K workstation) which has a Translation Offset in ACPI for
below-4G PCI addresses.
More info: https://github.com/freebsd/drm-kmod/issues/84
Tested by: dan.kotowski_a9development.com
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30986
Due to the quirky nature of the Synopsys Designware PCIe IP,
the type 0 configuration is broadcast and whatever device
is plugged into slot, will appear at each 32 device
positions of bus0. Mitigate the issue by filtering out
duplicated devices on this bus for both DT and ACPI cases.
Reviewed by: mw
Sponsored by: Semihalf
MFC: after 3 weeks
Differential revision: https://reviews.freebsd.org/D31887
Set domain number to device unit.
Some boards have multiple RCs handled by different drivers,
this ensures that there are no collisions with ofw_pcib.
Obtained from: Semihalf
Reviewed by: wma
Differential revision: https://reviews.freebsd.org/D31508
When adjusting resources we should write updated window base/limit into
the registers. Without this newly added address range won't be routed
through the bridge properly.
Use MIN()/MAX() against current window base/limit to not shrink it on
the other side if the window is shared by several resources.
Align passed resource start/end to the set window granularity to keep
it properly aligned. Currently this is mostly called by other bridges
having the same window alignment, but it may be change one day.
Reviewed by: jrtc27, jhb
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D31693
This is useful for bhyve, which otherwise has to use /dev/io to handle
accesses to I/O port BARs when PCI passthrough is in use.
Reviewed by: imp, kib
Discussed with: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31307
This has been present since the first revision of the file. The debugf
macros have always been unused so it doesn't actually do anything
useful, and besides, debugging should not be unconditionally turned on
for a production driver. Moreover, this breaks the riscv LINT kernel
build as sys/conf/NOTES includes options DEBUG, resulting in a macro
redefinition error. This does not show up in the arm64 LINT kernel build
since that has an explicit nooptions DEBUG, which is dubious and should
be revisited. Rather than copy such a hack to riscv's NOTES, fix this
specific instance of DEBUG breaking.
Fixes: 896e217a0e ("fu740_pci_dw: Add SiFive FU740 PCIe controller driver")
MFC after: 1 week
If we allocate a new window for a bridge rather than reusing an existing
one set up by firmware to cover all the devices then the new window only
includes the range needed for the first device to allocate the resource.
If a request comes in to adjust this resource in order to extend a
downstream window for another device then this will fail as the rman
doesn't have any space, so we must first grow the bridge's own window.
This is needed to support successfully attaching more than one PCI
device on SiFive's HiFive Unmatched, which has the following topology:
Root Port <---> Bridge <---> Bridge <-+-> Bridge <---> (Unused)
(pcib0) (pcib1) (pcib2) | (pcib3)
+-> Bridge <---> xHCI
| (pcib4)
+-> Bridge <---> M.2 E-key
| (pcib5)
+-> Bridge <---> M.2 M-key
| (pcib6)
+-> Bridge <---> x16 slot
(pcib7)
Without this, the xHCI endpoint successfully attaches but NVMe M.2 M-key
endpoint fails to attach as, when its adjacent bridge (pcib6) attempts
to allocate a window from its parent (pcib2) on the other side of the
switch, its parent attempts to grow its own window by calling
bus_adjust_resource on its own parent (pcib1) which fails to call the
root port device (pcib0) to request more memory to grow its own window.
Had the root port been directly connected to the switch without the
bridge in the middle then the existing code would have worked, but the
extra hop broke it.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31035
Currently we use the num-viewports property to decide how many outbound
regions there are we can use, defaulting to 2. However, Linux has
stopped using that and so it no longer appears in new device trees, such
as for the SiFive FU740. Instead, it's possible to just probe the
hardware directly.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31030
This supersedes the old legacy mode where a viewport register was used
to mux multiple regions behind a single set of registers, and is used on
the SiFive FU740.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31029