Implement the interace to create SR-IOV Virtual Functions (VFs).
When a driver registers that they support SR-IOV by calling
pci_setup_iov(), the SR-IOV code creates a new node in /dev/iov
for that device. An ioctl can be invoked on that device to
create VFs and have the driver initialize them.
At this point, allocating memory I/O windows (BARs) is not
supported.
Differential Revision: https://reviews.freebsd.org/D76
Reviewed by: jhb
MFC after: 1 month
Sponsored by: Sandvine Inc.
* Add a bus_if.m method - get_domain() - returning the VM domain or
ENOENT if the device isn't in a VM domain;
* Add bus methods to print out the domain of the device if appropriate;
* Add code in srat.c to save the PXM -> VM domain mapping that's done and
expose a function to translate VM domain -> PXM;
* Add ACPI and ACPI PCI methods to check if the bus has a _PXM attribute
and if so map it to the VM domain;
* (.. yes, this works recursively.)
* Have the pci bus glue print out the device VM domain if present.
Note: this is just the plumbing to start enumerating information -
it doesn't at all modify behaviour.
Differential Revision: D906
Reviewed by: jhb
Sponsored by: Norse Corp
This patch adds support for MSI interrupts when running on Xen. Apart
from adding the Xen related code needed in order to register MSI
interrupts this patch also makes the msi_init function a hook in
init_ops, so different MSI implementations can have different
initialization functions.
Sponsored by: Citrix Systems R&D
xen/interface/physdev.h:
- Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen
public interface.
x86/include/init.h:
- Add a hook for setting custom msi_init methods.
amd64/amd64/machdep.c:
i386/i386/machdep.c:
- Set the default msi_init hook to point to the native MSI
initialization method.
x86/xen/pv.c:
- Set the Xen MSI init hook when running as a Xen guest.
x86/x86/local_apic.c:
- Call the msi_init hook instead of directly calling msi_init.
xen/xen_intr.h:
x86/xen/xen_intr.c:
- Introduce support for registering/releasing MSI interrupts with
Xen.
- The MSI interrupts will use the same PIC as the IO APIC interrupts.
xen/xen_msi.h:
x86/xen/xen_msi.c:
- Introduce a Xen MSI implementation.
x86/xen/xen_nexus.c:
- Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI
implementation.
x86/xen/xen_pci.c:
- Introduce a Xen specific PCI bus that inherits from the ACPI PCI
bus and overwrites the native MSI methods.
- This is needed because when running under Xen the MSI messages used
to configure MSI interrupts on PCI devices are written by Xen
itself.
dev/acpica/acpi_pci.c:
- Lower the quality of the ACPI PCI bus so the newly introduced Xen
PCI bus can take over when needed.
conf/files.i386:
conf/files.amd64:
- Add the newly created files to the build process.
1.3 of Intelб╝ Virtualization Technology for Directed I/O Architecture
Specification. The Extended Context and PASIDs from the rev. 2.2 are
not supported, but I am not aware of any released hardware which
implements them. Code does not use queued invalidation, see comments
for the reason, and does not provide interrupt remapping services.
Code implements the management of the guest address space per domain
and allows to establish and tear down arbitrary mappings, but not
partial unmapping. The superpages are created as needed, but not
promoted. Faults are recorded, fault records could be obtained
programmatically, and printed on the console.
Implement the busdma(9) using DMARs. This busdma backend avoids
bouncing and provides security against misbehaving hardware and driver
bad programming, preventing leaks and corruption of the memory by wild
DMA accesses.
By default, the implementation is compiled into amd64 GENERIC kernel
but disabled; to enable, set hw.dmar.enable=1 loader tunable. Code is
written to work on i386, but testing there was low priority, and
driver is not enabled in GENERIC. Even with the DMAR turned on,
individual devices could be directed to use the bounce busdma with the
hw.busdma.pci<domain>:<bus>:<device>:<function>.bounce=1 tunable. If
DMARs are capable of the pass-through translations, it is used,
otherwise, an identity-mapping page table is constructed.
The driver was tested on Xeon 5400/5500 chipset legacy machine,
Haswell desktop and E5 SandyBridge dual-socket boxes, with ahci(4),
ata(4), bce(4), ehci(4), mfi(4), uhci(4), xhci(4) devices. It also
works with em(4) and igb(4), but there some fixes are needed for
drivers, which are not committed yet. Intel GPUs do not work with
DMAR (yet).
Many thanks to John Baldwin, who explained me the newbus integration;
Peter Holm, who did all testing and helped me to discover and
understand several incredible bugs; and to Jim Harris for the access
to the EDS and BWG and for listening when I have to explain my
findings to somebody.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
The tag enforces a single restriction that all DMA transactions must not
cross a 4GB boundary. Note that while this restriction technically only
applies to PCI-express, this change applies it to all PCI devices as it
is simpler to implement that way and errs on the side of caution.
- Add a softc structure for PCI bus devices to hold the bus_dma tag and
a new pci_attach_common() routine that performs actions common to the
attach phase of all PCI bus drivers. Right now this only consists of
a bootverbose printf and the allocate of a bus_dma tag if necessary.
- Adjust all PCI bus drivers to allocate a PCI bus softc and to call
pci_attach_common() from their attach routines.
MFC after: 2 weeks
ACPI Device() objects that do not have any device IDs available via the
_HID or _CID methods. Without a device ID a device driver cannot attach
to the device anyway. Namespace objects that are devices but not of
type ACPI_TYPE_DEVICE are not affected.
A few BIOSes have also attached a _CRS method to a PCI device to
allocate resources that are not managed via a BAR. With the previous
code those resources are allocated from acpi0 directly which can interfere
with the new PCI-PCI bridge driver (since the PCI device in question may
be behind a bridge and its resources should be allocated from that
bridge's windows instead). The resources were also orphaned and
and would end up associated with some other random device whose device_t
reused the pointer of the original ACPI-enumerated device (after it was
free'd by the ACPI PCI bus driver) in devinfo output which was confusing.
If we want to handle _CRS on PCI devices we can adjust the ACPI PCI bus
driver to do that in the future and associate the resources with the
proper device object respecting PCI-PCI bridges, etc.
Note that with this change the ACPI PCI bus driver no longer has to
delete ACPI-enumerated device_t devices that mirror PCI devices since
they should in general not exist. There are rare cases when a BIOS
will give a PCI device a _HID (e.g. I've seen a PCI-ISA bridge given
a _HID for a system resource device). In that case we leave both the
ACPI and PCI-enumerated device_t objects around just as in the previous
code.
doesn't "fail", it may merely return garbage if it is not a valid ivar
for a given device. Our parent device must be a 'pcib' device, so we
can just assume it implements pcib IVARs, and all pcib devices have a
bus number.
Submitted by: clang via rdivacky
knowledges from the file. All PCI devices enumerated in ACPI tree must use
correct one from acpi_pci.c any way. Reduce duplicate codes as we did for
pci.c in r213905. Do not return ESRCH from PCIB_POWER_FOR_SLEEP method.
When the method is not found, just return zero without modifying the given
default value as it is completely optional. As a side effect, the return
state must not be NULL. Note there is actually no functional change by
removing ESRCH because acpi_pcib_power_for_sleep() always returns zero.
Adjust debugging messages and add new ones under bootverbose to help
debugging device power state related issues.
Reviewed by: jhb, imp (earlier versions)
handle to the PCI device_t if the ACPI device_t is already attached to a
driver. This happens on the Tablet TC1000 which for some reason includes
two PCI-ISA bridges and treats the second bridge as an ACPI system resource
device.
Reviewed by: njl (a while ago)
MFC after: 3 days
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
to search for a specific extended capability. If the specified capability
is found for the given device, then the function returns success and
optionally returns the offset of that capability. If the capability is
not found, the function returns an error.
Unify the code to disable GPEs with the enable code. Shutdown is handled
the same way. ACPI now does all wake/sleep prep for child devices so
now they no longer need to call external functions in the suspend/resume
path. Add the flags to non-ACPI busses (i.e., pci).
device associated with any PCI devices that are enumerated in the ACPI
tree when adding children to an ACPI PCI bus and remove the duplicate
ACPI-only device_t and replace the device_t associated with the handle with
the ACPI and PCI aware device_t.
which doesn't support ACPI power states. Return AE_NOT_FOUND for these
cases and don't print the warning message. Also, print the name of the
handle instead of device when unable to switch states. The device is often
not attached at this point and so its name is NULL, which doesn't help
debugging.
1) In pci.c, we need to check the child device's state, not the parent
device's state.
2) In acpi_pci.c, we have to run the power state change after the acpi
method when the old_state is > new state, not the other way around.
Submitted by: Dmitry Remesov
PR: 65694
o Save and restore bars for suspend/resume as well as for D3->D0
transitions.
o preallocate resources that the PCI devices use to avoid resource
conflicts
o lazy allocation of resources not allocated by the BIOS.
o set unattached drivers to state D3. Set power state to D0
before probe/attach. Right now there's two special cases
for this (display and memory devices) that need work in other
areas of the tree.
Please report any bugs to me.
are enumerated in the ACPI device tree. In addition to the normal PCI
powerstate functionality, the ACPI _PSx methods are executed and ACPI
PowerResources are switched on and off via the acpi_pwr_switch_consumer()
function.
Glanced at by: imp, njl
interrupt to be used for a device. This is intended solely for internal
use of PCI bus implementations, and exists so that PCI bus drivers
implementing special interrupt assignment methods which require
additional work at the bus level to work right can be easily derived
from the generic driver (or any other one) without resorting to hacks.
It will be used in the sparc64 ofw_pcibus driver, which will be
committed shortly.
Make use of this method in the generic implementation, and add it to
the method table of bus drivers derived from the PCI one.
Reviewed by: imp, -hackers
pci busses implement this.
Also minor comment smithing in cardbus. Fix copyright to this year
with my name on it since I've been doing a lot to this file.
Reviewed by: jhb
when the first PCI bus attaches.
- Create /dev/pci during MOD_LOAD as well.
- Destroy /dev/pci during MOD_UNLOAD (not that you can kldunload pci, but
might as well get the code right)
driver. This driver overrides the probe, attach, and read_ivar methods.
If the parent bridge is an ACPI PCI bridge, then the probe routine will
match, otherwise it will fail. It tests this by seeing if it can get
the ACPI_HANDLE ivar from the bridge device.
In the attach routine, it uses pci_add_children() to add all the child
devices (but with a slightly larger ivar so it can store ACPI_HANDLE's
for child devices) and then walks through the ACPI namespace below the
bus device to cache ACPI_HANDLE's for all child devices present in the
namespace. It does this by comparing the pci slot and function to the
address encoded in _ADR of the devices in the ACPI namespace.
The read_ivar routine passes most requests off to pci_read_ivar()
and adds a new request so that the ACPI_HANDLE for a child device can
be read.
To add proper power support the power methods can be overridden as well,
but that is not currently implemented. Also, there are cases where a
device may show in the ACPI namespace as a PCI device that the PCI probe
does not find. Currently such devices are ignored.
Tested on: i386, ia64