Powermac G5 systems. MSI and several other things are not presently
supported.
The U3/U4 internal device support portions of this change were contributed
by Andreas Tobler.
MFC after: 1 week
pci_delete_child() function called by the cardbus driver. The new function
uses resource_list_unreserve() to release the BARs decoded by the device
being removed.
Reviewed by: imp
Tested by: brooks
handling for the PCIR_BIOS decoding enable bit from the cardbus driver.
The PCIR_BIOS BAR does include type bits like other BARs. Instead, it is
always a 32-bit non-prefetchable memory BAR where the low bit is used as a
flag to enable decoding.
Reviewed by: imp
are not allocated by the device driver. These resources should still appear
allocated from the system's perspective so that their assigned ranges are
not reused by other resource requests. The PCI bus driver has used a hack
to effect this for a while now where it uses rman_set_device() to assign
devices to the PCI bus when they are first encountered and later assigns
them to the actual device when a driver allocates a BAR. A few downsides of
this approach is that it results in somewhat confusing devinfo -r output as
well as not being very easily portable to other bus drivers.
This commit adds generic support for "reserved" resources to the resource
list API used by many bus drivers to manage the resources of child devices.
A resource may be reserved via resource_list_reserve(). This will allocate
the resource from the bus' parent without activating it.
resource_list_alloc() recognizes an attempt to allocate a reserved resource.
When this happens it activates the resource (if requested) and then returns
the reserved resource. Similarly, when a reserved resource is released via
resource_list_release(), it is deactivated (if it is active) and the
resource is then marked reserved again, but is left allocated from the
bus' parent. To completely remove a reserved resource, a bus driver may
use resource_list_unreserve(). A bus driver may use resource_list_busy()
to determine if a reserved resource is allocated by a child device or if
it can be unreserved.
The PCI bus driver has been changed to use this framework instead of
abusing rman_set_device() to keep track of reserved vs allocated resources.
Submitted by: imp (an older version many moons ago)
MFC after: 1 month
Have the early USB takeover enabled for i386 and amd64
by default.
This also avoids a panic on PowerPC where the resource
isn't released properly and we find a busy resource
when the USB host controller wants to allocate it...
all host controllers at the same time, we avoid problems where the BIOS will
actually write to the USB registers of all the USB host controllers every time
we handover one of them, and consequently reset the OS programmed values.
Submitted by: avg
Reviewed by: jhb
decoding "took". Other OS's that I checked do not do this and it breaks
some amdpm(4) devices. Prior to 7.2 we did not honor the error returned
when this failed anyway, so this in effect restores previous behavior.
PR: kern/137668
Tested by: Aurelien Mere aurelien.mere amc-os.com
MFC after: 3 days
actually specify valid bases that should be treated just as normal.
The PCI specifications have no indication that 0 would be a magic value
indicating a disabled BAR as commonly used on at least amd64 and i386
but not sparc64. It's unclear what to do in pci_delete_resource()
instead of writing 0 to a BAR though as there's no (other) way do
disable individual BARs so its decoding is left enabled in case of
__PCI_BAR_ZERO_VALID for now.
Approved by: re (kib), jhb
MFC after: 1 week
addresses to BARs into new pci_read_bar() and pci_write_bar() routines.
pci_add_map(), pci_alloc_map(), and pci_delete_resource() now use these
routines to work with BARs.
- Just pass the device_t for the new PCI device to various routines instead
of passing the device, bus, slot, and function.
Reviewed by: imp
when determining the size of a BAR by writing all 1's to the BAR and
reading back the result, always operate on the full 64-bit size.
Reviewed by: imp
MFC after: 1 month
flag when calling bus_alloc_resource() to allocate resources from a parent
PCI bridge. For PCI-PCI bridges this asks the bridge to satisfy the
request using the prefetchable memory range rather than the normal
memory range.
Reviewed by: imp
Reported by: scottl
MFC after: 1 week
We now explicitly enable INTx during bus_setup_intr() if it is needed.
Several of the ata drivers were managing this bit internally. This is
better handled in pci and it should work for all drivers now.
We also mask INTx during bus_teardown_intr() by setting this bit.
Reviewed by: jhb
MFC after: 3 days
A while back, Warner changed the PCI bus code to reserve resources when
enumerating devices and simply give devices the previously allocated
resources when they call bus_alloc_resource(). This ensures that address
ranges being decoded by a BAR are always allocated in the nexus0 device
(or whatever device the PCI bus gets its address space from) even if a
device driver is not attached to the device. This patch extends this
behavior further:
- To let the PCI bus distinguish between a resource being allocated by
a device driver vs. merely being allocated by the bus, use
rman_set_device() to assign the device to the bus when it is owned
by the bus and to the child device when it is allocated by the child
device's driver. We can now prevent a device driver from allocating
the same device twice. Doing so could result in odd things like
allocating duplicate virtual memory to map the resource on some
archs and leaking the original mapping.
- When a PCI device driver releases a resource, don't pass the request
all the way up the tree and release it in the nexus (or similar device)
since the BAR is still active and decoding. Otherwise, another device
could later allocate the same range even though it is still in use.
Instead, deactivate the resource and assign it back to the PCI bus
using rman_set_device().
- pci_delete_resource() will actually completely free a BAR including
attemping to disable it.
- Disable BAR decoding via the command register when sizing a BAR in
pci_alloc_map() which is used to allocate resources for a BAR when
the BIOS/firmware did not assign a usable resource range during boot.
This mirrors an earlier fix to pci_add_map() which is used when to
size BARs during boot.
- Move the activation of I/O decoding in the PCI command register into
pci_activate_resource() instead of doing it in pci_alloc_resource().
Previously we could actually enable decoding before a BAR was
initialized via pci_alloc_map().
Glanced at by: bsdimp
This addresses interrupt storms that were noticed after enabling MSI
in drm. I think this is due to a loose interpretation of the PCI 2.3
spec, which states that a function using MSI is prohibitted from using
INTx. It appears that some vendors interpretted that to mean that they
should handle it in hardware, while others felt it was the drivers
responsibility.
This fix will also likely resolve interrupt storm related issues with
devices other than drm.
Reviewed by: jhb@
MFC after: 3 days
by writing all 1's to it to determine its length. This fixes issues with
MCFG on at least some machines where a trashed BAR claimed subsequent
attempts at PCI config transactions because the addresses in the MCFG
window fell in the decoding range of the BAR.
In general it is a bad idea to leave the BARs enabled while we are
frobbing with them in this manner.
Sleuthing by: tegge
MFC after: 1 week
only in low memory situations, so the error fork of these fixes is
lightly tested, but they should do the least-wrong thing...
Submitted by: Hans Petter Selasky
pci_add_map(). First, this condition is already handled earlier in
the function. Second, as written the check would never fire as the
'start' value was overwritten with a long value (rman_get_start() returns
long) before the comparison was done.
Discussed with: imp
MFC after: 2 weeks
for a PCI device during the boot-time probe of the parent PCI bus, then
zero the BAR and clear the resource list entry for that BAR. This forces
the PCI bus driver to request a valid resource range from the parent bridge
driver when the device driver tries to allocate the BAR. Similarly, if the
initial value of a BAR is a valid range but it is > 4GB and the current OS
only has 32-bit longs, then do a full teardown of the initial value of the
BAR to force a reallocation.
Reviewed by: imp
MFC after: 1 week
used but MSI to HyperTransport IRQ mapping is enabled, and would act as
if MSI is turned on, resulting in interrupt loss.
This commit will,
1. enable MSI mapping on a device only when MSI is enabled for that
device and the MSI address matches the HT mapping window.
2. enable MSI mapping on a bridge only when a downstream device is
allocated an MSI address in the mapping window
PR: kern/118842
Reviewed by: jhb
MFC after: 1 week
PCI-express chipset (and thus has functional MSI) if there are any
PCI-express devices in the system, not requiring a root port device.
With PCI-X the chipset detection has to be very conservative because there
are known systems with PCI-X devices that do not appear to have PCI-X
chipsets. However, with PCI-express I'm not sure it is possible to have
a PCI-express device in a system with a non-PCI-express chipset. If we
assume that is the case then this change is valid. It is also required
for at least some PCI-express systems that don't have any devices with
a root port capability (some ICH9 systems).
MFC after: 1 week
Reported by: jfv
- Implement timing out of VPD register access.[1]
- Fix an off-by-one error of freeing malloc'd space when checksum is invalid.
- Fix style(9) bugs, i.e., sizeof cannot be followed by space.
- Retire now obsolete 'hw.pci.enable_vpd' tunable.
Submitted by: cokane (initial revision)[1]
Reviewed by: marius (intermediate revision)
Silence from: jhb, jmg, rwatson
Tested by: cokane, jkim
MFC after: 3 days
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
the duration of the function. The device we would otherwise
have left in an useless state may just as well be the low-level
console. When booting verbose, we do need it addressable if we
want to avoid a MCA.
Approved by: re (kensmith)
the power_nodriver tunable is off. pci_cfg_save() already checks the
tunable internally, and no other callers of pci_cfg_save() check the
tunable.
Reviewed by: imp
- Simplify the amount of work that has be done for each architecture by
pushing more of the truly MI code down into the PCI bus driver.
- Don't bind MSI-X indicies to IRQs so that we can allow a driver to map
multiple MSI-X messages into a single IRQ when handling a message
shortage.
The changes include:
- Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus
to calculate the address and data values for a given MSI/MSI-X IRQ.
The x86 nexus drivers map this into a call to a new 'msi_map()' function
in msi.c that does the mapping.
- Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index'
parameter from PCIB_ALLOC_MSIX(). MD code no longer has any knowledge
of the MSI-X index for a given MSI-X IRQ.
- The PCI bus driver now stores more MSI-X state in a child's ivars.
Specifically, it now stores an array of IRQs (called "message vectors" in
the code) that have associated address and data values, and a small
virtual version of the MSI-X table that specifies the message vector
that a given MSI-X table entry uses. Sparse mappings are permitted in
the virtual table.
- The PCI bus driver now configures the MSI and MSI-X address/data
registers directly via custom bus_setup_intr() and bus_teardown_intr()
methods. pci_setup_intr() invokes PCIB_MAP_MSI() to determine the
address and data values for a given message as needed. The MD code
no longer has to call back down into the PCI bus code to set these
values from the nexus' bus_setup_intr() handler.
- The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD
code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get
new values of the address and data fields for a given IRQ. The x86
MSI code uses this when an MSI IRQ is moved to a different CPU, requiring
a new value of the 'address' field.
- The x86 MSI psuedo-driver loses a lot of code, and in fact the separate
MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver
since the only remaining diff between the two is a substring in a
bootverbose printf.
- The PCI bus driver will now restore MSI-X state (including programming
entries in the MSI-X table) on device resume.
- The interface for pci_remap_msix() has changed. Instead of accepting
indices for the allocated vectors, it accepts a mini-virtual table
(with a new length parameter). This table is an array of u_ints, where
each value specifies which allocated message vector to use for the
corresponding MSI-X message. A vector of 0 forces a message to not
have an associated IRQ. The device may choose to only use some of the
IRQs assigned, in which case the unused IRQs must be at the "end" and
will be released back to the system. This allows a driver to use the
same remap table for different shortage values. For example, if a driver
wants 4 messages, it can use the same remap table (which only uses the
first two messages) for the cases when it only gets 2 or 3 messages and
in the latter case the PCI bus will release the 3rd IRQ back to the
system.
MFC after: 1 month