Commit Graph

17 Commits

Author SHA1 Message Date
John Baldwin
e706f7f0c7 Revamp the MSI/MSI-X code a bit to achieve two main goals:
- Simplify the amount of work that has be done for each architecture by
  pushing more of the truly MI code down into the PCI bus driver.
- Don't bind MSI-X indicies to IRQs so that we can allow a driver to map
  multiple MSI-X messages into a single IRQ when handling a message
  shortage.

The changes include:
- Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus
  to calculate the address and data values for a given MSI/MSI-X IRQ.
  The x86 nexus drivers map this into a call to a new 'msi_map()' function
  in msi.c that does the mapping.
- Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index'
  parameter from PCIB_ALLOC_MSIX().  MD code no longer has any knowledge
  of the MSI-X index for a given MSI-X IRQ.
- The PCI bus driver now stores more MSI-X state in a child's ivars.
  Specifically, it now stores an array of IRQs (called "message vectors" in
  the code) that have associated address and data values, and a small
  virtual version of the MSI-X table that specifies the message vector
  that a given MSI-X table entry uses.  Sparse mappings are permitted in
  the virtual table.
- The PCI bus driver now configures the MSI and MSI-X address/data
  registers directly via custom bus_setup_intr() and bus_teardown_intr()
  methods.  pci_setup_intr() invokes PCIB_MAP_MSI() to determine the
  address and data values for a given message as needed.  The MD code
  no longer has to call back down into the PCI bus code to set these
  values from the nexus' bus_setup_intr() handler.
- The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD
  code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get
  new values of the address and data fields for a given IRQ.  The x86
  MSI code uses this when an MSI IRQ is moved to a different CPU, requiring
  a new value of the 'address' field.
- The x86 MSI psuedo-driver loses a lot of code, and in fact the separate
  MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver
  since the only remaining diff between the two is a substring in a
  bootverbose printf.
- The PCI bus driver will now restore MSI-X state (including programming
  entries in the MSI-X table) on device resume.
- The interface for pci_remap_msix() has changed.  Instead of accepting
  indices for the allocated vectors, it accepts a mini-virtual table
  (with a new length parameter).  This table is an array of u_ints, where
  each value specifies which allocated message vector to use for the
  corresponding MSI-X message.  A vector of 0 forces a message to not
  have an associated IRQ.  The device may choose to only use some of the
  IRQs assigned, in which case the unused IRQs must be at the "end" and
  will be released back to the system.  This allows a driver to use the
  same remap table for different shortage values.  For example, if a driver
  wants 4 messages, it can use the same remap table (which only uses the
  first two messages) for the cases when it only gets 2 or 3 messages and
  in the latter case the PCI bus will release the 3rd IRQ back to the
  system.

MFC after:	1 month
2007-05-02 17:50:36 +00:00
John Baldwin
5fe82bca57 Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.
- First off, device drivers really do need to know if they are allocating
  MSI or MSI-X messages.  MSI requires allocating powerof2() messages for
  example where MSI-X does not.  To address this, split out the MSI-X
  support from pci_msi_count() and pci_alloc_msi() into new driver-visible
  functions pci_msix_count() and pci_alloc_msix().  As a result,
  pci_msi_count() now just returns a count of the max supported MSI
  messages for the device, and pci_alloc_msi() only tries to allocate MSI
  messages.  To get a count of the max supported MSI-X messages, use
  pci_msix_count().  To allocate MSI-X messages, use pci_alloc_msix().
  pci_release_msi() still handles both MSI and MSI-X messages, however.
  As a result of this change, drivers using the existing API will only
  use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
  values (and thus does not require all of the messages to have their
  MD vectors allocated as a group), some devices allow for "sparse" use
  of MSI-X message slots.  For example, if a device supports 8 messages
  but the OS is only able to allocate 2 messages, the device may make the
  best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
  than default of using the first N slots (or indicies) at 1 and 2.  To
  support this, add a new pci_remap_msix() function that a driver may call
  after a successful pci_alloc_msix() (but before allocating any of the
  SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
  assigned to different message indices.  For example, from the earlier
  example, after pci_alloc_msix() returned a value of 2, the driver would
  call pci_remap_msix() passing in array of integers { 1, 4 } as the
  new message indices to use.  The rid's for the SYS_RES_IRQ resources
  will always match the message indices.  Thus, after the call to
  pci_remap_msix() the driver would be able to access the first message
  in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
  SYS_RES_IRQ rid 4.  Note that the message slots/indices are 1-based
  rather than 0-based so that they will always correspond to the rid
  values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
  To support this API, a new PCIB_REMAP_MSIX() method was added to the
  pcib interface to change the message index for a single IRQ.

Tested by:	scottl
2007-01-22 21:48:44 +00:00
John Baldwin
9bf4c9c1b0 First cut at MI support for PCI Message Signalled Interrupts (MSI):
- Add 3 new functions to the pci_if interface along with suitable wrappers
  to provide the device driver visible API:
  - pci_alloc_msi(dev, int *count) backed by PCI_ALLOC_MSI().  '*count'
    here is an in and out parameter.  The driver stores the desired number
    of messages in '*count' before calling the function.  On success,
    '*count' holds the number of messages allocated to the device.  Also on
    success, the driver can access the messages as SYS_RES_IRQ resources
    starting at rid 1.  Note that the legacy INTx interrupt resource will
    not be available when using MSI.  Note that this function will allocate
    either MSI or MSI-X messages depending on the devices capabilities and
    the 'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables.  Also note
    that the driver should activate the memory resource that holds the
    MSI-X table and pending bit array (PBA) before calling this function
    if the device supports MSI-X.
  - pci_release_msi(dev) backed by PCI_RELEASE_MSI().  This function
    releases the messages allocated for this device.  All of the
    SYS_RES_IRQ resources need to be released for this function to succeed.
  - pci_msi_count(dev) backed by PCI_MSI_COUNT().  This function returns
    the maximum number of MSI or MSI-X messages supported by this device.
    MSI-X is preferred if present, but this function will honor the
    'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables.  This function
    should return the largest value that pci_alloc_msi() can return
    (assuming the MD code is able to allocate sufficient backing resources
    for all of the messages).
- Add default implementations for these 3 methods to the pci_driver generic
  PCI bus driver.  (The various other PCI bus drivers such as for ACPI and
  OFW will inherit these default implementations.)  This default
  implementation depends on 4 new pcib_if methods that bubble up through
  the PCI bridges to the MD code to allocate IRQ values and perform any
  needed MD setup code needed:
  - PCIB_ALLOC_MSI() attempts to allocate a group of MSI messages.
  - PCIB_RELEASE_MSI() releases a group of MSI messages.
  - PCIB_ALLOC_MSIX() attempts to allocate a single MSI-X message.
  - PCIB_RELEASE_MSIX() releases a single MSI-X message.
- Add default implementations for these 4 methods that just pass the
  request up to the parent bus's parent bridge driver and use the
  default implementation in the various MI PCI bridge drivers.
- Add MI functions for use by MD code when managing MSI and MSI-X
  interrupts:
  - pci_enable_msi(dev, address, data) programs the MSI capability address
    and data registers for a group of MSI messages
  - pci_enable_msix(dev, index, address, data) initializes a single MSI-X
    message in the MSI-X table
  - pci_mask_msix(dev, index) masks a single MSI-X message
  - pci_unmask_msix(dev, index) unmasks a single MSI-X message
  - pci_pending_msix(dev, index) returns true if the specified MSI-X
    message is currently pending
- Save the MSI capability address and data registers in the pci_cfgreg
  block in a PCI devices ivars and restore the values when a device is
  resumed.  Note that the MSI-X table is not currently restored during
  resume.
- Add constants for MSI-X register offsets and fields.
- Record interesting data about any MSI-X capability blocks we come
  across in the pci_cfgreg block in the ivars for PCI devices.

Tested on:	em (i386, MSI), bce (amd64/i386, MSI), mpt (amd64, MSI-X)
Reviewed by:	scottl, grehan, jfv
MFC after:	2 months
2006-11-13 21:47:30 +00:00
John Baldwin
04dda605c5 - Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all the
various pcib drivers to use their own private devclass_t variables for
  their modules.
- Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib
  drivers while I'm here.
2006-01-06 19:22:19 +00:00
David E. O'Brien
2a191126de Canonize the include of acpi.h. 2005-09-11 18:39:03 +00:00
Nate Lawson
d05fa56bb3 Remove trailing whitespace. 2004-12-27 05:36:47 +00:00
John Baldwin
5e1ba6d4ae Rework the ACPI PCI link code.
- Use a new-bus device driver for the ACPI PCI link devices.  The devices
  are called pci_linkX.  The driver includes suspend/resume support so that
  the ACPI bridge drivers no longer have to poke the links to get them
  to handle suspend/resume.  Also, the code to handle which IRQs a link is
  routed to and choosing an IRQ when a link is not already routed is all
  contained in the link driver.  The PCI bridge drivers now ask the link
  driver which IRQ to use once they determine that a _PRT entry does not
  use a hardwired interrupt number.
- The new link driver includes support for multiple IRQ resources per
  link device as well as preserving any non-IRQ resources when adjusting
  the IRQ that a link is routed to.
- The entire approach to routing when using a link device is now
  link-centric rather than pci bus/device/pin specific.  Thus, when
  using a tunable to override the default IRQ settings, one now uses
  a single tunable to route an entire link rather than routing a single
  device that uses the link (which has great foot-shooting potential if
  the user tries to route the same link to two different IRQs using two
  different pci bus/device/pin hints).  For example, to adjust the IRQ
  that \_SB_.LNKA uses, one would set 'hw.pci.link.LNKA.irq=10' from the
  loader.
- As a side effect of having the link driver, unused link devices will now
  be disabled when they are probed.
- The algorithm for choosing an IRQ for a link that doesn't already have an
  IRQ assigned is now much closer to the one used in $PIR routing.  When a
  link is routed via an ISA IRQ, only known-good IRQs that the BIOS has
  already used are used for routing instead of using probabilities to
  guess at which IRQs are probably not used by an ISA device.  One change
  from $PIR is that the SCI is always considered a viable ISA IRQ, so that
  if the BIOS does not setup any IRQs the kernel will degenerate to routing
  all interrupts over the SCI.  For non ISA IRQs, interrupts are picked
  from the possible pool using a simplistic weighting algorithm.

Tested by:	ru, scottl, others on acpi@
Reviewed by:	njl
2004-11-23 22:26:44 +00:00
Nate Lawson
e4116e931c Re-work ACPI PCI IRQ routing (_PRT, link devices). The old approach was
incomplete in that the PRT routing was not aware of link programming.
Fix this by doing all routing through the link devices.  The new algorithm
for setting up links is:

1. Read _CRS to get current setting.  If invalid (not in _PRS), then set
   to 0.
2. Attempt to call _DIS on the link.  If successful, mark the link as not
   routed.  Otherwise, assume it still is.

Then when a routing request occurs:

3. Update weights for all IRQs
4. Attempt to route the initial IRQ if valid
5. If that fails, walk through the sorted list, attempting to route IRQs.
6. Configure the trigger/polarity based on _PRS.

Other changes:
* Add acpi_pci_find_prt() to look up the PRT entry for a given device and
  acpi_pci_link_route() to select/route the best IRQ for it.
* Remove duplicated code in acpi_pcib_route_interrupt() that picked the
  first IRQ from _PRS.
* Remove unneeded arguments from acpi_pcib_resume() and friends.
* Ignore _STA on link devices but report if it seems strange.
* Add a prt_source handle to the PRT structure since the ACPI struct
  ACPI_PCI_ROUTING_TABLE uses a fixed-size entry for it.  We'll need to
  dynamically size this object if we want to use it the same way ACPI-CA
  does.  Null-terminate the source.

Tested by:	Luo Hong <luohong99_at_mails.tsinghua.edu.cn>,
		Jeffrey Katcher <jmkatcher_at_yahoo.com>
Info from:	jhb, Len Brown (Intel)
2004-08-11 14:52:50 +00:00
Poul-Henning Kamp
fe12f24bb0 Add missing <sys/module.h> includes 2004-05-30 20:08:47 +00:00
Nate Lawson
5acd02180c Style cleanups, don't set the device description before the probe routine
has completed successfully.
2004-05-29 04:32:50 +00:00
John Baldwin
fb0ac5433b If an ACPI PCI-PCI bridge doesn't have a _PRT object, fall back to using
the swizzle method for routing PCI interrupts across the bridge.  This
fixes problems with motherboards (typically laptops) whose BIOS doesn't
provide a PRT for the AGP bridge even though there is a device entry for
the bridge in the ACPI namespace.

Tested by:	Kenneth Culver culverk at sweetdreamsracing dot biz
2004-05-10 18:26:22 +00:00
Nate Lawson
64278df5e0 Add MODULE_DEPEND entries so some of these drivers can eventually be
loaded separately from ACPI (i.e., embedded use).
2004-04-09 18:14:32 +00:00
David E. O'Brien
aad970f1fe Use __FBSDID().
Also some minor style cleanups.
2003-08-24 17:55:58 +00:00
Warner Losh
cace7a2a4d Prefer new location of pci include files (which have only been in the
tree for two or more years now), except in a few places where there's
code to be compatible with older versions of FreeBSD.
2003-08-22 06:06:16 +00:00
Mitsuru IWASAKI
5e0ca5771b Make sure that ACPI PCI driver probe routine call pci_cfgregopen()
before start accessing PCI config space.

Reviewed by:	jhb
2002-10-05 02:16:49 +00:00
Mitsuru IWASAKI
ba835e3fe6 Add code for ACPI PCI link object manipulation.
This allocate the best IRQ to boot-disable devices (have IRQ 0).
Allocated IRQ will be used for PCI interrupt routing when ACPI is
enabled.

Note that verbose messaging enabled for the time being so that
people can easily notice the strange behavior if it happened.
2002-10-05 02:01:05 +00:00
John Baldwin
2ccfc93222 Overhaul the ACPI PCI bridge driver a bit:
- Add an ACPI PCI-PCI bridge driver (the previous driver just handled
  Host-PCI bridges) that is a PCI driver that is a subclass of the generic
  PCI-PCI bridge driver.  It overrides probe, attach, read_ivar, and
  pci_route_interrupt.
  - The probe routine only succeeds if our parent is an ACPI PCI bus which
    we test for by seeing if we can read our ACPI_HANDLE as an ivar.
  - The attach routine saves a copy of our handle and calls the new
    acpi_pcib_attach_common() function described below.
  - The read_ivar routine handles normal PCI-PCI bridge ivars and adds an
    ivar to return the ACPI_HANDLE of the bus this bridge represents.
  - The route_interrupt routine fetches the _PRT (PCI Interrupt Routing
    Table) from the bridge device's softc and passes it off to
    acpi_pcib_route_interrupt() to route the interrupt.
- Split the old ACPI Host-PCI bridge driver into two pieces.  Part of
  the attach routine and most of the route_interrupt routine remain in
  acpi_pcib.c and are shared by both ACPI PCI bridge drivers.
  - The attach routine verifies the PCI bridge is present, reads in
    the _PRT for the bridge, and attaches the child PCI bus.
  - The route_interrupt routine uses the passed in _PRT to route a PCI
    interrupt.
  The rest of the driver is the ACPI Host-PCI bridge specific bits that
  live in acpi_pcib_acpi.c.
  - We no longer duplicate pcib_maxslots but use it directly.
  - The driver now uses the pcib devclass instead of its own devclass.
    This means that PCI busses are now only children of pcib devices.
  - Allow the ACPI_HANDLE for the child PCI bus to be read as an ivar
    of the child bus.
  - Fetch the _PRT for routing PCI interrupts directly from our softc
    instead of walking the devclass to find ourself and then fetch our
    own softc.

With this change and the new ACPI PCI bus driver, ACPI can now properly
route interrupts for devices behind PCI-PCI bridges.  That is, the
Itanium2 with like 10 PCI busses can now boot ok and route all the PCI
interrupts.  Hopefully this will also fix problems people are having with
CardBus bridges behind PCI-PCI bridges not properly routing interrupts
when ACPI is used.

Tested on:	i386, ia64
2002-08-26 18:30:27 +00:00