Commit Graph

233 Commits

Author SHA1 Message Date
John Baldwin
e706f7f0c7 Revamp the MSI/MSI-X code a bit to achieve two main goals:
- Simplify the amount of work that has be done for each architecture by
  pushing more of the truly MI code down into the PCI bus driver.
- Don't bind MSI-X indicies to IRQs so that we can allow a driver to map
  multiple MSI-X messages into a single IRQ when handling a message
  shortage.

The changes include:
- Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus
  to calculate the address and data values for a given MSI/MSI-X IRQ.
  The x86 nexus drivers map this into a call to a new 'msi_map()' function
  in msi.c that does the mapping.
- Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index'
  parameter from PCIB_ALLOC_MSIX().  MD code no longer has any knowledge
  of the MSI-X index for a given MSI-X IRQ.
- The PCI bus driver now stores more MSI-X state in a child's ivars.
  Specifically, it now stores an array of IRQs (called "message vectors" in
  the code) that have associated address and data values, and a small
  virtual version of the MSI-X table that specifies the message vector
  that a given MSI-X table entry uses.  Sparse mappings are permitted in
  the virtual table.
- The PCI bus driver now configures the MSI and MSI-X address/data
  registers directly via custom bus_setup_intr() and bus_teardown_intr()
  methods.  pci_setup_intr() invokes PCIB_MAP_MSI() to determine the
  address and data values for a given message as needed.  The MD code
  no longer has to call back down into the PCI bus code to set these
  values from the nexus' bus_setup_intr() handler.
- The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD
  code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get
  new values of the address and data fields for a given IRQ.  The x86
  MSI code uses this when an MSI IRQ is moved to a different CPU, requiring
  a new value of the 'address' field.
- The x86 MSI psuedo-driver loses a lot of code, and in fact the separate
  MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver
  since the only remaining diff between the two is a substring in a
  bootverbose printf.
- The PCI bus driver will now restore MSI-X state (including programming
  entries in the MSI-X table) on device resume.
- The interface for pci_remap_msix() has changed.  Instead of accepting
  indices for the allocated vectors, it accepts a mini-virtual table
  (with a new length parameter).  This table is an array of u_ints, where
  each value specifies which allocated message vector to use for the
  corresponding MSI-X message.  A vector of 0 forces a message to not
  have an associated IRQ.  The device may choose to only use some of the
  IRQs assigned, in which case the unused IRQs must be at the "end" and
  will be released back to the system.  This allows a driver to use the
  same remap table for different shortage values.  For example, if a driver
  wants 4 messages, it can use the same remap table (which only uses the
  first two messages) for the cases when it only gets 2 or 3 messages and
  in the latter case the PCI bus will release the 3rd IRQ back to the
  system.

MFC after:	1 month
2007-05-02 17:50:36 +00:00
John Baldwin
5fe82bca57 Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.
- First off, device drivers really do need to know if they are allocating
  MSI or MSI-X messages.  MSI requires allocating powerof2() messages for
  example where MSI-X does not.  To address this, split out the MSI-X
  support from pci_msi_count() and pci_alloc_msi() into new driver-visible
  functions pci_msix_count() and pci_alloc_msix().  As a result,
  pci_msi_count() now just returns a count of the max supported MSI
  messages for the device, and pci_alloc_msi() only tries to allocate MSI
  messages.  To get a count of the max supported MSI-X messages, use
  pci_msix_count().  To allocate MSI-X messages, use pci_alloc_msix().
  pci_release_msi() still handles both MSI and MSI-X messages, however.
  As a result of this change, drivers using the existing API will only
  use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
  values (and thus does not require all of the messages to have their
  MD vectors allocated as a group), some devices allow for "sparse" use
  of MSI-X message slots.  For example, if a device supports 8 messages
  but the OS is only able to allocate 2 messages, the device may make the
  best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
  than default of using the first N slots (or indicies) at 1 and 2.  To
  support this, add a new pci_remap_msix() function that a driver may call
  after a successful pci_alloc_msix() (but before allocating any of the
  SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
  assigned to different message indices.  For example, from the earlier
  example, after pci_alloc_msix() returned a value of 2, the driver would
  call pci_remap_msix() passing in array of integers { 1, 4 } as the
  new message indices to use.  The rid's for the SYS_RES_IRQ resources
  will always match the message indices.  Thus, after the call to
  pci_remap_msix() the driver would be able to access the first message
  in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
  SYS_RES_IRQ rid 4.  Note that the message slots/indices are 1-based
  rather than 0-based so that they will always correspond to the rid
  values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
  To support this API, a new PCIB_REMAP_MSIX() method was added to the
  pcib interface to change the message index for a single IRQ.

Tested by:	scottl
2007-01-22 21:48:44 +00:00
John Baldwin
8964299ac8 Give Host-PCI bridge drivers their own pcib_alloc_msi() and
pcib_alloc_msix() methods instead of using the method from the generic
PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI
specific logic soon.
2006-12-12 19:27:01 +00:00
John Baldwin
d748ef4792 Replace a few magic numbers. 2006-12-12 19:23:52 +00:00
John Baldwin
4184900911 MD support for PCI Message Signalled Interrupts on amd64 and i386:
- Add a new apic_alloc_vectors() method to the local APIC support code
  to allocate N contiguous IDT vectors (aligned on a M >= N boundary).
  This function is used to allocate IDT vectors for a group of MSI
  messages.
- Add MSI and MSI-X PICs.  The PIC code here provides methods to manage
  edge-triggered MSI messages as x86 interrupt sources.  In addition to
  the PIC methods, msi.c also includes methods to allocate and release
  MSI and MSI-X messages.  For x86, we allow for up to 128 different
  MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs,
  16-254 for APIC PCI IRQs, and IRQ 255 is reserved).
- Add pcib_(alloc|release)_msi[x]() methods to the MD x86 PCI bridge
  drivers to bubble the request up to the nexus driver.
- Add pcib_(alloc|release)_msi[x]() methods to the x86 nexus drivers that
  ask the MSI PIC code to allocate resources and IDT vectors.

MFC after:	2 months
2006-11-13 22:23:34 +00:00
John Baldwin
fdaac72fcd Don't dump the $PIR table under bootverbose. The pirtool program in
src/tools/tools works fine, and dumping this table can add a lot of noise.

MFC after:	1 week
2006-11-09 18:03:36 +00:00
John Baldwin
04dda605c5 - Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all the
various pcib drivers to use their own private devclass_t variables for
  their modules.
- Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib
  drivers while I'm here.
2006-01-06 19:22:19 +00:00
John Baldwin
5b2119223e Move the hostb driver out of the i386 and amd64 PCI code (where it was
duplicated anyways) and into a single MI driver.  Extend the driver a bit
to implement the bus and PCI kobj interfaces such that other drivers can
attach to it and transparently act as if their parent device is the PCI
bus (for the most part).
2005-12-20 21:09:45 +00:00
Craig Rodrigues
16f99fe169 Add support for 7320 and 915 PCIe chipsets.
Submitted by:	Gavin Atkinson <gavin.atkinson at ury dot york dot ac dot uk>
PR:		kern/79139
Reviewed by:	scottl
2005-12-08 18:55:15 +00:00
Warner Losh
421552a580 Provide a dummy NO_XBOX option that lives in opt_xbox.h for pc98.
This allows us to eliminate a three ifdef PC98 instances.
2005-11-14 00:43:44 +00:00
Yoshihiro Takahashi
1ba0023e33 Fix pc98 build. 2005-11-09 12:22:26 +00:00
Warner Losh
51ef421d92 Add support for XBOX to the FreeBSD port. The xbox architecture is
nearly identical to wintel/ia32, with a couple of tweaks.  Since it is
so similar to ia32, it is optionally added to a i386 kernel.  This
port is preliminary, but seems to work well.  Further improvements
will improve the interaction with syscons(4), port Linux nforce driver
and future versions of the xbox.

This supports the 64MB and 128MB boxes.  You'll need the most recent
CVS version of Cromwell (the Linux BIOS for the XBOX) to boot.

Rink will be maintaining this port, and is interested in feedback.
He's setup a website http://xbox-bsd.nl to report the latest
developments.

Any silly mistakes are my fault.

Submitted by: Rink P.W. Springer rink at stack dot nl and
	Ed Schouten ed at fxq dot nl
2005-11-09 03:55:40 +00:00
Peter Wemm
68a443c292 MFamd64: indent with tabs instead of spaces. 2005-11-04 22:53:44 +00:00
Bill Paul
8a3a26385c Undo the change to pci_cfgdisable() on i386 for now. It seems to fix
the amd64 case, but makes the i386 case fail even more often.
2005-10-25 05:32:44 +00:00
Bill Paul
ba3af76df7 Modify the pci_cfgdisable() routine to bring it more in line with
other OSes (Solaris, Linux, VxWorks). It's not necessary to write a 0
to the config address register when using config mechanism 1 to turn
off config access. In fact, it can be downright troublesome, since it
seems to confuse the PCI-PCI bridge in the AMD8111 chipset and cause
it to sporadically botch reads from some devices. This is the cause
of the missing USP ports problem I was experiencing with my Sun Opteron
system.

Also correct the case for mechanism 2: it's only necessary to write
a 0 to the ENABLE port.
2005-10-25 04:53:29 +00:00
Warner Losh
e429f92618 Expose legacy_pcib_alloc_resource, and use it in the mptable pci bus
implementation, like other routines in the legacy bus.

This should fix problems with resource allocation on MP systems without
ACPI enabled.
2005-09-17 23:57:53 +00:00
Warner Losh
dca2069084 Commit a workaround to a problem with resource allocation. This helps
with some Dell servers that booted w/o a problem[*] on 5.4, but failed
with 6.0-BETA.

On the PCI bus, when we do lazy resource allocation, we narrow the
range requested as we pass through bridges to reflect how the bridges
are programmed and what addresses they pass.  However, when we're
doing an allocation on a bus that's directly connected to a host
bridge, no such translation can take place.  We already had a fallback
range for memory requests, but none for ioports.  As such, provide a
fallback for I/O ports so we don't allocate location 0, which will
have undesired side effects when the resources are actually used.

This fixes a problem with booting a Dell server with usb in the
kernel.  However, it is an unsatisfying solution.  I don't like the
hard coded value, and I think we should start narrowing the resources
returned to not be in the so-called isa alias area (where the ranage &
0x0300 must be 0 iirc).  Doing such filtering will have to wait for
another day.

This may be a good 6 candidate, maybe after its had a chance to be
refined.

Tested by: glebius@
2005-09-16 07:02:29 +00:00
Warner Losh
b3ffa2ae22 Note that pc98 specific defines maybe would be better in a header file. 2005-09-08 17:07:12 +00:00
John Baldwin
11f3a4f069 - Ignore BIOS IRQs (that is, IRQ settings left by the BIOS or a previous OS
in the PCI config registers) that are > 15 as $PIR can only route PCI
  interrupts to ISA IRQs which are limited to the 0 to 15 range.
- Remove an extra word from a printf.

Reported by:	othermark atkin901 at yahoo dot com
MFC after:	3 days
2005-07-13 15:41:16 +00:00
John Baldwin
84c7fde72e Trust the settings programmed by the BIOS over what the $PIR says.
Specifically, if the BIOS has programmed an IRQ for a device that doesn't
match the list of valid IRQs for the link, use it anyway as some BIOSes
don't correctly list the valid IRQs in the $PIR.  Also, allow the user
to specify an IRQ that $PIR claims is invalid as an override, but emit a
warning in that case.
2005-04-14 18:25:09 +00:00
John Baldwin
5165a17df5 Add code to read the primary PCI bus number out of the Compaq/HP 6010
hotplug Host to PCI bridge.  This is only needed for the non-ACPI case
as the BIOS includes a proper _BBN method in ACPI.
2005-03-25 14:18:50 +00:00
Poul-Henning Kamp
c711aea6ca Make a bunch of malloc types static.
Found by:	src/tools/tools/kernxref
2005-02-10 12:02:37 +00:00
Warner Losh
86cb007f9f /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 22:18:23 +00:00
Scott Long
5662cf3c92 Remove a stray critical_exit().
Submitted by: johan
2004-12-13 07:08:44 +00:00
Scott Long
245e410ba7 Expand the scope of the critical section in the PCIe read and write methods
on the advice of Alan Cox.
2004-12-10 15:44:12 +00:00
Scott Long
568b7ee1b2 Due to a significant addition of code, add my copyright to this file. Also
note that the PCIe work was made possible due to hardware donations from
the FreeBSD Foundation and Intel.  Thanks!
2004-12-06 18:19:32 +00:00
Scott Long
aa2ea23220 Add support for the memory-mapped PCI Express configuration mechanism. This
actually is a property of the northbridge and applies to all PCI/PCI-X/PCIe
devices in the system, though only PCIe devices will respond to registers
higher than 256.  This uses per-CPU pools of temporary mappings so that
the whole 256MB of configuration space doesn't have to be mapped all at
once.  While the sf_buf API was considered for this, the fact that it
requires sleep locks and can return failure made it unsuitable for this use.

For now only the Intel Grantsdale and Lindenhurst (925 and 752x) chipsets are
supported.  Since there doesn't appear to be a compatible way to determine
northbridge support, new chipsets will have to be explicitely added in the
future.
2004-12-06 08:27:10 +00:00
Dag-Erling Smørgrav
b0e1e474f7 Add TUNABLE_LONG and TUNABLE_ULONG, and use the latter for the
hw.pci.host_mem_start tunable.  Add comments to TUNABLE_INT and
TUNABLE_QUAD recommending against their use.

MFC after:	3 weeks
2004-10-31 15:50:33 +00:00
Dag-Erling Smørgrav
38228f7221 Whitespace cleanup 2004-10-31 15:02:53 +00:00
Warner Losh
fd492ee0e6 Make the lower range of the memory area 0x80000000 again. Also
introduce hw.{pci,acpi}.host_mem_start tunable to change this.

MFC: ASAP
2004-10-11 21:10:23 +00:00
Warner Losh
e625cbacaf Add missing 'static' 2004-10-06 15:18:12 +00:00
Warner Losh
0b3a486f21 For legacy PCI bridges, limit memory allocation to the top 32MB of
RAM.  Many older, legacy bridges only allow allocation from this
range.  This only appies to devices who don't have their memory
assigned by the BIOS (since we allocate the ranges so assigned
exactly), so should have minimal impact.

Hoewver, for CardBus bridges (cbb), they rarely get the resources
allocated by the BIOS, and this patch helps them greatly.  Typically
the 'bad Vcc' messages are caused by this problem.
2004-10-06 07:22:58 +00:00
Stefan Farfeleder
5908d366fb Consistently use __inline instead of __inline__ as the former is an empty macro
in <sys/cdefs.h> for compilers without support for inline.
2004-07-04 16:11:03 +00:00
John Baldwin
39981fed82 Trim a few things from the dmesg output and stick them under bootverbose to
cut down on the clutter including PCI interrupt routing, MTRR, pcibios,
etc.

Discussed with:	USENIX Cabal
2004-07-01 07:46:29 +00:00
John Baldwin
092a5c4530 Remove atdevbase and replace it's remaining uses with direct references to
KERNBASE instead.
2004-06-10 20:31:00 +00:00
John Baldwin
4468ab0a61 Allow the pir0 device add to fail since pir0 may already exist. This should
fix the panics in device_set_ivars() that people were seeing on boxes with
multiple Host-PCI bridges but not using ACPI.
2004-06-01 19:51:29 +00:00
Poul-Henning Kamp
41ee9f1c69 Add some missing <sys/module.h> includes which are masked by the
one on death-row in <sys/kernel.h>
2004-05-30 17:57:46 +00:00
John Baldwin
7a64d8d74c - Create a pir0 psuedo device as a child of legacy0 if we attach a legacy
host-PCI bridge device and find a valid $PIR.
- Make pci_pir_parse() private to pci_pir.c and have pir0's attach routine
  call it instead of having legacy_pcib_attach() call it.
- Implement suspend/resume support for the $PIR by giving pir0 a resume
  method that calls the BIOS to reroute each link that was already routed
  before the machine was suspended.
- Dump the state of the routed flag in the links display code.
- If a link's IRQ is set by a tunable, then force that link to be re-routed
  the first time it is used.
- Move the 'Found $PIR' message under bootverbose as the pir0 description
  line lists the number of entries already.  The pir0 line also only shows
  up if we are actually using the $PIR which is a bonus.
- Use BUS_CONFIG_INTR() to ensure that any IRQs used by a PCI link are
  set to level/low trigger/polarity.
2004-05-04 21:17:52 +00:00
John Baldwin
be16306ad3 Make the legacy_pcib_attach() function static. 2004-05-03 14:49:43 +00:00
John Baldwin
86f4fd6f71 Don't call the BIOS to route a link that has already been routed by the
BIOS during POST as it apparently makes some machines unhappy.

Tested by:	mux
2004-04-16 18:54:05 +00:00
John Baldwin
ccab16610b Add back an include to fix the build for the CPU_ELAN case. 2004-02-19 18:34:26 +00:00
John Baldwin
77fa00fa7c Switch to using the new $PIR interrupt routing code and remove the old
code.  The pci_cfgreg.c file now just controls reading/writing PCI config
registers.
2004-02-18 22:41:53 +00:00
John Baldwin
2e41ba54d6 Rework the $PIR (aka PCIBIOS) PCI interrupt routing code and split it off
into its own file:
- All of the $PIR interrupt routing is now done in a link-centric fashion.
  When a host-PCI bridge that uses the $PIR attaches, it calls pir_parse()
  to parse the table.  This scans for link devices and merges all the masks
  for each link device from the table entries.  It then looks at the intline
  register of PCI devices connected to a link to figure out if the BIOS has
  routed this link and if so to which IRQ.
- The IRQ for any given link can be overridden via a hint like so:
  'hw.pci.link.0x62.irq=10'  Any IRQ set in this matter is treated as if it
  were set that way by the BIOS.
- We only call the BIOS to route each link device once.
- When a PCI device wants to route an interrupt, we look it up in the $PIR
  to find the associated link.  If the link is routed, we simply return the
  IRQ it is using.  If it is not routed, we have to pick one.  This uses a
  different algorithm from the old code.  First off, when we try to pick
  an interrupt from a mask of possible interrupts, we try to pick the one
  that is least loaded as far as PCI devices.  We maintain this weight based
  on the number of devices attached to each link device.  When choosing an
  IRQ, we first attempt to route using any PCI only interrupts (the old
  code did this as well).  If that doesn't work, we try to use the list of
  IRQs that the BIOS has used.  This is a new step that the new code didn't
  do and avoids using IRQ 3 or 4 for every virgin interrupt routing.  If
  none of the IRQs that the BIOS used worked, then we fall back to trying
  anything.
- The fallback mask for !PC98 was fixed to include IRQ 3 and not allow IRQ
  2.
- We don't use the $PIR to route interrupts on a PCI-PCI bridge unless it
  has already been used to route on at least one Host-PCI bridge.  This
  helps to avoid mixing and matching x86 firmware PCI interrupt routing
  methods (which is a Bad Thing(tm)).

Silence on:	current@
2004-02-18 22:40:23 +00:00
John Baldwin
21e25fa607 Replace an outb() during the test for configuration mechanism #1 with a
DELAY(1) instead.  After wading through old commit logs, I found that the
outb() was added not as part of the test but as an intentional delay. In
fact, according to Shanley's PCI book, the configuration 1 data and address
ports should only be accessed using aligned 32-bit accesses (i.e. inl()
and outl()).  Thus, using outb() to just the last byte of the port violates
the PCI spec it would seem.  On at least one box doing so broke the probe
for PCI, whereas changing it to a DELAY(1) fixed the probe.

Reported by:	Sean Welch <welchsm@earthlink.net>
MFC after:	1 week
2003-12-31 16:56:32 +00:00
John Baldwin
6f92bdd0c1 New APIC support code:
- The apic interrupt entry points have been rewritten so that each entry
  point can serve 32 different vectors.  When the entry is executed, it
  uses one of the 32-bit ISR registers to determine which vector in its
  assigned range was triggered.  Thus, the apic code can support 159
  different interrupt vectors with only 5 entry points.
- We now always to disable the local APIC to work around an errata in
  certain PPros and then re-enable it again if we decide to use the APICs
  to route interrupts.
- We no longer map IO APICs or local APICs using special page table
  entries.  Instead, we just use pmap_mapdev().  We also no longer
  export the virtual address of the local APIC as a global symbol to
  the rest of the system, but only in local_apic.c.  To aid this, the
  APIC ID of each CPU is exported as a per-CPU variable.
- Interrupt sources are provided for each intpin on each IO APIC.
  Currently, each source is given a unique interrupt vector meaning that
  PCI interrupts are not shared on most machines with an I/O APIC.
  That mapping for interrupt sources to interrupt vectors is up to the
  APIC enumerator driver however.
- We no longer probe to see if we need to use mixed mode to route IRQ 0,
  instead we always use mixed mode to route IRQ 0 for now.  This can be
  disabled via the 'NO_MIXED_MODE' kernel option.
- The npx(4) driver now always probes to see if a built-in FPU is present
  since this test can now be performed with the new APIC code.  However,
  an SMP kernel will panic if there is more than one CPU and a built-in
  FPU is not found.
- PCI interrupts are now properly routed when using APICs to route
  interrupts, so remove the hack to psuedo-route interrupts when the
  intpin register was read.
- The apic.h header was moved to apicreg.h and a new apicvar.h header
  that declares the APIs used by the new APIC code was added.
2003-11-03 21:53:38 +00:00
John Baldwin
221111f6f2 Lower the priority of the legacy host to pci bridge driver so that other
non-ACPI host-bridge drivers can preempt this driver.
2003-10-31 21:00:37 +00:00
Mike Silbersack
184dcdc7c8 Change all SYSCTLS which are readonly and have a related TUNABLE
from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide
more useful error messages.
2003-10-21 18:28:36 +00:00
John Baldwin
810cb9ef5e We represent PCI intpin's two different ways. One is the way that the
intpin register is expressed in hardware where 0 means none, 1 means INTA,
2 INTB, etc.  The other way is commonly used in loops where 0 means INTA,
1 means INTB, etc.  The matchpin argument to pci_cfgintr_search() is
supposed to be the first form, but we passsed in a loop index of the
second.  This fix adds one to the loop index to convert to the first form.

Reported by:	Pavlin Radoslavov <pavlin@icir.org>
2003-09-10 06:00:53 +00:00
John Baldwin
729d7ffbcf - Rename PCIx_HEADERTYPE* to PCIx_HDRTYPE* so the constants aren't so long.
- Add a new PCIM_HDRTYPE constant for the field in PCIR_HDRTYPE that holds
  the header type.
- Replace several magic numbers with appropriate constants for the header
  type register and a couple of PCI_FUNCMAX.
- Merge to amd64 the fix to the i386 bridge code to skip devices with
  unknown header types.

Requested by:	imp (1, 2)
2003-08-28 21:22:25 +00:00
Warner Losh
19b7ffd1b8 Prefer new location of pci include files (which have only been in the
tree for two or more years now), except in a few places where there's
code to be compatible with older versions of FreeBSD.
2003-08-22 07:20:27 +00:00