Commit Graph

712 Commits

Author SHA1 Message Date
Adrian Chadd
2da2ade021 Use the correct device (child) when asking the bus layer about which power
state said device should go into.

This was a snafu introduced in the ACPI/PCI awareness separation.

When putting a device into a power state, the bus (and thus firmware,
eg ACPI) should be asked before hand to check whether the device
can indeed go into that power state.

There's a set of nodes in ACPI under each device - the _SxD nodes - which
state which ACPI power state to put the device into when the system is
going into power save state 'x'.  So when going into S3, the existence
of an _S3D node would override whatever the system was trying to do.

By default the PCI code wants to put devices into D3 before suspending.

I have a laptop here (Asus Zenbook - check the PR) whose EHCI controller
really wants to be in D2 during suspend, not D3.  So if we put it into
D3 and then try to enter S3, everything hangs.  The device itself
can go into D3 - it just can't be there when the call to ACPI to enter
S3 occurs.  The PCI patch fixes this.

jkim@ noticed that the same is needed for the ACPI child device
enumeration.

Thankyou to Matt Dillon (the programmer, not the actor) for buying me
this particular laptop so I could debug the issues with the Atheros
AR9485 that is in it.  It's his fault that I ended up with this
laptop and was sufficiently annoyed by the lack of USB suspend
to go down this rabbit hole.

Tested:

* Thinkpad T400
* Thinkpad X230
* Thinkpad T42
* Thinkpad T60
* Asus Zenbook (see PR)
* Asus EEEPC 701
* Asus EEEPC 1001PX

TODO:

* Figure out what we should do about devices we unload drivers for
  that want to be in a specific state when entering S3 / S4 -
  the "put devices into D3 if they're not bound to a driver" option
  may also mess with things.

PR:		kern/194884
Reviewed by:	jhb, jkim
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Matt Dillon <dillon@apollo.backplane.com> (hardware)
2014-11-11 17:14:11 +00:00
Davide Italiano
2be111bf7d Follow up to r225617. In order to maximize the re-usability of kernel code
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.

Submitted by:   kmacy
Tested by:      make universe
2014-10-16 18:04:43 +00:00
Adrian Chadd
ffcf962dab Add a bus method to fetch the VM domain for the given device/bus.
* Add a bus_if.m method - get_domain() - returning the VM domain or
  ENOENT if the device isn't in a VM domain;
* Add bus methods to print out the domain of the device if appropriate;
* Add code in srat.c to save the PXM -> VM domain mapping that's done and
  expose a function to translate VM domain -> PXM;
* Add ACPI and ACPI PCI methods to check if the bus has a _PXM attribute
  and if so map it to the VM domain;
* (.. yes, this works recursively.)
* Have the pci bus glue print out the device VM domain if present.

Note: this is just the plumbing to start enumerating information -
it doesn't at all modify behaviour.

Differential Revision:	D906
Reviewed by:	jhb
Sponsored by:	Norse Corp
2014-10-09 05:33:25 +00:00
Pyun YongHyeon
dd6d49aad2 Oops, fix typo made in r272729. 2014-10-08 05:53:04 +00:00
Pyun YongHyeon
b19487dff8 Add new quirk PCI_QUIRK_MSI_INTX_BUG to pci(4).
QAC AR816x/E2200 controller has a silicon bug that MSI interrupt
does not assert if PCIM_CMD_INTxDIS bit of command register is set.

Reviewed by:	jhb
2014-10-08 05:34:39 +00:00
Justin Hibbits
a1c1634858 Stage one of multipass suspend/resume
Summary:
Add the beginnings of multipass suspend/resume, by introducing
BUS_SUSPEND_CHILD/BUS_RESUME_CHILD, and move the PCI driver to this.

Reviewers: jhb

Reviewed By: jhb

Differential Revision: https://reviews.freebsd.org/D590
2014-09-23 02:56:40 +00:00
Roger Pau Monné
cd407ca216 pci: add a new pci_child_added newbus method.
This is needed so when running under Xen the calls to pci_child_added
can be intercepted and a custom Xen method can be used to register
those devices with Xen. This should not include any functional
change, since the Xen implementation will be added in a following
patch and the native implementation is a noop.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb

dev/pci/pci.c:
dev/pci/pci_if.m:
dev/pci/pci_private.h:
dev/pci/pcivar.h:
 - Add the pci_child_added newbus method.
2014-08-22 15:05:51 +00:00
Roger Pau Monné
073bf9dd70 pci: make MSI(-X) enable and disable methods of the PCI bus
Make the functions pci_disable_msi, pci_enable_msi and pci_enable_msix
methods of the newbus PCI bus. This code should not include any
functional change.

Sponsored by: Citrix Systems R&D
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D354

dev/pci/pci.c:
 - Convert the mentioned functions to newbus methods.
 - Fix the callers of the converted functions.

sys/dev/pci/pci_private.h:
dev/pci/pci_if.m:
 - Declare the new methods.

dev/pci/pcivar.h:
 - Add helpers to call the newbus methods.

ofed/include/linux/pci.h:
 - Add define to prevent the ofed version of pci_enable_msix from
   clashing with the FreeBSD native version.
2014-08-20 14:57:20 +00:00
Marcel Moolenaar
e7d939bda2 Remove ia64.
This includes:
o   All directories named *ia64*
o   All files named *ia64*
o   All ia64-specific code guarded by __ia64__
o   All ia64-specific makefile logic
o   Mention of ia64 in comments and documentation

This excludes:
o   Everything under contrib/
o   Everything under crypto/
o   sys/xen/interface
o   sys/sys/elf_common.h

Discussed at: BSDcan
2014-07-07 00:27:09 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Alexander Motin
e887c1dedf Add IOMMU PCI subclass, found on Tyan S8236 motherboard.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2014-05-20 14:39:22 +00:00
Steven Hartland
8197f45b6a Make uninteresting PCI devices with no attached drivers only print out
on a verbose boot

MFC after:	2 weeks
2014-04-30 16:42:12 +00:00
Ryan Stone
a998d4b50e Be consistent with the whitespace in the rest of these files.
X-MFC-With: r264007
2014-04-29 20:49:47 +00:00
Ryan Stone
c8912fcbdf Correct a PCI enumeration bug introduced in r264011
Ensure that first_func is set to 0 on every iteration of the PCI slot
enumeration loop after the first.  There is a continue statement that would
cause first_func to stay at 1 any PCI device where slot 0 has no functions
until we find a slot that does have a function.  This would cause us to
not enumerate the first PCI function on the device.

Credit to markj@ for spotting the bug.

X-MFC-With: r264011
2014-04-03 22:32:12 +00:00
Ryan Stone
aea992e1f5 Add missing copyright date.
MFC after:	2 months
2014-04-01 17:35:31 +00:00
Ryan Stone
55d3ea1731 Add support for PCIe ARI
PCIe Alternate RID Interpretation (ARI) is an optional feature that
allows devices to have up to 256 different functions.  It is
implemented by always setting the PCI slot number to 0 and
re-purposing the 5 bits used to encode the slot number to instead
contain the function number.  Combined with the original 3 bits
allocated for the function number, this allows for 256 functions.

This is enabled by default, but it's expected to be a no-op on currently
supported hardware.  It's a prerequisite for supporting PCI SR-IOV, and
I want the ARI support to go in early to help shake out any bugs in it.
ARI can be disabled by setting the tunable hw.pci.enable_ari=0.

Reviewed by:	kib
MFC after:	2 months
Sponsored by:	Sandvine Inc.
2014-04-01 16:02:02 +00:00
Ryan Stone
5605a99e36 Add a method to get the PCI RID for a device.
Reviewed by:	kib
MFC after:	2 months
Sponsored by:	Sandvine Inc.
2014-04-01 15:47:24 +00:00
Ryan Stone
7036ae46bf Revert PCI RID changes.
My PCI RID changes somehow got intermixed with my PCI ARI patch when I
committed it.  I may have accidentally applied a patch to a non-clean
working tree.  Revert everything while I figure out what went wrong.

Pointy hat to: rstone
2014-04-01 15:06:03 +00:00
Ryan Stone
d773f48b1e Add a method to get the PCI Routing ID for a device
Reviewed by:	kib
Sponsored by:	Sandvine, Inc
2014-04-01 14:49:25 +00:00
John Baldwin
4edef187b8 Add support for managing PCI bus numbers. As with BARs and PCI-PCI bridge
I/O windows, the default is to preserve the firmware-assigned resources.
PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture
defines a PCI_RES_BUS resource type.
- Add a helper API to create top-level PCI bus resource managers for each
  PCI domain/segment.  Host-PCI bridge drivers use this API to allocate
  bus numbers from their associated domain.
- Change the PCI bus and CardBus drivers to allocate a bus resource for
  their bus number from the parent PCI bridge device.
- Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the
  full range of bus numbers from secbus to subbus from their parent bridge.
  The drivers also always program their primary bus register.  The bridge
  drivers also support growing their bus range by extending the bus resource
  and updating subbus to match the larger range.
- Add support for managing PCI bus resources to the Host-PCI bridge drivers
  used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib).
- Define a PCI_RES_BUS resource type for amd64 and i386.

Reviewed by:	imp
MFC after:	1 month
2014-02-12 04:30:37 +00:00
John Baldwin
0070c94be9 Add two tunables to ignore certain firmware-assigned resources. These
are mostly useful for debugging.
- hw.pci.clear_bars ignores all firmware-assigned ranges for BARs when
  set.
- hw.pci.clear_pcib ignores all firmware-assigned ranges for PCI-PCI
  bridge I/O windows when set.

MFC after:	1 week
2014-02-05 20:52:12 +00:00
John Baldwin
b986a7ec18 Simplify pci_reserve_map() by calling resource_list_reserve() to allocate
the resource after creating a resource list entry rather than reimplementing
it by hand.

MFC after:	1 week
2014-02-05 20:47:49 +00:00
John Baldwin
8d280bcbfc Properly set the alignment flags when allocating the initial range for a
BAR.  This only really matters when pci_do_realloc_bars is enabled and
the initial allocation of a specific range fails.

MFC after:	1 week
2014-02-05 19:24:16 +00:00
John Baldwin
e0c34b5787 Fix a typo. 2014-02-05 19:23:05 +00:00
John Baldwin
e432d5f6a7 Drop the 3rd clause from all 3 clause BSD licenses where I am the sole
holder to convert them to 2 clause BSD licenses.

MFC after:	1 week
2014-02-05 18:13:27 +00:00
John Baldwin
84b755dfe5 Add support for displaying VPD for PCI devices via pciconf.
- Store the length of each read-only VPD value since not all values are
  guaranteed to be ASCII values (though most are).
- Add a new pciio ioctl to fetch VPD for a single PCI device.  The values
  are returned as a list of variable length records, one for the device
  name and each keyword.
- Add a new -V flag to pciconf's list mode which displays VPD data for
  each device.

MFC after:	1 week
2014-01-20 20:56:09 +00:00
Jean-Sébastien Pédron
2b22d5cc07 vga_pci: Improve boot display detection
The previous code was checking the "VGA Enable" bit on the video card's
parent PCI-to-PCI bridge only. This didn't work for the case where the
video card is attached to the root PCI bus (ie. the card has no parent
PCI-to-PCI bridge).

Now, the new code:
    1. checks the "VGA Enable" bit on the parent bridge only if it's a
       PCI-to-PCI bridge;
    2. always checks the "I/O" and "Memory address space decoding" bits
       on the video card itself.

However, vendor-specific bits are not used.

This fixes the use of many integrated Radeon cards: without this patch,
we fail to detect them as the boot display and, when radeonkms looks for
the Video BIOS, it skips the shadow copy made by the System BIOS. It
then fails to fully initialize the card, because the shadow copy is the
only way to read the Video BIOS in these situations. A workaround was to
force the boot display selection using the "hw.pci.default_vgapci_unit"
tunable.

A previous version of this patch added a new function doing the checks.
Now, the vga_pci_is_boot_display() function is used to perform the
checks (only until the boot display is found) and return if the given
device is the boot display or not.

Furthermore, vga_pci_attach() logs "Boot video device" if the card being
attached it the Chosen One:
    vgapci0: <VGA-compatible display> [...]
    vgapci0: Boot video device

Reviewed by:	kib@, jhb@ (both a previous version)
Tested by:	lunatic_ (#freebsd-xorg, integrated Radeon card,
		xmj (#freebsd-xorg, i915+NVIDIA cards)
2013-12-21 12:55:42 +00:00
Konstantin Belousov
c54a713f6a Make pci_get_dma_tag() non-static. Since the function is only
referenced by pointer, making it non-static should not have even the
negligible impact on the existing code.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-10-24 20:29:29 +00:00
Konstantin Belousov
83cc1cfaea Add some definitions for the bits in root control and status PCIe cap
registers.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-10-24 20:25:29 +00:00
Konstantin Belousov
feb96b46a2 Move the PCI_DMA_BOUNDARY definition into the pcivar.h.
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-10-24 20:21:37 +00:00
Jean-Sébastien Pédron
a38326da80 vgapci: Use vga_pci_alloc_resource() to map PCI Expansion ROM
This is cleaner and fixes Video BIOS mapping when the given device isn't
the boot display.

Submitted by:	jhb@
Approved by:	re (kib)
2013-09-14 17:17:32 +00:00
Jean-Sébastien Pédron
6537c655e3 vga_pci: Remove left-over debugging printf()'s 2013-08-25 18:23:15 +00:00
Jean-Sébastien Pédron
509a8ff67a vga_pci: Add API to map the Video BIOS
Here are two new functions to map and unmap the Video BIOS:
    void * vga_pci_map_bios(device_t dev, size_t *size);
    void   vga_pci_unmap_bios(device_t dev, void *bios);

The BIOS is either taken from the shadow copy made by the System BIOS at
boot time if the given device was used for the default display (i386,
amd64 and ia64 only), or from the PCI expansion ROM.

Additionally, one can determine if a given device was the default
display at boot time using the following new function:
    void   vga_pci_unmap_bios(device_t dev, void *bios);
2013-08-25 18:09:11 +00:00
Rui Paulo
957c6e86b1 Use device_printf(). 2013-08-11 06:57:57 +00:00
John Baldwin
c825d4dc50 Properly handle I/O windows in bridges with the ISA enable bit set. These
beasts still exist unfortunately.  More details can be found in other
references, but the short version is that bridges with this bit set ignore
I/O port ranges that alias to valid ISA I/O port ranges.  In the driver
this requires not allocating these alias regions from the parent device
(so they are free to be acquired by ISA devices), and ensuring no child
devices use resources from these alias regions.
- Change the pcib_window structure to allow for an array of backing
  resources rather than a single resource and update the existing code
  to cope with this.  Some of the coping requires using the saved
  base and limit values in pcib_window instead of using rman operations
  on the backing resource.
- Add special handling for allocating and adjusting the I/O port window
  of an ISA-enabled bridge to only allocate the non-alias ranges and
  add those to the associated resource manager.
- Reject I/O port allocations for a fixed request that conflicts with an
  ISA alias range.
- Remove the "no prefected decode" verbose printf during boot.  The absence
  of a "prefetched decode" line is sufficient.
- Replace the "subtractively decoded bridge" verbose printf with a single
  printf that lists all the "special" decoding modes of a bridge: ISA,
  subtractive, and VGA.
- Add a custom bus_release_resource() method to the PCI bus driver so that
  it can properly free resources for I/O windows of PCI-PCI bridges.
  (These resources are not stored in the bridge device's resource list.)

PR:		misc/179033
MFC after:	2 weeks
2013-07-18 15:17:11 +00:00
Marius Strobl
68e9cbd385 - As it turns out, not only MSI-X is broken for devices passed through by
VMware up to at least ESXi 5.1. Actually, using INTx in that case instead
  may still result in interrupt storms, with MSI being the only working
  option in some configurations. So introduce a PCI_QUIRK_DISABLE_MSIX quirk
  which only blacklists MSI-X but not also MSI and use it for the VMware
  PCI-PCI-bridges. Note that, currently, we still assume that if MSI doesn't
  work, MSI-X won't work either - but that's part of the internal logic and
  not guaranteed as part of the API contract. While at it, add and employ
  a pci_has_quirk() helper.
  Reported and tested by: Paul Bucher
- Use NULL instead of 0 for pointers.

Submitted by:	jhb (mostly)
Approved by:	jhb
MFC after:	3 days
2013-07-09 23:12:26 +00:00
John Baldwin
e35ce1f271 Make detaching drivers from PCI devices more robust. While here, fix a
bug where a PCI device would be powered down if it failed to probe, but
not when its driver was detached (e.g. via kldunload).
- Add a new helper method resource_list_release_active() which forcefully
  releases any active resources of a specified type from a resource list.
- Add a bus_child_detached method for the PCI bus driver which forces any
  active resources to be released (and whines to the console if it finds
  any) and then powers the device down.
- Call pci_child_detached() if we fail to probe a device when a driver
  is kldloaded.  This isn't perfect but can avoid leaking resources
  from a probe() routine in the kldload case.

Reviewed by:	imp, brooks
MFC after:	1 month
2013-06-27 20:21:54 +00:00
John Baldwin
19e1c4d115 Disable hw.pci.realloc_bars by default. It wasn't needed for the original
tester of this fix, and realloc_bars breaks some other cases as a small
BAR that is reallocated can end up grabbing space needed by a much larger
BAR in the existing window of a PCI-PCI bridge.

MFC after:	3 days
2013-06-24 18:30:44 +00:00
Konstantin Belousov
f05f3c1764 Add new capability types encodings from HyperTransport I/O Link
Specification revisions 3.00 and 3.10.

Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	1 week
2013-05-17 14:04:31 +00:00
John Baldwin
5569a8b8df Revision 233677 broke certain machines. Specifically, if the firmware/BIOS
assigned conflicting ranges to BARs then leaving the BARs alone could
result in one device stealing mmio accesses intended to go to a second
device.  Prior to 233677 the PCI bus driver attempted to handle this case
by clearing the BAR to 0 depending on BARs based at 0 not decoding (which
is not guaranteed to be true).  Now when a conflicting BAR is detected the
following steps are taken:

 1) If hw.pci.realloc_bars (a new tunable) is enabled (default is enabled),
    then ignore the current BAR setting from the firmware and attempt to
    allocate a fresh resource range for the BAR.

 2) If 1) failed (or was disabled), disable decoding for the relevant
    BAR type (e.g. disable mem decoding for a memory BAR) and emit a
    warning if booting verbose.

Tested by:	Alex Keda <admin@lissyara.su>
MFC after:	1 week
2013-05-09 19:24:50 +00:00
Konstantin Belousov
5c818b67a4 Usnure that PCI bus BIS_GET_DMA_TAG() method sees the actual PCI
device which makes the request for dma tag, instead of some descendant
of the PCI device, by creating a pass-through trampoline for vga_pci
and ata_pci buses.

Sponsored by:	The FreeBSD Foundation
Suggested by:	jhb
Discussed with:	jhb, mav
MFC after:	1 week
2013-04-14 14:02:34 +00:00
John Baldwin
9f905daf6e Proxy allocation requests for the PCI ROM BAR from child devices similar
to how the VGA bus driver currently proxies allocation requests for other
PCI BARs.

MFC after:	1 week
2013-04-09 19:36:34 +00:00
Marius Strobl
4495286fb2 - Complete r231621 by also blacklisting the bridge used by VMware for PCIe
devices. While at it, update the comment now that we know that MSI-X
  doesn't work with ESXi 5.1 for Intel 82576 either and the underlying issue
  is a bug in the MSI-X allocation code of the hypervisor.
  Reported by: Harald Schmalzbauer
- Make the nomatch table const.

MFC after:	1 week
2013-03-02 15:54:02 +00:00
Neel Natu
5bf80db676 Remove the quirk to allow use of MSI when the guest is running inside bhyve.
This became redundant after the hostbridge presented to the guest started
advertising the PCI-E capability (r246846).

Obtained from:	NetApp
2013-02-28 01:00:32 +00:00
Neel Natu
a61359a9a6 Add quirk to indicate that the bhyve hostbridge is capable of supporting
MSI and MSI-X even though it does not advertise the PCI-E capability
itself.

Obtained from:	NetApp
2013-01-05 18:48:23 +00:00
David Xu
fee3029ce7 Always initialize pattern_buf pointers to NULL, otherwise AMD64 machine
panics with:
   free: address xxx(yyy) has not been allocated.
it can be triggered by hald.
2012-12-26 13:07:17 +00:00
Dimitry Andric
29658c96ce Remove duplicate const specifiers in many drivers (I hope I got all of
them, please let me know if not).  Most of these are of the form:

static const struct bzzt_type {
	[...list of members...]
} const bzzt_devs[] = {
	[...list of initializers...]
};

The second const is unnecessary, as arrays cannot be modified anyway,
and if the elements are const, the whole thing is const automatically
(e.g. it is placed in .rodata).

I have verified this does not change the binary output of a full kernel
build (except for build timestamps embedded in the object files).

Reviewed by:	yongari, marius
MFC after:	1 week
2012-11-05 19:16:27 +00:00
Warner Losh
764f2139e6 Add missing Extended Capability ID Numbers from PCIe 3.0. 2012-10-19 23:10:55 +00:00