MSI/MSI-x interrupts are edge-triggered. If an interrupt
arrives when IRQ line is masked, it will be lost and will
never recover. Perform MSI_EOI always after unmask to give
a chance for PHB/XICS to send an interrupt again if MSI/MSI-x
pending bit is set in MSI/MSI-x BAR space.
Submitted by: Wojciech Macek <wma@semihalf.org>
Obtained from: Semihalf
Sponsored by: IBM, QCM Technologies
Commits r326203 and r326978 broke 64-bit booke kernels by introducing a 1MB
zero-pad between the ELF header and the start of the kernel. This didn't
cause a build failure, but caused kernels to need to be loaded into memory
1MB lower, which could easily break scripts expecting previous behavior.
This change matches the similar change made to AIM in r327358.
Uses of mallocarray(9).
The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.
Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.
Reported by: wosch
PR: 225197
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.
As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.
Reviewed by: kib
Suggestions from: marius, alc, kib
Runtime tested on: amd64, powerpc64, powerpc, mips64
In platform_smp_ap_init we are doing some crucial code (eg. set LPCR register)
which have influence over further execution.
Practiculary in PowerNV platform we have experienced Data Storage Interrupt
before we set apropriate LPCR. It caused code execution from location which was
legal in bootloader (petitboot based on linux) but illegal in FreeBSD
Set stack pointer to correct value after thread's stack pointer restore
Restoring new thread's stack pointer caused stack corruption because
restored stack pointer didn't point to callee (cpu_switch) stack frame but
caller stack frame.
As a result we had mysterious errors in caller function (sched_switch).
Solution: simply set stack pointer to correct value
Also, initialize TOC to a valid pointer once the thread is being
created.
Created by: Patryk Duda <pdk@semihalf.com>
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Reviewed by: nwhitehorn
Differential revision: https://reviews.freebsd.org/D13947
Sponsored by: QCM Technologies
Zero BSS always. The only case when this operation is
ommitted is when booting on BookE.
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Reviewed by: imp, nwhitehorn
Differential revision: https://reviews.freebsd.org/D13948
Sponsored by: QCM Technologies
Add missing little-endian 64-bit read and write. Since there
is no direct ASM opcode for this, perform byte swap if
necessary.
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: QCM Technologies
Use current userspace address for segment mapping. Previously,
there was a bug which made the funciton constantly using the userspace
base address which could cause data integrity issues.
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: QCM Technologies
Add CXGBE driver which is required for PowerNV system.
Also, remove AHCI which does not work in BigEndian.
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: QCM Technologies
FreeBSD prints text char-by-char, which is not what OPAL
is designed to. Poll events more frequently to avoid buffer
overflow and loosing data.
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: QCM Technologies
Fixes:
- map all devices to PE0
- use 1:1 TCE mapping
- provide the same TCE mapping for all PEs (not only PE0)
- add TCE reset and alignment (required by OPAL)
Created by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: QCM Technologies
Make XICS to be OPAL-aware.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@semihalf.com>
Sponsored by: FreeBSD Foundation
P1022 SATA controller may set the wrong CCR bit for a command completion.
This would previously cause an interrupt storm. Solve this by marking all
commands complete, and letting the end_transaction deal with the successes.
Causes no problems on P5020.
While here, fix a minor bug in collision detection. The Freescale SATA
controller only has 16 slots, not 32.
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
X-Differential revision: https://reviews.freebsd.org/D13837
to which it is specific, rather than in the generic AIM startup code. This
will be required to support the radix-table-based MMU introduced with POWER9.
buffers into a new pmap-module function pmap_map_user_ptr() that can
be implemented by the respective modules. This is required to implement
non-segment-based AIM-ish MMU systems such as the radix-tree page tables
introduced by POWER ISA 3.0 and present on POWER9.
Reviewed by: jhibbits
using a new macro PHYS_TO_DMAP, which deliberately has the same name as the
equivalent macro on amd64. This also sets the stage for moving the direct
map to another base address.
Some PowerQUICC and QorIQ platforms have a L2 cache managed via the
memory-mapped configuration registers, and appear as a node in the device
tree. This adds basic support to enable the cache.
allocated with a tag to come from the specified domain if it meets the
other constraints provided by the tag. Automatically create a tag at
the root of each bus specifying the domain local to that bus if
available.
Reviewed by: jhb, kib
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D13545
domains can be done by the _domain() API variants. UMA also supports a
first-touch policy via the NUMA zone flag.
The slab layer is now segregated by VM domains and is precise. It handles
iteration for round-robin directly. The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed. Well
behaved clients can achieve perfect locality with no performance penalty.
The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.
Reviewed by: Attilio (a slightly older version)
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Provide initial support for PCIe host controller as
well as for IOMMU mapping. This commit allows proper
bus enumeration, but does not guarantee DMA operations
are working.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@semihalf.com>
Sponsored by: FreeBSD Foundation
Avoid the lock in vtophys() by providing a static direct-mapped
spinlock- protected output buffer to use when the console driver
cannot acquire locks for some reason. This allows the idle thread
to use printf() (e.g. the SMP startup messages) without crashing
the kernel.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
Sponsored by: FreeBSD Foundation
Make sure to set LPCR[LPES] so that external interrupts set SRR0 and SRR1
instead of HSRR0 and HSRR1. Without this, external interrupt handlers would
get the wrong MSR value when executing, causing eventual madness.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
Sponsored by: FreeBSD Foundation
Fix AP startup, which was broken.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
Sponsored by: FreeBSD Foundation
Add basic power control (reset, power off) and bind
ttyuX to opal console so that init will start login there.
Created by: Nathan Whitehorn <nw@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
Sponsored by: FreeBSD Foundation
OPAL is a dedicated firmware acting as a hypervisor.
Add generic functions to provide all access.
Created by: Nathan Whitehorn <nw@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
- Call resource_int_value() once during attach, rather than within the
pci_(read|write)_config() code path; this avoids taking a blocking mutex
to read kenv variables.
- Use a spin lock to protect non-atomic config space accesses; this matches
the behavior of Darwin's AppleMacRiscPCI driver.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D13839
supported on newer POWER hardware and in graphical VMs run on the same,
which are typically XHCI-only. The 32-bit GENERIC kernel, which
does not run on hardware made in the last decade and is unlikely to
encounter XHCI devices, is left unchanged.
PR: kern/224940
Submitted by: Gustavo Romero
MFC after: 1 week
The data segement was too big.
Add a fix-up function like on ia32 for MAXDSIZ.
While here, bring also the MAXSSIZ closer to amd64 and add an equal fix-up
function for MAXSSIZ.
Reviewed by: jhibbits@
Obtained from: jhibbits@
Differential Revision: https://reviews.freebsd.org/D13753
is of limited utility outside of platform-specific code and can vary
at runtime when running as a hypervisor guest, so does not even have the
virtue of being a static identifier.
Reviewed by: jhibbits
instead of hard-coding a default. This information is passed implicitly by
the PS3 firmware and can be relied upon. Also adjust the default mode, if
somehow firmware doesn't pass one, to 1920x1080 from 720x480 since it is
2017.
MFC after: 2 weeks
- Densely number CPUs to avoid systems with CPUs with very high ID numbers
- Always have the BSP be CPU 0 to avoid remnant brokenness with non-0 BSPs
in other parts of the kernel.
- Improve parsing of the device tree CPU listings on SMT systems.
- Allow reboot via RTAS as well as OF for pSeries systems booted by FDT
without functioning Open Firmware.
Obtained from: projects/powernv
MFC after: 3 weeks
is used as the bootloader on a number of PPC64 platforms. This involves the
following pieces:
- Making the first instruction a valid kernel entry point, since kexec
ignores the ELF entry value. This requires a separate section and linker
magic to prevent the linker from filling the beginning of the section
with stubs.
- Adding an entry point at 0x60 past the first instruction for systems
lacking firmware CPU shutdown support (notably PS3).
- Linker script changes to support the above.
MFC after: 1 month
If these are not aligned, the linker has to emit a different type of
relocation that the early boot self-relocation code cannot handle, even
in principle, resulting in them being set to zero and the kernel crashing.
MFC after: 1 week