Follow-up to r353959 and r368070: do the same for other architectures.
arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing. Likewise, MIPS
uses .frame directives.
Reviewed by: arichardson
Differential Revision: https://reviews.freebsd.org/D27387
Summary:
The existing page table is fraught with errors, since it creates a hole
in the address space bits. Fix this by taking a cue from the POWER9
radix pmap, and make the page table 4 levels, 52 bits.
Reviewed by: bdragon
Differential Revision: https://reviews.freebsd.org/D24220
The memory range between VM_MAXUSER_ADDRESS and VM_MIN_KERNEL_ADDRESS is
reserved for devices currently, which are always mapped in TLB1, and
therefore do not exist in the kernel page table. Any page fault in this
range is therefore automatically a fatal fault.
Since TLB_MAXNEST is 3, the insert mask should only be 2 bits. Given that 2
bits counts to 4, and that we already have plenty of space wasted in
padding, make the nest level 4 to match the mask.
This involved several changes:
* Since lld does not like text relocations, replace SMP boot page text relocs
in booke/locore.S with position-independent math, and track the virtual base
in the SMP boot page header.
* As some SPRs are interpreted differently on clang due to the way it handles
platform-specific SPRs, switch m*dear and m*esr mnemonics out for regular
m*spr. Add both forms of SPR_DEAR to spr.h so the correct encoding is selected.
* Change some hardcoded 32 bit things in the boot page to be pointer-sized, and
fix alignment.
* Fix 64-bit build of booke/pmap.c when enabling pmap debugging.
Additionally, I took the opportunity to document how the SMP boot page works.
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21999
MAS8 is hypervisor privileged, defining the logical partition (VM) to
operate on for TLB accesses. It's already guaranteed to be cleared when
booting bare metal (bootloader needs it zeroed to work), and we can't touch
it from a guest. Assume that if/when we eventually port bhyve to PowerPC
(and Book-E) the hypervisor module will take care of managing MAS8. This
saves several (tens) of clocks on each TLB miss.
MFC after: 2 weeks
This is part 2 of r347078, pulling the page directory out of the Book-E
pmap. This breaks KBI for anything that uses struct pmap (such as vm_map)
so any modules that access this must be rebuilt.
Unconditional writing to MAS7, which doesn't exist on the e500v1 core, in a
TLB miss handler has been in the code for several years now. Since this has
gone unnoticed for so long, it's easily concluded that e500v1 is not in use
with FreeBSD. Simplify the code path a bit, by unconditionally zeroing MAS7
instead of calling a subroutine to do it.
The Signal Processing Engine (SPE) found in Freescale e500 cores (and
others) offloads IEEE-754 compliance (NaN, Inf handling, overflow,
underflow) to software, most likely as a means of simplifying the APU
silicon. Some software, like AbiWord, needs full IEEE-754 compliance,
including NaN handling. Implement the necessary bits to enable it.
Differential Revision: https://reviews.freebsd.org/D17446
Make int_external_input, int_decrementer, and int_performance_counter all
now use trap_common, just like on AIM. The effects of this are:
* All traps are now properly displayed in ddb. Previously traps from
external input, decrementer, and performance counters, would display as
just basic stack traces. Now the frame is displayed.
* External interrupts are now handled with interrupts enabled, so handling
can be preempted. This seems to fix a hang found post-r329882.
A reservation granule on PowerPC is a cache line.
On e500mc and derivatives a cacheline size is 64 bytes, not 32. Allocate
the maximum size permitted, but only utilize the size that is needed. On
e500v1 and e500v2 the reservation granule will still be 32 bytes.
According to EREF rlwinm is supposed to clear the upper 32 bits of the
register of 64-bit cores. However, from experience it seems there's a bug
in the e5500 which causes the result to be duplicated in the upper bits of
the register. This causes problems when applied to stashed SRR1 accessed
to retrieve context, as the upper bits are not masked out, so a
set_mcontext() fails. This causes sigreturn() to in turn return with
EINVAL, causing make(1) to exit with error.
This bit is unused in e500mc derivatives (including e5500), so could just be
conditional on non-powerpc64, but there may be other non-Freescale cores
which do use it. This is also the same as the POW bit on Book-S, so could
be cleared unconditionally with the only penalty being a few clock cycles
for these two interrupts.
The current method only sort of works, and usually doesn't work reliably.
Also, on Book-E the return address from DEBUG exceptions is not the sentinel
addresses, so it won't exit the loop correctly.
Fix this by better handling trap frames during unwinding, and using the
common trap handler for debug traps, as the code in that segment is
identical between the two.
MFC after: 1 week
Extend the Book-E pmap to support 64-bit operation. Much of this was taken from
Juniper's Junos FreeBSD port. It uses a 3-level page table (page directory
list -- PP2D, page directory, page table), but has gaps in the page directory
list where regions will repeat, due to the design of the PP2D hash (a 20-bit gap
between the two parts of the index). In practice this may not be a problem
given the expanded address space. However, an alternative to this would be to
use a 4-level page table, like Linux, and possibly reduce the available address
space; Linux appears to use a 46-bit address space. Alternatively, a cache of
page directory pointers could be used to keep the overall design as-is, but
remove the gaps in the address space.
This includes a new kernel config for 64-bit QorIQ SoCs, based on MPC85XX, with
the following notes:
* The DPAA driver has not yet been ported to 64-bit so is not included in the
kernel config.
* This has been tested on the AmigaOne X5000, using a MD_ROOT compiled in
(total size kernel+mdroot must be under 64MB).
* This can run both 32-bit and 64-bit processes, and has even been tested to run
a 32-bit init with 64-bit children.
Many thanks to stevek and marcel for getting Juniper's FreeBSD patches open
sourced to be used here, and to stevek for reviewing, and providing some
historical contexts on quirks of the code.
Reviewed by: stevek
Obtained from: Juniper (in part)
MFC after: 2 months
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D9433
Freescale added the E.D profile to e500mc and derivative cores. From
Freescale's EREF reference manual this is enabled by a bit in HID0 and should
otherwise default to traditional debug. However, none of the Freescale cores
support that bit, and instead always use E.D. This results in kernel panics
using the standard debug on e500mc+ cores.
Enhanced debug allows debugging of interrupts, including critical interrupts,
as it uses a different save/restore registers (srr*). At this time we don't use
this ability, so instead share the core of the debug handler code between both
handlers.
MFC after: 3 weeks
Summary:
The revised Book-E spec, adding the specification for the MMUv2 and e6500,
includes a hardware PTE layout for indirect page tables. In order to support
this in the future, migrate the PTE format to match the MMUv2 hardware PTE
format.
Test Plan: Boot tested on a P5020 board. Booted to multiuser mode.
Differential Revision: https://reviews.freebsd.org/D5224
e500mc, e5500, and e6500 all use the normal FPU, with the same behavior as AIM
hardware. e6500 also supports Altivec, so, although we don't yet have e6500
hardware to test on, add these IVORs as well. Theoretically, since it boots the
same as a e5500, it should work, single-threaded, single-core, with full altivec
support as of this commit.
With this commit, and some other patches to be committed shortly FreeBSD now
boots on the P5020, single-core, all the way to user space, and should boot just
fine on e500mc.
Relnotes: Yes (e500mc, e5500 support)
Sponsored by: Alex Perez/Inertial Computing
As part of this, clean up tlb1_init(), since bootinfo is always NULL here just
eliminate the loop altogether.
Also, fix a bug in mmu_booke_mapdev_attr() where it's possible to map a larger
immediately following a smaller page, causing the mappings to overlap. Instead,
break up the new mapping into smaller chunks. The downside to this is that it
uses more precious TLB1 entries, which, on smaller chips (e500v2) it could cause
problems with TLB1 being out of space (e500v2 only has 16 TLB1 entries).
Obtained from: Semihalf (partial)
Sponsored by: Alex Perez/Inertial Computing
Summary:
Both booke and AIM interrupt.c files contain nearly identical code. This merges
the two files, to reduce duplication.
Reviewers: #powerpc, marcel
Reviewed By: marcel
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D2991
This supports e500v1, e500v2, and e500mc. Tested only on e500v2, but the
performance counters are identical across all, with e500mc having some
additional events.
Relnotes: Yes
executables. The goal here, not yet accomplished, is to let the e500 kernel
run under QEMU by setting KERNBASE to something that fits in low memory and
then having the kernel relocate itself at runtime.
PCPU fields for curthread, by doing the same to Book-E. This closes
some potential races switching between CPUs. As a side effect, it turns out
the AIM and Book-E swtch.S implementations were the same to within a few
registers, so move that to powerpc/powerpc.
MFC after: 3 months
Note that setting the PTE_MODIFIED bit based on whether write is possible
is incorrect. We should set PTE_MODIFIED based on whether the access
is a write operation.
routines.
This unbreaks Book-E build after the recent machine/mutex.h removal.
While there move tlb_*lock() prototypes to machine/tlb.h.
Submitted by: jhb
Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).
To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.
Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: marcel
o Eliminate tlb0[] (a s/w copy of TLB0)
- The table contents cannot be maintained reliably in multiple MMU
environments, where asynchronous events (invalidations from other cores)
can change our local TLB0 contents underneath.
- Simplify and optimize TLB flushing: system wide invalidations are
performed using tlbivax instruction (propagates to other cores), for
local MMU invalidations a new optimized routine (assembly) is introduced.
o Improve and simplify TID allocation and management.
- Let each core keep track of its TID allocations.
- Simplify TID recycling, eliminate dead code.
- Drop the now unused powerpc/booke/support.S file.
o Improve page tables management logic.
o Simplify TLB1 manipulation routines.
o Other improvements and polishing.
Obtained from: Freescale, Semihalf
The PQ3 is a high performance integrated communications processing system
based on the e500 core, which is an embedded RISC processor that implements
the 32-bit Book E definition of the PowerPC architecture. For details refer
to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E
This port was tested and successfully run on the following members of the PQ3
family: MPC8533, MPC8541, MPC8548, MPC8555.
The following major integrated peripherals are supported:
* On-chip peripherals bus
* OpenPIC interrupt controller
* UART
* Ethernet (TSEC)
* Host/PCI bridge
* QUICC engine (SCC functionality)
This commit brings the main functionality and will be followed by individual
drivers that are logically separate from this base.
Approved by: cognet (mentor)
Obtained from: Juniper, Semihalf
MFp4: e500