Summary:
This code adds the basic infrastructure for the facility subsystem. A facility
trap is raised when an unavailable instruction is executed. One example is
executing a Hardware Transactional Memory instruction while the MSR[TM] is
disabled. In the past, there was a specific interrupt for it (FP, VEC), but the
new instructions seem to be multiplexed on this facility interrupt.
The root cause of the trap is provided on Facility Status and Control Register
(FSCR) register.
Submitted by: Breno Leitao
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14566
Summary:
Print current MSR on printtrap(). Currently, printtrap just prints srr1, which
contains part of the MSR prior to the exception. I find useful to dump the
current value of the MSR, since it changes when there is an interruption.
With this patch, this is the new printtrap model:
handled user trap:
exception = 0x700 (program)
srr0 = 0x100008a0 (0x100008a0)
srr1 = 0x800000000002f032
current msr = 0x8000000000009032
lr = 0x1000089c (0x1000089c)
curthread = 0x7a50000
pid = 714, comm = ttrap2
Submitted by: Breno Leitao
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14600
Summary:
It is not necessary to call isync() after calling mtmsr() function, mainly
because the mtmsr() calls 'isync' internally to synchronize the machine state
register. Other than that, isync() just calls the 'isync' instruction, thus,
the 'isync' instruction is being called twice, and that seems to be unnecessary.
This patch just remove the unecessary calls to isync() after mtmsr().
Submitted by: Breno Leitao
Differential Revision: https://reviews.freebsd.org/D14583
Summary:
Currently ofw_real_bounce_alloc() is requesting memory, using WAITOK, holding a
non-sleepable locks, called 'OF Bounce Page'.
Fix this by allocating the pages outside of the lock, and only updating the
global variables while holding the lock.
Submitted by: Breno Leitao
Differential Revision: https://reviews.freebsd.org/D14955
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.
Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.
Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.
Reviewed by: kib, cem, jhb, jtl
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14941
TLB1 can handle ranges up to 4GB (through e5500, larger in e6500), but
ilog2() took a unsigned int, which maxes out at 4GB-1, but truncates
silently. Increase the input range to the largest supported, at least for
64-bit targets. This lets the DMAP be completely mapped, instead of only
1GB blocks with it assuming being fully mapped.
As with AIM64, map the DMAP at the beginning of the fourth "quadrant" of
memory, and move the KERNBASE to the the start of KVA.
Eventually we may run the kernel out of the DMAP, but for now, continue
booting as it has been.
assym is only to be included by other .s files, and should never
actually be assembled by itself.
Reviewed by: imp, bdrewery (earlier)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D14180
OF_finddevices returns ((phandle_t)-1) in case of failure. Some code
in existing drivers checked return value to be equal to 0 or
less/equal to 0 which is also wrong because phandle_t is unsigned
type. Most of these checks were for negative cases that were never
triggered so trhere was no impact on functionality.
Reviewed by: nwhitehorn
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D14645
When the kernel can be in real mode in early boot, we can execute from
high addresses aliased to the kernel's physical memory. If that high
address has the first two bits set to 1 (0xc...), those addresses will
automatically become part of the direct map. This reduces page table
pressure from the kernel and it sets up the kernel to be used with
radix translation, for which it has to be up here.
This is accomplished by exploiting the fact that all PowerPC kernels are
built as position-independent executables and relocate themselves
on start. Before this patch, the kernel runs at 1:1 VA:PA, but that
VA/PA is random and set by the bootloader. Very early, it processes
its ELF relocations to operate wherever it happens to find itself.
This patch uses that mechanism to re-enter and re-relocate the kernel
a second time witha new base address set up in the early parts of
powerpc_init().
Reviewed by: jhibbits
Differential Revision: D14647
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.
The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.
Reviewed by: jhibbits, Breno Leitao
Differential Revision: D14499
correctly for the data contained on each memory page.
There are several components to this change:
* Add a variable to indicate the start of the R/W portion of the
initial memory.
* Stop detecting NX bit support for each AP. Instead, use the value
from the BSP and, if supported, activate the feature on the other
APs just before loading the correct page table. (Functionally, we
already assume that the BSP and all APs had the same support or
lack of support for the NX bit.)
* Set the RW and NX bits correctly for the kernel text, data, and
BSS (subject to some caveats below).
* Ensure DDB can write to memory when necessary (such as to set a
breakpoint).
* Ensure GDB can write to memory when necessary (such as to set a
breakpoint). For this purpose, add new MD functions gdb_begin_write()
and gdb_end_write() which the GDB support code can call before and
after writing to memory.
This change is not comprehensive:
* It doesn't do anything to protect modules.
* It doesn't do anything for kernel memory allocated after the kernel
starts running.
* In order to avoid excessive memory inefficiency, it may let multiple
types of data share a 2M page, and assigns the most permissions
needed for data on that page.
Reviewed by: jhb, kib
Discussed with: emaste
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14282
Add I2C OPAL driver and a set of dummy-ones to allow
all I2C things on Power8 to attach.
TODO: better async token management
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by: IBM, QCM Technologies
A reservation granule on PowerPC is a cache line.
On e500mc and derivatives a cacheline size is 64 bytes, not 32. Allocate
the maximum size permitted, but only utilize the size that is needed. On
e500v1 and e500v2 the reservation granule will still be 32 bytes.
These interfaces were put in place to let QorIQ SoCs dictate CPU idling
semantics, in order to support capabilities such as NAP mode and deep sleep.
However, this never stabilized, and the idling support reverted back to
CPU-level rather than SoC level. Move this code back to cpu.c instead. If
at a later date the lower power modes do come to fruition, it should be done
by overriding the cpu_idle_hook instead of this platform hook.
NVMe support is ready and should be compiled-in
to the ppc64 kernel.
Submitted by: Wojciech Macek <wma@semihalf.org>
Obtained from: Semihalf
Sponsored by: IBM, QCM Technologies
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
When processor enters power-save state it releases resources shared with other
cpu threads which makes other cores working much faster.
This patch also implements saving and restoring registers that might get
corrupted in power-save state.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Reviewed by: jhibbits, nwhitehorn, wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14330
Add function which can store RTC values to OPAL.
Submitted by: Wojciech Macek <wma@semihalf.org>
Obtained from: Semihalf
Sponsored by: IBM, QCM Technologies
Summary:
This compartmentalizes the CPU-specific trap components into its own
function, rather than littering the general printtrap() with various checks.
This will let us replace a series of #ifdef's with a runtime conditional check
in the future.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14416
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.
Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.
Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.
Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.
Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
On POWER8 architecture there is a timer with 512Mhz frequency.
It has about 1,95ns period, but it is rounded to 1ns which is not accurate.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Reviewed by: wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14433
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Reviewed by: nwhitehorn, pdk@semihalf.com, wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14437
zero, matching the IEEE 1275 standard. Since these internal error paths
have never, to my knowledge, been taken, behavior is unchanged.
Reported by: gonzo
MFC after: 2 weeks
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC
kernel. As more work is done, the struct may be optimized further.
Reviewed by: nwhitehorn
Summary:
After revision rS328534('PPC64: use hwref instead of cpuid'), FreeBSD on
powerpc64 virtual machine panics since it is unable to read the
timebase, showing the following error:
get-property for timebase-frequency on zero phandle
panic: Unable to determine timebase frequency!
With the change above, cpuref->cr_hwref does not contain the phandle
anymore, thus, it never reads the proper CPU entry in OF.
Submitted by: Breno Leitao
Differential Revision: https://reviews.freebsd.org/D14204
significant source of cache line contention from vm_page_alloc(). Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.
Reviewed by: markj
Discussed with: kib, glebius
Tested by: pho (earlier version)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14273
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323