We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
When processor enters power-save state it releases resources shared with other
cpu threads which makes other cores working much faster.
This patch also implements saving and restoring registers that might get
corrupted in power-save state.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Reviewed by: jhibbits, nwhitehorn, wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14330
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC
kernel. As more work is done, the struct may be optimized further.
Reviewed by: nwhitehorn
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
threads from compile-time defines to global variables. This removes a
significant amount of duplicated runtime patches to the compile-time
defines, centralizing the conditional logic in the early startup code.
Reviewed by: jhibbits
used with hashed page tables on AIM and place it into a new, modular pmap
function called pmap_decode_kernel_ptr(). This function is the inverse
of pmap_map_user_ptr(). With POWER9 radix tables, which mapping to use
becomes more complex than just AIM/BOOKE and it is best to have it in
the same place as pmap_map_user_ptr().
Reviewed by: jhibbits
Commits r326203 and r326978 broke 64-bit booke kernels by introducing a 1MB
zero-pad between the ELF header and the start of the kernel. This didn't
cause a build failure, but caused kernels to need to be loaded into memory
1MB lower, which could easily break scripts expecting previous behavior.
This change matches the similar change made to AIM in r327358.
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.
As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.
Reviewed by: kib
Suggestions from: marius, alc, kib
Runtime tested on: amd64, powerpc64, powerpc, mips64
buffers into a new pmap-module function pmap_map_user_ptr() that can
be implemented by the respective modules. This is required to implement
non-segment-based AIM-ish MMU systems such as the radix-tree page tables
introduced by POWER ISA 3.0 and present on POWER9.
Reviewed by: jhibbits
using a new macro PHYS_TO_DMAP, which deliberately has the same name as the
equivalent macro on amd64. This also sets the stage for moving the direct
map to another base address.
Make sure to set LPCR[LPES] so that external interrupts set SRR0 and SRR1
instead of HSRR0 and HSRR1. Without this, external interrupt handlers would
get the wrong MSR value when executing, causing eventual madness.
Created by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@freebsd.org>
Sponsored by: FreeBSD Foundation
The data segement was too big.
Add a fix-up function like on ia32 for MAXDSIZ.
While here, bring also the MAXSSIZ closer to amd64 and add an equal fix-up
function for MAXSSIZ.
Reviewed by: jhibbits@
Obtained from: jhibbits@
Differential Revision: https://reviews.freebsd.org/D13753
is of limited utility outside of platform-specific code and can vary
at runtime when running as a hypervisor guest, so does not even have the
virtue of being a static identifier.
Reviewed by: jhibbits
is used as the bootloader on a number of PPC64 platforms. This involves the
following pieces:
- Making the first instruction a valid kernel entry point, since kexec
ignores the ELF entry value. This requires a separate section and linker
magic to prevent the linker from filling the beginning of the section
with stubs.
- Adding an entry point at 0x60 past the first instruction for systems
lacking firmware CPU shutdown support (notably PS3).
- Linker script changes to support the above.
MFC after: 1 month
They provide relaxed-ordered atomic access semantic. Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses. The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.
The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations. It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.
Suggested by: jhb
Reviewed by: alc, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D13534
Currently Facility Unavailable is absent and once an application
tries to use or access a register from a feature disabled in the
CPU it causes a kernel panic.
A simple test-case is:
int main() { asm volatile ("tbegin.;"); }
which will use TM (Hardware Transactional Memory) feature which
is not supported by the kernel and so will trigger the following
kernel panic:
----
fatal user trap:
exception = 0xf60 (unknown)
srr0 = 0x10000890
srr1 = 0x800000000000f032
lr = 0x100004e4
curthread = 0x5f93000
pid = 1021, comm = htm
panic: unknown trap
cpuid = 40
KDB: stack backtrace:
Uptime: 3m18s
Dumping 10 MB (3 chunks)
chunk 0: 11MB (2648 pages) ... ok
chunk 1: 1MB (24 pages) ... ok
chunk 2: 1MB (2 pages)panic: IOMMU mapping error: -4
cpuid = 40
Uptime: 3m18s
----
Since Hardware Transactional Memory is not yet supported by FreeBSD, treat
this as an illegal instruction.
PR: 224350
Submitted by: Gustavo Romero <gromero_AT_ibm_DOT_com>
MFC after: 2 weeks
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
is set and the right thing to do may be platform-dependent (it requires
firmware on PowerNV, for instance). Make it a new platform method called
platform_smp_timebase_sync().
MFC after: 3 weeks
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Initially, only tag files that use BSD 4-Clause "Original" license.
RelNotes: yes
Differential Revision: https://reviews.freebsd.org/D13133
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
- expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
same way as for AT_HWCAP.
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D12699
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'. A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags. If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.
The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present. This is a step towards unifying the
AT_* constants across platforms.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12290
These processors may not be supported yet, but add them for completion.
POWER9 is planned for support. e300 may work (based on 603e core).
P5040/P5021 are similar to P5020, so should work as well. One addition is
needed for P5040, to support the number of LAWs, and will be a separate commit.
--Remove special-case handling of sparc64 bus_dmamap* functions.
Replace with a more generic mechanism that allows MD busdma
implementations to generate inline mapping functions by
defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This
is currently useful for sparc64, x86, and arm64, which all
implement non-load dmamap operations as simple wrappers
around map objects which may be bus- or device-specific.
--Remove NULL-checked bus_dmamap macros. Implement the
equivalent NULL checks in the inlined x86 implementation.
For non-x86 platforms, these checks are a minor pessimization
as those platforms do not currently allow NULL maps. NULL
maps were originally allowed on arm64, which appears to have
been the motivation behind adding arm[64]-specific barriers
to bus_dma.h, but that support was removed in r299463.
--Simplify the internal interface used by the bus_dmamap_load*
variants and move it to bus_dma_internal.h
--Fix some drivers that directly include sys/bus_dma.h
despite the recommendations of bus_dma(9)
Reviewed by: kib (previous revision), marius
Differential Revision: https://reviews.freebsd.org/D10729
AKA Make time_t 64 bits on powerpc(32).
PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386). This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.
Tested by: andreast, others
MFC after: Never
Relnotes: Yes
from machine/proc.h, consistently on all architectures.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.
Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.
This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html
Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before. i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386. powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.
Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.
-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.
The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.
Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
Extend the Book-E pmap to support 64-bit operation. Much of this was taken from
Juniper's Junos FreeBSD port. It uses a 3-level page table (page directory
list -- PP2D, page directory, page table), but has gaps in the page directory
list where regions will repeat, due to the design of the PP2D hash (a 20-bit gap
between the two parts of the index). In practice this may not be a problem
given the expanded address space. However, an alternative to this would be to
use a 4-level page table, like Linux, and possibly reduce the available address
space; Linux appears to use a 46-bit address space. Alternatively, a cache of
page directory pointers could be used to keep the overall design as-is, but
remove the gaps in the address space.
This includes a new kernel config for 64-bit QorIQ SoCs, based on MPC85XX, with
the following notes:
* The DPAA driver has not yet been ported to 64-bit so is not included in the
kernel config.
* This has been tested on the AmigaOne X5000, using a MD_ROOT compiled in
(total size kernel+mdroot must be under 64MB).
* This can run both 32-bit and 64-bit processes, and has even been tested to run
a 32-bit init with 64-bit children.
Many thanks to stevek and marcel for getting Juniper's FreeBSD patches open
sourced to be used here, and to stevek for reviewing, and providing some
historical contexts on quirks of the code.
Reviewed by: stevek
Obtained from: Juniper (in part)
MFC after: 2 months
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D9433
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
When committing DTrace in 2012/2013 era I inadvertently broke breakpoints, by
setting EXC_DTRACE to the same value as BKPT_INST. Change EXC_DTRACE to a
different, yet logically identical, trap (tw <all>,31,31).
MFC after: 2 weeks
Implement get_pcpu() for amd64/sparc64/mips/powerpc, and use it to
replace pcpu_find(curcpu) in MI code.
Reviewed by: andreast, kan, lidl
Tested by: lidl(mips, sparc64), andreast(powerpc)
Differential Revision: https://reviews.freebsd.org/D9587
The types are for the byte offset and page index in vm object. They
are similar to off_t, which is defined as 64bit MI integer. Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The switch to get_pcpu() in MI code seems to cause hangs on MIPS.
Back out until we can get a better idea of what's happening there.
Reported by: kan, lidl
Freescale added the E.D profile to e500mc and derivative cores. From
Freescale's EREF reference manual this is enabled by a bit in HID0 and should
otherwise default to traditional debug. However, none of the Freescale cores
support that bit, and instead always use E.D. This results in kernel panics
using the standard debug on e500mc+ cores.
Enhanced debug allows debugging of interrupts, including critical interrupts,
as it uses a different save/restore registers (srr*). At this time we don't use
this ability, so instead share the core of the debug handler code between both
handlers.
MFC after: 3 weeks
The desired behavior of atomic_fcmpset_() is to always exit on error. Instead
of retrying on lost reservation, leave the retry to the caller, and return
error.
Reported by: kib
There are no alternatives defined, so there's no point in keeping them. Also,
they weren't around every inline asm block anyway. Without __GNUCLIKE_ASM
defined, the guarded functions return garbage.
Reported by: Andrew Thompson
Summary:
atomic_fcmpset_*() is analogous to atomic_cmpset(), but saves off the read value
from the target memory location into the 'old' pointer in the case of failure.
Requested by: mjg
Differential Revision: https://reviews.freebsd.org/D9325
There are places where checks are made against VM_MAX_KERNEL_ADDRESS, or
virtual_end (set to VM_MAX_KERNEL_ADDRESS). With 32-bit checks, an address will
always be less than or equal to 0xffffffff. Drop a page, so those checks can
terminate loops safely.
MPC750 User Manual Errata (rev 1) adds a note to C.4.2.2 noting that mtsr,
mtsrin, and mtmsr all require a isync after the instruction and before data
address translation uses any of the segment registers. This should make FreeBSD
run correctly on the G3 again.
Reported by: Mark Millard
MFC after: 1 week
r309017 removed two fields from struct vmmeter, which is embedded in struct
pcpu. This caused the struct size to change, triggering the CTASSERT in
sys/pcpu.h. Add the extra 8 bytes back in as padding.
vmpage requires struct pmap to exist and contain a pm_stats field. As of
r308817, either AIM or BOOKE is required to be set in order to get their
respective pmap structs. Rather than expose them both, or try to unify them
unnecessarily, add a third option which contains only a pm_stats field, and
change the two existing pmap structures to place the common fields at the
beginning of the struct. This actually fixes the stats collection by libkvm on
AIM hardware, because before it was accessing a possibly different offset, which
would cause it to read garbage.
Bump __FreeBSD_version to denote this ABI change, so that ports which depend on
libkvm can be rebuilt.
Change the pv_tracked flag to an int, just in case userspace decides to include
this file and defines BOOKE.
Guard this block from unintentional inclusion with ifdef BOOKE.
Reported by: emaste
Drop the tracking down to the pmap layer, with optimizations to only track
necessary pages. This should give a (slight) performance improvement, as well
as a stability improvement, as the tracking is already mostly handled by the
pmap layer.
Summary:
The Freescale e500v2 PowerPC core does not use a standard FPU.
Instead, it uses a Signal Processing Engine (SPE)--a DSP-style vector processor
unit, which doubles as a FPU. The PowerPC SPE ABI is incompatible with the
stock powerpc ABI, so a new MACHINE_ARCH was created to deal with this.
Additionaly, the SPE opcodes overlap with Altivec, so these are mutually
exclusive. Taking advantage of this fact, a new file, powerpc/booke/spe.c, was
created with the same function set as in powerpc/powerpc/altivec.c, so it
becomes effectively a drop-in replacement. setjmp/longjmp were modified to save
the upper 32-bits of the now-64-bit GPRs (upper 32-bits are only accessible by
the SPE).
Note: This does _not_ support the SPE in the e500v1, as the e500v1 SPE does not
support double-precision floating point.
Also, without a new MACHINE_ARCH it would be impossible to provide binary
packages which utilize the SPE.
Additionally, no work has been done to support ports, work is needed for this.
This also means no newer gcc can yet be used. However, gcc's powerpc support
has been refactored which would make adding a powerpcspe-freebsd target very
easy.
Test Plan:
This was lightly tested on a RouterBoard RB800 and an AmigaOne A1222
(P1022-based) board, compiled against the new ABI. Base system utilities
(/bin/sh, /bin/ls, etc) still function appropriately, the system is able to boot
multiuser.
Reviewed By: bdrewery, imp
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D5683
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.
Submitted by: kib@, dim@, zbb@, emaste@
Summary:
Kernel maps only one page of FDT. When FDT is more than one page in size, data
TLB miss occurs on memmove() when FDT is moved to kernel storage
(sys/powerpc/booke/booke_machdep.c, booke_init())
This introduces a pmap_early_io_unmap() to complement pmap_early_io_map(), which
can be used for any early I/O mapping, but currently is only used when mapping
the fdt.
Submitted by: Ivan Krivonos <int0dster_gmail.com>
Differential Revision: https://reviews.freebsd.org/D7605
Several files use the internal name of `struct device` instead of
`device_t` which is part of the public API. This patch changes all
`struct device *` to `device_t`.
The remaining occurrences of `struct device` are those referring to the
Linux or OpenBSD version of the structure, or the code is not built on
FreeBSD and it's unclear what to do.
Submitted by: Matthew Macy <mmacy@nextbsd.org> (previous version)
Approved by: emaste, jhibbits, sbruno
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7447
Without enabling this bit, tlbre and tlbsx don't update the MAS7 register,
resulting in garbage in the register after a read (rather, the previous setting
of it for a tlbwe). This can result in mmu_booke_mapdev_attr() thinking
mappings that should match actually don't, because tlb1_read_entry() can't
determine the correct address of a given entry.
MFC after: 11-RELEASE
L3 cache is not defined by Book-E, so is platform specific. Since it was
already moved for e500-based devices into mpc85xx in r292903, just eliminate it
altogether. Any device that supports L3 cache should have its own platform
means to enable it.
mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in
the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try,
for example, to run tasks on CPUs that did not exist or to allocate too
few buffers on systems with sparse CPU IDs in which there are holes in the
range and mp_maxid > mp_ncpus. Such circumstances generally occur on
systems with SMT, but on which SMT is disabled. This patch restores system
operation at least on POWER8 systems configured in this way.
There are a number of other places in the kernel with potential problems
in these situations, but where sparse CPU IDs are not currently known
to occur, mostly in the ARM machine-dependent code. These will be fixed
in a follow-up commit after the stable/11 branch.
PR: kern/210106
Reviewed by: jhb
Approved by: re (glebius)
Summary:
PowerPC Book-E SMP is currently broken for unknown reasons. Pull in
Semihalf changes made c2012 for e500mc/e5500, which enables booting SMP.
This eliminates the shared software TLB1 table, replacing it with
tlb1_read_entry() function.
This does not yet support ePAPR SMP booting, and doesn't handle resetting CPUs
already released (ePAPR boot releases APs to a spin loop waiting on a specific
address). This will be addressed in the near future by using the MPIC to reset
the AP into our own alternate boot address.
This does include a change to the dpaa/dtsec(4) driver, to mark the portals as
CPU-private.
Test Plan:
Tested on Amiga X5000/20 (P5020). Boots, prints the following
messages:
Adding CPU 0, pir=0, awake=1
Waking up CPU 1 (dev=1)
Adding CPU 1, pir=20, awake=1
SMP: AP CPU #1 launched
top(1) shows CPU1 active.
Obtained from: Semihalf
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D5945
Summary:
There is currently a 1GB hole between user and kernel address spaces
into which direct (1:1 PA:VA) device mappings go. This appears to go largely
unused, leaving all devices to contend with the 128MB block at the end of the
32-bit space (0xf8000000-0xffffffff). This easily fills up, and needs to be
densely packed. However, dense packing wastes precious TLB1 space, of which
there are only 16 (e500v2) or 64(e5500) entries available.
Change this by using the 1GB space for all device mappings, and allow the kernel
to use the entire upper 1GB for KVA. This also allows us to use sparse device
mappings, freeing up TLB entries.
Test Plan: Boot tested on p5020.
Differential Revision: https://reviews.freebsd.org/D5832
Freescale's QorIQ line includes a new ethernet controller, based on their
Datapath Acceleration Architecture (DPAA). This uses a combination of a Frame
manager, Buffer manager, and Queue manager to improve performance across all
interfaces by being able to pass data directly between hardware acceleration
interfaces.
As part of this import, Freescale's Netcomm Software (ncsw) driver is imported.
This was an attempt by Freescale to create an OS-agnostic sub-driver for
managing the hardware, using shims to interface to the OS-specific APIs. This
work was abandoned, and Freescale's primary work is in the Linux driver (dual
BSD/GPL license). Hence, this was imported directly to sys/contrib, rather than
going through the vendor area. Going forward, FreeBSD-specific changes may be
made to the ncsw code, diverging from the upstream in potentially incompatible
ways. An alternative could be to import the Linux driver itself, using the
linuxKPI layer, as that would maintain parity with the vendor-maintained driver.
However, the Linux driver has not been evaluated for reliability yet, and may
have issues with the import, whereas the ncsw-based driver in this commit was
completed by Semihalf 4 years ago, and is very stable.
Other SoC modules based on DPAA, which could be added in the future:
* Security and Encryption engine (SEC4.x, SEC5.x)
* RAID engine
Additional work to be done:
* Implement polling mode
* Test vlan support
* Add support for the Pattern Matching Engine, which can do regular expression
matching on packets.
This driver has been tested on the P5020 QorIQ SoC. Others listed in the
dtsec(4) manual page are expected to work as the same DPAA engine is included in
all.
Obtained from: Semihalf
Relnotes: Yes
Sponsored by: Alex Perez/Inertial Computing
Summary:
Some drivers need special memory requirements. X86 solves this with a
pmap_change_attr() API, which DRM uses for changing the mapping of the GART and
other memory regions. Implement the same function for PowerPC. AIM currently
does not need this, but will in the future for DRM, so a default is added for
that, for business as usual. Book-E has some drivers coming down that do
require non-default memory coherency. In this case, the Datapath Acceleration
Architecture (DPAA) based ethernet controller has 2 regions for the buffer
portals: cache-inhibited, and cache-enabled. By default, device memory is
cache-inhibited. If the cache-enabled memory regions are mapped
cache-inhibited, an alignment exception is thrown on access.
Test Plan:
Tested with a new driver to be added after this (DPAA dTSEC ethernet driver).
No alignment exceptions thrown, driver works as expected with this.
Reviewed By: nwhitehorn
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5471
ucontext_t available. Our code even has XXX comment about this.
Add a bit of compliance by moving struct __ucontext definition into
sys/_ucontext.h and including it into signal.h and sys/ucontext.h.
Several machine/ucontext.h headers were changed to use namespace-safe
types (like uint64_t->__uint64_t) to not depend on sys/types.h.
struct __stack_t from sys/signal.h is made always visible in private
namespace to satisfy sys/_ucontext.h requirements.
Apparently mips _types.h pollutes global namespace with f_register_t
type definition. This commit does not try to fix the issue.
PR: 207079
Reported and tested by: Ting-Wei Lan <lantw44@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Summary:
The revised Book-E spec, adding the specification for the MMUv2 and e6500,
includes a hardware PTE layout for indirect page tables. In order to support
this in the future, migrate the PTE format to match the MMUv2 hardware PTE
format.
Test Plan: Boot tested on a P5020 board. Booted to multiuser mode.
Differential Revision: https://reviews.freebsd.org/D5224
The PT_{GET,SET}FPREGS requests use 'struct fpreg' and the NT_FPREGSET
core note stores a copy of 'struct fpreg'. As with x86 and the floating
point state there compared to the extended state in XSAVE, struct fpreg
on powerpc now only holds the 'base' FP state, and setting it via
PT_SETFPREGS leaves the extended vector state in a thread unchanged.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D5004
VM_MAX_KERNEL_ADDERESS is the maximum KVA address. 0xf8000000 is the start of
device mapping space. Since several conditional checks use '<=' against
VM_MAX_KERNEL_ADDRESS, bad things could feasibly happen.
Newer Book-E cores (e500mc, e5500, e6500) do not support the WE bit in the MSR,
and instead delegate CPU idling to the SoC.
Perhaps in the future the QORIQ_DPAA option for the mpc85xx platform will become
a subclass, which will eliminate most of the #ifdef's.
Summary:
With some additional changes for AIM, that could also support much
larger physmem sizes. Given that 32-bit AIM is more or less obsolete, though,
it's not worth it at this time.
Differential Revision: https://reviews.freebsd.org/D4345
into a new function that other platforms can share.
This creates a new ofw_reg_to_paddr() function (in a new ofw_subr.c file)
that contains most of the existing ppc implementation, mostly unchanged.
The ppc code now calls the new MI code from the MD code, then creates a
ppc-specific bus_space mapping from the results. The new arm implementation
does the same in an arm-specific way.
This also moves the declaration of OF_decode_addr() from ofw_machdep.h to
openfirm.h, except on sparc64 which uses a different function signature.
This will help all FDT platforms to set up early console access using
OF_decode_addr().
e500mc, e5500, and e6500 all use the normal FPU, with the same behavior as AIM
hardware. e6500 also supports Altivec, so, although we don't yet have e6500
hardware to test on, add these IVORs as well. Theoretically, since it boots the
same as a e5500, it should work, single-threaded, single-core, with full altivec
support as of this commit.
With this commit, and some other patches to be committed shortly FreeBSD now
boots on the P5020, single-core, all the way to user space, and should boot just
fine on e500mc.
Relnotes: Yes (e500mc, e5500 support)
Sponsored by: Alex Perez/Inertial Computing
lwsync instruction, which does not provide Store/Load barrier. Fix
this by using "full" sync barrier for mb().
atomic_store_rel() does not need full barrier, change mb() call there
to the lwsync instruction if not hitting the known CPU erratas
(i.e. on 32bit). Provide powerpc_lwsync() helper to isolate the
lwsync/sync compile time selection, and use it in atomic_store_rel()
and several other places which duplicate the code.
Noted by: alc
Reviewed and tested by: nwhitehorn
Sponsored by: The FreeBSD Foundation
new, simplified, ELF ABI that avoids some of the stranger aspects of the
existing 64-bit PowerPC ABI (function descriptors, in particular). Actually
generating such executables requires a new version of binutils and a newer
compiler (either GCC or clang) than GCC 4.2.1.
As part of this, clean up tlb1_init(), since bootinfo is always NULL here just
eliminate the loop altogether.
Also, fix a bug in mmu_booke_mapdev_attr() where it's possible to map a larger
immediately following a smaller page, causing the mappings to overlap. Instead,
break up the new mapping into smaller chunks. The downside to this is that it
uses more precious TLB1 entries, which, on smaller chips (e500v2) it could cause
problems with TLB1 being out of space (e500v2 only has 16 TLB1 entries).
Obtained from: Semihalf (partial)
Sponsored by: Alex Perez/Inertial Computing
initial thread stack is not adjusted by the tunable, the stack is
allocated too early to get access to the kernel environment. See
TD0_KSTACK_PAGES for the thread0 stack sizing on i386.
The tunable was tested on x86 only. From the visual inspection, it
seems that it might work on arm and powerpc. The arm
USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already
incorrect for the threads with non-default kstack size. I only
changed the macros to use variable instead of constant, since I cannot
test.
On arm64, mips and sparc64, some static data structures are sized by
KSTACK_PAGES, so the tunable is disabled.
Sponsored by: The FreeBSD Foundation
MFC after: 2 week
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)
These will create and destroy a temporary, CPU-local KVA mapping of a specified page.
Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread
Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page
The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.
NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: http://reviews.freebsd.org/D3013
provide a semantic defined by the C11 fences with corresponding
memory_order.
atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.
atomic_thread_fence_rel() is r, w | w.
atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation. Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.
atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.
Reviewed by: alc
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
On Book-E, physical addresses are actually 36-bits, not 32-bits. This is
currently worked around by ignoring the top bits. However, in some cases, the
boot loader configures CCSR to something above the 32-bit mark. This is stage 1
in updating the pmap to handle 36-bit physaddr.
This will print out the Memory Subsystem Status Register on MPC745x (G4+ class),
and the Machine Check Status Register on Book-E class CPUs, to aid in debugging
machine checks. Other relevant registers, for other CPUs, can be added in the
future.
This supports e500v1, e500v2, and e500mc. Tested only on e500v2, but the
performance counters are identical across all, with e500mc having some
additional events.
Relnotes: Yes
and export them to userland.
- Define __HAVE_REG32 on platforms that define a reg32 structure and check
for this in <sys/procfs.h> to control when to export prstatus32, etc.
- Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures.
libbfd looks for these types, and having them fixes 'gcore' in gdb of a
32-bit process on a 64-bit platform.
- Use the structure definitions from <sys/procfs.h> in gcore's elf32 core
dump code instead of duplicating the definitions.
Differential Revision: https://reviews.freebsd.org/D2142
Reviewed by: kib, nathanw (powerpc bits)
MFC after: 1 week
have the same meaning and occupy the same memory address in the trapframe
courtesy of union. Avoid some pointless #ifdef by spelling them both 'DAR'
in the trapframe.
this change is to improve concurrency:
- Drop global state stored in the shadow overflow page table (and all other
global state)
- Remove all global locks
- Use per-PTE lock bits to allow parallel page insertion
- Reconstruct state when requested for evicted PTEs instead of buffering
it during overflow
This drops total wall time for make buildworld on a 32-thread POWER8 system
by a factor of two and system time by a factor of three, providing performance
20% better than similarly clocked Core i7 Xeons per-core. Performance on
smaller SMP systems, where PMAP lock contention was not as much of an issue,
is nearly unchanged.
Tested on: POWER8, POWER5+, G5 UP, G5 SMP (64-bit and 32-bit kernels)
Merged from: user/nwhitehorn/ppc64-pmap-rework
Looked over by: jhibbits, andreast
MFC after: 3 months
Relnotes: yes
Sponsored by: FreeBSD Foundation
and POWER8. This instruction set unifies the 32 64-bit scalar floating
point registers with the 32 128-bit vector registers into a single bank
of 64 128-bit registers. Kernel support mostly amounts to saving and
restoring the wider version of the floating point registers and making
sure that both scalar FP and vector registers are enabled once a VSX
instruction is executed. get_mcontext() and friends currently cannot
see the high bits, which will require a little more work.
As the system compiler (GCC 4.2) does not support VSX, making use of this
from userland requires either newer GCC or clang.
Relnotes: yes
Sponsored by: FreeBSD Foundation
instructions to call through pointers instead. In general, these are set
implicitly through relocation processing. One has to be set explicitly in
machdep.c, however, to fit one handler in the tiny (8 instruction) space
available.
Reviewed by: andreast
Differential revision: D1554
Tested on: UP and SMP G5, Cell, POWER5+
This allows executing static clang built with -O0.
The value is configurable by a sysctl, so if one needs to clamp it down, they
still can.
Discussed with: nwhitehorn,emaste
code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly
identical and simply redefine a number of constants and helper subroutines;
a generic implementation will make it easier to implement features around
kernel core dumps. This change does not alter any minidump code and should
have no functional impact.
PR: 193873
Differential Revision: https://reviews.freebsd.org/D904
Submitted by: Conrad Meyer <conrad.meyer@isilon.com>
Reviewed by: jhibbits (earlier version)
Sponsored by: EMC / Isilon Storage Division
Summary:
Revert the initial FBT-with-KDB changes for trap_subr*.S, and instead use the
db_trap filter function to handle dtrace trap filtering. With this, the MMU is
enabled by the support code, simplifying the codepath altogether.
Test Plan: Tested on my G4 PowerBook
Reviewers: #powerpc, nwhitehorn
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D1207
MFC after: 3 weeks
It is automatically set when -fPIC is passed to the compiler.
Reviewed by: dim, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1179
physical address of the page to direct map address, in case
SFBUF_OPTIONAL_DIRECT_MAP returns true. The case of PowerPC AIM
64bit, where the page physical address is identical to the direct map
address, is accidental.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
The MD allocators were very common, however there were some minor
differencies. These differencies were all consolidated in the MI allocator,
under ifdefs. The defines from machine/vmparam.h turn on features required
for a particular machine. For details look in the comment in sys/sf_buf.h.
As result no MD code left in sys/*/*/vm_machdep.c. Some arches still have
machine/sf_buf.h, which is usually quite small.
Tested by: glebius (i386), tuexen (arm32), kevlo (arm32)
Reviewed by: kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
This appears to fix a strange condition with X on 32-bit PowerBooks I observed,
caused by one of these bits getting set in the mcontext, but not set in the
thread, which may be a symptom of another problem, more difficult to diagnose.
Since these bits aren't exported anyway, this change makes it more explicit that
the bits aren't MSR-related in SRR1.
MFC after: 3 weeks
The NetBSD Foundation states "Third parties are encouraged to change the
license on any files which have a 4-clause license contributed to the
NetBSD Foundation to a 2-clause license."
This change removes clauses 3 and 4 from copyright / license blocks that
list The NetBSD Foundation as the only copyright holder.
Sponsored by: The FreeBSD Foundation
This also fixes asserts on removal of the module for the mpc74xx.
The PowerPC 970 processors have two different types of events: direct events
and indirect events. Thus far only direct events are supported. I included
some documentation in the driver on how indirect events work, but support is
for the future.
MFC after: 1 month
obsolete. This involves the following pieces:
- Remove it entirely on PowerPC, where it is not used by MD code either
- Remove all references to machine/fdt.h in non-architecture-specific code
(aside from uart_cpu_fdt.c, shared by ARM and MIPS, and so is somewhat
non-arch-specific).
- Fix code relying on header pollution from machine/fdt.h includes
- Legacy fdtbus.c (still used on x86 FDT systems) now passes resource
requests to its parent (nexus). This allows x86 FDT devices to allocate
both memory and IO requests and removes the last notionally MI use of
fdtbus_bs_tag.
- On those architectures that retain a machine/fdt.h, unused bits like
FDT_MAP_IRQ and FDT_INTR_MAX have been removed.
regions which represent the total amount of memory. The size of these regions
is not the physical size of the chip but it is a logical one and it is given
by the OpenFirmware, it is selectable at boot time and varies between 16MB and
256MB in my case. There is an 'automatic' option which would select the size as
64MB in case you have around 16GB of RAM.
To make sure we can allocate RAM with the automatic option bump this value
of PHYS_AVAIL_SZ to 256.
Open Firmware-centric:
- Keep the static list of regions in platform.c instead of ofw_machdep.c
- Move various merging and sorting operations to platform.c as well
- Move apple_hacks code out of ofw_machdep.c and into platform_powermac.c,
where it belongs
- Move CHRP-specific dynamic-reconfiguration memory parsing into
platform_chrp.c instead of pretending it is shared code
It turned out that on pSeries machines the call into OF modified the trap
vectors and this made further behaviour unpredictable.
With this commit I'm now able to boot multi user on a network booted
environment on my IntelliStation 285. This is a POWER5+ machine.
Discussed with: nwhitehorn
MFC after: 1 week
allows FPU emulation on AIM as well as providing support for the mfpvr
and lwsync instructions from userland on e500 cores. lwsync, in particular,
is required for many C++ programs to work correctly.
MFC after: 1 week
the actual FPU is enabled, while PCB_FPREGS indicates that the FPU state
structure in the PCB is valid. This separation reflects the situation on
FPU-less systems in which the FP state is used by the emulator but we don't
actually want to try to turn on the non-existant FPU.
Use this flag to save and restore FP regs properly on both AIM and Book-E.
As a side effect, this sets up hard-FP and Altivec on Book-E CPUs with such
abilities except for a trap handler to call enable_fpu()/enable_altivec().
I found a stack overflow when a coredump was taken onto a ZFS volume with
heavy network activity. 2 DSI traps, plus one DECR trap, along with several
function calls in the stack, overflowed the 4 pages. 8 page stack fixes this.
Discussed with: nwhitehorn
MFC after: 1 week
PCPU fields for curthread, by doing the same to Book-E. This closes
some potential races switching between CPUs. As a side effect, it turns out
the AIM and Book-E swtch.S implementations were the same to within a few
registers, so move that to powerpc/powerpc.
MFC after: 3 months
words, every architecture is now auto-sizing the kmem arena. This revision
changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes
mandatory and the definition of VM_KMEM_SIZE becomes optional.
Replace or eliminate all existing definitions of VM_KMEM_SIZE. With
auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling
for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for
clarity.
Change kmeminit() so that the effect of defining VM_KMEM_SIZE is similar to
that of setting the tunable vm.kmem_size. Whereas the macros
VM_KMEM_SIZE_{MAX,MIN,SCALE} have had the same effect as the tunables
vm.kmem_size_{max,min,scale}, the effects of VM_KMEM_SIZE and vm.kmem_size
have been distinct. In particular, whereas VM_KMEM_SIZE was overridden by
VM_KMEM_SIZE_{MAX,MIN,SCALE} and vm.kmem_size_{max,min,scale}, vm.kmem_size
was not. Remedy this inconsistency. Now, VM_KMEM_SIZE can be used to set
the size of the kmem arena at compile-time without that value being
overridden by auto-sizing.
Update the nearby comments to reflect the kmem submap being replaced by the
kmem arena. Stop duplicating the auto-sizing formula in every machine-
dependent vmparam.h and place it in kmeminit() where auto-sizing takes
place.
Reviewed by: kib (an earlier version)
Sponsored by: EMC / Isilon Storage Division
- Remove explicit requirement that the SOC registers be found except as an
optimization (although the MPC85XX LAW drivers still require they be found
externally, which should change).
- Remove magic CCSRBAR_VA value.
- Allow bus_machdep.c's early-boot code to handle non 1:1 mappings and
systems not in real-mode or global 1:1 maps in early boot.
- Allow pmap_mapdev() on Book-E to reissue previous addresses if the
area is already mapped. Additionally have it check all mappings, not
just the CCSR area.
This allows the console on e500 systems to actually work on systems where
the boot loader was not kind enough to set up a 1:1 mapping before starting
the kernel.
Use it universally. Book-E traps may also need revisiting due to the
introduction of fixed-offset traps and the deprecation of IVORs in POWER
ISA 2.06, but that's very much an issue for another day.
platform modules. Whether to call this function or not is highly machine
dependent: on some systems, it is required, while on others it breaks
everything. Platform modules are in a better position to figure this
out. This is required for POWER hypervisor SCSI to work correctly. There
are no functional changes on Powermac systems.
Approved by: re (kib)
making sure they are all misaligned at +8 bytes. This fixes clang builds
of powerpc64 kernels (aside from a required increase in KSTACK_PAGES which
will come later).
This commit from FreeBSD/powerpc64 with a clang-built kernel.
MFC after: 2 weeks
Issues were noted by Bruce Evans and are present on all architectures.
On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.
On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.
PowerPC64 is treated the same as amd64.
For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.
Noted by: bde
Sponsored by: The FreeBSD Foundation
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.
Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc
as CTASSERT in MI pcpu.h, stop including all possible mutually exclusive
PCPU_MD_FIELDS fields into LINT kernels, due to brekaing
aforementioned CTASSERT.
Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.
See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.
In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.
Discussed with: bde
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month