VM_MAX_KERNEL_ADDERESS is the maximum KVA address. 0xf8000000 is the start of
device mapping space. Since several conditional checks use '<=' against
VM_MAX_KERNEL_ADDRESS, bad things could feasibly happen.
On powerpc64, pointers are 64 bits, so casting from uint32_t changes the integer
width.
The alternative was to use register_t, but I didn't see register_t used as
argument type for any other functions, though didn't look too closely. u_long
was an acceptable alternative. On 64-bit it's 64 bits, on 32-bit it's 32 bits.
powerpc_init() initializes the mmu. Since this may clear pages via
pmap_zero_page(), set the cacheline size before calling into it, so
pmap_zero_page() has the right cacheline size. This isn't completely
necessary now, but will be when 64-bit book-e is completed.
For rs6000, most memory insns and addi/addis do not allow GPR0 for RA
(they use literal zero there instead). So use a 'b' constraint to make
sure to have a base register other than GPR0.
GCC-4.7 and up handles this with allocating r9 instead of r0.
providing compiled-in static environment data that is used instead of any
data passed in from a boot loader.
Previously 'env' worked only on i386 and arm xscale systems, because it
required the MD startup code to examine the global envmode variable and
decide whether to use static_env or an environment obtained from the boot
loader, and set the global kern_envp accordingly. Most startup code wasn't
doing so. Making things even more complex, some mips startup code uses an
alternate scheme that involves calling init_static_kenv() to pass an empty
buffer and its size, then uses a series of kern_setenv() calls to populate
that buffer.
Now all MD startup code calls init_static_kenv(), and that routine provides
a single point where envmode is checked and the decision is made whether to
use the compiled-in static_kenv or the values provided by the MD code.
The routine also continues to serve its original purpose for mips; if a
non-zero buffer size is passed the routine installs the empty buffer ready
to accept kern_setenv() values. Now if the size is zero, the provided buffer
full of existing env data is installed. A NULL pointer can be passed if the
boot loader provides no env data; this allows the static env to be installed
if envmode is set to do so.
Most of the work here is a near-mechanical change to call the init function
instead of directly setting kern_envp. A notable exception is in xen/pv.c;
that code was originally installing a buffer full of preformatted env data
along with its non-zero size (like mips code does), which would have allowed
kern_setenv() calls to wipe out the preformatted data. Now it passes a zero
for the size so that the buffer of data it installs is treated as
non-writeable.
LBC block size can only be up to 4GB. The existing code already clamps it, but
mixes unsigned long and uint32_t. This works on 32-bit targets, but not 64-bit,
so isn't completely correct. This fixes the type confusion.
Newer Book-E cores (e500mc, e5500, e6500) do not support the WE bit in the MSR,
and instead delegate CPU idling to the SoC.
Perhaps in the future the QORIQ_DPAA option for the mpc85xx platform will become
a subclass, which will eliminate most of the #ifdef's.
This includes the following changes:
* SMP kickoff for QorIQ (tested on P5020)
* Errata fixes for some silicon revisions
* Enables L2 (and L3 if available) caches
Obtained from: Semihalf
Sponsored by: Alex Perez/Inertial Computing
There's no need for it to be in asm. Also, by writing in C, and marking it
static in pmap.c, it saves a branch to the function itself, as it's only used in
one location. The generated asm is virtually identical to the handwritten code.
Summary:
With some additional changes for AIM, that could also support much
larger physmem sizes. Given that 32-bit AIM is more or less obsolete, though,
it's not worth it at this time.
Differential Revision: https://reviews.freebsd.org/D4345
into a new function that other platforms can share.
This creates a new ofw_reg_to_paddr() function (in a new ofw_subr.c file)
that contains most of the existing ppc implementation, mostly unchanged.
The ppc code now calls the new MI code from the MD code, then creates a
ppc-specific bus_space mapping from the results. The new arm implementation
does the same in an arm-specific way.
This also moves the declaration of OF_decode_addr() from ofw_machdep.h to
openfirm.h, except on sparc64 which uses a different function signature.
This will help all FDT platforms to set up early console access using
OF_decode_addr().
e500mc, e5500, and e6500 all use the normal FPU, with the same behavior as AIM
hardware. e6500 also supports Altivec, so, although we don't yet have e6500
hardware to test on, add these IVORs as well. Theoretically, since it boots the
same as a e5500, it should work, single-threaded, single-core, with full altivec
support as of this commit.
With this commit, and some other patches to be committed shortly FreeBSD now
boots on the P5020, single-core, all the way to user space, and should boot just
fine on e500mc.
Relnotes: Yes (e500mc, e5500 support)
Sponsored by: Alex Perez/Inertial Computing
sysent.
sv_prepsyscall is unused.
sv_sigsize and sv_sigtbl translate signal number from the FreeBSD
namespace into the ABI domain. It is only utilized on i386 for iBCS2
binaries. The issue with this approach is that signals for iBCS2 were
delivered with the FreeBSD signal frame layout, which does not follow
iBCS2. The same note is true for any other potential user if
sv_sigtbl. In other words, if ABI needs signal number translation, it
really needs custom sv_sendsig method instead.
Sponsored by: The FreeBSD Foundation
lwsync instruction, which does not provide Store/Load barrier. Fix
this by using "full" sync barrier for mb().
atomic_store_rel() does not need full barrier, change mb() call there
to the lwsync instruction if not hitting the known CPU erratas
(i.e. on 32bit). Provide powerpc_lwsync() helper to isolate the
lwsync/sync compile time selection, and use it in atomic_store_rel()
and several other places which duplicate the code.
Noted by: alc
Reviewed and tested by: nwhitehorn
Sponsored by: The FreeBSD Foundation
new, simplified, ELF ABI that avoids some of the stranger aspects of the
existing 64-bit PowerPC ABI (function descriptors, in particular). Actually
generating such executables requires a new version of binutils and a newer
compiler (either GCC or clang) than GCC 4.2.1.
Summary:
* Take advantage of NEW_PCIB to remove a lot of setup code.
* Fix some bugs related to multiple PCI bridges.
There's still room for more cleanup, and still some bugs leftover, but this
cleans up a lot.
Test Plan: Tested on P5020 board with IDT PCIe switch.
Differential Revision: https://reviews.freebsd.org/D4127
created for bus_dma_tag_t tag, bounce pages should be allocated
only if needed.
Before the fix, they were allocated always if BUS_DMA_COULD_BOUNCE flag
was set but BUS_DMA_MIN_ALLOC_COMP not. As bounce pages are never freed,
it could cause memory exhaustion when a lot of such tags together with
their maps were created.
Note that there could be more maps in one tag by current design.
However BUS_DMA_MIN_ALLOC_COMP flag is tag's flag. It's set after
bounce pages are allocated. Thus, they are allocated only for first
tag's map which needs them.
Approved by: kib (mentor)
sizeof(unsigned long) < sizeof(vm_paddr_t) on Book-E, which uses 36-bit
addressing. With this, a CCSR with a physical address above 4GB successfully
maps.
Sponsored by: Alex Perez/Inertial Computing
QorIQ SoCs (e5500 core, P5 family) have 2 BARs for local access windows, while
MPC85XX, and P1/P2 families use only a single BAR register.
This also adds the QORIQ_DPAA option, mutually exclusive to MPC85XX, to handle
this difference.
Obtained from: Semihalf
Sponsored by: Alex Perez/Inertial Computing
OF_getprop() to get encode-int encoded values from the OF tree. This is
a no-op at present, since all existing PowerPC ports are big-endian, but
it is a correctness improvement and will be required if we have a
little-endian kernel at some future point.
Where it is totally impossible for the code ever to be used on a
little-endian system (much of powerpc/powermac, for instance), I have not
necessarily made the appropriate changes.
MFC after: 1 month
On the mpc85xx SoC family, writes to any part of a word in the CCSR affect the
full word. This prevents single-byte writes from taking the desired effect.
Code copied directly from ARM.
This will enable the elimination of a workaround in the USB driver that
artifically allocates buffers twice as big as they need to be (which
actually saves memory for very small buffers on the buggy platforms).
When deciding how to allocate a dma buffer, armv4, armv6, mips, and
x86/iommu all correctly check for the tag alignment <= maxsize as enabling
simple uma/malloc based allocation. Powerpc, sparc64, x86/bounce, and
arm64/bounce were all checking for alignment < maxsize; on those platforms
when alignment was equal to the max size it would fall back to page-based
allocators even for very small buffers.
This change makes all platforms use the <= check. It should be noted that
on all platforms other than arm[v6] and mips, this check is relying on
undocumented behavior in malloc(9) that if you allocate a block of a given
size it will be aligned to the next larger power-of-2 boundary. There is
nothing in the malloc(9) man page that makes that explicit promise (but the
busdma code has been relying on this behavior all along so I guess it works).
Arm and mips code uses the allocator in kern/subr_busdma_buffalloc.c, which
does explicitly implement this promise about size and alignment. Other
platforms probably should switch to the aligned allocator.
Make it clearer what each one means in the comments that define them.
IIC_BUSBSY was used in many places to mean two different things, either
"someone else has reserved the bus so you have to wait until they're done"
or "the signal level on the bus was not in the state I expected before/after
issuing some command".
Now IIC_BUSERR is used consistantly to refer to protocol/signaling errors,
and IIC_BUSBSY refers to ownership/reservation of the bus.
linkers no longer raise an error when undefined weak symbols are
found, but relocate as if the symbol value was 0. Note that we do not
repeat the mistake of userspace dynamic linker of making the symbol
lookup prefer non-weak symbol definition over the weak one, if both
are available. In fact, kernel linker uses the first definition
found, and ignores duplicates.
Signature of the elf_lookup() and elf_obj_lookup() functions changed
to split result/error code and the symbol address returned.
Otherwise, it is impossible to return zero address as the symbol
value, to MD relocation code. This explains the mechanical changes in
elf_machdep.c sources.
The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the
lookup() call, the patch leaves the code as is (untested).
Reported by: glebius
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
When the system has more than a single PCI domain, the bus numbers
are not unique, thus they cannot be used for "pci" device numbering.
Change bus numbers to -1 (i.e. to-be-determined automatically)
wherever the code did not care about domains.
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3406
running thread.
It is currently implemented only on amd64 and i386; on these
architectures, it is implemented by raising an NMI on the CPU on which
the target thread is currently running. Unlike stack_save_td(), it may
fail, for example if the thread is running in user mode.
This change also modifies the kern.proc.kstack sysctl to use this function,
so that stacks of running threads are shown in the output of "procstat -kk".
This is handy for debugging threads that are stuck in a busy loop.
Reviewed by: bdrewery, jhb, kib
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3256
The only operation which is prevented by the hold is the kernel stack
swapout for the faulted thread, which should be fine to allow.
Remove useless checks for NULL curproc or curproc->p_vmspace from the
trap_pfault() wrappers on x86 and powerpc.
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
As part of this, clean up tlb1_init(), since bootinfo is always NULL here just
eliminate the loop altogether.
Also, fix a bug in mmu_booke_mapdev_attr() where it's possible to map a larger
immediately following a smaller page, causing the mappings to overlap. Instead,
break up the new mapping into smaller chunks. The downside to this is that it
uses more precious TLB1 entries, which, on smaller chips (e500v2) it could cause
problems with TLB1 being out of space (e500v2 only has 16 TLB1 entries).
Obtained from: Semihalf (partial)
Sponsored by: Alex Perez/Inertial Computing
FDT_DTB_STATIC is defined in opt_platform.h, and fdt_static_dtb is in
fdt_common.h, so include those files.
Sponsored by: Alex Perez/Inertial Computing
Summary:
This is (probably step 1) of enhancing the book-e pmap to support the full
36-bit physical address space on Freescale e500 and e5500 cores.
Thus far it has only been regression tested on one platform. Since I only have
one other Book-E platform (e5500), that needs work beyond this, I haven't yet
tested it on this.
Test Plan: Regression tested on my RouterBoard RB800.
Reviewed By: marcel
Differential Revision: https://reviews.freebsd.org/D3027
Summary:
The RouterBoard uses a predefined partition map which doesn't exist in the fdt.
This change allows overriding the fdt slicer with a custom slicer, and uses this
custom slicer to define the flash map on the RouterBoard RB800.
D3305 converts the mpc85xx platform into a base class, so that systems based on
the mpc85xx platform can add their own overrides. This change builds on D3305,
and creates a RouterBoard (RB800) platform to initialize the slicer override.
Reviewed By: nwhitehorn, imp
Differential Revision: https://reviews.freebsd.org/D3345
Summary:
Some systems are based around mpc85xx, but need special initialization. By
making the mpc85xx platform a base class, these systems can be platform
subclasses, and perform board-specific initialization in addition to the mpc85xx
initialization.
Test Plan:
Tested on my RB800. A platform class was created, and will be committed
separately.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D3305
* Since r257190 the kernel must actually be loaded at a 64MB boundary, not 16MB.
* Don't program HID1 register on e500mc or e5500, they don't have this SPR.
* Set proper HID0 defaults for these new architectures.
There is still more work to be done for the various SoCs, and the PMAP code
still needs to be extended to 36-bit paddr, coming soon.
Obtained from: Semihalf
Sponsored by: Alex Perez/Inertial Computing
Rather than special casing on PCIC_BRIDGE || PCIC_PROCESSOR, allow all
HDRTYPE_BRIDGE types.
Obtained from: Semihalf
Sponsored by: Alex Perez/Intertial Computing
initial thread stack is not adjusted by the tunable, the stack is
allocated too early to get access to the kernel environment. See
TD0_KSTACK_PAGES for the thread0 stack sizing on i386.
The tunable was tested on x86 only. From the visual inspection, it
seems that it might work on arm and powerpc. The arm
USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already
incorrect for the threads with non-default kstack size. I only
changed the macros to use variable instead of constant, since I cannot
test.
On arm64, mips and sparc64, some static data structures are sized by
KSTACK_PAGES, so the tunable is disabled.
Sponsored by: The FreeBSD Foundation
MFC after: 2 week
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)
These will create and destroy a temporary, CPU-local KVA mapping of a specified page.
Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread
Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page
The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.
NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: http://reviews.freebsd.org/D3013
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D2993
If KSTACK_PAGES was changed to anything alse than the default,
the value from param.h was taken instead in some places and
the value from KENRCONF in some others. This resulted in
inconsistency which caused corruption in SMP envorinment.
Ensure all places where KSTACK_PAGES are used the opt_kstack_pages.h
is included.
The file opt_kstack_pages.h could not be included in param.h
because was breaking the toolchain compilation.
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3094
It appears that the linker will not handle 64-bit relocations at addresses that
are not aligned to 8-byte boundaries. Prior to this change the line:
.llong generictrap
was aligned to a 4-byte address, and the linker replaced that with an 8-byte
0x0. Aligning that address to 8 bytes caused the linker to generate the proper
relocation. As a follow-through, the dblow from trap_subr33.S used the code
sequence 'lwz %r1, TRAP_GENTRAP(0)', so this reproduces the analogue of that for
64-bit.
provide a semantic defined by the C11 fences with corresponding
memory_order.
atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.
atomic_thread_fence_rel() is r, w | w.
atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation. Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.
atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.
Reviewed by: alc
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Summary:
Both booke and AIM interrupt.c files contain nearly identical code. This merges
the two files, to reduce duplication.
Reviewers: #powerpc, marcel
Reviewed By: marcel
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D2991
On Book-E, physical addresses are actually 36-bits, not 32-bits. This is
currently worked around by ignoring the top bits. However, in some cases, the
boot loader configures CCSR to something above the 32-bit mark. This is stage 1
in updating the pmap to handle 36-bit physaddr.
This will print out the Memory Subsystem Status Register on MPC745x (G4+ class),
and the Machine Check Status Register on Book-E class CPUs, to aid in debugging
machine checks. Other relevant registers, for other CPUs, can be added in the
future.
This will require for AArch64 as we dont have modules yet.
Sponsored by: HEIF5
Sponsored by: ARM Ltd.
Differential Revision: https://reviews.freebsd.org/D1997
Thread credentials are maintained as follows: each thread has a pointer to
creds and a reference on them. The pointer is compared with proc's creds on
userspace<->kernel boundary and updated if needed.
This patch introduces a counter which can be compared instead, so that more
structures can use this scheme without adding more comparisons on the boundary.
Native ABI do not need signal conversion, only emulators may want this. Usually
emulators implements its own sv_sendsig method. For now only ibcs2 emulator does
not have own sv_sendsig implementation and depends on native sendsig() method.
So, remove any extra attempts to convert signal numbers from native sendsig()
methods except from i386 where ibsc2 is living.