On arm64 we currently use a non-posted write for device memory, however
we should move to use posted writes. This is expected to work on most
hardware, however we will need to support a non-posted option for some
broken hardware.
Reviewed by: imp, manu, bcr (manpage)
Differential Revision: https://reviews.freebsd.org/D29722
On some systems (e.g. Lenovo ThinkPad X240, Apple MacBookPro12,1)
the SMBIOS entry point is not found in the <0xFFFFF space.
Follow the SMBIOS spec and use the EFI Configuration Table for
locating the entry point on EFI systems.
Reviewed by: rpokala, dab
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29276
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.
[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
NetApp PR: 51
Differential Revision: https://reviews.freebsd.org/D29174
Add wrappers around the debug_monitor interface, to be consumed by MI
kernel debugger code. Update dbg_setup_watchpoint() and
dbg_remove_watchpoint() to return specific error codes, not just -1.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
This change serves two purposes.
First, we take advantage of the compiler provided endian definitions to
eliminate some long-standing duplication between the different versions
of this header. __BYTE_ORDER__ has been defined since GCC 4.6, so there
is no need to rely on platform defaults or e.g. __MIPSEB__ to determine
endianness. A new common sub-header is added, but there should be no
changes to the visibility of these definitions.
Second, this eliminates the hand-rolled __bswapNN() routines, again in
favor of the compiler builtins. This was done already for x86 in
e6ff6154d2. The benefit here is that we no longer have to maintain our
own implementations on each arch, and can instead rely on the compiler
to emit appropriate instructions or libcalls, as available. This should
result in equivalent or better code generation. Notably 32-bit arm will
start using the `rev` instruction for these routines, which is available
on armv6+.
PR: 236920
Reviewed by: arichardson, imp
Tested by: bdragon (BE powerpc)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29012
Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN. Lay a bit of groundwork for KASAN and KMSAN.
When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h. These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.
No functional change intended.
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
e4b8deb222 removed the last in-tree uses of PCPU_INC(). Its
potential benefit is also practically nonexistent. Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction. The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.
[0]: https://www.agner.org/optimize/instruction_tables.pdf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29308
I noticed that many of the math-related tests were failing on AArch64.
After a lot of debugging, I noticed that the floating point exception flags
were not being reset when starting a new process. This change resets the
VFP inside exec_setregs() to ensure no VFP register state is leaked from
parent processes to children.
This commit also moves the clearing of fpcr that was added in 65618fdda0
from fork() to execve() since that makes more sense: fork() can retain
current register values, but execve() should result in a well-defined
clean state.
Reviewed By: andrew
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29060
Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.
No functional change intended.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29103
To trace leaf asm functions we can insert a single nop instruction as
the first instruction in a function and trigger off this.
Reviewed by: gnn
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D28132
This reduces the memory mapped to be closer to the minimal memory
needed to enable the MMU.
Reviewed by: mmel
Sponsored by: Innovate UK
Differential Revision:://reviews.freebsd.org/D27765
arm64 has a distinct exception code for single-step, so we can use this
to detect when an unexpected SS trap is encountered, or when an expected
one is not. See db_stop_at_pc().
Reviewed by: markj, jhb
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28942
This value should be kept in sync with updates to kdb_frame->tf_elr,
since it is queried by PC_REGS() in several places.
Reviewed by: markj, jhb
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28943
The motivation is to provide access to these registers from userspace
via ptrace(2) requests PT_GETDBREGS and PT_SETDBREGS.
This change breaks the ABI of these particular requests, but is
justified by the fact that the intended consumers (debuggers) have not
been taught to use them yet. Making this change now enables active
upstream work on lldb to begin using this interface, and take advantage
of the hardware debugging registers available on the platform.
PR: 252860
Reported by: Michał Górny (mgorny@gentoo.org)
Reviewed by: andrew, markj (earlier version)
Tested by: Michał Górny (mgorny@gentoo.org)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28415
This is a prerequisite to allowing the use of hardware watchpoints for
userspace debuggers.
This is also a slight departure from the x86 behaviour, since `si_addr`
returns the data address that triggered the watchpoint, not the
address of the instruction that was executed. Otherwise, there is no
straightforward way for the application to determine which watchpoint
was triggered. Make a note of this in the siginfo(3) man page.
Reviewed by: jhb, markj (earlier version)
Tested by: Michał Górny (mgorny@gentoo.org)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28561
In particular, we want to disallow setting breakpoints on kernel
addresses from userspace. The control register fields are validated or
ignored as appropriate.
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28560
On arm64 we can select how strongly we order device memory. Currently
we use the strongest type of non-Gathering, non-Reordering, no Early
write acknowledgement. This is equivalent to VM_MEMATTR_SO in the 32-bit
arm code.
Create a new memory type to remove the no Early write acknowledgement
option to create a memory attribute that is equivalent to the arm
VM_MEMATTR_DEVICE.
Keep the the old nGnRnE memory as what we provide for VM_MEMATTR_DEVICE
until we can test nGnRE on more hardware. A method for dynamically
switching back may be needed as at least one vendor is known to have
broken nGnRE memory.
Sponsored by: Innovate UK
The RW fields in this register reset to architecturally unknown values,
so initialize these to the proper rounding and denormal mode.
MFC after: 1 week
The existing implementation relies on each trap handler saving a normal
stack frame record, which is a waste of time and space when we're
already saving a trapframe to the stack. It's also wrong as it currently
saves LR not ELR.
Instead of patching it up, rewrite it based on the RISC-V implementation
with inspiration from the amd64 implementation for how to handle
vectored traps to provide an improved implementation. This includes
compressing the information down to one line like other architectures
rather than the highly-verbose old form that repeats itself by printing
LR and FP in one frame only to print them as PC and SP in the next. It
also includes printing out actually useful information about the traps
that occurred, though FAR is not saved in the trapframe so we cannot
print it (in general it can be clobbered between when the trap happened
and now), only ESR.
The AAPCS also allows the stack frame record to be located anywhere in
the frame, not just the top, so the caller's SP is not at a fixed offset
from the callee's FP like on almost all other architectures in
existence. This means there is no way to derive the caller's SP in the
unwinder, and so we have to drop that bit of (unused) state everywhere.
Reviewed by: jhb, markj
Differential Revision: https://reviews.freebsd.org/D28026
This setting limits the amount of memory that can be allocated to UMA.
On systems with a direct map and ample KVA, however, there is no reason
for VM_KMEM_SIZE_SCALE to be larger than 1. This appears to have been
inherited from the 32-bit ARM platform definitions.
Also remove VM_KMEM_SIZE_MIN, which is not needed when
VM_KMEM_SIZE_SCALE is defined to be 1.[*]
Reviewed by: alc, kp, kib
Reported by: alc [*]
Submitted by: Klara, Inc.
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D28225
This setting places a (small) limit on the size of the buffer cache,
constraining UFS performance on large servers. The setting comes from
the initial arm64 implementation and appears to be vestigal. Remove it.
Reviewed by: kib
Submitted by: Klara, Inc.
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D28162
This allows us to use it when we only need to check if the virtual address
is valid. For example when checking if an address in the DMAP region is
mapped.
Reviewed by: kib, markj
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D27621
This removes an unneeded instruction to move the pointer from x18 to a
temporary register.
Reviewed by: emaste
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D26971
Everything required for remote kernel debugging over a serial
connection. For FDT-based systems, a debug port can be specified by
setting hw.fdt.dbgport to the desired device tree node in loader.conf.
For example, hw.fdt.dbgport="uart1", or
hw.fdt.dbgport="serial@ff1a0000".
Looks good: emaste
Tested by: rwatson
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27727
The program counter field in the PCB is written in exactly one place,
makectx(), upon entry to the debugger. For threads other than curthread,
its value will be empty, or bogus. Rather than writing to this field in
more places, it can be removed in favor of using the value in the link
register.
To make this clearer, pcb->pcb_x[30] is renamed to pcb->pcb_lr, similar
to what already exists in struct trapframe. Also, prefer lr to x30 in
assembly, as it better conveys intention.
This improves PC_REGS() for kdb_thread != curthread. It is required for
a functional gdb(4) stub, fixing the output of `info threads`, in
particular.
The space occupied by pcb_pc is retained, for compatibility with kgdb.
Reviewed by: markj, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27720
These macros generate both the 32- and 64-bit ops, but the mask was hard
coded for 32-bit ops, causing the 64-bit ops always to affect only the
low 32 bits.
PR: 252324
Reported by: gbe, mmel
Reviewed by: markj, mmel
Tested by: mmel, rwatson
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27886
This same check is used on other architectures. Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).
Reviewed by: andrew, markj
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27362
- record MPIDR for all started cores in pcpu, they will be used as link
between physical locality of given core, ID in external description
(FDT or ACPI) and cupid.
- because of above, cpuid can (and should) be freely assigned, only boot
CPU must have cpuid 0. Simplify startup code according this.
Please note that pure cpuid is not sufficient instrument to hold any
information about core or cluster topology, nor to determistically iterate
over subpart of cores in CPU (iterate over all cores in single cluster for
example). Situation is more complicated by fact that PSCI can reject start
of core without reporting error (because power budget for example), or by
fact that is possible that we booted on non-first core in cluster (thus with
cpuid 0 assigned to random core).
Given cores topology should be exhibited to other parts of system
(for example to scheduler for big.little or multicluster systems) by using
smp_topo interface.
Differential Revision: https://reviews.freebsd.org/D13863
Follow-up to r353959 and r368070: do the same for other architectures.
arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing. Likewise, MIPS
uses .frame directives.
Reviewed by: arichardson
Differential Revision: https://reviews.freebsd.org/D27387
On some of the server-grade ARM64 machines the number of NUMA domains is higher
than 2. When booting GENERIC kernel on such machines the SRAT parser fails
leaving the system with a single domain. To make GENERIC kernel usable on those
server, match the parameter value with the one for amd64 arch.
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D27368
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
This adds an arm64 iommu interface and a driver for Arm System Memory
Management Unit version 3.2 (ARM SMMU v3.2) specified in ARM IHI 0070C
document.
Hardware overview is provided in the header of smmu.c file.
The support is disabled by default. To enable add 'options IOMMU' to your
kernel configuration file.
The support was developed on Arm Neoverse N1 System Development Platform
(ARM N1SDP), kindly provided by ARM Ltd.
Currently, PCI-based devices and ACPI platforms are supported only.
The support was tested on IOMMU-enabled Marvell SATA controller,
Realtek Ethernet controller and a TI xHCI USB controller with a low to
medium load only.
Many thanks to Konstantin Belousov for help forming the generic IOMMU
framework that is vital for this project; to Andrew Turner for adding
IOMMU support to MSI interrupt code; to Mark Johnston for help with SMMU
page management; to John Baldwin for explaining various IOMMU bits.
Reviewed by: mmel
Relnotes: yes
Sponsored by: DARPA / AFRL
Sponsored by: Innovate UK (Digital Security by Design programme)
Differential Revision: https://reviews.freebsd.org/D24618
Use ELR register value instead of LR for PMC_TRAPFRAME_TO_PC macro since
it's the former that indicates PC if the interrupted execution thread.
This fixes a bug where pmcstat lost the leaf function of the call chain
and started with the second function in the chain.
Although this change is an improvement over the previous logic there is still
posibility for incomplete data: if the leaf function does not have stack
variables and does not call any other functions compiler would not generate
a stack frame for it and the FP value would point to the caller's frame, so
instead of the actual "caller1 -> caller2 -> leaf" chain only
"caller1 -> leaf" would be captured.
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
This brings these definitions in sync with the ARMv8.6 version of the
architecture reference manual.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26706
Ampere Altra in a dual socket configuration has 12 ITSes for the
12 PCIe root complexes. The NIRQ interrupts are statically split
between each child of the gic bus, so here we increase that
value. 16k is enough for
(#cpus * #its * max_pcie_bifurcation) LPIs + (#SPIs and #PPIs)
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D26766
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).
Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26131
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.
Reviewed by: kib, markj
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26129
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.
Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.
This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25371