When mapping the arm64 KASAN shadow map we use Ln_TABLE_MASK to align
physical addresses, however these should already be aligned either
by rounding to a greater alignment, or the VM subsystem is giving us
a correctly aligned page.
Remove these extra alignment masks.
Reviewed by: kevans
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D39752
It appears that PAC registers are configured to trap upon access, but
since the kernel starts in EL1 on this platform it has no ability to
inspect or modify this configuration. Simply disable PAC on this
platform for now, since the kernel otherwise hangs during boot.
PR: 270472
Reviewed by: andrew, emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D39748
Export default MINSIGSTKSZ value for the x86 until we do not preserve AVX
registers in the signal context.
Differential Revision: https://reviews.freebsd.org/D39644
MFC after: 1 month
Have more accruate comments. While #if, #else, etc are copied to the
header files, lines that don't start with # are not. And #include files
are only output to sysinc (which winds up at the front of init_sysent.c
which seems a bit odd). This is all radically undocumented, and likely
has drifted somewhat from 4.4BSD and what other systems do (they've
drifted too, fwiw).
Sponsored by: Netflix
When vm_map_remove() is called from vm_swapout_map_deactivate_pages()
due to swapout, PKRU attributes for the removed range must be kept
intact. Provide a variant of pmap_remove(), pmap_map_delete(), to
allow pmap to distinguish between real removes of the UVA mappings
and any other internal removes, e.g. swapout.
For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove().
Reported by: andrew
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39556
To allow it to be used before ENTRY we need to ensure the symbol is
in the .text section. It also needs to be aligned correctly.
While here mark the symbol type as a function as in the ENTRY macro.
Reported by: jrtc27
Sponsored by: Arm Ltd
Handle data-lanes property for pcie phy and set it accordingly.
This makes devices attached to pcie3 work properly.
For some RK3568 based boards, RTL8125B based device is
connected it. So with this, realtek-re-kmod driver attaches
and works.
Partially obtained from OpenBSD.
Tested on NanoPI-R5S, FireFly Station P2 boards.
NETLINK is going to replace rtsock and a number of other ioctl/sysctl interfaces.
In-base utilies such as route(8), netstat(8) and soon ifconfig(8)
are being converted to use netlink sockets as a transport between
kernel and userland.
In the current configuration, it still possible have the kernel
without NETLINK (`nooptions NETLINK`) and use the aforementioned
utilies by buidling the world with `WITHOUT_NETLINK` src.conf knob.
However, this approach does not cover the cases when person unintentionally
builds a custom kernel without netlink and tries to use the standard userland.
This change adds `option NETLINK` to the default options for each
architecture, fixing the custom kernel issue.
For arm, this change uses `std.armv6` and `std.armv7` (netlink already in)
instead of DEFAULTS.
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D39339
Use the GICD_SIZE macro (0x10000), which is half the size of the current
fixed-sized mapping (128 * 1024 == 0x20000).
In ARM64 Hyper-V instances, it seems the Distributor's registers are
located immediately preceding a range of physical memory in the bus
address space. Thus, when ram0 is attaching and attempts to reserve
SYS_RES_MEMORY resources corresponding to its physmem ranges, it fails,
because the first 0x10000 bytes of this range are already owned by gic0.
PR: 270415
Reported by: whu
Tested by: whu
Differential Revision: https://reviews.freebsd.org/D39260
init_pagetables is mapped into the segment containing the BSS, but does
not get zeroed by locore. It is used for bootstrap page table pages.
It happens that the bootstrap kernel stack is also placed in that
section, but there's no reason it shouldn't live in the BSS, so move it
there. No functional change intended.
Reviewed by: andrew
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks
Differential Revision: https://reviews.freebsd.org/D39367
The ENTRY macro adds instructions to the start of a function but not
EENTRY. To use these instructions in both functions move the EENTRY
use before the ENTRY use.
Sponsored by: Arm Ltd
On arm64, the PCB is stored at the top of the thread stack. For thread0
this comes from the static "initstack" region, which is placed in the
.init_pagetable section, which is not part of the BSS and thus doesn't
get zeroed by locore. (See the comment in ldscript.arm64.) It is thus
possible for the pcb_flags field to be uninitialized, which can result
in PCB_SINGLE_STEP being set.
Fix this by simply initializing the field. A separate commit will move
initstack out of the .init_pagetable section, since it has no reason to
be there, but it is preferable to explicitly initialize PCB fields
anyway. In particular, regular kernel stacks are not zeroed upon
allocation, so we should be consistent here.
Reviewed by: andrew
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks
Differential Revision: https://reviews.freebsd.org/D39343
Rather than falling through to the default case handle the unknown
exception with its own panic message. As ESR_EL1 is zero for this
exception stop printing it.
Sponsored by: Arm Ltd
Previously this would zero out x18 in the pcb, now it's attacking the
innocent pcb_onfault -- drop it entirely.
This technically fixes
e605b87a9e ("Save only callee-saved registers in pcb"), but it's
harmless until the below commit trims down pcb_x.
Reported by: mmel
Reviewed by: andrew, mmel
Fixes: 1c1f31a5e5 ("Remove unused registes from the arm pcb")
Differential Revision: https://reviews.freebsd.org/D39277
This entails:
- Marking some obvious candidates for __nosanitizeaddress
- Similar trap frame markings as amd64, for similar reasons
- Shadow map implementation
The shadow map implementation is roughly similar to what was done on
amd64, with some exceptions. Attempting to use available space at
preinit_map_va + PMAP_PREINIT_MAPPING_SIZE (up to the end of that range,
as depicted in the physmap) results in odd failures, so we instead
search the physmap for free regions that we can carve out, fragmenting
the shadow map as necessary to try and fit as much as we need for the
initial kernel map. pmap_bootstrap_san() is thus after
pmap_bootstrap(), which still included some technically reserved areas
of the memory map that needed to be included in the DMAP.
The odd failure noted above may be a bug, but I haven't investigated it
all that much.
Initial work by mhorne with additional fixes from kevans and markj.
Reviewed by: andrew, markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36701
These were kept for ABI reasons. Remove them and bump __FreeBSD_version
so debuggers can be updated to use the new layout.
Reviewed by: jhb
Sponsored by: Arm Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35378
The size of the spsr field in struct reg has changed. Mask the bits
that userspace doesn't know about out as they may be invalid.
While here add a comment why we don't need compat support in set_regs.
Sponsored by: Arm Ltd
It was previously possible for the fault address register to get
clobbered before it was saved. This small window occurred when an
additional exception was encountered inside the exception handler,
overwriting the previous value.
Commit f29942229d ("Read the arm64 far early in el0 exceptions")
patched this issue, but avoided changing the trapframe since this could
be considered a KBI change in FreeBSD 13.
Revert the above fix and save the fault address in the trapframe
instead. This saves the fault address even earlier in the exception
handling process, and is a more robust and simple fix.
Reviewed by: andrew, jhb, jrtc27
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38984
For the Exception Syndrome Register, ESR_ELx, the upper 32b were
previously unused, but now may contain additional exception info as of
Armv8.7 (FEAT_LS64).
Extend ESR from u32->u64 in exception handling code to support this. In
addition, also extend Saved Program Status Register SPSR_ELx in the same
way to allow for future extensions.
Reviewed by: andrew
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38983
sig_atomic_t is defined as a long and thus is 64-bit on arm64. For some
reason its limit was incorrectly specified as a 32-bit number. This had
the unfortunate side effect of causing gnulib to override most of the
definitions in stdint.h. On CheriBSD this breaks all software that uses
gnulib in annoying and hard to debug ways.
Technically updating the limits might be an ABI change, but these
defines are largely unused (the only use in tree is in the libc++ test
suite where it's use an assertion that will fail due to this bug).
Further, since the underlying type remains the same, we're just
increasing the range of values a paranoid program might use.
Reviewed by: andrew, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D39193
Add macros for offsets of macros we set in the arm64 pcb pcb_x array.
This will simplift reducing the size of this array in a later change.
Sponsored by: Arm Ltd
Make a pass at the various nexus implementations, fixing some very minor
style issues, obsolete comments, etc.
The method declaration section has become unwieldy in many respects.
Attempt to tame it by:
- Using generated method typedefs
- Grouping methods roughly by category, and then alphabetically.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D38495
Adding a missing include file, which provides the definition of
SYSCTL_INT.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D39149
Move device memory to a weaker type. The new device memory type allows
the system to acknowledge a write to a device before the write has
completed. This is inline with VM_MEMATTR_DEVICE on armv6/armv7.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38945
To allow for debugging after changing the arm64 VM_MEMATTR_DEVICE
memory type add a new set of tunables to tell the kernel to use
non-posted memory.
This adds the following tunables:
- kern.force_nonposted: When set to non-zero the kernel will use
non-posted memory for all device allocations.
- hint.<dev>.<unit>.force_nonposted: As above, however only forces
non-posted memory on the named device.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38944
We only ever build a 4 level page table for the Arm SMMU. Remove the
support for a 3 level table.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38949
These are SMMU (and MALI GPU) specific. Give them a SMMU specific name.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38948
The fields we need to adjust are different in stage 1 and stage 2
tables. Handle this by adding variables to hold the bits to check,
set, and clear.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37399
Consolidate add_efi_map_entry() and exclude_efi_map_entry() into a
single function, handle_efi_map_entry(), so that the exact set of entry
types handled is the same in the addition or exclusion cases. Before,
exclude_efi_map_entry() had a 'default' case that would exclude all
entry types that were not listed explicitly in the switch statement.
Logically, we do not need to exclude a range that could not possibly be
added to physmem, and we do not need to exclude bus ranges that are not
physical memory, for example EFI_MD_TYPE_IOMEM.
Since physmem's ram0 device will reserve bus memory resources for its
owned ranges, this was preventing attachment of the watchdog device on
the RPI4B. For some reason its region of memory-mapped I/O appeared in
the EFI memory map (with the aforementioned EFI_MD_TYPE_IOMEM type).
This change fixes the attachment issue, as we prevent the physmem API
from messing with this range of bus space.
PR: 270044
Reported by: karels, Mark Millard
Reviewed by: andrew, karels, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39003
To invalidate stage 2 mappings on arm64 we may need to call into the
hypervisor so add a function pointer that bhyve can use to implement
this.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37254
Hyper-V does not provide Mellanox hardware, some of Azure's instances
do, thus the configuration to enable them does not belong in the generic
std.hyperv config.
Fixes: 15e7fa83ef ("arm64: Hyper-V: Add vPCI and Mellanox driver modules into build")
This is the only mapping remaining which needs to respect nonposted-mmio
to avoid breaking the boot on Apple silicon.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D38920
To allow hardware to work around a broken memory bus where we need to
support the nonposted-mmio flag.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D34333
On arm64 PCI config memory is expected to be mapped with a non-posted
device type. To handle this use the new bus_map_resource support in
arm64 to map memory with the new VM_MEMATTR_DEVICE_NP attribute. This
memory has already been allocated and activated, it just needs to be
mapped.
Reviewed by: kevans, mmel
Differential Revision: https://reviews.freebsd.org/D30079
On some hardware, we can't clear HCR_EL2.E2H so accesses to the physical
timer hopelessly trap to EL2. Stash off the value of HCR_EL2 and use it
in has_hyp() to avoid this.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D38884
This makes the detection of VMs common between platforms that
have SMBios.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D38800
A subsequent commit will instead use existing infrastructure to
exclude the files from hwpmc.ko for non-ACPI builds. Note that the
original commit left the files as optional in sys/conf/files.arm64.
This reverts commit 751d88119f.
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D38736
This patch shaves off up to two three instructions in
save_registers_head in exception.S for arm64, which would make more
space for instructions that could be added in CheriBSD.
This is done by:
1. Combining pointer arithmetic with pre-incrementing STP instructions
2. Removing the instruction that sets the frame pointer (x29) as its
content is unused
Differential Revision: https://reviews.freebsd.org/D34631
On Apple Silicon systems, E2H can't actually be cleared; we're stuck
with it. Check it again when we're setting up CPTR_EL2 and set FPEN
appropriately to avoid later trapping to EL2 on writes to SIMD
registers.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D38819
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
They don't provide any value and are quite arbitrary.
Note arm64 GENERIC-MMCCAM was already excluded, just not the NODEBUG
variant.
The option is already build-tested with arm64 LINT kernel.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D38458
To avoid confusing people, rename linux_timer.h to linux_time.h,
as linux_timer.c is the implementation of timer syscalls only,
while linux_time.c contains implementation of all stuff declared
in linux_time.h.
MFC after: 2 weeks
Include vm headers directly where they needed. The linux_util.h included
in a most source files of the Linuxulator, avoid collecting a rarely used
includes here.
MFC after: 2 weeks
pmap_init_pv_table makes a first pass over the memory segments to
compute the amount of address space needed to allocate per-superpage
locks. It then makes a second pass over each segment allocating
domain-local memory to back the pages for the locks belonging to each
segment. This second pass rounds each segment's allocation up to a
page size since the domain-local allocation has to be a multiple of
pages. However, the first pass was only doing a single round of the
total page counts up at the end not accounting for the padding present
in each segment. To fix, apply the rounding in each segment in the
first pass instead of just at the end.
While here, tidy the second pass a bit by trimming some
not-quite-right logic copied from amd64. In particular, compute pages
directly at the start of the loop iteration to more closely match the
first loop. Then, drop an always-false condition as 'end' was
computed as 'start + pages' where 'start == highest + 1'. Thus, the
actual condition being tested was 'if (highest >= highest + 1 +
pages)'. Finally, remove 'highest' entirely by keep the result of the
'pvd' increment in the existing loop.
Reported by: CHERI (overflow)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D38377
The architecture nexus should handle allocation and release of memory and
interrupts. This is to ensure that system-wide resources such as these
are available to all devices, not just children of ofwbus0.
On powerpc this moves the ownership of these resources up one level,
from ofwbus0 to nexus0. Other architectures already have the required
logic in their nexus implementation, so this eliminates the duplication
of resources. An implementation of nexus_adjust_resource() is added for
arm, arm64, and riscv.
As noted by ian@ in the review, resource handling was the main bit of
logic distinguishing ofwbus from simplebus. With some attention to
detail, it should be possible to merge the two in the future.
Co-authored by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30554
When handling userspace exceptions on arm64 we need to dereference the
current thread pointer. If this is being promoted/demoted there is a
small window where it will cause another exception to be hit. As this
second exception will set the fault address register we will read the
incorrect value in the userspace exception handler.
Fix this be always reading the fault address before dereferencing the
current thread pointer.
Reported by: olivier@
Reviewed by: markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38196
Before adding the ITS interrupt controller driver to handle MSI/MSI-X
interrupts check if it is present in the IO Remapping Table (IORT).
If not don't attach as devices expect to use this table to find the
correct MSI interrupt controller.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37772
Microsoft Azure Hyper-V uses SPI to map MSI in ARM64, instead of
using LPI IDS. To enable that we need to have gic registered with
ACPI_MSI_XREF and gic acpi to map SPI for MSI.
This is the 1st of the three patchs to enable Hyper-V vPCI support
in arm64.
Reviewed by: andrew, emaste, whu
Tested by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Obtained from: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D37763
This is a followup of 692e19cf51 (add netlink to GENERIC@amd64).
Netlink is a communication protocol defined in RFC 3549. It is async,
TLV-based protocol, providing 1-1 and 1-many communications between kernel
and userland. Netlink is currently used in Linux kernel to modify, read and
subscribe for nearly all networking states. Interface state, addresses, routes,
firewall, rules, fibs, etc, are controlled via Netlink.
Netlink support was added in D36002. It has got a number of improvements and
first customers since then:
* net/bird2 got netlink support, enabling route multipath in FreeBSD
* netlink-based devd notifications are being worked on ( D37574 ).
* linux(4) fully supports and depends on Netlink
Enabling Netlink in GENERIC targets two goals.
The first one is to provide stability for the third-party userland applications,
so they can rely on the fact that netlink always exists since 14.0 and potentially 13.2.
Loadable module makes life of the app delepers harder. For example, `net/bird2` can be
either build with netlink or rtsock support, but not both.
The second goal is to enable gradual conversion of the base userland tools
to use netlink(4) interfaces. Converting tools like netstat (D36529), route,
ifconfig one-by-one simplifies testing and addressing the feedback.
Othewise, switching all base to use netlink at once may be too big of a leap.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37783
The scmi driver in its current form requires the arm_doorbell
driver to communicate with the firmware.
The arm_doorbell is only found in ARM Juno reference board (and
apparently on Morello too).
If we want to use scmi on other platform (like some rockchip or imx
soc), the driver needs to be updated to support svc/shmem communication
with the firmware.
For now since it can be only used with arm_doorbell move the device to
std.arm otherwise kernel configs like ALLWINNER or ROCKCHIP fails to build.
Reviewed by: br, imp
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37953
FreeBSD defines EF_ARM_EABI_VERSION in a non-standard way (at least
differently than everybody else). We use this only in elf*machdep.c to
make sure the image is new enough. Switch to the more standard way of
defining this and adjust other constants to match.
Fixes: c52c98e69a
Sponsored by: Netflix
In set_fpcontext we only need a critical section around vfp_discard.
The remainder of the code can run without it.
While here add an assert to check the passed in thread is the
current thread as the code already this.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38000
If a thread enters a kernel FP context the PCB_FP_STARTED may be
unset when calling get_fpcontext even if the VFP unit has been used
by the current thread.
Reduce the use of this flag to just decide when to store the VFP state.
While here add an assert to check the assumption that the passed in
thread is the current thread and remove the unneeded critical section.
The latter is unneeded as the only place we would need it is in
vfp_save_state and this already has a critical section when needed.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37998
The PCB_FP_STARTED is used to indicate that the current VFP context
has been used since either 1. the start of the thread, or 2. exiting
a kernel FP context.
When case 2 was added to the kernel this could cause incorrect results
to be returned when a thread exits the kernel FP context and fill_fpregs
is called before it has restored the VFP state, e.g. by trappin on a
userspace VFP instruction.
In both of the cases the base save area is still valid so reduce the
use of the PCB_FP_STARTED flag check to help decide if we need to
store the current threads VFP state.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37994
Commit 86a994d653 initialized
thread0.td_kstack_pages to KSTACK_PAGES. Due to the lack of an
include of opt_kstack_pages.h it used the fallback value of 4 from
machine/param.h. This meant that increasing KSTACK_PAGES in the kernel
config resulted in a panic in _epoch_enter_preempt as the following
assertion was false during network stack setup:
MPASS((vm_offset_t)et >= td->td_kstack &&
(vm_offset_t)et + sizeof(struct epoch_tracker) <=
td->td_kstack + td->td_kstack_pages * PAGE_SIZE);
Switch to initializing with kstack_pages following other architectures.
Reviewed by: imp, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D38048
The SPDX-License-Identifier was wrong in the Arm CoreLink CMN-600
driver files. It used the incorrect FreeBSD variant of the BSD-2-Clause
identifier. According to [1] all files should use BSD-2-Clause.
[1] https://tools.spdx.org/app/check_license/
Reported by: emaste
Sponsored by: Arm Ltd
In 0a9a4d2cd6 a check for OPT_ACPI was added to the hwpmc Makefile
to fix loading the module in a kernel where ACPI has been disabled.
This broke loading the module when ACPI was enabled in the build as
OPT_ACPI isn't a Makefile macro so was always disabled.
Move this check to the C files where the DEV_ACPI macro does exist.
Reviewed by: gnn
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37773
The td_ast member is an int so only 4 bytes, yet we were using an 8 byte
load and thus also got td_inhibitors in the upper bits. The code prior
to the commit that introduced td_ast did also do a bogus 8 byte load of
td_flags but masked the flags so arguably was correct, if dodgy. Now
that we're using the right width for the load we can also fold the
immediate offset back into the load; because td_ast is at an odd
multiple of 4 bytes from the start of struct thread the normal scaled
load couldn't be used with such an immediate offset when doing an 8 byte
load due to its limited immediate range, but we can use a scaled load
once more now that the offset is a multiple of the load width.
Reviewed by: andrew, kib
Fixes: c6d31b8306 ("AST: rework")
Differential Revision: https://reviews.freebsd.org/D37751
The order of asserting/deasserting the resets doesn't matter so use
the new hwreset_array to manage them all.
Reviewed by: manu
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37358
The SCMI specification describes a set of standard interfaces for power,
performance and system management.
SCMI is extensible and provides interfaces to access functions which are
often implemented in firmwares in the System Control Processor (SCP).
This implements Shared Memory-based transfer, which is one of the ways on
how messages are exchanged between agents and the platform.
This includes a driver for ARM Message Handling Unit (MHU) Doorbell, which
is a mechanism that the caller can use to alert the callee of the presence
of a message.
The support implements clock management interface. For instance this allows
us to control HDMI pixel clock on ARM Morello Board.
Tested on ARM Morello Board.
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D37316
Reviewed by: manu
Sponsored by: UKRI
Partially from: https://reviews.freebsd.org/D36027
This can be eventually improved or simplified or fixed if necessary.
Following devices work with proper drivers and with the necessary clocks:
Native networking via eqos driver
USB3 and USB2
PCIe support is working but a bit picky about what hardware it supports (but so is Linux)
SD & (e)MMC
With the EDK2 loader video also works
Supported hardwares are Quartz64, NanoPI R5S and Firefly Station P2, more to come as DTS files gets done.
On amd64, don't abort promotion due to a missing accessed bit in a
mapping before possibly write protecting that mapping. Previously,
in some cases, we might not repromote after madvise(MADV_FREE) because
there was no write fault to trigger the repromotion. Conversely, on
arm64, don't pointlessly, yet harmlessly, write protect physical pages
that aren't part of the physical superpage.
Don't count aborted promotions due to explicit promotion prohibition
(arm64) or hardware errata (amd64) as ordinary promotion failures.
Reviewed by: kib, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36916
When Linux loads a new kernel via kexec, somtiems it must reserve memory
for devices that are still active (and typically can't be reset or
shutdown). When present, this table is a linked list of ranges that are
still in use that the OS must avoid using.
Mark these areas as reserved.
This is part of the GICv3 workaround code where we must use the PA
addresses already programmed into the GICv3 when we take over. This part
ensure we don't allocate the mmeory for anything else.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37440
It would be nice to be able to pass an arbitrary pointer to the callback
code. Add one, and pass NULL in all the places that we do that today.
As noted by andrew@, we should likely refactor this into MI code and use
it here and amd64, but for the future.
Sponsored by: Netflix
Reviewed by: rpokala
Differential Revision: https://reviews.freebsd.org/D37439
When PV_STATS is defined, freed is used. Otherwise it isn't. Mark it as
__pvused and define __pvused appropriately.
Sponsored by: Netflix
Reviewed by: tsoome, rpokala, andrew
Differential Revision: https://reviews.freebsd.org/D37438
As with amd64 pmap introduce per-superpage locks backed by pages
allocated by their respective domains.
This significiantly reduces lock contantion from pmap when running
poudriere on a 160 core Ampere Altra server
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36310
A misaligned frame pointer is certainly not a valid frame pointer and
with strict alignment enabled (as on CHERI) can cause panics when it is
loaded from later in the code.
This is a recommit of 40e0fa10f5 with
is_aligned() corrected to __is_aligned().
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D34646
The RK3328 dts doesn't have the glue node so we need the dwc3 driver
to attach directly.
Differential Revision: https://reviews.freebsd.org/D37396
Sponsored by: Beckhoff Automation GmbH & Co. KG
These were originally in locore.S as they are only needed so we have
a valid value to put into the vbar_el2 register. As these will soon
be used by bhyve so move them to a new file as we already have with
the EL1 exception vectors in exception.S.
Obtained from: https://github.com/FreeBSD-UPB/freebsd-src (earlier version)
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Zero the vttbr_el2 register on each CPU so we can tell if we are
running the host or guest kernel from a hypervisor.
Obtained from: https://github.com/FreeBSD-UPB/freebsd-src (earlier version)
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
For completeness add accessors for the MIDR field. As the field is
always 0xf on arm64 it is unneeded in the current MICR handling, but
will be used in the vmm module for bhyve.
Obtained from: https://github.com/FreeBSD-UPB/freebsd-src (earlier version)
Sponsored by: The FreeBSD Foundation
These all work on stage 1 tables. Rename them so we can add similar
functions that operate on stage 2 tables.
Reviewed by: alc, markj, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37302
When modifying a stage 2 mapping we may need to call into the
hypervisor to invalidate the TLB. Until it is known if the cost of
this operation is less than the performance gains superpages offers
disable their use.
Reviewed by: kib. markj
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37299
A misaligned frame pointer is certainly not a valid frame pointer and
with strict alignment enabled (as on CHERI) can cause panics when it is
loaded from later in the code.
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D34646
The M1 has no EL3, so we're limited to a spin-table implementation if we
want to eventually use bhyve on it. Implement spin-table now, but note
that we still prefer PSCI where possible.
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D34661
The pull-up/pull-down register offset was wrong on the Rockchip rk356x.
It was set such that the driver would modify the IOMUX control registers.
This seems to work with the current device tree files, but fails with
upstream files. Fix the offset so the later calculation has the correct
offset for the pull-up/pull-down control register.
Sponsored by: The FreeBSD Foundation
With PERTHREAD_SSP configured, the compiler's stack-smashing protection
uses a per-thread canary value instead of a global value. The value is
stored in td->td_md.md_canary; the sp_el0 register always contains a
pointer to that value, and certain functions selected by the compiler
will store the canary value on the stack as a part of the function
prologue (and will verify the copy as part of the epilogue). In
particular, the thread structure may be accessed.
This happens to occur in data_abort(), which leads to the same problem
addressed by commit 2c10be9e06 ("arm64: Handle translation faults for
thread structures"). This commit fixes that directly, by disabling SSP
in data_abort() and a couple of related functions by using a function
attribute. It also moves the update of sp_el0 out of C code in case
the compiler decides to start checking the canary in pmap_switch()
someday.
A different solution might be to move the canary value to the PCB, which
currently lives on the kernel stack and isn't subject to the same
problem as thread structures (if only because guard pages inhibit
superpage promotion). However, there isn't any particular reason the
PCB has to live on the stack today; on amd64 it is embedded in struct
thread, reintroducing the same problem. Keeping the reference canary
value at the top of the stack is also rather dubious since it could be
clobbered by a sufficiently large stack overflow.
A third solution could be to go back to the approach of commit
5aa5420ff2, and modify UMA to use the direct map for thread structures
even if KASAN is enabled. But, transient promotions and demotions in
the direct map are possible too.
Reviewed by: alc, kib, andrew
MFC after: 1 month
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37255
This reverts commit fe36346a89.
The arm64 Hyper-v code now checks it is running under Hyper-v before
calling into the hypervisor.
Sponsored by: The FreeBSD Foundation
The break-before-make requirement poses a problem when promoting or
demoting mappings containing thread structures: a CPU may raise a
translation fault while accessing curthread, and data_abort() accesses
the thread again before pmap_fault() can translate the address and
return.
Normally this isn't a problem because we have a hack to ensure that
slabs used by the thread zone are always accessed via the direct map,
where promotions and demotions are rare. However, this hack doesn't
work properly with UMA_MD_SMALL_ALLOC disabled, as is the case with
KASAN configured (since our KASAN implementation does not shadow the
direct map and so tries to force the use of the kernel map wherever
possible).
Fix the problem by modifying data_abort() to handle translation faults
in the kernel map without dereferencing "td", i.e., curthread, and
without enabling interrupts. pmap_klookup() has special handling for
translation faults which makes it safe to call in this context. Then,
revert the aforementioned hack.
Reviewed by: kevans, alc, kib, andrew
MFC after: 1 month
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37231
Wake On LAN packets sent by wake(8) via BPF are lost if txcsum is
enabled. These fall into the "other protocol" case where gen_parse_tx
did nothing. Add code to shift up to gen_tx_hdr_min bytes of the
packet along with the Ethernet header in this case.
This allows the syscallname() function to give a usable result for Linux
ABIs.
Reported by: jrtc27
Reviewed by: jrtc27, markj, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37199
Add a minimal implementation of cpu_ptrace() for arm64. It is only used to
get/set VFP registers for 32bits binaries, as it is apparently what we use
there, instead of the MI PT_GETFPREGS/PT_SETFPREGS.
PR: 267361
MFC After: 1 week
The cpu_dcache_wb_range function is an expensive function that is
unneeded in ddb. It is used when the cache needs to be written to RAM,
e.g. when working with a non-cache coherent device.
Remove it as cpu_icache_sync_range already has the needed d-cache
handling to ensure any changed memory is visible to the i-cache.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37037
Mostly to document basic harware present on the platform; knowing that
Graviton exposes an ns8250 uart alone is quite helpful.
Reviewed by: andrew, imp, manu
Seems accurate: cperciva
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36776
This is the last part for ARM64 Hyper-V enablement. This includes
commone files and make file changes to enable the ARM64 FreeBSD
guest on Hyper-V. With this patch, it should be able to build
the ARM64 image and install it on Hyper-V.
Reviewed by: emaste, andrew, whu
Tested by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D36744
They are used when ASLR is not applied.
The need for adjusting is due to rtld direct exec mode puts ld-elf.so.1
at the PIE load address, and this address must not conflict with the
default linker' load address for non-PIE binaries. Otherwise rtld in
direct mode cannot activate image. Example of implicit failure is ldd(1)
refusing to run.
Reported by: kp
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37085
New driver to ACPI generic event device, defined in ACPI spec.
Some ACPI power button may not work without this.
In qemu arm64 with "virt" machine, with ACPI firmware,
enable devd check devd message by
and invoke following command in qemu monitor
(qemu) system_powerdown
and make sure some power button input event appear.
(setting sysctl hw.acpi.power_button_state=S5 is not work,
because ACPI tree does not have \_S5 object.)
Reviewed by: andrew, hrs
Differential Revision: https://reviews.freebsd.org/D37032
This add BUS_GET_DEVICE_PATH interface,
which shows device tree of openfirm/fdt.
In qemu-system-arm64 with "virt" machine with device-tree firmware,
% devctl getpath OFW cpu0
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37031
DPAA2 is a hardware-level networking architecture found in some NXP
SoCs which contain hardware blocks including Management Complex
(MC, a command interface to manipulate DPAA2 objects), Wire Rate I/O
processor (WRIOP, packets distribution, queuing, drop decisions),
Queues and Buffers Manager (QBMan, Rx/Tx queues control, Rx buffer
pools) and the others.
The Management Complex runs NXP-supplied firmware which provides DPAA2
objects as an abstraction layer over those blocks to simplify an
access to the underlying hardware. Each DPAA2 object has its own
driver (to perform an initialization at least) and will be visible
as a separate device in the device tree.
Two new drivers (dpaa2_mc and dpaa2_rc) act like firmware buses in
order to form a hierarchy of the DPAA2 devices:
acpiX (or simplebusX)
dpaa2_mcX
dpaa2_rcX
dpaa2_mcp0
...
dpaa2_mcpN
dpaa2_bpX
dpaa2_macX
dpaa2_io0
...
dpaa2_ioM
dpaa2_niX
dpaa2_mc is suppossed to be a root of the hierarchy, comes in ACPI
and FDT flavours and implements helper interfaces to allocate and
assign bus resources, MSI and "managed" DPAA2 devices (NXP treats some
of the objects as resources for the other DPAA2 objects to let them
function properly). Almost all of the DPAA2 objects are assigned to
the resource containers (dpaa2_rc) to implement isolation.
The initial implementation focuses on the DPAA2 network interface
to be operational. It is the most complex object in terms of
dependencies which uses I/O objects to transmit/receive packets.
Approved by: bz (mentor)
Tested by: manu, bz
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D36638
These are 64-bit. Mark them as unsigned long so we don't rely on
undefined behaviour or shift a 32-bit value more than 32 bits.
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Add TCP_BLACKBOX to the remaining platforms (arm64, RISC-V) and add
TCP_RFC7413 to the remaining platform (RISC-V).
Reviewed by: rscheff@
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D36918
Add a <sys/_pv_entry.h> intended for use in <machine/pmap.h> to
define struct pv_entry, pv_chunk, and related macros and inline
functions.
Note that powerpc does not yet use this as while the mmu_radix pmap
in powerpc uses the new scheme (albeit with fewer PV entries in a
chunk than normal due to an used pv_pmap field in struct pv_entry),
the Book-E pmaps for powerpc use the older style PV entries without
chunks (and thus require the pv_pmap field).
Suggested by: kib
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36685
Summary:
The indirect flag tells the hardware to use a flat or two level table.
As we only support using the flat table ensure the flag that marks
which is in use is set correctly.
We can't rely on this being set correctly as some firmware may set the
indirect flag, e.g. booting from LinuxBoot.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36873
static_kenv is only used under `FDT`, and `try_load_dtb` is only defined
with `FDT`.
Reviewed by: andrew, imp, manu
Differential Revision: https://reviews.freebsd.org/D36791
coresight_cpu_debug only has an FDT attachment, so let's not build it
for kernels without FDT.
coresight.h includes sys/malloc.h via header pollution
dev/ofw/openfirm.h; include it directly in case we're building without
FDT.
Reviewed by: andrew, manu
Differential Revision: https://reviews.freebsd.org/D36789
On systems with different CPUs we may print all the ID registers for
all CPUs. Reduce this to just print them when they change from the
previous CPU.
Sponsored by: The FreeBSD Foundation
As with amd64 use a per-domain pv chunk lock to reduce contention as
chunks get created and removed all the time.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36307
As with amd64 batch chunk removal in pmap_remove_pages to move it out
of the pv list lock. This is one of the main contested locks when
running poudriere on a 160 core Ampere Altra server.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36305
Reduce holes in pmap_bootstrap_state by moving freemempos after the
pointers as they are more likely to change size in any future ABI.
Sponsored by: The FreeBSD Foundation
Rework the pmap_bootstrap table generation so we can use it with
partially filled out page tables after the DMAP has been bootstrapped.
This allows it to be reused by the later bootstrap code.
Sponsored by: The FreeBSD Foundation
We'll only be redefining the various bus_* macros, not the definition of
struct bus_space.
Reviewed by: andrew
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36718
The intent is to assert that either no mapping exists at the given VA,
or that the existing L1 block mapping maps the same PA.
Fixes: 36f1526a59 ("Add experimental 16k page support on arm64")
Reviewed by: alc
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36698
strchr returns a pointer to the ',', so if the first option in the list
isn't available, we need to step over the , to look at the next
option. So if kern.cfg.order="acpi,fdt" and we have no acpi, we'd loop
forever with order=',fdt'.
Sponsored by: Netflix
Reviewed by: andrew, jhb
Differential Revision: https://reviews.freebsd.org/D36682
As with the GICv1/2 driver teach the GICv3 driver to translate memory
ranges of children. This allows us to create a common
bus_alloc_resource implementation for bot hACPI and FDT attachments.
Sponsored by: The FreeBSD Foundation
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This will be used by bhyve to attach a virtual GIC driver.
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36590
This merges TCA6416, TCA6408 drivers and adds PCA9555 support.
They handle 8 pin and 16 pin ICs with basic INPUT/OUTPUT functionality.
The register map is fairly similar so there is no point in having two
separate drivers.
Reviewed by: kd
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D36559
This changes the default TCP Congestion Control (CC) to CUBIC.
For small, transactional exchanges (e.g. web objects <15kB), this
will not have a material effect. However, for long duration data
transfers, CUBIC allocates a slightly higher fraction of the
available bandwidth, when competing against NewReno CC.
Reviewed By: tuexen, mav, #transport, guest-ccui, emaste
Relnotes: Yes
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36537
When attempting to promote 4KB user-space mappings to a 2MB user-space
mapping, the address of the struct vm_page representing the page table
page that contains the 4KB mappings is already known to the caller.
Pass that address to the promotion function rather than making the
promotion function recompute it, which on arm64 entails iteration over
the vm_phys_segs array by PHYS_TO_VM_PAGE(). And, while I'm here,
eliminate unnecessary arithmetic from the calculation of the first PTE's
address on arm64.
MFC after: 1 week
On boot we cache the length the 'dc zva' instruction will zero. Use
this in the memset function to decide when to use it. As the cached
value is in .bss it will be zero on boot so memset is safe to use
before the value has been read.
Sponsored by: The FreeBSD Foundation