Rework the pmap_bootstrap table generation so we can use it with
partially filled out page tables after the DMAP has been bootstrapped.
This allows it to be reused by the later bootstrap code.
Sponsored by: The FreeBSD Foundation
We'll only be redefining the various bus_* macros, not the definition of
struct bus_space.
Reviewed by: andrew
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36718
The intent is to assert that either no mapping exists at the given VA,
or that the existing L1 block mapping maps the same PA.
Fixes: 36f1526a59 ("Add experimental 16k page support on arm64")
Reviewed by: alc
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36698
strchr returns a pointer to the ',', so if the first option in the list
isn't available, we need to step over the , to look at the next
option. So if kern.cfg.order="acpi,fdt" and we have no acpi, we'd loop
forever with order=',fdt'.
Sponsored by: Netflix
Reviewed by: andrew, jhb
Differential Revision: https://reviews.freebsd.org/D36682
As with the GICv1/2 driver teach the GICv3 driver to translate memory
ranges of children. This allows us to create a common
bus_alloc_resource implementation for bot hACPI and FDT attachments.
Sponsored by: The FreeBSD Foundation
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This will be used by bhyve to attach a virtual GIC driver.
Sponsored by: Innovate UK
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36590
This merges TCA6416, TCA6408 drivers and adds PCA9555 support.
They handle 8 pin and 16 pin ICs with basic INPUT/OUTPUT functionality.
The register map is fairly similar so there is no point in having two
separate drivers.
Reviewed by: kd
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D36559
This changes the default TCP Congestion Control (CC) to CUBIC.
For small, transactional exchanges (e.g. web objects <15kB), this
will not have a material effect. However, for long duration data
transfers, CUBIC allocates a slightly higher fraction of the
available bandwidth, when competing against NewReno CC.
Reviewed By: tuexen, mav, #transport, guest-ccui, emaste
Relnotes: Yes
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36537
When attempting to promote 4KB user-space mappings to a 2MB user-space
mapping, the address of the struct vm_page representing the page table
page that contains the 4KB mappings is already known to the caller.
Pass that address to the promotion function rather than making the
promotion function recompute it, which on arm64 entails iteration over
the vm_phys_segs array by PHYS_TO_VM_PAGE(). And, while I'm here,
eliminate unnecessary arithmetic from the calculation of the first PTE's
address on arm64.
MFC after: 1 week
On boot we cache the length the 'dc zva' instruction will zero. Use
this in the memset function to decide when to use it. As the cached
value is in .bss it will be zero on boot so memset is safe to use
before the value has been read.
Sponsored by: The FreeBSD Foundation
Bring in the latest Arm Optimized Routines memcpy/memmove into the
arm64 kernel. As these functions have been merged in the current
version remove the now unneeded memmove.S.
Sponsored by: The FreeBSD Foundation
In 8db2e8fd16 ("Remove the secondary_stacks array in arm64 [...]"),
bootstacks was setup to be allocated dynamically. While this is
generally how x86 does it, it inadvertently shrunk each boot stack from
KSTACK_PAGES pages to a single page.
Resize these back up to the expected size using the kstack_pages
tunable, as we'll need larger stacks with upcoming sanitizer work.
Reviewed by: andrew, imp, markj
Fixes: 8db2e8fd16 ("Remove the secondary_stacks array [...]")
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36475
Failure to map RSDP, XSLT and checksum failures are events that can't
happen unless something has gone wrong. As such, they should be reported
always, and not in bootverbose. This has been this way since it was
originally brought in to parse APIC tables.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D36406
Add missing pmap_unmapbios() calls for when we return 0. Otherwise we
can leave the table mapped when it is of no use.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D36405
arm64 requires ACPI RSDP Revision 2.0 since it requires 64-bit physical
addresses. It is an error worth reporting if we have a RSDP pointer, but
it points to the wrong version.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D36404
The mpidr register is 64 bit on arm64 and 32 bit on arm. Fix this by
extending the arm64 definition to include the top 32 bits.
To preserve KBI when MFCing split the value into two 32 bit values.
This will be cleaned up later only on main.
Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36346
When the IDC flag is set in the cache type register we don't need to
clean the data cache to the point of unification. Previously we
supported this flag being set only when the DIC flags was also set.
Add a new handler for when this is not the case.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation, Ampere (hardware)
Differential Revision: https://reviews.freebsd.org/D36296
We don't use EFI_MEMORY_DESCRIPTOR that's typedef'd here. We use the one
from sys/efi.h instead. Remove the clutter here as these two are subtly
different (though wind up with the same layout due to alignment rules).
Sponsored by: Netflix
It supports gpio type checking. Depending on gpio type some
register addresses are different.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D36262
This permits inlining the comparisons even in the 16K page case.
Note that since PC_FREEN is -1, values can be compared efficiently
without having to fetch words of pc_freemask from memory via the
'cmn <reg>, #0x1' instruction.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36218
- Define PC_FREEL and _NPCM in terms of _NPCPV rather than via magic
numbers.
- Remove assertions about _NPC* values from pmap.c. This is less
relevant now that PC_FREEL and _NPCM are derived from _NPCPV.
- Add a helper inline function pc_is_full() which uses a loop to check
if pc_map is all zeroes. Use this to replace three places that
check for a full mask assuming there are only 3 entries in pc_map.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36217
We need to be careful to not promote or demote the memory containing
the per-CPU structures as the exception handlers will dereference it
so any time it's invalid may cause recursive exceptions.
Add a new pmap function to set a flag in the pte marking memory that
cannot be promoted or demoted and use it to mark pcpu memory.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35434
For RK356x platform, we can set bit 26 of DWC3_GUCTL1 register
for usb 2.0 device.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D36211
With clang 15, the following -Werror warning is produced:
sys/arm64/arm64/db_trace.c:53:23: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
db_md_list_watchpoints()
^
void
This is because db_md_list_watchpoints() is declared with a (void)
argument list, but defined with an empty argument list. Make the
definition match the declaration.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/arm64/rockchip/rk_spi.c:229:6: error: variable 'cnt' set but not used [-Werror,-Wunused-but-set-variable]
int cnt = 0;
^
The 'cnt' variable was in rk_spi.c when it was first added, but it
appears to have been a debugging aid that has never been used, so remove
it.
MFC after: 3 days
Changing mode on a pin (input/output/pullup/pulldown) is a bit slow.
Improve this by caching what we can.
We need to check if the pin is in gpio mode, do that the first time
that we have a request for this pin and cache the result. We can't do
that at attach as we are a child of rk_pinctrl and it didn't finished
its attach then.
Cache also the flags specific to the pinctrl (pullup or pulldown) if the
pin is in input mode.
Cache the registers that deals with input/output mode and output value. Also
remove some register reads when we change the direction of a pin or when we
change the output value since the bit changed in the registers only affect output
pins.
Define PAGE_SIZE and PAGE_MASK based on PAGE_SHIFT. With this we only
need to set one value to change one value to change the page size.
While here remove the unused PAGE_MASK_* macros.
Sponsored by: The FreeBSD Foundation
Node names for gpio bank were made generic in Linux 5.16 so stop
using them to map the gpio controller to the pin controller bank unit.
Sponsored by: Beckhoff Automation GmbH & Co. KG
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Since IOMMU map entries store a reference to the domain in which they
reside, there is no need to pass the domain to iommu_gas_free_entry(),
iommu_gas_free_space(), and iommu_gas_free_region().
Push down the acquisition and release of the IOMMU domain lock into
iommu_gas_free_space() and iommu_gas_free_region().
Both of these changes allow for simplifications in the callers of the
functions without really complicating the functions themselves.
Moreover, the latter change eliminates the direct use of the IOMMU
domain lock from the x86-specific DMAR code.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35995
The devmap variants used vm_offset_t for some reason, and a few places
explicitly cast bus addresses to vm_offset_t. (Probably those casts
along with similar casts for vm_size_t should just be removed and
instead permit the compiler to DTRT.)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D35961
Eliminate a possible case of use-after-free in an error handling path
after a mapping failure. Specifically, eliminate IOMMU_MAP_ENTRY_QI_NF
and instead perform the IOTLB invalidation synchronously. Otherwise,
when iommu_domain_unload_entry() is called and told not to free the
IOMMU map entry, the caller could free the entry before dmar_qi_task()
is finished with it.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35878
Add initial 16k page support on arm64. It is considered experimental,
with no guarantee of compatibility with a userspace or kernel modules
built with the current a 4k page size as code will likely try to pass
in a too small size when working with APIs that take a multiple of a
page, e.g. mmap.
As this is experimental, and because userspace and the kernel need to
have the PAGE_SIZE macro kept in sync there is no kernel option to
enable this. To test a new image should be built with the
PAGE_{SIZE,SHIFT,MASK} macros changed to the 16k versions.
There are currently known issues with loading modules from an old
loader as it can misalign them to load on a non-16k boundary.
Testing has shown good results in kernel workloads that allocate and
free large amounts of memory as only a quarter of the number of calls
into the VM subsystem are needed in the best case.
Reviewed by: markj
Tested by: gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34793
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393
Use a getter macro instead of fetching the sigcode address directly
from a sysent of a given process. It assumes that the sigcode is stored
in the shared page, which is true in all cases, except for a.out
binaries. This will be later useful when the shared page address
randomization is introduced.
No functional change intended.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35392
For version 2 extend the TMUV2_TMSAR() write loop over all site_ids
registered for a particular SoC and actually use the site_id rather
than always just the first [0] (which for the LX2080 would be a
problem given there is no site0).
Later, while version 2 adds the SITEs to enable to TMSR in bits 0..<n>,
version 1 (e.g., LS1028, LS1046, LS1088) add MSITEs to TMR
bits 16..31 or rather 15..0(16-<n>). Adjust the loops to only enable
the site_ids listed for the particular SoC for monitoring. This now
also deals with sparse site_ids (not starting at 0, or not being
contiguous).
MFC after: 1 week
Sponsored by: Traverse Technologies (providing Ten64 HW for testing)
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D35764
Configure the number of sites (sensors) based on SoC.
This avoids timeouts reading non-existent sensors.
The changes are based on mmel's initial work at:
914e3f0098
MFC after: 1 week
Sponsored by: Traverse Technologies (providing Ten64 HW for testing)
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D35759
Not all gicv3 fdt children have a compatible property. Those that don't
are configuration data rather than something that should have a driver
attach.
Sponsored by: The FreeBSD Foundation
The two implementations for the pca9548 switch and the pca9547 mux
seemed close enough so we can put them together and with a bit more
abstraction add pca9540 support.
While here apply a bit of consistency in variable and driver naming and
use device_has_property instead of the FDT-only OF_ variant.
This disconnects pca9547 from the build but does not yet delete it.
MFC after: 2 weeks
Reviewed by: mmel (earlier version), avg
Differential Revision: https://reviews.freebsd.org/D35701
We should clean up on failure as it may panic the kernel later, e.g.
if we crate the rman, but fail to destroy it on attach faulure.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35682
This is needed with some dtb files.
While here use a switch statement as the two options are mutually
exclusive in any iteration of the loop.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35680
Add a driver for NXP LS1088a clockgen support which passes
configuration information to QorIQ clockgen class.
The implementaiton started off as copy of ls1028 support and was
adjusted accordingly.
Reviewed by: dgr_semihalf.com (earlier), mmel
MFC after: 1 week
Sponsored by: Traverse Technologies (providing Ten64 HW for testing)
Differential Revision: https://reviews.freebsd.org/D35617
arm64 wasn't updated to grab this from acpi.rsdp when x86 was
update. belatedly update the kernel to grab this information from the
preferred kenv.
Sponsored by: Netflix
Reviewed by: andrew, jhb
Differential Revision: https://reviews.freebsd.org/D35631
The field values are only valid when the ID_AA64PFR0_EL1.SVE or
ID_AA64PFR1_EL1.SME vields are non-zero. When this is not the case
the register is reserved as zero so is safe to read, but the SVEver
field will be incorrect so only print the decoded register when
the SVE or SME fields indicate it is valid.
Sponsored by: The FreeBSD Foundation
On arm64 all registers have a name that encodes op0, op1, CRn, CRm, and
op2 that are used to encode the register in the instruction. As some
registers we need to access may not be supportedby older compilers, or
are only supported when specific extensions are enabled support this
alternative form.
Sponsored by: The FreeBSD Foundation
To keep the vfp thread creation code in one place move into vfp.c. This
will also help with adding SVE support as it depends on VFP.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35615
Add support of ARM CMN-600 controller, PMU access functions only.
Add support of PMU counters of ARM CMN-600 controller.
Reviewed by: mhorne
Sponsored By: ARM
Differential Revision: https://reviews.freebsd.org/D32321
On arm64, testing pc_curpcb != NULL is not correct since pc_curpcb is
set in pmap_switch() while the bootstrap stack is still in use. As a
result, smp_after_idle_runnable() can free the boot stack prematurely.
Take a different approach: use smp_rendezvous() to wait for all APs to
acknowledge an interrupt. Since APs must not enable interrupts until
they've entered the scheduler, i.e., switched off the boot stack, this
provides the right guarantee without depending as much on the
implementation of cpu_throw(). And, this approach applies to all
platforms, so convert x86 and riscv as well.
Reported by: mmel
Tested by: mmel
Reviewed by: kib
Fixes: 8db2e8fd16 ("Remove the secondary_stacks array in arm64 and riscv kernels.")
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35435
Let the caller to iommu_map pass the size parameter without rounding
it up to a multiple of page size. Let iommu_map round it up when
necessary, which is not all of the time, so that in some cases less
space is reserved.
Reviewed by: alc, kib (previous version)
Tested by: pho, br
Discussed with: andrew
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D35424
Summary:
It can be useful to see a summary of CPU caches on bootup. This is done
for most platforms already, so add this to arm64, in the form of (taken
from Apple M1 pro test):
L1 cache: 192KB (instruction), 128KB (data)
L2 cache: 12288KB (unified)
This is printed out per-CPU, only under bootverbose.
Future refinements could instead determine if a cache level is shared
with other cores (L2 is shared among cores on some SoCs, for instance),
and perform a better calculation to the full true cache sizes. For
instance, it's known that the M1 pro, on which this test was done, has 2
12MB L2 clusters, for a total of 24MB. Seeing each CPU with 12288KB L2
would make one think that there's 12MB * NCPUs, for possibly 120MB
cache, which is incorrect.
Sponsored by: Juniper Networks, Inc.
Reviewed by: #arm64, andrew
Differential Revision: https://reviews.freebsd.org/D35366
On AArch64, registers x9-x18 are not callee-saved, yet they are
preserved at many placed in swtch.S. This patch removes code that
preserves these registers.
Set the Common not Private bit in the ttbr registers when supported on
arm64. This tells the hardware it can share the translation table
entries on multiple CPUs.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
To enable it user-space needs to call feenableexcept().
FPE_FLTIDO has been added as the IDF bit can't be mapped to any existing
FPE code.
Reviewed by: andrew@
Differential revision: https://reviews.freebsd.org/D35247
MFC after: 2 weeks
As with atomic(9) use the ARMv8.1 Large System Extension atomic
instructions to implement the userspace compare and swap functions.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35234
This is needed to support non-PCI devices like memory-mapped
display controllers.
Split-out some initialization code from iommu_ctx_alloc() into
iommu_ctx_init() method so we could pass controller's MD-data
obtained from DTS to the driver prior to a CTX initialization.
Tested on Morello SoC.
Sponsored by: UKRI
For PCI devices we have entire L1 descriptor for every session ID (SID),
but for non-PCI (e.g. Display Processing Unit DPU), a single L1
descriptor serves multiple SIDs.
So prevent re-initialization of L1 descriptor if already initialized.
Don't free entire L1 descriptor on every STE removal.
Sponsored by: UKRI
The implemenation differs from others Linuxulators.
For unwinders Linux ucontext_t is stored, however native machine context
is used to store/restore process state to avoid code duplication.
As DWARF Aarch64 does not define a register number for PC and provides no
direct way to encode the PC of the previous frame, CFI cannot describe a
signal trampoline frame. So, modified the vdso linker script to discard
unused sections.
Extensions are not implemented.
MFC after: 2 weeks
Rework the defintion of struct siginfo so that the array padding
struct siginfo to SI_MAX_SIZE can be placed in a union along side of the
rest of the struct siginfo members. The result is that we no longer need
the __ARCH_SI_PREAMBLE_SIZE or SI_PAD_SIZE definitions.
Move struct siginfo definition under /compat/linux to reduce MD part.
To avoid headers polution include linux_siginfo.h in the MD linux.h
MFC after: 2 weeks
The signal trampoine-related definitions are used only in the MD part
of code, wherefore moved from everywhere used linux.h to separate MD
headers.
MFC after: 2 weeks
It is unused, especially now that the underlying d_dumper methods do not
accept the argument.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35174
An earlier stage may have set HCR_EL2.E2H, the clearing of which may
break address translation. We don't need the EL2 MMU at this point, so
we can avoid re-enabling it for now and just drop to EL1 as usual.
Suggested by: andrew
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D34644
The code for the "shift" block in the COPY macro set the pointer for
the next copy block to the wrong value. In this case, the link-layer
header would be overwritten by the network-layer header. This case is
difficult or impossible to exercise in the current driver without
changing the value of the hw.genet.tx_hdr_min sysctl. Correct the
pointer. While here, remove a line in the macro that was marked
"unneeded", which was actually wrong.
PR: 263824
Submitted by: jiahali@blackberry.com
MFC after: 2 weeks
One of the SMMU interrupt lines (priq) is optional and may be ommited in FDT.
Tested on ARM Morello Board, which has three SMMU units: first two have four
interrupt lines, last one has three interrupt lines.
Sponsored by: UKRI
On i386 are two semtimedop. The old one is called via multiplexor and
uses 32-bit timespec, and new semtimedop_tim64, which is uses 64-bit
timespec.
MFC after: 2 weeks
As the Linux semop syscall is not defined in i386, and as it is equal
to the native semop syscall, call it directly.
Fix semop definition to match Linux actual one - nsops is size_t type.
MFC after: 2 weeks
When we try to load these tables via acpidump(8) we need them to be in
the DMAP for /dev/mem to access. Add the EFI ACPI reclaim memory type
to the list of memory we map into DMAP but not used by the kernel as
this is the recommended place to put these.
Sponsored by: The FreeBSD Foundation
Deduplicate code to iterate over the bpages list in a bus_dmamap_t
freeing bounce pages during bus_dmamap_unload.
Reviewed by: imp
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34967
The genet interface did not resume operation correctly after doing
ifconfig down then up. The down/reset procedure did not clear the
RUNNING flag, and did not reset enough of the hardware state. This
patch is modeled on OpenBSD code, with a call to gen_reset added
to reset the controller completely. Regularize the parameter to
gen_dma_disable() while here.
PR: 263091
Submitted by: jiahali@blackberry.com
All supported compilers (modern versions of GCC and clang) support
this.
Many places didn't have an #else so would just silently do the wrong
thing. Ancient versions of icc (the original motivation for this) are
no longer a compiler FreeBSD supports.
PR: 263102 (exp-run)
Reviewed by: brooks, imp
Differential Revision: https://reviews.freebsd.org/D34797
On some hardware the EFI system table is not in memory mapped in the
DMAP section. Rather than panic the kernel check if it is mapped and
return a failure if not from efi_phys_to_kva.
Reported by: kevans
Differential Revision: https://reviews.freebsd.org/D34858
Some UEFI implementations place the system table in a runtime code
memory region. Include it in the DMAP so we can read it later.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D34861
In the arm64 busdma we have an internal flag to signal when a tag is
for a cache-coherent device. In this case we don't need to adjust the
size and alignment of allocated buffers to be within a cache line.
The cache line adjustment was incorrectly using the coherent flag
passed in to bus_dma_tag_create and not the internal flag. Fix it to
use the latter to reduce the memory usage slightly.
Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34763
To simplify the creation of the direct map (DMAP) region on arm64 move
it from the pre-C code into pmap. This simplifies the DMAP creation
as we can use the notmal index macros, and should reduce the number
of pages needed to hold the level 1 tables to just those needed.
Reviewed by: alc, dch
Tested by: dch, kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34568
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
On arm64 we can ask the hardware to perform cache operations from
userspace. These require read permission however when the memory is
unmapped the kernel will receive a write exception. Add a check to
see if the cause of the exception is from the cache and pass a memory
read fault type to the vm subsystem.
PR: 262836
Reported by: dch
Sponsored by: The FreeBSD Foundation
Following ARMARM sec D5.2.11, which says:
> Where an instruction results in an update to a System register,
> as is the case with the AT * address translation instructions,
> explicit synchronization must be performed before the result is
> guaranteed to be visible to subsequent direct reads of the
> PAR_EL1.
Reviewed By: andrew
MFC after: 3 weeks
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D34665
Bits 43:0 of the TLBI operand are bits 55:12 of the VA. Leaving
bits 63:55 of the VA in bits 51:44 of the operand might wind up
setting the TTL field (47:44) and accidentally restricting which
translation levels are flushed in the TLB.
Reviewed By: andrew
MFC after: 3 days
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D34664
This register set exposes the per-thread TLS register. It matches the
layout used by Linux on arm64. Linux does not implement this note for
32-bit arm.
Reviewed by: andrew, markj
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34595
This includes adding support for NT_ARM_VFP for 32-bit binaries
running under aarch64 kernels both for ptrace(), and coredumps via the
kernel and gcore.
Reviewed by: andrew, markj
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34448
Similar to fill_fpregs(), only invoke vfp_save_state() for curthread.
While here, zero the buffer if FP hasn't been started to avoid leaking
kernel stack memory.
Reviewed by: andrew, markj
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34525
It's unneeded as it was just used to align KERNBASE to a level 2
block start address. KERNBASE was already aligned correctly.
Sponsored by: The FreeBSD Foundation
When the page size the kernel is built for is not the same as
EFI_PAGE_SIZE we need to increment the page index at a faster rate.
Add this adjustment to the arm64 EFIRT support in preperation for
experimental 16k PAGE_SIZE support.
Sponsored by: The FreeBSD Foundation
To support cc -pg on arm64 we need to implement .mcount. As clang and
gcc think it is function like it just needs to load the arguments
to _mcount and call it.
On gcc the first argument is passed in x0, however this is missing on
clang so we need to load it from the stack. As it's the caller return
address this will be at a known location.
PR: 262709
Reviewed by: emaste (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34634
The ITS is defined to be disabled on a warm reset, but we may be coming
in via another kernel/hypervisor type setup where the ITS has been
previously configured then relinquished to the next kernel in the chain.
If it's enabled, the later configuration of GITS_BASER will almost
certainly fail -- clear it to prevent that.
Reviewed by: andrew
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34546
When using 16k or 64k pages atop will shift the address by more than
the needed amount for a tlbi instruction. Replace this with a new macro
to shift the address by 12 and use PAGE_SIZE in the for loop to let the
code work with any page size.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34516
To allow for a future 16k or 64k page size we need to tell libkvm which
is being used. Add a flag field in unused space in minidumphdr and use
it to signal between the different options.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34548
When moving from the l1 index to l0 index we need to use the l1 shift
value not the l0 shift value. With 4k pages they are identical, however
with 16k pages we only have 2 l0 entries so the shift value is incorrect.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34517
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers. This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers. It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).
Reviewed by: markj, emaste
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34446
When creating the DMAP region we may need to create level 2 page table
entries at the start and end of a block of memory. The code to do this
was almost identical so we can merge into a single function.
Sponsored by: The FreeBSD Foundation
They are in a different order to the TCR_TG1 values but appear to have
been copied incorrectly.
While here use TCR_TG0_4K in locore.S to make it explicit the userspace
page size is 4K.
Sponsored by: The FreeBSD Foundation
We assume EFI_PAGE_SIZE is the same as PAGE_SIZE, however this may not
be the case. Use the former when working with a list of pages from the
UEFI firmware so the correct size is used.
This will be needed on arm64 where PAGE_SIZE could be 16k or 64k in the
future. The other architectures have been updated to be consistent.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34510
We assume the pointer returned from get_pcpu will be consistent even
if the thread is moved to a new CPU. Fix this by partially reverting
63c858a04d to make get_pcpu a function again.
Sponsored by: The FreeBSD Foundation
The arm64 unknown exception will be raised when we execute an
instruction that id invalid or disabled. To help debug these print
the instruction that failed.
Sponsored by: The FreeBSD Foundation
To help with switching to a vdso sigtramp switch to passing through the
sigcode function when entering a signal. This ensures the return address
is within the function.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33690
We need to not include the MI _san.h files when builing some parts of
the kernel. Fix these checks in the arm64 header files.
Sponsored by: The FreeBSD Foundation
This can be used by debuggers to find which bits in a virtual address
should be masked off to get a canonical address. This is currently used
by the Pointer Authentication Code support to get its mask. It could also
be used if we support Top Byte Ignore for the same purpose.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34302
Use the new ofw_iicbus_set_devinfo() method to implant an OFW compatibility
string for a manually created RTC sub-device.
MFC after: 4 weeks
Reported by: archimedes.gaviola_at_gmail.com
bscott_at_bunyatech.com.au
Add a static assert for the siginfo{,32}_t, mcontext{,32}_t and
ucontext{,32}_t sizes. These are de-facto ABI options and cannot change
size ever.
Reviewed by: kib, andrew, jhb
Differential Revision: https://reviews.freebsd.org/D32958
We should clear the single step flag when entering a signal hander and
set it when returning. This fixes the ptrace__PT_STEP_with_signal test.
While here add support for userspace to set the single step bit as on
x86. This can be used by userspace for self tracing.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34170
When debugging 32-bit programs a debugger may insert a instruction that
will raise the undefined instruction trap. The kernel handles these
by raising a SIGTRAP, however the code was incorrect.
Fix this by using the expected TRAP_BRKPT signal code.
Sponsored by: The FreeBSD Foundation
While here clean up the names for the naming convention of the other
registers in this file.
Reviewed by: kib, mhorne (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34060
This allows entering the debugger at the earliest possible time, if
the '-d' argument is passed to the kernel.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34120
This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.
The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.
Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19831
ASLR stack randomization will reappear in a forthcoming commit. Rather
than inserting a random gap into the stack mapping, the entire stack
mapping itself will be randomized in the same way that other mappings
are when ASLR is enabled.
No functional change intended, as the stack gap implementation is
currently disabled by default.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro. With stack address randomization the
ps_strings address is no longer fixed.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.
This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
Pointer authentication allows userspace to add instructions to insert
a Pointer Authentication Code (PAC) into a register based on an address
and modifier and check if the PAC is correct. If the check fails it will
either return an invalid address or fault to the kernel.
As many of these instructions are a NOP when disabled and in earlier
revisions of the architecture this can be used, for example, to sign
the return address before pushing it to the stack making Return-oriented
programming (ROP) attack more difficult on hardware that supports them.
The kernel manages five 128 bit signing keys: 2 instruction keys, 2 data
keys, and a generic key. The instructions then use one of these when
signing the registers. Instructions that use the first four store the
PAC in the register being signed, however the instructions that use the
generic key store the PAC in a separate register.
Currently all userspace threads share all the keys within a process
with a new set of userspace keys being generated when executing a new
process. This means a forked child will share its keys with its parent
until it calls an appropriate exec system call.
In the kernel we allow the use of one of the instruction keys, the ia
key. This will be used to sign return addresses in function calls.
Unlike userspace each kernel thread has its own randomly generated.
Thread0 has a static key as does the early code on secondary CPUs.
This should be safe as there is minimal user interaction with these
threads, however we could generate random keys when the Armv8.5
Random number generation instructions are present.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31261
We have 32 instructions in each exception vector on arm64. Previously
only one was used to branch to the handler function. We can split the
start of these functions and move some of the instructions into the
vectors.
Reviewed by: kib, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33751
- Move busdma_lock_mutex to subr_bus_dma.c.
- Move _busdma_lock_dflt to subr_bus_dma.c. This function was named a
couple of different things previously. It is not a public API but
an internal helper used in place of a NULL pointer. The prototype
is in <sys/bus_dma.h> as not all backends include
<sys/bus_dma_internal.h>.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33694
Move mostly duplicated code in various MD bus_dma backends to support
bounce pages into sys/kern/subr_busdma_bounce.c. This file is
currently #include'd into the backends rather than compiled standalone
since it requires access to internal members of opaque bus_dma
structures such as bus_dmamap_t and bus_dma_tag_t.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33684
Add driver for Marvell, Armada-37xx peripheral clock.
Register clocks for various peripheral devices in
north bridge or south bridge domain. Dump clock's
domain while verbose boot.
Reviewed by:
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32294
Driver for tbg clocks. Read reference frequency from parent
and modify it depending on parameters read from register.
Reviewed by: manu
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32293
Driver registers new clock device. Clock frequency is set depending
on tenth bit's value obtained from syscon register. Full information
about the clock is dumped if bootverbose is enabled.
Driver was tested on EspressoBin.
Reviewed by: manu
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32292
A feature of arm64's instruction for TLB invalidation is the ability
to determine whether cached intermediate entries, i.e., L{0,1,2}_TABLE
entries, are invalidated in addition to the final entry, e.g., an
L3_PAGE entry.
Update pmap_invalidate_{page,range}() to support both types of
invalidation, allowing the caller to determine which type of
invalidation is performed.
Update the callers to request the appropriate type of invalidation.
Eliminate redundant TLB invalidations in pmap_abort_ptp() and
pmap_remove_l3_range().
Add a comment to pmap_invalidate_all() making clear that it always
invalidates entries at all levels.
As expected, these changes result in a tiny yet measurable
performance improvement.
Reviewed by: kib, markj
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D33705
Currently, any errors when adding a PIC child handler are ignored,
instead just continuing on to registering that PIC as an MSI, and
ignoring any errors that occur for that too.
Reviewed by: andrew
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33342
A recent change introduced a one-off error into a test allowing
coalescing chunks into segments. This fixes that error.
broke a check in _bus_dmamap_addseg on many architectures. This change makes it clear that it is not a particular range that is being boundary-checked, but the proposed union of the two adjacent ranges.
Reported by: se
Reviewed by: se
Fixes: c606ab59e7 vm_extern: use standard address checkers everywhere
Differential Revision: https://reviews.freebsd.org/D33715
Simplify control flow around handling of the execpath length and signal
trampoline. Cache the sysentvec pointer in a local variable.
No functional change intended.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33703
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D33685
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages. For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm). However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi. I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux. However, in the current scheme there's no need for the mux.
Instead, remove swi_vm and vm_ih. Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly. This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.
One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.
Reviewed by: imp, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33447
The current implementations never correctly return TRUE. In all cases,
when they currently return TRUE, they should have returned FALSE. And,
in some cases, when they currently return FALSE, they should have
returned TRUE. Except for its effects on performance, specifically,
additional page faults and pointless calls to pmap_enter_quick() that
abort, this error is harmless. That is why it has gone unnoticed.
Add a comment to the amd64, arm64, and riscv implementations
describing how their return values are computed.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33659
This will be used by the vdso signal trampoline on arm64.
While here fix the license as this part of locore.S to correct the
copyright owner.
Sponsored by: The FreeBSD Foundation
Use pmap_pte_exists() instead of pmap_pte() when the caller expects a
mapping to exist at a particular level. The caller benefits in two
ways from using pmap_pte_exists(). First, because the level is
specified to pmap_pte_exists() as a constant, rather than returned, the
compiler can specialize the implementation of pmap_pte_exists() to the
caller's exact needs, i.e., generate fewer instructions. Consequently,
within a GENERIC-NODEBUG kernel, 704 bytes worth of instructions are
eliminated from the inner loops of various pmap functions. Second,
suppose that the mapping doesn't exist. Rather than requiring every
caller to implement its own KASSERT()s to report missing mappings, the
caller can optionally have pmap_pte_exists() provide the KASSERT().
Reviewed by: andrew, kib
Tested by: andrew (an earlier version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33597
We only need to include sys/_atomic_subword.h on arm64 to provide
atomic_testandset_acq_long. Add an implementation in the arm64 atomic.h
based on the existing atomic_testandset macro.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33587
When recursing in pmap_change_props_locked we may fail because there is
no pte. This shouldn't be considered a fail as it may happen in a few
cases, e.g. there are multiple normal memory ranges with device memory
between them.
Reported by: cperciva
Tested by: cperciva
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33459
Notably, the current compat_options only makes sense for native and
freebsd32 ABIs. For the others, it just adds cruft. Switch to having
sets of compat options, and default to the native set. Setup the other
ABIs where it doesn't make sense to opt-out of the native set.
This removes some redundant COMPAT_FREEBSD* stuff from Linuxolator bits.
line_expr in makesyscalls.lua is fixed to allow empty strings to be
specified, since they're harmless.
Reviewed by: brooks, kib (both earlier version)
Differential Revision: https://reviews.freebsd.org/D33356
- maximum number of bytes that can be sent is 32, not 8;
- previous interface required callers to bump sc->msg->len in addition
to setting sc->tx_slave_addr;
- because of the above there was an issue with writing one too many bytes
because sc->cnt is not advanced when the slave address is written;
- the inetraction between outer and inner loops was confusing as the former
was bounded on the number of bytes to write and the counter was
incremented by one, but the inner loop advanced four bytes at a time;
- the return value was incorrect in the tx_slave_addr case; one call place
had to use its own (and incorrect in some cases) notion of the write
lenth.
All of the above issues should be fixed.
Some sanity asserts are added.
All callers use the return value to program RK_I2C_MTXCNT.
iic_msg::len no longer needs to be hacked.
A constant is added to reflect the maximum number of octets that can be
sent or received in one go (they are the same).
MFC after: 1 week
No need to mask a uint8_t with 0xff, the mask covers the whole type.
Explcitly cast to uint32_t before bit shifting instead of relying on
the implicit promotion to signed int.
MFC after: 1 week
Previously the driver would happily talk to addresses with no device
returning some garbage for reads and sending bits into the void for writes.
MFC after: 1 week
Previously the code would decalre the transfer complete after sending
first 31 bytes (plus the slave address) of a larger I2C write transfer.
That was tested using a large write to an EEPROM with 32-byte write page
size and a 2-byte address type. Such a transaction needed to send 34
bytes, 2 bytes for an offset and 32 bytes of actual data.
MFC after: 1 week