This function is referenced, but never called from DRM2 code. Also,
real behavior of pmap_mapdev_attr() in ARM world is unclear as we don't
have any additional attribute for a device memory type.
MFC after: 2 weeks
machine/armreg.h requires access to the __ARM_ARCH macro, which is not
always properly defined (especially by gcc 4.2.1). We should include
sys/cdefs.h in order to get the definitions in machine/acle-compat.h,
which would properly define the __ARM_ARCH macro in these cases.
So, in cases where machine/armreg.h is included without _SYS_CDEFS_H_
being defined - generate an #error.
Reviewed by: andrew
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D8460
* Rename ARM_HAVE_MP_EXTENSIONS to ARM_USE_MP_EXTENSIONS and extend it to
handle more cases, including when SMP is not enabled.
* Check ARM_USE_MP_EXTENSIONS when building for ARMv7+, even if no SMP.
* Use ARM_USE_MP_EXTENSIONS in pmap-v6.c to detect when to set PRRR_NS1.
With this we should be able to boot on all ARMv7+ Cortex-A cores with
32-bit support.
Reviewed by: mmel, imp (earlier version)
Relnotes: yes
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8335
hardware that supports the mp extensions. If so it should use the broadcast
tlb invalidate instructions as other CPUs or devices may need to know about
the invalidation.
To simplify the code have the compiler optimise out the else case when not
builing for Cortex-A8.
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8092
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.
Submitted by: kib@, dim@, zbb@, emaste@
for later Cortex-A CPUs that support the Multiprocessor Extensions. This
will be needed to support both in a single GENERIC kernel while still
being able to only build for a single SoC.
Reviewed by: mmel
Relnotes: yes
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8138
On amd64, arm64 and i386, we have the possibility to switch between TLS
areas in userspace. The nice thing about this is that it makes it easier
to do light-weight threading, if we ever feel like doing that. On armv6,
let's go into the same direction by making it possible to safely use the
TPIDRURW register, which is intended for this purpose.
Clean up the ARMv6 code to remove md_tp entirely. Simply add a dedicated
field to the PCB to hold the value of TPIDRURW across context switches,
like we do for any other register. As userspace currently uses the
read-only TPIDRURO register, simply ensure that we keep both values in
sync where possible. The system calls for modifying the read-only
register will simply write the intended value into both registers, so
that it lazily ends up in the PCB during the next context switch.
Reviewed by: https://reviews.freebsd.org/D7951
Approved by: andrew
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D7951
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC. For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge. Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.
Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET. For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.
Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location. __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.
Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access. But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.
Tested by: Howard Su <howard0su@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D7473
fully-pessimized implementation that requires a type to be aligned to
its natural size.
On armv6+ the compiler might generate load-/store-multiple instructions
which require 4-byte alignment even though the source code is only
accessing individual uint32_t values in a way that doesn't require any
particular alignment at all. The compiler apparently feels free to
combine multiple accesses into a single instruction that requires a
more-strict alignment, and no set of compiler flags seems to disable
this behavior (at least in clang 3.8).
This fixes alignment faults on arm systems using wifi adapters. The
wifi code uses ALIGNED_POINTER(p, uint32_t) to decide whether it needs
to copy-align tcp headers. Because clang is combining several uint32_t
accesses into a single ldm instruction, we need to say that accessing a
uint32_t requires 4-byte alignment.
Approved by: re(gjb)
are no longer natural-alignment strict, there are still some restrictions.
FreeBSD network code assumes data is naturally-aligned or is running
on a platform with no restrictions; pointers are not annotated to
indicate the data pointed to may be packed or unaligned. The clang
optimizer can sometimes combine the load or store of a pair of adjacent
32-bit values into a single doubleword load/store, and that operation
requires at least 4-byte alignment. __NO_STRICT_ALIGNMENT can lead
to tcp headers being only 2-byte aligned.
Note that alignment faults remain disabled on armv6, this change reverts
only the defining of the symbol which leads to some overly-agressive code
shortcuts when building common/shared drivers and network code for arm.
Approved by: re(kib)
the exact CPU we are running on to set the cpu functions. Relax the check
to ignore the CPU revision. Even so this may still be too specific.
Reviewed by: mmel
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D6504
- Reset debug architecture and enable monitor for secondary
CPUs in init_secondary() rather than when configuring watchpoint, etc.
- Disable HW debugging capabilities when one of the CPU cores fails
to set up.
- Use dbg_capable() in a more atomic manner to avoid any mismatch
between CPUs.
Differential Revision: https://reviews.freebsd.org/D6009
to match the new state of affairs. The hardware we support has always been
able to do unaligned accesses, we've just never enabled it until now.
This brings FreeBSD into line with all the other major OSes, and should help
with the growing volume of 3rd-party software that assumes unaligned access
will just work on armv6 and armv7.
have ACLE support built in. The ACLE (ARM C Language Extensions) defines
a set of standardized symbols which indicate the architecture version and
features available. ACLE support is built in to modern compilers (both
clang and gcc), but absent from gcc prior to 4.4.
ARM (the company) provides the acle-compat.h header file to define the
right symbols for older versions of gcc. Basically, acle-compat.h does
for arm about the same thing cdefs.h does for freebsd: defines
standardized macros that work no matter which compiler you use. If ARM
hadn't provided this file we would have ended up with a big #ifdef __arm__
section in cdefs.h with our own compatibility shims.
Remove #include <machine/acle-compat.h> from the zillion other places (an
ever-growing list) that it appears. Since style(9) requires sys/types.h
or sys/param.h early in the include list, and both of those lead to
including cdefs.h, only a couple special cases still need to include
acle-compat.h directly.
Loves it: imp
where possible. In the places that doesn't work (multi-line inline asm,
and places where the old armv4 cpufuncs mechanism is used), annotate the
accesses with a comment that includes SCTLR. Now a grep -i sctlr can find
all the system control register manipulations.
No functional changes.
compilers can emit arm instructions that require 8-byte alignment. The
alignment-sensitive instructions were added in armv5, which has to be
supported by our combined v4/v5 kernels, so the value is set uncoditionally
for all arm architecture versions.
Also adjust the comment to explain in more detail why the macros have the
form and values they do.
Per advice from bde@, maintain the unsignedness of the value of _ALIGNBYTES
(but do so using his second choice of allowing sizeof() to supply the
unsignedness, rather than just hardcoding '8U', which in my mind would
require an even more verbose comment to explain why it's right). Also
explain in the comment that the resulting type of _ALIGN() is equivelent
to uinptr_t on arm (32-bit unsigned int), but it's purposely spelled as
"unsigned" to avoid problems with including other header files. Even
including machine/_types.h to allow use of __uintptr_t causes compilation
failures because of this header being included (indirectly) in asm code.
The discussion that led to this change (albeit at a glacial pace) is at
https://lists.freebsd.org/pipermail/svn-src-head/2014-November/064593.html
implementations. Early in the boot the kernel will use an approximate,
however after the timer has been probed it will switch to a more accurate
implementation.
Reviewed by: manu
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D5762
and R/W emulation aborts under pmap lock.
There were two reasons for using of atomic operations:
(1) the pmap code is based on i386 one where they are used,
(2) there was an idea that access and R/W emulation aborts should be
handled as quick as possible, without pmap locking.
However, the atomic operations in i386 pmap code are used only because
page table entries may be modified by hardware. At the beginning, we
were not sure that it's the only reason. So even if arm hardware does
not modify them, we did not risk to not use them at that time. Further,
it turns out after some testing that using of pmap lock for access and
R/W emulation aborts does not bring any extra cost and there was no
measurable difference. Thus, we have decided finally to use pmap lock
for all operations on page table entries and so, there is no reason for
atomic operations on them. This makes the code cleaner and safer.
This decision introduce a question if it's safe to use pmap lock for
access and R/W emulation aborts. Anyhow, there may happen two cases in
general:
(A) Aborts while the pmap lock is locked already - this should not
happen as pmap lock is not recursive. However, under pmap lock only
internal kernel data should be accessed and such data should be mapped
with A bit set and NM bit cleared. If double abort happens, then
a mapping of data which has caused it must be fixed.
(B) Aborts while another lock(s) is/are locked - this already can
happen. There is no difference here if it's either access or R/W
emulation abort, or if it's some other abort.
Reviewed by: kib
(PL1) and unprivileged (PL0) read/write access. As cp15 virtual to
physical address translation operations are used, interrupts must be
disabled to get consistent result when they are called.
These functions should be used only in very specific occasions like
during abort handling or kernel debugging. One of them is going to be
used in pmap_fault(). However, complete function set is added. It cost
nothing, as they are inlined.
While here, fix comment of #endif.
Reviewed by: kib