Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
On arm64 use atomics. Then, both arm and arm64 do not need a critical
section around update. Replace all cpus loop by CPU_FOREACH().
This brings arm and arm64 counter(9) implementation closer to current
amd64, but being more RISC-y, arm* version cannot avoid atomics.
Reported by: Alexandre Martins <alexandre.martins@stormshield.eu>
Reviewed by: andrew
Tested by: Alexandre Martins, andrew
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The types are for the byte offset and page index in vm object. They
are similar to off_t, which is defined as 64bit MI integer. Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Use new option SMP_ON_UP instead of (mis)using specific CPU type.
By this, any SMP kernel can be compiled with SMP_ON_UP support.
- Enable runtime detection of CPU multiprocessor extensions only
if SMP_ON_UP option is used. In other cases (pure SMP or UP),
statically compile only required variant.
- Don't leak multiprocessor instructions to UP kernel.
- Correctly handle data cache write back to point of unification.
DCCMVAU is supported on all armv7 cpus.
- For SMP_ON_UP kernels, detect proper TTB flags on runtime.
Differential Revision: https://reviews.freebsd.org/D9133
- Replace pcpu_find(curcpu) with get_pcpu(), which is much
more direct.
- Remove armv4 pcpu fields which I added in r286296 but never
needed to use.
- armv6 pc_qmap_addr was leftover from the old armv6 pmap
implementation. Rename it and put it to use in the new one.
Noted by: skra
Reviewed by: skra
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9312
mappings for armv6 pmap zero and copy operations to the MD PCPU region.
Change sysmap initialization to only allocate KVA pages for CPUs that
are actually present.
While here, collapse CMAP3 into CMAP2 (their use was mutually exclusive
anyway) and "recover" some space in PCPU padding that has always been
available due to 64-byte cacheline padding.
Reviewed by: skra
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9172
it into pmap-v4.h where they are used. Other than those few lines of
support for different MMU types, nothing in cpuconf.h has been used in our
code for quite a while.
The file existed to set up a variety of symbols to describe the
architecture. Over the past few years we have converted all of our source
to use the new architecture symbols standardized by ARM Inc, and predefined
by both clang and gcc.
PR: 216104
This function is referenced, but never called from DRM2 code. Also,
real behavior of pmap_mapdev_attr() in ARM world is unclear as we don't
have any additional attribute for a device memory type.
MFC after: 2 weeks
machine/armreg.h requires access to the __ARM_ARCH macro, which is not
always properly defined (especially by gcc 4.2.1). We should include
sys/cdefs.h in order to get the definitions in machine/acle-compat.h,
which would properly define the __ARM_ARCH macro in these cases.
So, in cases where machine/armreg.h is included without _SYS_CDEFS_H_
being defined - generate an #error.
Reviewed by: andrew
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D8460
* Rename ARM_HAVE_MP_EXTENSIONS to ARM_USE_MP_EXTENSIONS and extend it to
handle more cases, including when SMP is not enabled.
* Check ARM_USE_MP_EXTENSIONS when building for ARMv7+, even if no SMP.
* Use ARM_USE_MP_EXTENSIONS in pmap-v6.c to detect when to set PRRR_NS1.
With this we should be able to boot on all ARMv7+ Cortex-A cores with
32-bit support.
Reviewed by: mmel, imp (earlier version)
Relnotes: yes
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8335
hardware that supports the mp extensions. If so it should use the broadcast
tlb invalidate instructions as other CPUs or devices may need to know about
the invalidation.
To simplify the code have the compiler optimise out the else case when not
builing for Cortex-A8.
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8092
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.
Submitted by: kib@, dim@, zbb@, emaste@
for later Cortex-A CPUs that support the Multiprocessor Extensions. This
will be needed to support both in a single GENERIC kernel while still
being able to only build for a single SoC.
Reviewed by: mmel
Relnotes: yes
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D8138
On amd64, arm64 and i386, we have the possibility to switch between TLS
areas in userspace. The nice thing about this is that it makes it easier
to do light-weight threading, if we ever feel like doing that. On armv6,
let's go into the same direction by making it possible to safely use the
TPIDRURW register, which is intended for this purpose.
Clean up the ARMv6 code to remove md_tp entirely. Simply add a dedicated
field to the PCB to hold the value of TPIDRURW across context switches,
like we do for any other register. As userspace currently uses the
read-only TPIDRURO register, simply ensure that we keep both values in
sync where possible. The system calls for modifying the read-only
register will simply write the intended value into both registers, so
that it lazily ends up in the PCB during the next context switch.
Reviewed by: https://reviews.freebsd.org/D7951
Approved by: andrew
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D7951
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC. For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge. Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.
Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET. For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.
Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location. __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.
Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access. But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.
Tested by: Howard Su <howard0su@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D7473
fully-pessimized implementation that requires a type to be aligned to
its natural size.
On armv6+ the compiler might generate load-/store-multiple instructions
which require 4-byte alignment even though the source code is only
accessing individual uint32_t values in a way that doesn't require any
particular alignment at all. The compiler apparently feels free to
combine multiple accesses into a single instruction that requires a
more-strict alignment, and no set of compiler flags seems to disable
this behavior (at least in clang 3.8).
This fixes alignment faults on arm systems using wifi adapters. The
wifi code uses ALIGNED_POINTER(p, uint32_t) to decide whether it needs
to copy-align tcp headers. Because clang is combining several uint32_t
accesses into a single ldm instruction, we need to say that accessing a
uint32_t requires 4-byte alignment.
Approved by: re(gjb)
are no longer natural-alignment strict, there are still some restrictions.
FreeBSD network code assumes data is naturally-aligned or is running
on a platform with no restrictions; pointers are not annotated to
indicate the data pointed to may be packed or unaligned. The clang
optimizer can sometimes combine the load or store of a pair of adjacent
32-bit values into a single doubleword load/store, and that operation
requires at least 4-byte alignment. __NO_STRICT_ALIGNMENT can lead
to tcp headers being only 2-byte aligned.
Note that alignment faults remain disabled on armv6, this change reverts
only the defining of the symbol which leads to some overly-agressive code
shortcuts when building common/shared drivers and network code for arm.
Approved by: re(kib)