651 Commits

Author SHA1 Message Date
Svatopluk Kraus
36fb9d5fc8 Fix comment about unpriviledged instructions. Now, it matches with
current state after r289372.

While here, do some style and comment cleanups.  No functional changes.

Approved by:	kib (mentor)
2015-11-04 15:35:22 +00:00
Ian Lepore
0c1daec859 Eliminate the last dregs of the old global arm_root_dma_tag.
In the old days, device drivers passed NULL for the parent tag when creating
a new tag, and on arm platforms that resulted in a global tag representing
overall platform constraints being substituted in the busdma code.  Now all
drivers use bus_get_dma_tag() and if there is a need to represent overall
platform constraints they will be inherited from a tag supplied by nexus or
some bus driver in the hierarchy.

The only arm platforms still relying on the old global-tag scheme were some
xscale boards with special PCI-bus constraints.  This change provides those
constraints through a tag supplied by the xscale PCI bus driver, and
eliminates the few remaining references to the old global var.

Reviewed by:	cognet
2015-11-02 22:49:39 +00:00
Zbigniew Bodek
232e189a56 Add support for branch instruction on armv7 with ptrace single step
Previous code supported only "continuous" code without any kind of
branch instructions. To change that, new function was implemented
which parses current instruction and returns an addres where
the jump might happen (alternative addr).
mdthread structure was extended to support two breakpoints
(one directly below current instruction and the second placed
at the alternative location).
One of them must trigger regardless the instruction has or has not been
executed due to condition field.
Upon cleanup, both software breakpoints are removed.

This implementation parses only the most common instructions
that are present in the code (like 99.99% of all), but there
is a chance there are some left, not covered by the parsing routine.
Parsing is done only for 32-bit instruction, no Thumb nor Thumb-2
support is provided.

Reviewed by:   kib
Submitted by:  Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Juniper Networks Inc.
Differential Revision: https://reviews.freebsd.org/D4021
2015-11-02 16:56:34 +00:00
Oleksandr Tymoshenko
0265aa0a15 Treat synchronous VFP exception just like aynchronous: as an FP exception,
not as illegal instruction
2015-11-01 21:59:56 +00:00
Jason A. Harmening
d58d7ad4a8 Retire pmap_dmap_iscurrent(). It is only a wrapper around pmap_is_current(), and is no longer called. 2015-10-28 21:17:38 +00:00
Ian Lepore
ce6ce41cac Provide armv4/v5 implementations of several of the armv6 cache maintenance
functions.  This will make it possible to use the same busdma code for all
arm platforms v4 thru v7.
2015-10-24 21:25:53 +00:00
Ian Lepore
d818f2b6b9 Rename dcache_dma_preread() to dcache_inv_poc_dma() to make it clear that it
is a dcache invalidate to point of coherency just like dcache_inv_poc(), but
a slightly different version specific to dma operations.  Elaborate the
comment about how and why it's different.
2015-10-24 19:39:41 +00:00
Jason A. Harmening
0820c78e93 Use pmap_quick* functions in armv6 busdma, for bounce buffers and cache maintenance. This makes it safe to sync buffers that have no VA mapping associated with the busdma map, but may have other mappings, possibly on different CPUs. This also makes it safe to sync unmapped bounce buffers in non-sleepable thread contexts.
Similar to r286787 for x86, this treats userspace buffers the same as unmapped buffers and no longer borrows the UVA for sync operations.

Submitted by: 	Svatopluk Kraus <onwahe@gmail.com> (earlier revision)
Tested by:	Svatopluk Kraus
Differential Revision:	https://reviews.freebsd.org/D3869
2015-10-22 16:38:01 +00:00
Ian Lepore
935c21a18e Set the correct values in the arm aux control register, based on chip type.
The bits in the aux control register vary based on the processor type.  In
the past we've always just set the 'smp' and "broadcast tlb/cache ops' bits,
which worked fine for the first few SoCs we supported.  Now that we support
most of the cortex-a series processors, it's important to get the right bits
set based on the processor type.

Submitted by:	Svatopluk Kraus <onwahe@gmail.com>
2015-10-19 19:18:02 +00:00
Ian Lepore
26b9f0eb98 Only decode fdt data which belongs to the GIC controller.
The interrupts-extended property is a list of controller-specific
interrupt tuples for more than one controller.  The decode routine of
every PIC gets called in the pre-INTRNG code (nexus doesn't know which
device instance belongs to which fdt node), so the GIC code has to
check each FDT node it is asked to decode to ensure it is the owner.

Because in the pre-INTRNG world there can only be one instance of a GIC,
it's safe to cache the results of a positive lookup in a static variable
to avoid the expensive lookups on subsequent calls.

Submitted by:	Svatopluk Kraus <onwahe@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D2345
2015-10-18 20:37:10 +00:00
Ian Lepore
686450c898 Import ARM_INTRNG, the "next generation" interrupt architecture for arm
and armv6 architecures.  The primary enhancement over the old design is
support for hierarchical interrupt controllers (such as a gpio driver
which can receive interrupts from a root PIC and act as a PIC itself for
clients interested in handling a change of gpio pin state as an
interrupt).  The new code also provides an infrastructure for mapping
interrupts described in metadata in the form of a "controller reference
plus interrupt number" tuple into the simple "0-n" flat numeric space
understood by rman and the bus resource mechanisms.

Use of the new code is enabled by setting the ARM_INTRNG option, and by
making a few simple changes to the platform's support code.  In addition
each existing PIC driver needs changes to be ready for INTRNG; this commit
contains the changes for the arm/gic driver, which most armv6 SoCs use, but
it does not enable the new code yet on any platform.

This project has been many years in the making, starting as a GSoC project
by Jakub Klama (jceel@) in 2012.  That didn't get committed right away and
the source base evolved out from under it to some degree.  In 2014 I rebased
the diffs to then -current and did some enhancements in the area of mapping
interrupt numbers and storing associated fdt data, then the project went
cold again for a while.  Eventually Svata Kraus took that work in progress
and did another big round of work on it, removing most of the remaining
rough edges.  Finally I took that and made one more pass through it, mostly
disabling the "INTR_SOLO" feature for now, pending further design
discussions on how to most efficiently dispatch a pending interrupt through
more than one layer of PIC.  The current code with the INTR_SOLO feature
disabled uses approximate 100 extra cpu cycles for each cascaded PIC the
interrupt has to be passed to, so what's left to do is about efficiency, not
correct operation.

Differential Revision:	https://reviews.freebsd.org/D2047
2015-10-18 18:26:19 +00:00
Ian Lepore
7ce00ee7b4 Rename arm_init_secondary_ic() -> arm_pic_init_secondary(). The latter is
the name the function will have when the new ARM_INTRNG code is integrated,
and doing this rename first will make it easier to toggle the new interrupt
handling code on/off with a config option for debugging.
2015-10-18 16:54:34 +00:00
Konstantin Belousov
97140827bb ARM userspace accessors, e.g. {s,f}uword(9), copy{in,out}(9),
casuword(9) and others, use LDRT and STRT instructions to access
memory with the privileges of userspace.  If the *RT instruction
faults on the kernel address, then additional checks must be done to
not confuse the VM system with invalid kernel-mode faults.

Put ARM on line with other FreeBSD architectures and disallow usermode
buffers which intersect with the kernel address space in advance,
before any accesses are performed.  In other words, vm_fault(9) is no
longer called when e.g. suword(9) stores to invalid (i.e. not
userspace) address.

Also, switch ARM to use fueword(9) and casueword(9).

Note: there is a pending patch in D3617, which adds the special
processing for faults from LDRT and STRT.  The addition of the
processing is useful for potential other uses of the instructions and
for completeness, but standard userspace accessors are better served
by not allowing such faults beforehand.

Reviewed by:	andrew
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3816
MFC after:	2 weeks
2015-10-15 17:40:39 +00:00
Konstantin Belousov
c890751da6 A follow-up to r288492. In fact, revert the mentioned commit for
pre-VFPv3 processors, since they do require software support code to
handle denormals.  For VFPv3 and later, enable flush-to-zero if
hardware does not claim full denormals arithmetic support by VMVFR1_FZ
field in mvfr1 register.

The end result is that we do use correct fpu environment on Cortexes
with VFPv3, while ARM11 (e.g. rpi) is in non-compliant flush-to-zero
mode.  At least CPUs without complete hardware implementation of
IEEE 754 do not cause unhandled floating point exception on underflow,
as it was before r288492.

Noted by:	ian
Tested by:	gjb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-10-07 09:12:49 +00:00
Robert Watson
ba2f5f5ed3 Add missing stack unwind information to several assembly functions on
ARMv6/7:

- Define _SAVE() macro to allow unwind data to be conditionally defined for
  ARM assembly code in the kernel.

- Use _SAVE() to provide unwind information for bcopy_page(), and two (of
  many) instances of copyin() and copyout().

Reviewed by:	andrew, imp
MFC after:	3 days
Sponsored by:	University of Cambridge
2015-10-04 09:39:40 +00:00
Konstantin Belousov
c39e422eed FreeBSD does not support SMP on ARMv5. Since processor is always
self-consistent, there is no need in anything but compiler barrier in
the implementation of atomic_thread_fence_*() on ARMv5.  Split
implementation of fences for ARMv4/5 and ARMv6; the former use
compiler barriers, the later also perform hardware barriers.

An issue which is fixed by the change is the faults from the CP15
coprocessor accesses in the user mode.  This was uncovered by the
pthread_once() changes in r287556.

Reported by:	Mattia Rossi <mattia.rossi.mailinglists@gmail.com>
Discussed with:	alc, cognet, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-10-02 13:21:08 +00:00
Konstantin Belousov
4a0589d1ba Typo. 2015-08-20 13:37:08 +00:00
Marcel Moolenaar
5b44efcf47 The Broadcom BCM56060 chip has a Cortex-A9R4 core.
Submitted by:	Steve Kiernan <stevek@juniper.net>
Reviewed by:	imp@
Differential Revision:	https://reviews.freebsd.org/D3357
2015-08-13 14:50:11 +00:00
Konstantin Belousov
edc8222303 Make kstack_pages a tunable on arm, x86, and powepc. On i386, the
initial thread stack is not adjusted by the tunable, the stack is
allocated too early to get access to the kernel environment. See
TD0_KSTACK_PAGES for the thread0 stack sizing on i386.

The tunable was tested on x86 only.  From the visual inspection, it
seems that it might work on arm and powerpc.  The arm
USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already
incorrect for the threads with non-default kstack size.  I only
changed the macros to use variable instead of constant, since I cannot
test.

On arm64, mips and sparc64, some static data structures are sized by
KSTACK_PAGES, so the tunable is disabled.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
2015-08-10 17:18:21 +00:00
Ed Maste
96226a9aa7 Rationalize BSD license on sys/*/include/float.h
Remove the advertising clause from the Regents of the University of
California's license, per the letter dated July 22, 1999.

Update clause numbering.
2015-08-05 17:05:35 +00:00
Jason A. Harmening
713841afb2 Add two new pmap functions:
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)

These will create and destroy a temporary, CPU-local KVA mapping of a specified page.

Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread

Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page

The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.

NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.

Reviewed by:	kib
Approved by:	kib (mentor)
Differential Revision:	http://reviews.freebsd.org/D3013
2015-08-04 19:46:13 +00:00
Andrew Turner
70888b7ed5 Fix atomic_store_64, it should write the value passed in, not the value
read by the load.

Pointy Hat:	andrew
2015-07-19 16:55:47 +00:00
Andrew Turner
a612bbfa12 Clean up the style of the armv6 atomic code.
Sponsored by:	ABT Systems Ltd
2015-07-19 15:44:51 +00:00
Andrew Turner
d6a2102846 Sort the ARM atomic functions to be in alphabetical order.
Sponsored by:	ABT Systems Ltd
2015-07-19 13:10:47 +00:00
Andrew Turner
8fa2222f46 Split out the arm and armv6 parts of atomic.h to new files. While here use
__ARM_ARCH to determine which revision of the architecture is applicable.

Sponsored by:	ABT Systems Ltd
2015-07-16 13:33:03 +00:00
Konstantin Belousov
8954a9a4e6 Add the atomic_thread_fence() family of functions with intent to
provide a semantic defined by the C11 fences with corresponding
memory_order.

atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.

atomic_thread_fence_rel() is r, w | w.

atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation.  Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.

atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.

Reviewed by:	alc
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2015-07-08 18:12:24 +00:00
Andrew Turner
7061124034 Stop using VFP in pcpu.h when we mean ARMv6 and later. 2015-06-11 13:58:40 +00:00
Alan Cox
966272ca33 Retire VM_FREEPOOL_CACHE as the next step in eliminating PG_CACHE pages.
Differential Revision:	https://reviews.freebsd.org/D2712
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2015-06-08 04:59:32 +00:00
Andrew Turner
e1f2f9bdfa Remove pc_cpu, it was duplicating pc_cpuid so was unneeded. 2015-06-07 10:50:15 +00:00
Andrew Turner
ae160b23fa We only support the ARM EABI in head, remove the check on __ARM_EABI__. 2015-05-31 10:51:06 +00:00
Andrew Turner
ad25ff4509 Add more cp15_ functions, and use them in cpufunc.c where possible. 2015-05-24 12:12:01 +00:00
Warner Losh
d36eec691a Export the eflags field from the elf header. This allows better
discrimination between different subarch binaries, at least for mips
and arm. Arm is implemented, mips is still tbd, so not currently
exported. aarch64 does not export this because aarch64 binaries use
different tags and flags than arm.

Differential Revision: https://reviews.freebsd.org/D2611
2015-05-22 20:50:35 +00:00
Andrew Turner
b7112ead32 Clean up struct syscall_args:
1. Align to a 64-bit address so 64-bit data will be correctly aligned.
 2. Add a comment explaining why.
 3. Remove an unneeded value from the struct.

This fixes an issue where the struct may not be correctly aligned on the
stack in the syscall function. This may lead to accesing a 64-bit value
at a non 64-bit. This will raise an exception and panic the kernel.

We have been lucky where on arm and armv6 both clang and gcc correctly
align the data, even without us asking to, however, on armeb with clang to
not be the case. This tells the compiler we really do need this to be
aligned.

Reported and tested by:	jmg (on armeb with clang)
MFC after:	1 Week [1, 2]
2015-05-17 18:35:58 +00:00
Ian Lepore
a8a4800fff Add assertions that the addresses passed to tlb maintenance are page-aligned.
Perform cache writebacks and invalidations in the correct (inner to outer
or vice versa) order, and add comments that explain that.

Consistantly use 'va' as the variable name for virtual addresses.

Submitted by:	Michal Meloun <meloun@miracle.cz>
2015-05-15 18:10:00 +00:00
Ganbold Tsagaankhuu
3f67197271 It appears to be armv7_sleep is a duplication of armv7_cpu_sleep.
For consistency with the naming conventions used by the other
implementations kill armv7_sleep and keep armv7_cpu_sleep.

Differential Revision:	https://reviews.freebsd.org/D2537
Submitted by:	John Wehle
Reviewed by:	ian@, andrew@
2015-05-15 00:39:51 +00:00
Alan Cox
dfb378345f Retire pmap_lazyfix(). This function only existed in the new armv6 pmap
because the i386 pmap on which the new armv6 pmap is based had it, and in
r281707 pmap_lazyfix() was removed from the i386 pmap.

Discussed with:	kib
Submitted by:	Michal Meloun (via Svatopluk Kraus)
2015-05-11 19:55:01 +00:00
Andrew Turner
8465de8e6c Mark thumb entry points as such when building for thumb, otherwise mark
them as arm.
2015-05-11 19:04:32 +00:00
Andrew Turner
827422e3fd Use the Thumb compliant version of the add instruction. We can only use
"add Rd, Rn, Rm" from within an IT (if-then) block.
2015-05-11 19:00:02 +00:00
Andrew Turner
ec94f63bca List both registers to use in the 64-bit atomic instructions. We will need
these to build for Thumb-2.
2015-05-11 18:52:06 +00:00
Andrew Turner
6bd9126da9 cpu-v6.h should only be used in the kernel, add an error to enforce this. 2015-05-11 12:44:02 +00:00
Ed Maste
185bf88e33 Correct PL310_POWER_CTRL offset
Offet for the power control register was specified incorrectly (it had
the same value as the prefetch control register.) This change corrects
the offset value to 0xF80, per the ARM PL310 documentation.

Submitted by:	Steve Kiernan <stevek@juniper.net>
Obtained from:	Juniper Networks, Inc.
2015-05-07 16:56:20 +00:00
Zbigniew Bodek
c4b8fcd66c Add new CP15 operations and DB_SHOW_COMMAND to print CP15 registers
Submitted by:   Wojciech Macek <wma@semihalf.com>
Reviewed by:    imp, Michal Meloun <meloun@miracle.cz>
Obtained from:  Semihalf
2015-05-06 15:17:28 +00:00
Ian Lepore
7669914f32 Add a pmap_kremove_device() to undo mappings made with pmap_kenter_device().
Previously we used pmap_kremove(), but with ARM_NEW_PMAP it does the remove
in a way that isn't SMP-coherent (which is appropriate in some circumstances
such as mapping/unmapping sf buffers).  With matching enter/remove routines
for device mappings, each low-level implementation can do the right thing.

Reviewed by:	Svatopluk Kraus <onwahe@gmail.com>
2015-04-10 13:26:35 +00:00
Andrew Turner
63d071445e Add support to the efi boot1 and loader for 32-bit ARM. This will be used
by the future qemu virt support.

Differential Revision:	https://reviews.freebsd.org/D2238
Reviewed by:	emaste
2015-04-06 15:50:20 +00:00
Andrew Turner
02df43b866 dev/ofw/openfirm.h is not needed in the arm machine/fdt.h 2015-04-05 09:50:22 +00:00
Andrew Turner
e8cfd62ed8 Re-add machine/bus.h to machine/fdt.h on arm, it's still needed. 2015-04-04 23:10:13 +00:00
Andrew Turner
8c7336d57f Don't include unneeded files in the arm machine/fdt.h. While here, remove
it from more files.
2015-04-04 22:22:04 +00:00
Andrew Turner
7defa9a75c Move the definition of fdt_localbus_devmap to a Marvell specific file as
it's only used there.
2015-04-04 22:05:43 +00:00
Andrew Turner
7b309274e3 Add the generic timer registers to sysreg.h and cpu-v6.h, and use the
access functions in the generic timer driver.

Differential Revision:	https://reviews.freebsd.org/D2198
Sponsored by:	The FreeBSD Foundation
2015-04-02 12:56:06 +00:00
Andrew Turner
a3db11e053 Remove support for CPU_XSCALE_80200. None of our configs support it, and
there wasn;t an option to enable it.

While here remove a check for CPU_ARM10 being defined as it has also been
removed.
2015-03-30 09:29:45 +00:00