Commit Graph

285 Commits

Author SHA1 Message Date
andrew
6d3c93b78f Set the upper limit of the DMAP region to the limit of RAM as was found in
the physmap. This will reduce the likelihood of an issue where we have
device memory mapped in the DMAP. This can only happen if it is within the
same 1G block of normal memory.

Reviewed by:	kib
Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5938
2016-04-14 10:43:28 +00:00
emaste
f015fe6ded arm64: Avoid null dereference in its_init_cpu
its_init_cpu() is called from gic_v3_init_secondary(), and its_sc will
be NULL if its did not attach.

Sponsored by:	The FreeBSD Foundation
2016-04-13 21:00:00 +00:00
andrew
fcf07e4e7b Document the memory ranges within the kernel region to help with debugging
to track down which region an address is from.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-13 11:43:03 +00:00
andrew
66ef37d31b Increase the arm64 kernel address space to 512GB, and the DMAP region to
2TB. The latter can be increased in 512GB chunks by adjusting the lower
address, however more work will be needed to increase the former.

There is still some work needed to only create a DMAP region for the RAM
address space as on ARM architectures all mappings should have the same
memory attributes, and these will be different for device and normal memory.

Reviewed by:	kib
Obtained from:	ABT Systems Ltd
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5859
2016-04-13 09:44:32 +00:00
pfg
b63211eed5 Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
zbb
fa19c68f9f Fix interrupts delivery on ThunderX for VF IDs beyond 8
SR-IOV devices usually use Alternative Routing ID (ARI).
In that case slot/device is always assumed to be 0 and
function/identifier is extended to 8 bits.

Fix interrupts delivery to VF IDs beyond 8 by using a correct
DevID if ARI is enabled.

Reviewed by:   jhb, wma
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5855
2016-04-07 10:36:50 +00:00
andrew
d28cfa553c Use PHYS_IN_DMAP to check if a physical address is within the DMAP region.
Approved by:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-06 14:16:37 +00:00
andrew
c287d03447 Cleanup the early pagetable creation code in preperation for increasing
the size of the arm64 DMAP region.

Approved by:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-06 14:12:00 +00:00
andrew
d532565c6d Allow vmparam.h to be included from assembly files on arm64.
Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-06 14:08:10 +00:00
ed
e55c02e6f8 Make CloudABI's way of doing TLS more friendly to userspace emulators.
We're currently seeing how hard it would be to run CloudABI binaries on
operating systems cannot be modified easily (Windows, Mac OS X). The
idea is that we want to just run them without any sandboxing. Now
that CloudABI executables are PIE, this is already a bit easier, but TLS
is still problematic:

- CloudABI executables want to write to the %fs, which typically
  requires extra system calls by the emulator every time it needs to
  switch between CloudABI's and its own TLS.

- If CloudABI executables overwrite the %fs base unconditionally, it
  also becomes harder for the emulator to store a backup of the old
  value of %fs. To solve this, let's no longer overwrite %fs, but just
  %fs:0.

As CloudABI's C library does not use a TCB, this space can now be used
by an emulator to keep track of its internal state. The executable can
now safely overwrite %fs:0, as long as it makes sure that the TCB is
copied over to the new TLS area.

Ensure that there is an initial TLS area set up when the process starts,
only containing a bogus TCB. We don't really care about its contents on
FreeBSD.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D5836
2016-04-06 11:11:31 +00:00
wma
f5a4347e1c Implement dtrace_getupcstack in ARM64
Allow using DTRACE for performance analysis of userspace
applications - the function call stack can be captured.
This is almost an exact copy of AMD64 solution.

Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           emaste, gnn, jhibbits
Differential Revision: https://reviews.freebsd.org/D5779
2016-04-06 05:13:36 +00:00
andrew
5b3779aaee Add a table to map from the FreeBSD CPUID space to the GIC CPUID space. On
many SoCs these two are the same, however there is no requirement for this
to be the case, e.g. on the ARM Juno we boot on what the GIC thinks of as
CPU 2, but FreeBSD numbers it CPU 0.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-04 17:04:33 +00:00
andrew
5a20a1f054 Reduce the diff for when we switch to intrng. The IPI interrupts will be
split out to multiple handlers.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-04-04 15:13:17 +00:00
wma
153a55cf7f arm64: pagezero improvement
This change has been provided to improve pagezero call performance.

Submitted by:          Dominik Ermel <der@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           kib
Differential Revision: https://reviews.freebsd.org/D5741
2016-04-04 07:16:43 +00:00
wma
da4a811387 Add bzero.S to ARM64 machdep
Add fille missing from https://svnweb.freebsd.org/changeset/base/297536
2016-04-04 07:11:33 +00:00
wma
4ba80483c0 arm64: bzero optimization
This optimization attempts to utylize as wide as possible register store instructions to zero large buffers.
The implementation, if possible, will use 'dc zva' to zero buffer by cache lines.

Speedup: 60x faster memory zeroing

Submitted by:          Dominik Ermel <der@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           kib
Differential Revision: https://reviews.freebsd.org/D5726
2016-04-04 07:06:20 +00:00
ed
910e4d679c Make Position Independent Executables work for CloudABI.
- Set BI_CAN_EXEC_DYN, so we can execute ET_DYN ELF files in addition to
  regular ET_EXECs.
- Provide an AT_BASE entry in the auxiliary vector, so the executable
  knows at which address it got loaded and can apply relocations.
2016-03-31 18:52:00 +00:00
andrew
b45ea0fe80 Add support for 4 level pagetables. The userland address space has been
increased to 256TiB. The kernel address space can also be increased to be
the same size, but this will be performed in a later change.

To help work with an extra level of page tables two new functions have
been added, one to file the lowest level table entry, and one to find the
block/page level. Both of these find the entry for a given pmap and virtual
address.

This has been tested with a combination of buildworld, stress2 tests, and
by using sort to consume a large amount of memory by sorting /dev/zero. No
new issues are known to be present from this change.

Reviewed by:	kib
Obtained from:	ABT Systems Ltd
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5720
2016-03-31 11:07:24 +00:00
andrew
68e9a48abb Read the CPU ID for the current CPU from the GIC. The GIC may have a
different ID space than the kernel. Because of this we need to read the
ID from the hardware. The hardware will provide this value to the CPU by
reading any of the first 8 Interrupt Processor Targets Registers.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5706
2016-03-29 13:51:26 +00:00
wma
9141d11115 arm64: Fixing user space boudary checking in copyinout.S
Big buffer size could cause integer overflow and as a result
attempt to copy beyond VM_USERMAX_ADDRESS.

Fixing copyinstr boundary checking where compared value has been
overwritten by accident when setting fault handler.

Submitted by:          Dominik Ermel <der@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           kib
Differential Revision: https://reviews.freebsd.org/D5719
2016-03-24 13:28:33 +00:00
wma
15e11f0a03 ARM64 copyinout improvements
The first of set of patches.
Use wider load/stores when aligned buffer is being copied.

In a simple test:
  dd if=/dev/zero of=/dev/null bs=1M count=1024
the performance jumped from 410MB/s up to 3.6GB/s.

TODO:
 - better handling of unaligned buffers (WiP)
 - implement similar mechanism to bzero

Submitted by:          Dominik Ermel <der@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           kib, andrew, emaste
Differential Revision: https://reviews.freebsd.org/D5664
2016-03-23 13:29:52 +00:00
andrew
acaab8fb10 Use the saved program state register to detect when an exception frame is
from userpsace. Previously we could have triggered a panic by trying to
jump to a kernel address from userland as the trap handling code thought we
received an ast in kernel mode.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-03-22 08:36:25 +00:00
andrew
47415357ac Move the opt_ files to be included first so their definitions can be used
from within all further included files.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-03-18 16:32:22 +00:00
andrew
d5c75c2b10 Rename COUNT_IPI to INTR_IPI_COUNT to reduce the diff with intrng.
Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-03-18 16:29:58 +00:00
andrew
162f59a4a0 Reduce the diff with intrng by renaming similar functions. This is a noop,
but will help move to use the common interrupt handling code later.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-03-18 16:18:29 +00:00
andrew
df8c050ab1 Remove the invalid L0_BLOCK definition. ARMv8 doesn't support block
translation in the level 0 descriptor.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-03-18 10:01:25 +00:00
wma
ca92c5bf09 pmap arm64: fixing pmap_invalidate_range
It seems that if range within one page is given this page will not be
invalidated at all. Clean it up.

Submitted by:          Dominik Ermel <der@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          Cavium
Reviewed by:           wma, zbb
Approved by:           cognet (mentor)
Differential Revision: https://reviews.freebsd.org/D5569
2016-03-14 07:26:38 +00:00
jhb
e641458f70 Fix reporting of the CloudABI ABI in kdump.
- Advertise the word size for CloudABI ABIs via the SV_LP64 flag.  All of
  the other ABIs include either SV_ILP32 or SV_LP64.
- Fix kdump to not assume a 32-bit ABI if the ABI flags field is non-zero
  but SV_LP64 isn't set.  Instead, only assume a 32-bit ABI if SV_ILP32 is
  set and fallback to the unknown value of "00" if neither SV_LP64 nor
  SV_ILP32 is set.

Reviewed by:	kib, ed
Differential Revision:	https://reviews.freebsd.org/D5560
2016-03-09 18:38:30 +00:00
bz
983c664612 Force re-routing PCI interrupts (this is for legacy INTx not MSI).
Need this for gem5, but was not needed on real hadrware (yet) as it
was always MSI.

Reviewed by:		andrew, jhb
Discovered by:		andrew
Sponsored by:		DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D5494
2016-03-02 15:20:42 +00:00
wma
486a6ad7e0 Improve ThunderX PEM driver to work on pass2 revision
Things changed:
    * do not allocate 4GB of SLI space, because it's the waste of
      system resources. Allocate only small portions when needed.
    * provide own implementation of activate_resource which performs
      address translation between PCI bus and host PA address space.
      This is temporary solution, should be replaced by bus_map_resource
      once implemented.

Obtained from:         Semihalf
Sponsored by:          Cavium
Approved by:           cognet (mentor)
Reviewed by:           jhb
Differential revision: https://reviews.freebsd.org/D5294
2016-03-02 08:39:59 +00:00
wma
0a7f0b77a3 Get memory ranges from FDT if no EFI API is available on ARM64
Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Approved by:           cognet (mentor)
Reviewed by:           andrew, wma
Differential revision: https://reviews.freebsd.org/D5408
2016-03-01 12:50:24 +00:00
wma
ad7622cfab Enable SRE_EL2 on ARM64
Enable system register access for EL2. Alpine-V2 is
the first device requiring this to be enabled.
It is also in-sync with Linux initialization code,
and compatible with Alpine-V2 uboot requirements.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Approved by:           cognet (mentor)
Reviewed by:           wma
Differential revision: https://reviews.freebsd.org/D5394
2016-03-01 08:15:00 +00:00
wma
8076bc06d9 Add uart 8250 device to GENERIC arm64 configuration
Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Approved by:           cognet (mentor)
Reviewed by:           zbb, wma
Differential revision: https://reviews.freebsd.org/D5406
2016-03-01 07:06:36 +00:00
jhibbits
23e52c3512 Correct the memory rman ranges to be to BUS_SPACE_MAXADDR
Summary:
As part of the migration of rman_res_t to be typed to uintmax_t, memory ranges
must be clamped appropriately for the bus, to prevent completely bogus addresses
from being used.

This is extracted from D4544.

Reviewed By: cem
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5134
2016-03-01 02:59:06 +00:00
wma
ac0c41dc12 Restore ThunderX Pass1.1 PCI changes removed by r295962
If Enhanced Allocation is not used, we can't allocate any random
    range. All internal devices have hardcoded place where they can
    be located within PCI address space. Fortunately, we can read
    this value from BAR.

Obtained from:         Semihalf
Sponsored by:          Cavium
Approved by:           cognet (mentor)
Reviewed by:           zbb
Differential revision: https://reviews.freebsd.org/D5455
2016-02-26 12:16:11 +00:00
wma
7d1ae5faf2 Make pci_host_generic and thunderx_pci common
* provided OFW interface for pci_host_generic (for handling devices which are present in DTS under the PCI node)
  * removed support for internal PCI from arm64/cavium
  * cleaned up and made most of the code common

Obtained from:         Semihalf
Sponsored by:          Cavium
Approved by:           cognet (mentor)
Reviewed by:           zbb
Differential revision: https://reviews.freebsd.org/D5261
2016-02-24 06:05:30 +00:00
wma
e960072e83 Add Intel 10Gb support to ARM64 GENERIC kernel config
Obtained from:         Semihalf
Sponsored by:          Cavium
Approved by:           cognet (mentor)
Reviewed by:           zbb
Differential revision: https://reviews.freebsd.org/D5347
2016-02-22 13:34:43 +00:00
skra
ad68cf93b1 As <machine/vmparam.h> is included from <vm/vm_param.h>, there is no
need to include it explicitly when <vm/vm_param.h> is already included.

Suggested by:	alc
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D5379
2016-02-22 09:08:04 +00:00
skra
812447f90a As <machine/param.h> is included from <sys/param.h>, there is no need
to include it explicitly when <sys/param.h> is already included.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D5378
2016-02-22 09:04:36 +00:00
skra
f4b6499ab5 As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to
include it explicitly when <vm/pmap.h> is already included.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D5373
2016-02-22 09:02:20 +00:00
jhibbits
f8385663ee Introduce a RMAN_IS_DEFAULT_RANGE() macro, and use it.
This simplifies checking for default resource range for bus_alloc_resource(),
and improves readability.

This is part of, and related to, the migration of rman_res_t from u_long to
uintmax_t.

Discussed with:	jhb
Suggested by:	marcel
2016-02-20 01:32:58 +00:00
zbb
d2a1177be6 Introduce bus_get_bus_tag() method
Provide bus_get_bus_tag() for sparc64, powerpc, arm, arm64 and mips
nexus and its children in order to return a platform specific default tag.

This is required to ensure generic correctness of the bus_space tag.
It is especially needed for arches where child bus tag does not match
the parent bus tag. This solves the problem with ppc architecture
where the PCI bus tag differs from parent bus tag which is big-endian.

This commit is a part of the following patch:
https://reviews.freebsd.org/D4879

Submitted by:  Marcin Mazurek <mma@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Annapurna Labs
Reviewed by:   jhibbits, mmel
Differential Revision: https://reviews.freebsd.org/D4879
2016-02-18 13:00:04 +00:00
wma
607785ea5a Fix ThunderX external PEM bus offset
Obtained from:         Semihalf
Sponsored by:          Cavium
Approved by:           cognet (mentor)
Reviewed by:           zbb
Differential revision: https://reviews.freebsd.org/D5293
2016-02-18 11:26:08 +00:00
skra
19710a8b47 Remove pd_prot and pd_cache members from struct arm_devmap_entry.
The struct is used for definition of static device mappings which
should always have same protection and attributes.
2016-02-17 12:36:24 +00:00
andrew
05274f8a8b Allow callers of OF_decode_addr to get the size of the found mapping. This
will allow for code that uses the old fdt_get_range and fdt_regsize
functions to find a range, map it, access, then unmap to replace this, up
to and including the map, with a call to OF_decode_addr.

As this function should only be used in the early boot code the unmap is
mostly do document we no longer need the mapping as it's a no-op, at least
on arm.

Reviewed by:	jhibbits
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D5258
2016-02-16 15:18:12 +00:00
zbb
a48186faea Support PEM that is not a PCI endpoint on ThunderX
Some chip revisions don't have their external PCIe buses
behind the internal bridge. Add support for FDT-configurable
PEMs but keep ability for PCIe enumeration.

Reviewed by:   andrew, wma
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5285
2016-02-16 11:43:57 +00:00
andrew
9305b5748d Only update curthread and curpcb after we have finished using the old
values.

If switching from a thread that used floating-point registers to a thread
that is still running, but holding the blocked_lock lock we would switch
the curthread to the new (running) thread, then call critical_enter. This
will non-atomically increment td_critnest, and later call critical_exit to
non-atomically decrement this value.

This can happen at the same time as the new thread is still running on the
old core, also calling these functions. In this case there will be a race
between these non-atomic operations. This can be an issue as we could loose
one of these operations leading to the value to not return to zero.

If, later on, we then hit a data abort we check if the td_critnest is zero.
If this check fails we will panic the kernel.

This has been observed when running pcmstat on a Cavium ThunderX. The pcm
thread will use the blocked_lock lock and there is a high chance userspace
will use the floating-point registers. When, later on, pmcstat triggers a
data abort we will hit this panic.

The fix is to update these values after storing the floating-point state.
This means we use the correct curthread while storing the state so it will
not be an issue that the changes to td_critnest are non-atomic.

Sponsored by:	ABT Systems Ltd
2016-02-12 12:38:04 +00:00
zbb
5d05778815 Support interrupts binding in GICv3 and ITS
- Add MOVI command and routine for the LPI migration
- Allow to search for the ITS device descriptor using
  not only devID but also LPI number.
- Bind SPIs in the Distributor
- Don't bind its_dev to collection. Keep track of the collection
  IDs for each LPI.

Reviewed by:   wma
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5231
2016-02-11 12:04:58 +00:00
zbb
1187007450 Implement finer locking in ITS
- Change locks' names to be more suitable
- Don't use blocking mutex. Lock only basic operations such
  as lists or bitmaps modifications.

Reviewed by:   wma
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5230
2016-02-11 12:03:11 +00:00
zbb
b8738668ec Initially bind all interrupts to the boot CPU when using GICv3
This should be done by routing all interrupts to CPU0,
different assignment will be induced by either interrupts
shuffling or bus_bind_intr().

Reviewed by:   wma
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5229
2016-02-11 12:01:33 +00:00