Commit Graph

3167 Commits

Author SHA1 Message Date
Justin Hibbits
402c7806cb Fix CTR formatting for moea64_native bootstrap
On very large memory systems 'size' can become 2GB or larger, resulting in a
negative value being formatted.  Also, moea64_pteg_count is already a long, so
format it as such.
2018-06-14 16:01:11 +00:00
Breno Leitao
5ecc8c2077 powerpc64/powernv: Avoid type promotion
There is a type promotion that transform count = -1 into a unsigned int causing
the default TCE SEG SIZE not being returned on a Boston POWER9 machine.

This machine does not have the 'ibm,supported-tce-sizes' entries, thus, count
is set to -1, and the function continue to execute instead of returning.

Reviewed by: jhibbits, wma
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15763
2018-06-12 19:50:33 +00:00
Matt Macy
eb7c901995 hwpmc: simplify calling convention for hwpmc interrupt handling
pmc_process_interrupt takes 5 arguments when only 3 are needed.
cpu is always available in curcpu and inuserspace can always be
derived from the passed trapframe.

While facially a reasonable cleanup this change was motivated
by the need to workaround a compiler bug.

core2_intr(cpu, tf) ->
  pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) ->
    pmc_add_sample(cpu, ring, pm, tf, inuserspace)

In the process of optimizing the tail call the tf pointer was getting
clobbered:

(kgdb) up
    at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709
4709                                pmc_save_kernel_callchain(ps->ps_pc,
(kgdb) up
1205                    error = pmc_process_interrupt(cpu, PMC_HR, pm, tf,

resulting in a crash in pmc_save_kernel_callchain.
2018-06-08 04:58:03 +00:00
Breno Leitao
6d645c57a3 Fix excise_initrd_region() to support 32- and 64-bit initrd params.
Changed excise_initrd_region to support both 32- and 64-bit
values for linux,initrd-start and linux,initrd-end.

This fixes the boot problem on some machines after rS334485.

Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: jhibbits, leitao
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15667
2018-06-07 21:24:21 +00:00
Justin Hibbits
5167f178ab Included VSX registers in powerpc core dumps
Summary: Included VSX registers in powerpc core dumps (both kernel and gcore)

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15512
2018-06-02 20:28:58 +00:00
Justin Hibbits
2e65567500 Added ptrace support for reading/writing powerpc VSX registers
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).

This is necessary to add support for VSX registers in debuggers.

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
2018-06-02 19:17:11 +00:00
Justin Hibbits
3254c39f83 Increase powerpc64 KVA from ~7.25GB to 32GB
This will let us use much more KVA for ZFS ARC where needed.  This may be
incresed in the future if memory requirements increase.

Discussed with:	nwhitehorn
2018-06-01 21:37:20 +00:00
Justin Hibbits
a608b7d313 Unbreak 32-bit binaries on powerpc64
Recently a change was made which broke loading 32-bit binaries on powerpc64,
with an assertion in ld-elf32.so.1:

ld-elf32.so.1: assert failed:
/usr/local/poudriere/jails/ppc64/usr/src/libexec/rtld-elf/rtld.c:390

It turns out Elf32_AuxInfo was broken for a very long time on powerpc64, as
it uses long and pointers, which are both 64 bits on powerpc64, and only
manifested with the recent work on auxargs.
2018-06-01 16:31:05 +00:00
Breno Leitao
48f64992f2 powerpc64: Avoid overwriting initrd area
Currently kexec loads an initrd file into the main memory but does not
mark that region as reserved, thus the area is not protected.

If any initrd/md file is loaded from kexec/petitboot, the region might become
corarupted/overwritten since FreeBSD does not know the region is 'reserved'.

This patch simply adds the initrd area as a reserved memory region.

Approved by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15610
2018-06-01 12:43:13 +00:00
Justin Hibbits
e69b55eadb Remove a debug printf from opal_pci driver 2018-05-31 04:11:40 +00:00
Justin Hibbits
dceea51efe Make opal_pci driver work with POWER9
Summary:
Coupled with r334365, this makes PCI work on POWER9.  There is still more to
do to fully exploit the hardware capabilities, but this is sufficient to
enable USB and ethernet controllers on a POWER9 Talos II system.

Reviewed by:	nwhitehorn, leitao
Differential Revision: https://reviews.freebsd.org/D15566
2018-05-30 03:00:57 +00:00
Justin Hibbits
f07ee2a7c0 Cache the phandle of the PCI node in opal_pci_attach
Simple cleanup, no functional change.  This is related to the fixups needed
for POWER9 support.
2018-05-30 02:47:23 +00:00
Justin Hibbits
0b1f36b6c5 Make ALT_BREAK_TO_DEBUGGER work with OPAL console
Match other consoles by using the higher level cngetc() in the interrupt
handler, so that kdb_alt_break() can check for console break.
2018-05-28 01:59:48 +00:00
Justin Hibbits
38cfc8c393 Print the full-width pointer values in hex.
PRI0ptrX is used to print a zero-padded hex value of the architecture's bitness,
so on 64-bit architectures it'll print the full 64 bit address.
2018-05-28 00:19:08 +00:00
Justin Hibbits
877c96bedc Match style of the other prototypes, and don't name the argument. 2018-05-27 20:36:43 +00:00
Justin Hibbits
ce7b8e55e3 Stop idle threads on power9 in the idle task until an interrupt.
This reduces the CPU cycle wastage on power9, which is SMT4.  Any idle
thread that's spinning is simply starving working threads on the same core
of valuable resources.

This can be reduced further by taking more advantage of the PSSCR supported
states, as well as permitting state loss, as is currently done for power8.
The currently implemented stop state is the lowest latency, which may still
consume resources.
2018-05-27 20:24:24 +00:00
Justin Hibbits
5ab39b6552 On POWER9 clear the HID0_RADIX before enabling the page tables
POWER9 supports Radix page tables in addition to Hashed page tables.  When
Radix page tables are in use, the TLB is cut in half, so that half of the
TLB is used for the page walk cache.  This is the default behavior, however
FreeBSD currently does not support Radix tables.  Clear this bit so that we
can use the full TLB.  Do this in the MMU logic so that configuration can be
localized to the specific translation format.  Once we do support Radix
tables, the setup for that will be localized to the Radix MMU kobj.
2018-05-26 04:33:19 +00:00
Justin Hibbits
9ae8a6d3d1 Fix a typo missed in r334232 2018-05-26 04:24:25 +00:00
Justin Hibbits
459e54f990 Correct a typo for opal temperature sensor type constant 2018-05-26 02:45:41 +00:00
Justin Hibbits
204d74320d Only crop the VPN on POWER4 and derivatives for TLBIE operations
Summary:
PowerISA 2.03 and later require bits 14:65 in the RB register argument,
which is the full value of the vpn argument post-shift.  Only POWER4, POWER4+,
and PPC970* need the upper 16 bits cropped.

With this change FreeBSD can boot to multi-user on POWER9.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15581
2018-05-26 00:41:50 +00:00
Justin Hibbits
1a3eaf6cc8 Add an IPMI attachment for PowerNV systems
IPMI access on PowerNV systems is done through the OPAL firmware.  This adds a
simple attachment for communicating with the FSP/BMC on these machines.  This
has been tested on a Talos POWER9 workstation, only in the bootup phase, noting
the successful attachment messages:

...
ipmi0: IPMI device rev. 0, firmware rev. 2.00, version 2.0, device support mask 0
ipmi0: Number of channels 2
...

The ipmi device has not been added to GENERIC64, but may be after further
testing.  It may also eventually be added to the ipmi module at that point.
2018-05-22 03:57:32 +00:00
Justin Hibbits
5272c9bd07 Add a comment explaining the need of a global temporary variable
cpu_xirr is used only as a temporary location for the OPAL call in
PIC_DISPATCH().

Requested by:	nwhitehorn
2018-05-22 03:24:16 +00:00
Justin Hibbits
9c6ba29de1 Basic OPAL sensor support for POWER9 platforms
Summary:
PowerNV architectures (in the test case POWER9) export sensors via the device
tree, which are accessed via OPAL calls.  This adds sysctl nodes for each
device in a generic fashion.  New sysctl nodes are:

dev.opal_sensor.N.sensor
dev.opal_sensor.N.sensor_min
dev.opal_sensor.N.sensor_max
dev.opal_sensor.N.type
dev.opal_sensor.N.label

These are rooted at a parent attachment under opal, called opalsens.  This does
not add support for the "sensor groups" defined in the device tree.

Reviewed by:	breno.leitao_gmail.com
Differential Revision: https://reviews.freebsd.org/D15362
2018-05-22 02:42:53 +00:00
Nathan Whitehorn
6cff19a3be Fix build with PSERIES but not POWERNV defined. 2018-05-20 18:26:09 +00:00
Justin Hibbits
ef6da5e5c7 Add support for the XIVE XICS emulation mode for POWER9 systems
Summary:
POWER9 systems use a new interrupt controller, XIVE, managed through OPAL
firmware calls.  The OPAL firmware includes support for emulating the previous
generation XICS presentation layer in addition to a new "XIVE Exploitation"
mode.  As a stopgap until we have XIVE exploitation mode, enable XICS emulation
mode so that we at least have an interrupt controller.

Since the CPPR is local to the current CPU, it cannot be updated for APs when
initializing on the BSP.  This adds a new function, directly called by the
powernv platform code, to initialize the CPPR on AP bringup.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15492
2018-05-20 03:23:17 +00:00
Mark Johnston
892bdccca0 Enable kernel dump features in GENERIC for most platforms.
This turns on support for kernel dump encryption and compression, and
netdump. arm and mips platforms are omitted for now, since they are more
constrained and don't benefit as much from these features.

Reviewed by:	cem, manu, rgrimes
Tested by:	manu (arm64)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D15465
2018-05-19 19:53:23 +00:00
Justin Hibbits
4a11ed7159 Add SPR_HSRR0/SPR_HSRR1 definitions
Reported by:	Mark Millard
Pointy-hat to:	jhibbits
2018-05-19 04:56:10 +00:00
Justin Hibbits
5321c01b50 Add hypervisor trap handling, using HSRR0/HSRR1
Summary:
Some hypervisor exceptions on POWER architecture only save state to HSRR0/HSRR1.
Until we have bhyve on POWER, use a lightweight exception frontend which copies
HSRR0/HSRR1 into SRR0/SRR1, and run the normal trap handler.

The first user of this is the Hypervisor Virtualization Interrupt, which targets
the XIVE interrupt controller on POWER9.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15487
2018-05-19 04:21:50 +00:00
Justin Hibbits
30f3b0f5f7 powerpc64: Add OPAL definitions
Summary:
Add additional OPAL PCI definitions and expand the code to use them in order to
ease the OPAL interface process for new comers.

These definitions came directly from the OPAL code and they are the same for
both PHB3 (POWER8) and PHB4 (POWER9).

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15432
2018-05-19 04:01:15 +00:00
Justin Hibbits
c07c77a311 Fix a manual copy from the original diff for r333825
The 'else' was in the original diff.

Submitted by:	Breno Leitao
2018-05-19 03:47:28 +00:00
Justin Hibbits
876f3b9295 Add yet another option for gathering available memory
On some POWER9 systems, 'reg' denotes the full memory in the system, while
'linux,usable-memory' denotes the usable memory.  Some memory is reserved for
NVLink usage, so is partitioned off.

Submitted by:	Breno Leitao
2018-05-19 03:45:38 +00:00
Justin Hibbits
829c98b82e Add some Hypervisor interrupt definitions
This mostly completes the interrupt definitions.  There are still some left out,
less likely to be used in the near term.
2018-05-19 03:23:46 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Nathan Whitehorn
b00df92b1f Final fix for alignment issues with the page table first patched with
r333273 and partially reverted with r333594.

Older CPUs implement addition of offsets into the page table by a
bitwise OR rather than actual addition, which only works if the table is
aligned at a multiple of its own size (they also require it to be aligned
at a multiple of 256KB). Newer ones do not have that requirement, but it
hardly matters to enforce it anyway.

The original code was failing on newer systems with huge amounts of RAM
(> 512 GB), in which the page table was 4 GB in size. Because the
bootstrap memory allocator took its alignment parameter as an int, this
turned into a 0, removing any alignment constraint at all and making
the MMU fail. The first round of this patch (r333273) fixed this case by
aligning it at 256 KB, which broke older CPUs. Fix this instead by widening
the alignment parameter.
2018-05-14 04:00:52 +00:00
Nathan Whitehorn
b9ff14e6e9 Revert changes to hash table alignment in r333273, which booting on all G5
systems, pending further analysis.
2018-05-13 23:56:43 +00:00
Justin Hibbits
04de51dbab No need to bzero splpar_vpa entries
splpar_vpa is in the BSS, so is already zeroed when the kernel starts up.

Tested by:	Leandro Lupori
2018-05-11 02:04:01 +00:00
Justin Hibbits
b4a0a59871 Fix PPC symbol resolution
Summary:
There were 2 issues that were preventing correct symbol resolution
on PowerPC/pseries:

1- memory corruption at chrp_attach() - this caused the inital
   part of the symbol table to become zeroed, which would cause
   the kernel linker to fail to parse it.
   (this was probably zeroing out other memory parts as well)

2- DDB symbol resolution wasn't working because symtab contained
   not relocated addresses but it was given relocated offsets.
   Although relocating the symbol table fixed this, it broke the
   linker, that already handled this case.
   Thus, the fix for this consists in adding a new DDB macro:
   DB_STOFFS(offs) that converts a (potentially) relocated offset
   into one that can be compared with symbol table values.

PR:		227093
Submitted by:	Leandro Lupori <leandro.lupori_gmail.com>
Differential Revision: https://reviews.freebsd.org/D15372
2018-05-10 03:59:48 +00:00
Warner Losh
5aa07b053a Move MI-ish bcopy routine to libkern
riscv and powerpc have nearly identical bcopy.c that's
supposed to be mostly MI. Move it to the MI libkern.

Differential Revision: https://reviews.freebsd.org/D15374
2018-05-10 02:31:38 +00:00
Justin Hibbits
151c44e22b Fix wrong cpu0 identification
Summary:
chrp_cpuref_init() was relying on the boot strap processor to be
the first child of /cpus. That was not always the case, specially
on pseries with FDT.

This change uses the "reg" property of each CPU instead and also
adds several sanity checks to avoid unexpected behavior (maybe
too many panics?).

The main observed symptom was interrupts being missed by the main
processor, leading to timeouts and the kernel aborting the boot.

Submitted by:	Leandro Lupori
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15174
2018-05-08 13:23:39 +00:00
Justin Hibbits
10d0cdfc6e Add support for powernv POWER9 MMU initialization
The POWER9 MMU (PowerISA 3.0) is slightly different from current
configurations, using a partition table even for hypervisor mode, and
dropping the SDR1 register.  Key off the newly early-enabled CPU features
flags for the new architecture, and configure the MMU appropriately.

The POWER9 MMU ignores the "PSIZ" field in the PTCR, and expects a 64kB
table.  As we are enabled for powernv (hypervisor mode, no VMs), only
initialize partition table entry 0, and zero out the rest.  The actual
contents of the register are identical to SDR1 from previous architectures.

Along with this, fix a bug in the page table allocation with very large
memory.  The table can be allocated on any 256k boundary.  The
bootstrap_alloc alignment argument is an int, and with large amounts of
memory passing the size of the table as the alignment will overflow an
integer.  Hard-code the alignment at 256k as wider alignment is not
necessary.

Reviewed by:	nwhitehorn
Tested by:	Breno Leitao
Relnotes:	Yes
2018-05-05 16:00:02 +00:00
Justin Hibbits
55a12bbda2 Break out the cpu_features setup to its own function, to be run earlier
The new POWER9 MMU configuration is slightly different from current setups.
Rather than special-casing on POWER9, move the initialization of cpu_features
and cpu_features2 to as early as possible, so that platform and MMU
configuration can be based upon CPU features instead of specific CPUs if at all
possible.

Reviewed by:	nwhitehorn
2018-05-05 15:48:39 +00:00
Justin Hibbits
4f4f92c58f Add POWER9 to the POWER8 bootstrap case blocks
POWER8 and POWER9 have similar configuration requirements for hypervisor setup,
and in the cases here they're identical.  Add the POWER9 constant to the POWER8
list so it's initialized correctly.

Reviewed by:	nwhitehorn
2018-05-05 15:42:58 +00:00
Mateusz Guzik
a571c38536 Allow __builtin_memmove instead of bcopy for small buffers of known size
See r323329 for an explanation why this is a good idea.
2018-05-04 04:00:48 +00:00
Justin Hibbits
971b5e4da8 Remove dead errata fixup code
This code caused more problems than it should have fixed (boot failures) on
the machines I tested, so has been commented out for a while now.  Remove
it, and assume the errata fixups were done by the bootloader where they
belong.
2018-05-01 04:31:17 +00:00
Nathan Whitehorn
47280ef170 Fix null pointer dereference on nodes without a "compatible" property.
MFC after:	1 week
2018-04-30 19:37:32 +00:00
Justin Hibbits
42ca1d5cc3 Increase the fdtmemreserv array limit to boot on POWER9
Discussing with others, this needs to be at least 20 to boot on some POWER9
nodes.  Linux made a similar change for the same reason, so increase to 32
to give us some extra breathing room as well.  The input and output arrays
are sized at 256, so much greater than the increase in the property array
size.
2018-04-25 02:42:11 +00:00
Justin Hibbits
38c32a140b Fix the build post r332859
sysentvec::sv_hwcap/sv_hwcap2 are pointers to  u_long, so cpu_features* need
to be u_long to use the pointers.  This also requires a temporary cast in
printing the bitfields, which is fine because the feature flag fields are
only 32 bits anyway.
2018-04-22 03:58:04 +00:00
Justin Hibbits
611c02b1d2 Export powerpc CPU features for auxvec
FreeBSD exports the AT_HWCAP* auxvec items if provided by the ELF sysentvec
structure.  Add the CPU features to be exported, so user space can more
easily check for them without using the hw.cpu_features and hw.cpu_features2
sysctls.
2018-04-21 15:15:47 +00:00
Justin Hibbits
18f48e0c72 Sync powerpc feature flags with Linux
Not all feature flags are synced.  Those for processors we don't currently
support are ignored currently.  Those that are supported are synced best I
can tell.  One flag was renamed to match the Linux flag name
(PPC_FEATURE2_VCRYPTO -> PPC_FEATURE2_VEC_CRYPTO).
2018-04-21 04:18:17 +00:00
Justin Hibbits
567dd766f6 powerpc64: Set n_slbs = 32 for POWER9
Summary:
POWER9 also contains 32 slbs entries as explained by the POWER9 User Manual:

 "For HPT translation, the POWER9 core contains a unified (combined for both
   instruction and data), 32-entry, fully-associative SLB per thread"

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15128
2018-04-20 03:23:19 +00:00
Justin Hibbits
2914706ab0 powerpc64: Add DSCR support
Summary:
Powerpc64 has support for a register called Data Stream Control Register
(DSCR), which basically controls how the hardware controls the caching and
prefetch for stream operations.

Since mfdscr and mtdscr are privileged instructions, we need to emulate them,
and
keep the custom DSCR configuration per thread.

The purpose of this feature is to change DSCR depending on the operation, set
to DSCR Default Prefetch Depth to deepest on string operations, as memcpy.

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15081
2018-04-20 03:19:44 +00:00
Nathan Whitehorn
323e673945 Fix detection of memory overlap with the kernel in the case where a memory
region marked "available" by firmware is contained entirely in the kernel.

This had a tendency to happen with FDTs passed by loader, though could for
other reasons as well, and would result in the kernel slowly cannibalizing
itself for other purposes, eventually resulting in a crash.

A similar fix is needed for mmu_oea.c and should probably just be rolled
at that point into some generic code in platform.c for taking a mem_region
list and removing chunks.

PR:		226974
Submitted by:	leandro.lupori@gmail.com
Reviewed by:	jhibbits
Differential Revision:	D15121
2018-04-19 18:34:38 +00:00
Alexander Motin
596e6ade92 Release memory resource on cuda driver attach failure.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
2018-04-19 15:29:10 +00:00
Andriy Gapon
f3f6ecb450 set kdb_why to "trap" when calling kdb_trap from trap_fatal
This will allow to hook a ddb script to "kdb.enter.trap" event.
Previously there was no specific name for this event, so it could only
be handled by either "kdb.enter.unknown" or "kdb.enter.default" hooks.
Both are very unspecific.

Having a specific event is useful because the fatal trap condition is
very similar to panic but it has an additional property that the current
stack frame is the frame where the trap occurred.  So, both a register
dump and a stack bottom dump have additional information that can help
analyze the problem.

I have added the event only on architectures that have trap_fatal()
function defined.  I haven't looked at other architectures.  Their
maintainers can add support for the event later.

Sample script:
kdb.enter.trap=bt; show reg; x/aS $rsp,20; x/agx $rsp,20

Reviewed by:	kib, jhb, markj
MFC after:	11 days
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D15093
2018-04-19 05:06:56 +00:00
Andriy Gapon
6d83b2e971 don't check for kdb reentry in trap_fatal(), it's impossible
trap() checks for it earlier and calls kdb_reentry().

Discussed with:	jhb
MFC after:	12 days
Sponsored by:	Panzura
2018-04-18 15:44:54 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Warner Losh
9d89a9f326 No need to force md code to define a macro that's the same as
_BYTE_ORDER. Use that instead.
2018-04-16 13:52:23 +00:00
Justin Hibbits
3877c32ec9 Use a resource hint instead of environment variable for DIU mode
This makes it more consistent with FreeBSD norms, rather than using Linux's
norms.  Now, instead of needing an environment variable

  video-mode=fslfb:1280x1024@60

Now one would use a hint:

  hint.fb.0.mode=1280x1024@60
2018-04-16 04:02:53 +00:00
Justin Hibbits
bda8aa770a Reenter KDB on fault on powerpc, instead of panicking
Most other architectures already re-enter KDB on faults, powerpc and mips
are the only outliers.  Correct this for powerpc, so that now bad addresses
can be handled gracefully instead of panicking.
2018-04-10 21:14:54 +00:00
Justin Hibbits
99adcecf38 Call through powerpc_interrupt for all Book-E interrupts
Make int_external_input, int_decrementer, and int_performance_counter all
now use trap_common, just like on AIM.  The effects of this are:

* All traps are now properly displayed in ddb.  Previously traps from
  external input, decrementer, and performance counters, would display as
  just basic stack traces.  Now the frame is displayed.

* External interrupts are now handled with interrupts enabled, so handling
  can be preempted.  This seems to fix a hang found post-r329882.
2018-04-10 17:32:27 +00:00
Oleksandr Tymoshenko
f7604b1b27 Align OF_getencprop_alloc API with OF_getencprop and OF_getprop_alloc
Change OF_getencprop_alloc semantics to be combination of malloc and
OF_getencprop and return size of the property, not number of elements
allocated.

For the use cases where number of elements is preferred introduce
OF_getencprop_alloc_multi helper function that copies semantics
of OF_getencprop_alloc prior to this change.

This is to make OF_getencprop_alloc and OF_getencprop_alloc_multi
function signatures consistent with OF_getencprop_alloc and
OF_getencprop_alloc_multi.

Functionality-wise this patch is mostly rename of OF_getencprop_alloc
to OF_getencprop_alloc_multi except two calls in ofw_bus_setup_iinfo
where 1 was used as a block size.
2018-04-09 22:06:16 +00:00
Oleksandr Tymoshenko
217d17bcd3 Clean up OF_getprop_alloc API
OF_getprop_alloc takes element size argument and returns number of
elements in the property. There are valid use cases for such behavior
but mostly API consumers pass 1 as element size to get string
properties. What API users would expect from OF_getprop_alloc is to be
a combination of malloc + OF_getprop with the same semantic of return
value. This patch modifies API signature to match these expectations.

For the valid use cases with element size != 1 and to reduce
modification scope new OF_getprop_alloc_multi function has been
introduced that behaves the same way OF_getprop_alloc behaved prior to
this patch.

Reviewed by:	ian, manu
Differential Revision:	https://reviews.freebsd.org/D14850
2018-04-08 22:59:34 +00:00
Justin Hibbits
b4b4b17687 Fix typo
Reserved cause is 6, not 5.

Reported by:	cem
2018-04-08 19:33:05 +00:00
Justin Hibbits
ac2605b1d1 Powerpc64: Add the facility unavailable trap subsystem
Summary:
This code adds the basic infrastructure for the facility subsystem. A facility
trap is raised when an unavailable instruction is executed. One example is
executing a Hardware Transactional Memory instruction while the MSR[TM] is
disabled. In the past, there was a specific interrupt for it (FP, VEC), but the
new instructions seem to be multiplexed on this facility interrupt.

The root cause of the trap is provided on Facility Status and Control Register
(FSCR) register.

Submitted by:	Breno Leitao
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14566
2018-04-08 19:11:25 +00:00
Justin Hibbits
3762bafa7b powerpc64: Print current MSR on printtrap()
Summary:
Print current MSR on printtrap(). Currently, printtrap just prints srr1, which
contains part of the MSR prior to the exception. I find useful to dump the
current value of the MSR, since it changes when there is an interruption.

With this patch, this is the new printtrap model:

handled user trap:

    exception       = 0x700 (program)
    srr0            = 0x100008a0 (0x100008a0)
    srr1            = 0x800000000002f032
    current msr     = 0x8000000000009032
    lr              = 0x1000089c (0x1000089c)
    curthread       = 0x7a50000
	pid = 714, comm = ttrap2

Submitted by:	Breno Leitao
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14600
2018-04-08 16:55:28 +00:00
Justin Hibbits
8238d3423d powerpc64: Avoid calling isync twice
Summary:
It is not necessary to call isync() after calling mtmsr() function, mainly
because the mtmsr() calls 'isync' internally to synchronize the machine state
register. Other than that, isync() just calls the 'isync' instruction, thus,
the 'isync' instruction is being called twice, and that seems to be unnecessary.

This patch just remove the unecessary calls to isync() after mtmsr().

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D14583
2018-04-08 16:46:24 +00:00
Justin Hibbits
d6d0670814 powerpc/ofw: Fix malloc inside lock
Summary:
Currently ofw_real_bounce_alloc() is requesting memory, using WAITOK, holding a
non-sleepable locks, called 'OF Bounce Page'.

Fix this by allocating the pages outside of the lock, and only updating the
global variables while holding the lock.

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14955
2018-04-08 16:43:56 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Justin Hibbits
9225dfbdf0 Correct the ilog2() for calculating memory sizes.
TLB1 can handle ranges up to 4GB (through e5500, larger in e6500), but
ilog2() took a unsigned int, which maxes out at 4GB-1, but truncates
silently.  Increase the input range to the largest supported, at least for
64-bit targets.  This lets the DMAP be completely mapped, instead of only
1GB blocks with it assuming being fully mapped.
2018-04-04 02:13:27 +00:00
Justin Hibbits
9f5b999aca Add support for a pmap direct map for 64-bit Book-E
As with AIM64, map the DMAP at the beginning of the fourth "quadrant" of
memory, and move the KERNBASE to the the start of KVA.

Eventually we may run the kernel out of the DMAP, but for now, continue
booting as it has been.
2018-04-03 00:45:38 +00:00
Justin Hibbits
9ae2eed9f8 Debug interrupts aren't instruction traps
The EXC_DEBUG type is akin to the MPC74xx "Instruction Breakpoint" trap.
Don't treat it as a trap instruction.
2018-03-23 00:40:08 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Justin Hibbits
a029f84189 Fix powerpc Book-E build post-331018/331048.
pagedaemon_wakeup() was moved from vm_pageout.h to vm_pagequeue.h.
2018-03-20 01:07:22 +00:00
Oleksandr Tymoshenko
108117cc22 [ofw] fix errneous checks for OF_finddevice(9) return value
OF_finddevices returns ((phandle_t)-1) in case of failure. Some code
in existing drivers checked return value to be equal to 0 or
less/equal to 0 which is also wrong because phandle_t is unsigned
type. Most of these checks were for negative cases that were never
triggered so trhere was no impact on functionality.

Reviewed by:	nwhitehorn
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14645
2018-03-20 00:03:49 +00:00
Wojciech Macek
d90930743f Reverting r330925 for now 2018-03-15 06:19:45 +00:00
Wojciech Macek
22eedd96c7 PowerNV: Fix I2C to compile if FDT is disabled
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-03-14 09:20:03 +00:00
Nathan Whitehorn
9b0ec025d4 Restore missing temporary variable, deleted by accident in r330845. This
unbreaks the ppc32 AIM build.

Reported by:	jhibbits
2018-03-13 18:24:21 +00:00
Nathan Whitehorn
8864f35942 Execute PowerPC64/AIM kernel from direct map region when possible.
When the kernel can be in real mode in early boot, we can execute from
high addresses aliased to the kernel's physical memory. If that high
address has the first two bits set to 1 (0xc...), those addresses will
automatically become part of the direct map. This reduces page table
pressure from the kernel and it sets up the kernel to be used with
radix translation, for which it has to be up here.

This is accomplished by exploiting the fact that all PowerPC kernels are
built as position-independent executables and relocate themselves
on start. Before this patch, the kernel runs at 1:1 VA:PA, but that
VA/PA is random and set by the bootloader. Very early, it processes
its ELF relocations to operate wherever it happens to find itself.
This patch uses that mechanism to re-enter and re-relocate the kernel
a second time witha new base address set up in the early parts of
powerpc_init().

Reviewed by:	jhibbits
Differential Revision:	D14647
2018-03-13 15:03:58 +00:00
Nathan Whitehorn
35feca377d Make FDT-using parts of ofw_machdep.c condition on options FDT. This fixes
the kernel build when options FDT is absent.
2018-03-11 01:09:31 +00:00
Nathan Whitehorn
f9edb09d70 Move the powerpc64 direct map base address from zero to high memory. This
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
  bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
  direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.

The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.

Reviewed by:	jhibbits, Breno Leitao
Differential Revision:	D14499
2018-03-07 17:08:07 +00:00
Nathan Whitehorn
72820025dd Fix use of unitialized variables. 2018-03-06 15:52:43 +00:00
Jonathan T. Looney
beb2406556 amd64: Protect the kernel text, data, and BSS by setting the RW/NX bits
correctly for the data contained on each memory page.

There are several components to this change:
 * Add a variable to indicate the start of the R/W portion of the
   initial memory.
 * Stop detecting NX bit support for each AP.  Instead, use the value
   from the BSP and, if supported, activate the feature on the other
   APs just before loading the correct page table.  (Functionally, we
   already assume that the BSP and all APs had the same support or
   lack of support for the NX bit.)
 * Set the RW and NX bits correctly for the kernel text, data, and
   BSS (subject to some caveats below).
 * Ensure DDB can write to memory when necessary (such as to set a
   breakpoint).
 * Ensure GDB can write to memory when necessary (such as to set a
   breakpoint).  For this purpose, add new MD functions gdb_begin_write()
   and gdb_end_write() which the GDB support code can call before and
   after writing to memory.

This change is not comprehensive:
 * It doesn't do anything to protect modules.
 * It doesn't do anything for kernel memory allocated after the kernel
   starts running.
 * In order to avoid excessive memory inefficiency, it may let multiple
   types of data share a 2M page, and assigns the most permissions
   needed for data on that page.

Reviewed by:	jhb, kib
Discussed with:	emaste
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D14282
2018-03-06 14:28:37 +00:00
Nathan Whitehorn
c1d6f6ebe8 Honor physical memory regions marked unavailable in the FDT, when present.
The most notable of these is the FDT itself, which it is a bad idea to
overwrite.
2018-03-03 02:06:48 +00:00
Nathan Whitehorn
1a60bed731 Remove assumption that all physical memory is available to the kernel and
that the physical and available memory arrays are interchangeable.
2018-03-03 02:04:40 +00:00
Wojciech Macek
4ffd72e34c PowerNV: Initial support for OPAL I2C transfers
Add I2C OPAL driver and a set of dummy-ones to allow
all I2C things on Power8 to attach.

TODO: better async token management

Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-03-01 14:11:07 +00:00
Justin Hibbits
5903f5954a Fix the psl_userset32 definition.
It should be based on psl_userset, not psl_kernset.  As kernset, it would
inherit kernel config, including privilege level.
2018-03-01 04:44:17 +00:00
Justin Hibbits
2d5320a818 Increase the size of a reservation granule for TLB locks
A reservation granule on PowerPC is a cache line.

On e500mc and derivatives a cacheline size is 64 bytes, not 32.  Allocate
the maximum size permitted, but only utilize the size that is needed.  On
e500v1 and e500v2 the reservation granule will still be 32 bytes.
2018-02-27 04:38:27 +00:00
Justin Hibbits
7fa00cd0ab Fix a minor typo. 2018-02-27 04:23:03 +00:00
Justin Hibbits
e6939726ef Correct a copy&paste-o -- altivec assist interrupt, not watchdog 2018-02-26 03:05:36 +00:00
Nathan Whitehorn
f638d50513 Avoid dereferencing random memory when kickstarting DMA.
MFC after: 1 week
2018-02-24 22:34:56 +00:00
Justin Hibbits
7eb6081727 Unbreak 64-bit Book-E builds post r329712
can_wakeup is defined only in AIM's locore64.S, so conditionalize use of it
on AIM in addition to powerpc64.
2018-02-24 18:12:38 +00:00
Justin Hibbits
635d2bed1d Make MPC85XXSPE kernel conf ident match the file name 2018-02-24 17:54:12 +00:00
Justin Hibbits
675147dd7b Change ident for QORIQ64 kernel conf
Make it match the conf file name.
2018-02-24 17:53:22 +00:00
Justin Hibbits
571892ff4d Unbreak the build after r329891
I was apparently a little too excited with deleting code, and apparently
didn't do a final test build before commit.  Restore cpu_idle_wakeup().
2018-02-24 17:29:29 +00:00
Justin Hibbits
6708989b60 Remove platform_cpu_idle() and platform_cpu_idle_wakeup() interfaces
These interfaces were put in place to let QorIQ SoCs dictate CPU idling
semantics, in order to support capabilities such as NAP mode and deep sleep.
However, this never stabilized, and the idling support reverted back to
CPU-level rather than SoC level.  Move this code back to cpu.c instead.  If
at a later date the lower power modes do come to fruition, it should be done
by overriding the cpu_idle_hook instead of this platform hook.
2018-02-24 01:46:56 +00:00
Wojciech Macek
3c41c1d446 powerpc64: add NVMe to GENERIC64
NVMe support is ready and should be compiled-in
to the ppc64 kernel.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-23 07:43:52 +00:00
Warner Losh
ef1fcaf0f5 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
Nathan Whitehorn
d9dbc2104f Add definition for the PowerPC A2. 2018-02-21 15:15:58 +00:00
Nathan Whitehorn
dddf28585d Add definitions for the new Radix MMU mode on POWER9+ CPUs. 2018-02-21 15:15:31 +00:00
Wojciech Macek
6d13fd638c PowerNV: Put processor to power-save state in idle thread
When processor enters power-save state it releases resources shared with other
cpu threads which makes other cores working much faster.

This patch also implements saving and restoring registers that might get
corrupted in power-save state.

Submitted by:          Patryk Duda <pdk@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           jhibbits, nwhitehorn, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14330
2018-02-21 14:28:40 +00:00
Wojciech Macek
eb96cc1364 PowerNV: add missing RTC_WRITE support
Add function which can store RTC values to OPAL.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-21 08:13:17 +00:00
Justin Hibbits
fcc491a3fe Split printtrap() into generic and CPU-specific components
Summary:
This compartmentalizes the CPU-specific trap components into its own
function, rather than littering the general printtrap() with various checks.
This will let us replace a series of #ifdef's with a runtime conditional check
in the future.

Reviewed By:	nwhitehorn
Differential Revision:	https://reviews.freebsd.org/D14416
2018-02-21 03:34:33 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Wojciech Macek
f32ebdc85c PowerPC: Switch to more accurate unit to avoid division rounding
On POWER8 architecture there is a timer with 512Mhz frequency.
It has about 1,95ns period, but it is rounded to 1ns which is not accurate.

Submitted by:          Patryk Duda <pdk@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14433
2018-02-20 07:30:57 +00:00
Wojciech Macek
838070d5f4 PowerNV: Send SIGILL on HEA illegal instruction exception
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.

Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn, pdk@semihalf.com, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14437
2018-02-20 06:38:55 +00:00
Nathan Whitehorn
65184f89b6 Set internal error returns for OF_peer(), OF_child(), and OF_parent() to
zero, matching the IEEE 1275 standard. Since these internal error paths
have never, to my knowledge, been taken, behavior is unchanged.

Reported by:	gonzo
MFC after:	2 weeks
2018-02-19 15:49:14 +00:00
Justin Hibbits
bce6d88bc1 Merge AIM and Book-E PCPU fields
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC
kernel.  As more work is done, the struct may be optimized further.

Reviewed by:	nwhitehorn
2018-02-17 20:59:12 +00:00
Justin Hibbits
a00ce4e854 PPC64: Get the timestap from the proper OF field
Summary:
After revision rS328534('PPC64: use hwref instead of cpuid'), FreeBSD on
powerpc64 virtual machine panics since it is unable to read the
timebase, showing the following error:

     get-property for timebase-frequency on zero phandle

     panic: Unable to determine timebase frequency!

With the change above,  cpuref->cr_hwref does not contain the phandle
anymore, thus, it never reads the proper CPU entry in OF.

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14204
2018-02-14 02:51:28 +00:00
Justin Hibbits
26e251b55c powerpc64/pseries: Define new hcalls
Summary:
Define new hcalls as in 'Linux on Power Architecture Platform Reference'
version 1.1 (24 March 2016) downloaded from:

        https://members.openpowerfoundation.org/document/dl/469

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14281
2018-02-14 02:48:27 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Warner Losh
62bca77843 Move __va_list and related defines to sys/sys/_types.h
__va_list and related defines are identical in all the
ARCH/include/_types.h files. Move them to sys/sys/_types.h

Sponsored by: Netflix
2018-02-12 14:48:20 +00:00
Warner Losh
982e7bdafc We don't support gcc < 4.2.1, so varargs.h now is just #error
always. Unifdef for versions prior to 4.2.1 and remove now-unused
header files.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:14 +00:00
Warner Losh
33e959abab Use standard pattern for stdargs.h
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:05 +00:00
Nathan Whitehorn
778c8dac96 Fix PowerMac G5 thermal management, plus likely other bugs, introduced in
r328113 and affecting SMP systems.

The way the time is set on PowerMacs is racy and relies on all the
CPUs in the system setting a register simultaneously in a rendezvous. A
few-cycle delay can result in out-of-sync times, which can break the
scheduler and result in calls like mtx_sleep() and pause() never timing out
if the thread is migrated while sleeping. r328113 added a call to a no-op
function between the beginning of the rendezvous and setting the time that
was only called on APs and added enough cycles to cause a problematic offset.
For some reason, the fan-management code was the first place this appeared.

Clue from:	andreast
Reported by:	many
2018-02-09 20:09:32 +00:00
Mark Johnston
ab7c09f121 Use vm_page_unwire_noq() instead of directly modifying page wire counts.
No functional change intended.

Reviewed by:	alc, kib (previous revision)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14266
2018-02-08 19:28:51 +00:00
Jeff Roberson
e2068d0bcd Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state.  Protect reservations with the free lock
from the domain that they belong to.  Refactor to make vm domains more
of a first class object.

Reviewed by:    markj, kib, gallatin
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14000
2018-02-06 22:10:07 +00:00
Justin Hibbits
b0d3bb2613 Only look for L2 cache controllers for mpc85xx_cache
The L3 cache controller (Corenet Platform Cache) is listed with one of its
compatible strings as "cache", which this driver can't attach to.  Restrict
to a known list of primary cache controller strings, as found in the l2cache
devicetree binding.
2018-02-04 20:07:08 +00:00
Justin Hibbits
ce2d51972f Start building modules for MPC85XX and MPC85XXSPE
These kernels aren't restricted to development boards anymore, they are
closer in behavior to GENERIC, so build modules.
2018-02-04 15:40:48 +00:00
Justin Hibbits
2c26c98c89 Add sdhci to MPC85XX build 2018-02-04 15:39:15 +00:00
Steve Wills
aa3c83c3c6 Create GENERIC64-NODEBUG for powerpc64
Approved by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D14192
2018-02-04 14:27:12 +00:00
Steve Wills
e1782bae5f Correct longjmp
Reviewed by:	nwhitehorn
Differential Revision:	https://reviews.freebsd.org/D14159
2018-02-02 02:28:25 +00:00
Nathan Whitehorn
619282986d Change the default MSR values used when starting userland and kernel
threads from compile-time defines to global variables. This removes a
significant amount of duplicated runtime patches to the compile-time
defines, centralizing the conditional logic in the early startup code.

Reviewed by:	jhibbits
2018-02-01 05:31:24 +00:00
Nathan Whitehorn
564ac41556 Fix build on 32-bit PowerPC, broken in r328537. 2018-02-01 05:28:02 +00:00
Wojciech Macek
d32802f0c3 PowerNV: fix compilation on non-NV platforms
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-31 06:42:01 +00:00
Wojciech Macek
70bb600a0a PowerNV: move LPCR and LPID altering to cpudep_ap_early_bootstrap
It turns out that under some circumstances we can get DSI or DSE before we set
LPCR and LPID so we should set it as early as possible.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-29 09:27:02 +00:00
Wojciech Macek
f0393bbf34 PPC64: use hwref instead of cpuid
On CHRP and PowerNV, use the interrupt server number in the cpuref and pcpu
hwref field instead of the device-tree phandle and make the CPU IDs reported
to the scheduler dense and with the BSP at 0.

Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14011
2018-01-29 09:15:38 +00:00
Wojciech Macek
b74fb1e713 PPC64: cleanup APs startup routines
Cleaning up AP startup routines. This is a mix of changes
required to make PowerNV running and to modify the code
to be more robust. Previously, some races were seen if more
than 90CPUs were online.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14026
2018-01-29 08:10:03 +00:00
Nathan Whitehorn
eb1baf72ae Remove hard-coded trap-handling logic involving the segmented memory model
used with hashed page tables on AIM and place it into a new, modular pmap
function called pmap_decode_kernel_ptr(). This function is the inverse
of pmap_map_user_ptr(). With POWER9 radix tables, which mapping to use
becomes more complex than just AIM/BOOKE and it is best to have it in
the same place as pmap_map_user_ptr().

Reviewed by:	jhibbits
2018-01-29 04:33:41 +00:00
Warner Losh
d6b6639713 Add ISA PNP tables to ISA drivers. Fix a few incidental comments.
ACPI ISA PBP tables not tagged, there's bigger issues with them.
2018-01-29 00:22:30 +00:00
Nathan Whitehorn
21776ff850 Remove some unused AIM register declarations that existed to support some
CPUs we have never run on. As a side-effect, removes some #ifdef AIM/#else.
2018-01-28 21:30:57 +00:00
Justin Hibbits
0a3ef103a3 Start building modules for QORIQ64
There's no reason not to build modules for 64-bit QorIQ devices.  This
config has evolved to be analogous to the AIM GENERIC64 kernel, so will grow
to match it in more ways as well.
2018-01-28 20:35:48 +00:00
Justin Hibbits
a72b951348 Consolidate trap instruction checks to a single function
Summary:
Rather than duplicating the checks for programmatic traps all over the code, put
it all in one function.  This helps to remove some of the #ifdefs between AIM
and Book-E.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14082
2018-01-28 19:18:40 +00:00
Wojciech Macek
919736d252 PPC: Add place for NULL chars in intrnames
In a corner case we could fall into OOB error.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-26 09:38:40 +00:00
Nathan Whitehorn
e649493c6d Avoid all SLB operations in trap handling if the process is not using a
software-managed SLB.
2018-01-25 18:10:33 +00:00
Nathan Whitehorn
3e1b393a51 Treat DSE exceptions like DSI exceptions when generating signinfo.
Both can generate SIGSEGV, but DSEs would have put the wrong address
into the siginfo structure when the signal was delivered.

MFC after:	1 week
2018-01-25 18:09:26 +00:00
Wojciech Macek
68c2d255bc PPC: Add KASSERT in intrcnt_add which checks for buffer overflow
Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-24 12:01:32 +00:00
Wojciech Macek
a81a290f34 PowerNV: send MSI_EOI always after MSI unmask
MSI/MSI-x interrupts are edge-triggered. If an interrupt
arrives when IRQ line is masked, it will be lost and will
never recover. Perform MSI_EOI always after unmask to give
a chance for PHB/XICS to send an interrupt again if MSI/MSI-x
pending bit is set in MSI/MSI-x BAR space.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-23 08:07:00 +00:00
Justin Hibbits
1a55e1d038 Fix 64-bit booke kernel builds after the ldscript changes
Commits r326203 and r326978 broke 64-bit booke kernels by introducing a 1MB
zero-pad between the ELF header and the start of the kernel.  This didn't
cause a build failure, but caused kernels to need to be loaded into memory
1MB lower, which could easily break scripts expecting previous behavior.
This change matches the similar change made to AIM in r327358.
2018-01-23 02:52:12 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Nathan Whitehorn
7790e46cf0 On AIM systems without a software-managed SLB, such as POWER9 systems using
either hardware segment tables or radix-tree-based page tables, do not try
to install SLB entries at trap boundaries.
2018-01-19 22:19:50 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
Wojciech Macek
720212d30d Call platform_smp_ap_init before decr_ap_init
In platform_smp_ap_init we are doing some crucial code (eg. set LPCR register)
    which have influence over further execution.

    Practiculary in PowerNV platform we have experienced Data Storage Interrupt
    before we set apropriate LPCR. It caused code execution from location which was
    legal in bootloader (petitboot based on linux) but illegal in FreeBSD
2018-01-18 08:34:20 +00:00
Wojciech Macek
054a090d49 PPC64: fix TOC behavior on process initialization
Set stack pointer to correct value after thread's stack pointer restore

Restoring new thread's stack pointer caused stack corruption because
restored stack pointer didn't point to callee (cpu_switch) stack frame but
caller stack frame.

As a result we had mysterious errors in caller function (sched_switch).

Solution: simply set stack pointer to correct value

Also, initialize TOC to a valid pointer once the thread is being
created.

Created by:            Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn
Differential revision: https://reviews.freebsd.org/D13947
Sponsored by:          QCM Technologies
2018-01-18 07:42:51 +00:00
Wojciech Macek
8a0112ca65 PPC: machdep, zero BSS always but BookE
Zero BSS always. The only case when this operation is
ommitted is when booting on BookE.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           imp, nwhitehorn
Differential revision: https://reviews.freebsd.org/D13948
Sponsored by:          QCM Technologies
2018-01-18 07:41:04 +00:00
Wojciech Macek
e70f868f17 PPC64: add AHCI back to GENERIC64 2018-01-18 06:28:21 +00:00
Wojciech Macek
f7b509a109 PPC64: implement missing busdma ops
Add missing little-endian 64-bit read and write. Since there
is no direct ASM opcode for this, perform byte swap if
necessary.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:45:18 +00:00
Wojciech Macek
91769f6452 PPC64: fix copyinout ranges
Use current userspace address for segment mapping. Previously,
there was a bug which made the funciton constantly using the userspace
base address which could cause data integrity issues.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:36:48 +00:00
Wojciech Macek
55b823e52a PPC64: add CXGBE and remove AHCI from GENERIC64
Add CXGBE driver which is required for PowerNV system.
Also, remove AHCI which does not work in BigEndian.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:33:16 +00:00
Wojciech Macek
6005affb74 PowerNV: workaround console on OPAL 5.4
FreeBSD prints text char-by-char, which is not what OPAL
is designed to. Poll events more frequently to avoid buffer
overflow and loosing data.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 08:01:51 +00:00
Wojciech Macek
5c3e53ef19 PowerNV: make PowerNV PCIe working on a real hardware
Fixes:
- map all devices to PE0
- use 1:1 TCE mapping
- provide the same TCE mapping for all PEs (not only PE0)
- add TCE reset and alignment (required by OPAL)

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 07:39:11 +00:00
Wojciech Macek
8fc8068eba PowerNV: XICS support for PowerNV/OPAL
Make XICS to be OPAL-aware.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-16 06:24:19 +00:00
Justin Hibbits
e64428edf7 Make fsl_sata driver work on P1022
P1022 SATA controller may set the wrong CCR bit for a command completion.
This would previously cause an interrupt storm.  Solve this by marking all
commands complete, and letting the end_transaction deal with the successes.
Causes no problems on P5020.

While here, fix a minor bug in collision detection.  The Freescale SATA
controller only has 16 slots, not 32.
2018-01-16 04:50:23 +00:00
Pedro F. Giffuni
6d5bc1bcab powerpc: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:10:40 +00:00
Nathan Whitehorn
fc8ea4be2a Install the SLB miss trap-handling code in the SLB-based MMU driver set up,
to which it is specific, rather than in the generic AIM startup code. This
will be required to support the radix-table-based MMU introduced with POWER9.
2018-01-15 16:08:34 +00:00
Nathan Whitehorn
04329fa708 Move the pmap-specific code in copyinout.c that gets pointers to userland
buffers into a new pmap-module function pmap_map_user_ptr() that can
be implemented by the respective modules. This is required to implement
non-segment-based AIM-ish MMU systems such as the radix-tree page tables
introduced by POWER ISA 3.0 and present on POWER9.

Reviewed by:	jhibbits
2018-01-15 06:46:33 +00:00
Nathan Whitehorn
68b9c019aa Document places we assume that physical memory is direct-mapped at zero by
using a new macro PHYS_TO_DMAP, which deliberately has the same name as the
equivalent macro on amd64. This also sets the stage for moving the direct
map to another base address.
2018-01-13 23:14:53 +00:00
Justin Hibbits
4a20766452 Include only the headers needed
The extra headers came through evolution of the file.
2018-01-13 21:10:42 +00:00
Justin Hibbits
8e14018389 Add SPDX identifier to header
Reported by:	pfg
2018-01-13 17:25:48 +00:00
Nathan Whitehorn
222393d5ca Chase removal of FDT fixup code on PowerPC in r327907. 2018-01-13 03:09:05 +00:00
Justin Hibbits
e9f96ff457 Enable L2 cache on supported PowerQUICC and QorIQ platforms
Some PowerQUICC and QorIQ platforms have a L2 cache managed via the
memory-mapped configuration registers, and appear as a node in the device
tree.  This adds basic support to enable the cache.
2018-01-13 01:36:37 +00:00
Jeff Roberson
6f4acaf4c9 Add support for NUMA domains to bus dma tags. This causes all memory
allocated with a tag to come from the specified domain if it meets the
other constraints provided by the tag.  Automatically create a tag at
the root of each bus specifying the domain local to that bus if
available.

Reviewed by:	jhb, kib
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13545
2018-01-12 23:34:16 +00:00
Jeff Roberson
ab3185d15e Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
Wojciech Macek
504d9b6029 PowerNV: update OPAL driver
Update OPAL driver with:
- better console support
- proper AP configuration
- enhanced IRQ/OFW mapping
- RTC support

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-12 12:14:52 +00:00
Wojciech Macek
ac9b43252a PowerNV: initial support for PCIe host controller
Provide initial support for PCIe host controller as
well as for IOMMU mapping. This commit allows proper
bus enumeration, but does not guarantee DMA operations
are working.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-12 07:55:49 +00:00
Wojciech Macek
fc1689021f PowerNV: add buffer for OPAL console
Avoid the lock in vtophys() by providing a static direct-mapped
spinlock- protected output buffer to use when the console driver
cannot acquire locks for some reason. This allows the idle thread
to use printf() (e.g. the SMP startup messages) without crashing
the kernel.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:42:24 +00:00
Wojciech Macek
c024897601 PowerNV: set LPCR[LPES] correctly
Make sure to set LPCR[LPES] so that external interrupts set SRR0 and SRR1
instead of HSRR0 and HSRR1. Without this, external interrupt handlers would
get the wrong MSR value when executing, causing eventual madness.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:39:38 +00:00
Wojciech Macek
01d7bda7b7 PowerNV: correctly start secondary CPUs
Fix AP startup, which was broken.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:34:33 +00:00
Wojciech Macek
32d1354a39 PowerNV: add reset, poweroff, OPAL console
Add basic power control (reset, power off) and bind
ttyuX to opal console so that init will start login there.

Created by:            Nathan Whitehorn <nw@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:26:28 +00:00
Wojciech Macek
fb3855e0e7 PowerNV: initial support for OPAL
OPAL is a dedicated firmware acting as a hypervisor.
Add generic functions to provide all access.

Created by:            Nathan Whitehorn <nw@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
2018-01-11 07:40:06 +00:00
Landon J. Fuller
a1df0d9592 Fix minor locking issues in the Power Mac Uninorth PCI bridge driver.
- Call resource_int_value() once during attach, rather than within the
  pci_(read|write)_config() code path; this avoids taking a blocking mutex
  to read kenv variables.

- Use a spin lock to protect non-atomic config space accesses; this matches
  the behavior of Darwin's AppleMacRiscPCI driver.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D13839
2018-01-10 22:19:11 +00:00
Nathan Whitehorn
566a135bd5 Add XHCI support to powerpc64 GENERIC. This is useful to get input devices
supported on newer POWER hardware and in graphical VMs run on the same,
which are typically XHCI-only. The 32-bit GENERIC kernel, which
does not run on hardware made in the last decade and is unlikely to
encounter XHCI devices, is left unchanged.

PR:		kern/224940
Submitted by:	Gustavo Romero
MFC after:	1 week
2018-01-09 19:41:10 +00:00
Nathan Whitehorn
09f07b0017 Revert r327360, which can cause boot problems on high-CPU-count (>60)
POWER8 and POWER9 systems, pending further analysis.

PR:		224841
2018-01-04 23:07:51 +00:00
Andreas Tobler
7e792cb8f5 The recent bump of MAXDSIZ made 32-bit binary execution on 64-bit powerpc fail.
The data segement was too big.

Add a fix-up function like on ia32 for MAXDSIZ.

While here, bring also the MAXSSIZ closer to amd64 and add an equal fix-up
function for MAXSSIZ.

Reviewed by:	jhibbits@
Obtained from:  jhibbits@
Differential Revision:	https://reviews.freebsd.org/D13753
2018-01-03 20:20:43 +00:00
Nathan Whitehorn
67530f82dd Fix reversed endianness that crept in at some point. Blue is now blue
instead of pink.

MFC after:	3 days
2018-01-02 03:59:46 +00:00
Nathan Whitehorn
3972f4c1d4 Remove PIR from PCPU data. It has an implementation-defined meaning that
is of limited utility outside of platform-specific code and can vary
at runtime when running as a hypervisor guest, so does not even have the
virtue of being a static identifier.

Reviewed by:	jhibbits
2017-12-31 20:23:39 +00:00
Nathan Whitehorn
4e05ac247c Fix 32-bit build. 2017-12-31 20:20:55 +00:00
Nathan Whitehorn
f81dfc7f6b Make newer binutils happy by using a bl-type branch instead of b, which
displeases it for some reason. LR is not relevant in this code, so just
do what it wants.
2017-12-31 20:10:08 +00:00
Nathan Whitehorn
ec75f647cc Provide relative, as well as absolute, addresses in trap panic panics. This
makes it easier to cross-correlate them with instruction listings without
worrying about where the kernel was relocated to.

MFC after:	1 week
2017-12-31 20:08:16 +00:00
Colin Percival
d5d7606c0c Use the TSLOG framework to record entry/exit timestamps for DELAY and
_vprintf; these functions are called in many places and can contribute
meaningfully to the total time spent booting.
2017-12-31 09:24:41 +00:00
Nathan Whitehorn
5261ac0eda Use data from the boot loader to pick the appropriate output graphics mode
instead of hard-coding a default. This information is passed implicitly by
the PS3 firmware and can be relied upon. Also adjust the default mode, if
somehow firmware doesn't pass one, to 1920x1080 from 720x480 since it is
2017.

MFC after:	2 weeks
2017-12-31 06:10:07 +00:00
Nathan Whitehorn
a891d21aac Make sure the first instruction of the low-memory spinloop is in the
cacheline being invalidated.

MFC after:	1 month
2017-12-31 05:38:19 +00:00
Nathan Whitehorn
3fca788024 Remove logic for early console with loader.ps3 now that loader.ps3 is dead. 2017-12-30 20:25:33 +00:00
Nathan Whitehorn
ba06dbb874 Change the way SMP startup works to match the new multi-AP features in
locore64.S introduced in r327358.

MFC after:	3 weeks
2017-12-30 20:24:33 +00:00
Nathan Whitehorn
f9d6e0a5d0 Enhance the CHRP/pSeries platform layer:
- Densely number CPUs to avoid systems with CPUs with very high ID numbers
- Always have the BSP be CPU 0 to avoid remnant brokenness with non-0 BSPs
  in other parts of the kernel.
- Improve parsing of the device tree CPU listings on SMT systems.
- Allow reboot via RTAS as well as OF for pSeries systems booted by FDT
  without functioning Open Firmware.

Obtained from:	projects/powernv
MFC after:	3 weeks
2017-12-29 21:09:17 +00:00
Nathan Whitehorn
70f654991a Add support for 64-bit PowerPC kernels to be directly loaded by kexec, which
is used as the bootloader on a number of PPC64 platforms. This involves the
following pieces:
- Making the first instruction a valid kernel entry point, since kexec
  ignores the ELF entry value. This requires a separate section and linker
  magic to prevent the linker from filling the beginning of the section
  with stubs.
- Adding an entry point at 0x60 past the first instruction for systems
  lacking firmware CPU shutdown support (notably PS3).
- Linker script changes to support the above.

MFC after:	1 month
2017-12-29 20:30:10 +00:00
Nathan Whitehorn
8469e0fe35 Maintain alignment of in-code 64-bit quantities by design rather than luck.
If these are not aligned, the linker has to emit a different type of
relocation that the early boot self-relocation code cannot handle, even
in principle, resulting in them being set to zero and the kernel crashing.

MFC after:	1 week
2017-12-29 20:25:15 +00:00
Nathan Whitehorn
2ad331874e Remove ELF note for Open Firmware. It is marked optional in a single 1996
draft of a never-finalized standard (CHRP) and is irrelevant in practice
on FreeBSD since we load the kernel with loader(8) on Open Firmware
platforms anyway. Moreover, loader(8), which is directly loaded by Open
Firmware, has never had an equivalent note.

MFC after:	2 weeks
2017-12-28 23:49:53 +00:00
Eitan Adler
caa7e52f3f kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
Justin Hibbits
87879ba805 Increase default MAXDSIZ to 32G on powerpc64
Linking LLVM now seems to require more than 1GB data size, so increase the
default to 32G, which matches amd64.

Reviewed by:	nwhitehorn
2017-12-20 16:49:45 +00:00
Nathan Whitehorn
d6716aa2af The highest-order bit of the bootloader cookie is 1, with the result that
the 32-bit cookie can be sign-extended on its way out of the loader and
through Open Firmware. If sign-extended, the in-kernel check of its value
would fail on 64-bit systems, resulting in a mountroot prompt. Solve this
by telling the kernel to ignore the high-order bits.

PR:		kern/224437
Submitted by:	Gustavo Romero
2017-12-19 16:45:40 +00:00
Konstantin Belousov
30d4f9e888 Add atomic_load(9) and atomic_store(9) operations.
They provide relaxed-ordered atomic access semantic.  Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses.  The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.

The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations.  It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.

Suggested by:	jhb
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 09:59:20 +00:00
Justin Hibbits
7cd4e55c43 Handle the Facility Unavailable exception as a SIGILL
Currently Facility Unavailable is absent and once an application
tries to use or access a register from a feature disabled in the
CPU it causes a kernel panic.

A simple test-case is:

int main() { asm volatile ("tbegin.;"); }

which will use TM (Hardware Transactional Memory) feature which
is not supported by the kernel and so will trigger the following
kernel panic:

----

fatal user trap:

    exception       = 0xf60 (unknown)
    srr0            = 0x10000890
    srr1            = 0x800000000000f032
    lr              = 0x100004e4
    curthread       = 0x5f93000
    pid = 1021, comm = htm

panic: unknown trap
cpuid = 40
KDB: stack backtrace:
Uptime: 3m18s
Dumping 10 MB (3 chunks)
    chunk 0: 11MB (2648 pages) ... ok
    chunk 1: 1MB (24 pages) ... ok
    chunk 2: 1MB (2 pages)panic: IOMMU mapping error: -4

cpuid = 40
Uptime: 3m18s

----

Since Hardware Transactional Memory is not yet supported by FreeBSD, treat
this as an illegal instruction.

PR:		224350
Submitted by:	Gustavo Romero <gromero_AT_ibm_DOT_com>
MFC after:	2 weeks
2017-12-15 04:11:20 +00:00
Justin Hibbits
9ee02cd6f8 Add identifier for POWER9 CPU to CPU list
Without the identifier in the list booting FreeBSD results in printing the
following (from a PowerKVM boot):

cpu0: Unknown PowerPC CPU revision 0x1201, 2550.00 MHz

For now, add the same feature list as POWER8.  As new capabilities are added to
support POWER9 specific features, they will be added to this.

PR:		224344
Submitted by:	Breno Leitao <breno_DOT_leitao_AT_gmail_DOT_com>
2017-12-14 20:01:04 +00:00
Justin Hibbits
bf1b92967f Decode some PowerPC trap registers
Decode on Book-E:
* ESR (Exception Syndrome Register)
* MCSR (Machine Check Status Register)

On AIM:
* MSSSR (Memory Subsystem Status Register)

Makes it easier to tell at a glance the type of trap and machine check
conditions now.
2017-12-12 03:16:10 +00:00
Mark Johnston
5bab623438 Pass the trap frame to fasttrap hooks.
The DTrace fasttrap entry points expect a struct reg containing the
register values of the calling thread. Perform the conversion in
fasttrap rather than in the trap handler: this reduces the number of
ifdefs and avoids wasting stack space for traps that don't involve
DTrace.

MFC after:	2 weeks
2017-12-11 19:21:39 +00:00
Justin Hibbits
713e844971 Retrieve the page outside of holding locks
pmap_track_page() only works with physical memory pages, which have a
constant vm_page_t address.  Microoptimize pmap_track_page() to perform one
less operation under the lock.
2017-12-10 04:43:27 +00:00
Justin Hibbits
94a9d7c3b9 Remove PTE VA mappings for tracked pages in 64-bit mode
This was done in 32-bit mode, but not duplicated when 64-bit mode was
brought in.  Without this, stale mappings can be left, leading to odd
crashes when the wrong VA is checked in XX_PhysToVirt() (dpaa(4)).
2017-12-08 03:49:53 +00:00
Bruce Evans
fb3cc1c37d Move instantiation of msgbufp from 9 MD files to subr_prf.c.
This variable should be pure MI except possibly for reading it in MD
dump routines.  Its initialization was pure MD in 4.4BSD, but FreeBSD
changed this in r36441 in 1998.  There were many imperfections in
r36441.  This commit fixes only a small one, to simplify fixing the
others 1 arch at a time.  (r47678 added support for
special/early/multiple message buffer initialization which I want in
a more general form, but this was too fragile to use because hacking
on the msgbufp global corrupted it, and was only used for 5 hours in
-current...)
2017-12-07 07:55:38 +00:00
Justin Hibbits
89c3a53299 Override memattr for mmap on the Freescale DIU driver
The Display Interface Unit (DIU) uses main memory for the framebuffer, which
is already mapped as cache coherent physical memory.  Prevent mmap() from
using its own attributes which may otherwise conflict.
2017-12-02 01:42:07 +00:00
Pedro F. Giffuni
796df753f4 SPDX: Consider code from Carnegie-Mellon University.
Interesting cases, most likely from CMU Mach sources.
2017-11-30 15:48:35 +00:00
Scott Long
c15269ccb8 It's time to retire AHC_REG_PRETTY_PRINT and AHD_REG_PRETTY_PRINT from
the standard kernels.  They are still available as custom compile
options.
2017-11-29 23:41:49 +00:00
Justin Hibbits
3de971a61a Only check the page tables if within the KVA.
Devices aren't mapped within the KVA, and with the way 64-bit hashes the
addresses pte_vatopa() may not return a 0 physical address for a device.

MFC after:	1 week
2017-11-29 01:26:07 +00:00
Nathan Whitehorn
2bfca5775b Back out OF module installation in the event of failure. PS3 firmware gives
some ancient FDT version (2) that fails the init check in OFW_FDT. It is
still possible to make progress, but not while the OF layer is going crazy.
2017-11-28 06:31:39 +00:00
Pedro F. Giffuni
71e3c3083b sys/powerpc: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:09:59 +00:00
Justin Hibbits
41eeef87ca Synchronize TLB1 mappings when created
This allows modules creating mappings to be loaded post-boot, after SMP has
started.  Without this, the TLB1 mappings can become unsynchronized and lead
to kernel page faults when accessed on the alternate CPUs.

MFC after:	3 weeks
2017-11-26 20:30:02 +00:00
Nathan Whitehorn
b78f74e564 Remove another extern int n_slbs made redundant by declaring this in
mmu_oea64.h.

MFC after:	3 weeks
2017-11-26 04:34:13 +00:00
Nathan Whitehorn
47f69f4f2b Use the cookie now set by loader to determine whether the value passed to
PowerPC kernels in r6 is actually metadata from loader(8) or gibberish
left in r6, which is not required to be anything under the
PAPR/ePAPR/CHRP/OF standards, by another boot loader.

Note that, as a result, systems need a new boot loader to boot PPC kernels
after this revision without ending up at a mountroot prompt. New boot
loaders are backwards compatible and can boot older kernels.

Reviewed by:	jhibbits
MFC after:	2 months
2017-11-26 03:53:20 +00:00
Nathan Whitehorn
91419bdafd Avoid assumptions about the BSP being CPU 0.
MFC after:	3 weeks
2017-11-25 23:23:24 +00:00
Nathan Whitehorn
8a92c52a84 On AIM systems, it is not actually possible to stop the CPU timer, so we
just set it to a large default value (and inherit any previously existing
value), hoping it never turns over. Instead, silently allow spurious
one-shots from rollovers.

MFC after:	10 days
2017-11-25 22:43:52 +00:00
Nathan Whitehorn
e54979488d Return base IRQ of PIC when added and massively increase the number of
available IRQs per PIC for large systems.

MFC after:	3 weeks
2017-11-25 22:42:05 +00:00
Nathan Whitehorn
50d82d6f6a Missed gate on __powerpc64__ for setting LPCR in r326207.
MFC after:	3 weeks
X-MFC-with:	r326207
2017-11-25 22:15:56 +00:00
Nathan Whitehorn
c0650b2f69 When booting from an FDT, make sure the FDT itself isn't included the range
of available memory. Boot loaders are supposed to add a reserved entry for
it, but not all do.

MFC after:	2 weeks
2017-11-25 22:14:30 +00:00
Nathan Whitehorn
5bcc3e4277 Allow platform modules to set the size of large pizes, as potentially
discovered from firmware, and better handle highly-discontiguous memory
and CPU maps.

MFC after:	3 weeks
2017-11-25 22:13:19 +00:00
Nathan Whitehorn
312fb3d8dd Invalidate TLB at boot using the correct IS settings on newer-than-POWER5
CPUs.

MFC after:	3 weeks
2017-11-25 22:10:10 +00:00
Nathan Whitehorn
d225a2a9c9 Definitions for registers and trap types found on new POWER CPUs.
MFC after:	3 weeks
2017-11-25 22:08:40 +00:00
Nathan Whitehorn
66d6978c27 Missed platform_smp_timebase_sync() in r326205.
MFC after:	3 weeks
X-MFC-With:	r326205
2017-11-25 22:06:40 +00:00
Nathan Whitehorn
5d7c76afc6 Make n_slbs public in a more straightforward way. Some platforms (like
PowerNV) use firmware-assisted mechanisms to discover it and need access
to the variable.

MFC after:	3 weeks
2017-11-25 22:05:05 +00:00
Nathan Whitehorn
cb74659e0c Preserve the LPCR on new-ish (POWER7 and POWER8) CPUs, preventing exceptions
and such from ending on the wrong CPU on SMP systems. It would be good to
have this be more generic somehow as POWER9s appear, but PPC does not
have features bits, unfortunately.

MFC after:	3 weeks
2017-11-25 22:03:25 +00:00
Nathan Whitehorn
f04a8fd6a9 Yield while spinning on APs and avoid announcing all CPUs unless bootverbose
is set. These improve startup performance on massively multithreaded systems
with 8-way SMT and dozens to hundreds of CPUs.

MFC after:	3 weeks
2017-11-25 22:01:55 +00:00
Nathan Whitehorn
de2dd83fb9 Whether you can use mttb() or not is more complicated than whether PSL_HV
is set and the right thing to do may be platform-dependent (it requires
firmware on PowerNV, for instance). Make it a new platform method called
platform_smp_timebase_sync().

MFC after:	3 weeks
2017-11-25 21:59:59 +00:00
Ed Schouten
814629dd64 Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by:	kib, jhibbits
Differential Revision:	https://reviews.freebsd.org/D13180
2017-11-24 07:35:08 +00:00
Justin Hibbits
1ccb14588b Check the page table before TLB1 in pmap_kextract()
The vast majority of pmap_kextract() calls are looking for a physical memory
address, not a device address.  By checking the page table first this saves
the formerly inevitable 64 (on e500mc and derivatives) iteration loop
through TLB1 in the most common cases.

Benchmarking this on the P5020 (e5500 core) yields a 300% throughput
improvement on dtsec(4) (115Mbit/s -> 460Mbit/s) measured with iperf.

Benchmarked on the P1022 (e500v2 core, 16 TLB1 entries) yields a 50%
throughput improvement on tsec(4) (~93Mbit/s -> 165Mbit/s) measured with
iperf.

MFC after:	1 week
Relnotes:	Maybe (significant performance improvement)
2017-11-21 03:12:16 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Justin Hibbits
bb7137e1a3 Stop special casing 32-bit AIM in memory parsing
There's no need to special case 32-bit AIM to short circuit processing.
Some AIM CPUs can handle 36 bit addresses, and 64-bit CPUs can run 32-bit
OSes, so this will allow us to expand for that in the future if we desire.
2017-11-17 04:10:52 +00:00
Justin Hibbits
27353776c4 Expand the Freescale PCIe root complex driver with the ofw_pcib_pci
The interrupt map wasn't being allocated properly, preventing IRQs from being
allocated to children of the PCIe bus.  Fix this by cloning the ofw_pcib_pci
code, which handles all cases -- device tree and probed.

In the future this may become a subclass of the ofw_pcib_pci driver, but as
that's not an exported class, it's cloned for now.

MFC after:	3 weeks
2017-11-14 03:53:15 +00:00
Justin Hibbits
2d968f6dd4 Properly initialize the full md_page structure 2017-11-10 04:23:58 +00:00
Justin Hibbits
06ba753a7a Book-E pmap_mapdev_attr() improvements
* Check TLB1 in all mapdev cases, in case the memattr matches an existing
  mapping (doesn't need to be MAP_DEFAULT).
* Fix mapping where the starting address is not a multiple of the widest size
  base.  For instance, it will now properly map 0xffffef000, size 0x11000 using
  2 TLB entries, basing it at 0x****f000, instead of 0x***00000.

MFC after:	2 weeks
2017-11-10 04:14:48 +00:00
Jeff Roberson
8d6fbbb867 Replace manyinstances of VM_WAIT with blocking page allocation flags
similar to the kernel memory allocator.

This simplifies NUMA allocation because the domain will be known at wait
time and races between failure and sleeping are eliminated.  This also
reduces boilerplate code and simplifies callers.

A wait primitive is supplied for uma zones for similar reasons.  This
eliminates some non-specific VM_WAIT calls in favor of more explicit
sleeps that may be satisfied without new pages.

Reviewed by:	alc, kib, markj
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2017-11-08 02:39:37 +00:00
Justin Hibbits
81d7ebb695 Add the ISEL feature macro for those powerpc cores that have it
This is mostly for completeness, we don't currently use it for anything else.
2017-11-08 01:26:44 +00:00
Justin Hibbits
bc3acf82cc Clear the WE bit in C code rather than the asm
According to EREF rlwinm is supposed to clear the upper 32 bits of the
register of 64-bit cores.  However, from experience it seems there's a bug
in the e5500 which causes the result to be duplicated in the upper bits of
the register.  This causes problems when applied to stashed SRR1 accessed
to retrieve context, as the upper bits are not masked out, so a
set_mcontext() fails.  This causes sigreturn() to in turn return with
EINVAL, causing make(1) to exit with error.

This bit is unused in e500mc derivatives (including e5500), so could just be
conditional on non-powerpc64, but there may be other non-Freescale cores
which do use it.  This is also the same as the POW bit on Book-S, so could
be cleared unconditionally with the only penalty being a few clock cycles
for these two interrupts.
2017-11-08 01:23:37 +00:00
Justin Hibbits
37f275860c Set the PRD extension list base address in little endian
All data accesses with the SATA controller are little endian.  This was
missed when writing the extension code.
2017-11-06 05:09:18 +00:00
Justin Hibbits
78220c7be8 Fix an off-by-one error missed in the initial commit of this driver
When the segment count is > 16 it spills into an 'indirect descriptor list',
which immediately follows the main table, but the indirect list is entry 15, so
needs to be skipped for the general list.
2017-11-05 22:09:59 +00:00
Justin Hibbits
809cd50ff5 Add Freescale QorIQ SATA controller support.
The Freescale SATA controller has many similarities to AHCI controllers, so
this driver is a heavily modified AHCI driver.  Currently it seems to only
do SATA 1.0 speeds (~100-150MB/s), so there is still room for improvement.

Still to be done:
* Address erratum SATA-A-006187 -- Spread Spectrum Support (intermittent
  non-recoverable transient data integrity error seen when SSC enabled).
* Linux doesn't read the log page as it hangs on the P1022.  See if that's
  applicable to this, and address accordingly.
* Try to determine what's holding back performance, and address it.

MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D6071
2017-11-05 02:47:46 +00:00
Justin Hibbits
51cfee5d13 Stop passing -me500 to the assembler for Book-E kernels
We already pass -many to the assembler, and -me500 drops 64-bit instruction
handling, for some reason only breaking module building for 64-bit kernels.

Additionally, build with CTF for dtrace.
2017-11-04 00:47:21 +00:00
Justin Hibbits
8c6037c4f8 Fix integer type and format in debug print
gcc complains "cast to pointer from integer of different size".  phandle_t is
*always* a uint32_t, so treat it as such, not as a pointer.  Fixes 64-bit build.
2017-11-03 03:13:15 +00:00
Justin Hibbits
140db60323 Enable a bunch more options in the QORIQ64 kernel
This brings it closer to par with GENERIC64.  In the future I hope to have a
GENERIC64-E and GENERIC-E kernels as Book-E analogues to the GENERIC64/GENERIC
AIM kernels.
2017-11-01 03:54:07 +00:00
Justin Hibbits
7561a31ed9 Rename a couple files to not conflict with ZFS filenames
Now a kernel can be built with both ZFS and DPAA compiled in.
2017-11-01 03:09:16 +00:00
Justin Hibbits
8ccebb4435 Add Guest State (GS) bit to MSR bits
For completeness only.  It will be used by a hypervisor if/when one is written.
While here, sort the MSR bits into the proper categories.
2017-11-01 02:54:48 +00:00
Justin Hibbits
61b9e7ef6a Fix debug interrupts on 64-bit Book-E
Use a WORD_SIZE macro to define the correct offset to the second word
needed.  This corrects the offset calculation in 64-bit builds.
2017-11-01 02:40:15 +00:00
Justin Hibbits
a32b54357f Make DPAA work in 64-bit mode
Rework the dTSEC and FMan drivers to be more like a full bus relationship,
so that dtsec can use bus_alloc_resource() instead of trying to handle the
offset from the dts.  This required taking some code from the sparc64 ebus
driver to allow subdividing the fman region for the dTSEC devices.
2017-10-31 02:53:50 +00:00
Justin Hibbits
852ba10081 Update DPAA SDK to SDK 2.0
This adds some support for ARM as well as 64-bit.  64-bit on PowerPC is
currently not working, and ARM support has not been completed or tested on the
FreeBSD side.

As this was imported from a Linux tree, it includes some Linux-isms
(ioread/iowrite), so compile with the LinuxKPI for now.  This may change in the
future.
2017-10-30 03:41:04 +00:00
Justin Hibbits
f6bd9666a5 Add P5010/P5010E for completeness 2017-10-30 01:55:38 +00:00
Eitan Adler
a2aef24aa3 Update several more URLs
- Primarily http -> https
- Primarily FreeBSD project URLs
2017-10-29 08:17:03 +00:00
Michal Meloun
904d8c492f Add AT_HWCAP2 ELF auxiliary vector.
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
 - expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
   same way as for AT_HWCAP.

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12699
2017-10-21 12:05:01 +00:00
Bjoern A. Zeeb
8e94025b41 With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD.  Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by:		many
Reviewed by:		kristof, emaste, hiren
X-MFC after:		never
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D12639
2017-10-20 21:40:59 +00:00
Justin Hibbits
d41742b585 Expand the TLB nest level mask to 3 bits to match the 32-bit mask
This really doesn't change anything right now, because BOOKE_TLB_MAXNEST is only
3, which fits into the 2 bits currently used.
2017-10-20 03:31:23 +00:00
Justin Hibbits
95ce4c00ec No need to check for AIM here
This block is already in a #ifdef AIM block.
2017-10-20 03:13:31 +00:00
Justin Hibbits
15f9620e36 Book-E debug trace fixes
* Book-E can have Altivec exceptions, so move it out of the AIM-only block.
* Print the right DSI trap mode (read vs write) for Book-E

While here, fix some whitespace found while reviewing other diffs.
2017-10-20 03:03:04 +00:00
Justin Hibbits
12accff186 Add some more devices to the MPC85XX-based configs
These devices bring the configs closer to a desktop-like (GENERIC) kernel
config.
* The Freescale DIU support was added to the config in r306358.
  Without keyboard support video support is nearly pointless, so add ukbd and
  ums.
* The AmigaOne X5000, and P1022 devboard, both use a variant of the ds1307 RTC
* cpufreq scaling is currently supported by the p1022.  More SoCs will be added
  eventually.
2017-10-19 03:38:53 +00:00
Justin Hibbits
5ff24e4eb4 Remove some unnecessary includes 2017-10-19 02:14:39 +00:00
Wojciech Macek
10b980e75b PPC: increase MAX_PICS to 32
Previous value was too low on dual-socket POWER8 system.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           nwhitehorn
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D12540
2017-10-02 06:05:19 +00:00
Josh Paetzel
c77037f16f Fix indentation for r323068
PR:	220170
Reported by:	lidl
MFC after:	3 days
Pointyhat to:	jpaetzel
2017-09-19 20:40:05 +00:00
Justin Hibbits
d11e86549e Don't use a non-zero argument for __builtin_frame_address
__builtin_frame_address with a non-zero argument is unsafe and rejected by
newer gcc.  Since it doesn't seem to impact the stacktrace, don't bother
with gymnastics to unwind to a different frame for starting.

PR:		kern/220118
MFC after:	2 weeks
2017-09-17 20:07:20 +00:00
Justin Hibbits
6b7530563b Print the correct bitmask for the running Book-E CPU
All the Book-E world is no longer e500v{1,2}.  e500mc the 64-bit derivatives do
not use the DOZE/NAP bits with MSR[WE], instead using the `wait' instruction to
wait for interrupts, and SoC plane controls (via CCSR) for power management.

MFC after:	1 week
2017-09-17 19:40:17 +00:00
Mark Johnston
b999e9c813 Implement mmu_page_init for AIM platforms.
As of r323290 we cannot rely on the vm_page array being
zero-initialized.

Reported and tested by:	andreast
MFC after:	1 week
2017-09-17 15:40:12 +00:00
John Baldwin
c2f37b9245 Add AT_HWCAP and AT_EHDRFLAGS on all platforms.
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'.  A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags.  If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.

The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present.  This is a step towards unifying the
AT_* constants across platforms.

Reviewed by:	kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12290
2017-09-14 14:26:55 +00:00
Mateusz Guzik
4dabeda46c Fix riscv and powerpc compilation after r323329.
On these archs bzero is a C function, which triggers a compilation error
as the compiler tries to expand the macro.
2017-09-09 05:56:04 +00:00
Justin Hibbits
c5fea8adf0 Add P5021 and P5040 conditions for LAW count check.
P5040/P5021 have the same number of LAWs as P5020.  There may be a better way of
getting the count from the FDT (fsl,num-laws property on soc/corenet-law or
soc/ecm-law), but that's not supported everywhere, so we still need this check
for those other cases.
2017-09-09 02:19:44 +00:00
Justin Hibbits
dc72081153 Add some more PVR and SVR defines
These processors may not be supported yet, but add them for completion.

POWER9 is planned for support.  e300 may work (based on 603e core).
P5040/P5021 are similar to P5020, so should work as well.  One addition is
needed for P5040, to support the number of LAWs, and will be a separate commit.
2017-09-09 02:08:22 +00:00
Josh Paetzel
9d0ec2a920 Revert r323087
This needs more thinking out and consensus, and the commit message
was wrong AND there was a typo in the commit.

pointyhat:	jpaetzel
2017-09-01 17:03:48 +00:00
Josh Paetzel
0be04b100c Take options IPSEC out of GENERIC
PR:	220170
Submitted by:	delphij
Reviewed by:	ae, glebius
MFC after:	2 weeks
Differential Revision:	D11806
2017-09-01 15:54:53 +00:00
Josh Paetzel
3b65550eec Allow kldload tcpmd5
PR:	220170
MFC after:	2 weeks
2017-08-31 20:16:28 +00:00
Bruce Evans
7692d200c1 Use better hard-coded defaults for the cursor shape, and remove nearby
redundant initializations.

Hard-code base = 0, height = (approx. 1/8 of the boot-time font height)
in all cases, and remove the BIOS/MD support for setting these values.
This asks for an underline cursor sized for the boot-time font instead
of various less hard-coded but worse values.  I used that think that
the x86 BIOS always gave the same values as the above hard-coding, but
on 1 of my systems it gives the wrong value of base = 1.

The remaining BIOS fields are shift_state and bell_pitch.  These are now
consistently not explicitly reinitialized to 0.  All sc_get_bios_value()
functions except x86's are now empty, and the only useful thing that x86
returns is shift_state.  This really belongs in atkbdc, but heavier
use of the BIOS to read the more useful typematic rate has been removed
there.  fb still makes much heavier use of the BIOS.
2017-08-19 19:33:16 +00:00
Justin Hibbits
452adeee95 Add cpufreq support for P1022 and MPC8536
P1022 and MPC8536  include a 'jog' feature for clock control
(jog being a slower form of run mode).  This is done by changing the
PLL multiplier, and cannot be done if any core is in doze or sleep mode.
2017-07-21 03:40:05 +00:00
Justin Hibbits
da7266dd6b Remove an obsolete comment
This has been wrong for well over a year, we support the full 36-bit
(or more) PA space.
2017-07-05 02:20:03 +00:00
Jason A. Harmening
eb36b1d0bc Clean up MD pollution of bus_dma.h:
--Remove special-case handling of sparc64 bus_dmamap* functions.
  Replace with a more generic mechanism that allows MD busdma
  implementations to generate inline mapping functions by
  defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>.  This
  is currently useful for sparc64, x86, and arm64, which all
  implement non-load dmamap operations as simple wrappers
  around map objects which may be bus- or device-specific.

--Remove NULL-checked bus_dmamap macros.  Implement the
  equivalent NULL checks in the inlined x86 implementation.
  For non-x86 platforms, these checks are a minor pessimization
  as those platforms do not currently allow NULL maps.  NULL
  maps were originally allowed on arm64, which appears to have
  been the motivation behind adding arm[64]-specific barriers
  to bus_dma.h, but that support was removed in r299463.

--Simplify the internal interface used by the bus_dmamap_load*
  variants and move it to bus_dma_internal.h

--Fix some drivers that directly include sys/bus_dma.h
  despite the recommendations of bus_dma(9)

Reviewed by:	kib (previous revision), marius
Differential Revision:	https://reviews.freebsd.org/D10729
2017-07-01 05:35:29 +00:00
Justin Hibbits
d7fd731d06 Use the more common Book-E idiom for disabling interrupts.
Book-E has the wrteei/wrtee instructions for writing the PSL_EE bit, ignoring
all others.  Use this instead of the AIM-typical mtmsr.

MFC with:	r320392
2017-06-30 02:11:32 +00:00
Justin Hibbits
3d1357108a Disable interrupts when updating the TLB
Without disabling interrupts it's possible for another thread to preempt
and update the registers post-read (tlb1_read_entry) or pre-write
(tlb1_write_entry), and confuse the kernel with mixed register states.

MFC after:	2 weeks
2017-06-27 01:57:22 +00:00
Justin Hibbits
fbcf7bcdf4 Solve the y2038 problem for powerpc
AKA Make time_t 64 bits on powerpc(32).

PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386).  This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.

Tested by:	andreast, others
MFC after:	Never
Relnotes:	Yes
2017-06-26 02:25:19 +00:00
Justin Hibbits
37ea599bf7 Actually add the mpc85xx_get_platform_clock() function.
Follow up r319935 by actually committing the mpc85xx_get_platform_clock()
function.  This function was created to facilitate other development, and I
thought I had committed it earlier.

Some blocks depend on the platform clock rather than the system clock.
The System clock is derived from the platform clock as one-half the
platform clock.  Rewrite mpc85xx_get_system_clock() to use the new
function.

Pointy-hat to:	jhibbits
2017-06-14 04:26:37 +00:00
Justin Hibbits
3c804fef82 Use mpc85xx_get_platform_clock() instead of rolling our own.
Now that we have a single source for the platform clock, we don't need to
roll our own in every user.
2017-06-14 04:16:37 +00:00
Konstantin Belousov
2d88da2f06 Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
Konstantin Belousov
43f41dd393 Make struct syscall_args visible to userspace compilation environment
from machine/proc.h, consistently on all architectures.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 20:53:44 +00:00
John Baldwin
5033c43b7a Add a driver for the Chelsio T6 crypto accelerator engine.
The ccr(4) driver supports use of the crypto accelerator engine on
Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.

Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
and SHA2-512-HMAC authentication algorithms.  The driver also supports
chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
algorithm for encrypt-then-authenticate operations.

Note that this driver is still under active development and testing and
may not yet be ready for production use.  It does pass the tests in
tests/sys/opencrypto with the exception that the AES-GCM implementation
in the driver does not yet support requests with a zero byte payload.

To use this driver currently, the "uwire" configuration must be used
along with explicitly enabling support for lookaside crypto capabilities
in the cxgbe(4) driver.  These can be done by setting the following
tunables before loading the cxgbe(4) driver:

    hw.cxgbe.config_file=uwire
    hw.cxgbe.cryptocaps_allowed=-1

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D10763
2017-05-17 22:13:07 +00:00
Justin Hibbits
611aec2545 Correct pa argument type for pmap_kenter_attr()
Physical addresses are vm_paddr_t, not vm_offset_t.  This can make a difference
when sizeof(vm_offset_t) != sizeof(vm_paddr_t).
2017-05-16 03:31:49 +00:00
Justin Hibbits
675cad71e7 Fix stack tracing in dtrace for powerpc
The current method only sort of works, and usually doesn't work reliably.
Also, on Book-E the return address from DEBUG exceptions is not the sentinel
addresses, so it won't exit the loop correctly.

Fix this by better handling trap frames during unwinding, and using the
common trap handler for debug traps, as the code in that segment is
identical between the two.

MFC after:	1 week
2017-05-11 00:23:51 +00:00
Gleb Smirnoff
83c9dea1ba - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place.  To do per-cpu stats, convert all fields that previously were
  maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
  before we have set up UMA and we can do counter_u64_alloc(), provide an
  early counter mechanism:
  o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
  o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
    so that at early stages of boot, before counters are allocated we already
    point to a counter that can be safely written to.
  o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
  to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by:	kib, gallatin, marius, lidl
Differential Revision:	https://reviews.freebsd.org/D10156
2017-04-17 17:34:47 +00:00
Gleb Smirnoff
9ed01c32e0 All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
2017-04-17 17:07:00 +00:00
Patrick Kelsey
67d955aab4 Corrected misspelled versions of rendezvous.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by:	gnn, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10313
2017-04-09 02:00:03 +00:00
Justin Hibbits
d139c624a9 Add Freescale eSPI driver found on QorIQ SoCs 2017-04-02 01:21:35 +00:00
Justin Hibbits
bba2d2bd51 Add a helper function to get system reference clock
Many devices are clocked from the SoC's platform clock / 2.  Some device nodes
include their own clock-frequency property, while others are dependent on the
SoC's bus-frequency property instead.  To simplify, add a helper function to get
this clock.
2017-04-01 22:29:11 +00:00
Bruce Evans
f434f3515b Fix printing of negative offsets (typically from frame pointers) again.
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before.  i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386.  powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.

Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.

-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.

The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.

Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
2017-03-26 18:46:35 +00:00
Justin Hibbits
457797001e Don't bother checking core version
We already constrain by SoC, so there's no need to check the core version, too.
2017-03-24 01:52:10 +00:00
Justin Hibbits
52f0686952 Switch qoriq_gpio over to using ofw_bus_search_compatible
This will make it easier to add more compatibility strings in the future, if
necessary.
2017-03-24 01:30:18 +00:00
Justin Hibbits
e683c328f8 Introduce 64-bit PowerPC Book-E support
Extend the Book-E pmap to support 64-bit operation.  Much of this was taken from
Juniper's Junos FreeBSD port.  It uses a 3-level page table (page directory
list -- PP2D, page directory, page table), but has gaps in the page directory
list where regions will repeat, due to the design of the PP2D hash (a 20-bit gap
between the two parts of the index).  In practice this may not be a problem
given the expanded address space.  However, an alternative to this would be to
use a 4-level page table, like Linux, and possibly reduce the available address
space; Linux appears to use a 46-bit address space.  Alternatively, a cache of
page directory pointers could be used to keep the overall design as-is, but
remove the gaps in the address space.

This includes a new kernel config for 64-bit QorIQ SoCs, based on MPC85XX, with
the following notes:
* The DPAA driver has not yet been ported to 64-bit so is not included in the
  kernel config.
* This has been tested on the AmigaOne X5000, using a MD_ROOT compiled in
  (total size kernel+mdroot must be under 64MB).
* This can run both 32-bit and 64-bit processes, and has even been tested to run
  a 32-bit init with 64-bit children.

Many thanks to stevek and marcel for getting Juniper's FreeBSD patches open
sourced to be used here, and to stevek for reviewing, and providing some
historical contexts on quirks of the code.

Reviewed by:	stevek
Obtained from:	Juniper (in part)
MFC after:	2 months
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D9433
2017-03-17 21:40:14 +00:00
Justin Hibbits
62c6b30e5c Fix booting with >4GB RAM on PowerMac G5 hardware
===
From Nathan Whitehorn:

Open Firmware runs in virtual mode on the Powermac G5. This runs inside the
kernel page table, which preserves all address translations made by OF before
the kernel starts; as a result, the kernel address space is a strict superset of
OF's.

Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler
runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the
kernel. Having a value of SPRG0 that works for the kernel is less fatal than
preserving OF's value in this case.

===

The result of this is seemingly random panics from NULL dereferences, or hangs
immediately upon boot.  By not restoring SPRG0 for Open Firmware entry the
kernel PCPU pointer is preserved and SLB faults are successful, resulting in a
stable kernel.

PR:		205458
Reported by:	several (over bugzilla, lists, IRC)
Reviewed by:	andreast
Tested by:	many (various forms)
MFC after:	2 weeks
2017-03-07 22:11:57 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Justin Hibbits
15fc4ab7fc Make kernel breakpoints work for book-e
Add the necessary bits to enable kernel breakpoints for Book-E.  The entrypoint
for program exception is very trivial, so rather than expand it to be similar to
AIM, add it into the standard trap handler.

This wasn't blocked out as Book-E specific because it is only a minor redundancy
over AIM, which should have already called db_trap_glue() at this point.  If
it's going to panic with a fatal trap anywya, it doesn't matter if it goes
through this path again.
2017-02-28 04:31:28 +00:00
Justin Hibbits
b3ae819e0a Unbreak kernel breakpoints, broken for ~4 years now
When committing DTrace in 2012/2013 era I inadvertently broke breakpoints, by
setting EXC_DTRACE to the same value as BKPT_INST.  Change EXC_DTRACE to a
different, yet logically identical, trap (tw <all>,31,31).

MFC after:	2 weeks
2017-02-28 04:13:20 +00:00
Ruslan Bukin
c214a270f5 Allow setting access-width for UART registers.
This is required for FDT's standard "reg-io-width" property
(similar to "reg-shift" property) found in many DTS files.

This fixes operation on Altera Arria 10 SOC Development Kit,
where standard ns8250 uart allows 4-byte access only.

Reviewed by:	kan, marcel
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9785
2017-02-27 20:08:42 +00:00
Warner Losh
55157631f1 Include pcib_private.h for prototypes.
Noticed by: rpokala@
Sponsored by: Netflix
2017-02-26 21:33:18 +00:00
Warner Losh
28586889c2 Convert PCIe Hot Plug to using pci_request_feature
Convert PCIe hot plug support over to asking the firmware, if any, for
permission to use the HotPlug hardware. Implement pci_request_feature
for ACPI. All other host pci connections to allowing all valid feature
requests.

Sponsored by: Netflix
2017-02-25 06:11:59 +00:00
Marius Strobl
4874af73c1 - Allow different slicers for different flash types to be registered
with geom_flashmap(4) and teach it about MMC for slicing enhanced
  user data area partitions. The FDT slicer still is the default for
  CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
  in question to the slicers as a single device may provide more than
  provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
  or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes

Submitted by:	jhibbits (RouterBoard bits)
Reviewed by:	jhibbits
2017-02-22 10:21:39 +00:00
Justin Hibbits
27da2007da Correct the return value for pmap_change_attr()
pmap_change_attr() returns an error code, not a paddr.  This function is
currently unused for powerpc.

MFC after:	2 weeks
2017-02-21 05:08:07 +00:00
Justin Hibbits
d9720179fd Add a driver for the RouterBoard RB800 User LED
This may work on other RouterBoard PPC platforms, but I don't have any to test
with.
2017-02-19 19:56:12 +00:00
Jason A. Harmening
e2a8d17887 Bring back r313037, with fixes for mips:
Implement get_pcpu() for amd64/sparc64/mips/powerpc, and use it to
replace pcpu_find(curcpu) in MI code.

Reviewed by:	andreast, kan, lidl
Tested by:	lidl(mips, sparc64), andreast(powerpc)
Differential Revision:	https://reviews.freebsd.org/D9587
2017-02-19 02:03:09 +00:00
Konstantin Belousov
9fb10d635e Define the vm_ooffset_t and vm_pindex_t types as machine-independend.
The types are for the byte offset and page index in vm object.  They
are similar to off_t, which is defined as 64bit MI integer.  Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-04 12:26:38 +00:00
Jason A. Harmening
ad62ba6e96 Revert r313037
The switch to get_pcpu() in MI code seems to cause hangs on MIPS.
Back out until we can get a better idea of what's happening there.

Reported by:	kan, lidl
2017-02-04 06:24:49 +00:00
Jason A. Harmening
65ed483615 Implement get_pcpu() for the remaining architectures and use it to
replace pcpu_find(curcpu) in MI code.
2017-02-01 03:32:49 +00:00