Commit Graph

3275 Commits

Author SHA1 Message Date
Justin Hibbits
07b57507c9 powerpc64/pmap: No need for moea64_pvo_remove_from_page_locked() wrapper
The only consumer of moea64_pvo_remove_from_page_locked() already has the
page in hand, so there is no need to search for the page while holding the
lock.  Drop the wrapper, and rename _moea64_pvo_remove_from_page_locked().

Reported by:	alc
2019-07-13 03:39:46 +00:00
Justin Hibbits
a7e6ec601a powerpc64/pmap: Reduce scope of PV_LOCK in remove path
Summary:
Since the 'page pv' lock is one of the most highly contended locks, we
need to try to do as much work outside of the lock as we can.  The
moea64_pvo_remove_from_page() path is a low hanging fruit, where we can
do some heavy work (PHYS_TO_VM_PAGE()) outside of the lock if needed.
In one path, moea64_remove_all(), the PV lock is already held and can't
be swizzled, so we provide two ways to perform the locked operation, one
that can call PHYS_TO_VM_PAGE outside the lock, and one that calls with
the lock already held.

Reviewed By: luporl
Differential Revision: https://reviews.freebsd.org/D20694
2019-07-13 03:02:11 +00:00
Justin Hibbits
21e47be25e Set pcpu curpmap for powerpc64
Summary:
If an illegal instruction is encountered on a process running on a
powerpc64 kernel it would attempt to sync the cache before retrying the
instruction "just in case".  However, since curpmap is not set, when
moea64_sync_icache() attempts to lock the pmap, it's locking on a NULL pointer,
triggering a panic.  Fix this by adding a (assumed unnecessary) fallback to
curthread's pmap in moea64_sync_icache().

Reported by:	alfredo.junior_eldorado.org.br
Reviewed by:	luporl, alfredo.junior_eldorado.org.br
Differential Revision: https://reviews.freebsd.org/D20911
2019-07-13 00:19:57 +00:00
Konstantin Belousov
30b3018d48 Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely.  Otherwise, rogue userspace livelocks the
kernel.

To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison.  The primitive no longer retries
the operation if it failed spuriously.  Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.

On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.

Reviewed by:	andrew (arm64, previous version), markj
Tested by:	pho
Reported by:	https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
Justin Hibbits
9ac516a6f1 powerpc: Only worry about the lower 32 bits of SP in a 32-bit process
Summary:
Running a 32-bit process on a 64-bit POWER CPU may still use all 64-bits
in calculations, while ignoring the upper 32 bits for addressing
storage.  It so happens that some processes end up with r1 (SP) having
bit 31 set in some cases (33-bit address).  Writing out to this 33-bit
address obviosly fails.  Since the CPU ignores the upper bits, we should
as well.

sendsig() and cpu_fetch_syscall_args() appear to be the only functions
that actually rely on userspace register values for copy in/out, and
cpu_fetch_syscall_args() doesn't seem to be bitten in practice yet.

Reviewed By: luporl
Differential Revision: https://reviews.freebsd.org/D20896
2019-07-11 03:29:25 +00:00
Leandro Lupori
8b55f9f853 [PPC64] pseries: fix realmaxaddr calculation
On POWER9/pseries, QEMU passes several regions of memory,
instead of a single region containing all memory, as the
code was expecting.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20857
2019-07-10 13:36:17 +00:00
Justin Hibbits
b51bfc30ac powerpc: Clamp 32-bit binaries to 32-bit MAXUSER
sv_maxuser specifies the maximum addressable space for user space.  Presently
this is all 64-bits worth, which is impossible for a 32-bit process.

This bug has existed since the initial import of powerpc64 in 2010.

MFC after:	2 weeks
2019-07-10 04:09:15 +00:00
Mark Johnston
eeacb3b02f Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics.  The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.

This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead.  Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.

No functional change is intended.  __FreeBSD_version is bumped.

Reviewed by:	alc, kib
Discussed with:	jeff
Discussed with:	jhb, np (cxgbe)
Tested by:	pho (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19247
2019-07-08 19:46:20 +00:00
Leandro Lupori
54fdf3bf19 [PPC] Add missing SLB allocation KASSERT
Although PPC SLB code doesn't handle allocation failures,
which are rare, in most places it asserts that the pointer
returned by uma_zalloc() is not NULL, making it easier to
identify the failure and avoiding an invalid pointer dereference.

This change simply adds a missing KASSERT in SLB code.
2019-07-08 13:01:54 +00:00
Leandro Lupori
57d0d4a271 [PPC64] pseries llan: fix MAC address
There was an issue in pseries llan driver, that resulted in the first 2 bytes
of the MAC address getting stripped, and the last 2 being always 0.

In most cases the network interface still worked, despite the MAC being
different of what was specified to QEMU, but when some other host or DHCP
server expected a specific MAC, this would fail.

This change fixes this by shifting right by 2 the local-mac-address read from
device tree, if its length is 6 instead of 8, as observed in QEMU DT, that
always presents a 6 bytes value for this property.

PR:		237471
Reported by:	Alfredo Dal'Ava Junior
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20843
2019-07-04 12:31:24 +00:00
Justin Hibbits
088c26aee8 powerpc/booke: Handle misaligned floating point loads/stores as on AIM
Misaligned floating point loads and stores are already handled for AIM, but
use the DSISR to obtain the necessary data.  Book-E does not have the DSISR,
so these fixups are not performed, leading to a SIGBUS on misaligned FP
loads or stores.  Obtain the necessary data on the Book-E side, similar to
how is done for SPE.

MFC after:	1 week
2019-06-26 01:14:39 +00:00
Justin Hibbits
f62da49b2f powerpc: Transition to Secure-PLT, like most other OSs
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT.  BSS-PLT uses runtime
code generation to generate the PLT stubs.  Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.

This is the libc, rtld, and kernel support only.  The toolchain and build
parts will be updated separately.

Reviewed By: nwhitehorn, bdragon, pfg
Differential Revision: https://reviews.freebsd.org/D20598
MFC after:	1 month
2019-06-25 00:40:44 +00:00
Conrad Meyer
c363b16c63 sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead.  Document removal in UPDATING; update NOTES and
random.4.

Reviewed by:	delphij, markm (previous version)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19918
2019-06-21 00:16:30 +00:00
Nathan Whitehorn
4d210a60c3 Fix bug on newbus device deletion: we should delete the child's devinfo
on deletion, not the parent's.

MFC after:	3 weeks
2019-06-16 21:56:45 +00:00
Brandon Bergren
664104b4af Fix PPC970 boot after r348783
r348783 changed the behavior of the kernel mappings and broke booting on G5.

- Split the kernel mapping logic out so that the case where we are
running from the wrong memory space is handled using identity
mappings, and the case where we are not using a DMAP is handled by
forcibly mapping the kernel into the dmap range as intended by
r348783.

Reported by:	Mikael Urankar
Reviewed by:	luporl
Approved by:	jhibbits (mentor)
Differential Revision:	https://reviews.freebsd.org/D20608
2019-06-12 15:58:11 +00:00
Leandro Lupori
49f10b5181 [PPC] Fix build error when POWERNV is disabled
When building a kernel supporting PSERIES but not POWERNV,
the compiler would complain about an error variable being
possibly used before being initialized.

In practice, however, this should never happen. In any case, it
is now initialized to an error value.
2019-06-11 11:23:15 +00:00
Leandro Lupori
5d8c3a0657 [PPC64] Fix ofw_initrd
Before this change, OFW initrd (as md) handling code was simulating an ofwbus
device. But as there isn't really a Device Tree (DT) node representing OFW
initrd (it is specified in 2 properties under /chosen), its driver was in fact
stealing other driver's DT node.  This was noticed after MD_ROOT_MEM became
default and QEMU's USB keyboard stopped working under VNC.

This change consists in simplifying the process of detection and mapping of
initrd memory, turning it into a simple startup step, instead of trying to
simulate a device.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20553
2019-06-11 11:16:41 +00:00
Justin Hibbits
fdb916d53e powernv: Port HMI handler to use the message framework
When an HMI occurs a message event also gets created with the details of the
exception.  Hook into the messaging framework to retrieve the HMI message.
Nothing is done with it yet, except to panic on unhandled exception.
2019-06-10 03:24:38 +00:00
Justin Hibbits
f433dab2de powerpc/powernv: Reduce the scope of the sensor guarding mutex
vmem_xalloc() cannot be called while holding a nonblocking mutex, warned
by WITNESS.  The lock may not be necessary in general, but it avoids
superfluous concurrent OPAL calls for the same sensor.

Reported by:	pkubaj
2019-06-10 03:16:55 +00:00
Justin Hibbits
988d63af1c powerpc/pmap: Move the SLB spill handlers to a better place
The SLB spill handlers are AIM-specific, and belong better with the rest of
the SLB code anyway.  No functional change.
2019-06-08 03:07:08 +00:00
Justin Hibbits
b7918b86b3 powerpc/aim: Use nitems() for calculating size of phys_avail in AIM pmaps
Same thing was already done in r347164 for Book-E pmap.
2019-06-08 02:36:07 +00:00
Leandro Lupori
b934fc7468 [PPC64] Support QEMU/KVM pseries without hugepages
This set of changes make it possible to run FreeBSD for PowerPC64/pseries,
under QEMU/KVM, without requiring the host to make hugepages available to the
guest.

While there was already this possibility, by means of setting hw_direct_map to
0, on PowerPC64 there were a couple of issues/wrong assumptions that prevented
this from working, before this changelist.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20522
2019-06-07 17:58:59 +00:00
Justin Hibbits
4420fc895f powerpc/moea: Fix moea64 native VA invalidation
Summary:
moea64_insert_pteg_native()'s invalidation only works by happenstance.
The purpose of the shifts and XORs is to extract the VSID in order to
reverse-engineer the lower bits of the VPN.  Currently a segment size is 256MB
(2**28), and ADDR_API_SHFT64 is 16, so ADDR_PIDX_SHIFT is equivalent.  However,
it's semantically incorrect, in that we don't want to shift by the page shift
size, we want to shift to get to the VSID.

Tested by:	bdragon
Differential Revision: https://reviews.freebsd.org/D20467
2019-06-01 01:40:14 +00:00
Justin Hibbits
8cd3016c00 powerpc64/pmap: Reapply r334235 to OEA64 pmap, clearing HID0_RADIX
This was lost in the re-merger of ISA3 MMU into moea64_native.
2019-05-25 04:56:06 +00:00
Piotr Kubaj
57557e714e Add snd_hda(4) to GENERIC64 used by powerpc64.
amd64 also has snd_hda(4) in GENERIC.

Approved by:	jhibbits (src committer), linimon (mentor)
2019-05-24 20:01:59 +00:00
Leandro Lupori
d3f78f00db Make options MD_ROOT_MEM default on PPC64
Having this option enabled by default on PowerPC64 kernels makes
booting ISO images much easier when on PowerNV.

With it, the ISO may simply be given to the -i flag of kexec.
Better yet, the ISO may be loop mounted on PetitBoot and its
kernel may be used to load itself.

Without this option, booting ISOs on remote PPC64 machines usually
involve preparing a separate kernel, with this option enabled.
2019-05-24 18:41:31 +00:00
Justin Hibbits
702818d200 powerpc/mpc85xx: Use the proper (EREF) form of writing to DBCR0
DBCR0, according to the Freescale EREF, is guaranteed to be updated, and
changes take effect, after an isync plus change of MSR[DE] from 0 to 1.
Otherwise it's guaranteed to be updated "eventually".  Use the expected
synchronization sequence to write it for resetting.

This prevents "Reset failed" from being printed immediately before the CPU
resets.

MFC after:	2 weeks
2019-05-23 03:47:25 +00:00
Justin Hibbits
72e58595b1 powerpc/booke: It helps to set variables before using them
Actually set the source and destination VA's before using them.  Fixes a
bizarre panic on 32-bit Book-E.  Not sure why this wasn't caught by the
compiler.
2019-05-23 03:40:48 +00:00
Leandro Lupori
0632bb89db Fix PPC64 kernel build with clang8 + lld8
This patch fixes the following lld link errors:

- unsupported dynamic relocations on read-only sections
- out-of-range TOC references

Submitted by:	git_bdragon.rtk0.net
Reviewed by:	jhibbits, luporl
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19352
2019-05-22 15:56:41 +00:00
Justin Hibbits
b4b2e7a7b9 powerpc/booke: Use wrtee instead of msr to restore EE bit
The MSR[EE] bit does not require synchronization when changing.  This is a
trivial micro-optimization, removing the trailing isync from mtmsr().

MFC after:	1 week
2019-05-22 02:43:17 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Justin Hibbits
c3acaec564 powerpc: Fix moea64 pmap from 347952
vm_paddr_t is only 32 bits on AIM32 (currently), causing a build failure with
the shifting.

MFC after:	2 weeks
MFC with:	r347952
2019-05-18 14:55:59 +00:00
Justin Hibbits
a57ec59f9c powerpc64/pmap: NUMA-ize the page pv lock pool to reduce contention
It was found during building llvm that the page pv lock pool was seeing very
high contention.  Since the pmap is already NUMA aware, it was surmised that
the domains were referencing similar pages in the different domains.  This
reduces contention to the point of noise in a lockstat(8) run (~51% down to
under 5%), reducing build times by up to 20%.

This doesn't do a perfect domain alignment, just a best-guess based on
hardware available, that the domain is roughly specified in the upper bits
of the PA.  Trying to be more clever would more than likely result in
reduced performance just on the work needed.

MFC after:	2 weeks
2019-05-18 11:14:43 +00:00
Brooks Davis
9e774e5340 FCP-101: Remove bm(4).
Relnotes:	yes
FCP:		https://github.com/freebsd/fcp/blob/master/fcp-0101.md
Reviewed by:	jhb, imp
Differential Revision:	https://reviews.freebsd.org/D20230
2019-05-17 15:20:51 +00:00
Justin Hibbits
f04019c3c6 powerpc: Initialize the Hardware Interrupt Offset Register (HIOR) earlier for ppc970
Since we now have a much larger KVA on powerpc64, it's possible to get SLB
traps earlier in boot, possibly even before the HIOR is properly configured
for us.  Move the HIOR setup to immediately after reset, so that we use our
exception handlers instead of Open Firmware's.

PR:		233863
Submitted by:	Mark Millard (partial)
Reported by:	Mark Millard
MFC after:	2 weeks
2019-05-10 19:36:14 +00:00
Andrew Gallatin
542970fa2d Remove IPSEC from GENERIC due to performance issues
Having IPSEC compiled into the kernel imposes a non-trivial
performance penalty on multi-threaded workloads due to IPSEC
refcounting. In my benchmarks of multi-threaded UDP
transmit (connected sockets), I've seen a roughly 20% performance
penalty when the IPSEC option is included in the kernel (16.8Mpps
vs 13.8Mpps with 32 senders on a 14 core / 28 HTT Xeon
2697v3)). This is largely due to key_addref() incrementing and
decrementing an atomic reference count on the default
policy. This cause all CPUs to stall on the same cacheline, as it
bounces between different CPUs.

Given that relatively few users use ipsec, and that it can be
loaded as a module, it seems reasonable to ask those users to
load the ipsec module so as to avoid imposing this penalty on the
GENERIC kernel. Its my hope that this will make FreeBSD look
better in "out of the box" benchmark comparisons with other
operating systems.

Many thanks to ae for fixing auto-loading of ipsec.ko when
ifconfig tries to configure ipsec, and to cy for volunteering
to ensure the the racoon ports will load the ipsec.ko module

Reviewed by:	cem, cy, delphij, gnn, jhb, jpaetzel
Differential Revision:	https://reviews.freebsd.org/D20163
2019-05-09 22:38:15 +00:00
Justin Hibbits
2b03b6bd45 powerpc/booke: Rewrite pmap_sync_icache() a bit
* Make mmu_booke_sync_icache() use the DMAP on 64-bit prcoesses, no need to
  map the page into the user's address space.  This removes the
  pvh_global_lock from the equation on 64-bit.
* Don't map the page with user-readability on 32-bit.  I don't know what the
  chance of a given user process being able to access the NULL page when
  another process's page is added there, but it doesn't seem like a good
  idea to map it to NULL with user read permissions.
* Only sync as much as we need to.  There are only two significant places
  where pmap_sync_icache is used: proc_rwmem(), and the SIGILL second-chance
  for powerpc.  The SIGILL second chance is likely the most common, and only
  syncs 4 bytes, so avoid the other 127 loop iterations (4096 / 32 byte
  cacheline) in __syncicache().
2019-05-08 16:15:28 +00:00
Justin Hibbits
4023311a29 powerpc/booke: Do as much work outside of TLB locks as possible
Reduce the surface area of the TLB locks.  Unfortunately the same trick for
serializing the tlbie instruction on OEA64 cannot be used here to reduce the
scope of the tlbivax mutex to the tlbsync only, as the mutex also serializes
the TLB miss lock as a side effect, so contention on this lock may not be
reducible any further.
2019-05-08 16:05:18 +00:00
Justin Hibbits
7d91f528a6 powerpc: hide innocuous printf behind bootverbose
NUMA associativity, and OFW node existence, is completely optional, and
shouldn't warn always.
2019-05-08 03:15:22 +00:00
Kyle Evans
251a32b5b2 tun/tap: merge and rename to tuntap
tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).

This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp

[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).

ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.

(MFC commentary)

This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.

I have no plans to do this MFC as of now.

Reviewed by:	bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from:	melifaro
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20044
2019-05-08 02:32:11 +00:00
Justin Hibbits
2154a866b6 powerpc/booke: Use #ifdef __powerpc64__ instead of hw_direct_map in places
Since the DMAP is only available on powerpc64, and is *always* available on
Book-E powerpc64, don't penalize either side (32-bit or 64-bit) by always
checking hw_direct_map to perform operations.  This saves 5-10% time on
various ports builds, and on buildworld+buildkernel on Book-E hardware.

MFC after:	3 weeks
2019-05-05 20:23:43 +00:00
Justin Hibbits
bfd0787769 powerpc/booke: Fix size check for phys_avail in pmap bootstrap
Use the nitems() macro instead of the expansion, a'la r298352.  Also, fix the
location of this check to after initializing availmem_regions_sz, so that the
check isn't always against 0, thus always failing (nitems(phys_avail) is always
more than 0).
2019-05-05 20:05:50 +00:00
Justin Hibbits
73a30b035e powerpc/mpc85xx: Attach MPC85xx PCI bus and root complex at the right pass
No signifcant change, just matches other PCI attachments, attaching at
BUS_PASS_BUS.

MFC after:	2 weeks
2019-05-04 16:24:43 +00:00
Justin Hibbits
e280e2ea3d powerpc: Optimize padding in bus_dma_tag
Avoid 8 bytes of padding (2 noncontiguous ints).

Submitted by:	Brandon Bergren <git_bdragon.rtk0.net>
Differential Revision: https://reviews.freebsd.org/D20121
2019-05-04 02:45:24 +00:00
Justin Hibbits
5d67b612d0 powerpc: Merge all pmap struct definitions
Summary:
A few ports fail to build due to missing pmap-related definitions, which are
specific per-pmap type.  This tries to appease those ports, by merging all
pmaps together.

A future change will move the inline page directory out of the Book-E pmap,
to eliminate the last #ifdefs in pmap.h and complete the merge.

Reviewed By: luporl
Differential Revision: https://reviews.freebsd.org/D20119
2019-05-04 02:34:28 +00:00
Conrad Meyer
d6745408c7 Add a COMPAT_FREEBSD12 kernel option.
Use it wherever COMPAT_FREEBSD11 is currently specified, like r309749.

Reviewed by:	imp, jhb, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20120
2019-05-02 18:10:23 +00:00
Justin Hibbits
b4698b7a6c powerpc: Drop OPAL_HANDLE_HMI2 for now, to avoid panicking
It's possible for a Hypervisor Maintenance Interrupt (HMI) to occur while in
the pmap code, holding locks.  This can cause WITNESS to panic due to lock
errors in calling pmap_kextract().  Since we don't yet handle the flags
returned by OPAL_HANDLE_HMI2, just stop using it, so that we don't call into
pmap_kextract().

Reported by:	pkubaj
2019-05-02 03:39:03 +00:00
Justin Hibbits
0af5d6f7d9 powerpc: Stop pretending we run on e500v1 cores
Unconditional writing to MAS7, which doesn't exist on the e500v1 core, in a
TLB miss handler has been in the code for several years now.  Since this has
gone unnoticed for so long, it's easily concluded that e500v1 is not in use
with FreeBSD.  Simplify the code path a bit, by unconditionally zeroing MAS7
instead of calling a subroutine to do it.
2019-04-30 03:45:46 +00:00
Justin Hibbits
7122ab6ed3 powerpc64: Fix switch panic from cpu_throw()
r18 is used to hold the old PCB flags, but cpu_throw doesn't populate r18
with PCB flags, since the old thread is gone.  This can lead to panics on
cores that don't have the registers guarded by these flags.
2019-04-29 22:37:35 +00:00
Leandro Lupori
508864649b [PPC64] Turn opal_flash.c into a device
This change makes it easier to enable/disable the inclusion of
OPAL flash in the kernel.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20098
2019-04-29 16:50:33 +00:00
Justin Hibbits
e2e3e7d28e powerpc: Make OPAL root node probe at bus pass
This way its children can attach earlier if needed, and some subsystems are
attached earlier, like the asynchronous token management.

MFC after:	2 weeks
2019-04-29 01:10:57 +00:00
Justin Hibbits
d1d73b0e27 powerpc: Add support for additional FSCR-managed facilities
Add support to enable, save, and restore the following facilities:
* Target Address Register (bctar) -- seemingly just another register to
  branch to.
* Event-based branching -- an interrupt-like userspace event handler
  subsystem.
* Load-monitored facility -- A facility that allows monitoring a range of
  physical memory, and triggering an event on access.  Targeted to garbage
  collection software features.
2019-04-27 22:30:22 +00:00
Justin Hibbits
3eb5d5dd25 powerpc: Add SPR definitions for additional POWER8/POWER9 facilities
This only adds the new SPR definitions and the associated FSCR bits.  The
facilities themselves will be added in separate commits.
2019-04-27 19:32:33 +00:00
Justin Hibbits
8b7f0d83e6 powerpc64: Add the DSCR facility on POWER8 and later
The Data Stream Control Register (DSCR) is privileged on POWER7, but
unprivileged (different register) on POWER8 and later.  However, it's now
guarded by a new register, the Facility Status and Control Register, instead of
the MSR like other pre-existing facilities (FPU, Altivec).  The FSCR must be
managed explicitly, since it's effectively an extension of the MSR.

Tested by:	Brandon Bergren
2019-04-27 16:28:34 +00:00
Justin Hibbits
f074eff155 powerpc: Add POWER8NVL definition
The POWER8NVL (POWER8 NVLink) architecturally behaves identically to the
POWER8, with a different PVR identifier.  Mark it as such, so it shows up
appropriately to the user.

Reported by:	Alexey Kardashevskiy
MFC after:	2 weeks
2019-04-27 02:33:49 +00:00
Justin Hibbits
19cfd8759e powerpc: micro-optimize cpu_switch()
Since the non-volatile registers are restored at the end of cpu_switchin (of
the new thread) they're free for us to use for our own purposes.  Load the
PCB_FLAGS into a non-volatile register so it's preserved across the C
function calls that manage FPU and altivec state.  This removes 4 loads from
each file.  Might be a trivial performance improvement (~12 clock cycles per
context switch).

MFC after:	3 weeks
2019-04-27 00:53:41 +00:00
Justin Hibbits
17b72853f4 powerpc64: Clear FSCR SPR, so that it's in a known state
This now turns any access to the DSCR SPR into a SIGILL.  Later commits will
make DCSR work correctly on POWER8 and POWER9.

PR:		237208
2019-04-26 03:18:49 +00:00
Justin Hibbits
38a6d5495b powerpc: Fix whitespace in SPR header. 2019-04-26 03:13:44 +00:00
Justin Hibbits
da54cd8721 powerpc: Add another feature2 flag, and update power9 definition
Also fix the definition of PPC_FEATURE2_HTM_NOSUSPEND, a bad line copy.

This now closer matches Linux's definition.
2019-04-26 02:30:03 +00:00
Justin Hibbits
19b86243f4 powerpc: Add a couple missing isyncs
mtmsr and mtsr require context synchronizing instructions to follow.  Without
a CSI, there's a chance for a machine check exception.  This reportedly does
occur on a MPC750 (PowerMac G3).

Reported by:	Mark Millard
2019-04-24 02:51:58 +00:00
Leandro Lupori
8920043674 [PPC64] Fix wrong KASSERT in mphyp_pte_insert()
As mphyp_pte_unset() can also remove PTE entries, and as this can
happen in parallel with PTEs evicted by mphyp_pte_insert(), there
is a (rare) chance the PTE being evicted gets removed before
mphyp_pte_insert() is able to do so. Thus, the KASSERT should
check wether the result is H_SUCCESS or H_NOT_FOUND, to avoid
panics if the situation described above occurs.

More details about this issue can be found in PR 237470.

PR:		237470
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20012
2019-04-23 17:11:45 +00:00
Justin Hibbits
f4c5f64d30 [PowerPC64] pseries-llan: increment packet output counters on error and success
Summary: when using pseries-llan driver, Opkts and Oerrs counters (netstat
-i) are always zero. This patch adds an small error handling to increment
these counters.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision: https://reviews.freebsd.org/D20009
2019-04-23 03:19:03 +00:00
Justin Hibbits
ba5189f7be powerpc64/pseries: Fix hypervisor call with extra arguments
Some hypervisor calls, such as H_SEND_LOGICAL_LAN, take more arguments than
are traditionally passed in registers.  The HCALL ABI will accept these
arguments in r11 and r12.  With ELFv2 ABI, these arguments are 2
double-words lower than ELFv1 ABI, as two double-words in the stack frame
are no longer used, and therefore removed from the frame.  Fix the offsets
for loading the registers for the HCALL.  This fixes the phyp_llan driver
with ELFv2 kernel.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision:	https://reviews.freebsd.org/D20008
2019-04-23 03:05:26 +00:00
Justin Hibbits
93096fecb6 powerpc64/powernv: Relax flash block write requirements
Since writes don't necessarily need to be on erase-block boundaries, we can
relax the block size and alignments down to sector size.  If it needs to be
erased, opalflash_erase() will check proper alignment and size.
2019-04-20 02:44:38 +00:00
Justin Hibbits
bc60451a47 powerpc/powernv: Make erasing before writes optional
If the OPAL flash driver supports writing without erase, it adds a
'no-erase' property to the flash device node.  Honor that property and don't
bother erasing if it exists.
2019-04-19 02:28:04 +00:00
Warner Losh
f7ab01581a Move mpr/mps drivers from per-arch NOTES files into the MI notes
file. They are in more arches they they aren't. Add appropriate
nodevice directives in powerpc and arm.
2019-04-13 06:30:45 +00:00
Justin Hibbits
49d9a59783 Add NUMA support to powerpc
Summary:
Initial NUMA support:
    - associate CPU with domain
    - associate memory ranges with domain
    - identify domain for devices
    - limit device interrupt binding to appropriate domain

- Additionally fixes a bug in the setting of Maxmem which led to
  only memory attached to the first socket being enabled for DMA

A pmap variant can opt in to numa support by by calling `numa_mem_regions`
at the end of pmap_bootstrap - registering the corresponding ranges with the
VM.

This yields a ~20% improvement in build times of llvm on dual socket POWER9
over non-NUMA.

Original patch by mmacy.

Differential Revision: https://reviews.freebsd.org/D17933
2019-04-13 04:03:18 +00:00
Justin Hibbits
77eb50c7a3 powerpc: Add file forgotten in r346144
Forgot to add the changes for DELAY(), which lowers priority during the
delay period.  Also, mark the timebase read as volatile so newer GCC does
not optimize it away, as it reportedly does currently.

MFC after:	2 weeks
MFC with:	r346144
2019-04-13 02:29:30 +00:00
Justin Hibbits
e92d228bbb powerpc: Adjust priority NOPs, and make them functions
PowerISA 2.07 and PowerISA 3.0 both specify special NOPs for priority
adjustments, with "medium" priority being normal.  We had been setting
medium-low as our normal priority.  Rather than guess each time as to what
we want and the right NOP, wrap them in inline functions, and replace the
occurrances of the NOPs with the functions.  Also, make DELAY() drop to very
low priority while waiting, so we don't burn CPU.

Coupled with r346143, this shaves off a modest 5-8% on buildworld times with
-j72.  There may be more room for improvement with judicious use of these
NOPs.

MFC after:	2 weeks
2019-04-12 00:53:30 +00:00
Justin Hibbits
6b74fa3f3e powerpc64: Increase the nap level on power9 idling
The POWER9 documentation specifies that levels 0-3 are the 'lightest' sleep
level, meaning lowest latency and with no state loss.  However, state 3 is
not implemented, and is instead reserved for future chips.  This now
properly configures the PSSCR, specifying state 2 as the lowest level to
enter, but request level 0 for quickest sleep level.  If the OCC determines
that the CPU can enter states 1 or 2 it will trigger the transition to those
states on demand.

MFC after:	1 week
2019-04-12 00:44:33 +00:00
Justin Hibbits
3c8c50f955 powerpc/powernv: Fix major bugs in opal_flash
* The BIO bio_data may not be page aligned.  Only the base address of each
  page worth of data is extracted to pass to OPAL.  Without page alignment
  it can scribble over random memory when finishing the page read.  Fix this
  by short-reading the first page to properly align for full page reads.
* Fix the definition of OPAL_FLASH_ERASE.
* Properly handle the async message result, as now returned from r345974.
2019-04-06 02:39:56 +00:00
Justin Hibbits
947079ebee powerpc/powernv: Fix issues in opal_async
* Properly return the full opal_msg from an async completion.
* Don't keep bugging OPAL, wait 100us or so.  With some minor changes to
  DELAY() to drop to very low priority, the thread won't hog the CPU while
  polling for the async completion.
2019-04-06 02:31:01 +00:00
Justin Hibbits
62c7ea1f1d powerpc: Allow emulating optional FPU instructions on CPUs with an FPU
The e5500 has an FPU, but lacks the optional fsqrt instruction.  This
instruction gets emulated in the kernel, but the emulation uses stale data,
from the last switch out, and does not return the result of the operation
immediately.  Fix both of these conditions by saving and restoring the FPRs
around the emulation point.

MFC after:	1 week
MFC with:	r345829
2019-04-03 04:01:08 +00:00
Justin Hibbits
81dd9c5e69 powerpc: Apply r178139 from sparc64 to powerpc's fpu_sqrt
This fix was committed less than 2 months after the code was forked into the
powerpc kernel.  Though powerpc doesn't use quad-precision floating point,
or need it for emulation, the changes do look like correctness fixes
overall.

This was found while trying to get fsqrt emulation working on e5500, which
does have a real FPU, but lacks the fsqrt instruction.  This is not the
complete fix, the rest is to be committed separately.

MFC after:	1 week
2019-04-03 03:54:30 +00:00
Justin Hibbits
fbf7737949 powernv: Port OPAL asynchronous framework to use the new message framework
Since OPAL_GET_MSG does not discriminate between message types, asynchronous
completion events may be received in the OPAL_GET_MSG call, which dequeues
them from the list, thus preventing OPAL_CHECK_ASYNC_COMPLETION from
succeeding.  Handle this case by integrating with the messaging framework.
2019-04-02 04:02:57 +00:00
Justin Hibbits
911a92603e powerpc/powernv: Add OPAL heartbeat thread
Summary:
OPAL needs to be kicked periodically in order for the firmware to make
progress on its tasks.  To do so, create a heartbeat thread to perform this task
every N milliseconds, defined by the device tree.  This task is also a central
location to handle all messages received from OPAL.

Reviewed By: luporl
Differential Revision: https://reviews.freebsd.org/D19743
2019-04-02 04:00:01 +00:00
Justin Hibbits
0499e9c619 powerpc64: Use medium code model in asm files for TOC references
Summary:
With a sufficiently large TOC, it's possible to index out of range, as
the immediate load instructions only permit 16-bit indices, allowing up
to 64kB range (signed) from the base pointer.  Allow +/- 2GB range, with
the medium code model TOC accesses in asm.

Patch originally by Brandon Bergren.  The issue appears to impact ELFv2
more than ELFv1.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D19708
2019-03-29 02:38:30 +00:00
Justin Hibbits
4b4b6f0191 powerpc: Remove now-obsolete P9H MMU name 2019-03-29 02:11:48 +00:00
Justin Hibbits
9f1a007da7 powerpc64: Micro-optimize moea64 native pmap tlbie
* Cache moea64_need_lock in a local variable; gcc generates slightly better
  code this way, it doesn't need to reload the value from memory each read.
* VPN cropping is only needed on PowerPC ISA 2.02 and older cores, a subset
  of those that need serialization, so move this under the need_lock check,
  so those that don't need the lock don't even need to check this.
2019-03-26 02:53:35 +00:00
Justin Hibbits
8af4cc4d5a powernv: Add Hypervisor Maintenance Interrupt handler
Attempting to build www/firefox on POWER9 resulted in a HMI exception being
thrown, a fatal trap currently.  This is typically caused by timer facility
errors, but examination of the Hypervisor Maintenance Exception Register
(HMER) yielded only that an exception had recovered, with no information of
the actual exception cause.

When an HMI occurs, OPAL_HANDLE_HMI or OPAL_HANDLE_HMI2 must be called to
handle the exception at the firmware level.  If the exception is handled, we
can continue.

This adds only the preliminary handler, enough to prevent package building
from panicking.  An enhancement in the future is to use the flags returned
by OPAL_HANDLE_HMI2 to print more useful error messages, and log maintenance
events.

Reviewed by:	luporl
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D19634
2019-03-23 03:23:20 +00:00
Justin Hibbits
bc94b70098 powerpc: Re-merge isa3 HPT with moea64 native HPT
r345402 fixed the bug that led to the split of the ISA 3.0 HPT handling from
the existing manager.  The cause of the bug was gcc moving the register
holding VPN to a different register (not r0), which triggered bizarre
behaviors.  With the fix, things work, so they can be re-merged.  No
performance lost with the merge.
2019-03-22 22:14:14 +00:00
Justin Hibbits
091a23cbf8 powerpc64: Handle the modern (2.05+) implementaiton of tlbie
By happenstance gcc4 puts 'vpn' into r0 in all uses of TLBIE(), but modern
gcc does not.  Also, the single-argument form of tlbie zeros all unused
arguments, making the modern tlbie instruction use r0 as the RS field
(LPID).

The vpn argument has the bottom 12 bits cleared (the input having been
left-shifted by 12 bits), which just so happens, on the POWER9 and previous
incarnations, to be the number of LPID bits supported.  With those bits
being zero, the instruction:

	tlbie r0, r0

will invalidate the VPN in r0, in LPAR 0 (ignoring the upper bits of r0 for
the RS field).  One build with gcc8 yields:

	tlbie r9, r0

with r0 having arbitrary contents, not equal to r9.  This leads to strange
crashes, behaviors, and panics, due to the requested TLB entry not actually
being invalidated.

As the moea64_native must work on both old and new, we explicitly zero out
r0 so that it can work with only the single argument, built with base gcc
and modern gcc.  isa3_hashtb takes a different approach, encoding the
two-argument form, soas not to explicitly clobber r0, and instead let the
compiler decide.

Reported by:	Brandon Bergren
Tested by:	Brandon Bergren
MFC after:	1 week
2019-03-22 01:43:31 +00:00
Konstantin Belousov
fd8d844f76 amd64 KPTI: add control from procctl(2).
Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting.  PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.

Requested by:	jhb
Reviewed by:	jhb, markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:44:33 +00:00
Konstantin Belousov
6f1fe3305a amd64: Add md process flags and first P_MD_PTI flag.
PTI mode for the process pmap on exec is activated iff P_MD_PTI is set.

On exec, the existing vmspace can be reused only if pti mode of the
pmap matches the P_MD_PTI flag of the process.  Add MD
cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can
vetoed reuse of the existing vmspace.

MFC note: md_flags change struct proc KBI.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:31:01 +00:00
Justin Hibbits
093f7de620 powerpc: Print trap frame address in ddb backtraces
Registers visible from 'show reg' don't always match the registers from the
offending trap frame.  Knowing the frame address lets one examine the
registers manually.

MFC after:	1 week
2019-03-09 03:24:39 +00:00
Justin Hibbits
66306e6acd powerpc: Print trap frame address for fatal traps
MFC after:	1 week
2019-03-09 03:18:37 +00:00
Justin Hibbits
b2c820735a powerpc: Print data address register on alignment exceptions
MFC after:	1 week
2019-03-09 03:10:56 +00:00
Justin Hibbits
1cd7081eb1 powerpc64: Fix early exit with invalid kernel SLB entries
The check for early exit should be checking the SLB entry itself.  As
currently written it was checking the address of the SLB, which is always
non-zero, so would go through the kernel SR restore loop regardless.

Submitted by:	mmacy
MFC after:	2 weeks
2019-03-08 04:20:33 +00:00
Justin Hibbits
9ffdae0fd7 powerpc: Fix cpufreq statement scoping
The second statements on the lines are not guarded by the `if' condition.
This triggers a warning with newer gcc.  It's relatively harmless given the
usage, but incorrect.  Instead, wrap the statements so they're properly
guarded.

Reported by:	powerpc64-gcc xtoolchain
MFC after:	1 week
2019-03-08 03:59:53 +00:00
Justin Hibbits
058250a8ab powerpc: Save stack pointer in savectx
This allows 'show acttrace' to show backtrace on processes currently running
on CPUs.

Reported by:	Brandon Bergren
MFC after:	1 week
2019-03-07 04:43:08 +00:00
Justin Hibbits
83b009dab5 powerpc: fix 'show spr' for ELFv1 powerpc64
Update and flush the right cache range for the ELFv1 ABI.

MFC after:	1 week
2019-03-02 21:11:46 +00:00
Justin Hibbits
5b4c63b781 powerpc/booke: Depessimize MAS register updates even more
Remove isyncs between MAS register updates in the TLB miss handler, since
it's only needed before the TLB update instructions.
2019-03-02 20:59:18 +00:00
Justin Hibbits
51244b1e46 powerpc: Scale intrcnt by mp_ncpus
On very large powerpc64 systems (2x22x4 power9) it's very easy to run out of
available IRQs and crash the system at boot.  Scale the count by mp_ncpus,
similar to x86, so this doesn't happen.  Further work can be done in the future
to scale the I/O IRQs as well, but that's left for the future.

Submitted by:	mmacy
MFC after:	3 weeks
2019-03-02 01:51:41 +00:00
Edward Tomasz Napierala
1699546def Remove sv_pagesize, originally introduced with r100384.
In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.

Reviewed by:	imp (who also contributed the commit message)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19280
2019-03-01 16:16:38 +00:00
Justin Hibbits
6775dfdf54 powerpc/powernv: Add OPAL flash device driver
Firmware needed by petitboot, for example, GPU firmware, can be installed to
a partition in the flash filesystem.  This driver exposes the full flash
given by the device tree, letting the user manage firmware, etc, from
FreeBSD.

To use the partitions provided by the flash module, the fdt_slicer module is
needed, but the module isn't needed for raw access, so there's no direct
dependency link in here.

MFC after:	2 weeks
2019-03-01 04:36:55 +00:00
Justin Hibbits
dac618a648 powerpc/powernv: Add asynchronous token management for powernv
The OPAL firmware only supports a finite number of in-flight asynchronous
operations.  Rather than have each subsystem try to manage its own, use a
central management service to hand out tokens.

More work can be done to improve asynchronous behavior, such as funneling
things through a future OPAL heartbeat handler, but capabilities will be
added as needed.

Augment the existing consumers (i2c and sensors) to use this new API.

MFC after:	4 weeks
2019-03-01 02:49:47 +00:00
Justin Hibbits
0d69f00b4d powerpc/mpc85xx: Synchronize timebase the platform correct way
Summary:
To safely synchronize timebase we need to disable the timebase on all
cores, set timebase, and resynchronize.  This adds two new devices, mutually
exclusive, which attach on the SoC simplebus, to freeze and unfreeze the
timebase.  The devices are singletons, and platform-specific, so no reason
to make them optional and in separate files.

This was found to be necessary for top(1) to work correctly on an AmigaOne
X5000 (P5020 SoC).  It also fixes bufdaemon and bufspacedaemon hangs at
shutdown.

Test Plan: Regression test on various Book-E hardware.

Reviewed by:	nwhitehorn
Tested by:	Brandon Bergren (git_bdragon.rtk0.net)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D19208
2019-02-27 03:30:49 +00:00
Konstantin Belousov
e7a9df16e6 Add kernel support for Intel userspace protection keys feature on
Skylake Xeons.

See SDM rev. 68 Vol 3 4.6.2 Protection Keys and the description of the
RDPKRU and WRPKRU instructions.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-20 09:51:13 +00:00
Justin Hibbits
a9b033c2f3 powerpc/booke: Fix 32-bit build
MFC after:	2 weeks
MFC with:	344202
2019-02-16 04:47:33 +00:00
Justin Hibbits
0454ed9794 powerpc/booke: depessimize MAS register updates
We only need to isync before we actually use the MAS registers, so before and
after the TLB read/write/sync/search operations.

MFC after:	2 weeks
2019-02-16 04:38:34 +00:00
Justin Hibbits
18f7e2b45e powerpc/booke: Use DMAP where possible for page copy and zeroing
This avoids several locks and pmap_kenter()'s, improving performance
marginally.

MFC after:	2 weeks
2019-02-16 04:16:10 +00:00
Leandro Lupori
59621b207c [PPC64] Fix mismatch between thread flags and MSR
When sigreturn() restored a thread's context, SRR1 was being restored
to its previous value, but pcb_flags was not being touched.

This could cause a mismatch between the thread's MSR and its pcb_flags.
For instance, when the thread used the FPU for the first time inside
the signal handler, sigreturn() would clear SRR1, but not pcb_flags.
Then, the thread would return with the FPU bit cleared in MSR and,
the next time it tried to use the FPU, it would fail on a KASSERT
that checked if the FPU was disabled.

This change clears the FPU bit in both pcb_flags and frame->srr1,
as the code that restores the context expects to use the FPU trap
to re-enable it.

PR:		234539
Reported by:	sbruno
Reviewed by:	jhibbits, sbruno
Differential Revision:	https://reviews.freebsd.org/D19166
2019-02-14 15:15:32 +00:00
Konstantin Belousov
72091bb393 Enable enabling ASLR on non-x86 architectures.
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
2019-02-14 14:44:53 +00:00
Justin Hibbits
64143619ab powerpc/booke: Use the 'tlbilx' instruction on newer cores
Newer cores have the 'tlbilx' instruction, which doesn't broadcast over
CoreNet.  This is significantly faster than walking the TLB to invalidate
the PID mappings.  tlbilx with the arguments given takes 131 clock cycles to
complete, as opposed to 512 iterations through the loop plus tlbre/tlbwe at
each iteration.

MFC after:	3 weeks
2019-02-13 03:11:12 +00:00
Leandro Lupori
b8efbfb9d3 [ppc64] prevent infinite loop on icache sync
At moea64_sync_icache(), when the 'va' argument has page size
alignment, round_page() will return the same value as 'va'.
This would cause 'len' to be 0 and thus an infinite loop.

With this change, 'lim' will always point to the next page boundary.

This issue occurred especially during debugging sessions, when a breakpoint
was placed on an exact page-aligned offset, for instance.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19149
2019-02-12 11:29:03 +00:00
Justin Hibbits
dcbd7de5b6 powerpc: Clamp MAXCPU for MPC85XXSPE kernel to 2
SoCs with e500v2 chips only have at most 2 cores, and there are no plans to
release any more e500v2-based SoCs.  Clamping MAXCPU down to 2 saves 5MB of
data, and 1.5MB bss.
2019-02-10 20:21:20 +00:00
Justin Hibbits
83191e19b7 powerpc: Fix AIM build
cpu_idle_e500mc is only used in booke, so ignore it completely in AIM.

MFC after:	2 weeks
MFC with:	r343944
2019-02-09 23:19:33 +00:00
Justin Hibbits
d6919f21dc powerpc: Split out the e500mc idling from rest of Book-E
The e500v2 and e500mc (and derivatives) have different idling procedures, so
make them different functions.

MFC after:	2 weeks
2019-02-09 21:19:53 +00:00
Leandro Lupori
59a8224976 [ppc64] fix /dev/kmem
For direct mapped kernel addresses, ppc64 function was not
performing the dmap to physical conversion, before jumping
to the code that fetched the value from physical memory.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19086
2019-02-07 17:30:44 +00:00
Justin Hibbits
4290b4b849 powerpc: Bind IRQs to only one interrupt on QorIQ SoCs
The QorIQ SoCs don't actually support multicast interrupts, and the
references state explicitly that multicast is undefined behavior.  Avoid the
undefined behavior by binding to only a single CPU, using a quirk to
determine if this is necessary.

MFC after:	3 weeks
2019-02-06 03:52:14 +00:00
Leandro Lupori
4a8450ceff [ppc64] llan: fix fatal kernel trap when system is low on memory
When running several builders in parallel, on QEMU, with 8GB of
memory, a fatal kernel trap (0x300 (data storage interrupt))
caused by llan driver is sometimes observed, when the system
starts to run out of swap space.

This happens because, at llan_intr(), a phyp call to add a
logical LAN buffer is always made when llan_add_rxbuf() fails,
even if it fails to allocate a new buffer.

PR:	235489
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19084
2019-02-05 18:16:14 +00:00
Justin Hibbits
9c22a13345 powerpc: Don't idle with the wait instruction on booke
It appears idling via 'wait' on e5500 causes strange behaviors, such as
top(1) simply hanging sporadically, until input.  Until this can possibly be
sorted out (interrupt issue?), just don't idle on this hardware.  The SoCs
are low power already, and the wait state doesn't save much anyway.
2019-02-05 04:47:41 +00:00
Leandro Lupori
6174048251 powerpc64: Add a trap stack area
Currently, the trap code switches to the the temporary stack in the dbtrap
section. It works in most cases, but in the beginning of the execution, the
temp stack is being used, as starting in the powerpc_init() code.

In this current scenario, the stack is being overwritten, which causes the
return of breakpoint() to take abnormal execution.

This current patchset create a small stack to use by the dbtrap: codepath
avoiding the corruption of the temporary stack.

PR:		224872
Submitted by:	breno.leitao_gmail.com
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D14484
2019-02-04 16:02:03 +00:00
Justin Hibbits
d49fc192c1 powerpc/powernv: Add a driver for the POWER9 XIVE interrupt controller
The XIVE (External Interrupt Virtualization Engine) is a new interrupt
controller present in IBM's POWER9 processor.  It's a very powerful,
very complex device using queues and shared memory to improve interrupt
dispatch performance in a virtualized environment.

This yields a ~10% performance improvment over the XICS emulation mode,
measured in both buildworld, and 'dd' from nvme to /dev/null.

Currently, this only supports native access.

MFC after:	1 month
2019-02-02 04:15:16 +00:00
Konstantin Belousov
c75f49f7d8 Make iflib a loadable module.
iflib is already a module, but it is unconditionally compiled into the
kernel.  There are drivers which do not need iflib(4), and there are
situations where somebody might not want iflib in kernel because of
using the corresponding driver as module.

Reviewed by:	marius
Discussed with:	erj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19041
2019-01-31 19:05:56 +00:00
Andriy Voskoboinyk
86d535ab47 Garbage collect AH_SUPPORT_AR5416 config option.
It does nothing since r318857.
2019-01-25 13:48:40 +00:00
Justin Hibbits
15fba9d3be powerpc: Fix opaque irq data initialization
The powerpc_intr structure is not zero-initialized, so on an invariants
build would panic in the xics driver with an invalid pointer.  Also fix the
xics driver to share the private data setup code between xics_enable() and
xics_bind().

Reported by:	Leonardo Bianconi
2019-01-19 04:47:19 +00:00
Justin Hibbits
bd326619e8 powerpc: Fix FPU fsqrt emulation special case results
If fsqrts is emulated with +INF as its argument, the 0 return value causes a
NULL pointer dereference, panicking the system.  Follow the PowerISA and
return +INF with no FP exception.

MFC after:	1 week
2019-01-16 03:52:43 +00:00
Justin Hibbits
2da4e52d79 powerpcspe: Correct SPE high-component loading
Don't clobber the low part of the register restoring the high component of.
This could lead to very bad behavior if it's an ABI-affected register.

While here, also mark the asm volatile in the SPE high save case, to match
the load case.

Reported by:	Branden Bergren (git_bdragon.rtk0.net)
MFC after:	1 week
2019-01-13 04:51:24 +00:00
Justin Hibbits
02f2e80c3f Add AT_HWCAP / AT_HWCAP2 to elf64_sysvec_v2.
Summary:
I was working on implementing ifuncs on powerpc64 elfv2 today, and I suddenly
realized that the reason I was having so much trouble with AT_HWCAP and
AT_HWCAP2 is they are missing from the sysentvec.

After adding them, the auxv is being filled like it should.

Submitted by:	Brandon Bergren (git_bdragon.rtk0.net)
Differential Revision: https://reviews.freebsd.org/D18575
2019-01-13 02:28:37 +00:00
Justin Hibbits
431d31e0bf powerpc/pseries: Cache the IPI vector to avoid the common static lookup
The IPI vector is static, and happens to be the most common interrupt by far
on some systems.  Rather than searching for the interrupt every time, cache
the index.

This appears to yield a small performance boost, of about 8% reduction in
buildworld times, on my POWER9 system, when paired with r342975.
2019-01-12 22:10:31 +00:00
Justin Hibbits
56505ec016 powerpc: Add opaque 'private data' to interrupt vectors
The XICS and XIVE need extra data beyond irq and vector.  Rather than
performing a separate search, it's better for the general interrupt facility
to hold a private pointer, since the search already must be done anyway at
that level.
2019-01-12 22:05:42 +00:00
Conrad Meyer
bba9cbe374 powerpc: Fix regression introduced in r342771
In r342771, I introduced a regression in Power by abusing the platform
smp_topo() method as a shortcut for providing the MI information needed for
the stated sysctls.  The smp_topo() method was already called later by
sched_ule (under the name cpu_topo()), and initializes a static array of
scheduler topology information.  I had skimmed the smp_topo_foo() functions
and assumed they were idempotent; empirically, they are not (or at least,
detect re-initialization and panic).

Do the cleaner thing I should have done in the first place and add a
platform method specifically for core- and thread-count probing.

Reported by:	luporl via jhibbits
Reviewed by:	luporl
X-MFC-With:	r342771
Differential Revision:	https://reviews.freebsd.org/D18777
2019-01-07 19:39:31 +00:00
Conrad Meyer
6b83069e05 Expose threads-per-core and physical core count information
With new sysctls (to the best of our ability do detect them).  Restructured
smp.4 slightly for clarity (keep relevant stuff closer to the top) while
documenting.

Reviewed by:	markj, jhibbits (ppc parts)
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18322
2019-01-04 18:31:17 +00:00
Mateusz Guzik
628888f0e0 Remove iBCS2, part2: general kernel
Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
2018-12-19 21:57:58 +00:00
Justin Hibbits
45b33e809f powerpc/booke: Change KERNBASE to be physical load address
Previous commits have made VM_MIN_KERNEL_ADDRESS its own separate entity,
and rebased the kernel around that address instead of KERNBASE.  This commit
pulls the trigger to rebase KERNBASE to a physical load address.  The
eventual goal is to align the address with the AIM KERNBASE, but at this
time that's not an option.

Currently a Book-E kernel must be loaded on a 64MB boundary, due to size
issues.  The common load address is at the 64MB mark (0x04000000), so simply
make that the default KERNBASE.

As of this commit, Book-E kernels can be loaded and booted with ubldr.

MFC after:	3 weeks
2018-12-13 05:07:39 +00:00
Justin Hibbits
3067a880ce powerpcspe: Fix GPR handling in SPE exception handler
Optimize the exception handler to only save and load the upper word of the
GPRs used in the emulating instruction.  This reduces the save/load
overhead, and as a side effect does not overwrite the upper word of any
temporary register.

With this commit I am now able to run editors/abiword and math/gnumeric on a
e500-based system.

MFC after:	1 week
MFC With:	r341752,r341751
2018-12-13 04:48:28 +00:00
Justin Hibbits
9d720d45c9 powerpc/booke: Don't get and use the load offset for TOC on APs
The code was a near exact copy of the code in startup, but it doesn't need
the complexity since the kernel is already relocated.  With
VM_MIN_KERNEL_ADDRESS as currently set to KERNBASE, this doesn't cause a
problem, because it's a zero offset.  However, when KERNBASE is changed to a
physical load address, it then has a non-zero offset, and ends up with an
invalid stack pointer, causing the AP to hang.
2018-12-11 02:03:00 +00:00
Leandro Lupori
be2bd024de ppc64: handle exception 0x1500 (soft patch)
This change adds a hypervisor trap handler for exception 0x1500 (soft patch),
normalizing all VSX registers and returning.
This avoids a kernel panic due to unknown exception.

Change made with the collaboration of leonardo.bianconi_eldorado.org.br,
that found out that this is a hypervisor exception and not a supervisor one,
and fixed this in the code.

Reviewed by:	jhibbits, sbruno
Differential Revision:	https://reviews.freebsd.org/D17806
2018-12-10 14:54:28 +00:00
Hans Petter Selasky
d7a9bfee8f Implement atomic_swap_xxx() for all platforms.
Differential Revision:	https://reviews.freebsd.org/D18450
Reviewed by:		kib@
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2018-12-10 13:38:13 +00:00
Justin Hibbits
870d94c50a powerpc/booke: Replace a logical equivalent of pmap_kextract() with a real call
No sense in reinventing the wheel here.  AP bringup is not a time-critical
point.
2018-12-10 04:16:40 +00:00
Scott Long
44f299a3cc Remove the mps driver from powerpc 32bit GENERIC, and don't build it and
mpr as a module for powerpc or mips.  An upcoming commit will cause these
drivers to rely on the presence of 64bit atomic operations.  Discussed
with jhibbits.
2018-12-09 06:06:06 +00:00
Justin Hibbits
ddc6c1fa3d powerpc/SPE: Copy lower part of source register to target for efdabs/efdnabs/efdneg
MFC after:	1 week
MFC With:	r341751
2018-12-09 04:54:55 +00:00
Justin Hibbits
3d6bebd3a2 powerpc/SPE: Reload vector registers after efdabs/efdnabs/efdneg
While here, also style(9)-adjust indents around this code.
2018-12-09 04:13:14 +00:00
Justin Hibbits
76748087bf powerpc: Set very low priority mode while waiting for AP unleash event
The POWER9 does not recognize 'or 27,27,27' as a thread priority NOP.  On
earlier POWER architectures, this NOP would note to the processor to give up
resources if able, to improve performance of other threads.

All processors that support the thread priority NOPs recognize the
'or 31,31,31' NOP as very low priority, so use this to perform a similar
function, and not burn cycles on POWER9.
2018-12-06 04:36:02 +00:00
Justin Hibbits
ac37786a0a powerpc: Fix ELFv2 JMP_SLOT relocation fixup
The jump slot is a function pointer, not a descriptor pointer, in ELFv2.  Just
write the pointer itself over, not the contents of the pointer, which would be
the first instruction of the function.
2018-12-06 04:30:24 +00:00
Justin Hibbits
7c4f1a1c5a powerpc/powermac: Fix macgpio(4) child interrupt resource handling
The 'interrupts' property is actually 2 words, not one, on macgpio child
nodes.  Open Firmware's getprop function might be returning the value
copied, not the total size of the property, but FDT's returns the total
size.  Prior to this patch, this would cause the SYS_RES_IRQ resource list
to not be populated when running with the 'usefdt' loader variable set, to
convert the OFW device tree to a FDT.  Since the property is always 2 words,
read both words, and ignore the second.

Tested by:	Dennis Clarke (previous attempt)
MFC after:	2 weeks
2018-12-06 04:25:12 +00:00
Justin Hibbits
bfed756af6 Sprinkle EARLY_DRIVER_MODULE around the tree
Mark some buses as BUS_PASS_BUS, and some resources as BUS_PASS_RESOURCE.
This also decouples some resource attachment orderings from being races by
device tree ordering, instead relying on the bus pass to provide the
ordering.

This was originally intended to support multipass suspend/resume, but it's
also needed on PowerMacs when using fdt, as the device tree seems to get
created in reverse of the OFW tree.
Reviewed by:	nwhitehorn (long ago)
Differential Revision:	https://reviews.freebsd.org/D918
2018-12-04 04:55:49 +00:00
Justin Hibbits
f1e0cb5ef1 powerpc: preload_addr_relocate is no longer necessary for booke
The same behavior was moved to machdep.c, paired with AIM's relocation,
making this redundant.  With this, it's now possible to boot FreeBSD with
ubldr on a uboot Book-E platform, even with a
KERNBASE != VM_MIN_KERNEL_ADDRESS.
2018-12-04 03:51:10 +00:00
Justin Hibbits
f095905ca4 powerpc: Check for a fdt in the metadata if it doesn't already exist
It's possible the fdt pointer was passed in via the metadata, as is done in
ubldr.  Check for the fdt here, instead of working with a NULL fdt, and
panicking.
2018-12-03 04:56:06 +00:00
Justin Hibbits
3c0b081966 powerpc/booke: Check for the metadata address by physical address
The metadata pointer will almost never be at or above 'btext', as btext is a
relocated symbol, so will be based at VM_MIN_KERNEL_ADDRESS, not at
KERNBASE.  Check the address against kernload, where the kernel is
physically loaded.
2018-12-03 04:47:28 +00:00
Conrad Meyer
d86e0b338f pmcr: Fix pstate setting on Power8
Fix p-state setting on Power8 by removing the accidental double-indirection of
the pstate_ids table.

The pstate_ids table comes from the OF property "ibm,pstate-ids."  On Power9,
the values happen to be identical to the indices, so the extra indirection was
harmless.  On Power8, the values were out of the range [0, npstates], so
pmcr_set() would fail the spec[0] range check with EINVAL.

While here, include both the value and index in the driver-specific register
array as spec[0] and spec[1] respectively.  They're redundant, but relatively
harmless, and it may aid debugging.

While here, fix the range check to exclude the index npstates, which is one
past the last valid index.

PR:		233693
Reported and tested by:	sbruno
Reviewed by:	jhibbits
2018-12-01 21:37:47 +00:00
Justin Hibbits
f4a80449ce Fix thread creation in PowerPC64 ELFv2 processes.
Summary:
Currently, the upcall used to create threads assumes ELFv1.

Instead, we should check which sysentvec is in use on the process and act
accordingly.

This makes ELFv2 threaded processes work.

Submitted by:	git_bdragon.rtk0.net
Differential Revision: https://reviews.freebsd.org/D18330
2018-11-29 03:39:11 +00:00
Justin Hibbits
8f69e36d87 powerpc: Don't include KERNBASE in genassym, it's unnecessary
A related future change, which changes KERNBASE for Book-E for some reason
causes a "KERNBASE redefined" error with assym.inc, even though it only changed
the value of KERNBASE and nothing else.  Since machine/vmparam.h is already
included in booke/locore.S, and the requisite guards are already in place for
properly handling KERNBASE in vmparam.h, just remove it from genassym, and
include vmparam.h in the AIM locore files.
2018-11-28 16:00:52 +00:00
Justin Hibbits
b996435337 powerpc/booke: Fix debug printfs in pmap
Add missing '%'s so printf formats are actually handled.
2018-11-28 04:02:26 +00:00
Justin Hibbits
24c3112f0c powerpc: Fix the powerpc64 build post-r341102
VM_MIN_KERNEL_ADDRESS is now used in locore.S, but the UL suffix isn't
permitted in .S files.
2018-11-28 02:48:43 +00:00
Justin Hibbits
ea32838af0 powerpc: Prepare Book-E kernels for KERNBASE != run base
Book-E kernels really run at VM_MIN_KERNEL_ADDRESS, which currently happens to
be the same as KERNBASE.  KERNBASE is the linked address, which the loader also
takes to be the physical load address.  Treat KERNBASE as a physical address,
not a virtual, and change virtual address references for KERNBASE to use
something more appropriate.
2018-11-28 02:00:27 +00:00
Eric van Gyzen
f5e7d8bdb5 Prevent kernel stack disclosure in getcontext/swapcontext
Expand r338982 to cover freebsd32 interfaces on amd64, mips, and powerpc.

MFC after:	2 days
Security:	FreeBSD-EN-18:12.mem
Security:	CVE-2018-17155
Sponsored by:	Dell EMC Isilon
2018-11-26 20:50:55 +00:00
Niclas Zeising
bd62da641d Enable evdev on ppc32
Enable evdev on ppc32 as well, similar to what was done i386 and amd64 in
r340387 and ppc64 in r340632.

Evdev can be used by X and is used by wayland to handle input devices.

Approved by:	jhibbits
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18049
2018-11-20 19:31:02 +00:00
Justin Hibbits
fe5e88fabf powerpc: Sync icache on SIGILL, in case of cache issues
The update of jemalloc to 5.1.0 exposed a cache syncing issue on a Freescale
e500 base system.  There was already code in the FPU emulator to address
this, but it was limited to a single static variable, and did not attempt to
sync the cache.  This pulls that out to the higher level program exception
handler, and syncs the cache.

If a SIGILL is hit a second time at the same address, it will be treated as
a real illegal instruction, and handled accordingly.
2018-11-19 23:54:49 +00:00
Niclas Zeising
4651a02f07 Enable evdev on ppc64
Enable evdev on ppc64 as well, similar to what was done for amd64 and i386
in r340387.

Evdev can be used by X and is used by wayland to handle input devices.

Approved by:	mmacy
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18026
2018-11-19 15:36:58 +00:00
Kevin Bowling
0d909f4ccf powerpc64: reduce GENERIC64 diff versus amd64 GENERIC
Reviewed by:	jhibbits
Approved by:	timur (mentor)
Differential Revision:	https://reviews.freebsd.org/D17515
2018-11-13 09:19:07 +00:00
Justin Hibbits
266b2aa146 powerpc: Use MAX() macro instead of max() inline function to calculate Maxmem
Maxmem is the highest address for physical memory in the system.  It's
measured in pages which, since max() returns a u_int, should allow for up to
2^44 bytes of memory addressable by the system.  However, on POWER9 systems
at least, memory addressed by additional socketed CPUs begins at addresses
far above the 2^44 mark, causing issues with memory accesses and DMA, when
memory is addressed on the auxiliary CPUs.  Use the MAX() macro instead,
which doesn't convert arguments, so retains Maxmem and all calculations as
its defined long type (64-bit on powerpc64), keeping the maximum address
correct.

Submitted by:	mmacy
2018-11-10 02:37:56 +00:00
Justin Hibbits
8f04c0c06b powerpc64: Fix "show spr" command on ELFv2 kernels
Summary: When compiling for ELFv2, it is necessary to adjust the offset to
get_spr and factor in the function prologue to ensure the correct instruction is
being edited.

Test Plan:
Before:
```
db> show spr 110
KDB: reentering
KDB: stack backtrace:
0xc008000020fb96e0: at 0xc000000002bb2e34 = kdb_backtrace+0x68
0xc008000020fb97f0: at 0xc000000002bb3798 = kdb_reenter+0x54
0xc008000020fb9860: at 0xc000000002f87090 = trap+0x4e4
0xc008000020fb9990: at 0xc000000002f78a60 = powerpc_interrupt+0x110
0xc008000020fb9a20: kernel trap 0xe40 by 0xc000000002401978 = get_spr+0x8: srr1=0x9000000000001032
            r1=0xc008000020fb9cd0 cr=0x80009438 xer=0x20040000 ctr=0xc000000002f7b40c r2=0xc0000000037fd000
saved LR(0xfffffffffffffffb) is invalid.
```

After:

```
db> show spr 110
SPR 272(110): c000000003cae900
```

Submitted by:	git_bdragon.rtk0.net
Differential Revision: https://reviews.freebsd.org/D17813
2018-11-08 20:48:44 +00:00
Justin Hibbits
ad39591ad2 powerpc/powernv: Restrict the busdma tag to only POWER8
It seems this tag is causing problems on POWER9 systems.  Since no POWER9 user
has encountered the problem fixed by r339589 just restrict it to POWER8 for now.
A better fix will likely be to update powerpc/busdma_machdep.c to handle the
window correctly.

Reported by:	mmacy, others
2018-11-08 20:31:12 +00:00
Justin Hibbits
6a0fd1a51b powerpc/atomic: Loosen the memory barrier on atomic_load_acq_*()
'sync' is pretty heavy-handed, and is unnecessary for this use case.  It's a
full barrier, which is applicable for all storage types.  However,
atomic_load_acq_*() is only expected to operate on physical memory, not
device memory, so lwsync is sufficient (lwsync provides access ordering on
memory that is marked as Coherency Required and is not Write Through nor
Cache Inhibited).  On 32-bit systems, this is a nop, since powerpc_lwsync()
is defined to use sync, as a workaround for a silicon bug in the Freescale
e500 core.
2018-11-07 01:42:00 +00:00
John Baldwin
4cbbb74888 Add a KPI for the delay while spinning on a spin lock.
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI.  Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms.  However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.

Reviewed by:	kib
MFC after:	3 days
2018-11-05 21:34:17 +00:00
Justin Hibbits
b465e0bb56 powerpc/SMP: Don't spam the console with AP bringup messages
Especially on new POWER9 systems, the console can be filled with

  SMP: AP CPU #XX launched

messages.  This can also slow down the console printing.  Instead, do what
x86 now does, as of r333335, and print it all on one line, unless
bootverbose is set.
2018-11-05 01:53:20 +00:00
John Baldwin
b317cfd4c0 Don't enter DDB for fatal traps before panic by default.
Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'.  Disable the new knob by default.  Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob.  However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.

Reviewed by:	kib, avg
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17768
2018-11-01 21:34:17 +00:00
Kyle Evans
be352d20d5 Compile in VERBOSE_SYSINIT support by default, remain silent by default
The loader tunable 'debug.verbose_sysinit' may be used to toggle verbosity.
This is added to the debugging section of these kernconfs to be turned off
in stable branches for clarity of intent.

MFC after:	never
2018-10-31 22:38:19 +00:00
Michael Tuexen
d59a162c11 Bump the number of fans supported from 8 to 12.
The number of fans on a PowerMac7,3 with liquid cooling is 9.

Reviewed by:		andreast@
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D17754
2018-10-30 11:51:09 +00:00
Justin Hibbits
a37c714a0f powerpc/mpc85xx: Reset the PCIe bus on attach
It seems if a Radeon card is already initialized by u-boot, it won't be
reinitialized by the kernel, and the DRM module will fail to attach.  This
steals the reset code from mips/octopci.c to blindly reset the bus on attach.
This was tested on a AmigaOne X5000/20, such that it can be booted from the
local video console, and get a video console in FreeBSD.
2018-10-30 00:47:40 +00:00
Brooks Davis
c3adaa3305 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
Leandro Lupori
d93e635a81 ppc64: limited 32-bit DMA address range
Further investigation of issues with 32-bit DMA on PowerNV revealed that
its window is hardcoded by OPAL (at least in skiboot version 5.4.9) and
cannot be changed by the OS.
Thus, now jhb suggestion of limiting the range in PCI DMA tag seems
the best way to deal with it.

Reviewed by:	jhibbits, nwhitehorn, sbruno
Approved by:	jhibbits(mentor)
Differential Revision:	https://reviews.freebsd.org/D17601
2018-10-22 13:40:50 +00:00
Justin Hibbits
24dd643d54 powerpc: stash off srr0 in si_addr for signals
si_addr is the address of the instruction executing at the time the
signal was sent.  Populate this field with srr0, which, though not
always the case, is most often the instruction that triggered the fault.
2018-10-22 00:27:37 +00:00
Justin Hibbits
6f4827aca7 powerpc/booke: Turn tlb*_print_tlbentries() into 'show tlb*' DDB commands
debugf() is unnecessary for the TLB printing functions, as they're only
intended to be used from ddb.  Instead, make them full DDB 'show'
commands, so now it can be written as 'show tlb1' and 'show tlb0'
instead of calling the function, hoping DEBUG has been defined.
2018-10-22 00:21:27 +00:00
Justin Hibbits
c9bd5bee1a powerpc/mpc85xx: Make Freescale PCI bridge driver a subclass of ofw_pcib_pci
This driver was already 99% identical to the ofw_pcib_pci driver, except for
the attachment.  Since ofw_pcib_pci is already a subclass of pcib, this
creates a private declaration of that class, to use for the base class for
this driver.

At some point in the future, ofw_pcib_pci_driver should probably be exported
to a header, so we're not tracking the softc struct contents, but for now,
since there's only this one other driver, it's not a pressing issue.
2018-10-21 02:39:13 +00:00
Justin Hibbits
27ef2ca86b powerpc64/powernv: Add pnpinfo strings to opal device children
This makes it easier to see what's left unattached as new drivers are
written, and to see what drivers get attached to what nodes.
2018-10-21 02:30:34 +00:00
Justin Hibbits
d692cd43c4 powerpc64/pmap: Correct the logic for minidump KVA chunk
r279252 inverted the logic in moea64_scan_init, such that instead of
terminating when reaching a dead page, it terminates when reaching a live
page, ostensibly preserving exactly one page of KVA.
2018-10-21 02:28:04 +00:00
Justin Hibbits
54b310b892 powerpc64/xics: Fix comment typo 2018-10-21 02:25:56 +00:00
Justin Hibbits
2756851a77 powerpc64/powernv:opal_pci: Fix the alignment of the TCE table
The TCE table need only be aligned to the size of the table, not the size of
the TCE segment.
2018-10-21 02:24:37 +00:00
Justin Hibbits
289041e2cb powerpcspe: Implement SPE exception handling
The Signal Processing Engine (SPE) found in Freescale e500 cores (and
others) offloads IEEE-754 compliance (NaN, Inf handling, overflow,
underflow) to software, most likely as a means of simplifying the APU
silicon.  Some software, like AbiWord, needs full IEEE-754 compliance,
including NaN handling.  Implement the necessary bits to enable it.

Differential Revision: https://reviews.freebsd.org/D17446
2018-10-21 00:43:27 +00:00
Leandro Lupori
461298ffa6 Initialize SPRG0 before its first possible use
At early boot, PCPU_GET(), that obtains a pointer from SPRG0, was being
used with SPRG0 not yet initialized. If it pointed to an invalid
address, the machine would hang.

Approved by:	re(gjb), jhibbits(mentor)
2018-10-15 16:43:07 +00:00
Michael Tuexen
20a2f77eec Enable TCP Fast Open support for PPC platforms.
Reviewed by:		kbowling@, andreast@
Approved by:		re (kib@)
Differential Revision:	https://reviews.freebsd.org/D17407
2018-10-07 12:56:05 +00:00
Justin Hibbits
7e524b0746 powerpc/pseries: EOI interrupts in XICS by setting lowest priority
Discussing with Benjamin Herrenschmidt, OPAL_INT_GET_XIRR masks the
returned priority, so must be resumed before more interrupts can be
handled at this priority.  Since there are only two priorities used in
FreeBSD, we know that the previous priority in an EOI will always be
0xff (lowest priority).

Reviewed by:	nwhitehorn
Approved by:	re(rgrimes)
Differential Revision: https://reviews.freebsd.org/D17361
2018-10-06 18:51:49 +00:00
Justin Hibbits
013cc176c9 powerpc64/powernv: Don't mask MSIs in OPAL
Summary:
Discussing with Benjamin Herrenschmidt, MSIs, and edge-triggered
interrupts in general, must not be masked in XICS and XIVE, else
subsequent interrupts may be ignored.

Testing locally on my Talos II (single CPU, 18-core POWER9), NVMe now
works with MSI, improving read throughput by ~70% (900MB/s -> 1.67GB/s,
with 64MB block size) over INTx interrupts, and snd_hda(4) now will
actually play music with MSI.  Previously, snd_hda(4) would not receive
interrupts, timing out, and declaring the channels dead.

This has also been tested by Kevin Bowling, and others, with great
success.  Kevin reported NVMe unusable on his Talos II prior to this
patch.

Reviewed by:	nwhitehorn, kbowling
Approved by:	re(rgrimes)
Differential Revision: https://reviews.freebsd.org/D17356
2018-10-06 03:20:26 +00:00
Kevin Bowling
8ac2f3ba9f Use nda(4) on powerpc64
Approved by:	re@ (kib), krion (mentor), imp
Differential Revision:	https://reviews.freebsd.org/D17368
2018-10-02 21:36:00 +00:00
Justin Hibbits
fd8cf3be5f powerpc: Blacklist the top 64kB range of the lower 4GB PA space
The PHB4 host bridge used by the POWER9 uses a 64kB range in 32-bit
space at the address 0xffff0000-0xffffffff.  Reserve this range so that
DMA memory cannot be allocated within this range.  This fixes seemingly
random crashes on a POWER9 system.  Ideally this range will have been
reserved by the firmware, but as of now this is not the case.

Submitted by:	git_bdragon.rtk0.net
Reviewed by:	nwhitehorn
Approved by:	re(kib)
Differential Revision:	https://reviews.freebsd.org/D17183
2018-09-25 02:34:28 +00:00
Breno Leitao
03e83a838e powerpc64: Add initial support for HTM (kABI)
This patch adds the very initial support for HTM that might come at FreeBSD
version 12.1. This basic support defines a new kABI, so, we do not need to change
it later during 12.1 time frame, when the full implementation will come.

Reviewed by: jhibbits
Approved by: re(marius), jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16889
2018-09-06 17:07:21 +00:00
Alan Cox
49bfa624ac Eliminate the arena parameter to kmem_free(). Implicitly this corrects an
error in the function hypercall_memfree(), where the wrong arena was being
passed to kmem_free().

Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are
mapped in kmem with execute permissions.  Use this flag to determine which
arena the kmem virtual addresses are returned to.

Eliminate UMA_SLAB_KRWX.  The introduction of VPO_KMEM_EXEC makes it
redundant.

Update the nearby comment for UMA_SLAB_KERNEL.

Reviewed by:	kib, markj
Discussed with:	jeff
Approved by:	re (marius)
Differential Revision:	https://reviews.freebsd.org/D16845
2018-08-25 19:38:08 +00:00
Mark Johnston
36716fe2e6 Prepare the kernel linker to handle PC-relative ifunc relocations.
The boot-time ifunc resolver assumes that it only needs to apply
IRELATIVE relocations to PLT entries.  With an upcoming optimization,
this assumption no longer holds, so add the support required to handle
PC-relative relocations targeting GNU_IFUNC symbols.
- Provide a custom symbol lookup routine that can be used in early boot.
  The default lookup routine uses kobj, which is not functional at that
  point.
- Apply all existing relocations during boot rather than filtering
  IRELATIVE relocations.
- Ensure that we continue to apply ifunc relocations in a second pass
  when loading a kernel module.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16749
2018-08-22 20:44:30 +00:00
Alan Cox
83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
Alan Cox
44d0efb215 Eliminate kmem_alloc_contig()'s unused arena parameter.
Reviewed by:	hselasky, kib, markj
Discussed with:	jeff
Differential Revision:	https://reviews.freebsd.org/D16799
2018-08-20 15:57:27 +00:00
Matt Macy
381388b9c4 add snps IP uart support / genaralize UART
This is an amalgam of a patch by Doug Ambrisko to
generalize uart_acpi_find_device, imp moving the
ACPI table to uart_dev_ns8250.c and advice by jhb
to work around a bug in the EPYC 3151 BIOS
(the BIOS incorrectly marks the serial ports as
disabled)

Reviewed by: imp
MFC after: 8 weeks
Differential Revision: https://reviews.freebsd.org/D16432
2018-08-19 21:10:21 +00:00
Justin Hibbits
8d67357c5c powerpc conf: Add PRINTF_BUFR_SIZE option to Book-E configs
Without this, printf is very hard to follow at times on multicore systems.
2018-08-19 19:07:59 +00:00
Justin Hibbits
b793c8ab28 Sort SPR_SPEFSCR in the SPR list
Also remove duplicate definition of SPR_IBAT0U.
2018-08-19 19:03:43 +00:00
Justin Hibbits
340a810bf0 booke pmap: hide debug-ish printf behind bootverbose
It's not necessary during normal operation to know the mapped region size
and wasted space.
2018-08-19 18:54:43 +00:00
John Baldwin
8cd385fda0 Make 'device crypto' lines more consistent.
- In configurations with a pseudo devices section, move 'device crypto'
  into that section.
- Use a consistent comment.  Note that other things common in kernel
  configs such as GELI also require 'device crypto', not just IPSEC.

Reviewed by:	rgrimes, cem, imp
Differential Revision:	https://reviews.freebsd.org/D16775
2018-08-18 20:32:08 +00:00
Justin Hibbits
7d849dc1a4 powerpc: Add lwsync and ptesync 'sync' opcode variants to ddb disassembler
The canonical form of sync is:

  sync L, E (if Category Elemental Memory Barriers implemented)

The L bits (2) denote the type of sync:

  0 -- hwsync
  1 -- lwsync
  2 -- ptesync or hwsync

It's been found that most 32-bit CPUs designed prior to the introduction of
lwsync will ignore the L bits.  However, some cores, particularly the e500 core,
will trigger an illegal instruction exception.  Adding these variants will make
it easier to see which sync variant is actually being used in case of a trap.
2018-08-10 03:28:40 +00:00
Breno Leitao
78f4e2fea0 powerpc64/powernv: re-read RTC after polling
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.

This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.

The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.

This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617
2018-08-08 21:19:07 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Justin Hibbits
9b8d0a4615 powerpcspe: Unconditionally save an restore SPEFSCR on task switch
The SPEFSCR is not guarded by the SPV bit in MSR, it's just another SPR.
Protect processes from other tasks setting the SPEFSCR for their own needs.
2018-07-30 17:03:15 +00:00
Justin Hibbits
0bf0bb832f Support building IPMI as a module on powerpc64
This still only supports IPMI via OPAL on powerpc64, but now it can be tested
with a GENERIC kernel.
2018-07-25 18:58:57 +00:00
Justin Hibbits
adc9dcf3e3 Fix floating point exception definitions for powerpcspe
These were incorrectly implemented in the original port.
2018-07-24 22:04:56 +00:00
Breno Leitao
3ddc2cde51 ofw: Load initrd file
This is an OFW initrd module that would load the initrd from device tree
parameters and give the to the md driver.

With this patch, it is possible to pass a rootfs image through kexec in PowerNV
mode (powerpc64). In order to user it, you should set the MD_ROOT_MEM option in
your kernel configuration.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15705
2018-07-24 16:52:52 +00:00
Justin Hibbits
038c615929 Revert r336509. Fails buildworld.
I had naively assumed that building kernel would be sufficient to test that
the header is sane.  However, it turns out this now needs -fms-extensions to
build.  Rather than sprinkling -fms-extensions all over the place, revert
for now, and revisit with a better fix.
2018-07-19 21:06:58 +00:00
Justin Hibbits
7fb935da15 Merge the md_page structs for AIM and Book-E into a single unioned struct
Summary:
Ports like sysutils/lsof troll through kernel structures, and
therefore include kernel headers and all the dirty secrets involved.  struct
vm_page includes the struct md_page inline, which currently is only defined
if AIM or BOOKE is defined.  Thus, by default, sysutils/lsof cannot build,
due to the struct md_page having an incomplete type.  Fix this by merging
the two struct definitions into an anonymous struct-union.

A similar change could be made to unify the pmap structures as well.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D16232
2018-07-19 20:13:33 +00:00
Justin Hibbits
3395ab28eb powerpc/powernv: Make opal_i2c driver work with attached i2c drivers
* FreeBSD stores addresses in 8 bit format, but the OPAL API requires the 7-bit
  address, and encodes the direction elsewhere.  Behave like other i2c drivers,
  and shift accordingly.
* The OPAL API can already handle multiple requests in flight.  Change the async
  token to be private to the thread, so as not to stomp across i2c accesses,
  remove the limitation error message, and use the correct message index to
  transfer all messages in the list.
* Micro-optimize the async handler to not continuously call pmap_kextract() when
  spin-waiting for the operation to complete.

This has been tested by hexdumping an EEPROM attached via the icee(4) driver.
2018-07-09 20:33:48 +00:00
Justin Hibbits
fedd55f14b Let ofw_iicbus work its magic on OPAL i2c buses.
ofw_iicbus already has attachments on iichb.  Rather than adding an explicit
attachment onto opal_i2c, simply change the exposed name of the OPAL i2c bus
to 'iichb'.
2018-07-07 01:58:40 +00:00
Matt Macy
ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00