Commit Graph

123 Commits

Author SHA1 Message Date
John Baldwin
9b9a532782 powerpc pseries: Remove unused devclass arguments to DRIVER_MODULE. 2022-05-10 10:21:38 -07:00
John Baldwin
c90ea83112 Remove unused uart_devclass. 2022-05-06 15:46:57 -07:00
John Baldwin
e4a38f5407 powerpc pseries xics: Use devclass_find to lookup xicp devclass.
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D35083
2022-04-27 15:08:47 -07:00
John Baldwin
1c311640c0 powerpc: Use __diagused for variables only used in KASSERT(). 2022-04-13 16:08:23 -07:00
John Baldwin
b2a8b342f3 llan: Remove unused variables.
In theory the errors during llan_attach should be handled, but other
errors in llan_attach (e.g. bus_setup_intr) are already ignored, so
just remove the unused variable to preserve the status quo.
2022-04-12 14:58:59 -07:00
Justin Hibbits
5ae48eb998 powerpc/pseries: Allow radix pmap in pseries for ISA 3.0
ISA 3.0 allows for nested radix translations with minimal to no
involvement of the hypervisor.  This should make pseries signficantly
faster on POWER9 pseries instances, as fewer hypercalls are needed to
manage pmap now.

MFC after:	2 weeks
Relnotes:	yes
2021-08-11 19:07:04 -05:00
Warner Losh
ddfc9c4c59 newbus: Move from bus_child_{pnpinfo,location}_src to bus_child_{pnpinfo,location} with sbuf
Now that the upper layers all go through a layer to tie into these
information functions that translates an sbuf into char * and len. The
current interface suffers issues of what to do in cases of truncation,
etc. Instead, migrate all these functions to using struct sbuf and these
issues go away. The caller is also in charge of any memory allocation
and/or expansion that's needed during this process.

Create a bus_generic_child_{pnpinfo,location} and make it default. It
just returns success. This is for those busses that have no information
for these items. Migrate the now-empty routines to using this as
appropriate.

Document these new interfaces with man pages, and oversight from before.

Reviewed by:		jhb, bcr
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29937
2021-06-22 20:52:06 -06:00
Leandro Lupori
4a66b8083c powerpc: fix boot on pseries without hugepages
Commit 49c894ddce introduced an issue that prevented pseries boot,
when hugepages were not available to the guest. Now large page
info must be available before moea64_install is called, so this change
moves the code that scans large page sizes before the call.

Reviewed by:	jhibbits (IRC)
Sponsored by:	Instituto de Pesquisas Eldorado (eldorado.org.br)
2021-06-02 16:27:36 -03:00
Marcin Wojtas
240429103c Rename ofwpci.c to ofw_pcib.c
It's a class0 driver that implements some pcib methods and creates
a pci bus as its children.
The "ofw_pci" name will be used by a new driver that will be a subclass
of the pci bus.
No functional changes intended.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30226
2021-05-20 11:22:25 +02:00
Justin Hibbits
49c894ddce powerpc64: Split out DMAP and non-DMAP implementations of some methods
Summary:
Some methods are split between DMAP and non-DMAP, conditional on
hw_direct_map variable.  Rather than checking this variable every time,
use it to install different functions via IFUNCs.

Reviewed By: luporl
Differential Revision: https://reviews.freebsd.org/D30071
2021-05-05 20:57:33 -05:00
Justin Hibbits
a5f07fa0c6 powerpc/pseries: Add new hypercall definition, H_REGISTER_PROC_TBL
This will be used by the Radix MMU on pseries.

MFC after:	1 week
2021-03-30 21:22:21 -05:00
Brandon Bergren
5001c579ba [PowerPC64LE] pseries: Fix input buffering logic.
In uart_phyp_get(), when the internal buffer is empty, we make a
hypercall to retrieve up to 16 bytes of input data from the
hypervisor. As this is specified to be returned in BE format, we need
to do a 64-bit byte swap on the first and second half of the data.

If the buffer being passed in was insufficient to return the fetched
data, we store the remainder in the internal buffer and use it to
satisfy the following calls to uart_phyp_get() until it is drained.

However, in this case, we were accidentally byteswapping the internal
buffer again.

Move the byteswapping code to just after the hypercall so it only gets
swapped when we're filling the buffer.

Fixes arrow keys in qemu on pseries, among other console oddities.

Sponsored by:	Tag1 Consulting, Inc.
MFC after:	3 days
2021-02-25 14:50:13 -06:00
Conrad Meyer
78599c32ef Add CFI start/end proc directives to arm64, i386, and ppc
Follow-up to r353959 and r368070: do the same for other architectures.

arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing.  Likewise, MIPS
uses .frame directives.

Reviewed by:	arichardson
Differential Revision:	https://reviews.freebsd.org/D27387
2020-12-05 00:33:28 +00:00
Leandro Lupori
e2d6c417e3 Implement superpages for PowerPC64 (HPT)
This change adds support for transparent superpages for PowerPC64
systems using Hashed Page Tables (HPT). All pmap operations are
supported.

The changes were inspired by RISC-V implementation of superpages,
by @markj (r344106), but heavily adapted to fit PPC64 HPT architecture
and existing MMU OEA64 code.

While these changes are not better tested, superpages support is disabled by
default. To enable it, use vm.pmap.superpages_enabled=1.

In this initial implementation, when superpages are disabled, system
performance stays at the same level as without these changes. When
superpages are enabled, buildworld time increases a bit (~2%). However,
for workloads that put a heavy pressure on the TLB the performance boost
is much bigger (see HPC Challenge and pgbench on D25237).

Reviewed by:	jhibbits
Sponsored by:	Eldorado Research Institute (eldorado.org.br)
Differential Revision:	https://reviews.freebsd.org/D25237
2020-11-06 14:12:45 +00:00
Brandon Bergren
c0290b3de8 [PowerPC64LE] Fix endianness issues in phyp_vscsi.
Unlike virtio, which in legacy mode is guest endian, the hypervisor vscsi
interface operates in big endian, so we must convert back and forth in several
places.

These changes are enough to attach a rootdisk.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-23 00:13:58 +00:00
Brandon Bergren
15be37cb7f [PowerPC64LE] Fix endianness issues in phyp and opal consoles.
This applies to both pseries and powernv, which were tested at different
points during the patchset development.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-23 00:06:48 +00:00
Brandon Bergren
35ef395191 [PowerPC64LE] Tell the hypervisor to switch interrupts to LE at CHRP attach.
Since we will need to be able to take traps relatively early in the process,
ensure that the hypervisor changes our ILE for us as soon as we are ready.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-23 00:03:35 +00:00
Brandon Bergren
a662559264 [PowerPC64LE] LE bringup work: locore / machdep / platform
This is the initial LE changes required in the machdep code to get as far
as platform attachment on qemu pseries.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-22 23:55:34 +00:00
Mateusz Guzik
b64b31338f powerpc: clean up empty lines in .c and .h files 2020-09-01 21:20:08 +00:00
Justin Hibbits
45b69dd63e powerpc/mmu: Convert PowerPC pmap drivers to ifunc from kobj
With IFUNC support in the kernel, we can finally get rid of our poor-man's
ifunc for pmap, utilizing kobj.  Since moea64 uses a second tier kobj as
well, for its own private methods, this adds a second pmap install function
(pmap_mmu_init()) to perform pmap 'post-install pre-bootstrap'
initialization, before the IFUNCs get initialized.

Reviewed by:	bdragon
2020-05-27 01:24:12 +00:00
Gleb Smirnoff
de086f1a6c This is Ethernet driver so mark the interrupt appropriately. 2020-01-23 01:46:05 +00:00
Leandro Lupori
35f294270c Enable use of ofwcons for early debug
This change enables the use of OpenFirmware Console (ofwcons), even when VGA is
available, allowing early kernel messages to be seen, that is important in case
of crashes before VGA console initialization.

This is specially useful in virtualized environments, where the user/developer
doesn't have full control of the virtualization engine (e.g. OpenStack).

The old behavior is preserved by default and, in order to use ofwcons, a few
tunables that have been introduced need to be set:
- hw.ofwfb.disable=1     - disable OFW FrameBuffer device
- machdep.ofw.mtx_spin=1 - change PPC OFW mutex to SPIN type, to match kernel
                           console's mutex type
- debug.quiesce_ofw=0    - don't call OFW quiesce, needed to keep ofwcons I/O
                           working

More details can be found at differential revision D20640.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20640
2019-12-09 13:40:23 +00:00
Leandro Lupori
a16111e6a2 [PPC64] Enable opal console use as a GDB DBGPORT
This change makes it possible to use OPAL console as a GDB debug port.

Similar to uart and uart_phyp debug ports, it has to be enabled by
setting the hw.uart.dbgport variable to the serial console node
of the device tree.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D22649
2019-12-09 13:09:32 +00:00
Leandro Lupori
0b0c23fee3 [PPC] Remove extra \0 char inserted on vty by QEMU
Since version 2.11.0, QEMU became bug-compatible with
PowerVM's vty implementation, by inserting a \0 after
every \r going to the guest. Guests are expected to
workaround this issue by removing every \0 immediately
following a \r.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D22171
2019-11-29 11:34:11 +00:00
Leandro Lupori
4ceaf951fb [PPC64] Enable phyp vty use as a GDB DBGPORT
This change makes it possible to use a POWER Hypervisor virtual
terminal device (phyp vty) as a GDB debug port.

Similar to the uart debug port, it has to be enabled by setting
the hw.uart_phyp.dbgport variable to the vty node of the device
tree.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D22205
2019-11-25 16:30:38 +00:00
Gleb Smirnoff
38e1a6585b Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:37 +00:00
Leandro Lupori
0ecc478b74 [PPC64] Initial kernel minidump implementation
Based on POWER9BSD implementation, with all POWER9 specific code removed and
addition of new methods in PPC64 MMU interface, to isolate platform specific
code. Currently, the new methods are implemented on pseries and PowerNV
(D21643).

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21551
2019-10-14 13:04:04 +00:00
Justin Hibbits
7c382eea30 powerpc/pmap64: Make moea64 statistics optional
Summary:
It turns out statistics accounting is very expensive in the pmap driver,
and doesn't seem necessary in the common case.  Make this optional
behind a MOEA64_STATS #define, which one can set if they really need
statistics.

This saves ~7-8% on buildworld time on a POWER9.

Found by bdragon.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D20903
2019-07-25 03:47:27 +00:00
Leandro Lupori
8b55f9f853 [PPC64] pseries: fix realmaxaddr calculation
On POWER9/pseries, QEMU passes several regions of memory,
instead of a single region containing all memory, as the
code was expecting.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20857
2019-07-10 13:36:17 +00:00
Leandro Lupori
57d0d4a271 [PPC64] pseries llan: fix MAC address
There was an issue in pseries llan driver, that resulted in the first 2 bytes
of the MAC address getting stripped, and the last 2 being always 0.

In most cases the network interface still worked, despite the MAC being
different of what was specified to QEMU, but when some other host or DHCP
server expected a specific MAC, this would fail.

This change fixes this by shifting right by 2 the local-mac-address read from
device tree, if its length is 6 instead of 8, as observed in QEMU DT, that
always presents a 6 bytes value for this property.

PR:		237471
Reported by:	Alfredo Dal'Ava Junior
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20843
2019-07-04 12:31:24 +00:00
Leandro Lupori
49f10b5181 [PPC] Fix build error when POWERNV is disabled
When building a kernel supporting PSERIES but not POWERNV,
the compiler would complain about an error variable being
possibly used before being initialized.

In practice, however, this should never happen. In any case, it
is now initialized to an error value.
2019-06-11 11:23:15 +00:00
Leandro Lupori
b934fc7468 [PPC64] Support QEMU/KVM pseries without hugepages
This set of changes make it possible to run FreeBSD for PowerPC64/pseries,
under QEMU/KVM, without requiring the host to make hugepages available to the
guest.

While there was already this possibility, by means of setting hw_direct_map to
0, on PowerPC64 there were a couple of issues/wrong assumptions that prevented
this from working, before this changelist.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20522
2019-06-07 17:58:59 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Leandro Lupori
8920043674 [PPC64] Fix wrong KASSERT in mphyp_pte_insert()
As mphyp_pte_unset() can also remove PTE entries, and as this can
happen in parallel with PTEs evicted by mphyp_pte_insert(), there
is a (rare) chance the PTE being evicted gets removed before
mphyp_pte_insert() is able to do so. Thus, the KASSERT should
check wether the result is H_SUCCESS or H_NOT_FOUND, to avoid
panics if the situation described above occurs.

More details about this issue can be found in PR 237470.

PR:		237470
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20012
2019-04-23 17:11:45 +00:00
Justin Hibbits
f4c5f64d30 [PowerPC64] pseries-llan: increment packet output counters on error and success
Summary: when using pseries-llan driver, Opkts and Oerrs counters (netstat
-i) are always zero. This patch adds an small error handling to increment
these counters.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision: https://reviews.freebsd.org/D20009
2019-04-23 03:19:03 +00:00
Justin Hibbits
ba5189f7be powerpc64/pseries: Fix hypervisor call with extra arguments
Some hypervisor calls, such as H_SEND_LOGICAL_LAN, take more arguments than
are traditionally passed in registers.  The HCALL ABI will accept these
arguments in r11 and r12.  With ELFv2 ABI, these arguments are 2
double-words lower than ELFv1 ABI, as two double-words in the stack frame
are no longer used, and therefore removed from the frame.  Fix the offsets
for loading the registers for the HCALL.  This fixes the phyp_llan driver
with ELFv2 kernel.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision:	https://reviews.freebsd.org/D20008
2019-04-23 03:05:26 +00:00
Leandro Lupori
4a8450ceff [ppc64] llan: fix fatal kernel trap when system is low on memory
When running several builders in parallel, on QEMU, with 8GB of
memory, a fatal kernel trap (0x300 (data storage interrupt))
caused by llan driver is sometimes observed, when the system
starts to run out of swap space.

This happens because, at llan_intr(), a phyp call to add a
logical LAN buffer is always made when llan_add_rxbuf() fails,
even if it fails to allocate a new buffer.

PR:	235489
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D19084
2019-02-05 18:16:14 +00:00
Justin Hibbits
d49fc192c1 powerpc/powernv: Add a driver for the POWER9 XIVE interrupt controller
The XIVE (External Interrupt Virtualization Engine) is a new interrupt
controller present in IBM's POWER9 processor.  It's a very powerful,
very complex device using queues and shared memory to improve interrupt
dispatch performance in a virtualized environment.

This yields a ~10% performance improvment over the XICS emulation mode,
measured in both buildworld, and 'dd' from nvme to /dev/null.

Currently, this only supports native access.

MFC after:	1 month
2019-02-02 04:15:16 +00:00
Justin Hibbits
15fba9d3be powerpc: Fix opaque irq data initialization
The powerpc_intr structure is not zero-initialized, so on an invariants
build would panic in the xics driver with an invalid pointer.  Also fix the
xics driver to share the private data setup code between xics_enable() and
xics_bind().

Reported by:	Leonardo Bianconi
2019-01-19 04:47:19 +00:00
Justin Hibbits
431d31e0bf powerpc/pseries: Cache the IPI vector to avoid the common static lookup
The IPI vector is static, and happens to be the most common interrupt by far
on some systems.  Rather than searching for the interrupt every time, cache
the index.

This appears to yield a small performance boost, of about 8% reduction in
buildworld times, on my POWER9 system, when paired with r342975.
2019-01-12 22:10:31 +00:00
Justin Hibbits
56505ec016 powerpc: Add opaque 'private data' to interrupt vectors
The XICS and XIVE need extra data beyond irq and vector.  Rather than
performing a separate search, it's better for the general interrupt facility
to hold a private pointer, since the search already must be done anyway at
that level.
2019-01-12 22:05:42 +00:00
Conrad Meyer
bba9cbe374 powerpc: Fix regression introduced in r342771
In r342771, I introduced a regression in Power by abusing the platform
smp_topo() method as a shortcut for providing the MI information needed for
the stated sysctls.  The smp_topo() method was already called later by
sched_ule (under the name cpu_topo()), and initializes a static array of
scheduler topology information.  I had skimmed the smp_topo_foo() functions
and assumed they were idempotent; empirically, they are not (or at least,
detect re-initialization and panic).

Do the cleaner thing I should have done in the first place and add a
platform method specifically for core- and thread-count probing.

Reported by:	luporl via jhibbits
Reviewed by:	luporl
X-MFC-With:	r342771
Differential Revision:	https://reviews.freebsd.org/D18777
2019-01-07 19:39:31 +00:00
Conrad Meyer
6b83069e05 Expose threads-per-core and physical core count information
With new sysctls (to the best of our ability do detect them).  Restructured
smp.4 slightly for clarity (keep relevant stuff closer to the top) while
documenting.

Reviewed by:	markj, jhibbits (ppc parts)
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18322
2019-01-04 18:31:17 +00:00
Justin Hibbits
54b310b892 powerpc64/xics: Fix comment typo 2018-10-21 02:25:56 +00:00
Justin Hibbits
7e524b0746 powerpc/pseries: EOI interrupts in XICS by setting lowest priority
Discussing with Benjamin Herrenschmidt, OPAL_INT_GET_XIRR masks the
returned priority, so must be resumed before more interrupts can be
handled at this priority.  Since there are only two priorities used in
FreeBSD, we know that the previous priority in an EOI will always be
0xff (lowest priority).

Reviewed by:	nwhitehorn
Approved by:	re(rgrimes)
Differential Revision: https://reviews.freebsd.org/D17361
2018-10-06 18:51:49 +00:00
Justin Hibbits
5272c9bd07 Add a comment explaining the need of a global temporary variable
cpu_xirr is used only as a temporary location for the OPAL call in
PIC_DISPATCH().

Requested by:	nwhitehorn
2018-05-22 03:24:16 +00:00
Nathan Whitehorn
6cff19a3be Fix build with PSERIES but not POWERNV defined. 2018-05-20 18:26:09 +00:00
Justin Hibbits
ef6da5e5c7 Add support for the XIVE XICS emulation mode for POWER9 systems
Summary:
POWER9 systems use a new interrupt controller, XIVE, managed through OPAL
firmware calls.  The OPAL firmware includes support for emulating the previous
generation XICS presentation layer in addition to a new "XIVE Exploitation"
mode.  As a stopgap until we have XIVE exploitation mode, enable XICS emulation
mode so that we at least have an interrupt controller.

Since the CPPR is local to the current CPU, it cannot be updated for APs when
initializing on the BSP.  This adds a new function, directly called by the
powernv platform code, to initialize the CPPR on AP bringup.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15492
2018-05-20 03:23:17 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Justin Hibbits
04de51dbab No need to bzero splpar_vpa entries
splpar_vpa is in the BSS, so is already zeroed when the kernel starts up.

Tested by:	Leandro Lupori
2018-05-11 02:04:01 +00:00