Summary:
Consolidate the NUMA associativity handling into a platform function.
Non-NUMA platforms will just fall back to the default (0). Currently
only implemented for powernv, which uses a lookup table to map the
device tree associativity into a system NUMA domain.
Fixes hangs on powernv after r356534, and corrects a fairly longstanding
bug in powernv's NUMA handling, which ended up using domains 1 and 2 for
devices and memory on power9, while CPUs were bound to domains 0 and 1.
Reviewed by: bdragon, luporl
Differential Revision: https://reviews.freebsd.org/D23220
Due to a bug in clang 9.0.0 source tracking, the trap vector copying will
always trigger a fortify-source warning.
The destination buffers are 0x2f00 bytes, and the bcopy region is 0x2e00
bytes, so there is not an overflow here.
(I have been running with this patch since September.)
On POWER8 systems with only one memory domain, the "ibm,associativity"
number that corresponds to it is 0, unlike POWER9 systems with two
or more domains, in which the minimum value is 1.
In POWER8 case, subtracting 1 causes an underflow on the unsigned domain
variable and a subsequent index out-of-bounds access.
Reviewed by: jhibbits
Tested by: bdragon, luporl
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.
EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.
LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).
No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
Summary:
Initial NUMA support:
- associate CPU with domain
- associate memory ranges with domain
- identify domain for devices
- limit device interrupt binding to appropriate domain
- Additionally fixes a bug in the setting of Maxmem which led to
only memory attached to the first socket being enabled for DMA
A pmap variant can opt in to numa support by by calling `numa_mem_regions`
at the end of pmap_bootstrap - registering the corresponding ranges with the
VM.
This yields a ~20% improvement in build times of llvm on dual socket POWER9
over non-NUMA.
Original patch by mmacy.
Differential Revision: https://reviews.freebsd.org/D17933
The PHB4 host bridge used by the POWER9 uses a 64kB range in 32-bit
space at the address 0xffff0000-0xffffffff. Reserve this range so that
DMA memory cannot be allocated within this range. This fixes seemingly
random crashes on a POWER9 system. Ideally this range will have been
reserved by the firmware, but as of now this is not the case.
Submitted by: git_bdragon.rtk0.net
Reviewed by: nwhitehorn
Approved by: re(kib)
Differential Revision: https://reviews.freebsd.org/D17183
Changed excise_initrd_region to support both 32- and 64-bit
values for linux,initrd-start and linux,initrd-end.
This fixes the boot problem on some machines after rS334485.
Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: jhibbits, leitao
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15667
Currently kexec loads an initrd file into the main memory but does not
mark that region as reserved, thus the area is not protected.
If any initrd/md file is loaded from kexec/petitboot, the region might become
corarupted/overwritten since FreeBSD does not know the region is 'reserved'.
This patch simply adds the initrd area as a reserved memory region.
Approved by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15610
On some POWER9 systems, 'reg' denotes the full memory in the system, while
'linux,usable-memory' denotes the usable memory. Some memory is reserved for
NVLink usage, so is partitioned off.
Submitted by: Breno Leitao
Discussing with others, this needs to be at least 20 to boot on some POWER9
nodes. Linux made a similar change for the same reason, so increase to 32
to give us some extra breathing room as well. The input and output arrays
are sized at 256, so much greater than the increase in the property array
size.
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.
The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.
Reviewed by: jhibbits, Breno Leitao
Differential Revision: D14499
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
There's no need to special case 32-bit AIM to short circuit processing.
Some AIM CPUs can handle 36 bit addresses, and 64-bit CPUs can run 32-bit
OSes, so this will allow us to expand for that in the future if we desire.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
===
From Nathan Whitehorn:
Open Firmware runs in virtual mode on the Powermac G5. This runs inside the
kernel page table, which preserves all address translations made by OF before
the kernel starts; as a result, the kernel address space is a strict superset of
OF's.
Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler
runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the
kernel. Having a value of SPRG0 that works for the kernel is less fatal than
preserving OF's value in this case.
===
The result of this is seemingly random panics from NULL dereferences, or hangs
immediately upon boot. By not restoring SPRG0 for Open Firmware entry the
kernel PCPU pointer is preserved and SLB faults are successful, resulting in a
stable kernel.
PR: 205458
Reported by: several (over bugzilla, lists, IRC)
Reviewed by: andreast
Tested by: many (various forms)
MFC after: 2 weeks
Summary:
If the environment variable is set, U-boot adds a 'bootargs' property to
/chosen. This is already handled by ARM and MIPS, but should be handled in a
central location. For now, ofw_subr.c is a good place until we determine if it
should be moved to init_main.c, or somewhere more central to all architectures.
Eventually arm and mips should be modified to use ofw_parse_bootargs() as well,
rather than using the duplicate code already.
Reviewed By: adrian
Differential Revision: https://reviews.freebsd.org/D7846
will allow for code that uses the old fdt_get_range and fdt_regsize
functions to find a range, map it, access, then unmap to replace this, up
to and including the map, with a call to OF_decode_addr.
As this function should only be used in the early boot code the unmap is
mostly do document we no longer need the mapping as it's a no-op, at least
on arm.
Reviewed by: jhibbits
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D5258
Summary:
With some additional changes for AIM, that could also support much
larger physmem sizes. Given that 32-bit AIM is more or less obsolete, though,
it's not worth it at this time.
Differential Revision: https://reviews.freebsd.org/D4345
into a new function that other platforms can share.
This creates a new ofw_reg_to_paddr() function (in a new ofw_subr.c file)
that contains most of the existing ppc implementation, mostly unchanged.
The ppc code now calls the new MI code from the MD code, then creates a
ppc-specific bus_space mapping from the results. The new arm implementation
does the same in an arm-specific way.
This also moves the declaration of OF_decode_addr() from ofw_machdep.h to
openfirm.h, except on sparc64 which uses a different function signature.
This will help all FDT platforms to set up early console access using
OF_decode_addr().
OF_getprop() to get encode-int encoded values from the OF tree. This is
a no-op at present, since all existing PowerPC ports are big-endian, but
it is a correctness improvement and will be required if we have a
little-endian kernel at some future point.
Where it is totally impossible for the code ever to be used on a
little-endian system (much of powerpc/powermac, for instance), I have not
necessarily made the appropriate changes.
MFC after: 1 month
FDT_DTB_STATIC is defined in opt_platform.h, and fdt_static_dtb is in
fdt_common.h, so include those files.
Sponsored by: Alex Perez/Inertial Computing
in ofw_mem_regions(). This function is actually MI and should move to
dev/ofw at some point in the near future so that ARM and MIPS can use the
same code.
the Open Firmware, as provided by petitboot, for example. Note that this is
not quite complete, since RTAS instantiation still depends on callable
firmware.
MFC after: 2 weeks
Open Firmware-centric:
- Keep the static list of regions in platform.c instead of ofw_machdep.c
- Move various merging and sorting operations to platform.c as well
- Move apple_hacks code out of ofw_machdep.c and into platform_powermac.c,
where it belongs
- Move CHRP-specific dynamic-reconfiguration memory parsing into
platform_chrp.c instead of pretending it is shared code
It turned out that on pSeries machines the call into OF modified the trap
vectors and this made further behaviour unpredictable.
With this commit I'm now able to boot multi user on a network booted
environment on my IntelliStation 285. This is a POWER5+ machine.
Discussed with: nwhitehorn
MFC after: 1 week
assumed that the MDIO bus was a direct child of the Ethernet interface. It
may not be and indeed on many device trees is not. While here, add proper
locking for MII transactions, which may be on a bus shared by several MACs.
Hardware donated by: Benjamin Perrault
platform modules. Whether to call this function or not is highly machine
dependent: on some systems, it is required, while on others it breaks
everything. Platform modules are in a better position to figure this
out. This is required for POWER hypervisor SCSI to work correctly. There
are no functional changes on Powermac systems.
Approved by: re (kib)
property such as those found on some real and emulated IBM systems. The
approach, which is taken from Linux, is to scan through the PCI bars
until we find one large enough to contain the linear framebuffer and
which is ideally prefetchable if no "address" property can be found.
This makes the graphical console work with the pSeries target in QEMU.
Approved by: re (delphij)