Move dump_avail[] extern declaration and inlines into a new header
vm/vm_dumpset.h. This fixes default gcc build for mips.
Reviewed by: alc, scottph
Tested by: kevans (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26741
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled. Otherwise, there is no
functional change. The mechanism can be disabled with
debug.fxrng_vdso_enable=0.
arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time. Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed. On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.
This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator. That change is
left for future discussion/work.
Reviewed by: kib (previous version)
Approved by: csprng (me -- only touching FXRNG here)
Differential Revision: https://reviews.freebsd.org/D22839
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.
Note: This may cause conflicts with updating in some cases.
find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
Add support for sysctl 'machdep.uprintf_signal' that prints debugging
information on trap signal.
Reviewed by: jhibbits, luporl, bdragon
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26004
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.
While here, fix some small style issues.
Reviewed by: markj, kib (amd64 part, in D26701)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Similar to OPAL calls, switch to big endian to do calls to RTAS.
(Missed this one when I was doing the bulk commit of PowerPC64LE support.)
Sponsored by: Tag1 Consulting, Inc.
Due to enter_idle_powerx fabricating a MSR from scratch, it is necessary
for it to care about the endianness, so we don't accidentally switch
endian the first time we idle a thread.
Took about five seconds to spot after seeing an unmangled backtrace.
The hard bit was needing to temporarily set up a mutex to sort out the
logjam that happens when every thread simultaneously wakes up in the wrong
endian due to the panic IPI and panics, leaving what I can best describe as
"alphabet soup" on the console.
Luckily, I already had a patch sitting around to do that.
This brings POWER8 up to equivilence with POWER9 on PPC64LE.
Sponsored by: Tag1 Consulting, Inc.
OPAL unconditionally enters secondary CPUs with only HV and SF set.
I tried writing a secondary entry point instead, but OPAL rejected it
and I am unsure why, so I resorted to making the system reset interrupt
endian-flexible.
This means we take a slight performance hit on wakeup on LE, but it is
a good stopgap until we can figure out a reliable way to make OPAL enter
where we want it to.
It probably makes sense to have it around anyway, because I can imagine
scenarios where the cpu resets itself to BE and does a software reset.
Sponsored by: Tag1 Consulting, Inc.
More endian conversion.
* Install TCEs correctly (i.e. in big endian)
* Convert to big endian and back when setting up queue pages and IRQs.
Sponsored by: Tag1 Consulting, Inc.
Since OPAL runs in big endian, any data being passed back and forth
via memory instead of registers needs to be byteswapped.
From my notes during development:
"A good way to find candidates is to look for vtophys() in opal_call()
parameters. The memory being passed will be written into in BE."
Sponsored by: Tag1 Consulting, Inc.
It's much easier to implement this in an endian-independent way when we
don't also have to worry about masking half of the dword off.
Given that this code ran on a machine that ran a poudriere bulk with no
kernel oddities, I am relatively certain it is correctly implemented. ;)
This should be a minor performance boost on BE as well.
Sponsored by: Tag1 Consulting, Inc.
For a body of code that had its endian conversion bits written blind without
the ability to test, moea64 was VERY close to being correct.
There were only four instances where the existing code was getting it wrong.
Sponsored by: Tag1 Consulting, Inc.
This is slightly stripped down from GENERIC64, as PowerMac G5 machines
are incapable of running in LE mode (so we can skip the Mac drivers.)
While technically POWER6 and POWER7 have the hardware capability of running
in LE mode, they have a tendency to trap excessively when a load/store is
misaligned. (an extremely common occurrence in LE code, and one of the main
reasons I consider BE to be superior, as it turns potential security issues
into immediately obvious mangled numbers.)
Additionally, there was no mechanism to control what endian interrupts
are delivered in, so supporting LE operation on POWER6 and POWER7 involves
some really dirty tricks in the interrupt vectors that I would rather
avoid.
IBM drew the line in the sand at POWER8 some time around 2013, embracing
full support for LE in the platform, and making a push across the board
for LE code to target POWER8 as a minimum requirement. As such, usage of
LE kernels on POWER6 and POWER7 is practically nil, despite it being
technically possible to do.
The so-called "TRUELE" feature bit which is the baseline requirement for
needed for PowerPC64LE was introduced in POWER8.
Sponsored by: Tag1 Consulting, Inc.
When running without a hypervisor, we need to set the ILE bit in the LPCR
ourselves.
For the boot processor, handle it in powernv_attach() like we do for other
LPCR bits.
No change for the APs, as they will use the lpcr global to set up their own
LPCR when they do their own cpudep_ap_early_bootstrap() and pick up this
automatically.
Sponsored by: Tag1 Consulting, Inc.
OPAL runs in big endian, so we need to rfid into it to switch endian
atomically when branching to it, and we need to do the
RETURN_TO_NATIVE_ENDIAN dance when it returns to us.
Sponsored by: Tag1 Consulting, Inc.
Unlike virtio, which in legacy mode is guest endian, the hypervisor vscsi
interface operates in big endian, so we must convert back and forth in several
places.
These changes are enough to attach a rootdisk.
Sponsored by: Tag1 Consulting, Inc.
The TCG implementation of mtmsrd in qemu blindly copies the entire register
to the MSR, instead of the specific bit positions listed in the ISA.
This means that qemu will prematurely switch endian out from under the
running code instead of waiting for the rfid, causing an immediate trap
as it attempts to interpret the next instruction in the wrong endianness.
To work around this, ensure PSL_LE is still set before doing the mtmsrd.
In the future, we may wish to just turn off translation and unconditionally
use rfid to switch to the ofmsr instead of quasi-switching to the ofmsr.
Add a new platform option so this can be disabled. (And so that we can
conditonalize additional QEMU-specific hacks in the platform code.)
Sponsored by: Tag1 Consulting, Inc.
Since we will need to be able to take traps relatively early in the process,
ensure that the hypervisor changes our ILE for us as soon as we are ready.
Sponsored by: Tag1 Consulting, Inc.
Since OFW always runs in big endian in practice, we need to convert several
bits back and forth.
This is necessary to communicate with SLOF on LE pseries.
Sponsored by: Tag1 Consulting, Inc.
This is the initial set up for PowerPC64LE.
The current plan is for this arch to remain experimental for FreeBSD 13.
This started as a weekend learning project for me and kinda snowballed from
there.
(More to follow momentarily.)
Reviewed by: imp (earlier version), emaste
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26399
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).
Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26131
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.
Reviewed by: kib, markj
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26129
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.
Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.
This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25371
Due to a check that should have been an endian check being an #if 0,
the wrong checksum mask table was being used on LE, which was causing
extreme strangeness in DNS resolution -- *some* hosts would be resolvable,
but most would not.
This fixes DNS resolution.
(I am committing some parts of the LE patchset ahead of time to reduce the
amount of work I have to do while committing the main patchset.)
Sponsored by: Tag1 Consulting, Inc.
On ibm,extended-clock-frequency, ensure we be64toh() the value.
On clock-frequency, remove the right-shifting hack (which was needed due to
reading a 32 bit value into a 64 bit variable) and switch to OF_getencprop()
for reading (which will handle endian conversion internally.)
Reviewed by: jhibbits (in irc)
Sponsored by: Tag1 Consulting, Inc.
that can be extended, but also ensure compile-time type checking. Refactor
common code out of arch-specific implementations. Move the mpr and mps
drivers to this new API. The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
To make it easier to work with this in the future, convert to c99
designated initializer syntax.
Tested on powerpc, powerpc64, and powerpc64le. No functional change.
Sponsored by: Tag1 Consulting, Inc.
The intention of the bus_be naming was for those to be the no-endian-swapping
and for the bus_le to be endian-swapping in all the functions.
This naming breaks down when we're actually are running in LE and need to
use the opposite sense.
As such, rename bs_be_* to native_bs_* and rename bs_le_* to swapped_bs_*.
No functional change.
Sponsored by: Tag1 Consulting, Inc.
Swap the BE and LE bus_space tags when on LE, and adjust the nexus tag
to match.
This is prep for a a followup that makes the powerpc bus_space macros easier
to maintain in the future.
Sponsored by: Tag1 Consulting, Inc.
Implement pmap_mincore() for moea64.
This will need some slight tweaks when large page support in HPT lands.
Submitted by: Fernando Eckhardt Valle <fernando.valle@eldorado.org.br>
Reviewed by: bdragon
Differential Revision: https://reviews.freebsd.org/D26314
* Add LOAD_LR_NIA define. This is preferred to "bl 1f; 1:" because it
doesn't pollute the branch predictor.
* Add magic sequence to return the CPU to the correct endianness after
jumping to cross-endian code, similar to the sequence from Linux.
Sponsored by: Tag1 Consulting, Inc.
There were multiple bugs in the OPAL RTC code which had never been
discovered, as the default configuration of OPAL machines is to
have the BMC / FSP control the RTC.
* Fix calling convention for setting the time -- the variables are passed
directly in CPU registers, not via memory.
* Fix bug in the bcd encoding routines. (from jhibbits)
Tested on POWER9 Talos II (BE) and POWER9 Blackbird (LE).
Reviewed by: jhibbits (in irc)
Sponsored by: Tag1 Consulting, Inc.
When emulating a single thread system for testing reasons, mp_maxid can
be 0. This trips up our math for calculating the order.
Account for this to fix xive attachment when emulating a single-thread
core on qemu powernv (a configuration that doesn't exist in the real world.)
Sponsored by: Tag1 Consulting, Inc.
When doing a build for a modern CPUTYPE, llvm will throw errors if obsolete
instructions are used, even if they will never run due to runtime checks.
Hiding the dssall instruction from the assembler fixes kernel build when
overriding CPUTYPE, without having any effect on the generated binary.
This has been in my local tree for over a year and is well tested across
a variety of machines.
Sponsored by: Tag1 Consulting, Inc.
When enabling an interrupt, assert that we do in fact have a root PIC.
This would have saved me some debugging effort.
Sponsored by: Tag1 Consulting, Inc.
Implement the remaining pieces needed to allow userland timestamp reading.
Rewritten based on an intial essay into the problem by Justin Hibbits.
(Copyright changed to my own on his request.)
Tested on ppc64 (POWER9 Talos II), powerpcspe (e500v2 RB800), and
powerpc (g4 PowerBook).
Reviewed by: jhibbits (in irc)
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26347