OPAL unconditionally enters secondary CPUs with only HV and SF set.
I tried writing a secondary entry point instead, but OPAL rejected it
and I am unsure why, so I resorted to making the system reset interrupt
endian-flexible.
This means we take a slight performance hit on wakeup on LE, but it is
a good stopgap until we can figure out a reliable way to make OPAL enter
where we want it to.
It probably makes sense to have it around anyway, because I can imagine
scenarios where the cpu resets itself to BE and does a software reset.
Sponsored by: Tag1 Consulting, Inc.
More endian conversion.
* Install TCEs correctly (i.e. in big endian)
* Convert to big endian and back when setting up queue pages and IRQs.
Sponsored by: Tag1 Consulting, Inc.
Since OPAL runs in big endian, any data being passed back and forth
via memory instead of registers needs to be byteswapped.
From my notes during development:
"A good way to find candidates is to look for vtophys() in opal_call()
parameters. The memory being passed will be written into in BE."
Sponsored by: Tag1 Consulting, Inc.
It's much easier to implement this in an endian-independent way when we
don't also have to worry about masking half of the dword off.
Given that this code ran on a machine that ran a poudriere bulk with no
kernel oddities, I am relatively certain it is correctly implemented. ;)
This should be a minor performance boost on BE as well.
Sponsored by: Tag1 Consulting, Inc.
For a body of code that had its endian conversion bits written blind without
the ability to test, moea64 was VERY close to being correct.
There were only four instances where the existing code was getting it wrong.
Sponsored by: Tag1 Consulting, Inc.
This was originally part of the initial commit, but after discussion in
D26399, I split it out into its own commit after the kernel config file.
Sponsored by: Tag1 Consulting, Inc.
This is slightly stripped down from GENERIC64, as PowerMac G5 machines
are incapable of running in LE mode (so we can skip the Mac drivers.)
While technically POWER6 and POWER7 have the hardware capability of running
in LE mode, they have a tendency to trap excessively when a load/store is
misaligned. (an extremely common occurrence in LE code, and one of the main
reasons I consider BE to be superior, as it turns potential security issues
into immediately obvious mangled numbers.)
Additionally, there was no mechanism to control what endian interrupts
are delivered in, so supporting LE operation on POWER6 and POWER7 involves
some really dirty tricks in the interrupt vectors that I would rather
avoid.
IBM drew the line in the sand at POWER8 some time around 2013, embracing
full support for LE in the platform, and making a push across the board
for LE code to target POWER8 as a minimum requirement. As such, usage of
LE kernels on POWER6 and POWER7 is practically nil, despite it being
technically possible to do.
The so-called "TRUELE" feature bit which is the baseline requirement for
needed for PowerPC64LE was introduced in POWER8.
Sponsored by: Tag1 Consulting, Inc.
There was a small window cp was broken. Work around this by using :>
instead of cp /dev/null. Ideally, we'd keep the cp /dev/null in the
build as a regression test, but doing so breaks people that upgraded
during the cp breakage and this is simpler than bootstrapping a
working cp since there's no good __FreeBSD_version sign posts for
that.
Suggested by: lots of people
Too stubborn for his own good: imp
When running without a hypervisor, we need to set the ILE bit in the LPCR
ourselves.
For the boot processor, handle it in powernv_attach() like we do for other
LPCR bits.
No change for the APs, as they will use the lpcr global to set up their own
LPCR when they do their own cpudep_ap_early_bootstrap() and pick up this
automatically.
Sponsored by: Tag1 Consulting, Inc.
OPAL runs in big endian, so we need to rfid into it to switch endian
atomically when branching to it, and we need to do the
RETURN_TO_NATIVE_ENDIAN dance when it returns to us.
Sponsored by: Tag1 Consulting, Inc.
Given that we have converted to ELFv2 for BE already, endianness is the only
difference between the two ARCHs.
As such, there is no need to differentiate LIBC_ARCH between the two.
Combining them like this lets us avoid needing to have two copies of several
bits for no good reason.
Sponsored by: Tag1 Consulting, Inc.
Unlike virtio, which in legacy mode is guest endian, the hypervisor vscsi
interface operates in big endian, so we must convert back and forth in several
places.
These changes are enough to attach a rootdisk.
Sponsored by: Tag1 Consulting, Inc.
The TCG implementation of mtmsrd in qemu blindly copies the entire register
to the MSR, instead of the specific bit positions listed in the ISA.
This means that qemu will prematurely switch endian out from under the
running code instead of waiting for the rfid, causing an immediate trap
as it attempts to interpret the next instruction in the wrong endianness.
To work around this, ensure PSL_LE is still set before doing the mtmsrd.
In the future, we may wish to just turn off translation and unconditionally
use rfid to switch to the ofmsr instead of quasi-switching to the ofmsr.
Add a new platform option so this can be disabled. (And so that we can
conditonalize additional QEMU-specific hacks in the platform code.)
Sponsored by: Tag1 Consulting, Inc.
Since we will need to be able to take traps relatively early in the process,
ensure that the hypervisor changes our ILE for us as soon as we are ready.
Sponsored by: Tag1 Consulting, Inc.
Since OFW always runs in big endian in practice, we need to convert several
bits back and forth.
This is necessary to communicate with SLOF on LE pseries.
Sponsored by: Tag1 Consulting, Inc.
This is the initial set up for PowerPC64LE.
The current plan is for this arch to remain experimental for FreeBSD 13.
This started as a weekend learning project for me and kinda snowballed from
there.
(More to follow momentarily.)
Reviewed by: imp (earlier version), emaste
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26399
I had overthought how to do the FICL_TRUE change. We do not need to
explicitly specify how big the 0 is before the cast to the correct size.
The same change was suggested by both imp@ and Gunther Nikl independently.
Tested on powerpc.
Reported by: imp, Gunther Nikl
Document the calls to send messages to userland via devctl.
devctl_notify will create a message for the specified system,
subsystem and type, optionally adding additional information.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
Belatedly document the quoting requirements for the devctl protocol. I
thought they'd been previously documented.
Also, while I'm here, make igor happy.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
This routine centralizes the knowledge needed for properly quoting
'value' in all key="value" items that appear in devctl messages.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
It is like O_BENEATH, but disables to walk out of the subtree rooted
in the starting directory. O_BENEATH does not care if path walks out
if it returned.
Requested by: Dan Gohman <dev@sunfishcode.online>
PR: 248335
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
Do not care if path walks out of the topping directory if it returns back.
Requested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
not unconditionally on any dotdot component.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
the helper to calculate namei flags both for open(2) and creat(2).
Suggested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
the helper to convert AT_ flags for *at() syscalls to namei flags.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
Stop abusing internal namei flag NI_LCF_STRICTRELATIVE as indicator of
cap-restricted lookup. Add designated returned flag NIRES_STRICTREL
to inform kern_openat() that lookup was restricted.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
This makes open("/a/../a", O_BENEATH) with cwd == "/a" work.
Reviewed by: markj
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
Explain why tracker is needed at all.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25886
Otherwise a corrupted file entry containing invalid extended attribute
lengths or allocation descriptor lengths can trigger an overflow when
the file entry is loaded.
admbug: 965
PR: 248613
Reported by: C Turt <ecturt@gmail.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
In the spirit of the GENERIC config, we should include the drivers required to
run on most supported platforms.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D26501
This allows the PF interfaces to communicate with the VF interfaces over
the internal switch in the ASIC. Fix the GL limits for VM work requests
while here.
MFC after: 3 days
Sponsored by: Chelsio Communications
Extend the powerpc relative relocation handling from r240782 to a
handful of other architectures. This is needed to properly read
dependency information from kernel modules.
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26365