e4b8deb22227 removed the last in-tree uses of PCPU_INC(). Its
potential benefit is also practically nonexistent. Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction. The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.
[0]: https://www.agner.org/optimize/instruction_tables.pdf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29308
In r361544, the pmap drivers were converted to ifuncs. When doing so,
this changed the call type of pmap functions to be called via the
secure-plt stubs.
These stubs depend on the TOC base being loaded to r30 to run properly.
On SMP AIM (i.e. a dual processor G4 or running 32-bit on G5), since the
APs were being started up from the reset vector instead of going
through __start, they had never had r30 initialized properly, so when the
cpu_reset code in trap_subr32.S attempted to branch to
pmap_cpu_bootstrap(), it was loading the target from the wrong location.
Ensure r30 is set up directly in the cpu_reset trap code, so we can make
PLT calls as normal.
Fixes boot on my SMP G4.
Reviewed by: jhibbits
MFC after: 3 days
Sponsored by: Tag1 Consulting, Inc.
In uart_phyp_get(), when the internal buffer is empty, we make a
hypercall to retrieve up to 16 bytes of input data from the
hypervisor. As this is specified to be returned in BE format, we need
to do a 64-bit byte swap on the first and second half of the data.
If the buffer being passed in was insufficient to return the fetched
data, we store the remainder in the internal buffer and use it to
satisfy the following calls to uart_phyp_get() until it is drained.
However, in this case, we were accidentally byteswapping the internal
buffer again.
Move the byteswapping code to just after the hypercall so it only gets
swapped when we're filling the buffer.
Fixes arrow keys in qemu on pseries, among other console oddities.
Sponsored by: Tag1 Consulting, Inc.
MFC after: 3 days
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.
In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions. Abstracting this check reduces the diff relative
to FreeBSD. It is perhaps slightly more readable as well.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D28710
In 78599c32efed3247d165302a1fbe8d9203e38974, CFI endproc decoration was
added to locore64.S. However, it missed the subtle detail that
__restartkernel_virtual() falls through to __restartkernel(). This was
causing boot failure on PowerMac G5, as it tried to execute the
epilogue as code.
Fix this by branching to __restartkernel() instead of intentionally
running off the end of the function.
While here, add some additional notes on how the virtual mode restart
works.
MFC after: 3 days
lang/rust needs COMPAT_FREEBSD11 to build, even though powerpc64le itself is supported only since 13.0.
I also corrected a comment, because if we ever have lib32 for powerpc64le, it will be for powerpcle.
Reviewed by: bdragon (on IRC)
Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, kadesai (on email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26531
Summary:
The values returned by this function are reported to the gdb client as
the reason for the break in execution, a signal value such as SIGTRAP,
SIGEMT, or SIGSEGV. As such, exact vector numbers can be misidentified.
Return SIGEMT in the default case instead.
Reviewed by: alfredo
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28046
Support for powerpc64le appeared in 13, so there's no point to enable COMPAT_* for older releases.
Also disable COMPAT_FREEBSD32, since there's no powerpcle. Since that may change in the future, leave the option commented out.
Approved by: bdragon, jhibbits (on IRC)
This is the superset of the nooptions found in the -DEBUG kernels.
Reviewed by: emaste, manu
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D28152
This does an import of quirk stubs, debugging macros from USB code and
numerous usage constants used by dependent drivers.
Besides, this change renames some functions to get a better matching
with userland library and NetBSD/OpenBSD HID code. Namely:
- Old hid_report_size() renamed to hid_report_size_max()
- New hid_report_size() calculates size of given report rather than
maximum size of all reports.
- hid_get_data_unsigned() renamed to hid_get_udata()
- hid_put_data_unsigned() renamed to hid_put_udata()
Compat shim functions are provided in usbhid.h to make possible compile
of legacy code unmodified after this change.
Reviewed by: manu, hselasky
Differential revision: https://reviews.freebsd.org/D27887
It will be used by the upcoming HID-over-i2C implementation. Should be
no-op, except hid.ko module dependency is to be added to affected drivers.
Reviewed by: hselasky, manu
Differential revision: https://reviews.freebsd.org/D27867
It's possible for a context switch, and CPU migration, to occur between
fetching the PCPU context and extracting the pc_curpcb. This can cause
the fault handler to be installed for the wrong thread, leading to a
panic in copyin()/copyout(). Since curthread is already in %r13, just
use that directly, as GPRs are migrated, so there is no migration race
risk.
Currently copyinstr() uses fubyte() to read each byte from userspace.
However, this means that for each byte, it calls pmap_map_user_ptr() to
map the string into memory. This is needlessly wasteful, since the
string will rarely ever cross a segment boundary. Instead, map a
segment at a time, and copy as much from that segment as possible at a
time.
Measured with the HPT pmap on powerpc64, this saves roughly 8% time on
buildkernel, and 5% on buildworld, in wallclock time.
- fix values returned by 'sysctls dev.opal_sensor.*.sensor'
- fix missing 'dev.opal_sensor.*.sensor_[max|min]' on sysctl
Reviewed-by: jhibbits
Sponsored-by: Eldorado Research Institute (eldorado.org.br)
Differential-Revision: https://reviews.freebsd.org/D27365
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.
The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
daemons implementation still don't have multipath support enabled for FreeBSD.
Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
feature gap to other operating systems. Linux and OpenBSD enabled similar support
at least 5 years ago.
ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.
It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
defaulting to 0.
Differential Revision: https://reviews.freebsd.org/D27428
Follow-up to r353959 and r368070: do the same for other architectures.
arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing. Likewise, MIPS
uses .frame directives.
Reviewed by: arichardson
Differential Revision: https://reviews.freebsd.org/D27387
In the PCB struct, we need to match the VSX register file layout
correctly, as the VSRs shadow the FPRs.
In LE, we need to have a dword of padding before the fprs so they end up
on the correct side, as the struct may be manipulated by either the FP
routines or the VSX routines.
Additionally, when saving and restoring fprs, we need to explicitly target
the fpr union member so it gets offset correctly on LE.
Fixes weirdness with FP registers in VSX-using programs (A FPR that was
saved by the FP routines but restored by the VSX routines was becoming 0
due to being loaded to the wrong side of the VSR.)
Original patch by jhibbits.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D27431
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
It can useful for code outside the VM system to look up the NUMA domain
of a page backing a virtual or physical address, specifically when
creating NUMA-aware data structures. We have _vm_phys_domain() for
this, but the leading underscore implies that it's an internal function,
and vm_phys.h has dependencies on a number of other headers.
Rename vm_phys_domain() to vm_page_domain(), and _vm_phys_domain() to
vm_phys_domain(). Make the latter an inline function.
Add _vm_phys.h and define struct vm_phys_seg there so that it's easier
to use in other headers. Include it from vm_page.h so that
vm_page_domain() can be defined there.
Include machine/vmparam.h from _vm_phys.h since it depends directly on
some constants defined there.
Reviewed by: alc
Reviewed by: dougm, kib (earlier versions)
Differential Revision: https://reviews.freebsd.org/D27207
r367416 should have called save_fpu() before kern_sigprocmask to avoid
race condition
Thanks jhibbits and bdragon for pointing it out
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D27241
After r367417, both mmu_oea64 and mmu_radix were defining the vm.pmap
sysctl node, resulting in the later definition hiding the properties of
the previous one. Avoid this issue by defining vm.pmap in a common
source file and declaring it where needed.
This change also standardizes the tunable name used to enable superpages
and change its default to disabled on radix MMU, because it still has some
issues with superpages.
Reviewed by: bdragon, jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D27156
There were many, many endianness fixes needed for Radix MMU. The Radix
pagetable is stored in BE (as it is read and written to by the MMU hw),
so we need to convert back and forth every time we interact with it when
running in LE.
With these changes, I can successfully boot with radix enabled on POWER9 hw.
Reviewed by: luporl, jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D27181
The HPT is always stored in big-endian, as it is accessed directly by the
hardware as well as the kernel. As such, it is necessary to convert values
to and from native endian when running on LE.
Some unconverted accesses snuck in accidentally with r367417.
Apply the appropriate conversions to fix boot hanging on powerpc64le.
Sponsored by: Tag1 Consulting, Inc.
This brings its 'struct syscall_args' in sync with other architectures.
Reviewed by: bdragon, jhibbits
MFC after: 2 weeks
Sponsored by: EPSRC
Differential Revision: https://reviews.freebsd.org/D26605
Fix build errors introduced by r367417 and r367390:
- Guard label reached only by powerpc64
- Guard vm_reserv_level_iffullpop call, that is not defined on powerpc
variants that don't support superpages
- Add missing hwpmc file, for when hwpmc is built into kernel
This change adds support for transparent superpages for PowerPC64
systems using Hashed Page Tables (HPT). All pmap operations are
supported.
The changes were inspired by RISC-V implementation of superpages,
by @markj (r344106), but heavily adapted to fit PPC64 HPT architecture
and existing MMU OEA64 code.
While these changes are not better tested, superpages support is disabled by
default. To enable it, use vm.pmap.superpages_enabled=1.
In this initial implementation, when superpages are disabled, system
performance stays at the same level as without these changes. When
superpages are enabled, buildworld time increases a bit (~2%). However,
for workloads that put a heavy pressure on the TLB the performance boost
is much bigger (see HPC Challenge and pgbench on D25237).
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D25237
Add support for Floating-Point Exception traps on 32 and 64 bit platforms.
Also make sure to clean FPSCR on EXEC and thread exit
Author of initial version: Renato Riolino <renato.riolino@eldorad.org.br>
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D23623
This change adds support for POWER8 and POWER9 PMCs (bare metal and
pseries).
All PowerISA 2.07B non-random events are supported.
Implementation was based on that of PPC970.
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26110
And add a _74XX suffix to 74XX SPRs.
This is a preparation for adding support to POWER8/9 PMCs, which have most
SPRs equal to 970 ones.
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26532
VM_ALLOC_WAITOK and vm_page_unwire_noq(), have eliminated the need for
many of the #includes.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D27052
Move dump_avail[] extern declaration and inlines into a new header
vm/vm_dumpset.h. This fixes default gcc build for mips.
Reviewed by: alc, scottph
Tested by: kevans (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26741
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled. Otherwise, there is no
functional change. The mechanism can be disabled with
debug.fxrng_vdso_enable=0.
arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time. Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed. On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.
This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator. That change is
left for future discussion/work.
Reviewed by: kib (previous version)
Approved by: csprng (me -- only touching FXRNG here)
Differential Revision: https://reviews.freebsd.org/D22839
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.
Note: This may cause conflicts with updating in some cases.
find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
Add support for sysctl 'machdep.uprintf_signal' that prints debugging
information on trap signal.
Reviewed by: jhibbits, luporl, bdragon
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26004
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.
While here, fix some small style issues.
Reviewed by: markj, kib (amd64 part, in D26701)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Similar to OPAL calls, switch to big endian to do calls to RTAS.
(Missed this one when I was doing the bulk commit of PowerPC64LE support.)
Sponsored by: Tag1 Consulting, Inc.