This is an amalgam of a patch by Doug Ambrisko to
generalize uart_acpi_find_device, imp moving the
ACPI table to uart_dev_ns8250.c and advice by jhb
to work around a bug in the EPYC 3151 BIOS
(the BIOS incorrectly marks the serial ports as
disabled)
Reviewed by: imp
MFC after: 8 weeks
Differential Revision: https://reviews.freebsd.org/D16432
- In configurations with a pseudo devices section, move 'device crypto'
into that section.
- Use a consistent comment. Note that other things common in kernel
configs such as GELI also require 'device crypto', not just IPSEC.
Reviewed by: rgrimes, cem, imp
Differential Revision: https://reviews.freebsd.org/D16775
The canonical form of sync is:
sync L, E (if Category Elemental Memory Barriers implemented)
The L bits (2) denote the type of sync:
0 -- hwsync
1 -- lwsync
2 -- ptesync or hwsync
It's been found that most 32-bit CPUs designed prior to the introduction of
lwsync will ignore the L bits. However, some cores, particularly the e500 core,
will trigger an illegal instruction exception. Adding these variants will make
it easier to see which sync variant is actually being used in case of a trap.
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.
This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.
The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.
This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.
Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617
This is an OFW initrd module that would load the initrd from device tree
parameters and give the to the md driver.
With this patch, it is possible to pass a rootfs image through kexec in PowerNV
mode (powerpc64). In order to user it, you should set the MD_ROOT_MEM option in
your kernel configuration.
Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15705
I had naively assumed that building kernel would be sufficient to test that
the header is sane. However, it turns out this now needs -fms-extensions to
build. Rather than sprinkling -fms-extensions all over the place, revert
for now, and revisit with a better fix.
Summary:
Ports like sysutils/lsof troll through kernel structures, and
therefore include kernel headers and all the dirty secrets involved. struct
vm_page includes the struct md_page inline, which currently is only defined
if AIM or BOOKE is defined. Thus, by default, sysutils/lsof cannot build,
due to the struct md_page having an incomplete type. Fix this by merging
the two struct definitions into an anonymous struct-union.
A similar change could be made to unify the pmap structures as well.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D16232
* FreeBSD stores addresses in 8 bit format, but the OPAL API requires the 7-bit
address, and encodes the direction elsewhere. Behave like other i2c drivers,
and shift accordingly.
* The OPAL API can already handle multiple requests in flight. Change the async
token to be private to the thread, so as not to stomp across i2c accesses,
remove the limitation error message, and use the correct message index to
transfer all messages in the list.
* Micro-optimize the async handler to not continuously call pmap_kextract() when
spin-waiting for the operation to complete.
This has been tested by hexdumping an EEPROM attached via the icee(4) driver.
ofw_iicbus already has attachments on iichb. Rather than adding an explicit
attachment onto opal_i2c, simply change the exposed name of the OPAL i2c bus
to 'iichb'.
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
(defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)
- Allocate page from the correct domain for a given cpu.
- Don't initialize pc_domain to non-zero value if NUMA is not defined
There are some misconceptions surrounding this field. It is the
_VM_ NUMA domain and should only ever correspond to valid domain
values as understood by the VM.
The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.
Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15933
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.
In preparation to fix this create two macros depending on if the data is
global or static.
Reviewed by: bz, emaste, markj
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D16140
Summary: If the chosen console is not the OPAL uart, but OPAL uart devices
exist, the console device doesn't attach properly, and faults in the interrupt
handler, with a NULL pointer dereference. To fix this, and as a byproduct, also
support multiple OPAL consoles, refactor to have the console getc callback use
the appropriate softc instead of the global console_sc, which may be NULL in the
case of a different device being the console.
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D16071
Summary: In r220638, stoppcbs started being tracked. This never got exposed to
ddb though, so kdb_thr_ctx() didn't know how to look them up.
This allows switching to threads on stopped CPUs in kdb.
Submitted by: Brandon Bergren <git_bdragon.rkt0.net>
Differential Revision: https://reviews.freebsd.org/D15986
r330610 relocated the DMAP from the base of memory to the base of the fourth
quadrant of memory. This broke synthetic traps, such as KDB forced
breakpoints. Use GET_TOCBASE() so the DMAP offset is handled.
Submitted by: git_bdragon.rkt0.net
Differential Revision: https://reviews.freebsd.org/D15973
Summary: POWER8 and POWER9 use a single CPU register, per core, to change clock
speed. Everything else is handled by the on-chip controller. This change
necessitates a change to the cpufreq global kernel driver to bump supported
levels, as the device tree for these systems can have theoretically 256
different options. On my POWER9 Talos, the list consists of 100 items. At
16.67MHz intervals, that allows for a change of roughly 1.67GHz between lowest
and highest.
This has only been tested on the POWER9. However, since they're similar, this
should work on POWER8 as well.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15932
PowerISA 3.0 makes several changes to not only the format of the HPT but
also the behavior surrounding it. For instance, TLBIE no longer requires
serialization. Removing this lock cuts buildworld time in half on a
18-core/72-thread POWER9 system, demonstrating that this lock is highly
contended on such a system.
There was odd behavior observed trying to make this change in a
backwards-compatible manner in moea64_native.c, so the best option was to
fully split it, and largely revert the original changes adding POWER9
support to the original file.
Suggested by: nwhitehorn
On very large memory systems 'size' can become 2GB or larger, resulting in a
negative value being formatted. Also, moea64_pteg_count is already a long, so
format it as such.
There is a type promotion that transform count = -1 into a unsigned int causing
the default TCE SEG SIZE not being returned on a Boston POWER9 machine.
This machine does not have the 'ibm,supported-tce-sizes' entries, thus, count
is set to -1, and the function continue to execute instead of returning.
Reviewed by: jhibbits, wma
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15763
pmc_process_interrupt takes 5 arguments when only 3 are needed.
cpu is always available in curcpu and inuserspace can always be
derived from the passed trapframe.
While facially a reasonable cleanup this change was motivated
by the need to workaround a compiler bug.
core2_intr(cpu, tf) ->
pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) ->
pmc_add_sample(cpu, ring, pm, tf, inuserspace)
In the process of optimizing the tail call the tf pointer was getting
clobbered:
(kgdb) up
at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709
4709 pmc_save_kernel_callchain(ps->ps_pc,
(kgdb) up
1205 error = pmc_process_interrupt(cpu, PMC_HR, pm, tf,
resulting in a crash in pmc_save_kernel_callchain.
Changed excise_initrd_region to support both 32- and 64-bit
values for linux,initrd-start and linux,initrd-end.
This fixes the boot problem on some machines after rS334485.
Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: jhibbits, leitao
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15667
Summary: Included VSX registers in powerpc core dumps (both kernel and gcore)
Submitted by: Luis Pires
Differential Revision: https://reviews.freebsd.org/D15512
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).
This is necessary to add support for VSX registers in debuggers.
Submitted by: Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
This will let us use much more KVA for ZFS ARC where needed. This may be
incresed in the future if memory requirements increase.
Discussed with: nwhitehorn
Recently a change was made which broke loading 32-bit binaries on powerpc64,
with an assertion in ld-elf32.so.1:
ld-elf32.so.1: assert failed:
/usr/local/poudriere/jails/ppc64/usr/src/libexec/rtld-elf/rtld.c:390
It turns out Elf32_AuxInfo was broken for a very long time on powerpc64, as
it uses long and pointers, which are both 64 bits on powerpc64, and only
manifested with the recent work on auxargs.
Currently kexec loads an initrd file into the main memory but does not
mark that region as reserved, thus the area is not protected.
If any initrd/md file is loaded from kexec/petitboot, the region might become
corarupted/overwritten since FreeBSD does not know the region is 'reserved'.
This patch simply adds the initrd area as a reserved memory region.
Approved by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15610
Summary:
Coupled with r334365, this makes PCI work on POWER9. There is still more to
do to fully exploit the hardware capabilities, but this is sufficient to
enable USB and ethernet controllers on a POWER9 Talos II system.
Reviewed by: nwhitehorn, leitao
Differential Revision: https://reviews.freebsd.org/D15566
This reduces the CPU cycle wastage on power9, which is SMT4. Any idle
thread that's spinning is simply starving working threads on the same core
of valuable resources.
This can be reduced further by taking more advantage of the PSSCR supported
states, as well as permitting state loss, as is currently done for power8.
The currently implemented stop state is the lowest latency, which may still
consume resources.
POWER9 supports Radix page tables in addition to Hashed page tables. When
Radix page tables are in use, the TLB is cut in half, so that half of the
TLB is used for the page walk cache. This is the default behavior, however
FreeBSD currently does not support Radix tables. Clear this bit so that we
can use the full TLB. Do this in the MMU logic so that configuration can be
localized to the specific translation format. Once we do support Radix
tables, the setup for that will be localized to the Radix MMU kobj.
Summary:
PowerISA 2.03 and later require bits 14:65 in the RB register argument,
which is the full value of the vpn argument post-shift. Only POWER4, POWER4+,
and PPC970* need the upper 16 bits cropped.
With this change FreeBSD can boot to multi-user on POWER9.
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15581
IPMI access on PowerNV systems is done through the OPAL firmware. This adds a
simple attachment for communicating with the FSP/BMC on these machines. This
has been tested on a Talos POWER9 workstation, only in the bootup phase, noting
the successful attachment messages:
...
ipmi0: IPMI device rev. 0, firmware rev. 2.00, version 2.0, device support mask 0
ipmi0: Number of channels 2
...
The ipmi device has not been added to GENERIC64, but may be after further
testing. It may also eventually be added to the ipmi module at that point.
Summary:
PowerNV architectures (in the test case POWER9) export sensors via the device
tree, which are accessed via OPAL calls. This adds sysctl nodes for each
device in a generic fashion. New sysctl nodes are:
dev.opal_sensor.N.sensor
dev.opal_sensor.N.sensor_min
dev.opal_sensor.N.sensor_max
dev.opal_sensor.N.type
dev.opal_sensor.N.label
These are rooted at a parent attachment under opal, called opalsens. This does
not add support for the "sensor groups" defined in the device tree.
Reviewed by: breno.leitao_gmail.com
Differential Revision: https://reviews.freebsd.org/D15362