Commit Graph

3167 Commits

Author SHA1 Message Date
John Baldwin
b317cfd4c0 Don't enter DDB for fatal traps before panic by default.
Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'.  Disable the new knob by default.  Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob.  However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.

Reviewed by:	kib, avg
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17768
2018-11-01 21:34:17 +00:00
Kyle Evans
be352d20d5 Compile in VERBOSE_SYSINIT support by default, remain silent by default
The loader tunable 'debug.verbose_sysinit' may be used to toggle verbosity.
This is added to the debugging section of these kernconfs to be turned off
in stable branches for clarity of intent.

MFC after:	never
2018-10-31 22:38:19 +00:00
Michael Tuexen
d59a162c11 Bump the number of fans supported from 8 to 12.
The number of fans on a PowerMac7,3 with liquid cooling is 9.

Reviewed by:		andreast@
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D17754
2018-10-30 11:51:09 +00:00
Justin Hibbits
a37c714a0f powerpc/mpc85xx: Reset the PCIe bus on attach
It seems if a Radeon card is already initialized by u-boot, it won't be
reinitialized by the kernel, and the DRM module will fail to attach.  This
steals the reset code from mips/octopci.c to blindly reset the bus on attach.
This was tested on a AmigaOne X5000/20, such that it can be booted from the
local video console, and get a video console in FreeBSD.
2018-10-30 00:47:40 +00:00
Brooks Davis
c3adaa3305 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
Leandro Lupori
d93e635a81 ppc64: limited 32-bit DMA address range
Further investigation of issues with 32-bit DMA on PowerNV revealed that
its window is hardcoded by OPAL (at least in skiboot version 5.4.9) and
cannot be changed by the OS.
Thus, now jhb suggestion of limiting the range in PCI DMA tag seems
the best way to deal with it.

Reviewed by:	jhibbits, nwhitehorn, sbruno
Approved by:	jhibbits(mentor)
Differential Revision:	https://reviews.freebsd.org/D17601
2018-10-22 13:40:50 +00:00
Justin Hibbits
24dd643d54 powerpc: stash off srr0 in si_addr for signals
si_addr is the address of the instruction executing at the time the
signal was sent.  Populate this field with srr0, which, though not
always the case, is most often the instruction that triggered the fault.
2018-10-22 00:27:37 +00:00
Justin Hibbits
6f4827aca7 powerpc/booke: Turn tlb*_print_tlbentries() into 'show tlb*' DDB commands
debugf() is unnecessary for the TLB printing functions, as they're only
intended to be used from ddb.  Instead, make them full DDB 'show'
commands, so now it can be written as 'show tlb1' and 'show tlb0'
instead of calling the function, hoping DEBUG has been defined.
2018-10-22 00:21:27 +00:00
Justin Hibbits
c9bd5bee1a powerpc/mpc85xx: Make Freescale PCI bridge driver a subclass of ofw_pcib_pci
This driver was already 99% identical to the ofw_pcib_pci driver, except for
the attachment.  Since ofw_pcib_pci is already a subclass of pcib, this
creates a private declaration of that class, to use for the base class for
this driver.

At some point in the future, ofw_pcib_pci_driver should probably be exported
to a header, so we're not tracking the softc struct contents, but for now,
since there's only this one other driver, it's not a pressing issue.
2018-10-21 02:39:13 +00:00
Justin Hibbits
27ef2ca86b powerpc64/powernv: Add pnpinfo strings to opal device children
This makes it easier to see what's left unattached as new drivers are
written, and to see what drivers get attached to what nodes.
2018-10-21 02:30:34 +00:00
Justin Hibbits
d692cd43c4 powerpc64/pmap: Correct the logic for minidump KVA chunk
r279252 inverted the logic in moea64_scan_init, such that instead of
terminating when reaching a dead page, it terminates when reaching a live
page, ostensibly preserving exactly one page of KVA.
2018-10-21 02:28:04 +00:00
Justin Hibbits
54b310b892 powerpc64/xics: Fix comment typo 2018-10-21 02:25:56 +00:00
Justin Hibbits
2756851a77 powerpc64/powernv:opal_pci: Fix the alignment of the TCE table
The TCE table need only be aligned to the size of the table, not the size of
the TCE segment.
2018-10-21 02:24:37 +00:00
Justin Hibbits
289041e2cb powerpcspe: Implement SPE exception handling
The Signal Processing Engine (SPE) found in Freescale e500 cores (and
others) offloads IEEE-754 compliance (NaN, Inf handling, overflow,
underflow) to software, most likely as a means of simplifying the APU
silicon.  Some software, like AbiWord, needs full IEEE-754 compliance,
including NaN handling.  Implement the necessary bits to enable it.

Differential Revision: https://reviews.freebsd.org/D17446
2018-10-21 00:43:27 +00:00
Leandro Lupori
461298ffa6 Initialize SPRG0 before its first possible use
At early boot, PCPU_GET(), that obtains a pointer from SPRG0, was being
used with SPRG0 not yet initialized. If it pointed to an invalid
address, the machine would hang.

Approved by:	re(gjb), jhibbits(mentor)
2018-10-15 16:43:07 +00:00
Michael Tuexen
20a2f77eec Enable TCP Fast Open support for PPC platforms.
Reviewed by:		kbowling@, andreast@
Approved by:		re (kib@)
Differential Revision:	https://reviews.freebsd.org/D17407
2018-10-07 12:56:05 +00:00
Justin Hibbits
7e524b0746 powerpc/pseries: EOI interrupts in XICS by setting lowest priority
Discussing with Benjamin Herrenschmidt, OPAL_INT_GET_XIRR masks the
returned priority, so must be resumed before more interrupts can be
handled at this priority.  Since there are only two priorities used in
FreeBSD, we know that the previous priority in an EOI will always be
0xff (lowest priority).

Reviewed by:	nwhitehorn
Approved by:	re(rgrimes)
Differential Revision: https://reviews.freebsd.org/D17361
2018-10-06 18:51:49 +00:00
Justin Hibbits
013cc176c9 powerpc64/powernv: Don't mask MSIs in OPAL
Summary:
Discussing with Benjamin Herrenschmidt, MSIs, and edge-triggered
interrupts in general, must not be masked in XICS and XIVE, else
subsequent interrupts may be ignored.

Testing locally on my Talos II (single CPU, 18-core POWER9), NVMe now
works with MSI, improving read throughput by ~70% (900MB/s -> 1.67GB/s,
with 64MB block size) over INTx interrupts, and snd_hda(4) now will
actually play music with MSI.  Previously, snd_hda(4) would not receive
interrupts, timing out, and declaring the channels dead.

This has also been tested by Kevin Bowling, and others, with great
success.  Kevin reported NVMe unusable on his Talos II prior to this
patch.

Reviewed by:	nwhitehorn, kbowling
Approved by:	re(rgrimes)
Differential Revision: https://reviews.freebsd.org/D17356
2018-10-06 03:20:26 +00:00
Kevin Bowling
8ac2f3ba9f Use nda(4) on powerpc64
Approved by:	re@ (kib), krion (mentor), imp
Differential Revision:	https://reviews.freebsd.org/D17368
2018-10-02 21:36:00 +00:00
Justin Hibbits
fd8cf3be5f powerpc: Blacklist the top 64kB range of the lower 4GB PA space
The PHB4 host bridge used by the POWER9 uses a 64kB range in 32-bit
space at the address 0xffff0000-0xffffffff.  Reserve this range so that
DMA memory cannot be allocated within this range.  This fixes seemingly
random crashes on a POWER9 system.  Ideally this range will have been
reserved by the firmware, but as of now this is not the case.

Submitted by:	git_bdragon.rtk0.net
Reviewed by:	nwhitehorn
Approved by:	re(kib)
Differential Revision:	https://reviews.freebsd.org/D17183
2018-09-25 02:34:28 +00:00
Breno Leitao
03e83a838e powerpc64: Add initial support for HTM (kABI)
This patch adds the very initial support for HTM that might come at FreeBSD
version 12.1. This basic support defines a new kABI, so, we do not need to change
it later during 12.1 time frame, when the full implementation will come.

Reviewed by: jhibbits
Approved by: re(marius), jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16889
2018-09-06 17:07:21 +00:00
Alan Cox
49bfa624ac Eliminate the arena parameter to kmem_free(). Implicitly this corrects an
error in the function hypercall_memfree(), where the wrong arena was being
passed to kmem_free().

Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are
mapped in kmem with execute permissions.  Use this flag to determine which
arena the kmem virtual addresses are returned to.

Eliminate UMA_SLAB_KRWX.  The introduction of VPO_KMEM_EXEC makes it
redundant.

Update the nearby comment for UMA_SLAB_KERNEL.

Reviewed by:	kib, markj
Discussed with:	jeff
Approved by:	re (marius)
Differential Revision:	https://reviews.freebsd.org/D16845
2018-08-25 19:38:08 +00:00
Mark Johnston
36716fe2e6 Prepare the kernel linker to handle PC-relative ifunc relocations.
The boot-time ifunc resolver assumes that it only needs to apply
IRELATIVE relocations to PLT entries.  With an upcoming optimization,
this assumption no longer holds, so add the support required to handle
PC-relative relocations targeting GNU_IFUNC symbols.
- Provide a custom symbol lookup routine that can be used in early boot.
  The default lookup routine uses kobj, which is not functional at that
  point.
- Apply all existing relocations during boot rather than filtering
  IRELATIVE relocations.
- Ensure that we continue to apply ifunc relocations in a second pass
  when loading a kernel module.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16749
2018-08-22 20:44:30 +00:00
Alan Cox
83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
Alan Cox
44d0efb215 Eliminate kmem_alloc_contig()'s unused arena parameter.
Reviewed by:	hselasky, kib, markj
Discussed with:	jeff
Differential Revision:	https://reviews.freebsd.org/D16799
2018-08-20 15:57:27 +00:00
Matt Macy
381388b9c4 add snps IP uart support / genaralize UART
This is an amalgam of a patch by Doug Ambrisko to
generalize uart_acpi_find_device, imp moving the
ACPI table to uart_dev_ns8250.c and advice by jhb
to work around a bug in the EPYC 3151 BIOS
(the BIOS incorrectly marks the serial ports as
disabled)

Reviewed by: imp
MFC after: 8 weeks
Differential Revision: https://reviews.freebsd.org/D16432
2018-08-19 21:10:21 +00:00
Justin Hibbits
8d67357c5c powerpc conf: Add PRINTF_BUFR_SIZE option to Book-E configs
Without this, printf is very hard to follow at times on multicore systems.
2018-08-19 19:07:59 +00:00
Justin Hibbits
b793c8ab28 Sort SPR_SPEFSCR in the SPR list
Also remove duplicate definition of SPR_IBAT0U.
2018-08-19 19:03:43 +00:00
Justin Hibbits
340a810bf0 booke pmap: hide debug-ish printf behind bootverbose
It's not necessary during normal operation to know the mapped region size
and wasted space.
2018-08-19 18:54:43 +00:00
John Baldwin
8cd385fda0 Make 'device crypto' lines more consistent.
- In configurations with a pseudo devices section, move 'device crypto'
  into that section.
- Use a consistent comment.  Note that other things common in kernel
  configs such as GELI also require 'device crypto', not just IPSEC.

Reviewed by:	rgrimes, cem, imp
Differential Revision:	https://reviews.freebsd.org/D16775
2018-08-18 20:32:08 +00:00
Justin Hibbits
7d849dc1a4 powerpc: Add lwsync and ptesync 'sync' opcode variants to ddb disassembler
The canonical form of sync is:

  sync L, E (if Category Elemental Memory Barriers implemented)

The L bits (2) denote the type of sync:

  0 -- hwsync
  1 -- lwsync
  2 -- ptesync or hwsync

It's been found that most 32-bit CPUs designed prior to the introduction of
lwsync will ignore the L bits.  However, some cores, particularly the e500 core,
will trigger an illegal instruction exception.  Adding these variants will make
it easier to see which sync variant is actually being used in case of a trap.
2018-08-10 03:28:40 +00:00
Breno Leitao
78f4e2fea0 powerpc64/powernv: re-read RTC after polling
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.

This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.

The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.

This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617
2018-08-08 21:19:07 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Justin Hibbits
9b8d0a4615 powerpcspe: Unconditionally save an restore SPEFSCR on task switch
The SPEFSCR is not guarded by the SPV bit in MSR, it's just another SPR.
Protect processes from other tasks setting the SPEFSCR for their own needs.
2018-07-30 17:03:15 +00:00
Justin Hibbits
0bf0bb832f Support building IPMI as a module on powerpc64
This still only supports IPMI via OPAL on powerpc64, but now it can be tested
with a GENERIC kernel.
2018-07-25 18:58:57 +00:00
Justin Hibbits
adc9dcf3e3 Fix floating point exception definitions for powerpcspe
These were incorrectly implemented in the original port.
2018-07-24 22:04:56 +00:00
Breno Leitao
3ddc2cde51 ofw: Load initrd file
This is an OFW initrd module that would load the initrd from device tree
parameters and give the to the md driver.

With this patch, it is possible to pass a rootfs image through kexec in PowerNV
mode (powerpc64). In order to user it, you should set the MD_ROOT_MEM option in
your kernel configuration.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15705
2018-07-24 16:52:52 +00:00
Justin Hibbits
038c615929 Revert r336509. Fails buildworld.
I had naively assumed that building kernel would be sufficient to test that
the header is sane.  However, it turns out this now needs -fms-extensions to
build.  Rather than sprinkling -fms-extensions all over the place, revert
for now, and revisit with a better fix.
2018-07-19 21:06:58 +00:00
Justin Hibbits
7fb935da15 Merge the md_page structs for AIM and Book-E into a single unioned struct
Summary:
Ports like sysutils/lsof troll through kernel structures, and
therefore include kernel headers and all the dirty secrets involved.  struct
vm_page includes the struct md_page inline, which currently is only defined
if AIM or BOOKE is defined.  Thus, by default, sysutils/lsof cannot build,
due to the struct md_page having an incomplete type.  Fix this by merging
the two struct definitions into an anonymous struct-union.

A similar change could be made to unify the pmap structures as well.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D16232
2018-07-19 20:13:33 +00:00
Justin Hibbits
3395ab28eb powerpc/powernv: Make opal_i2c driver work with attached i2c drivers
* FreeBSD stores addresses in 8 bit format, but the OPAL API requires the 7-bit
  address, and encodes the direction elsewhere.  Behave like other i2c drivers,
  and shift accordingly.
* The OPAL API can already handle multiple requests in flight.  Change the async
  token to be private to the thread, so as not to stomp across i2c accesses,
  remove the limitation error message, and use the correct message index to
  transfer all messages in the list.
* Micro-optimize the async handler to not continuously call pmap_kextract() when
  spin-waiting for the operation to complete.

This has been tested by hexdumping an EEPROM attached via the icee(4) driver.
2018-07-09 20:33:48 +00:00
Justin Hibbits
fedd55f14b Let ofw_iicbus work its magic on OPAL i2c buses.
ofw_iicbus already has attachments on iichb.  Rather than adding an explicit
attachment onto opal_i2c, simply change the exposed name of the OPAL i2c bus
to 'iichb'.
2018-07-07 01:58:40 +00:00
Matt Macy
ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00
Andrew Turner
2bf9501287 Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by:	bz, emaste, markj
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D16140
2018-07-05 17:13:37 +00:00
Justin Hibbits
341679e1f2 Support multiple OPAL consoles, and don't crash if uart is not stdout
Summary: If the chosen console is not the OPAL uart, but OPAL uart devices
exist, the console device doesn't attach properly, and faults in the interrupt
handler, with a NULL pointer dereference.  To fix this, and as a byproduct, also
support multiple OPAL consoles, refactor to have the console getc callback use
the appropriate softc instead of the global console_sc, which may be NULL in the
case of a different device being the console.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D16071
2018-06-29 19:35:25 +00:00
Justin Hibbits
4a28eaada4 Expose stopped cpu contexts to ddb on PowerPC
Summary: In r220638, stoppcbs started being tracked. This never got exposed to
ddb though, so kdb_thr_ctx() didn't know how to look them up.

This allows switching to threads on stopped CPUs in kdb.

Submitted by:	Brandon Bergren <git_bdragon.rkt0.net>
Differential Revision: https://reviews.freebsd.org/D15986
2018-06-25 22:05:33 +00:00
Justin Hibbits
ab42fbe2e9 powerpc64: Fix stack setup in dbtrap
r330610 relocated the DMAP from the base of memory to the base of the fourth
quadrant of memory.  This broke synthetic traps, such as KDB forced
breakpoints.  Use GET_TOCBASE() so the DMAP offset is handled.

Submitted by:	git_bdragon.rkt0.net
Differential Revision:	https://reviews.freebsd.org/D15973
2018-06-23 01:42:34 +00:00
Justin Hibbits
88cafb5b91 Fix the build post-PMCR addition.
Submitted by:	lwhsu
2018-06-21 15:59:05 +00:00
Justin Hibbits
b99540b655 Add the rest of the files for r335481
Missed hooking PMCR cpufreq(4) to the build, and adding the SPR to the header.
2018-06-21 14:30:14 +00:00
Justin Hibbits
22c1b4c0f1 Introduce PMCR-based cpufreq(4) driver, for IBM POWER8 and POWER9 systems
Summary: POWER8 and POWER9 use a single CPU register, per core, to change clock
speed.  Everything else is handled by the on-chip controller.  This change
necessitates a change to the cpufreq global kernel driver to bump supported
levels, as the device tree for these systems can have theoretically 256
different options.  On my POWER9 Talos, the list consists of 100 items.  At
16.67MHz intervals, that allows for a change of roughly 1.67GHz between lowest
and highest.

This has only been tested on the POWER9.  However, since they're similar, this
should work on POWER8 as well.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15932
2018-06-21 14:26:43 +00:00
Justin Hibbits
ebf95d96d9 Split the PowerISA 3.0 HPT implementation from historic
PowerISA 3.0 makes several changes to not only the format of the HPT but
also the behavior surrounding it.  For instance, TLBIE no longer requires
serialization.  Removing this lock cuts buildworld time in half on a
18-core/72-thread POWER9 system, demonstrating that this lock is highly
contended on such a system.

There was odd behavior observed trying to make this change in a
backwards-compatible manner in moea64_native.c, so the best option was to
fully split it, and largely revert the original changes adding POWER9
support to the original file.

Suggested by:	nwhitehorn
2018-06-14 17:23:51 +00:00
Justin Hibbits
402c7806cb Fix CTR formatting for moea64_native bootstrap
On very large memory systems 'size' can become 2GB or larger, resulting in a
negative value being formatted.  Also, moea64_pteg_count is already a long, so
format it as such.
2018-06-14 16:01:11 +00:00
Breno Leitao
5ecc8c2077 powerpc64/powernv: Avoid type promotion
There is a type promotion that transform count = -1 into a unsigned int causing
the default TCE SEG SIZE not being returned on a Boston POWER9 machine.

This machine does not have the 'ibm,supported-tce-sizes' entries, thus, count
is set to -1, and the function continue to execute instead of returning.

Reviewed by: jhibbits, wma
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15763
2018-06-12 19:50:33 +00:00
Matt Macy
eb7c901995 hwpmc: simplify calling convention for hwpmc interrupt handling
pmc_process_interrupt takes 5 arguments when only 3 are needed.
cpu is always available in curcpu and inuserspace can always be
derived from the passed trapframe.

While facially a reasonable cleanup this change was motivated
by the need to workaround a compiler bug.

core2_intr(cpu, tf) ->
  pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) ->
    pmc_add_sample(cpu, ring, pm, tf, inuserspace)

In the process of optimizing the tail call the tf pointer was getting
clobbered:

(kgdb) up
    at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709
4709                                pmc_save_kernel_callchain(ps->ps_pc,
(kgdb) up
1205                    error = pmc_process_interrupt(cpu, PMC_HR, pm, tf,

resulting in a crash in pmc_save_kernel_callchain.
2018-06-08 04:58:03 +00:00
Breno Leitao
6d645c57a3 Fix excise_initrd_region() to support 32- and 64-bit initrd params.
Changed excise_initrd_region to support both 32- and 64-bit
values for linux,initrd-start and linux,initrd-end.

This fixes the boot problem on some machines after rS334485.

Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: jhibbits, leitao
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15667
2018-06-07 21:24:21 +00:00
Justin Hibbits
5167f178ab Included VSX registers in powerpc core dumps
Summary: Included VSX registers in powerpc core dumps (both kernel and gcore)

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15512
2018-06-02 20:28:58 +00:00
Justin Hibbits
2e65567500 Added ptrace support for reading/writing powerpc VSX registers
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).

This is necessary to add support for VSX registers in debuggers.

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
2018-06-02 19:17:11 +00:00
Justin Hibbits
3254c39f83 Increase powerpc64 KVA from ~7.25GB to 32GB
This will let us use much more KVA for ZFS ARC where needed.  This may be
incresed in the future if memory requirements increase.

Discussed with:	nwhitehorn
2018-06-01 21:37:20 +00:00
Justin Hibbits
a608b7d313 Unbreak 32-bit binaries on powerpc64
Recently a change was made which broke loading 32-bit binaries on powerpc64,
with an assertion in ld-elf32.so.1:

ld-elf32.so.1: assert failed:
/usr/local/poudriere/jails/ppc64/usr/src/libexec/rtld-elf/rtld.c:390

It turns out Elf32_AuxInfo was broken for a very long time on powerpc64, as
it uses long and pointers, which are both 64 bits on powerpc64, and only
manifested with the recent work on auxargs.
2018-06-01 16:31:05 +00:00
Breno Leitao
48f64992f2 powerpc64: Avoid overwriting initrd area
Currently kexec loads an initrd file into the main memory but does not
mark that region as reserved, thus the area is not protected.

If any initrd/md file is loaded from kexec/petitboot, the region might become
corarupted/overwritten since FreeBSD does not know the region is 'reserved'.

This patch simply adds the initrd area as a reserved memory region.

Approved by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15610
2018-06-01 12:43:13 +00:00
Justin Hibbits
e69b55eadb Remove a debug printf from opal_pci driver 2018-05-31 04:11:40 +00:00
Justin Hibbits
dceea51efe Make opal_pci driver work with POWER9
Summary:
Coupled with r334365, this makes PCI work on POWER9.  There is still more to
do to fully exploit the hardware capabilities, but this is sufficient to
enable USB and ethernet controllers on a POWER9 Talos II system.

Reviewed by:	nwhitehorn, leitao
Differential Revision: https://reviews.freebsd.org/D15566
2018-05-30 03:00:57 +00:00
Justin Hibbits
f07ee2a7c0 Cache the phandle of the PCI node in opal_pci_attach
Simple cleanup, no functional change.  This is related to the fixups needed
for POWER9 support.
2018-05-30 02:47:23 +00:00
Justin Hibbits
0b1f36b6c5 Make ALT_BREAK_TO_DEBUGGER work with OPAL console
Match other consoles by using the higher level cngetc() in the interrupt
handler, so that kdb_alt_break() can check for console break.
2018-05-28 01:59:48 +00:00
Justin Hibbits
38cfc8c393 Print the full-width pointer values in hex.
PRI0ptrX is used to print a zero-padded hex value of the architecture's bitness,
so on 64-bit architectures it'll print the full 64 bit address.
2018-05-28 00:19:08 +00:00
Justin Hibbits
877c96bedc Match style of the other prototypes, and don't name the argument. 2018-05-27 20:36:43 +00:00
Justin Hibbits
ce7b8e55e3 Stop idle threads on power9 in the idle task until an interrupt.
This reduces the CPU cycle wastage on power9, which is SMT4.  Any idle
thread that's spinning is simply starving working threads on the same core
of valuable resources.

This can be reduced further by taking more advantage of the PSSCR supported
states, as well as permitting state loss, as is currently done for power8.
The currently implemented stop state is the lowest latency, which may still
consume resources.
2018-05-27 20:24:24 +00:00
Justin Hibbits
5ab39b6552 On POWER9 clear the HID0_RADIX before enabling the page tables
POWER9 supports Radix page tables in addition to Hashed page tables.  When
Radix page tables are in use, the TLB is cut in half, so that half of the
TLB is used for the page walk cache.  This is the default behavior, however
FreeBSD currently does not support Radix tables.  Clear this bit so that we
can use the full TLB.  Do this in the MMU logic so that configuration can be
localized to the specific translation format.  Once we do support Radix
tables, the setup for that will be localized to the Radix MMU kobj.
2018-05-26 04:33:19 +00:00
Justin Hibbits
9ae8a6d3d1 Fix a typo missed in r334232 2018-05-26 04:24:25 +00:00
Justin Hibbits
459e54f990 Correct a typo for opal temperature sensor type constant 2018-05-26 02:45:41 +00:00
Justin Hibbits
204d74320d Only crop the VPN on POWER4 and derivatives for TLBIE operations
Summary:
PowerISA 2.03 and later require bits 14:65 in the RB register argument,
which is the full value of the vpn argument post-shift.  Only POWER4, POWER4+,
and PPC970* need the upper 16 bits cropped.

With this change FreeBSD can boot to multi-user on POWER9.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15581
2018-05-26 00:41:50 +00:00
Justin Hibbits
1a3eaf6cc8 Add an IPMI attachment for PowerNV systems
IPMI access on PowerNV systems is done through the OPAL firmware.  This adds a
simple attachment for communicating with the FSP/BMC on these machines.  This
has been tested on a Talos POWER9 workstation, only in the bootup phase, noting
the successful attachment messages:

...
ipmi0: IPMI device rev. 0, firmware rev. 2.00, version 2.0, device support mask 0
ipmi0: Number of channels 2
...

The ipmi device has not been added to GENERIC64, but may be after further
testing.  It may also eventually be added to the ipmi module at that point.
2018-05-22 03:57:32 +00:00
Justin Hibbits
5272c9bd07 Add a comment explaining the need of a global temporary variable
cpu_xirr is used only as a temporary location for the OPAL call in
PIC_DISPATCH().

Requested by:	nwhitehorn
2018-05-22 03:24:16 +00:00
Justin Hibbits
9c6ba29de1 Basic OPAL sensor support for POWER9 platforms
Summary:
PowerNV architectures (in the test case POWER9) export sensors via the device
tree, which are accessed via OPAL calls.  This adds sysctl nodes for each
device in a generic fashion.  New sysctl nodes are:

dev.opal_sensor.N.sensor
dev.opal_sensor.N.sensor_min
dev.opal_sensor.N.sensor_max
dev.opal_sensor.N.type
dev.opal_sensor.N.label

These are rooted at a parent attachment under opal, called opalsens.  This does
not add support for the "sensor groups" defined in the device tree.

Reviewed by:	breno.leitao_gmail.com
Differential Revision: https://reviews.freebsd.org/D15362
2018-05-22 02:42:53 +00:00
Nathan Whitehorn
6cff19a3be Fix build with PSERIES but not POWERNV defined. 2018-05-20 18:26:09 +00:00
Justin Hibbits
ef6da5e5c7 Add support for the XIVE XICS emulation mode for POWER9 systems
Summary:
POWER9 systems use a new interrupt controller, XIVE, managed through OPAL
firmware calls.  The OPAL firmware includes support for emulating the previous
generation XICS presentation layer in addition to a new "XIVE Exploitation"
mode.  As a stopgap until we have XIVE exploitation mode, enable XICS emulation
mode so that we at least have an interrupt controller.

Since the CPPR is local to the current CPU, it cannot be updated for APs when
initializing on the BSP.  This adds a new function, directly called by the
powernv platform code, to initialize the CPPR on AP bringup.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15492
2018-05-20 03:23:17 +00:00
Mark Johnston
892bdccca0 Enable kernel dump features in GENERIC for most platforms.
This turns on support for kernel dump encryption and compression, and
netdump. arm and mips platforms are omitted for now, since they are more
constrained and don't benefit as much from these features.

Reviewed by:	cem, manu, rgrimes
Tested by:	manu (arm64)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D15465
2018-05-19 19:53:23 +00:00
Justin Hibbits
4a11ed7159 Add SPR_HSRR0/SPR_HSRR1 definitions
Reported by:	Mark Millard
Pointy-hat to:	jhibbits
2018-05-19 04:56:10 +00:00
Justin Hibbits
5321c01b50 Add hypervisor trap handling, using HSRR0/HSRR1
Summary:
Some hypervisor exceptions on POWER architecture only save state to HSRR0/HSRR1.
Until we have bhyve on POWER, use a lightweight exception frontend which copies
HSRR0/HSRR1 into SRR0/SRR1, and run the normal trap handler.

The first user of this is the Hypervisor Virtualization Interrupt, which targets
the XIVE interrupt controller on POWER9.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15487
2018-05-19 04:21:50 +00:00
Justin Hibbits
30f3b0f5f7 powerpc64: Add OPAL definitions
Summary:
Add additional OPAL PCI definitions and expand the code to use them in order to
ease the OPAL interface process for new comers.

These definitions came directly from the OPAL code and they are the same for
both PHB3 (POWER8) and PHB4 (POWER9).

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15432
2018-05-19 04:01:15 +00:00
Justin Hibbits
c07c77a311 Fix a manual copy from the original diff for r333825
The 'else' was in the original diff.

Submitted by:	Breno Leitao
2018-05-19 03:47:28 +00:00
Justin Hibbits
876f3b9295 Add yet another option for gathering available memory
On some POWER9 systems, 'reg' denotes the full memory in the system, while
'linux,usable-memory' denotes the usable memory.  Some memory is reserved for
NVLink usage, so is partitioned off.

Submitted by:	Breno Leitao
2018-05-19 03:45:38 +00:00
Justin Hibbits
829c98b82e Add some Hypervisor interrupt definitions
This mostly completes the interrupt definitions.  There are still some left out,
less likely to be used in the near term.
2018-05-19 03:23:46 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Nathan Whitehorn
b00df92b1f Final fix for alignment issues with the page table first patched with
r333273 and partially reverted with r333594.

Older CPUs implement addition of offsets into the page table by a
bitwise OR rather than actual addition, which only works if the table is
aligned at a multiple of its own size (they also require it to be aligned
at a multiple of 256KB). Newer ones do not have that requirement, but it
hardly matters to enforce it anyway.

The original code was failing on newer systems with huge amounts of RAM
(> 512 GB), in which the page table was 4 GB in size. Because the
bootstrap memory allocator took its alignment parameter as an int, this
turned into a 0, removing any alignment constraint at all and making
the MMU fail. The first round of this patch (r333273) fixed this case by
aligning it at 256 KB, which broke older CPUs. Fix this instead by widening
the alignment parameter.
2018-05-14 04:00:52 +00:00
Nathan Whitehorn
b9ff14e6e9 Revert changes to hash table alignment in r333273, which booting on all G5
systems, pending further analysis.
2018-05-13 23:56:43 +00:00
Justin Hibbits
04de51dbab No need to bzero splpar_vpa entries
splpar_vpa is in the BSS, so is already zeroed when the kernel starts up.

Tested by:	Leandro Lupori
2018-05-11 02:04:01 +00:00
Justin Hibbits
b4a0a59871 Fix PPC symbol resolution
Summary:
There were 2 issues that were preventing correct symbol resolution
on PowerPC/pseries:

1- memory corruption at chrp_attach() - this caused the inital
   part of the symbol table to become zeroed, which would cause
   the kernel linker to fail to parse it.
   (this was probably zeroing out other memory parts as well)

2- DDB symbol resolution wasn't working because symtab contained
   not relocated addresses but it was given relocated offsets.
   Although relocating the symbol table fixed this, it broke the
   linker, that already handled this case.
   Thus, the fix for this consists in adding a new DDB macro:
   DB_STOFFS(offs) that converts a (potentially) relocated offset
   into one that can be compared with symbol table values.

PR:		227093
Submitted by:	Leandro Lupori <leandro.lupori_gmail.com>
Differential Revision: https://reviews.freebsd.org/D15372
2018-05-10 03:59:48 +00:00
Warner Losh
5aa07b053a Move MI-ish bcopy routine to libkern
riscv and powerpc have nearly identical bcopy.c that's
supposed to be mostly MI. Move it to the MI libkern.

Differential Revision: https://reviews.freebsd.org/D15374
2018-05-10 02:31:38 +00:00
Justin Hibbits
151c44e22b Fix wrong cpu0 identification
Summary:
chrp_cpuref_init() was relying on the boot strap processor to be
the first child of /cpus. That was not always the case, specially
on pseries with FDT.

This change uses the "reg" property of each CPU instead and also
adds several sanity checks to avoid unexpected behavior (maybe
too many panics?).

The main observed symptom was interrupts being missed by the main
processor, leading to timeouts and the kernel aborting the boot.

Submitted by:	Leandro Lupori
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15174
2018-05-08 13:23:39 +00:00
Justin Hibbits
10d0cdfc6e Add support for powernv POWER9 MMU initialization
The POWER9 MMU (PowerISA 3.0) is slightly different from current
configurations, using a partition table even for hypervisor mode, and
dropping the SDR1 register.  Key off the newly early-enabled CPU features
flags for the new architecture, and configure the MMU appropriately.

The POWER9 MMU ignores the "PSIZ" field in the PTCR, and expects a 64kB
table.  As we are enabled for powernv (hypervisor mode, no VMs), only
initialize partition table entry 0, and zero out the rest.  The actual
contents of the register are identical to SDR1 from previous architectures.

Along with this, fix a bug in the page table allocation with very large
memory.  The table can be allocated on any 256k boundary.  The
bootstrap_alloc alignment argument is an int, and with large amounts of
memory passing the size of the table as the alignment will overflow an
integer.  Hard-code the alignment at 256k as wider alignment is not
necessary.

Reviewed by:	nwhitehorn
Tested by:	Breno Leitao
Relnotes:	Yes
2018-05-05 16:00:02 +00:00
Justin Hibbits
55a12bbda2 Break out the cpu_features setup to its own function, to be run earlier
The new POWER9 MMU configuration is slightly different from current setups.
Rather than special-casing on POWER9, move the initialization of cpu_features
and cpu_features2 to as early as possible, so that platform and MMU
configuration can be based upon CPU features instead of specific CPUs if at all
possible.

Reviewed by:	nwhitehorn
2018-05-05 15:48:39 +00:00
Justin Hibbits
4f4f92c58f Add POWER9 to the POWER8 bootstrap case blocks
POWER8 and POWER9 have similar configuration requirements for hypervisor setup,
and in the cases here they're identical.  Add the POWER9 constant to the POWER8
list so it's initialized correctly.

Reviewed by:	nwhitehorn
2018-05-05 15:42:58 +00:00
Mateusz Guzik
a571c38536 Allow __builtin_memmove instead of bcopy for small buffers of known size
See r323329 for an explanation why this is a good idea.
2018-05-04 04:00:48 +00:00
Justin Hibbits
971b5e4da8 Remove dead errata fixup code
This code caused more problems than it should have fixed (boot failures) on
the machines I tested, so has been commented out for a while now.  Remove
it, and assume the errata fixups were done by the bootloader where they
belong.
2018-05-01 04:31:17 +00:00
Nathan Whitehorn
47280ef170 Fix null pointer dereference on nodes without a "compatible" property.
MFC after:	1 week
2018-04-30 19:37:32 +00:00
Justin Hibbits
42ca1d5cc3 Increase the fdtmemreserv array limit to boot on POWER9
Discussing with others, this needs to be at least 20 to boot on some POWER9
nodes.  Linux made a similar change for the same reason, so increase to 32
to give us some extra breathing room as well.  The input and output arrays
are sized at 256, so much greater than the increase in the property array
size.
2018-04-25 02:42:11 +00:00
Justin Hibbits
38c32a140b Fix the build post r332859
sysentvec::sv_hwcap/sv_hwcap2 are pointers to  u_long, so cpu_features* need
to be u_long to use the pointers.  This also requires a temporary cast in
printing the bitfields, which is fine because the feature flag fields are
only 32 bits anyway.
2018-04-22 03:58:04 +00:00
Justin Hibbits
611c02b1d2 Export powerpc CPU features for auxvec
FreeBSD exports the AT_HWCAP* auxvec items if provided by the ELF sysentvec
structure.  Add the CPU features to be exported, so user space can more
easily check for them without using the hw.cpu_features and hw.cpu_features2
sysctls.
2018-04-21 15:15:47 +00:00
Justin Hibbits
18f48e0c72 Sync powerpc feature flags with Linux
Not all feature flags are synced.  Those for processors we don't currently
support are ignored currently.  Those that are supported are synced best I
can tell.  One flag was renamed to match the Linux flag name
(PPC_FEATURE2_VCRYPTO -> PPC_FEATURE2_VEC_CRYPTO).
2018-04-21 04:18:17 +00:00
Justin Hibbits
567dd766f6 powerpc64: Set n_slbs = 32 for POWER9
Summary:
POWER9 also contains 32 slbs entries as explained by the POWER9 User Manual:

 "For HPT translation, the POWER9 core contains a unified (combined for both
   instruction and data), 32-entry, fully-associative SLB per thread"

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15128
2018-04-20 03:23:19 +00:00
Justin Hibbits
2914706ab0 powerpc64: Add DSCR support
Summary:
Powerpc64 has support for a register called Data Stream Control Register
(DSCR), which basically controls how the hardware controls the caching and
prefetch for stream operations.

Since mfdscr and mtdscr are privileged instructions, we need to emulate them,
and
keep the custom DSCR configuration per thread.

The purpose of this feature is to change DSCR depending on the operation, set
to DSCR Default Prefetch Depth to deepest on string operations, as memcpy.

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15081
2018-04-20 03:19:44 +00:00
Nathan Whitehorn
323e673945 Fix detection of memory overlap with the kernel in the case where a memory
region marked "available" by firmware is contained entirely in the kernel.

This had a tendency to happen with FDTs passed by loader, though could for
other reasons as well, and would result in the kernel slowly cannibalizing
itself for other purposes, eventually resulting in a crash.

A similar fix is needed for mmu_oea.c and should probably just be rolled
at that point into some generic code in platform.c for taking a mem_region
list and removing chunks.

PR:		226974
Submitted by:	leandro.lupori@gmail.com
Reviewed by:	jhibbits
Differential Revision:	D15121
2018-04-19 18:34:38 +00:00
Alexander Motin
596e6ade92 Release memory resource on cuda driver attach failure.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
2018-04-19 15:29:10 +00:00
Andriy Gapon
f3f6ecb450 set kdb_why to "trap" when calling kdb_trap from trap_fatal
This will allow to hook a ddb script to "kdb.enter.trap" event.
Previously there was no specific name for this event, so it could only
be handled by either "kdb.enter.unknown" or "kdb.enter.default" hooks.
Both are very unspecific.

Having a specific event is useful because the fatal trap condition is
very similar to panic but it has an additional property that the current
stack frame is the frame where the trap occurred.  So, both a register
dump and a stack bottom dump have additional information that can help
analyze the problem.

I have added the event only on architectures that have trap_fatal()
function defined.  I haven't looked at other architectures.  Their
maintainers can add support for the event later.

Sample script:
kdb.enter.trap=bt; show reg; x/aS $rsp,20; x/agx $rsp,20

Reviewed by:	kib, jhb, markj
MFC after:	11 days
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D15093
2018-04-19 05:06:56 +00:00
Andriy Gapon
6d83b2e971 don't check for kdb reentry in trap_fatal(), it's impossible
trap() checks for it earlier and calls kdb_reentry().

Discussed with:	jhb
MFC after:	12 days
Sponsored by:	Panzura
2018-04-18 15:44:54 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Warner Losh
9d89a9f326 No need to force md code to define a macro that's the same as
_BYTE_ORDER. Use that instead.
2018-04-16 13:52:23 +00:00
Justin Hibbits
3877c32ec9 Use a resource hint instead of environment variable for DIU mode
This makes it more consistent with FreeBSD norms, rather than using Linux's
norms.  Now, instead of needing an environment variable

  video-mode=fslfb:1280x1024@60

Now one would use a hint:

  hint.fb.0.mode=1280x1024@60
2018-04-16 04:02:53 +00:00
Justin Hibbits
bda8aa770a Reenter KDB on fault on powerpc, instead of panicking
Most other architectures already re-enter KDB on faults, powerpc and mips
are the only outliers.  Correct this for powerpc, so that now bad addresses
can be handled gracefully instead of panicking.
2018-04-10 21:14:54 +00:00
Justin Hibbits
99adcecf38 Call through powerpc_interrupt for all Book-E interrupts
Make int_external_input, int_decrementer, and int_performance_counter all
now use trap_common, just like on AIM.  The effects of this are:

* All traps are now properly displayed in ddb.  Previously traps from
  external input, decrementer, and performance counters, would display as
  just basic stack traces.  Now the frame is displayed.

* External interrupts are now handled with interrupts enabled, so handling
  can be preempted.  This seems to fix a hang found post-r329882.
2018-04-10 17:32:27 +00:00
Oleksandr Tymoshenko
f7604b1b27 Align OF_getencprop_alloc API with OF_getencprop and OF_getprop_alloc
Change OF_getencprop_alloc semantics to be combination of malloc and
OF_getencprop and return size of the property, not number of elements
allocated.

For the use cases where number of elements is preferred introduce
OF_getencprop_alloc_multi helper function that copies semantics
of OF_getencprop_alloc prior to this change.

This is to make OF_getencprop_alloc and OF_getencprop_alloc_multi
function signatures consistent with OF_getencprop_alloc and
OF_getencprop_alloc_multi.

Functionality-wise this patch is mostly rename of OF_getencprop_alloc
to OF_getencprop_alloc_multi except two calls in ofw_bus_setup_iinfo
where 1 was used as a block size.
2018-04-09 22:06:16 +00:00
Oleksandr Tymoshenko
217d17bcd3 Clean up OF_getprop_alloc API
OF_getprop_alloc takes element size argument and returns number of
elements in the property. There are valid use cases for such behavior
but mostly API consumers pass 1 as element size to get string
properties. What API users would expect from OF_getprop_alloc is to be
a combination of malloc + OF_getprop with the same semantic of return
value. This patch modifies API signature to match these expectations.

For the valid use cases with element size != 1 and to reduce
modification scope new OF_getprop_alloc_multi function has been
introduced that behaves the same way OF_getprop_alloc behaved prior to
this patch.

Reviewed by:	ian, manu
Differential Revision:	https://reviews.freebsd.org/D14850
2018-04-08 22:59:34 +00:00
Justin Hibbits
b4b4b17687 Fix typo
Reserved cause is 6, not 5.

Reported by:	cem
2018-04-08 19:33:05 +00:00
Justin Hibbits
ac2605b1d1 Powerpc64: Add the facility unavailable trap subsystem
Summary:
This code adds the basic infrastructure for the facility subsystem. A facility
trap is raised when an unavailable instruction is executed. One example is
executing a Hardware Transactional Memory instruction while the MSR[TM] is
disabled. In the past, there was a specific interrupt for it (FP, VEC), but the
new instructions seem to be multiplexed on this facility interrupt.

The root cause of the trap is provided on Facility Status and Control Register
(FSCR) register.

Submitted by:	Breno Leitao
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14566
2018-04-08 19:11:25 +00:00
Justin Hibbits
3762bafa7b powerpc64: Print current MSR on printtrap()
Summary:
Print current MSR on printtrap(). Currently, printtrap just prints srr1, which
contains part of the MSR prior to the exception. I find useful to dump the
current value of the MSR, since it changes when there is an interruption.

With this patch, this is the new printtrap model:

handled user trap:

    exception       = 0x700 (program)
    srr0            = 0x100008a0 (0x100008a0)
    srr1            = 0x800000000002f032
    current msr     = 0x8000000000009032
    lr              = 0x1000089c (0x1000089c)
    curthread       = 0x7a50000
	pid = 714, comm = ttrap2

Submitted by:	Breno Leitao
Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14600
2018-04-08 16:55:28 +00:00
Justin Hibbits
8238d3423d powerpc64: Avoid calling isync twice
Summary:
It is not necessary to call isync() after calling mtmsr() function, mainly
because the mtmsr() calls 'isync' internally to synchronize the machine state
register. Other than that, isync() just calls the 'isync' instruction, thus,
the 'isync' instruction is being called twice, and that seems to be unnecessary.

This patch just remove the unecessary calls to isync() after mtmsr().

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D14583
2018-04-08 16:46:24 +00:00
Justin Hibbits
d6d0670814 powerpc/ofw: Fix malloc inside lock
Summary:
Currently ofw_real_bounce_alloc() is requesting memory, using WAITOK, holding a
non-sleepable locks, called 'OF Bounce Page'.

Fix this by allocating the pages outside of the lock, and only updating the
global variables while holding the lock.

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14955
2018-04-08 16:43:56 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Justin Hibbits
9225dfbdf0 Correct the ilog2() for calculating memory sizes.
TLB1 can handle ranges up to 4GB (through e5500, larger in e6500), but
ilog2() took a unsigned int, which maxes out at 4GB-1, but truncates
silently.  Increase the input range to the largest supported, at least for
64-bit targets.  This lets the DMAP be completely mapped, instead of only
1GB blocks with it assuming being fully mapped.
2018-04-04 02:13:27 +00:00
Justin Hibbits
9f5b999aca Add support for a pmap direct map for 64-bit Book-E
As with AIM64, map the DMAP at the beginning of the fourth "quadrant" of
memory, and move the KERNBASE to the the start of KVA.

Eventually we may run the kernel out of the DMAP, but for now, continue
booting as it has been.
2018-04-03 00:45:38 +00:00
Justin Hibbits
9ae2eed9f8 Debug interrupts aren't instruction traps
The EXC_DEBUG type is akin to the MPC74xx "Instruction Breakpoint" trap.
Don't treat it as a trap instruction.
2018-03-23 00:40:08 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Justin Hibbits
a029f84189 Fix powerpc Book-E build post-331018/331048.
pagedaemon_wakeup() was moved from vm_pageout.h to vm_pagequeue.h.
2018-03-20 01:07:22 +00:00
Oleksandr Tymoshenko
108117cc22 [ofw] fix errneous checks for OF_finddevice(9) return value
OF_finddevices returns ((phandle_t)-1) in case of failure. Some code
in existing drivers checked return value to be equal to 0 or
less/equal to 0 which is also wrong because phandle_t is unsigned
type. Most of these checks were for negative cases that were never
triggered so trhere was no impact on functionality.

Reviewed by:	nwhitehorn
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14645
2018-03-20 00:03:49 +00:00
Wojciech Macek
d90930743f Reverting r330925 for now 2018-03-15 06:19:45 +00:00
Wojciech Macek
22eedd96c7 PowerNV: Fix I2C to compile if FDT is disabled
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-03-14 09:20:03 +00:00
Nathan Whitehorn
9b0ec025d4 Restore missing temporary variable, deleted by accident in r330845. This
unbreaks the ppc32 AIM build.

Reported by:	jhibbits
2018-03-13 18:24:21 +00:00
Nathan Whitehorn
8864f35942 Execute PowerPC64/AIM kernel from direct map region when possible.
When the kernel can be in real mode in early boot, we can execute from
high addresses aliased to the kernel's physical memory. If that high
address has the first two bits set to 1 (0xc...), those addresses will
automatically become part of the direct map. This reduces page table
pressure from the kernel and it sets up the kernel to be used with
radix translation, for which it has to be up here.

This is accomplished by exploiting the fact that all PowerPC kernels are
built as position-independent executables and relocate themselves
on start. Before this patch, the kernel runs at 1:1 VA:PA, but that
VA/PA is random and set by the bootloader. Very early, it processes
its ELF relocations to operate wherever it happens to find itself.
This patch uses that mechanism to re-enter and re-relocate the kernel
a second time witha new base address set up in the early parts of
powerpc_init().

Reviewed by:	jhibbits
Differential Revision:	D14647
2018-03-13 15:03:58 +00:00
Nathan Whitehorn
35feca377d Make FDT-using parts of ofw_machdep.c condition on options FDT. This fixes
the kernel build when options FDT is absent.
2018-03-11 01:09:31 +00:00
Nathan Whitehorn
f9edb09d70 Move the powerpc64 direct map base address from zero to high memory. This
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
  bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
  direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.

The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.

Reviewed by:	jhibbits, Breno Leitao
Differential Revision:	D14499
2018-03-07 17:08:07 +00:00
Nathan Whitehorn
72820025dd Fix use of unitialized variables. 2018-03-06 15:52:43 +00:00
Jonathan T. Looney
beb2406556 amd64: Protect the kernel text, data, and BSS by setting the RW/NX bits
correctly for the data contained on each memory page.

There are several components to this change:
 * Add a variable to indicate the start of the R/W portion of the
   initial memory.
 * Stop detecting NX bit support for each AP.  Instead, use the value
   from the BSP and, if supported, activate the feature on the other
   APs just before loading the correct page table.  (Functionally, we
   already assume that the BSP and all APs had the same support or
   lack of support for the NX bit.)
 * Set the RW and NX bits correctly for the kernel text, data, and
   BSS (subject to some caveats below).
 * Ensure DDB can write to memory when necessary (such as to set a
   breakpoint).
 * Ensure GDB can write to memory when necessary (such as to set a
   breakpoint).  For this purpose, add new MD functions gdb_begin_write()
   and gdb_end_write() which the GDB support code can call before and
   after writing to memory.

This change is not comprehensive:
 * It doesn't do anything to protect modules.
 * It doesn't do anything for kernel memory allocated after the kernel
   starts running.
 * In order to avoid excessive memory inefficiency, it may let multiple
   types of data share a 2M page, and assigns the most permissions
   needed for data on that page.

Reviewed by:	jhb, kib
Discussed with:	emaste
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D14282
2018-03-06 14:28:37 +00:00
Nathan Whitehorn
c1d6f6ebe8 Honor physical memory regions marked unavailable in the FDT, when present.
The most notable of these is the FDT itself, which it is a bad idea to
overwrite.
2018-03-03 02:06:48 +00:00
Nathan Whitehorn
1a60bed731 Remove assumption that all physical memory is available to the kernel and
that the physical and available memory arrays are interchangeable.
2018-03-03 02:04:40 +00:00
Wojciech Macek
4ffd72e34c PowerNV: Initial support for OPAL I2C transfers
Add I2C OPAL driver and a set of dummy-ones to allow
all I2C things on Power8 to attach.

TODO: better async token management

Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-03-01 14:11:07 +00:00
Justin Hibbits
5903f5954a Fix the psl_userset32 definition.
It should be based on psl_userset, not psl_kernset.  As kernset, it would
inherit kernel config, including privilege level.
2018-03-01 04:44:17 +00:00
Justin Hibbits
2d5320a818 Increase the size of a reservation granule for TLB locks
A reservation granule on PowerPC is a cache line.

On e500mc and derivatives a cacheline size is 64 bytes, not 32.  Allocate
the maximum size permitted, but only utilize the size that is needed.  On
e500v1 and e500v2 the reservation granule will still be 32 bytes.
2018-02-27 04:38:27 +00:00
Justin Hibbits
7fa00cd0ab Fix a minor typo. 2018-02-27 04:23:03 +00:00
Justin Hibbits
e6939726ef Correct a copy&paste-o -- altivec assist interrupt, not watchdog 2018-02-26 03:05:36 +00:00
Nathan Whitehorn
f638d50513 Avoid dereferencing random memory when kickstarting DMA.
MFC after: 1 week
2018-02-24 22:34:56 +00:00
Justin Hibbits
7eb6081727 Unbreak 64-bit Book-E builds post r329712
can_wakeup is defined only in AIM's locore64.S, so conditionalize use of it
on AIM in addition to powerpc64.
2018-02-24 18:12:38 +00:00
Justin Hibbits
635d2bed1d Make MPC85XXSPE kernel conf ident match the file name 2018-02-24 17:54:12 +00:00
Justin Hibbits
675147dd7b Change ident for QORIQ64 kernel conf
Make it match the conf file name.
2018-02-24 17:53:22 +00:00
Justin Hibbits
571892ff4d Unbreak the build after r329891
I was apparently a little too excited with deleting code, and apparently
didn't do a final test build before commit.  Restore cpu_idle_wakeup().
2018-02-24 17:29:29 +00:00
Justin Hibbits
6708989b60 Remove platform_cpu_idle() and platform_cpu_idle_wakeup() interfaces
These interfaces were put in place to let QorIQ SoCs dictate CPU idling
semantics, in order to support capabilities such as NAP mode and deep sleep.
However, this never stabilized, and the idling support reverted back to
CPU-level rather than SoC level.  Move this code back to cpu.c instead.  If
at a later date the lower power modes do come to fruition, it should be done
by overriding the cpu_idle_hook instead of this platform hook.
2018-02-24 01:46:56 +00:00
Wojciech Macek
3c41c1d446 powerpc64: add NVMe to GENERIC64
NVMe support is ready and should be compiled-in
to the ppc64 kernel.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-23 07:43:52 +00:00
Warner Losh
ef1fcaf0f5 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
Nathan Whitehorn
d9dbc2104f Add definition for the PowerPC A2. 2018-02-21 15:15:58 +00:00
Nathan Whitehorn
dddf28585d Add definitions for the new Radix MMU mode on POWER9+ CPUs. 2018-02-21 15:15:31 +00:00
Wojciech Macek
6d13fd638c PowerNV: Put processor to power-save state in idle thread
When processor enters power-save state it releases resources shared with other
cpu threads which makes other cores working much faster.

This patch also implements saving and restoring registers that might get
corrupted in power-save state.

Submitted by:          Patryk Duda <pdk@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           jhibbits, nwhitehorn, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14330
2018-02-21 14:28:40 +00:00
Wojciech Macek
eb96cc1364 PowerNV: add missing RTC_WRITE support
Add function which can store RTC values to OPAL.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-21 08:13:17 +00:00
Justin Hibbits
fcc491a3fe Split printtrap() into generic and CPU-specific components
Summary:
This compartmentalizes the CPU-specific trap components into its own
function, rather than littering the general printtrap() with various checks.
This will let us replace a series of #ifdef's with a runtime conditional check
in the future.

Reviewed By:	nwhitehorn
Differential Revision:	https://reviews.freebsd.org/D14416
2018-02-21 03:34:33 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Wojciech Macek
f32ebdc85c PowerPC: Switch to more accurate unit to avoid division rounding
On POWER8 architecture there is a timer with 512Mhz frequency.
It has about 1,95ns period, but it is rounded to 1ns which is not accurate.

Submitted by:          Patryk Duda <pdk@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14433
2018-02-20 07:30:57 +00:00
Wojciech Macek
838070d5f4 PowerNV: Send SIGILL on HEA illegal instruction exception
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.

Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn, pdk@semihalf.com, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14437
2018-02-20 06:38:55 +00:00
Nathan Whitehorn
65184f89b6 Set internal error returns for OF_peer(), OF_child(), and OF_parent() to
zero, matching the IEEE 1275 standard. Since these internal error paths
have never, to my knowledge, been taken, behavior is unchanged.

Reported by:	gonzo
MFC after:	2 weeks
2018-02-19 15:49:14 +00:00
Justin Hibbits
bce6d88bc1 Merge AIM and Book-E PCPU fields
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC
kernel.  As more work is done, the struct may be optimized further.

Reviewed by:	nwhitehorn
2018-02-17 20:59:12 +00:00
Justin Hibbits
a00ce4e854 PPC64: Get the timestap from the proper OF field
Summary:
After revision rS328534('PPC64: use hwref instead of cpuid'), FreeBSD on
powerpc64 virtual machine panics since it is unable to read the
timebase, showing the following error:

     get-property for timebase-frequency on zero phandle

     panic: Unable to determine timebase frequency!

With the change above,  cpuref->cr_hwref does not contain the phandle
anymore, thus, it never reads the proper CPU entry in OF.

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14204
2018-02-14 02:51:28 +00:00
Justin Hibbits
26e251b55c powerpc64/pseries: Define new hcalls
Summary:
Define new hcalls as in 'Linux on Power Architecture Platform Reference'
version 1.1 (24 March 2016) downloaded from:

        https://members.openpowerfoundation.org/document/dl/469

Submitted by:	Breno Leitao
Differential Revision:	https://reviews.freebsd.org/D14281
2018-02-14 02:48:27 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Warner Losh
62bca77843 Move __va_list and related defines to sys/sys/_types.h
__va_list and related defines are identical in all the
ARCH/include/_types.h files. Move them to sys/sys/_types.h

Sponsored by: Netflix
2018-02-12 14:48:20 +00:00
Warner Losh
982e7bdafc We don't support gcc < 4.2.1, so varargs.h now is just #error
always. Unifdef for versions prior to 4.2.1 and remove now-unused
header files.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:14 +00:00
Warner Losh
33e959abab Use standard pattern for stdargs.h
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:05 +00:00
Nathan Whitehorn
778c8dac96 Fix PowerMac G5 thermal management, plus likely other bugs, introduced in
r328113 and affecting SMP systems.

The way the time is set on PowerMacs is racy and relies on all the
CPUs in the system setting a register simultaneously in a rendezvous. A
few-cycle delay can result in out-of-sync times, which can break the
scheduler and result in calls like mtx_sleep() and pause() never timing out
if the thread is migrated while sleeping. r328113 added a call to a no-op
function between the beginning of the rendezvous and setting the time that
was only called on APs and added enough cycles to cause a problematic offset.
For some reason, the fan-management code was the first place this appeared.

Clue from:	andreast
Reported by:	many
2018-02-09 20:09:32 +00:00
Mark Johnston
ab7c09f121 Use vm_page_unwire_noq() instead of directly modifying page wire counts.
No functional change intended.

Reviewed by:	alc, kib (previous revision)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14266
2018-02-08 19:28:51 +00:00
Jeff Roberson
e2068d0bcd Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state.  Protect reservations with the free lock
from the domain that they belong to.  Refactor to make vm domains more
of a first class object.

Reviewed by:    markj, kib, gallatin
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14000
2018-02-06 22:10:07 +00:00
Justin Hibbits
b0d3bb2613 Only look for L2 cache controllers for mpc85xx_cache
The L3 cache controller (Corenet Platform Cache) is listed with one of its
compatible strings as "cache", which this driver can't attach to.  Restrict
to a known list of primary cache controller strings, as found in the l2cache
devicetree binding.
2018-02-04 20:07:08 +00:00
Justin Hibbits
ce2d51972f Start building modules for MPC85XX and MPC85XXSPE
These kernels aren't restricted to development boards anymore, they are
closer in behavior to GENERIC, so build modules.
2018-02-04 15:40:48 +00:00
Justin Hibbits
2c26c98c89 Add sdhci to MPC85XX build 2018-02-04 15:39:15 +00:00
Steve Wills
aa3c83c3c6 Create GENERIC64-NODEBUG for powerpc64
Approved by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D14192
2018-02-04 14:27:12 +00:00
Steve Wills
e1782bae5f Correct longjmp
Reviewed by:	nwhitehorn
Differential Revision:	https://reviews.freebsd.org/D14159
2018-02-02 02:28:25 +00:00
Nathan Whitehorn
619282986d Change the default MSR values used when starting userland and kernel
threads from compile-time defines to global variables. This removes a
significant amount of duplicated runtime patches to the compile-time
defines, centralizing the conditional logic in the early startup code.

Reviewed by:	jhibbits
2018-02-01 05:31:24 +00:00
Nathan Whitehorn
564ac41556 Fix build on 32-bit PowerPC, broken in r328537. 2018-02-01 05:28:02 +00:00
Wojciech Macek
d32802f0c3 PowerNV: fix compilation on non-NV platforms
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-31 06:42:01 +00:00
Wojciech Macek
70bb600a0a PowerNV: move LPCR and LPID altering to cpudep_ap_early_bootstrap
It turns out that under some circumstances we can get DSI or DSE before we set
LPCR and LPID so we should set it as early as possible.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-29 09:27:02 +00:00
Wojciech Macek
f0393bbf34 PPC64: use hwref instead of cpuid
On CHRP and PowerNV, use the interrupt server number in the cpuref and pcpu
hwref field instead of the device-tree phandle and make the CPU IDs reported
to the scheduler dense and with the BSP at 0.

Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14011
2018-01-29 09:15:38 +00:00
Wojciech Macek
b74fb1e713 PPC64: cleanup APs startup routines
Cleaning up AP startup routines. This is a mix of changes
required to make PowerNV running and to modify the code
to be more robust. Previously, some races were seen if more
than 90CPUs were online.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14026
2018-01-29 08:10:03 +00:00
Nathan Whitehorn
eb1baf72ae Remove hard-coded trap-handling logic involving the segmented memory model
used with hashed page tables on AIM and place it into a new, modular pmap
function called pmap_decode_kernel_ptr(). This function is the inverse
of pmap_map_user_ptr(). With POWER9 radix tables, which mapping to use
becomes more complex than just AIM/BOOKE and it is best to have it in
the same place as pmap_map_user_ptr().

Reviewed by:	jhibbits
2018-01-29 04:33:41 +00:00
Warner Losh
d6b6639713 Add ISA PNP tables to ISA drivers. Fix a few incidental comments.
ACPI ISA PBP tables not tagged, there's bigger issues with them.
2018-01-29 00:22:30 +00:00
Nathan Whitehorn
21776ff850 Remove some unused AIM register declarations that existed to support some
CPUs we have never run on. As a side-effect, removes some #ifdef AIM/#else.
2018-01-28 21:30:57 +00:00
Justin Hibbits
0a3ef103a3 Start building modules for QORIQ64
There's no reason not to build modules for 64-bit QorIQ devices.  This
config has evolved to be analogous to the AIM GENERIC64 kernel, so will grow
to match it in more ways as well.
2018-01-28 20:35:48 +00:00
Justin Hibbits
a72b951348 Consolidate trap instruction checks to a single function
Summary:
Rather than duplicating the checks for programmatic traps all over the code, put
it all in one function.  This helps to remove some of the #ifdefs between AIM
and Book-E.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D14082
2018-01-28 19:18:40 +00:00
Wojciech Macek
919736d252 PPC: Add place for NULL chars in intrnames
In a corner case we could fall into OOB error.

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-26 09:38:40 +00:00
Nathan Whitehorn
e649493c6d Avoid all SLB operations in trap handling if the process is not using a
software-managed SLB.
2018-01-25 18:10:33 +00:00
Nathan Whitehorn
3e1b393a51 Treat DSE exceptions like DSI exceptions when generating signinfo.
Both can generate SIGSEGV, but DSEs would have put the wrong address
into the siginfo structure when the signal was delivered.

MFC after:	1 week
2018-01-25 18:09:26 +00:00
Wojciech Macek
68c2d255bc PPC: Add KASSERT in intrcnt_add which checks for buffer overflow
Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-24 12:01:32 +00:00
Wojciech Macek
a81a290f34 PowerNV: send MSI_EOI always after MSI unmask
MSI/MSI-x interrupts are edge-triggered. If an interrupt
arrives when IRQ line is masked, it will be lost and will
never recover. Perform MSI_EOI always after unmask to give
a chance for PHB/XICS to send an interrupt again if MSI/MSI-x
pending bit is set in MSI/MSI-x BAR space.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-23 08:07:00 +00:00
Justin Hibbits
1a55e1d038 Fix 64-bit booke kernel builds after the ldscript changes
Commits r326203 and r326978 broke 64-bit booke kernels by introducing a 1MB
zero-pad between the ELF header and the start of the kernel.  This didn't
cause a build failure, but caused kernels to need to be loaded into memory
1MB lower, which could easily break scripts expecting previous behavior.
This change matches the similar change made to AIM in r327358.
2018-01-23 02:52:12 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Nathan Whitehorn
7790e46cf0 On AIM systems without a software-managed SLB, such as POWER9 systems using
either hardware segment tables or radix-tree-based page tables, do not try
to install SLB entries at trap boundaries.
2018-01-19 22:19:50 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
Wojciech Macek
720212d30d Call platform_smp_ap_init before decr_ap_init
In platform_smp_ap_init we are doing some crucial code (eg. set LPCR register)
    which have influence over further execution.

    Practiculary in PowerNV platform we have experienced Data Storage Interrupt
    before we set apropriate LPCR. It caused code execution from location which was
    legal in bootloader (petitboot based on linux) but illegal in FreeBSD
2018-01-18 08:34:20 +00:00
Wojciech Macek
054a090d49 PPC64: fix TOC behavior on process initialization
Set stack pointer to correct value after thread's stack pointer restore

Restoring new thread's stack pointer caused stack corruption because
restored stack pointer didn't point to callee (cpu_switch) stack frame but
caller stack frame.

As a result we had mysterious errors in caller function (sched_switch).

Solution: simply set stack pointer to correct value

Also, initialize TOC to a valid pointer once the thread is being
created.

Created by:            Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn
Differential revision: https://reviews.freebsd.org/D13947
Sponsored by:          QCM Technologies
2018-01-18 07:42:51 +00:00
Wojciech Macek
8a0112ca65 PPC: machdep, zero BSS always but BookE
Zero BSS always. The only case when this operation is
ommitted is when booting on BookE.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           imp, nwhitehorn
Differential revision: https://reviews.freebsd.org/D13948
Sponsored by:          QCM Technologies
2018-01-18 07:41:04 +00:00
Wojciech Macek
e70f868f17 PPC64: add AHCI back to GENERIC64 2018-01-18 06:28:21 +00:00
Wojciech Macek
f7b509a109 PPC64: implement missing busdma ops
Add missing little-endian 64-bit read and write. Since there
is no direct ASM opcode for this, perform byte swap if
necessary.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:45:18 +00:00
Wojciech Macek
91769f6452 PPC64: fix copyinout ranges
Use current userspace address for segment mapping. Previously,
there was a bug which made the funciton constantly using the userspace
base address which could cause data integrity issues.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:36:48 +00:00
Wojciech Macek
55b823e52a PPC64: add CXGBE and remove AHCI from GENERIC64
Add CXGBE driver which is required for PowerNV system.
Also, remove AHCI which does not work in BigEndian.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:33:16 +00:00
Wojciech Macek
6005affb74 PowerNV: workaround console on OPAL 5.4
FreeBSD prints text char-by-char, which is not what OPAL
is designed to. Poll events more frequently to avoid buffer
overflow and loosing data.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 08:01:51 +00:00
Wojciech Macek
5c3e53ef19 PowerNV: make PowerNV PCIe working on a real hardware
Fixes:
- map all devices to PE0
- use 1:1 TCE mapping
- provide the same TCE mapping for all PEs (not only PE0)
- add TCE reset and alignment (required by OPAL)

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 07:39:11 +00:00
Wojciech Macek
8fc8068eba PowerNV: XICS support for PowerNV/OPAL
Make XICS to be OPAL-aware.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-16 06:24:19 +00:00
Justin Hibbits
e64428edf7 Make fsl_sata driver work on P1022
P1022 SATA controller may set the wrong CCR bit for a command completion.
This would previously cause an interrupt storm.  Solve this by marking all
commands complete, and letting the end_transaction deal with the successes.
Causes no problems on P5020.

While here, fix a minor bug in collision detection.  The Freescale SATA
controller only has 16 slots, not 32.
2018-01-16 04:50:23 +00:00
Pedro F. Giffuni
6d5bc1bcab powerpc: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:10:40 +00:00
Nathan Whitehorn
fc8ea4be2a Install the SLB miss trap-handling code in the SLB-based MMU driver set up,
to which it is specific, rather than in the generic AIM startup code. This
will be required to support the radix-table-based MMU introduced with POWER9.
2018-01-15 16:08:34 +00:00
Nathan Whitehorn
04329fa708 Move the pmap-specific code in copyinout.c that gets pointers to userland
buffers into a new pmap-module function pmap_map_user_ptr() that can
be implemented by the respective modules. This is required to implement
non-segment-based AIM-ish MMU systems such as the radix-tree page tables
introduced by POWER ISA 3.0 and present on POWER9.

Reviewed by:	jhibbits
2018-01-15 06:46:33 +00:00
Nathan Whitehorn
68b9c019aa Document places we assume that physical memory is direct-mapped at zero by
using a new macro PHYS_TO_DMAP, which deliberately has the same name as the
equivalent macro on amd64. This also sets the stage for moving the direct
map to another base address.
2018-01-13 23:14:53 +00:00
Justin Hibbits
4a20766452 Include only the headers needed
The extra headers came through evolution of the file.
2018-01-13 21:10:42 +00:00
Justin Hibbits
8e14018389 Add SPDX identifier to header
Reported by:	pfg
2018-01-13 17:25:48 +00:00
Nathan Whitehorn
222393d5ca Chase removal of FDT fixup code on PowerPC in r327907. 2018-01-13 03:09:05 +00:00
Justin Hibbits
e9f96ff457 Enable L2 cache on supported PowerQUICC and QorIQ platforms
Some PowerQUICC and QorIQ platforms have a L2 cache managed via the
memory-mapped configuration registers, and appear as a node in the device
tree.  This adds basic support to enable the cache.
2018-01-13 01:36:37 +00:00
Jeff Roberson
6f4acaf4c9 Add support for NUMA domains to bus dma tags. This causes all memory
allocated with a tag to come from the specified domain if it meets the
other constraints provided by the tag.  Automatically create a tag at
the root of each bus specifying the domain local to that bus if
available.

Reviewed by:	jhb, kib
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13545
2018-01-12 23:34:16 +00:00
Jeff Roberson
ab3185d15e Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
Wojciech Macek
504d9b6029 PowerNV: update OPAL driver
Update OPAL driver with:
- better console support
- proper AP configuration
- enhanced IRQ/OFW mapping
- RTC support

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-12 12:14:52 +00:00
Wojciech Macek
ac9b43252a PowerNV: initial support for PCIe host controller
Provide initial support for PCIe host controller as
well as for IOMMU mapping. This commit allows proper
bus enumeration, but does not guarantee DMA operations
are working.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-12 07:55:49 +00:00
Wojciech Macek
fc1689021f PowerNV: add buffer for OPAL console
Avoid the lock in vtophys() by providing a static direct-mapped
spinlock- protected output buffer to use when the console driver
cannot acquire locks for some reason. This allows the idle thread
to use printf() (e.g. the SMP startup messages) without crashing
the kernel.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:42:24 +00:00
Wojciech Macek
c024897601 PowerNV: set LPCR[LPES] correctly
Make sure to set LPCR[LPES] so that external interrupts set SRR0 and SRR1
instead of HSRR0 and HSRR1. Without this, external interrupt handlers would
get the wrong MSR value when executing, causing eventual madness.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:39:38 +00:00
Wojciech Macek
01d7bda7b7 PowerNV: correctly start secondary CPUs
Fix AP startup, which was broken.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:34:33 +00:00
Wojciech Macek
32d1354a39 PowerNV: add reset, poweroff, OPAL console
Add basic power control (reset, power off) and bind
ttyuX to opal console so that init will start login there.

Created by:            Nathan Whitehorn <nw@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
Sponsored by:          FreeBSD Foundation
2018-01-11 09:26:28 +00:00
Wojciech Macek
fb3855e0e7 PowerNV: initial support for OPAL
OPAL is a dedicated firmware acting as a hypervisor.
Add generic functions to provide all access.

Created by:            Nathan Whitehorn <nw@freebsd.org>
Submitted by:          Wojciech Macek <wma@freebsd.org>
2018-01-11 07:40:06 +00:00
Landon J. Fuller
a1df0d9592 Fix minor locking issues in the Power Mac Uninorth PCI bridge driver.
- Call resource_int_value() once during attach, rather than within the
  pci_(read|write)_config() code path; this avoids taking a blocking mutex
  to read kenv variables.

- Use a spin lock to protect non-atomic config space accesses; this matches
  the behavior of Darwin's AppleMacRiscPCI driver.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D13839
2018-01-10 22:19:11 +00:00
Nathan Whitehorn
566a135bd5 Add XHCI support to powerpc64 GENERIC. This is useful to get input devices
supported on newer POWER hardware and in graphical VMs run on the same,
which are typically XHCI-only. The 32-bit GENERIC kernel, which
does not run on hardware made in the last decade and is unlikely to
encounter XHCI devices, is left unchanged.

PR:		kern/224940
Submitted by:	Gustavo Romero
MFC after:	1 week
2018-01-09 19:41:10 +00:00
Nathan Whitehorn
09f07b0017 Revert r327360, which can cause boot problems on high-CPU-count (>60)
POWER8 and POWER9 systems, pending further analysis.

PR:		224841
2018-01-04 23:07:51 +00:00
Andreas Tobler
7e792cb8f5 The recent bump of MAXDSIZ made 32-bit binary execution on 64-bit powerpc fail.
The data segement was too big.

Add a fix-up function like on ia32 for MAXDSIZ.

While here, bring also the MAXSSIZ closer to amd64 and add an equal fix-up
function for MAXSSIZ.

Reviewed by:	jhibbits@
Obtained from:  jhibbits@
Differential Revision:	https://reviews.freebsd.org/D13753
2018-01-03 20:20:43 +00:00
Nathan Whitehorn
67530f82dd Fix reversed endianness that crept in at some point. Blue is now blue
instead of pink.

MFC after:	3 days
2018-01-02 03:59:46 +00:00
Nathan Whitehorn
3972f4c1d4 Remove PIR from PCPU data. It has an implementation-defined meaning that
is of limited utility outside of platform-specific code and can vary
at runtime when running as a hypervisor guest, so does not even have the
virtue of being a static identifier.

Reviewed by:	jhibbits
2017-12-31 20:23:39 +00:00
Nathan Whitehorn
4e05ac247c Fix 32-bit build. 2017-12-31 20:20:55 +00:00
Nathan Whitehorn
f81dfc7f6b Make newer binutils happy by using a bl-type branch instead of b, which
displeases it for some reason. LR is not relevant in this code, so just
do what it wants.
2017-12-31 20:10:08 +00:00
Nathan Whitehorn
ec75f647cc Provide relative, as well as absolute, addresses in trap panic panics. This
makes it easier to cross-correlate them with instruction listings without
worrying about where the kernel was relocated to.

MFC after:	1 week
2017-12-31 20:08:16 +00:00
Colin Percival
d5d7606c0c Use the TSLOG framework to record entry/exit timestamps for DELAY and
_vprintf; these functions are called in many places and can contribute
meaningfully to the total time spent booting.
2017-12-31 09:24:41 +00:00
Nathan Whitehorn
5261ac0eda Use data from the boot loader to pick the appropriate output graphics mode
instead of hard-coding a default. This information is passed implicitly by
the PS3 firmware and can be relied upon. Also adjust the default mode, if
somehow firmware doesn't pass one, to 1920x1080 from 720x480 since it is
2017.

MFC after:	2 weeks
2017-12-31 06:10:07 +00:00
Nathan Whitehorn
a891d21aac Make sure the first instruction of the low-memory spinloop is in the
cacheline being invalidated.

MFC after:	1 month
2017-12-31 05:38:19 +00:00
Nathan Whitehorn
3fca788024 Remove logic for early console with loader.ps3 now that loader.ps3 is dead. 2017-12-30 20:25:33 +00:00
Nathan Whitehorn
ba06dbb874 Change the way SMP startup works to match the new multi-AP features in
locore64.S introduced in r327358.

MFC after:	3 weeks
2017-12-30 20:24:33 +00:00
Nathan Whitehorn
f9d6e0a5d0 Enhance the CHRP/pSeries platform layer:
- Densely number CPUs to avoid systems with CPUs with very high ID numbers
- Always have the BSP be CPU 0 to avoid remnant brokenness with non-0 BSPs
  in other parts of the kernel.
- Improve parsing of the device tree CPU listings on SMT systems.
- Allow reboot via RTAS as well as OF for pSeries systems booted by FDT
  without functioning Open Firmware.

Obtained from:	projects/powernv
MFC after:	3 weeks
2017-12-29 21:09:17 +00:00
Nathan Whitehorn
70f654991a Add support for 64-bit PowerPC kernels to be directly loaded by kexec, which
is used as the bootloader on a number of PPC64 platforms. This involves the
following pieces:
- Making the first instruction a valid kernel entry point, since kexec
  ignores the ELF entry value. This requires a separate section and linker
  magic to prevent the linker from filling the beginning of the section
  with stubs.
- Adding an entry point at 0x60 past the first instruction for systems
  lacking firmware CPU shutdown support (notably PS3).
- Linker script changes to support the above.

MFC after:	1 month
2017-12-29 20:30:10 +00:00
Nathan Whitehorn
8469e0fe35 Maintain alignment of in-code 64-bit quantities by design rather than luck.
If these are not aligned, the linker has to emit a different type of
relocation that the early boot self-relocation code cannot handle, even
in principle, resulting in them being set to zero and the kernel crashing.

MFC after:	1 week
2017-12-29 20:25:15 +00:00
Nathan Whitehorn
2ad331874e Remove ELF note for Open Firmware. It is marked optional in a single 1996
draft of a never-finalized standard (CHRP) and is irrelevant in practice
on FreeBSD since we load the kernel with loader(8) on Open Firmware
platforms anyway. Moreover, loader(8), which is directly loaded by Open
Firmware, has never had an equivalent note.

MFC after:	2 weeks
2017-12-28 23:49:53 +00:00
Eitan Adler
caa7e52f3f kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
Justin Hibbits
87879ba805 Increase default MAXDSIZ to 32G on powerpc64
Linking LLVM now seems to require more than 1GB data size, so increase the
default to 32G, which matches amd64.

Reviewed by:	nwhitehorn
2017-12-20 16:49:45 +00:00
Nathan Whitehorn
d6716aa2af The highest-order bit of the bootloader cookie is 1, with the result that
the 32-bit cookie can be sign-extended on its way out of the loader and
through Open Firmware. If sign-extended, the in-kernel check of its value
would fail on 64-bit systems, resulting in a mountroot prompt. Solve this
by telling the kernel to ignore the high-order bits.

PR:		kern/224437
Submitted by:	Gustavo Romero
2017-12-19 16:45:40 +00:00
Konstantin Belousov
30d4f9e888 Add atomic_load(9) and atomic_store(9) operations.
They provide relaxed-ordered atomic access semantic.  Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses.  The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.

The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations.  It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.

Suggested by:	jhb
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 09:59:20 +00:00
Justin Hibbits
7cd4e55c43 Handle the Facility Unavailable exception as a SIGILL
Currently Facility Unavailable is absent and once an application
tries to use or access a register from a feature disabled in the
CPU it causes a kernel panic.

A simple test-case is:

int main() { asm volatile ("tbegin.;"); }

which will use TM (Hardware Transactional Memory) feature which
is not supported by the kernel and so will trigger the following
kernel panic:

----

fatal user trap:

    exception       = 0xf60 (unknown)
    srr0            = 0x10000890
    srr1            = 0x800000000000f032
    lr              = 0x100004e4
    curthread       = 0x5f93000
    pid = 1021, comm = htm

panic: unknown trap
cpuid = 40
KDB: stack backtrace:
Uptime: 3m18s
Dumping 10 MB (3 chunks)
    chunk 0: 11MB (2648 pages) ... ok
    chunk 1: 1MB (24 pages) ... ok
    chunk 2: 1MB (2 pages)panic: IOMMU mapping error: -4

cpuid = 40
Uptime: 3m18s

----

Since Hardware Transactional Memory is not yet supported by FreeBSD, treat
this as an illegal instruction.

PR:		224350
Submitted by:	Gustavo Romero <gromero_AT_ibm_DOT_com>
MFC after:	2 weeks
2017-12-15 04:11:20 +00:00
Justin Hibbits
9ee02cd6f8 Add identifier for POWER9 CPU to CPU list
Without the identifier in the list booting FreeBSD results in printing the
following (from a PowerKVM boot):

cpu0: Unknown PowerPC CPU revision 0x1201, 2550.00 MHz

For now, add the same feature list as POWER8.  As new capabilities are added to
support POWER9 specific features, they will be added to this.

PR:		224344
Submitted by:	Breno Leitao <breno_DOT_leitao_AT_gmail_DOT_com>
2017-12-14 20:01:04 +00:00
Justin Hibbits
bf1b92967f Decode some PowerPC trap registers
Decode on Book-E:
* ESR (Exception Syndrome Register)
* MCSR (Machine Check Status Register)

On AIM:
* MSSSR (Memory Subsystem Status Register)

Makes it easier to tell at a glance the type of trap and machine check
conditions now.
2017-12-12 03:16:10 +00:00
Mark Johnston
5bab623438 Pass the trap frame to fasttrap hooks.
The DTrace fasttrap entry points expect a struct reg containing the
register values of the calling thread. Perform the conversion in
fasttrap rather than in the trap handler: this reduces the number of
ifdefs and avoids wasting stack space for traps that don't involve
DTrace.

MFC after:	2 weeks
2017-12-11 19:21:39 +00:00
Justin Hibbits
713e844971 Retrieve the page outside of holding locks
pmap_track_page() only works with physical memory pages, which have a
constant vm_page_t address.  Microoptimize pmap_track_page() to perform one
less operation under the lock.
2017-12-10 04:43:27 +00:00
Justin Hibbits
94a9d7c3b9 Remove PTE VA mappings for tracked pages in 64-bit mode
This was done in 32-bit mode, but not duplicated when 64-bit mode was
brought in.  Without this, stale mappings can be left, leading to odd
crashes when the wrong VA is checked in XX_PhysToVirt() (dpaa(4)).
2017-12-08 03:49:53 +00:00
Bruce Evans
fb3cc1c37d Move instantiation of msgbufp from 9 MD files to subr_prf.c.
This variable should be pure MI except possibly for reading it in MD
dump routines.  Its initialization was pure MD in 4.4BSD, but FreeBSD
changed this in r36441 in 1998.  There were many imperfections in
r36441.  This commit fixes only a small one, to simplify fixing the
others 1 arch at a time.  (r47678 added support for
special/early/multiple message buffer initialization which I want in
a more general form, but this was too fragile to use because hacking
on the msgbufp global corrupted it, and was only used for 5 hours in
-current...)
2017-12-07 07:55:38 +00:00
Justin Hibbits
89c3a53299 Override memattr for mmap on the Freescale DIU driver
The Display Interface Unit (DIU) uses main memory for the framebuffer, which
is already mapped as cache coherent physical memory.  Prevent mmap() from
using its own attributes which may otherwise conflict.
2017-12-02 01:42:07 +00:00
Pedro F. Giffuni
796df753f4 SPDX: Consider code from Carnegie-Mellon University.
Interesting cases, most likely from CMU Mach sources.
2017-11-30 15:48:35 +00:00