3284 Commits

Author SHA1 Message Date
markj
84cd531f96 Remove page locking from pmap_mincore().
After r352110 the page lock no longer protects a page's identity, so
there is no purpose in locking the page in pmap_mincore().  Instead,
if vm.mincore_mapped is set to the non-default value of 0, re-lookup
the page after acquiring its object lock, which holds the page's
identity stable.

The change removes the last callers of vm_page_pa_tryrelock(), so
remove it.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21823
2019-10-16 22:03:27 +00:00
markj
957cc17627 Clear PGA_WRITEABLE in moea_pvo_remove().
moea_pvo_remove() might remove the last mapping of a page, in which case
it is clearly no longer writeable.  This can happen via pmap_remove(),
or when a CoW fault removes the last mapping of the old page.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22044
2019-10-16 15:50:12 +00:00
kib
ae3c37c2e4 Fix assert in PowerPC pmaps after introduction of object busy.
The VM_PAGE_OBJECT_BUSY_ASSERT() in pmap_enter() implementation should
be only asserted when the code is executed as result of pmap_enter(),
not when the same code is entered from e.g. pmap_enter_quick().  This
is relevant for all PowerPC pmap variants, because mmu_*_enter() is
used as the backend, and assert is located there.

Add a PowerPC private pmap_enter() PMAP_ENTER_QUICK_LOCKED flag to
indicate that the call is not from pmap_enter().  For non-quick-locked
calls, assert that the object is locked.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22041
2019-10-16 07:09:15 +00:00
jhibbits
1537de8003 powerpc/mpc85xx: Fix function type for fsl_pcib_error_intr()
Since it's only called as an interrupt handler, fsl_pcib_eror_intr() should just
match the driver_intr_t type.

Reported by:	bdragon
2019-10-16 03:03:59 +00:00
jhibbits
cf6db46977 powerpc: Add AmigaOne platform, a subclass of MPC85xx
Summary:
The AmigaOne platform, encompassing the X5000 and A1222 at this time, is
based on the mpc85xx platform, but includes some things not listed in
the device tree.  Some custom devices, like CPLD, could be added to the
device tree with an overlay, or other means.  However, some cannot
easily be done, such as the power button interrupt.

The directory will also become a location to add AmigaOne platform drivers,
such as the aforementioned CPLD, and its children.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D21829
2019-10-16 00:38:50 +00:00
jeff
50eb2e4288 (6/6) Convert pmap to expect busy in write related operations now that all
callers hold it.

This simplifies pmap code and removes a dependency on the object lock.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21596
2019-10-15 03:51:46 +00:00
jeff
0a6e7a4266 (3/6) Add a shared object busy synchronization mechanism that blocks new page
busy acquires while held.

This allows code that would need to acquire and release a very large number
of page busy locks to use the old mechanism where busy is only checked and
not held.  This comes at the cost of false positives but never false
negatives which the single consumer, vm_fault_soft_fast(), handles.

Reviewed by:    kib
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21592
2019-10-15 03:41:36 +00:00
jhibbits
4ec246542a powerpc/atomic: Fix atomic_cmpset_rel()
Need a release barrier, not an acquire barrier, else bad things happen.
2019-10-15 03:37:21 +00:00
luporl
cad5513a14 Fix powerpc/powerpcspe builds
Revision 353489 introduced some new function calls in common powerpc code,
but these must be called only on powerpc64.
2019-10-14 19:06:17 +00:00
luporl
57d28447c8 [PPC64] Initial kernel minidump implementation
Based on POWER9BSD implementation, with all POWER9 specific code removed and
addition of new methods in PPC64 MMU interface, to isolate platform specific
code. Currently, the new methods are implemented on pseries and PowerNV
(D21643).

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21551
2019-10-14 13:04:04 +00:00
jhibbits
0c685cb80e powerpc/pmap: Tighten condition for removing tracked pages in Book-E pmap
There are cases where there's no vm_page_t structure for a given physical
address, such as the CCSR.  In this case, trying to obtain the
md.page_tracked struct member would lead to a NULL dereference, and panic.
Tighten this up by checking for kernel_pmap AND that the page structure
actually exists before dereferencing.  The flag can only be set when it's
tracked in the kernel pmap anyway.

MFC after:	3 weeks
2019-10-13 19:33:00 +00:00
jhibbits
a6677a026c powerpc: Implement atomic_(f)cmpset_ for short and char
|
This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_
short and char functions, selectable at compile time for the target
architecture.  By default, it uses a generic shift-and-mask to perform atomic
updates to sub-components of 32-bit words from <sys/_atomic_subword.h>.
However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for
halfword and bytes, introduced in PowerISA 2.06.  These instructions are
supported by all IBM processors from POWER7 on, as well as the Freescale/NXP
e6500 core.  Although the e5500 and e500mc both implement PowerISA 2.06 they
do not implement these instructions.

As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by
using macros to reduce code duplication.

ISA_206_ATOMICS requires clang or newer binutils (2.20 or later).

Differential Revision:	https://reviews.freebsd.org/D21682
2019-10-08 01:36:34 +00:00
jhibbits
90ce64d1c6 powerpc64/pmap: Fix release order to match lock order in moea64_enter()
Page PV lock is always taken first, so should be released last.  This also
(trivially) shortens the hold time of the pmap lock.

Submitted by:	mjg
2019-10-07 02:36:42 +00:00
jhibbits
a5519d9dd3 powerpc/pmap64: Properly parenthesize PV_LOCK_COUNT macros
As pointed out by mjg, without the parentheses the calculations done against
these macros are incorrect, resulting in only 1/3 of locks being used.

Reported by:	mjg
2019-10-06 19:11:01 +00:00
jhibbits
3def8cb160 powerpc/booke64: Align initial stack setting to match that of aim64's
Clang9/LLD9 appears to get quite confused with the instruction stream used
to obtain the tmpstack pointer, almost as though it thinks this is a C
function, so tries to optimize it.  Since the AIM64 method doesn't use the
TOC to obtain the tmpstack, just follow that model, and lld won't get
confused.

Reported by:	bdragon
MFC after:	2 weeks
2019-09-28 03:33:07 +00:00
kib
957270782d Improve MD page fault handlers.
Centralize calculation of signal and ucode delivered on unhandled page
fault in new function vm_fault_trap().  MD trap_pfault() now almost
always uses the signal numbers and error codes calculated in
consistent MI way.

This introduces the protection fault compatibility sysctls to all
non-x86 architectures which did not have that bug, but apparently they
were already much more wrong in selecting delivered signals on
protection violations.

Change the delivered signal for accesses to mapped area after the
backing object was truncated.  According to POSIX description for
mmap(2):
   The system shall always zero-fill any partial page at the end of an
   object. Further, the system shall never write out any modified
   portions of the last page of an object which are beyond its
   end. References within the address range starting at pa and
   continuing for len bytes to whole pages following the end of an
   object shall result in delivery of a SIGBUS signal.

   An implementation may generate SIGBUS signals when a reference
   would cause an error in the mapped object, such as out-of-space
   condition.
Adjust according to the description, keeping the existing
compatibility code for SIGSEGV/SIGBUS on protection failures.

For situations where kernel cannot handle page fault due to resource
limit enforcement, SIGBUS with a new error code BUS_OBJERR is
delivered.  Also, provide a new error code SEGV_PKUERR for SIGSEGV on
amd64 due to protection key access violation.

vm_fault_hold() is renamed to vm_fault().  Fixed some nits in
trap_pfault()s like mis-interpreting Mach errors as errnos.  Removed
unneeded truncations of the fault addresses reported by hardware.

PR:	211924
Reviewed by:	alc
Discussed with:	jilles, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21566
2019-09-27 18:43:36 +00:00
markj
fbe7e9c7e4 Complete the removal of the "wire_count" field from struct vm_page.
Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21768
2019-09-25 16:11:35 +00:00
jhibbits
53af0efd39 powerpc/atomic: Follow recommendations on atomic primitive comparisons
Both IBM and Freescale programming examples presume the cmpset operands will
favor equal, and pessimize the non-equal case instead.  Do the same for
atomic_cmpset_* and atomic_fcmpset_*.  This slightly pessimizes the failure
case, in favor of the success case.

MFC after:	3 weeks
2019-09-25 01:39:58 +00:00
jhibbits
c9cf854b0a powerpc: Allocate DPCPU block from domain-local memory
This should improve NUMA scalability a little, by binding to the CPU's NUMA
domain.  This matches what's done on amd64.
2019-09-25 01:23:08 +00:00
markj
3616760326 Revert r352406, which contained changes I didn't intend to commit. 2019-09-16 15:04:45 +00:00
markj
543f9366b9 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:03:12 +00:00
jhibbits
476a473cab powerpc64/powernv: Add opal NVRAM driver for PowerNV systems
Add a very basic NVRAM driver for OPAL which can be used by the IBM
powerpc-utils nvram utility, not to be confused with the base nvram utility,
which only operates on powermac_nvram.

The IBM utility handles all partitions itself, treating the nvram device as
a plain store.

An alternative would be to manage partitions in the kernel, and augment the
base nvram utility to deal with different backing stores, but that
complicates the driver significantly.  Instead, present the same interface
IBM's utlity expects, and we get the usage for free.

Tested by:	bdragon
2019-09-14 03:30:34 +00:00
markj
ccbfa8304f Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
jhibbits
7d61399d06 powerpc64/pmap: Fix a WITNESS error in alloc_pvo_entry()
We only call alloc_pvo_entry() with M_WAITOK from one location.  However,
this can be called while holding nonsleepable locks.  Rather than passing
M_WAITOK down, use vm_wait() and loop.
2019-09-06 03:02:12 +00:00
jhibbits
508197caf8 powerpc64/pmap: Simplify the code path for moea64_pte_replace_native()
Summary:
MOEA64_PTE_REPLACE() is called often with the pmap lock held, and
sometimes with the page pv lock held.  The less work done while holding
a lock, the better.  Since we are intending to replace the same PTE
(same hash index), we don't need to recalculate anything, just flat
replace the PTE.  This cuts more than 200 instructions off the
invalidating code path.  In addition, we don't need to replace a PTE
that's not occupied by this PVO.

Reviewed by:	luporl
Differential Revision:	https://reviews.freebsd.org/D21515
2019-09-06 02:45:46 +00:00
kib
d33fdaeb7e Centralize __pcpu definitions.
Many extern struct pcpu <something>__pcpu declarations were
copied/pasted in sources.  The issue is that the definition is MD, but
it cannot be provided by machine/pcpu.h due to actual struct pcpu
defined in sys/pcpu.h later than the inclusion of machine/pcpu.h.
This forced the copying when other code needed direct access to
__pcpu.  There is no way around it, due to machine/pcpu.h supplying
part of struct pcpu fields.

To work around the problem, add a new machine/pcpu_aux.h header, which
should fill any needed MD definitions after struct pcpu definition is
completed. This allows to remove copies of __pcpu spread around the
source.  Also on x86 it makes it possible to remove work arounds like
OFFSETOF_CURTHREAD or clang specific warnings supressions.

Reported and tested by:	lwhsu, bcran
Reviewed by:	imp, markj (previous version)
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D21418
2019-08-29 07:25:27 +00:00
jhibbits
ef2b3ca7b6 Revert a part of r350883 that should never have gone in
The wire_count change is not part of the unification, and doesn't even make
sense.

Reported by:	markj
2019-08-27 14:04:32 +00:00
jhibbits
7e5b076efc powerpc/booke: Clean up pmap a little for 64-bit
64-bit Book-E pmap doesn't need copy and zero bounce pages, nor the mutex.
Don't initialize them or reserve space for them.
2019-08-25 20:11:35 +00:00
jhibbits
e856fe61f0 powerpc/booke: Use the DMAP if possible in pmap_map()
This avoids unnecessary TLB usage for statically mapped regions, such as
vm_page_array.
2019-08-25 20:08:48 +00:00
imp
3a6755f808 Fix bogusly declared WERRORs in kernel build
Many arm kernel configs bogusly specified WERROR=-Werror. There's no
reason for this because the default is that and there's no reason to
override. These date from a time when we needed to add additional
warning->error suppression. They are obsolete and were cut and paste
propagated from file to file.

Comment out all the WERROR=.... lines in powerpc. They aren't bogus,
but were appropriate for the old defaults for gcc4.2.1. Now that we've
made the policy decision to suppress -Werror by default on these
platforms, it is appropriate to comment these out. People wishing to
fix these errors can still un-comment them out, or say WERROR=-Werror
on the command line.

Fix two instances (cut and paste propagation) of hard-coded -Werror
in x86 code. Replace with ${WERROR} instead. This is a no-op change
except for people who build WERROR=-Wno-error :).

This should fix tinderbox / CI breakage.
2019-08-25 19:39:31 +00:00
jhibbits
4c524d8f8d powerpc: Link Book-E kernels at the same address as AIM kernels
Summary:
Reduce the diff between AIM and Book-E even more.  This also cleans up
vmparam.h significantly.

Reviewed by:	luporl
Differential Revision:	https://reviews.freebsd.org/D21301
2019-08-20 01:26:02 +00:00
jeff
685a292036 Move phys_avail definition into MI code. It is consumed in the MI layer and
doing so adds more flexibility with less redundant code.

Reviewed by:	jhb, markj, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21250
2019-08-16 00:45:14 +00:00
jhibbits
49916caf58 powerpc/pmap: Enable UMA_MD_SMALL_ALLOC for 64-bit booke
The only thing blocking UMA_MD_SMALL_ALLOC from working on 64-bit booke
powerpc was a missing check in pmap_kextract().  Adding DMAP handling into
pmap_kextract(), we can now use UMA_MD_SMALL_ALLOC.  This should improve
performance and stability a bit, since DMAP is always mapped in TLB1, so
this relieves pressure on TLB0.

MFC after:	3 weeks
2019-08-15 03:42:15 +00:00
imp
75967bea53 Fix powerpc LINT build
tcpratelimit isn't supported as there's now atomic_add_64, so add it to the exclusion list
Add comment for why PPC_PROBE_CHIPSET is on the list
Remove UKBD_DFLT_KEYMAP now that ukbd works on all platforms.
2019-08-12 23:25:21 +00:00
imp
75e675195a Floppy driver really only works on x86
Move the floppy driver to the x86 specific notes file.

Reviewed by: jhb, manu, jhibbits, emaste
Differential Revision: https://reviews.freebsd.org/D21208
2019-08-12 22:58:50 +00:00
imp
c0702cb45a Move sc out of the global file
x86 needs sc, as does sparc64. powerpc doesn't use it by default, but some old
powermac notebooks do not work with vt yet for reasons unknonw. Even so, I've
removed it from powerpc LINT. It's not in daily use there, and the intent is to
100% switch to vt now that it works for that platform to limit support burden.

All the other architectures omit some or all of the screen savers from their
lint config. Move them to the x86 NOTES files and remove the exclusions. This
reduces slightly the number of savers sparc64 compiles, but since they are in
GENERIC, the overage is adequate and if someone reaelly wants to sort them out
in sparc64 they can sweat the details and the testing.

Reviewed by: jhb (earlier version), manu (earlier version), jhibbits
Differential Revision: https://reviews.freebsd.org/D21233
2019-08-12 22:58:44 +00:00
luporl
dcf7a30f5f Fix style and clarify comment
Fix code style in proc.h and clarify comment about empty structs.
2019-08-12 19:44:57 +00:00
luporl
d412969e89 [ppc] avoid empty mdproc struct
Avoid empty structs, that have undefined behavior in C99 and
make compilers complain about it
(empty struct has size 0 in C, size 1 in C++).

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21231
2019-08-12 17:18:20 +00:00
luporl
29bd1140e7 [PPC64] Save FPU registers before enabling VSX
Fixed trap handler logic, in order to make it save FPU registers,
if FPU is enabled, before enabling VSX. Without this change, FPU
register contents were being lost when set before VSX was enabled.
2019-08-12 12:51:47 +00:00
jhibbits
afb0e48ecd powerpc: Unify pmap definitions between AIM and Book-E
This is part 2 of r347078, pulling the page directory out of the Book-E
pmap.  This breaks KBI for anything that uses struct pmap (such as vm_map)
so any modules that access this must be rebuilt.
2019-08-12 03:03:56 +00:00
jhibbits
0e244c1b3c powerpc/pmap: Minor optimizations to 64-bit booke pmap
Don't recalculate the VM page of the page table pages, just pass them down
to free.  Also, use the pmap's page zero function instead of bzero().
2019-08-08 03:18:35 +00:00
jhibbits
0c4cc09c8d Fix build from r350622
It helps if my local kernel build has INVARIANTS.
2019-08-06 03:49:40 +00:00
jhibbits
e52c32c64e powerpc/pmap: Simplify Book-E 64-bit page table management
There is no need for the 64-bit pmap to have a fixed number of page table
buffers.  Since the 64-bit pmap has a DMAP, we can effectively have user
page tables limited only by total RAM size.
2019-08-06 03:16:06 +00:00
jhibbits
b8553eed10 powerpc: Get 32-bit AIM building with secure-PLT
The last few changes needed before 32-bit AIM builds with secure-PLT with
base GCC.  Because ofwcall32.S and swtch32.S were branching to the GOT it
could not use secure PLT.
2019-08-05 01:37:18 +00:00
jhibbits
8300ee499c powerpc: Enable the BUS_DMA_NOCACHE flag handling
The flag handling was committed commented out 7 years ago.  It works, and is
needed for LinuxKPI-based DRM drivers.

Also mark a local as potentially unusable, as it's only really used when KTR
is enabled.

Submitted by:	mmacy
2019-08-04 19:25:23 +00:00
jhibbits
38a7c04854 powerpc/powernv: Fix OPAL cfgread/cfgwrite error handling
Freeze clearing needs to heppen any time OPAL reads return either an error
(except OPAL_HARDWARE), AND any time it returns 0xff for all bytes.

For cfgwrite, any error that's not OPAL_HARDWARE should be cleaned up.
2019-08-03 01:55:51 +00:00
jhibbits
307d7334c5 powerpc/powernv: Only clear EEH freeze for some errors
Only clear an EEH freeze if an error occurs.  However, if an OPAL_HARDWARE
error is returned, this indicates a hardware failure which cannot be
unfrozen, and instead needs a hardware reset.  Attempting to unfreeze a
broken PCH will result in console spam for each attempt.  To avoid the spam,
just don't do it.
2019-08-01 03:59:25 +00:00
jhibbits
13f43c365e powerpc64/mmu: Use a SLIST for the PVO delete list, instead of a RB_TREE
Summary:
Although it's convenient to reuse the pvo_plist for deletion, RB_TREE
insertion and removal is not free, and can result in a lot of extra work
to rebalance the tree.  Instead, use a SLIST as a LIFO delete queue,
which gives us almost free insertion, deletion, and traversal.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D21061
2019-08-01 03:55:58 +00:00
luporl
1eefc66c5f powerpc: Improve pvo allocation code
Added allocation retry loop in alloc_pvo_entry(), to wait for
memory to become available if the caller specifies the M_WAITOK flag.

Also, the loop in moa64_enter() was removed, as moea64_pvo_enter()
never returns ENOMEM. It is alloc_pvo_entry() memory allocation that
can fail and must be retried.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21035
2019-07-25 15:27:05 +00:00
jhibbits
d7f60f9264 powerpc/pmap64: Make moea64 statistics optional
Summary:
It turns out statistics accounting is very expensive in the pmap driver,
and doesn't seem necessary in the common case.  Make this optional
behind a MOEA64_STATS #define, which one can set if they really need
statistics.

This saves ~7-8% on buildworld time on a POWER9.

Found by bdragon.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D20903
2019-07-25 03:47:27 +00:00