Commit Graph

3210 Commits

Author SHA1 Message Date
jhibbits
9894543692 powerpc64: Don't guard ISA 3.0 partition table setup with hw_direct_map
PowerISA 3.0 eliminated the 64-bit bridge mode which allowed 32-bit kernels
to run on 64-bit AIM/Book-S hardware.  Since therefore only a 64-bit kernel
can run on this hardware, and 64-bit native always has the direct map, there
is no need to guard it.
2019-11-13 02:22:00 +00:00
jhibbits
988efbe321 powerpc: Don't savectx() twice in IPI_STOP handler
We already save context in stoppcbs[] array, so there's no need to also save it
in the PCB, it won't be used.
2019-11-13 02:16:24 +00:00
jhibbits
7ad5aa93b3 powerpc64/powernv: Use OPAL call for non-POWER8 PCI TCE reset
According to the OPAL documentation, only the POWER8 (PHB3) should use
the register write TCE reset method.  All others should use the OPAL
call.

On POWER9 the call is semantically identical to the register write, with
a wait for completion.
2019-11-10 04:24:36 +00:00
jhibbits
38170f38fd powerpc/booke: Only handle kernel page faults in KVA range
The memory range between VM_MAXUSER_ADDRESS and VM_MIN_KERNEL_ADDRESS is
reserved for devices currently, which are always mapped in TLB1, and
therefore do not exist in the kernel page table.  Any page fault in this
range is therefore automatically a fatal fault.
2019-11-08 04:26:19 +00:00
jhibbits
f3bf35451c powerpc/booke: Make the TLB save area and mask match
Since TLB_MAXNEST is 3, the insert mask should only be 2 bits.  Given that 2
bits counts to 4, and that we already have plenty of space wasted in
padding, make the nest level 4 to match the mask.
2019-11-08 03:45:13 +00:00
jhibbits
bdcbfe133b powerpc/mpc85xx: Add MSI support for Freescale PowerPC SoCs
Freescale SoCs use a set of IRQs at the high end of the OpenPIC IRQ
list, not counted in the NIRQs of the Feature reporting register.  Some
SoCs include a MSI inbound window in the PCIe controller configuration
registers as well, but some don't.  Currently, this only handles the
SoCs *with* the MSI window.

There are 256 MSIs per MSI bank (32 per MSI IRQ, 8 IRQs per MSI bank).
The P5020 has 3 banks, yielding up to 768 MSIs; older SoCs have only one
bank.
2019-11-08 03:36:19 +00:00
jhibbits
b7e0d70189 powerpc/booke: Fix pmap_mapdev_attr() for multi-TLB1 entry mappings
Also, fix pmap_change_attr() to ignore non-kernel mappings.

* Fix a masking bug in mmu_booke_mapdev_attr() which caused it to align
  mappings to the smallest mapping alignment, instead of the largest.  This
  caused mappings to be potentially pessimally aligned, using more TLB
  entries than necessary.
* Return existing mappings from mmu_booke_mapdev_attr() that span more than
  one TLB1 entry.  The drm-current-kmod drivers map discontiguous segments
  of the GPU, resulting in more than one TLB entry being used to satisfy the
  mapping.
* Ignore non-kernel mappings in mmu_booke_change_attr().  There's a bug in
  the linuxkpi layer that causes it to actually try to change physical
  address mappings, instead of virtual addresses.  amd64 doesn't encounter
  this because it ignores non-kernel mappings.

With this it's possible to use drm-current-kmod on Book-E.
2019-11-06 04:40:12 +00:00
jhibbits
d80c8cf1c1 powerpc/pmap: Make use of tlb1_mapin_region in pmap_mapdev_attr()
tlb1_mapin_region() and pmap_mapdev_attr() do roughly the same thing -- map
a chunk of physical address space(memory or MMIO) into virtual, but do it in
differing ways.  Unify the code, settling on pmap_mapdev_attr()'s algorithm,
to simplify and unify the logic.  This fixes a bug with growing the kernel
mappings in mmu_booke_bootstrap(), where part of the mapping was not getting
done, leading to a hang when the unmapped VAs were accessed.
2019-11-04 00:35:40 +00:00
bdragon
f7c2a05990 powerpc: Add display of raw instruction values to x/I in ddb.
The "alternate format" character 'I' previously had the same behavior as
the "display as an instruction" character 'i'. With this change, it will now
prefix each disassembled instruction with the raw hex value.

As PowerPC instructions are always 32 bits and always aligned, and there are
no alternate modes that would affect instruction decoding or display, this
seemed to me to be the obvious interpretation of "alternate format".

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D22223
2019-11-03 02:18:45 +00:00
bdragon
7f395b6413 powerpc: Fix incorrect disassembly of the cntlzw instruction in ddb.
Noticed while comparing disassembly between ddb and objdump.

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D22121
2019-11-03 01:52:50 +00:00
bdragon
69432bab7d Add support for building Book-E kernels with clang/lld.
This involved several changes:

* Since lld does not like text relocations, replace SMP boot page text relocs
in booke/locore.S with position-independent math, and track the virtual base
in the SMP boot page header.

* As some SPRs are interpreted differently on clang due to the way it handles
platform-specific SPRs, switch m*dear and m*esr mnemonics out for regular
m*spr. Add both forms of SPR_DEAR to spr.h so the correct encoding is selected.

* Change some hardcoded 32 bit things in the boot page to be pointer-sized, and
fix alignment.

* Fix 64-bit build of booke/pmap.c when enabling pmap debugging.

Additionally, I took the opportunity to document how the SMP boot page works.

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21999
2019-11-02 21:15:56 +00:00
jhibbits
f56e17d76e powerpc/mpc85xx: Set description for the MPC85xx RC bridge 2019-11-02 02:24:53 +00:00
jhibbits
9a201b070a powerpc/booke: Fix TLB1 entry accounting
It's possible, with per-CPU mappings, for TLB1 indices to get out of sync.
This presents a problem when trying to insert an entry into TLB1 of all
CPUs.  Currently that's done by assuming (hoping) that the TLBs are
perfectly synced, and inserting to the same index for all CPUs.  However,
with aforementioned private mappings, this can result in overwriting
mappings on the other CPUs.

An example:

    CPU0                    CPU1
    <setup all mappings>    <idle>
        3 private mappings
      kick off CPU 1
                            initialize shared mappings (3 indices low)
                            Load kernel module, triggers 20 new mappings
      Sync mappings at N-3
                            initialize 3 private mappings.

At this point, CPU 1 has all the correct mappings, while CPU 0 is missing 3
mappings that were shared across to CPU 1.  When CPU 0 tries to access
memory in one of the overwritten mappings, it hangs while tripping through
the TLB miss handler.  Device mappings are not stored in any page table.

This fixes by introducing a '-1' index for tlb1_write_entry_int(), so each
CPU searches for an available index private to itself.

MFC after:	3 weeks
2019-11-01 02:55:58 +00:00
luporl
bd2e67a200 Fix GDB machdep code for PPC/PPC64
There was a couple issues with GDB machdep code for PPC/PPC64, the main ones being:
- wrong register sizes being returned
- pcb_context index was wrong (this affects all PPC variants)

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D22201
2019-10-31 12:03:47 +00:00
luporl
1ac2596a4d [PPC64] Fix trapstk overflow
In some scenarios, the 4K trapstk may overflow, corrupting tmpstk.

This was observed during remote debugging, with the following steps:

At remote host (R):
- enter kdb during boot
- switch to gdb backend

At local host (L):
- attach gdb to R
- try to read an invalid memory position

At R:
- a DSI trap occurs and kdb restarts (all this occurs on trapstk)
- while printing the stacktrace, trapstk overflows and corrupts tmpstk

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D22200
2019-10-31 11:59:00 +00:00
jhibbits
92b9f222a9 powerpc/booke: Simplify the MPC85XX PCIe root complex driver
Summary:
Due to bugs in the enumeration code, fsl_pcib_init() was not configuring
sub-bridges properly, so devices hanging off a separate bridge would not
be found.  Since the generic PCI code already supports probing child
buses, just delete this code and initialize only the device itself,
letting the generic code handle all the additional probing and
initializing.

This also deletes setup for some PCI peripherals found on some MPC85XX
evaluation boards.  The code can be resurrected if needed, but overly
complicated this code in the first place.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D22050
2019-10-24 03:51:33 +00:00
jhibbits
75b58f4b94 powerpc/booke: Fix Book-E boot post-minidump
r353489 added minidump support for powerpc64, but it added a dependency on
the dump_avail array.  Leaving it uninitialized caused breakage in late
boot.  Initialize dump_avail, even though the 64-bit booke pmap doesn't yet
support minidumps, but will in the future.
2019-10-23 00:31:19 +00:00
luporl
97ed980e58 [PPC] Avoid underflows in NUMA domains
On POWER8 systems with only one memory domain, the "ibm,associativity"
number that corresponds to it is 0, unlike POWER9 systems with two
or more domains, in which the minimum value is 1.

In POWER8 case, subtracting 1 causes an underflow on the unsigned domain
variable and a subsequent index out-of-bounds access.

Reviewed by:	jhibbits
Tested by:	bdragon, luporl
2019-10-22 18:28:58 +00:00
glebius
5e4ec8a4ab Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:37 +00:00
glebius
94b667e416 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:33 +00:00
luporl
023b61aa0a [PPC64] Add minidump support to PowerNV
Implementation of PowerNV specific minidump code.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21643
2019-10-21 11:56:57 +00:00
jhibbits
9a2f00bcbd powerpc/booke: Don't zero MAS8, it's unnecessary
MAS8 is hypervisor privileged, defining the logical partition (VM) to
operate on for TLB accesses.  It's already guaranteed to be cleared when
booting bare metal (bootloader needs it zeroed to work), and we can't touch
it from a guest.  Assume that if/when we eventually port bhyve to PowerPC
(and Book-E) the hypervisor module will take care of managing MAS8.  This
saves several (tens) of clocks on each TLB miss.

MFC after:	2 weeks
2019-10-20 15:50:33 +00:00
jhibbits
483d2889cb powerpc/booke pmap: Fix printf format type warnings 2019-10-19 16:09:06 +00:00
jhibbits
43e6fd09e6 powerpc/aim: Fix comment typo 2019-10-19 02:47:32 +00:00
jhibbits
23cda530a2 powerpc/mpc85xx: Replace global PCI config mutex with per-controller mutex
PCI controllers need to enforce exclusive config register access on their
own bus, not between all buses.
2019-10-19 01:07:35 +00:00
cem
f3a0ee41db Split out a more generic debugnet(4) from netdump(4)
Debugnet is a simplistic and specialized panic- or debug-time reliable
datagram transport.  It can drive a single connection at a time and is
currently unidirectional (debug/panic machine transmit to remote server
only).

It is mostly a verbatim code lift from netdump(4).  Netdump(4) remains
the only consumer (until the rest of this patch series lands).

The INET-specific logic has been extracted somewhat more thoroughly than
previously in netdump(4), into debugnet_inet.c.  UDP-layer logic and up, as
much as possible as is protocol-independent, remains in debugnet.c.  The
separation is not perfect and future improvement is welcome.  Supporting
INET6 is a long-term goal.

Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to
'debugnet_' or 'dn_' -- sorry.  I thought keeping the netdump name on the
generic module would be more confusing than the refactoring.

The only functional change here is the mbuf allocation / tracking.  Instead
of initiating solely on netdump-configured interface(s) at dumpon(8)
configuration time, we watch for any debugnet-enabled NIC for link
activation and query it for mbuf parameters at that time.  If they exceed
the existing high-water mark allocation, we re-allocate and track the new
high-water mark.  Otherwise, we leave the pre-panic mbuf allocation alone.
In a future patch in this series, this will allow initiating netdump from
panic ddb(4) without pre-panic configuration.

No other functional change intended.

Reviewed by:	markj (earlier version)
Some discussion with:	emaste, jhb
Objection from:	marius
Differential Revision:	https://reviews.freebsd.org/D21421
2019-10-17 16:23:03 +00:00
markj
84cd531f96 Remove page locking from pmap_mincore().
After r352110 the page lock no longer protects a page's identity, so
there is no purpose in locking the page in pmap_mincore().  Instead,
if vm.mincore_mapped is set to the non-default value of 0, re-lookup
the page after acquiring its object lock, which holds the page's
identity stable.

The change removes the last callers of vm_page_pa_tryrelock(), so
remove it.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21823
2019-10-16 22:03:27 +00:00
markj
957cc17627 Clear PGA_WRITEABLE in moea_pvo_remove().
moea_pvo_remove() might remove the last mapping of a page, in which case
it is clearly no longer writeable.  This can happen via pmap_remove(),
or when a CoW fault removes the last mapping of the old page.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22044
2019-10-16 15:50:12 +00:00
kib
ae3c37c2e4 Fix assert in PowerPC pmaps after introduction of object busy.
The VM_PAGE_OBJECT_BUSY_ASSERT() in pmap_enter() implementation should
be only asserted when the code is executed as result of pmap_enter(),
not when the same code is entered from e.g. pmap_enter_quick().  This
is relevant for all PowerPC pmap variants, because mmu_*_enter() is
used as the backend, and assert is located there.

Add a PowerPC private pmap_enter() PMAP_ENTER_QUICK_LOCKED flag to
indicate that the call is not from pmap_enter().  For non-quick-locked
calls, assert that the object is locked.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22041
2019-10-16 07:09:15 +00:00
jhibbits
1537de8003 powerpc/mpc85xx: Fix function type for fsl_pcib_error_intr()
Since it's only called as an interrupt handler, fsl_pcib_eror_intr() should just
match the driver_intr_t type.

Reported by:	bdragon
2019-10-16 03:03:59 +00:00
jhibbits
cf6db46977 powerpc: Add AmigaOne platform, a subclass of MPC85xx
Summary:
The AmigaOne platform, encompassing the X5000 and A1222 at this time, is
based on the mpc85xx platform, but includes some things not listed in
the device tree.  Some custom devices, like CPLD, could be added to the
device tree with an overlay, or other means.  However, some cannot
easily be done, such as the power button interrupt.

The directory will also become a location to add AmigaOne platform drivers,
such as the aforementioned CPLD, and its children.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D21829
2019-10-16 00:38:50 +00:00
jeff
50eb2e4288 (6/6) Convert pmap to expect busy in write related operations now that all
callers hold it.

This simplifies pmap code and removes a dependency on the object lock.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21596
2019-10-15 03:51:46 +00:00
jeff
0a6e7a4266 (3/6) Add a shared object busy synchronization mechanism that blocks new page
busy acquires while held.

This allows code that would need to acquire and release a very large number
of page busy locks to use the old mechanism where busy is only checked and
not held.  This comes at the cost of false positives but never false
negatives which the single consumer, vm_fault_soft_fast(), handles.

Reviewed by:    kib
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21592
2019-10-15 03:41:36 +00:00
jhibbits
4ec246542a powerpc/atomic: Fix atomic_cmpset_rel()
Need a release barrier, not an acquire barrier, else bad things happen.
2019-10-15 03:37:21 +00:00
luporl
cad5513a14 Fix powerpc/powerpcspe builds
Revision 353489 introduced some new function calls in common powerpc code,
but these must be called only on powerpc64.
2019-10-14 19:06:17 +00:00
luporl
57d28447c8 [PPC64] Initial kernel minidump implementation
Based on POWER9BSD implementation, with all POWER9 specific code removed and
addition of new methods in PPC64 MMU interface, to isolate platform specific
code. Currently, the new methods are implemented on pseries and PowerNV
(D21643).

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21551
2019-10-14 13:04:04 +00:00
jhibbits
0c685cb80e powerpc/pmap: Tighten condition for removing tracked pages in Book-E pmap
There are cases where there's no vm_page_t structure for a given physical
address, such as the CCSR.  In this case, trying to obtain the
md.page_tracked struct member would lead to a NULL dereference, and panic.
Tighten this up by checking for kernel_pmap AND that the page structure
actually exists before dereferencing.  The flag can only be set when it's
tracked in the kernel pmap anyway.

MFC after:	3 weeks
2019-10-13 19:33:00 +00:00
jhibbits
a6677a026c powerpc: Implement atomic_(f)cmpset_ for short and char
|
This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_
short and char functions, selectable at compile time for the target
architecture.  By default, it uses a generic shift-and-mask to perform atomic
updates to sub-components of 32-bit words from <sys/_atomic_subword.h>.
However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for
halfword and bytes, introduced in PowerISA 2.06.  These instructions are
supported by all IBM processors from POWER7 on, as well as the Freescale/NXP
e6500 core.  Although the e5500 and e500mc both implement PowerISA 2.06 they
do not implement these instructions.

As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by
using macros to reduce code duplication.

ISA_206_ATOMICS requires clang or newer binutils (2.20 or later).

Differential Revision:	https://reviews.freebsd.org/D21682
2019-10-08 01:36:34 +00:00
jhibbits
90ce64d1c6 powerpc64/pmap: Fix release order to match lock order in moea64_enter()
Page PV lock is always taken first, so should be released last.  This also
(trivially) shortens the hold time of the pmap lock.

Submitted by:	mjg
2019-10-07 02:36:42 +00:00
jhibbits
a5519d9dd3 powerpc/pmap64: Properly parenthesize PV_LOCK_COUNT macros
As pointed out by mjg, without the parentheses the calculations done against
these macros are incorrect, resulting in only 1/3 of locks being used.

Reported by:	mjg
2019-10-06 19:11:01 +00:00
jhibbits
3def8cb160 powerpc/booke64: Align initial stack setting to match that of aim64's
Clang9/LLD9 appears to get quite confused with the instruction stream used
to obtain the tmpstack pointer, almost as though it thinks this is a C
function, so tries to optimize it.  Since the AIM64 method doesn't use the
TOC to obtain the tmpstack, just follow that model, and lld won't get
confused.

Reported by:	bdragon
MFC after:	2 weeks
2019-09-28 03:33:07 +00:00
kib
957270782d Improve MD page fault handlers.
Centralize calculation of signal and ucode delivered on unhandled page
fault in new function vm_fault_trap().  MD trap_pfault() now almost
always uses the signal numbers and error codes calculated in
consistent MI way.

This introduces the protection fault compatibility sysctls to all
non-x86 architectures which did not have that bug, but apparently they
were already much more wrong in selecting delivered signals on
protection violations.

Change the delivered signal for accesses to mapped area after the
backing object was truncated.  According to POSIX description for
mmap(2):
   The system shall always zero-fill any partial page at the end of an
   object. Further, the system shall never write out any modified
   portions of the last page of an object which are beyond its
   end. References within the address range starting at pa and
   continuing for len bytes to whole pages following the end of an
   object shall result in delivery of a SIGBUS signal.

   An implementation may generate SIGBUS signals when a reference
   would cause an error in the mapped object, such as out-of-space
   condition.
Adjust according to the description, keeping the existing
compatibility code for SIGSEGV/SIGBUS on protection failures.

For situations where kernel cannot handle page fault due to resource
limit enforcement, SIGBUS with a new error code BUS_OBJERR is
delivered.  Also, provide a new error code SEGV_PKUERR for SIGSEGV on
amd64 due to protection key access violation.

vm_fault_hold() is renamed to vm_fault().  Fixed some nits in
trap_pfault()s like mis-interpreting Mach errors as errnos.  Removed
unneeded truncations of the fault addresses reported by hardware.

PR:	211924
Reviewed by:	alc
Discussed with:	jilles, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21566
2019-09-27 18:43:36 +00:00
markj
fbe7e9c7e4 Complete the removal of the "wire_count" field from struct vm_page.
Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21768
2019-09-25 16:11:35 +00:00
jhibbits
53af0efd39 powerpc/atomic: Follow recommendations on atomic primitive comparisons
Both IBM and Freescale programming examples presume the cmpset operands will
favor equal, and pessimize the non-equal case instead.  Do the same for
atomic_cmpset_* and atomic_fcmpset_*.  This slightly pessimizes the failure
case, in favor of the success case.

MFC after:	3 weeks
2019-09-25 01:39:58 +00:00
jhibbits
c9cf854b0a powerpc: Allocate DPCPU block from domain-local memory
This should improve NUMA scalability a little, by binding to the CPU's NUMA
domain.  This matches what's done on amd64.
2019-09-25 01:23:08 +00:00
markj
3616760326 Revert r352406, which contained changes I didn't intend to commit. 2019-09-16 15:04:45 +00:00
markj
543f9366b9 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:03:12 +00:00
jhibbits
476a473cab powerpc64/powernv: Add opal NVRAM driver for PowerNV systems
Add a very basic NVRAM driver for OPAL which can be used by the IBM
powerpc-utils nvram utility, not to be confused with the base nvram utility,
which only operates on powermac_nvram.

The IBM utility handles all partitions itself, treating the nvram device as
a plain store.

An alternative would be to manage partitions in the kernel, and augment the
base nvram utility to deal with different backing stores, but that
complicates the driver significantly.  Instead, present the same interface
IBM's utlity expects, and we get the usage for free.

Tested by:	bdragon
2019-09-14 03:30:34 +00:00
markj
ccbfa8304f Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
jhibbits
7d61399d06 powerpc64/pmap: Fix a WITNESS error in alloc_pvo_entry()
We only call alloc_pvo_entry() with M_WAITOK from one location.  However,
this can be called while holding nonsleepable locks.  Rather than passing
M_WAITOK down, use vm_wait() and loop.
2019-09-06 03:02:12 +00:00