Commit Graph

343 Commits

Author SHA1 Message Date
Jeff Roberson
638f867814 (6/6) Convert pmap to expect busy in write related operations now that all
callers hold it.

This simplifies pmap code and removes a dependency on the object lock.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21596
2019-10-15 03:51:46 +00:00
Jeff Roberson
205be21d99 (3/6) Add a shared object busy synchronization mechanism that blocks new page
busy acquires while held.

This allows code that would need to acquire and release a very large number
of page busy locks to use the old mechanism where busy is only checked and
not held.  This comes at the cost of false positives but never false
negatives which the single consumer, vm_fault_soft_fast(), handles.

Reviewed by:    kib
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21592
2019-10-15 03:41:36 +00:00
Justin Hibbits
ec17d5e06a powerpc/pmap: Tighten condition for removing tracked pages in Book-E pmap
There are cases where there's no vm_page_t structure for a given physical
address, such as the CCSR.  In this case, trying to obtain the
md.page_tracked struct member would lead to a NULL dereference, and panic.
Tighten this up by checking for kernel_pmap AND that the page structure
actually exists before dereferencing.  The flag can only be set when it's
tracked in the kernel pmap anyway.

MFC after:	3 weeks
2019-10-13 19:33:00 +00:00
Justin Hibbits
69cea06f34 powerpc/booke64: Align initial stack setting to match that of aim64's
Clang9/LLD9 appears to get quite confused with the instruction stream used
to obtain the tmpstack pointer, almost as though it thinks this is a C
function, so tries to optimize it.  Since the AIM64 method doesn't use the
TOC to obtain the tmpstack, just follow that model, and lld won't get
confused.

Reported by:	bdragon
MFC after:	2 weeks
2019-09-28 03:33:07 +00:00
Mark Johnston
b119329d81 Complete the removal of the "wire_count" field from struct vm_page.
Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21768
2019-09-25 16:11:35 +00:00
Mark Johnston
e8bcf6966b Revert r352406, which contained changes I didn't intend to commit. 2019-09-16 15:04:45 +00:00
Mark Johnston
41fd4b9422 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:03:12 +00:00
Mark Johnston
fee2a2fa39 Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
Justin Hibbits
7038069e9c Revert a part of r350883 that should never have gone in
The wire_count change is not part of the unification, and doesn't even make
sense.

Reported by:	markj
2019-08-27 14:04:32 +00:00
Justin Hibbits
8da08d5123 powerpc/booke: Clean up pmap a little for 64-bit
64-bit Book-E pmap doesn't need copy and zero bounce pages, nor the mutex.
Don't initialize them or reserve space for them.
2019-08-25 20:11:35 +00:00
Justin Hibbits
e7f2151c75 powerpc/booke: Use the DMAP if possible in pmap_map()
This avoids unnecessary TLB usage for statically mapped regions, such as
vm_page_array.
2019-08-25 20:08:48 +00:00
Justin Hibbits
6793e5b23d powerpc: Link Book-E kernels at the same address as AIM kernels
Summary:
Reduce the diff between AIM and Book-E even more.  This also cleans up
vmparam.h significantly.

Reviewed by:	luporl
Differential Revision:	https://reviews.freebsd.org/D21301
2019-08-20 01:26:02 +00:00
Jeff Roberson
2194393787 Move phys_avail definition into MI code. It is consumed in the MI layer and
doing so adds more flexibility with less redundant code.

Reviewed by:	jhb, markj, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21250
2019-08-16 00:45:14 +00:00
Justin Hibbits
141a0ab012 powerpc/pmap: Enable UMA_MD_SMALL_ALLOC for 64-bit booke
The only thing blocking UMA_MD_SMALL_ALLOC from working on 64-bit booke
powerpc was a missing check in pmap_kextract().  Adding DMAP handling into
pmap_kextract(), we can now use UMA_MD_SMALL_ALLOC.  This should improve
performance and stability a bit, since DMAP is always mapped in TLB1, so
this relieves pressure on TLB0.

MFC after:	3 weeks
2019-08-15 03:42:15 +00:00
Justin Hibbits
3be09f300d powerpc: Unify pmap definitions between AIM and Book-E
This is part 2 of r347078, pulling the page directory out of the Book-E
pmap.  This breaks KBI for anything that uses struct pmap (such as vm_map)
so any modules that access this must be rebuilt.
2019-08-12 03:03:56 +00:00
Justin Hibbits
937e8f20af powerpc/pmap: Minor optimizations to 64-bit booke pmap
Don't recalculate the VM page of the page table pages, just pass them down
to free.  Also, use the pmap's page zero function instead of bzero().
2019-08-08 03:18:35 +00:00
Justin Hibbits
8b1531eca8 Fix build from r350622
It helps if my local kernel build has INVARIANTS.
2019-08-06 03:49:40 +00:00
Justin Hibbits
dc825fed55 powerpc/pmap: Simplify Book-E 64-bit page table management
There is no need for the 64-bit pmap to have a fixed number of page table
buffers.  Since the 64-bit pmap has a DMAP, we can effectively have user
page tables limited only by total RAM size.
2019-08-06 03:16:06 +00:00
Justin Hibbits
cafceaebea powerpc/SPE: Enable SPV bit for EFSCFD instruction emulation
EFSCFD (floating point single convert from double) emulation requires saving
the high word of the register, which uses SPE instructions.  Enable the SPE
to avoid an SPV Unavailable exception.

MFC after:	1 week
2019-07-20 18:22:01 +00:00
Mark Johnston
eeacb3b02f Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics.  The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.

This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead.  Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.

No functional change is intended.  __FreeBSD_version is bumped.

Reviewed by:	alc, kib
Discussed with:	jeff
Discussed with:	jhb, np (cxgbe)
Tested by:	pho (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19247
2019-07-08 19:46:20 +00:00
Justin Hibbits
72e58595b1 powerpc/booke: It helps to set variables before using them
Actually set the source and destination VA's before using them.  Fixes a
bizarre panic on 32-bit Book-E.  Not sure why this wasn't caught by the
compiler.
2019-05-23 03:40:48 +00:00
Justin Hibbits
b4b2e7a7b9 powerpc/booke: Use wrtee instead of msr to restore EE bit
The MSR[EE] bit does not require synchronization when changing.  This is a
trivial micro-optimization, removing the trailing isync from mtmsr().

MFC after:	1 week
2019-05-22 02:43:17 +00:00
Justin Hibbits
2b03b6bd45 powerpc/booke: Rewrite pmap_sync_icache() a bit
* Make mmu_booke_sync_icache() use the DMAP on 64-bit prcoesses, no need to
  map the page into the user's address space.  This removes the
  pvh_global_lock from the equation on 64-bit.
* Don't map the page with user-readability on 32-bit.  I don't know what the
  chance of a given user process being able to access the NULL page when
  another process's page is added there, but it doesn't seem like a good
  idea to map it to NULL with user read permissions.
* Only sync as much as we need to.  There are only two significant places
  where pmap_sync_icache is used: proc_rwmem(), and the SIGILL second-chance
  for powerpc.  The SIGILL second chance is likely the most common, and only
  syncs 4 bytes, so avoid the other 127 loop iterations (4096 / 32 byte
  cacheline) in __syncicache().
2019-05-08 16:15:28 +00:00
Justin Hibbits
4023311a29 powerpc/booke: Do as much work outside of TLB locks as possible
Reduce the surface area of the TLB locks.  Unfortunately the same trick for
serializing the tlbie instruction on OEA64 cannot be used here to reduce the
scope of the tlbivax mutex to the tlbsync only, as the mutex also serializes
the TLB miss lock as a side effect, so contention on this lock may not be
reducible any further.
2019-05-08 16:05:18 +00:00
Justin Hibbits
2154a866b6 powerpc/booke: Use #ifdef __powerpc64__ instead of hw_direct_map in places
Since the DMAP is only available on powerpc64, and is *always* available on
Book-E powerpc64, don't penalize either side (32-bit or 64-bit) by always
checking hw_direct_map to perform operations.  This saves 5-10% time on
various ports builds, and on buildworld+buildkernel on Book-E hardware.

MFC after:	3 weeks
2019-05-05 20:23:43 +00:00
Justin Hibbits
bfd0787769 powerpc/booke: Fix size check for phys_avail in pmap bootstrap
Use the nitems() macro instead of the expansion, a'la r298352.  Also, fix the
location of this check to after initializing availmem_regions_sz, so that the
check isn't always against 0, thus always failing (nitems(phys_avail) is always
more than 0).
2019-05-05 20:05:50 +00:00
Justin Hibbits
0af5d6f7d9 powerpc: Stop pretending we run on e500v1 cores
Unconditional writing to MAS7, which doesn't exist on the e500v1 core, in a
TLB miss handler has been in the code for several years now.  Since this has
gone unnoticed for so long, it's easily concluded that e500v1 is not in use
with FreeBSD.  Simplify the code path a bit, by unconditionally zeroing MAS7
instead of calling a subroutine to do it.
2019-04-30 03:45:46 +00:00
Justin Hibbits
0499e9c619 powerpc64: Use medium code model in asm files for TOC references
Summary:
With a sufficiently large TOC, it's possible to index out of range, as
the immediate load instructions only permit 16-bit indices, allowing up
to 64kB range (signed) from the base pointer.  Allow +/- 2GB range, with
the medium code model TOC accesses in asm.

Patch originally by Brandon Bergren.  The issue appears to impact ELFv2
more than ELFv1.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D19708
2019-03-29 02:38:30 +00:00
Justin Hibbits
5b4c63b781 powerpc/booke: Depessimize MAS register updates even more
Remove isyncs between MAS register updates in the TLB miss handler, since
it's only needed before the TLB update instructions.
2019-03-02 20:59:18 +00:00
Justin Hibbits
a9b033c2f3 powerpc/booke: Fix 32-bit build
MFC after:	2 weeks
MFC with:	344202
2019-02-16 04:47:33 +00:00
Justin Hibbits
0454ed9794 powerpc/booke: depessimize MAS register updates
We only need to isync before we actually use the MAS registers, so before and
after the TLB read/write/sync/search operations.

MFC after:	2 weeks
2019-02-16 04:38:34 +00:00
Justin Hibbits
18f7e2b45e powerpc/booke: Use DMAP where possible for page copy and zeroing
This avoids several locks and pmap_kenter()'s, improving performance
marginally.

MFC after:	2 weeks
2019-02-16 04:16:10 +00:00
Justin Hibbits
64143619ab powerpc/booke: Use the 'tlbilx' instruction on newer cores
Newer cores have the 'tlbilx' instruction, which doesn't broadcast over
CoreNet.  This is significantly faster than walking the TLB to invalidate
the PID mappings.  tlbilx with the arguments given takes 131 clock cycles to
complete, as opposed to 512 iterations through the loop plus tlbre/tlbwe at
each iteration.

MFC after:	3 weeks
2019-02-13 03:11:12 +00:00
Justin Hibbits
2da4e52d79 powerpcspe: Correct SPE high-component loading
Don't clobber the low part of the register restoring the high component of.
This could lead to very bad behavior if it's an ABI-affected register.

While here, also mark the asm volatile in the SPE high save case, to match
the load case.

Reported by:	Branden Bergren (git_bdragon.rtk0.net)
MFC after:	1 week
2019-01-13 04:51:24 +00:00
Justin Hibbits
3067a880ce powerpcspe: Fix GPR handling in SPE exception handler
Optimize the exception handler to only save and load the upper word of the
GPRs used in the emulating instruction.  This reduces the save/load
overhead, and as a side effect does not overwrite the upper word of any
temporary register.

With this commit I am now able to run editors/abiword and math/gnumeric on a
e500-based system.

MFC after:	1 week
MFC With:	r341752,r341751
2018-12-13 04:48:28 +00:00
Justin Hibbits
9d720d45c9 powerpc/booke: Don't get and use the load offset for TOC on APs
The code was a near exact copy of the code in startup, but it doesn't need
the complexity since the kernel is already relocated.  With
VM_MIN_KERNEL_ADDRESS as currently set to KERNBASE, this doesn't cause a
problem, because it's a zero offset.  However, when KERNBASE is changed to a
physical load address, it then has a non-zero offset, and ends up with an
invalid stack pointer, causing the AP to hang.
2018-12-11 02:03:00 +00:00
Justin Hibbits
ddc6c1fa3d powerpc/SPE: Copy lower part of source register to target for efdabs/efdnabs/efdneg
MFC after:	1 week
MFC With:	r341751
2018-12-09 04:54:55 +00:00
Justin Hibbits
3d6bebd3a2 powerpc/SPE: Reload vector registers after efdabs/efdnabs/efdneg
While here, also style(9)-adjust indents around this code.
2018-12-09 04:13:14 +00:00
Justin Hibbits
f1e0cb5ef1 powerpc: preload_addr_relocate is no longer necessary for booke
The same behavior was moved to machdep.c, paired with AIM's relocation,
making this redundant.  With this, it's now possible to boot FreeBSD with
ubldr on a uboot Book-E platform, even with a
KERNBASE != VM_MIN_KERNEL_ADDRESS.
2018-12-04 03:51:10 +00:00
Justin Hibbits
3c0b081966 powerpc/booke: Check for the metadata address by physical address
The metadata pointer will almost never be at or above 'btext', as btext is a
relocated symbol, so will be based at VM_MIN_KERNEL_ADDRESS, not at
KERNBASE.  Check the address against kernload, where the kernel is
physically loaded.
2018-12-03 04:47:28 +00:00
Justin Hibbits
b996435337 powerpc/booke: Fix debug printfs in pmap
Add missing '%'s so printf formats are actually handled.
2018-11-28 04:02:26 +00:00
Justin Hibbits
ea32838af0 powerpc: Prepare Book-E kernels for KERNBASE != run base
Book-E kernels really run at VM_MIN_KERNEL_ADDRESS, which currently happens to
be the same as KERNBASE.  KERNBASE is the linked address, which the loader also
takes to be the physical load address.  Treat KERNBASE as a physical address,
not a virtual, and change virtual address references for KERNBASE to use
something more appropriate.
2018-11-28 02:00:27 +00:00
Justin Hibbits
6f4827aca7 powerpc/booke: Turn tlb*_print_tlbentries() into 'show tlb*' DDB commands
debugf() is unnecessary for the TLB printing functions, as they're only
intended to be used from ddb.  Instead, make them full DDB 'show'
commands, so now it can be written as 'show tlb1' and 'show tlb0'
instead of calling the function, hoping DEBUG has been defined.
2018-10-22 00:21:27 +00:00
Justin Hibbits
289041e2cb powerpcspe: Implement SPE exception handling
The Signal Processing Engine (SPE) found in Freescale e500 cores (and
others) offloads IEEE-754 compliance (NaN, Inf handling, overflow,
underflow) to software, most likely as a means of simplifying the APU
silicon.  Some software, like AbiWord, needs full IEEE-754 compliance,
including NaN handling.  Implement the necessary bits to enable it.

Differential Revision: https://reviews.freebsd.org/D17446
2018-10-21 00:43:27 +00:00
Justin Hibbits
340a810bf0 booke pmap: hide debug-ish printf behind bootverbose
It's not necessary during normal operation to know the mapped region size
and wasted space.
2018-08-19 18:54:43 +00:00
Justin Hibbits
38cfc8c393 Print the full-width pointer values in hex.
PRI0ptrX is used to print a zero-padded hex value of the architecture's bitness,
so on 64-bit architectures it'll print the full 64 bit address.
2018-05-28 00:19:08 +00:00
Justin Hibbits
971b5e4da8 Remove dead errata fixup code
This code caused more problems than it should have fixed (boot failures) on
the machines I tested, so has been commented out for a while now.  Remove
it, and assume the errata fixups were done by the bootloader where they
belong.
2018-05-01 04:31:17 +00:00
Justin Hibbits
99adcecf38 Call through powerpc_interrupt for all Book-E interrupts
Make int_external_input, int_decrementer, and int_performance_counter all
now use trap_common, just like on AIM.  The effects of this are:

* All traps are now properly displayed in ddb.  Previously traps from
  external input, decrementer, and performance counters, would display as
  just basic stack traces.  Now the frame is displayed.

* External interrupts are now handled with interrupts enabled, so handling
  can be preempted.  This seems to fix a hang found post-r329882.
2018-04-10 17:32:27 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Justin Hibbits
9225dfbdf0 Correct the ilog2() for calculating memory sizes.
TLB1 can handle ranges up to 4GB (through e5500, larger in e6500), but
ilog2() took a unsigned int, which maxes out at 4GB-1, but truncates
silently.  Increase the input range to the largest supported, at least for
64-bit targets.  This lets the DMAP be completely mapped, instead of only
1GB blocks with it assuming being fully mapped.
2018-04-04 02:13:27 +00:00