Commit Graph

190 Commits

Author SHA1 Message Date
Konstantin Belousov
39ffa8c138 Change pmap_enter(9) interface to take flags parameter and superpage
mapping size (currently unused).  The flags includes the fault access
bits, wired flag as PMAP_ENTER_WIRED, and a new flag
PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep.

For powerpc aim both 32 and 64 bit, fix implementation to ensure that
the requested mapping is created when PMAP_ENTER_NOSLEEP is not
specified, in particular, wait for the available memory required to
proceed.

In collaboration with:	alc
Tested by:	nwhitehorn (ppc aim32 and booke)
Sponsored by:	The FreeBSD Foundation and EMC / Isilon Storage Division
MFC after:	2 weeks
2014-08-08 17:12:03 +00:00
Alan Cox
a695d9b25b Retire pmap_change_wiring(). We have never used it to wire virtual pages.
We continue to use pmap_enter() for that.  For unwiring virtual pages, we
now use pmap_unwire(), which unwires a range of virtual addresses instead
of a single virtual page.

Sponsored by:	EMC / Isilon Storage Division
2014-08-03 20:40:51 +00:00
Alan Cox
a844c68fc2 Implement pmap_unwire(). See r268327 for the motivation behind this change. 2014-07-13 16:27:57 +00:00
Bryan Drewery
44f1c91610 Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.

Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.

Exp-run revealed no ports using it directly.

No objection from:	arch@
Sponsored by:	EMC / Isilon Storage Division
2014-03-22 10:26:09 +00:00
Nathan Whitehorn
a3845e4242 Avoid spurious compiler warning about an uninitialized variable. 2014-02-01 20:06:52 +00:00
Nathan Whitehorn
c1cb22d755 Rearchitect platform memory map parsing to make it less
Open Firmware-centric:
- Keep the static list of regions in platform.c instead of ofw_machdep.c
- Move various merging and sorting operations to platform.c as well
- Move apple_hacks code out of ofw_machdep.c and into platform_powermac.c,
  where it belongs
- Move CHRP-specific dynamic-reconfiguration memory parsing into
  platform_chrp.c instead of pretending it is shared code
2013-12-01 19:43:15 +00:00
Nathan Whitehorn
a0d0d6d88b badaddr() is used only in the grackle PCI driver, so move its definition
there. Clean up a spurious setfault() declaration as well.
2013-11-27 22:01:09 +00:00
Nathan Whitehorn
e537388b84 Unify handling of illegal instruction faults between AIM and Book-E. This
allows FPU emulation on AIM as well as providing support for the mfpvr
and lwsync instructions from userland on e500 cores. lwsync, in particular,
is required for many C++ programs to work correctly.

MFC after:	1 week
2013-11-17 15:12:03 +00:00
Nathan Whitehorn
debe445512 Split the function of the PCB_FPU flags into two: PCB_FPU now indicates that
the actual FPU is enabled, while PCB_FPREGS indicates that the FPU state
structure in the PCB is valid. This separation reflects the situation on
FPU-less systems in which the FP state is used by the emulator but we don't
actually want to try to turn on the non-existant FPU.

Use this flag to save and restore FP regs properly on both AIM and Book-E.
As a side effect, this sets up hard-FP and Altivec on Book-E CPUs with such
abilities except for a trap handler to call enable_fpu()/enable_altivec().
2013-11-17 14:44:22 +00:00
Nathan Whitehorn
52cfe485fb Move CCSR discovery into the platform module, while simultaneously making
it more flexible about how the CCSR range is found. With this change, the
stock MPC85XX will boot on a Routerboard 800.

Hardware donated by:	Benjamin Perrault
2013-11-17 02:03:36 +00:00
Nathan Whitehorn
b899f5d5c9 Make sure that TLB1 mappings are aligned correctly. 2013-11-17 01:59:42 +00:00
Nathan Whitehorn
e39c26a950 Use the same implementation of copyinout.c for both AIM and Book-E. This
fixes some bugs in both implementations related to validity checks on
mapping bounds.
2013-11-11 23:37:16 +00:00
Nathan Whitehorn
bdac436008 Follow up r223485, which made AIM use the ABI thread pointer instead of
PCPU fields for curthread, by doing the same to Book-E. This closes
some potential races switching between CPUs. As a side effect, it turns out
the AIM and Book-E swtch.S implementations were the same to within a few
registers, so move that to powerpc/powerpc.

MFC after: 3 months
2013-11-11 17:37:50 +00:00
Nathan Whitehorn
302acc2e5f Rename the "bare" platform "mpc85xx", which is what it actually is, and
add actual platform probing based on PVR. Still needs a little more work:
in particular, the CCRS setup should move here.

Also turn "bare" into a truly bare platform that doesn't pretend to know how
to do anything except get the memory map. This should also be enhanced to
process the FDT reserved memory list, but that is for another day.
2013-11-11 16:14:25 +00:00
Nathan Whitehorn
0f3406e079 Do not panic if pmap_mincore() is called. This prevents crashing userland
binaries from bringing down the kernel.

MFC after:	3 days
2013-11-06 14:36:38 +00:00
Nathan Whitehorn
0b8a792e0b Make devices with registers into the KVA region work reliably. Without this,
previous KVA allocations (which the PMAP lazily invalidates) in TLB0 could
shadow device maps in TLB1. Add a big block comment about some of the
caveats with this approach.
2013-10-26 20:57:26 +00:00
Nathan Whitehorn
b5c192462c Handle (in a slightly ugly way) ePAPR-type loaders that just place a
device tree into r3. Rather than worrying about mapping that tree, reserving
its space in the global physical memory space, etc., just copy it to some
memory after the kernel.
2013-10-26 19:50:40 +00:00
Nathan Whitehorn
d2a406dd46 Bump initial TLB size. The kernel is not necessarily less than 16 MB any
more.
2013-10-26 19:49:09 +00:00
Nathan Whitehorn
33724f17d2 Interrelated improvements to early boot mappings:
- Remove explicit requirement that the SOC registers be found except as an
  optimization (although the MPC85XX LAW drivers still require they be found
  externally, which should change).
- Remove magic CCSRBAR_VA value.
- Allow bus_machdep.c's early-boot code to handle non 1:1 mappings and
  systems not in real-mode or global 1:1 maps in early boot.
- Allow pmap_mapdev() on Book-E to reissue previous addresses if the
  area is already mapped. Additionally have it check all mappings, not
  just the CCSR area.

This allows the console on e500 systems to actually work on systems where
the boot loader was not kind enough to set up a 1:1 mapping before starting
the kernel.
2013-10-26 18:18:14 +00:00
Nathan Whitehorn
928554a24f Fix concurrency issues with TLB1 updates and make pmap_kextract() search
TLB1 mappings as well, which is required for the console to work after
r257111.
2013-10-26 16:49:41 +00:00
Nathan Whitehorn
8bcd59f2c0 Add pmap_mapdev_attr() and pmap_kenter_attr() interfaces. pmap_set_memattr()
is slightly more complicated and is left unimplemented for now. Also
prevent pmap_mapdev() from mapping over the kernel and KVA regions if
devices happen to have high physical addresses.

MFC after:	2 weeks
2013-10-26 14:52:55 +00:00
Nathan Whitehorn
1c02b4c9fb A quick addendum: the standard says that timebase-frequency can be either
32 or 64 bits, so allow either.
2013-10-23 14:34:04 +00:00
Nathan Whitehorn
d26eb2c194 If the device tree directly contains the timebase frequency, use it. This
property is required by ePAPR, but maintain the fallback to bus-frequency
for compatibility.

MFC after:	2 weeks
2013-10-23 14:28:59 +00:00
Nathan Whitehorn
c8d23639da Make hard-wired TLB allocations be at minimum one page. This is required by
some implementations, most notably (in my case) QEMU's e500 emulation.
2013-10-21 22:25:54 +00:00
Nathan Whitehorn
58ac6f25fa Avoid sign overflow if there are more than 2 GB of RAM. 2013-10-20 23:02:16 +00:00
Nathan Whitehorn
228f09b3ef Replace the two almost-exactly-identical AIM and Book-E clock.c
implementations with a single one after the application of a very small
amount of #ifdef.
2013-10-20 16:37:03 +00:00
Nathan Whitehorn
1cfdc97153 Unify the AIM and Book-E vm_machdep.c implementations, which previously
differed only with respect to the AIM version not following style(9) and
some additional features for 64-bit systems and machines with direct maps
in the AIM implementation that are no-ops on Book-E (at least for now).
2013-10-20 16:14:03 +00:00
Gleb Smirnoff
255c1caae3 - Create kern.ipc.sendfile namespace, and put the new "readhead" OID
there as "kern.ipc.sendfile.readahead".
- Push all nsfbuf related tunables into MD code. Don't move them
  to new namespace in favor of POLA.

Reviewed by:	scottl
Approved by:	re (gjb)
2013-09-22 13:36:52 +00:00
Alan Cox
deb179bb4c The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8.  Now, that call no
longer exists.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-20 04:30:18 +00:00
Konstantin Belousov
e68c64f0ba Revert r254501. Instead, reuse the type stability of the struct pmap
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone.  Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().

Suggested and reviewed by:	alc (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-22 18:12:24 +00:00
Attilio Rao
c7aebda8a1 The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
  and vm_page_grab are being executed.  This will be very helpful
  once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff, kib
Tested by:	gavin, bapt (older version)
Tested by:	pho, scottl
2013-08-09 11:11:11 +00:00
Jeff Roberson
5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
Andrey V. Elsukov
05d1f5bce0 Introduce new structure sfstat for collecting sendfile's statistics
and remove corresponding fields from struct mbstat. Use PCPU counters
and SFSTAT_INC() macro for update these statistics.

Discussed with:	glebius
2013-07-15 06:16:57 +00:00
Attilio Rao
9af6d512f5 o Relax locking assertions for vm_page_find_least()
o Relax locking assertions for pmap_enter_object() and add them also
  to architectures that currently don't have any
o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade
  operation on the per-object rwlock
o Use all the mechanisms above to make vm_map_pmap_enter() to work
  mostl of the times only with readlocks.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
2013-05-21 20:38:19 +00:00
Alan Cox
658f180b3b Relax the object locking assertion in pmap_enter_locked().
Reviewed by:	attilio
Sponsored by:	EMC / Isilon Storage Division
2013-05-17 18:59:00 +00:00
Konstantin Belousov
e8a4a618cf Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination.  Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations.  For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used.  The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after:	2 weeks
2013-03-14 20:18:12 +00:00
Attilio Rao
89f6b8632c Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
  - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
  - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
  - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
  - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
    (in order to avoid visibility of implementation details)
  - The read-mode operations are added:
    VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
    VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
  sys/mutex.h in consumers directly to cater its inlining functions
  using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
  consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
  the compat layer because the name clash between FreeBSD and solaris
  versions must be avoided.
  At this purpose zfs redefines the vm_object locking functions
  directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit.  Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff
Reviewed by:	pjd (ZFS specific review)
Discussed with:	alc
Tested by:	pho
2013-03-09 02:32:23 +00:00
Alexander Motin
fdc5dd2d2f MFcalloutng:
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.
2013-02-28 13:46:03 +00:00
Attilio Rao
dc1558d1cd Merge from vmobj-rwlock:
VM_OBJECT_LOCKED() macro is only used to implement a custom version
of lock assertions right now (which likely spread out thanks to
copy and paste).
Remove it and implement actual assertions.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-02-27 18:12:13 +00:00
Attilio Rao
a4915c21d9 Merge from vmc-playground branch:
Replace the sub-optimal uma_zone_set_obj() primitive with more modern
uma_zone_reserve_kva().  The new primitive reserves before hand
the necessary KVA space to cater the zone allocations and allocates pages
with ALLOC_NOOBJ.  More specifically:
- uma_zone_reserve_kva() does not need an object to cater the backend
  allocator.
- uma_zone_reserve_kva() can cater M_WAITOK requests, in order to
  serve zones which need to do uma_prealloc() too.
- When possible, uma_zone_reserve_kva() uses directly the direct-mapping
  by uma_small_alloc() rather than relying on the KVA / offset
  combination.

The removal of the object attribute allows 2 further changes:
1) _vm_object_allocate() becomes static within vm_object.c
2) VM_OBJECT_LOCK_INIT() is removed.  This function is replaced by
   direct calls to mtx_init() as there is no need to export it anymore
   and the calls aren't either homogeneous anymore: there are now small
   differences between arguments passed to mtx_init().

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc (which also offered almost all the comments)
Tested by:	pho, jhb, davide
2013-02-26 23:35:27 +00:00
Rui Paulo
eaba9848dd Introduce PLATFORMMETHOD_END and use it. 2013-02-13 02:21:45 +00:00
Alan Cox
e33d0ab830 Replace all uses of the page queues lock by a R/W lock that is private
to this pmap.

Eliminate two redundant #include's.

Tested by:	marcel
2012-11-03 23:22:49 +00:00
Marcel Moolenaar
4a49da83e1 1. Have the APs initialize the TLB1 entries from what has been
programmed on the BSP during (early) boot. This makes sure
    that the APs get configured the same as the BSP, irrspective
    of how FreeBSD was loaded.
2.  Make sure to flush the dcache after writing the TLB1 entries
    to the boot page. The APs aren't part of the coherency domain
    just yet.
3.  Set pmap_bootstrapped after calling pmap_bootstrap(). The
    FDT code now maps the devices (like OF), and this resulted
    in a panic.
4.  Since we pre-wire the CCSR, make sure not to map chunks of
    it in pmap_mapdev().
2012-11-03 22:02:12 +00:00
Attilio Rao
324e57150d userret() already checks for td_locks when INVARIANTS is enabled, so
there is no need to check if Giant is acquired after it.

Reviewed by:	kib
MFC after:	1 week
2012-09-08 18:27:11 +00:00
Alan Cox
8d9e6d9f93 Avoid recursion on the pvh global lock in the aim oea pmap.
Correct the return type of the pmap_ts_referenced() implementations.

Reported by:	jhibbits [1]
Tested by:	andreast
2012-07-10 22:10:21 +00:00
Marcel Moolenaar
8ab303584d Fix a typo that resulted in or-ing PTE_UW twice whrn PTE_SW was needed.
Note that setting the PTE_MODIFIED bit based on whether write is possible
is incorrect. We should set PTE_MODIFIED based on whether the access
is a write operation.
2012-07-02 21:21:12 +00:00
Marcel Moolenaar
863fcb91a4 Handle traps from the debugger. We need to catch them and re-enter
the debugger where they're being taken care of.
2012-07-02 21:18:09 +00:00
Marcel Moolenaar
816da2204a Invalidate any TLB1 entries we don't need. The firmware (e.g. U-Boot)
may have added entries that conflict with TLB0 entries.
2012-07-02 21:15:56 +00:00
Marcel Moolenaar
ab83b69996 Implement cpu_flush_dcache(). This allows us to optimize __syncicache()
for the common case in chich D-caches are coherent by virtue of busdma.
2012-07-02 21:11:01 +00:00
Rafal Jaworowski
691df1a1f8 Panic openly if we cannot retrieve memory information from the device tree.
This is a critical condition and can lead to all sorts of misterious hangs if
not handled.

Obtained from:	Semihalf
Also reported by: thompsa
2012-05-30 18:05:48 +00:00