Commit Graph

155 Commits

Author SHA1 Message Date
Konstantin Belousov
e8a4a618cf Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination.  Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations.  For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used.  The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after:	2 weeks
2013-03-14 20:18:12 +00:00
Attilio Rao
89f6b8632c Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
  - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
  - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
  - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
  - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
    (in order to avoid visibility of implementation details)
  - The read-mode operations are added:
    VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
    VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
  sys/mutex.h in consumers directly to cater its inlining functions
  using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
  consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
  the compat layer because the name clash between FreeBSD and solaris
  versions must be avoided.
  At this purpose zfs redefines the vm_object locking functions
  directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit.  Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff
Reviewed by:	pjd (ZFS specific review)
Discussed with:	alc
Tested by:	pho
2013-03-09 02:32:23 +00:00
Alexander Motin
fdc5dd2d2f MFcalloutng:
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.
2013-02-28 13:46:03 +00:00
Attilio Rao
dc1558d1cd Merge from vmobj-rwlock:
VM_OBJECT_LOCKED() macro is only used to implement a custom version
of lock assertions right now (which likely spread out thanks to
copy and paste).
Remove it and implement actual assertions.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-02-27 18:12:13 +00:00
Attilio Rao
a4915c21d9 Merge from vmc-playground branch:
Replace the sub-optimal uma_zone_set_obj() primitive with more modern
uma_zone_reserve_kva().  The new primitive reserves before hand
the necessary KVA space to cater the zone allocations and allocates pages
with ALLOC_NOOBJ.  More specifically:
- uma_zone_reserve_kva() does not need an object to cater the backend
  allocator.
- uma_zone_reserve_kva() can cater M_WAITOK requests, in order to
  serve zones which need to do uma_prealloc() too.
- When possible, uma_zone_reserve_kva() uses directly the direct-mapping
  by uma_small_alloc() rather than relying on the KVA / offset
  combination.

The removal of the object attribute allows 2 further changes:
1) _vm_object_allocate() becomes static within vm_object.c
2) VM_OBJECT_LOCK_INIT() is removed.  This function is replaced by
   direct calls to mtx_init() as there is no need to export it anymore
   and the calls aren't either homogeneous anymore: there are now small
   differences between arguments passed to mtx_init().

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc (which also offered almost all the comments)
Tested by:	pho, jhb, davide
2013-02-26 23:35:27 +00:00
Rui Paulo
eaba9848dd Introduce PLATFORMMETHOD_END and use it. 2013-02-13 02:21:45 +00:00
Alan Cox
e33d0ab830 Replace all uses of the page queues lock by a R/W lock that is private
to this pmap.

Eliminate two redundant #include's.

Tested by:	marcel
2012-11-03 23:22:49 +00:00
Marcel Moolenaar
4a49da83e1 1. Have the APs initialize the TLB1 entries from what has been
programmed on the BSP during (early) boot. This makes sure
    that the APs get configured the same as the BSP, irrspective
    of how FreeBSD was loaded.
2.  Make sure to flush the dcache after writing the TLB1 entries
    to the boot page. The APs aren't part of the coherency domain
    just yet.
3.  Set pmap_bootstrapped after calling pmap_bootstrap(). The
    FDT code now maps the devices (like OF), and this resulted
    in a panic.
4.  Since we pre-wire the CCSR, make sure not to map chunks of
    it in pmap_mapdev().
2012-11-03 22:02:12 +00:00
Attilio Rao
324e57150d userret() already checks for td_locks when INVARIANTS is enabled, so
there is no need to check if Giant is acquired after it.

Reviewed by:	kib
MFC after:	1 week
2012-09-08 18:27:11 +00:00
Alan Cox
8d9e6d9f93 Avoid recursion on the pvh global lock in the aim oea pmap.
Correct the return type of the pmap_ts_referenced() implementations.

Reported by:	jhibbits [1]
Tested by:	andreast
2012-07-10 22:10:21 +00:00
Marcel Moolenaar
8ab303584d Fix a typo that resulted in or-ing PTE_UW twice whrn PTE_SW was needed.
Note that setting the PTE_MODIFIED bit based on whether write is possible
is incorrect. We should set PTE_MODIFIED based on whether the access
is a write operation.
2012-07-02 21:21:12 +00:00
Marcel Moolenaar
863fcb91a4 Handle traps from the debugger. We need to catch them and re-enter
the debugger where they're being taken care of.
2012-07-02 21:18:09 +00:00
Marcel Moolenaar
816da2204a Invalidate any TLB1 entries we don't need. The firmware (e.g. U-Boot)
may have added entries that conflict with TLB0 entries.
2012-07-02 21:15:56 +00:00
Marcel Moolenaar
ab83b69996 Implement cpu_flush_dcache(). This allows us to optimize __syncicache()
for the common case in chich D-caches are coherent by virtue of busdma.
2012-07-02 21:11:01 +00:00
Rafal Jaworowski
691df1a1f8 Panic openly if we cannot retrieve memory information from the device tree.
This is a critical condition and can lead to all sorts of misterious hangs if
not handled.

Obtained from:	Semihalf
Also reported by: thompsa
2012-05-30 18:05:48 +00:00
Rafal Jaworowski
aa6bc7dc29 Extract vendor specific Book-E pieces into separate files and have a common
skeleton (maybe we should kobj-tize this one day).

Note the PPC4xx bit is not connected to the build yet.

Obtained from:	AppliedMicro, Semihalf.
2012-05-30 17:34:40 +00:00
Rafal Jaworowski
b504a44a4b Remove redundant check, we catch ULE platform support in common
sys/kern/sched_ule.c
2012-05-27 10:32:10 +00:00
Rafal Jaworowski
17f4cae4a5 Let us manage differences of Book-E PowerPC variations i.e. vendor /
implementation specific vs. the common architecture definition.

Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.

This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.

Obtained from:	AppliedMicro, Freescale, Semihalf
2012-05-27 10:25:20 +00:00
Rafal Jaworowski
925f0a6ed6 Retrieve CPU number info from the device tree.
Obtained from:	Freescale, Semihalf.
2012-05-26 13:42:55 +00:00
Rafal Jaworowski
2f6bd24181 Rename e500 prefix to match other Book-E CPU variations. CPU id tidbits for
the new cores.

Obtained from:	Freescale, Semihalf.
2012-05-26 13:36:18 +00:00
Rafal Jaworowski
20b7961267 Fix physical address type to vm_paddr_t. 2012-05-24 21:13:24 +00:00
Marcel Moolenaar
a45d9127bd o Rename kernload_ap to bp_kernelload. This to introduce a common prefix
for variables that live in the boot page.
o   Add bp_trace (yes, it's in the boot page) that gets zeroed before we
    try to wake a core and to which the core being woken can write markers
    so that we know where the core was in case it doesn't wake up. The
    boot code does not yet write markers (too follow).
o   Disable the boot page translation to allow the last 4K page to be used
    for whatever we please. It would get mapped otherwise.
o   Fix kernstart in the case of SMP. The start argument is typically page
    aligned due to the alignment requirements that come with having a boot
    page. The point of using trunc_page is that we get the actual load
    address given that the entry point is immediately following the ELF
    headers. In the SMP case this ended up exactly 4K after the load
    address. Hence subtracting 1 from start.
2012-05-24 20:58:40 +00:00
Konstantin Belousov
62c625fdd2 Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit
and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported,
and some variants of powerpc32.

MFC after:	2 months (if ever)
2012-01-30 07:56:00 +00:00
Jayachandran C.
07042bef45 Fix OF_finddevice error return value in case of FDT.
According to the open firmware standard, finddevice call has to return
a phandle with value of -1 in case of error.

This commit is to:
- Fix the FDT implementation of this interface (ofw_fdt_finddevice) to
  return (phandle_t)-1 in case of error, instead of 0 as it does now.
- Fix up the callers of OF_finddevice() to compare the return value with
  -1 instead of 0 to check for errors.
- Since phandle_t is unsigned, the return value of OF_finddevice should
  be checked with '== -1' rather than '<= 0' or '> 0', fix up these cases
  as well.

Reported by:	nwhitehorn

Reviewed by:	raj
Approved by:	raj, nwhitehorn
2011-12-02 15:24:39 +00:00
Konstantin Belousov
578113aaa3 Remove locking of the vm page queues from several pmaps, which only
protected the dirty mask updates. The dirty mask updates are handled
by atomics after the r225840.

Submitted by:	alc
Tested by:	flo (sparc64)
MFC after:	2 weeks
2011-09-28 15:01:20 +00:00
Konstantin Belousov
26ccf4f10f Inline the syscallenter() and syscallret(). This reduces the time measured
by the syscall entry speed microbenchmarks by ~10% on amd64.

Submitted by:	jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 16:05:09 +00:00
Konstantin Belousov
3407fefef6 Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic
flags field. Updates to the atomic flags are performed using the atomic
ops on the containing word, do not require any vm lock to be held, and
are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9)
functions are provided to modify afalgs.

Document the changes to flags field to only require the page lock.

Introduce vm_page_reference(9) function to provide a stable KPI and
KBI for filesystems like tmpfs and zfs which need to mark a page as
referenced.

Reviewed by:    alc, attilio
Tested by:      marius, flo (sparc64); andreast (powerpc, powerpc64)
Approved by:	re (bz)
2011-09-06 10:30:11 +00:00
Konstantin Belousov
d98d0ce27a - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag
to VPO_UNMANAGED (and also making the flag protected by the vm object
  lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
  VPO_UNMANAGED. As a consequence, pmap code now can use use just
  VPO_UNMANAGED to decide whether the page is unmanaged.

Reviewed by:	alc
Tested by:	pho (x86, previous version), marius (sparc64),
    marcel (arm, ia64, powerpc), ray (mips)
Sponsored by:	The FreeBSD Foundation
Approved by:	re (bz)
2011-08-09 21:01:36 +00:00
Marcel Moolenaar
5ce36fdb77 Cross a T and dot an I:
o   Fix awkward use of braces in combination with mis-indentation.
    A mistake, that happened to yield the right behaviour?
o   Fix typo in comment.

No functional change.

Approved by:	re (blanket)
2011-08-02 23:49:23 +00:00
Marcel Moolenaar
d50c56183e It's invalid to use GLOBAL() for kernload_ap, as the macro switches
to the .data section. We need kernload_ap in the boot page.

Approved by:	re (blanket)
2011-08-02 23:33:44 +00:00
Marcel Moolenaar
d7f74bdca7 There's no ':' after GLOBAL(). Missed due to no SMP testing.
Approved by:	re (blanket)
2011-08-02 23:06:59 +00:00
Marcel Moolenaar
2b5bf115ae Add support for Juniper's loader. The difference between FreeBSD's and
Juniper's loader is that Juniper's loader maps all of the kernel and
preloaded modules at the right virtual address before jumping into the
kernel. FreeBSD's loader simply maps 16MB using the physical address
and expects the kernel to jump through hoops to relocate itself to
it's virtual address. The problem with the FreeBSD loader's approach is
that it typically maps too much or too little. There's no harm if it's
too much (other than wasting space), but if it's too little then the
kernel will simply not boot, because the first thing the kernel needs
is the bootinfo structure, which is never mapped in that case. The page
fault that early is fatal.

The changes constitute:
1.  Do not remap the kernel in locore.S. We're mapped where we need to
    be so we can pretty much call into C code after setting up the
    stack.
2.  With kernload and kernload_ap not set in locore.S, we need to set
    them in pmap.c: kernload gets defined when we preserve the TLB1.
    Here we also determine the size of the kernel mapped. kernload_ap
    is set first thing in the pmap_bootstrap() method.
3.  Fix tlb1_map_region() and its use to properly externd the mapped
    kernel size to include low-level data structures.

Approved by:	re (blanket)
Obtained from:	Juniper Networks, Inc
2011-08-02 15:35:43 +00:00
Marcel Moolenaar
9668a15a6a Fix r224187: .word defines a 16-bit object and size_t is defined as
a 32-bit intergal. Use .long to define sintrcnt and sintrname.

Approved by:	re (blanket)
2011-07-31 18:26:47 +00:00
Attilio Rao
521ea19d1c - Remove the eintrcnt/eintrnames usage and introduce the concept of
sintrcnt/sintrnames which are symbols containing the size of the 2
  tables.
- For amd64/i386 remove the storage of intr* stuff from assembly files.
  This area can be widely improved by applying the same to other
  architectures and likely finding an unified approach among them and
  move the whole code to be MI. More work in this area is expected to
  happen fairly soon.

No MFC is previewed for this patch.

Tested by:	pluknet
Reviewed by:	jhb
Approved by:	re (kib)
2011-07-18 15:19:40 +00:00
Attilio Rao
de138ec703 MFC 2011-06-24 16:35:40 +00:00
Nathan Whitehorn
e69dff491d Use the ABI-mandated thread pointer register (r2 for ppc32, r13 for ppc64)
instead of a PCPU field for curthread. This averts a race on SMP systems
with a high interrupt rate where the thread looking up the value of
curthread could be preempted and migrated between obtaining the PCPU
pointer and reading the value of pc_curthread, resulting in curthread being
observed to be the current thread on the thread's original CPU. This played
merry havoc with the system, in particular with mutexes. Many thanks to
jhb for helping me work this one out.

Note that Book-E is in principle susceptible to the same problem, but has
not been modified yet due to lack of Book-E hardware.

MFC after:	2 weeks
2011-06-23 22:21:28 +00:00
Attilio Rao
c7c2767e33 Remove pc_other_cpus and pc_cpumask usage from powerpc support.
Tested and reviewed by:	andreast
2011-06-16 07:27:13 +00:00
Attilio Rao
61b926921f MFC 2011-05-31 21:22:44 +00:00
Nathan Whitehorn
d098f93019 On multi-core, multi-threaded PPC systems, it is important that the threads
be brought up in the order they are enumerated in the device tree (in
particular, that thread 0 on each core be brought up first). The SLIST
through which we loop to start the CPUs has all of its entries added with
SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration
and so AP startup would always fail in such situations (causing a machine
check or RTAS failure). Fix this by changing the SLIST into an STAILQ,
and inserting new CPUs at the end.

Reviewed by:	jhb
2011-05-31 15:11:43 +00:00
Attilio Rao
c7df91af4b MFC 2011-05-29 00:59:38 +00:00
Marcel Moolenaar
ebfbeb83f6 o Add system versions for the P4040(E) and P4080(E).
o   In bare_probe(), change the logic that determines the maximum
    number of processors/cores into a switch statement and take
    advantage of the fact that bit 3 of the SVR value indicates
    whether we're running on a security enabled version. Since we
    don't care about that here, mask the bit. All -E versions
    are taken care of automatically.
2011-05-29 00:27:42 +00:00
Marcel Moolenaar
ebf84ceca7 Better support different kernel hand-offs. When loaded directly
from U-Boot, the kernel is passed a standard argc/argv pair.
The Juniper loader passes the metadata pointer as the second
argument and passes 0 in the first. The FreeBSD loader passes
the metadata pointer in the first argument.

As such, have locore preserve the first 2 arguments in registers
r30 & r31. Change e500_init() to accept these arguments. Don't
pass global offsets (i.e. kernel_text and _end) as arguments to
e500_init(). We can reference those directly.

Rename e500_init() to booke_init() now that we're changing the
prototype.

In booke_init(), "decode" arg1 and arg2 to obtain the metadata
pointer correctly. For the U-Boot case, clear SBSS and BSS and
bank on having a static FDT for now. This allows loading the
ELF kernel and jumping to the entry point without trampoline.
2011-05-28 04:10:44 +00:00
Marcel Moolenaar
7faf44ba96 o The P1020(E) & P2020(E) also have two cores. This conditional has
a tendency to grow unwieldy so we may want to revisit this in due
    time.
o   Simplify the CPU reset function by writing to the reset control
    register irrespective of whether the CPU has one and automatically
    falling back to the debug control register if we didn't reset the
    CPU. The side-effect is that we now properly reset future processors
    without first having to add the system version to the list.
2011-05-27 23:18:41 +00:00
Marcel Moolenaar
6a76463e30 Wire the kernel using TLB1 entry 0 rather than entry 1. A more recent
U-Boot as found on the P1020RDB doesn't like it when we use entry 1
(for some reason) whereas an older U-Boot doesn't mind if we use entry
0. If anything else, this simplifies the code a bit.
2011-05-27 23:09:12 +00:00
Attilio Rao
9cb46334ee MFC 2011-05-27 16:09:10 +00:00
Marcel Moolenaar
7512c508df Don't assume we have a valid bootinfo pointer. 2011-05-26 20:47:05 +00:00
Attilio Rao
20bf92c280 Fix usage of cpumask that cannot be used like that anymore.
Reported by:	pluknet
2011-05-18 16:56:36 +00:00
Attilio Rao
c98b35868f Revert r222069,222068 as they were intended to be committed to the
largeSMP branch.

Reported by:	pluknet
2011-05-18 16:50:13 +00:00
Attilio Rao
1a203896c3 Fix warning spit out.
Reported by:	sbruno
2011-05-18 16:42:01 +00:00
Attilio Rao
db4b2ef5a2 Fix newly introduced code.
Reported by:	sbruno
2011-05-18 16:41:38 +00:00