Commit Graph

243 Commits

Author SHA1 Message Date
Marcel Moolenaar
816da2204a Invalidate any TLB1 entries we don't need. The firmware (e.g. U-Boot)
may have added entries that conflict with TLB0 entries.
2012-07-02 21:15:56 +00:00
Marcel Moolenaar
ab83b69996 Implement cpu_flush_dcache(). This allows us to optimize __syncicache()
for the common case in chich D-caches are coherent by virtue of busdma.
2012-07-02 21:11:01 +00:00
Rafal Jaworowski
691df1a1f8 Panic openly if we cannot retrieve memory information from the device tree.
This is a critical condition and can lead to all sorts of misterious hangs if
not handled.

Obtained from:	Semihalf
Also reported by: thompsa
2012-05-30 18:05:48 +00:00
Rafal Jaworowski
aa6bc7dc29 Extract vendor specific Book-E pieces into separate files and have a common
skeleton (maybe we should kobj-tize this one day).

Note the PPC4xx bit is not connected to the build yet.

Obtained from:	AppliedMicro, Semihalf.
2012-05-30 17:34:40 +00:00
Rafal Jaworowski
b504a44a4b Remove redundant check, we catch ULE platform support in common
sys/kern/sched_ule.c
2012-05-27 10:32:10 +00:00
Rafal Jaworowski
17f4cae4a5 Let us manage differences of Book-E PowerPC variations i.e. vendor /
implementation specific vs. the common architecture definition.

Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.

This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.

Obtained from:	AppliedMicro, Freescale, Semihalf
2012-05-27 10:25:20 +00:00
Rafal Jaworowski
925f0a6ed6 Retrieve CPU number info from the device tree.
Obtained from:	Freescale, Semihalf.
2012-05-26 13:42:55 +00:00
Rafal Jaworowski
2f6bd24181 Rename e500 prefix to match other Book-E CPU variations. CPU id tidbits for
the new cores.

Obtained from:	Freescale, Semihalf.
2012-05-26 13:36:18 +00:00
Rafal Jaworowski
20b7961267 Fix physical address type to vm_paddr_t. 2012-05-24 21:13:24 +00:00
Marcel Moolenaar
a45d9127bd o Rename kernload_ap to bp_kernelload. This to introduce a common prefix
for variables that live in the boot page.
o   Add bp_trace (yes, it's in the boot page) that gets zeroed before we
    try to wake a core and to which the core being woken can write markers
    so that we know where the core was in case it doesn't wake up. The
    boot code does not yet write markers (too follow).
o   Disable the boot page translation to allow the last 4K page to be used
    for whatever we please. It would get mapped otherwise.
o   Fix kernstart in the case of SMP. The start argument is typically page
    aligned due to the alignment requirements that come with having a boot
    page. The point of using trunc_page is that we get the actual load
    address given that the entry point is immediately following the ELF
    headers. In the SMP case this ended up exactly 4K after the load
    address. Hence subtracting 1 from start.
2012-05-24 20:58:40 +00:00
Konstantin Belousov
62c625fdd2 Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit
and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported,
and some variants of powerpc32.

MFC after:	2 months (if ever)
2012-01-30 07:56:00 +00:00
Jayachandran C.
07042bef45 Fix OF_finddevice error return value in case of FDT.
According to the open firmware standard, finddevice call has to return
a phandle with value of -1 in case of error.

This commit is to:
- Fix the FDT implementation of this interface (ofw_fdt_finddevice) to
  return (phandle_t)-1 in case of error, instead of 0 as it does now.
- Fix up the callers of OF_finddevice() to compare the return value with
  -1 instead of 0 to check for errors.
- Since phandle_t is unsigned, the return value of OF_finddevice should
  be checked with '== -1' rather than '<= 0' or '> 0', fix up these cases
  as well.

Reported by:	nwhitehorn

Reviewed by:	raj
Approved by:	raj, nwhitehorn
2011-12-02 15:24:39 +00:00
Konstantin Belousov
578113aaa3 Remove locking of the vm page queues from several pmaps, which only
protected the dirty mask updates. The dirty mask updates are handled
by atomics after the r225840.

Submitted by:	alc
Tested by:	flo (sparc64)
MFC after:	2 weeks
2011-09-28 15:01:20 +00:00
Konstantin Belousov
26ccf4f10f Inline the syscallenter() and syscallret(). This reduces the time measured
by the syscall entry speed microbenchmarks by ~10% on amd64.

Submitted by:	jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 16:05:09 +00:00
Konstantin Belousov
3407fefef6 Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic
flags field. Updates to the atomic flags are performed using the atomic
ops on the containing word, do not require any vm lock to be held, and
are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9)
functions are provided to modify afalgs.

Document the changes to flags field to only require the page lock.

Introduce vm_page_reference(9) function to provide a stable KPI and
KBI for filesystems like tmpfs and zfs which need to mark a page as
referenced.

Reviewed by:    alc, attilio
Tested by:      marius, flo (sparc64); andreast (powerpc, powerpc64)
Approved by:	re (bz)
2011-09-06 10:30:11 +00:00
Konstantin Belousov
d98d0ce27a - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag
to VPO_UNMANAGED (and also making the flag protected by the vm object
  lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
  VPO_UNMANAGED. As a consequence, pmap code now can use use just
  VPO_UNMANAGED to decide whether the page is unmanaged.

Reviewed by:	alc
Tested by:	pho (x86, previous version), marius (sparc64),
    marcel (arm, ia64, powerpc), ray (mips)
Sponsored by:	The FreeBSD Foundation
Approved by:	re (bz)
2011-08-09 21:01:36 +00:00
Marcel Moolenaar
5ce36fdb77 Cross a T and dot an I:
o   Fix awkward use of braces in combination with mis-indentation.
    A mistake, that happened to yield the right behaviour?
o   Fix typo in comment.

No functional change.

Approved by:	re (blanket)
2011-08-02 23:49:23 +00:00
Marcel Moolenaar
d50c56183e It's invalid to use GLOBAL() for kernload_ap, as the macro switches
to the .data section. We need kernload_ap in the boot page.

Approved by:	re (blanket)
2011-08-02 23:33:44 +00:00
Marcel Moolenaar
d7f74bdca7 There's no ':' after GLOBAL(). Missed due to no SMP testing.
Approved by:	re (blanket)
2011-08-02 23:06:59 +00:00
Marcel Moolenaar
2b5bf115ae Add support for Juniper's loader. The difference between FreeBSD's and
Juniper's loader is that Juniper's loader maps all of the kernel and
preloaded modules at the right virtual address before jumping into the
kernel. FreeBSD's loader simply maps 16MB using the physical address
and expects the kernel to jump through hoops to relocate itself to
it's virtual address. The problem with the FreeBSD loader's approach is
that it typically maps too much or too little. There's no harm if it's
too much (other than wasting space), but if it's too little then the
kernel will simply not boot, because the first thing the kernel needs
is the bootinfo structure, which is never mapped in that case. The page
fault that early is fatal.

The changes constitute:
1.  Do not remap the kernel in locore.S. We're mapped where we need to
    be so we can pretty much call into C code after setting up the
    stack.
2.  With kernload and kernload_ap not set in locore.S, we need to set
    them in pmap.c: kernload gets defined when we preserve the TLB1.
    Here we also determine the size of the kernel mapped. kernload_ap
    is set first thing in the pmap_bootstrap() method.
3.  Fix tlb1_map_region() and its use to properly externd the mapped
    kernel size to include low-level data structures.

Approved by:	re (blanket)
Obtained from:	Juniper Networks, Inc
2011-08-02 15:35:43 +00:00
Marcel Moolenaar
9668a15a6a Fix r224187: .word defines a 16-bit object and size_t is defined as
a 32-bit intergal. Use .long to define sintrcnt and sintrname.

Approved by:	re (blanket)
2011-07-31 18:26:47 +00:00
Attilio Rao
521ea19d1c - Remove the eintrcnt/eintrnames usage and introduce the concept of
sintrcnt/sintrnames which are symbols containing the size of the 2
  tables.
- For amd64/i386 remove the storage of intr* stuff from assembly files.
  This area can be widely improved by applying the same to other
  architectures and likely finding an unified approach among them and
  move the whole code to be MI. More work in this area is expected to
  happen fairly soon.

No MFC is previewed for this patch.

Tested by:	pluknet
Reviewed by:	jhb
Approved by:	re (kib)
2011-07-18 15:19:40 +00:00
Attilio Rao
de138ec703 MFC 2011-06-24 16:35:40 +00:00
Nathan Whitehorn
e69dff491d Use the ABI-mandated thread pointer register (r2 for ppc32, r13 for ppc64)
instead of a PCPU field for curthread. This averts a race on SMP systems
with a high interrupt rate where the thread looking up the value of
curthread could be preempted and migrated between obtaining the PCPU
pointer and reading the value of pc_curthread, resulting in curthread being
observed to be the current thread on the thread's original CPU. This played
merry havoc with the system, in particular with mutexes. Many thanks to
jhb for helping me work this one out.

Note that Book-E is in principle susceptible to the same problem, but has
not been modified yet due to lack of Book-E hardware.

MFC after:	2 weeks
2011-06-23 22:21:28 +00:00
Attilio Rao
c7c2767e33 Remove pc_other_cpus and pc_cpumask usage from powerpc support.
Tested and reviewed by:	andreast
2011-06-16 07:27:13 +00:00
Attilio Rao
61b926921f MFC 2011-05-31 21:22:44 +00:00
Nathan Whitehorn
d098f93019 On multi-core, multi-threaded PPC systems, it is important that the threads
be brought up in the order they are enumerated in the device tree (in
particular, that thread 0 on each core be brought up first). The SLIST
through which we loop to start the CPUs has all of its entries added with
SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration
and so AP startup would always fail in such situations (causing a machine
check or RTAS failure). Fix this by changing the SLIST into an STAILQ,
and inserting new CPUs at the end.

Reviewed by:	jhb
2011-05-31 15:11:43 +00:00
Attilio Rao
c7df91af4b MFC 2011-05-29 00:59:38 +00:00
Marcel Moolenaar
ebfbeb83f6 o Add system versions for the P4040(E) and P4080(E).
o   In bare_probe(), change the logic that determines the maximum
    number of processors/cores into a switch statement and take
    advantage of the fact that bit 3 of the SVR value indicates
    whether we're running on a security enabled version. Since we
    don't care about that here, mask the bit. All -E versions
    are taken care of automatically.
2011-05-29 00:27:42 +00:00
Marcel Moolenaar
ebf84ceca7 Better support different kernel hand-offs. When loaded directly
from U-Boot, the kernel is passed a standard argc/argv pair.
The Juniper loader passes the metadata pointer as the second
argument and passes 0 in the first. The FreeBSD loader passes
the metadata pointer in the first argument.

As such, have locore preserve the first 2 arguments in registers
r30 & r31. Change e500_init() to accept these arguments. Don't
pass global offsets (i.e. kernel_text and _end) as arguments to
e500_init(). We can reference those directly.

Rename e500_init() to booke_init() now that we're changing the
prototype.

In booke_init(), "decode" arg1 and arg2 to obtain the metadata
pointer correctly. For the U-Boot case, clear SBSS and BSS and
bank on having a static FDT for now. This allows loading the
ELF kernel and jumping to the entry point without trampoline.
2011-05-28 04:10:44 +00:00
Marcel Moolenaar
7faf44ba96 o The P1020(E) & P2020(E) also have two cores. This conditional has
a tendency to grow unwieldy so we may want to revisit this in due
    time.
o   Simplify the CPU reset function by writing to the reset control
    register irrespective of whether the CPU has one and automatically
    falling back to the debug control register if we didn't reset the
    CPU. The side-effect is that we now properly reset future processors
    without first having to add the system version to the list.
2011-05-27 23:18:41 +00:00
Marcel Moolenaar
6a76463e30 Wire the kernel using TLB1 entry 0 rather than entry 1. A more recent
U-Boot as found on the P1020RDB doesn't like it when we use entry 1
(for some reason) whereas an older U-Boot doesn't mind if we use entry
0. If anything else, this simplifies the code a bit.
2011-05-27 23:09:12 +00:00
Attilio Rao
9cb46334ee MFC 2011-05-27 16:09:10 +00:00
Marcel Moolenaar
7512c508df Don't assume we have a valid bootinfo pointer. 2011-05-26 20:47:05 +00:00
Attilio Rao
20bf92c280 Fix usage of cpumask that cannot be used like that anymore.
Reported by:	pluknet
2011-05-18 16:56:36 +00:00
Attilio Rao
c98b35868f Revert r222069,222068 as they were intended to be committed to the
largeSMP branch.

Reported by:	pluknet
2011-05-18 16:50:13 +00:00
Attilio Rao
1a203896c3 Fix warning spit out.
Reported by:	sbruno
2011-05-18 16:42:01 +00:00
Attilio Rao
db4b2ef5a2 Fix newly introduced code.
Reported by:	sbruno
2011-05-18 16:41:38 +00:00
Attilio Rao
fea3a3fa94 MFC 2011-05-17 22:03:01 +00:00
Attilio Rao
c47dd3db8c Add the powerpc support.
Note that there is a dirty hack for calling openpic_write(), but
nwhitehorn approved it.

Discussed with:	nwhitehorn
2011-05-09 16:16:15 +00:00
Sergey Kandaurov
4053b05b91 Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by:	perryh pluto.rain.com (previous version)
Reviewed by:	jhb
Approved by:	kib (mentor)
Tested by:	universe
2011-01-21 10:26:26 +00:00
Konstantin Belousov
55aabb7fd1 For architectures not using direct map , and requiring real KVA page for
sf buf allocation, use wakeup() instead of wakeup_one() to notify sf
buffer waiters about free buffer.

sf_buf_alloc() calls msleep(PCATCH) when SFB_CATCH flag was given,
and for simultaneous wakeup and signal delivery, msleep() returns
EINTR/ERESTART despite the thread was selected for wakeup_one(). As
result, we loose a wakeup, and some other waiter will not be woken up.

Reported and tested by:	az
Reviewed by:	alc, jhb
MFC after:	1 week
2011-01-18 21:57:02 +00:00
Marcel Moolenaar
e3d4100642 Support booting non FDT-capable loaders:
1.  Allow embedding the FDT into the kernel, just like PowerPC/book-E.
2.  If the loader passes us a pointer to the bootinfo structure, save
    it and use it to fill in the gaps (e.g. bus frequencies, etc).
2011-01-17 23:54:50 +00:00
Dimitry Andric
3e288e6238 After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files.  A better long-term solution is
still being considered.  This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
2010-11-22 19:32:54 +00:00
Dimitry Andric
31c6a0037e Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.
2010-11-14 20:38:11 +00:00
Nathan Whitehorn
6413b05739 Add some platform KOBJ extensions and continue integrating PowerPC
hypervisor infrastructure support:
- Fix coexistence of multiple platform modules in the same kernel
- Allow platform modules to provide an SMP topology
- PowerPC hypervisors limit the amount of memory accessible in real mode.
  Allow the platform modules to specify the maximum real-mode address,
  and modify the bits of the kernel that need to allocate
  real-mode-accessible buffers to respect this limits.
2010-11-12 04:18:19 +00:00
Nathan Whitehorn
5b7ed13bc8 Centralize CPU idle routines into powerpc/cpu.c and use the same
cpu_idle_hook mechanism that x86 uses for overriding the idle routine.
This is required for supporting ilding the CPU under PowerPC hypervisors.
2010-11-12 03:43:22 +00:00
Rafal Jaworowski
1f87b29431 Fix typo in the comment. 2010-11-11 13:46:28 +00:00
Rafal Jaworowski
1d56a2801d Use local TLB_UNLOCKED marker instead of MTX_UNOWNED for Book-E PowerPC trap
routines.

This unbreaks Book-E build after the recent machine/mutex.h removal.

While there move tlb_*lock() prototypes to machine/tlb.h.

Submitted by:	jhb
2010-11-11 13:35:23 +00:00
John Baldwin
b3e3402d3a Remove unused includes of <sys/mutex.h> and <machine/mutex.h>. 2010-11-09 20:41:10 +00:00
John Baldwin
0108cce0a4 Adjust the order of operations in spinlock_enter() and spinlock_exit() to
work properly with single-stepping in a kernel debugger.  Specifically,
these routines have always disabled interrupts before increasing the nesting
count and restored the prior state of interrupts after decreasing the nesting
count to avoid problems with a nested interrupt not disabling interrupts
when acquiring a spin lock.  However, trap interrupts for single-stepping
can still occur even when interrupts are disabled.  Now the saved state of
interrupts is not saved in the thread until after interrupts have been
disabled and the nesting count has been increased.  Similarly, the saved
state from the thread cannot be read once the nesting count has been
decreased to zero.  To fix this, use temporary variables to store interrupt
state and shuffle it between the thread's MD area and the appropriate
registers.

In cooperation with:	bde
MFC after:     1 month
2010-11-05 13:42:58 +00:00
Peter Grehan
33529b98d5 Introduce inheritance into the PowerPC MMU kobj interface.
include/mmuvar.h - Change the MMU_DEF macro to also create the class
definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT,
which has an extra parameter specifying the MMU class to inherit methods
from. Update the comments at the start of the header file to describe the
new macros.

booke/pmap.c
aim/mmu_oea.c
aim/mmu_oea64.c - Collapse mmu_def_t declaration into updated MMU_DEF macro

The MMU_DEF_INHERIT macro will be used in the PS3 MMU implementation to
allow it to inherit the stock powerpc64 MMU methods.

Reviewed by:	nwhitehorn
2010-09-15 00:17:52 +00:00
Alexander Motin
a157e42516 Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
  kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
  kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
  kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
  kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by:	many (on i386, amd64, sparc64 and powerc)
H/W donated by:	Gheorghe Ardelean
Sponsored by:	iXsystems, Inc.
2010-09-13 07:25:35 +00:00
Alexander Motin
707c2fb950 Update PowerPC event timer code to use new event timers infrastructure.
Reviewed by:	nwitehorn
Tested by:	andreast
H/W donated by:	Gheorghe Ardelean
2010-09-11 04:45:51 +00:00
Nathan Whitehorn
b2a237be5c Restructure how reset and poweroff are handled on PowerPC systems, since
the existing code was very platform specific, and broken for SMP systems
trying to reboot from KDB.

- Add a new PLATFORM_RESET() method to the platform KOBJ interface, and
  migrate existing reset functions into platform modules.
- Modify the OF_reboot() routine to submit the request by hand to avoid
  the IPIs involved in the regular openfirmware() routine. This fixes
  reboot from KDB on SMP machines.
- Move non-KDB reset and poweroff functions on the Powermac platform
  into the relevant power control drivers (cuda, pmu, smu), instead of
  using them through the Open Firmware backdoor.
- Rename platform_chrp to platform_powermac since it has become
  increasingly Powermac specific. When we gain support for IBM systems,
  we will grow a new platform_chrp.
2010-08-31 15:27:46 +00:00
John Baldwin
8c7a92bd4a Remove unused KTRACE includes. 2010-08-19 16:41:27 +00:00
Rafal Jaworowski
4f124b977c Eliminate FDT_IMMR_VA define.
This removes platform dependencies from <machine>/fdt.h for the benfit of
portability.
2010-07-19 18:47:18 +00:00
Nathan Whitehorn
c3e289e1ce MFppc64:
Kernel sources for 64-bit PowerPC, along with build-system changes to keep
32-bit kernels compiling (build system changes for 64-bit kernels are
coming later). Existing 32-bit PowerPC kernel configurations must be
updated after this change to specify their architecture.
2010-07-13 05:32:19 +00:00
Nathan Whitehorn
cc81c44dd8 Unify ABI-related bits of the Book-E and AIM machdep routines
(exec_setregs, etc.) in order to simplify the addition of 64-bit support,
and possible future extension of the Book-E code to handle hard floating
point and Altivec.

MFC after:	1 month
2010-07-12 16:08:07 +00:00
Rafal Jaworowski
d1d3233ebd Convert Freescale PowerPC platforms to FDT convention.
The following systems are affected:

  - MPC8555CDS
  - MPC8572DS

This overhaul covers the following major changes:

  - All integrated peripherals drivers for Freescale MPC85XX SoC, which are
    currently in the FreeBSD source tree are reworked and adjusted so they
    derive config data out of the device tree blob (instead of hard coded /
    tabelarized values).

  - This includes: LBC, PCI / PCI-Express, I2C, DS1553, OpenPIC, TSEC, SEC,
    QUICC, UART, CFI.

  - Thanks to the common FDT infrastrucutre (fdtbus, simplebus) we retire
    ocpbus(4) driver, which was based on hard-coded config data.

Note that world for these platforms has to be built WITH_FDT.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
2010-07-11 21:08:29 +00:00
Nathan Whitehorn
f6421f31e3 Replace the existing PowerPC busdma implementation with the one from
amd64 (with slight modifications). This provides support for bounce
buffers, which are required on systems with RAM above 4 GB.
2010-07-08 15:38:55 +00:00
Marcel Moolenaar
8a35d194f2 Remove the unneeded header <machine/intr.h>. 2010-07-02 02:17:39 +00:00
John Baldwin
fc0de8f0b6 Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
2010-06-30 18:03:42 +00:00
Marcel Moolenaar
ed571bca9f Remove debugging printf() -- that is, I assume it was for debugging :-) 2010-06-24 05:47:00 +00:00
Nathan Whitehorn
eaef5f0af8 Provide for multiple, cascaded PICs on PowerPC systems, and extend the
OFW interrupt map interface to also return the device's interrupt parent.

MFC after:	8.1-RELEASE
2010-06-18 14:06:27 +00:00
Alan Cox
9124d0d6a3 Relax one of the new assertions in pmap_enter() a little. Specifically,
allow pmap_enter() to be performed on an unmanaged page that doesn't have
VPO_BUSY set.  Having VPO_BUSY set really only matters for managed pages.
(See, for example, pmap_remove_write().)
2010-06-11 15:49:39 +00:00
Alan Cox
ce18658792 Reduce the scope of the page queues lock and the number of
PG_REFERENCED changes in vm_pageout_object_deactivate_pages().
Simplify this function's inner loop using TAILQ_FOREACH(), and shorten
some of its overly long lines.  Update a stale comment.

Assert that PG_REFERENCED may be cleared only if the object containing
the page is locked.  Add a comment documenting this.

Assert that a caller to vm_page_requeue() holds the page queues lock,
and assert that the page is on a page queue.

Push down the page queues lock into pmap_ts_referenced() and
pmap_page_exists_quick().  (As of now, there are no longer any pmap
functions that expect to be called with the page queues lock held.)

Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever
be passed an unmanaged page.  Assert this rather than returning "0"
and "FALSE" respectively.

ARM:

Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().

Push down the page queues lock inside of pmap_clearbit(), simplifying
pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write().
Additionally, this allows for avoiding the acquisition of the page
queues lock in some cases.

PowerPC/AIM:

moea*_page_exits_quick() and moea*_page_wired_mappings() will never be
called before pmap initialization is complete.  Therefore, the check
for moea_initialized can be eliminated.

Push down the page queues lock inside of moea*_clear_bit(),
simplifying moea*_clear_modify() and moea*_clear_reference().

The last parameter to moea*_clear_bit() is never used.  Eliminate it.

PowerPC/BookE:

Simplify mmu_booke_page_exists_quick()'s control flow.

Reviewed by:	kib@
2010-06-10 16:56:35 +00:00
Alan Cox
85a71b2578 Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.
Correct a typo in a nearby comment on sparc64.
2010-06-05 18:20:09 +00:00
Alan Cox
b5bde83122 In the case that mmu_booke_enter_locked() is changing the attributes of a
mapping but not changing the physical page being mapped, the wrong flags
were being inspected in order to determine whether or not to flush the
instruction cache.  The effect of looking at the wrong flags was that the
instruction cache was never being flushed.

Reviewed by:	marcel
2010-06-01 19:56:02 +00:00
Alan Cox
c46b90e90a Push down page queues lock acquisition in pmap_enter_object() and
pmap_is_referenced().  Eliminate the corresponding page queues lock
acquisitions from vm_map_pmap_enter() and mincore(), respectively.  In
mincore(), this allows some additional cases to complete without ever
acquiring the page queues lock.

Assert that the page is managed in pmap_is_referenced().

On powerpc/aim, push down the page queues lock acquisition from
moea*_is_modified() and moea*_is_referenced() into moea*_query_bit().
Again, this will allow some additional cases to complete without ever
acquiring the page queues lock.

Reorder a few statements in vm_page_dontneed() so that a race can't lead
to an old reference persisting.  This scenario is described in detail by a
comment.

Correct a spelling error in vm_page_dontneed().

Assert that the object is locked in vm_page_clear_dirty(), and restrict the
page queues lock assertion to just those cases in which the page is
currently writeable.

Add object locking to vnode_pager_generic_putpages().  This was the one
and only place where vm_page_clear_dirty() was being called without the
object being locked.

Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call
to vm_page_clear_dirty().

Change vnode_pager_generic_putpages() to the modern-style of function
definition.  Also, change the name of one of the parameters to follow
virtual memory system naming conventions.

Reviewed by:	kib
2010-05-26 18:00:44 +00:00
Alan Cox
567e51e18c Roughly half of a typical pmap_mincore() implementation is machine-
independent code.  Move this code into mincore(), and eliminate the
page queues lock from pmap_mincore().

Push down the page queues lock into pmap_clear_modify(),
pmap_clear_reference(), and pmap_is_modified().  Assert that these
functions are never passed an unmanaged page.

Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m:
Contrary to what the comment says, pmap_mincore() is not simply an
optimization.  Without a complete pmap_mincore() implementation,
mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED
because only the pmap can provide this information.

Eliminate the page queues lock from vfs_setdirty_locked_object(),
vm_pageout_clean(), vm_object_page_collect_flush(), and
vm_object_page_clean().  Generally speaking, these are all accesses
to the page's dirty field, which are synchronized by the containing
vm object's lock.

Reduce the scope of the page queues lock in vm_object_madvise() and
vm_page_dontneed().

Reviewed by:	kib (an earlier version)
2010-05-24 14:26:57 +00:00
Konstantin Belousov
afe1a68827 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
Alan Cox
9ab6032f73 On entry to pmap_enter(), assert that the page is busy. While I'm
here, make the style of assertion used by pmap_enter() consistent
across all architectures.

On entry to pmap_remove_write(), assert that the page is neither
unmanaged nor fictitious, since we cannot remove write access to
either kind of page.

With the push down of the page queues lock, pmap_remove_write() cannot
condition its behavior on the state of the PG_WRITEABLE flag if the
page is busy.  Assert that the object containing the page is locked.
This allows us to know that the page will neither become busy nor will
PG_WRITEABLE be set on it while pmap_remove_write() is running.

Correct a long-standing bug in vm_page_cowsetup().  We cannot possibly
do copy-on-write-based zero-copy transmit on unmanaged or fictitious
pages, so don't even try.  Previously, the call to pmap_remove_write()
would have failed silently.
2010-05-16 23:45:10 +00:00
Alan Cox
3c4a24406b Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and
vm_page_try_to_free().  Consequently, push down the page queues lock into
pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and
pmap_remove_write().

Push down the page queues lock into Xen's pmap_page_is_mapped().  (I
overlooked the Xen pmap in r207702.)

Switch to a per-processor counter for the total number of pages cached.
2010-05-08 20:34:01 +00:00
Alan Cox
c7a0df65b1 MFamd64/i386 r207205
Clearing a page table entry's accessed bit and setting the page's
  PG_REFERENCED flag in pmap_protect() can't really be justified, so
  don't do it.

Additionally, two changes that make this pmap behave like the others do:

Change pmap_protect() such that it calls vm_page_dirty() only if the
page is managed.

Change pmap_remove_write() such that it doesn't clear a page table
entry's accessed bit.
2010-04-30 15:22:52 +00:00
Kip Macy
2965a45315 On Alan's advice, rather than do a wholesale conversion on a single
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.

Supported by: Bitgravity Inc.

Discussed with: alc, jeffr, and kib
2010-04-30 00:46:43 +00:00
Alan Cox
7b85f59183 Resurrect pmap_is_referenced() and use it in mincore(). Essentially,
pmap_ts_referenced() is not always appropriate for checking whether or
not pages have been referenced because it clears any reference bits
that it encounters.  For example, in mincore(), clearing the reference
bits has two negative consequences.  First, it throws off the activity
count calculations performed by the page daemon.  Specifically, a page
on which mincore() has called pmap_ts_referenced() looks less active
to the page daemon than it should.  Consequently, the page could be
deactivated prematurely by the page daemon.  Arguably, this problem
could be fixed by having mincore() duplicate the activity count
calculation on the page.  However, there is a second problem for which
that is not a solution.  In order to clear a reference on a 4KB page,
it may be necessary to demote a 2/4MB page mapping.  Thus, a mincore()
by one process can have the side effect of demoting a superpage
mapping within another process!
2010-04-24 17:32:52 +00:00
Nathan Whitehorn
a107d8aac9 Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by:	jhb
2010-03-25 14:24:00 +00:00
Marcel Moolenaar
33d56ab39b Enable power management for E500 cores. Use "doze" for now to make
sure the caches remain coherent. For single-core configurations and
with busdma changes we could eventually switch to "nap" and force
a D-cache invalidation as part of the DMA completion. To this end,
clear PSL_WE until after we handled the decrementer or external
interrupt as it tells us whether we just woke up or not.
2010-03-23 19:30:56 +00:00
Marcel Moolenaar
6d58efc75d Actually pass a pointer to the trapframe to powerpc_extr_interrupt(). 2010-03-23 01:07:30 +00:00
Nathan Whitehorn
ec3c90f3c8 Place interrupt handling in a critical section and remove double
counting in incrementing the interrupt nesting level. This fixes a number
of bugs in which the interrupt thread could be preempted by an IPI,
indefinitely delaying acknowledgement of the interrupt to the PIC, causing
interrupt starvation and hangs.

Reported by:	linimon
Reviewed by:	marcel, jhb
MFC after:	1 week
2010-03-09 02:00:53 +00:00
Nathan Whitehorn
19317dfd85 Merge r198724 to Book-E. casuword() non-atomically read the current value
of its argument before atomically replacing it, which could occasionally
return the wrong value on an SMP system. This resulted in user mutex
operations hanging when using threaded applications.
2010-02-20 16:13:43 +00:00
Rafal Jaworowski
e95b7f61b9 Call the proper linkup routine in PowerPC Book-E machdep.
Submitted by:	attilio
MFC after:	1 week
2010-02-15 14:38:30 +00:00
Martin Blapp
c2ede4b379 Remove extraneous semicolons, no functional changes.
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week
2010-01-07 21:01:37 +00:00
Marcel Moolenaar
1c56376494 Remove a warning in DELAY about large delays. In kern_shutdown.c
we use excessive delays quite habitually.
2009-12-19 20:42:56 +00:00
Nathan Whitehorn
227f66048e Add a CPU features framework on PowerPC and simplify CPU setup a little
more. This provides three new sysctls to user space:
hw.cpu_features - A bitmask of available CPU features
hw.floatingpoint - Whether or not there is hardware FP support
hw.altivec - Whether or not Altivec is available

PR:		powerpc/139154
MFC after:	10 days
2009-11-28 17:33:19 +00:00
Alan Cox
e2997fea72 Simplify the invocation of vm_fault(). Specifically, eliminate the flag
VM_FAULT_DIRTY.  The information provided by this flag can be trivially
inferred by vm_fault().

Discussed with:	kib
2009-11-27 20:24:11 +00:00
Konstantin Belousov
a7b890448c Extract the code that records syscall results in the frame into MD
function cpu_set_syscall_retval().

Suggested by:	marcel
Reviewed by:	marcel, davidxu
PowerPC, ARM, ia64 changes:	marcel
Sparc64 tested and reviewed by:	marius, also sunv reviewed
MIPS tested by:	gonzo
MFC after:	1 month
2009-11-10 11:43:07 +00:00
Konstantin Belousov
d6e029adbe In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
Nathan Whitehorn
999987e51a Add SMP support on U3-based G5 systems. This does not yet work perfectly:
at least on my Xserve, getting the decrementer and timebase on APs to tick
requires setting up a clock chip over I2C, which is not yet done.

While here, correct the 64-bit tlbie function to set the CPU to 64-bit
mode correctly.

Hardware donated by:	grehan
2009-10-23 03:17:02 +00:00
Marcel Moolenaar
1a4fcaebe3 o Introduce vm_sync_icache() for making the I-cache coherent with
the memory or D-cache, depending on the semantics of the platform.
    vm_sync_icache() is basically a wrapper around pmap_sync_icache(),
    that translates the vm_map_t argumument to pmap_t.
o   Introduce pmap_sync_icache() to all PMAP implementation. For powerpc
    it replaces the pmap_page_executable() function, added to solve
    the I-cache problem in uiomove_fromphys().
o   In proc_rwmem() call vm_sync_icache() when writing to a page that
    has execute permissions. This assures that when breakpoints are
    written, the I-cache will be coherent and the process will actually
    hit the breakpoint.
o   This also fixes the Book-E PMAP implementation that was missing
    necessary locking while trying to deal with the I-cache coherency
    in pmap_enter() (read: mmu_booke_enter_locked).

The key property of this change is that the I-cache is made coherent
*after* writes have been done. Doing it in the PMAP layer when adding
or changing a mapping means that the I-cache is made coherent *before*
any writes happen. The difference is key when the I-cache prefetches.
2009-10-21 18:38:02 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Jeff Roberson
50c202c592 Implement a facility for dynamic per-cpu variables.
- Modules and kernel code alike may use DPCPU_DEFINE(),
   DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined
   PCPU_*.  Requires only one extra instruction more than PCPU_* and is
   virtually the same as __thread for builtin and much faster for shared
   objects.  DPCPU variables can be initialized when defined.
 - Modules are supported by relocating the module's per-cpu linker set
   over space reserved in the kernel.  Modules may fail to load if there
   is insufficient space available.
 - Track space available for modules with a one-off extent allocator.
   Free may block for memory to allocate space for an extent.

Reviewed by:    jhb, rwatson, kan, sam, grehan, marius, marcel, stas
2009-06-23 22:42:39 +00:00
Alan Cox
cd2b3416c3 Correct the method of waking the page daemon when the number of allocated
pv entries surpasses the high water mark.  The problem was that the page
daemon would only be awakened the first time that the high water mark was
surpassed.  (The variable "pagedaemon_waken" is a non-working vestige of
FreeBSD 4.x, in which it was external and reset by the page daemon whenever
it ran.  This reset allowed subsequent wakeups by the pv entry allocator.)
2009-06-13 18:35:29 +00:00
Rafal Jaworowski
661ee6eea5 Fix Book-E/MPC85XX build. Some prototypes were wrong and got revealed with
the recent kobj signature checking.
2009-06-13 08:57:04 +00:00
Rafal Jaworowski
2b7b2d7952 Discover and handle the number of E500 CPUs in run time. 2009-06-05 09:46:00 +00:00
Rafal Jaworowski
29794416db Fill PTEs covering kernel code and data.
Without this fix pte_vatopa() was not able to retrieve physical address of
data structures inside kernel, for example EFAULT was reported while acessing
/dev/kmem ('netstat -nr').

Submitted by:	Piotr Ziecik
Obtained from:	Semihalf
2009-06-05 09:09:46 +00:00
Nathan Whitehorn
9eb9db93da Introduce support for cpufreq on PowerPC with the dynamic frequency
switching capabilities of the MPC7447A and MPC7448.
2009-05-31 09:01:23 +00:00
Rafal Jaworowski
816192653f Set PG_WRITEABLE in Book-E pmap_enter[_locked] if it creates a mapping that
permits write access. This is similar to r192671.

Pointed out and reviewed by:	alc
2009-05-26 06:24:50 +00:00
Rafal Jaworowski
5a065915b0 Improve style(9), clean up. 2009-05-21 12:05:15 +00:00
Rafal Jaworowski
28bb01e5ba Initial support for SMP on PowerPC MPC85xx.
Tested with Freescale dual-core MPC8572DS development system.

Obtained from:	Freescale, Semihalf
2009-05-21 11:43:37 +00:00
Marcel Moolenaar
dbb95048da Add cpu_flush_dcache() for use after non-DMA based I/O so that a
possible future I-cache coherency operation can succeed. On ARM
for example the L1 cache can be (is) virtually mapped, which
means that any I/O that uses temporary mappings will not see the
I-cache made coherent. On ia64 a similar behaviour has been
observed. By flushing the D-cache, execution of binaries backed
by md(4) and/or NFS work reliably.
For Book-E (powerpc), execution over NFS exhibits SIGILL once in
a while as well, though cpu_flush_dcache() hasn't been implemented
yet.

Doing an explicit D-cache flush as part of the non-DMA based I/O
read operation eliminates the need to do it as part of the
I-cache coherency operation itself and as such avoids pessimizing
the DMA-based I/O read operations for which D-cache are already
flushed/invalidated. It also allows future optimizations whereby
the bcopy() followed by the D-cache flush can be integrated in a
single operation, which could be implemented using on-chips DMA
engines, by-passing the D-cache altogether.
2009-05-18 18:37:18 +00:00
Rafal Jaworowski
7ad9c533ef PowerPC common SMP startup and time base rework.
- make mftb() shared, rewrite in C, provide complementary mttb()
- adjust SMP startup per the above, additional comments, minor naming
  changes
- eliminate redundant TB defines, other minor cosmetics

Reviewed by:	marcel, nwhitehorn
Obtained from:	Freescale, Semihalf
2009-05-14 16:48:25 +00:00
Nathan Whitehorn
b40ce02a2f Factor out platform dependent things unrelated to device drivers into a
new platform module. These are probed in early boot, and have the
responsibility of determining the layout of physical memory, determining
the CPU timebase frequency, and handling the zoo of SMP mechanisms
found on PowerPC.

Reviewed by:	marcel, raj
Book-E parts by: raj
2009-05-14 00:34:26 +00:00
Marcel Moolenaar
d6a8fa0577 Remove PTE_ISFAKE. While here remove code
between "#if 0" and "#endif".
2009-04-24 02:53:38 +00:00
Rafal Jaworowski
d701728e3d Minor style consistency fix. 2009-04-22 13:18:04 +00:00
Rafal Jaworowski
7b6f38c007 Provide cpu_throw() for Book-E. Adjust cpu_switch() towards ULE support.
Obtained from:	Freescale, Semihalf
2009-04-22 13:13:34 +00:00
Rafal Jaworowski
b9b8eb777f Centralize setting HID0/1 for E500. Rename HID defines which are specific
to E500 rather than shared within Book-E family.

Obtained from:	Freescale, Semihalf
2009-04-22 13:11:38 +00:00
Marcel Moolenaar
2cf3f80c1b o Properly set ksym_start & ksym_end when options DDB is set.
Include opt_ddb.h for that. Now you can actually boot with
   -d and set breakpoints using function names.
o  Make sure to include opt_msgbuf.h.
o  Carve out the first 1MB of physical memory. The MPC85xx has
   DMA problems with addresses below 1MB. Ideally busdma knows
   how to avoid allocating below 1MB for MPC85xx, but that
   requires a bit more work. For now, ignore the 1MB of DRAM.
2009-04-21 17:04:01 +00:00
Marcel Moolenaar
48d6f243a6 Implement kernel core dump support for Book-E processors.
Both raw physical memory dumps and virtual minidumps are
supported. The default being minidumps.

Obtained from:	Juniper Networks
2009-04-04 22:01:43 +00:00
Nathan Whitehorn
1c96bdd146 Add support for 64-bit PowerPC CPUs operating in the 64-bit bridge mode
provided, for example, on the PowerPC 970 (G5), as well as on related CPUs
like the POWER3 and POWER4.

This also adds support for various built-in hardware found on Apple G5
hardware (e.g. the IBM CPC925 northbridge).

Reviewed by:    grehan
2009-04-04 00:22:44 +00:00
Ed Schouten
802cb57e34 Add memmove() to the kernel, making the kernel compile with Clang.
When copying big structures, LLVM generates calls to memmove(), because
it may not be able to figure out whether structures overlap. This caused
linker errors to occur. memmove() is now implemented using bcopy().
Ideally it would be the other way around, but that can be solved in the
future. On ARM we don't do add anything, because it already has
memmove().

Discussed on:	arch@
Reviewed by:	rdivacky
2009-02-28 16:21:25 +00:00
Rafal Jaworowski
0834dc77ba Prefer register usage style to be more consistent with the rest of the
trap_subr.S code.
2009-02-27 12:18:17 +00:00
Rafal Jaworowski
0a35b40f8d Make Book-E debug register state part of the PCB context.
Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).

To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.

Submitted by:	Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by:	marcel
2009-02-27 12:08:24 +00:00
Rafal Jaworowski
3cb2642f86 Clean up BookE low-level exceptions code.
Improve comments, fix style(9) and typos, unify separators.

Obtained from:	Freescale, Semihalf
2009-01-13 16:19:58 +00:00
Rafal Jaworowski
0f31d4ea25 Clean up BookE pmap.
Improve comments, eliminate redundant debug output, fix style(9) and other
minor tweaks for code readability.

Obtained from:	Freescale, Semihalf
2009-01-13 16:15:49 +00:00
Rafal Jaworowski
b2b734e771 Rework BookE pmap towards multi-core support.
o Eliminate tlb0[] (a s/w copy of TLB0)
  - The table contents cannot be maintained reliably in multiple MMU
    environments, where asynchronous events (invalidations from other cores)
    can change our local TLB0 contents underneath.
  - Simplify and optimize TLB flushing: system wide invalidations are
    performed using tlbivax instruction (propagates to other cores), for
    local MMU invalidations a new optimized routine (assembly) is introduced.

o Improve and simplify TID allocation and management.
  - Let each core keep track of its TID allocations.
  - Simplify TID recycling, eliminate dead code.
  - Drop the now unused powerpc/booke/support.S file.

o Improve page tables management logic.

o Simplify TLB1 manipulation routines.

o Other improvements and polishing.

Obtained from:	Freescale, Semihalf
2009-01-13 15:41:58 +00:00
Nathan Whitehorn
91416fb268 Modularize the Open Firmware client interface to allow run-time switching
of OFW access semantics, in order to allow future support for real-mode
OF access and flattened device frees. OF client interface modules are
implemented using KOBJ, in a similar way to the PPC PMAP modules.

Because we need Open Firmware to be available before mutexes can be used on
sparc64, changes are also included to allow KOBJ to be used very early in
the boot process by only using the mutex once we know it has been initialized.

Reviewed by:    marius, grehan
2008-12-20 00:33:10 +00:00
Rafal Jaworowski
f01415c34b Minor spelling fix in E500 locore. 2008-12-18 18:28:03 +00:00
Rafal Jaworowski
ece0de20a8 Fix E500 cache invalidation routines.
When invalidating the i/d-cache we need to wait until the core complex is
really finished with the operation.

Obtained from:	Semihalf
2008-12-17 15:54:25 +00:00
Rafal Jaworowski
fdd28cb882 Rework E500 locore.
- split bootstrap code into more modular routines, which will also be used for
  the non-booting cores
- clean up registers usage
- improve comments to better reflect reality
- eliminate dead or redundant code
- other minor fixes

This refactoring is a preliminary step before importing dual-core (MPC8572)
support.

Obtained from:	Freescale, Semihalf
2008-12-17 15:44:34 +00:00
Rafal Jaworowski
51d059c6de Minor clean up of BookE/MPC85XX: iprove naming and style(9). 2008-12-17 15:31:15 +00:00
Marcel Moolenaar
dc9d16844c Add support for kernel profiling for both AIM and BookE.
Obtained from:	Juniper Networks, Inc (BookE support).
2008-10-27 02:36:03 +00:00
Marcel Moolenaar
1463295f81 Remove unused declarations (interrupt_vector_{base|top}). 2008-10-27 01:51:30 +00:00
Marcel Moolenaar
9b85e175fd In mmu_booke_mapdev(), handle mappings that cannot be represented
by a single TLB entry. The boot ROM on the MPC85555CDS is 8MB, for
example, and in order to map that we need 2 4MB TLB entries.
2008-10-25 03:36:21 +00:00
Nathan Whitehorn
4c01c0b965 Allow the cacheline size on PowerPC to be set at runtime. This is essential for
supporting 64-bit CPUs, which often have 128-byte cache lines instead of the
standard 32.
2008-09-24 00:28:46 +00:00
Rafal Jaworowski
c8e780791e Move initialization of tlb0, ptbl_bufs and kernel_pdir regions after we are
100% sure that TLB1 mapping covers for them; previously we could lock the CPU
with an untranslated references.

Obtained from:	Semihalf
2008-08-28 07:38:08 +00:00
Rafal Jaworowski
959aea56c1 Improve kernel stack handling on e500.
- Allocate thread0.td_kstack in pmap_bootstrap(), provide guard page
- Switch to thread0.td_kstack as soon as possible i.e. right after return
  from e500_init() and before mi_startup() happens
- Clean up temp stack area
- Other minor cosmetics in machdep.c

Obtained from:	Semihalf
2008-08-26 17:07:37 +00:00
Wojciech A. Koszek
b390a5baf8 Fix a typo in a comment. 2008-06-11 20:05:25 +00:00
Alan Cox
d1fdd63483 The VM system no longer uses setPQL2(). Remove it and its helpers. 2008-05-23 04:03:54 +00:00
Alan Cox
1ec1304bdb Retire pmap_addr_hint(). It is no longer used. 2008-05-18 04:16:57 +00:00
Marcel Moolenaar
12640815f8 MFp4: SMP support 2008-04-27 22:33:43 +00:00
Marcel Moolenaar
b66bd41daa Eliminate track_modified_needed(), better known as pmap_track_modified()
on other platforms.  We no longer need it because we do not create managed
mappings within the clean submap.

Pointed out by: alc
2008-04-27 21:04:54 +00:00
Rafal Jaworowski
a1cd472a40 Introduce a dedicated file for MPC85xx-specific routines. Move cpu_reset()
there, as it's not relevant to Book-E specification, but is an implementation
detail, directly dependent on the given SoC version.
2008-04-26 17:57:29 +00:00
Jeff Roberson
6c47aaae12 - Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
 - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
   suspended in cpu specific states.  This function can fail and cause the
   scheduler to fall back to another mechanism (ipi).
 - Implement support for mwait in cpu_idle() on i386/amd64 machines that
   support it.  mwait is a higher performance way to synchronize cpus
   as compared to hlt & ipis.
 - Allow selecting the idle routine by name via sysctl machdep.idle.  This
   replaces machdep.cpu_idle_hlt.  Only idle routines supported by the
   current machine are permitted.

Sponsored by:	Nokia
2008-04-25 05:18:50 +00:00
Poul-Henning Kamp
9b4a8ab7ba Now that all platforms use genclock, shuffle things around slightly
for better structure.

Much of this is related to <sys/clock.h>, which should really have
been called <sys/calendar.h>, but unless and until we need the name,
the repocopy can wait.

In general the kernel does not know about minutes, hours, days,
timezones, daylight savings time, leap-years and such.  All that
is theoretically a matter for userland only.

Parts of kernel code does however care: badly designed filesystems
store timestamps in local time and RTC chips almost universally
track time in a YY-MM-DD HH:MM:SS format, and sometimes in local
timezone instead of UTC.  For this we have <sys/clock.h>

<sys/time.h> on the other hand, deals with time_t, timeval, timespec
and so on.  These know only seconds and fractions thereof.

Move inittodr() and resettodr() prototypes to <sys/time.h>.
Retain the names as it is one of the few surviving PDP/VAX references.

Move startrtclock() to <machine/clock.h> on relevant platforms, it
is a MD call between machdep.c/clock.c.  Remove references to it
elsewhere.

Remove a lot of unnecessary <sys/clock.h> includes.

Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs.
XXX: should be kern.disable_rtc_set really, it's not MD.
2008-04-22 19:38:30 +00:00
Poul-Henning Kamp
943f6ddd39 Get rid of an empty RTC implementation and hook up genclock instead. 2008-04-13 21:20:26 +00:00
Poul-Henning Kamp
e465985885 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
Rafal Jaworowski
ecb1ab1761 Obtain TSEC h/w address from the parent bus (OCP) and not rely blindly on what
might be currently programmed into the registers.

Underlying firmware (U-Boot) would typically program MAC address into the
first unit only, and others are left uninitialized. It is now possible to
retrieve and program MAC address for all units properly, provided they were
passed on in the bootinfo metadata.

Reviewed by:	imp, marcel
Approved by:	cognet (mentor)
2008-03-12 16:32:08 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
Marcel Moolenaar
704bb9b36f Enable the D-cache and I-cache when not already enabled.
It so happens that U-Boot disables the D-cache when booting
an ELF image, so this change makes sure we run with the
D-cache enabled from now on. It shows too...

While here, remove the duplicate definition of the hw.model
sysctl.
2008-03-08 05:36:25 +00:00
Rafal Jaworowski
6b7ba54456 Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family.
The PQ3 is a high performance integrated communications processing system
based on the e500 core, which is an embedded RISC processor that implements
the 32-bit Book E definition of the PowerPC architecture. For details refer
to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E

This port was tested and successfully run on the following members of the PQ3
family: MPC8533, MPC8541, MPC8548, MPC8555.

The following major integrated peripherals are supported:

  * On-chip peripherals bus
  * OpenPIC interrupt controller
  * UART
  * Ethernet (TSEC)
  * Host/PCI bridge
  * QUICC engine (SCC functionality)

This commit brings the main functionality and will be followed by individual
drivers that are logically separate from this base.

Approved by:	cognet (mentor)
Obtained from:	Juniper, Semihalf
MFp4:		e500
2008-03-03 17:17:00 +00:00