29 Commits

Author SHA1 Message Date
Alexander Motin
a157e42516 Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
  kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
  kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
  kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
  kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by:	many (on i386, amd64, sparc64 and powerc)
H/W donated by:	Gheorghe Ardelean
Sponsored by:	iXsystems, Inc.
2010-09-13 07:25:35 +00:00
Rafal Jaworowski
4f124b977c Eliminate FDT_IMMR_VA define.
This removes platform dependencies from <machine>/fdt.h for the benfit of
portability.
2010-07-19 18:47:18 +00:00
Nathan Whitehorn
c3e289e1ce MFppc64:
Kernel sources for 64-bit PowerPC, along with build-system changes to keep
32-bit kernels compiling (build system changes for 64-bit kernels are
coming later). Existing 32-bit PowerPC kernel configurations must be
updated after this change to specify their architecture.
2010-07-13 05:32:19 +00:00
Nathan Whitehorn
cc81c44dd8 Unify ABI-related bits of the Book-E and AIM machdep routines
(exec_setregs, etc.) in order to simplify the addition of 64-bit support,
and possible future extension of the Book-E code to handle hard floating
point and Altivec.

MFC after:	1 month
2010-07-12 16:08:07 +00:00
Rafal Jaworowski
d1d3233ebd Convert Freescale PowerPC platforms to FDT convention.
The following systems are affected:

  - MPC8555CDS
  - MPC8572DS

This overhaul covers the following major changes:

  - All integrated peripherals drivers for Freescale MPC85XX SoC, which are
    currently in the FreeBSD source tree are reworked and adjusted so they
    derive config data out of the device tree blob (instead of hard coded /
    tabelarized values).

  - This includes: LBC, PCI / PCI-Express, I2C, DS1553, OpenPIC, TSEC, SEC,
    QUICC, UART, CFI.

  - Thanks to the common FDT infrastrucutre (fdtbus, simplebus) we retire
    ocpbus(4) driver, which was based on hard-coded config data.

Note that world for these platforms has to be built WITH_FDT.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
2010-07-11 21:08:29 +00:00
John Baldwin
fc0de8f0b6 Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
2010-06-30 18:03:42 +00:00
Nathan Whitehorn
a107d8aac9 Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by:	jhb
2010-03-25 14:24:00 +00:00
Marcel Moolenaar
33d56ab39b Enable power management for E500 cores. Use "doze" for now to make
sure the caches remain coherent. For single-core configurations and
with busdma changes we could eventually switch to "nap" and force
a D-cache invalidation as part of the DMA completion. To this end,
clear PSL_WE until after we handled the decrementer or external
interrupt as it tells us whether we just woke up or not.
2010-03-23 19:30:56 +00:00
Rafal Jaworowski
e95b7f61b9 Call the proper linkup routine in PowerPC Book-E machdep.
Submitted by:	attilio
MFC after:	1 week
2010-02-15 14:38:30 +00:00
Nathan Whitehorn
227f66048e Add a CPU features framework on PowerPC and simplify CPU setup a little
more. This provides three new sysctls to user space:
hw.cpu_features - A bitmask of available CPU features
hw.floatingpoint - Whether or not there is hardware FP support
hw.altivec - Whether or not Altivec is available

PR:		powerpc/139154
MFC after:	10 days
2009-11-28 17:33:19 +00:00
Konstantin Belousov
d6e029adbe In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
Nathan Whitehorn
9eb9db93da Introduce support for cpufreq on PowerPC with the dynamic frequency
switching capabilities of the MPC7447A and MPC7448.
2009-05-31 09:01:23 +00:00
Rafal Jaworowski
28bb01e5ba Initial support for SMP on PowerPC MPC85xx.
Tested with Freescale dual-core MPC8572DS development system.

Obtained from:	Freescale, Semihalf
2009-05-21 11:43:37 +00:00
Marcel Moolenaar
dbb95048da Add cpu_flush_dcache() for use after non-DMA based I/O so that a
possible future I-cache coherency operation can succeed. On ARM
for example the L1 cache can be (is) virtually mapped, which
means that any I/O that uses temporary mappings will not see the
I-cache made coherent. On ia64 a similar behaviour has been
observed. By flushing the D-cache, execution of binaries backed
by md(4) and/or NFS work reliably.
For Book-E (powerpc), execution over NFS exhibits SIGILL once in
a while as well, though cpu_flush_dcache() hasn't been implemented
yet.

Doing an explicit D-cache flush as part of the non-DMA based I/O
read operation eliminates the need to do it as part of the
I-cache coherency operation itself and as such avoids pessimizing
the DMA-based I/O read operations for which D-cache are already
flushed/invalidated. It also allows future optimizations whereby
the bcopy() followed by the D-cache flush can be integrated in a
single operation, which could be implemented using on-chips DMA
engines, by-passing the D-cache altogether.
2009-05-18 18:37:18 +00:00
Nathan Whitehorn
b40ce02a2f Factor out platform dependent things unrelated to device drivers into a
new platform module. These are probed in early boot, and have the
responsibility of determining the layout of physical memory, determining
the CPU timebase frequency, and handling the zoo of SMP mechanisms
found on PowerPC.

Reviewed by:	marcel, raj
Book-E parts by: raj
2009-05-14 00:34:26 +00:00
Marcel Moolenaar
2cf3f80c1b o Properly set ksym_start & ksym_end when options DDB is set.
Include opt_ddb.h for that. Now you can actually boot with
   -d and set breakpoints using function names.
o  Make sure to include opt_msgbuf.h.
o  Carve out the first 1MB of physical memory. The MPC85xx has
   DMA problems with addresses below 1MB. Ideally busdma knows
   how to avoid allocating below 1MB for MPC85xx, but that
   requires a bit more work. For now, ignore the 1MB of DRAM.
2009-04-21 17:04:01 +00:00
Nathan Whitehorn
1c96bdd146 Add support for 64-bit PowerPC CPUs operating in the 64-bit bridge mode
provided, for example, on the PowerPC 970 (G5), as well as on related CPUs
like the POWER3 and POWER4.

This also adds support for various built-in hardware found on Apple G5
hardware (e.g. the IBM CPC925 northbridge).

Reviewed by:    grehan
2009-04-04 00:22:44 +00:00
Rafal Jaworowski
0a35b40f8d Make Book-E debug register state part of the PCB context.
Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).

To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.

Submitted by:	Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by:	marcel
2009-02-27 12:08:24 +00:00
Rafal Jaworowski
b2b734e771 Rework BookE pmap towards multi-core support.
o Eliminate tlb0[] (a s/w copy of TLB0)
  - The table contents cannot be maintained reliably in multiple MMU
    environments, where asynchronous events (invalidations from other cores)
    can change our local TLB0 contents underneath.
  - Simplify and optimize TLB flushing: system wide invalidations are
    performed using tlbivax instruction (propagates to other cores), for
    local MMU invalidations a new optimized routine (assembly) is introduced.

o Improve and simplify TID allocation and management.
  - Let each core keep track of its TID allocations.
  - Simplify TID recycling, eliminate dead code.
  - Drop the now unused powerpc/booke/support.S file.

o Improve page tables management logic.

o Simplify TLB1 manipulation routines.

o Other improvements and polishing.

Obtained from:	Freescale, Semihalf
2009-01-13 15:41:58 +00:00
Nathan Whitehorn
91416fb268 Modularize the Open Firmware client interface to allow run-time switching
of OFW access semantics, in order to allow future support for real-mode
OF access and flattened device frees. OF client interface modules are
implemented using KOBJ, in a similar way to the PPC PMAP modules.

Because we need Open Firmware to be available before mutexes can be used on
sparc64, changes are also included to allow KOBJ to be used very early in
the boot process by only using the mutex once we know it has been initialized.

Reviewed by:    marius, grehan
2008-12-20 00:33:10 +00:00
Rafal Jaworowski
51d059c6de Minor clean up of BookE/MPC85XX: iprove naming and style(9). 2008-12-17 15:31:15 +00:00
Nathan Whitehorn
4c01c0b965 Allow the cacheline size on PowerPC to be set at runtime. This is essential for
supporting 64-bit CPUs, which often have 128-byte cache lines instead of the
standard 32.
2008-09-24 00:28:46 +00:00
Rafal Jaworowski
959aea56c1 Improve kernel stack handling on e500.
- Allocate thread0.td_kstack in pmap_bootstrap(), provide guard page
- Switch to thread0.td_kstack as soon as possible i.e. right after return
  from e500_init() and before mi_startup() happens
- Clean up temp stack area
- Other minor cosmetics in machdep.c

Obtained from:	Semihalf
2008-08-26 17:07:37 +00:00
Alan Cox
d1fdd63483 The VM system no longer uses setPQL2(). Remove it and its helpers. 2008-05-23 04:03:54 +00:00
Jeff Roberson
6c47aaae12 - Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
 - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
   suspended in cpu specific states.  This function can fail and cause the
   scheduler to fall back to another mechanism (ipi).
 - Implement support for mwait in cpu_idle() on i386/amd64 machines that
   support it.  mwait is a higher performance way to synchronize cpus
   as compared to hlt & ipis.
 - Allow selecting the idle routine by name via sysctl machdep.idle.  This
   replaces machdep.cpu_idle_hlt.  Only idle routines supported by the
   current machine are permitted.

Sponsored by:	Nokia
2008-04-25 05:18:50 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
Rafal Jaworowski
ecb1ab1761 Obtain TSEC h/w address from the parent bus (OCP) and not rely blindly on what
might be currently programmed into the registers.

Underlying firmware (U-Boot) would typically program MAC address into the
first unit only, and others are left uninitialized. It is now possible to
retrieve and program MAC address for all units properly, provided they were
passed on in the bootinfo metadata.

Reviewed by:	imp, marcel
Approved by:	cognet (mentor)
2008-03-12 16:32:08 +00:00
Marcel Moolenaar
704bb9b36f Enable the D-cache and I-cache when not already enabled.
It so happens that U-Boot disables the D-cache when booting
an ELF image, so this change makes sure we run with the
D-cache enabled from now on. It shows too...

While here, remove the duplicate definition of the hw.model
sysctl.
2008-03-08 05:36:25 +00:00
Rafal Jaworowski
6b7ba54456 Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family.
The PQ3 is a high performance integrated communications processing system
based on the e500 core, which is an embedded RISC processor that implements
the 32-bit Book E definition of the PowerPC architecture. For details refer
to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E

This port was tested and successfully run on the following members of the PQ3
family: MPC8533, MPC8541, MPC8548, MPC8555.

The following major integrated peripherals are supported:

  * On-chip peripherals bus
  * OpenPIC interrupt controller
  * UART
  * Ethernet (TSEC)
  * Host/PCI bridge
  * QUICC engine (SCC functionality)

This commit brings the main functionality and will be followed by individual
drivers that are logically separate from this base.

Approved by:	cognet (mentor)
Obtained from:	Juniper, Semihalf
MFp4:		e500
2008-03-03 17:17:00 +00:00