Here, "suitably endowed" means that the System Control Coprocessor
(#15) has Performance Monitoring Registers, including a CCNT (Cycle
Count) register.
The CCNT register is used in a way similar to the TSC register in
x86 processors by the get_cyclecount(9) function. The entropy-harvesting
thread is a heavy user of this function, and will benefit from not
having to call binuptime(9) instead.
One problem with the CCNT register is that it is 32-bit only, so
the upper 32-bits of the returned number are always 0. The entropy
harvester does not care, but in case any one else does, follow-up
work may include an interrup trap to increment an upper-32-bit
counter on CCNT overflow.
Another problem is that the CCNT register is not readable in user-mode
code; in can be made readable by userland, but then it is also
writable, and so is a good chunk of the PMU system. For that reason,
the CCNT is not enabled for user-mode access in this commit.
Like the x86, there is one CCNT per core, so they don't all run in
perfect sync.
Reviewed by: ian@ (an earlier version)
Tested by: ian@ (same earlier version)
Committed from: WANDBOARD-QUAD
that and the need to be in a critical section when switching to idleclock
mode for event timers, use spinlock_enter()/exit() to achieve both needs.
The ARM WFI (wait for interrupt) instruction blocks until an interrupt is
asserted, and it will unblock even if interrupts are masked, and it will
unblock immediately if an interrupt is already pending. It is necessary
to execute it with interrupts disabled, otherwise the interrupt that
should unblock it may occur and be serviced just prior to executing the
instruction. At that point the system is inappropriately asleep until
the next timer tick or some other random interrupt happens.
In general, interrupts need to be disabled continuously from the time the
decision is made that there is no work to be done and sleeping is needed
until actually going to sleep, to avoid a race where handling a new
interrupt changes the basis for deciding there is no work to be done.
Submitted by: hps@ (in slightly different form)
On modern ARM SoCs the L2 cache controller sits between the CPU and the
AXI bus, and most on-chip memory-mapped devices are on the AXI bus. We
map the device registers using the 'Device' memory attribute, which means
the memory is not cached, but writes to it are buffered. Ensuring that a
write has made it all the way to a device may require that the L2
controller take some action.
There is currently only one implementation of the new function, for the
PL310 cache controller. It invokes a function that the controller
manual calls "cache sync" but it actually has nothing to do with cache at
all, it triggers a drain of all pending store buffer writes and it blocks
until they complete.
The sheeva and xscale L2 controllers (which predate the concept of Device
memory) don't seem to have a corresponding function. It appears that the
standard armv5 drain_writebuf function includes draining all the way
through the L2 controller.
On armv4 these are defined as synonyms right now, but it's a bit ambiguous
what NOCACHE means (is buffering/write-combining also enabled or not?); this
is a first step towards replacing PTE_NOCACHE with a less ambiguous name.
Remove some other ifdefs that came in with a copy/paste that mean basically
"if this processor supports multicore stuff", because if you're starting up
an AP core... it does.
case where the controller is already enabled.
Some of the pl310 configuration registers cannot be changed while the
controller is active, so if there is any platform-specific init to be done
it must happen before enabling the controller.
The controller should not be enabled upon entry to the kernel, but u-boot
has recently developed the bad habit of leaving caches enabled when
launching the kernel, and since we have no control over that source code
we have to do our best to cope with it. The PL310 manual doesn't document
a safe sequence for disabling the controller, but the sequence used here
(force write-through mode and disable linefill allocations, then clean and
invalidate the current contents before disabling the hardware) appears to
be sound both by analysis and empirical testing.
These changes were developed and tested in collaboration with
Svatopluk Kraus <onwahe@gmail.com>.
Reviewed by: cognet@
#NO_UNIVERSE. Many of these config files are important examples, but
add little to no regresive value to the intended purpose of
UNIVERSE. We now build over 120 kernels during universe. There's
really little to no value to this over building say 60 or even 30 of
them (either is still a way too big number). This is especially true
for kernels that are nothing more than including a common base and
adding a static DTB file. Start by pruning 1/3 of the arm kernels that
add little regresion value.
Flushing the caches is required before doing a panic dump, but ARM
doesn't provide a flavor of flush that gets broadcast to other cores.
However, all cores except one are stopped before doing a dump, so this
works around the lack of a global flush/invalidate by doing it locally
on each CPU as part of stopping.
Discussed with: cognet@
This was added ca. 2004 for the purpose of ensuring the caches were in the
right state after the debugger set a breakpoint. kdb_cpu_sync_icache()
was added in 2007 to handle that situation, and now the wbinv_all is
actually harmful because the operation isn't broadcast to other cores.
using armv7_idcache_wbinv_all, because wbinv_all doesn't broadcast the
operation to other cores. In elf_cpu_load_file() use icache_sync_all()
and explain why it's needed (and why other sync operations aren't).
As part of doing this, all callers of cpu_icache_sync_all() were
inspected to ensure they weren't relying on the old side effect of
doing a wbinv_all along with the icache work.
the cpufreq code. Replace its use with smp_started. There's at least
one userland tool that still looks at the kern.smp.active sysctl, so
preserve it but point it to smp_started as well.
Discussed with: peter, jhb
MFC after: 3 days
Obtained from: Netflix
define a few imx_ccm_foo() functions that are implemented by the imx51 or
imx6 ccm code. Of course, the imx6 ccm code is still more a wish than
reality, so for now its implementations just return hard-coded numbers.
- These were needed on armv4/5 (VIVT cache), not needed on armv6.
- The wbinv_all call can't be used on SMP systems; cache operations by
set/way are not broadcast to other cores.
- The TLB maintenance operations needed for pmap_growkernel() happen in
pmap_grow_l2_bucket(), so there's no need to flush all TLB entries at
the end.
- There may not be any need for the TLB flush at the beginning of
pmap_release(), but it's left in for now pending more investigation.
Pointed out by: Svatopluk Kraus <onwahe@gmail.com>
Discussed with: cognet@
While it is the recommended initialization procedure, it hangs on the reset
of the second GPIO module on pandaboard.
Removes the module reset for now as more investigation would be needed.
Reported by: jceel
These should have been part of r264129, they are part of the overall set
of changes that got several weeks of testing. I must have fumbled them
while merging various patchsets.
On AM335x each one of the four GPIO banks has two physical interrupt
lines, so we now allocate resources and setup our interrupt handler for
all the (8) available interrupts.
On OMAP3 and OMAP4 there is only one interrupt for each GPIO bank (6
banks, 6 interrupts), but there are two set of registers where the
first one is used to setup the delivery of interrupts to the MPU and
the second set, setup the delivery of interrupts to the DSP.
On AM335x, each set of registers controls each one of the interrupt
lines.
- Remove nonexistent registers for OMAP4 and AM335x, replace their use with
the correct ones for these SoCs.
- Remove stray whitespace.
Based on OMAP3, OMAP4 and AM335x TRMs.
Tested on Beaglebone-black.