case where the controller is already enabled.
Some of the pl310 configuration registers cannot be changed while the
controller is active, so if there is any platform-specific init to be done
it must happen before enabling the controller.
The controller should not be enabled upon entry to the kernel, but u-boot
has recently developed the bad habit of leaving caches enabled when
launching the kernel, and since we have no control over that source code
we have to do our best to cope with it. The PL310 manual doesn't document
a safe sequence for disabling the controller, but the sequence used here
(force write-through mode and disable linefill allocations, then clean and
invalidate the current contents before disabling the hardware) appears to
be sound both by analysis and empirical testing.
These changes were developed and tested in collaboration with
Svatopluk Kraus <onwahe@gmail.com>.
Reviewed by: cognet@
#NO_UNIVERSE. Many of these config files are important examples, but
add little to no regresive value to the intended purpose of
UNIVERSE. We now build over 120 kernels during universe. There's
really little to no value to this over building say 60 or even 30 of
them (either is still a way too big number). This is especially true
for kernels that are nothing more than including a common base and
adding a static DTB file. Start by pruning 1/3 of the arm kernels that
add little regresion value.
Flushing the caches is required before doing a panic dump, but ARM
doesn't provide a flavor of flush that gets broadcast to other cores.
However, all cores except one are stopped before doing a dump, so this
works around the lack of a global flush/invalidate by doing it locally
on each CPU as part of stopping.
Discussed with: cognet@
This was added ca. 2004 for the purpose of ensuring the caches were in the
right state after the debugger set a breakpoint. kdb_cpu_sync_icache()
was added in 2007 to handle that situation, and now the wbinv_all is
actually harmful because the operation isn't broadcast to other cores.
using armv7_idcache_wbinv_all, because wbinv_all doesn't broadcast the
operation to other cores. In elf_cpu_load_file() use icache_sync_all()
and explain why it's needed (and why other sync operations aren't).
As part of doing this, all callers of cpu_icache_sync_all() were
inspected to ensure they weren't relying on the old side effect of
doing a wbinv_all along with the icache work.
the cpufreq code. Replace its use with smp_started. There's at least
one userland tool that still looks at the kern.smp.active sysctl, so
preserve it but point it to smp_started as well.
Discussed with: peter, jhb
MFC after: 3 days
Obtained from: Netflix
define a few imx_ccm_foo() functions that are implemented by the imx51 or
imx6 ccm code. Of course, the imx6 ccm code is still more a wish than
reality, so for now its implementations just return hard-coded numbers.
- These were needed on armv4/5 (VIVT cache), not needed on armv6.
- The wbinv_all call can't be used on SMP systems; cache operations by
set/way are not broadcast to other cores.
- The TLB maintenance operations needed for pmap_growkernel() happen in
pmap_grow_l2_bucket(), so there's no need to flush all TLB entries at
the end.
- There may not be any need for the TLB flush at the beginning of
pmap_release(), but it's left in for now pending more investigation.
Pointed out by: Svatopluk Kraus <onwahe@gmail.com>
Discussed with: cognet@
While it is the recommended initialization procedure, it hangs on the reset
of the second GPIO module on pandaboard.
Removes the module reset for now as more investigation would be needed.
Reported by: jceel
These should have been part of r264129, they are part of the overall set
of changes that got several weeks of testing. I must have fumbled them
while merging various patchsets.
On AM335x each one of the four GPIO banks has two physical interrupt
lines, so we now allocate resources and setup our interrupt handler for
all the (8) available interrupts.
On OMAP3 and OMAP4 there is only one interrupt for each GPIO bank (6
banks, 6 interrupts), but there are two set of registers where the
first one is used to setup the delivery of interrupts to the MPU and
the second set, setup the delivery of interrupts to the DSP.
On AM335x, each set of registers controls each one of the interrupt
lines.
- Remove nonexistent registers for OMAP4 and AM335x, replace their use with
the correct ones for these SoCs.
- Remove stray whitespace.
Based on OMAP3, OMAP4 and AM335x TRMs.
Tested on Beaglebone-black.
enabled. Also switch IMX6 to use SCHED_ULE.
The now-unreferenced WANDBOARD.common config will be deleted after giving
folks who may be including it a heads-up to switch to IMX6.
- Add cpu_cpwait to comply with the convention.
- Add missing TLB invalidations, especially in pmap_kenter & pmap_kremove
with distinguishing between D and ID pages.
- Modify pmap init/bootstrap invalidations to ID, just to be safe.
- Fix TLB-inv and PTE_SYNC ordering.
This combines changes submitted by ian@, cognet@, and Wojciech Macek,
which have all been tested together as a unit.
Perform sychronization (by "isb" barrier) after TTB is set. This
is done to ensure that TLB invalidation always executes after
TTB modification and operates on valid CP15 data (per specification).
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: ian@, cognet@
- Don't allow high-speed mode on OMAP4 due to hardware erratum.
- Check the proper bit in the status register when waiting for the
controller to come out of reset.
- Add handling for the "non-removable" fdt property by always returning
"card is present" status.
- Add the non-removable property for the MMC card on a Beaglebone Black.
- Add the non-removable property for Pandaboard as a workaround.
For Pandaboard the card detect pin is handled by the twl6030 fpga device
which gets an interrupt on pin change and then has to query the fpga
for the actual status. We don't have code to do that yet.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>
errors.
Reset the GPIO module during the initialization. This is guaranteed to be
the same as a hardware reset. Tested on AM335x (BBB) and checked against
the omap3 and omap4 TRM.
Do a better job freeing resources when there are errors and on
ti_gpio_detach().
We don't know our ARM security state, so one of them will operate.
- Don't set frequency, since it's unpossible in non-secure state.
Only rely on DTS clock-frequency value or get clock from timer.
Discussed with: ian, cognet
This adds the concept of "operating points," combinations of frequency
and voltage at which the cpu is known to work correctly. Some day these
should come from FDT data, but for now the table is hard-coded.
This also allows tuning the min and max operating frequencies. The min
frequency is what the thermal management code will slow down to if the
core temperature gets too high. The max frequency is what gets used if
the temperature is okay.
Normally the max cannot be set higher than the value burned into the
ocotp fuses as the chip's rated max, but there is now a new sysctl+tunable
cpu_overclock_enable; when set to non-zero it allows raising the frequency
above the ocotp value: USE WITH CARE! (At least one of my imx6 boards
has a cpu whose ocotp values never got set correctly; they claim a max
of 792mhz, but the physical markings on the chip say it's good to 1ghz.)
Because all these values affect the entire SoC, there is a new sysctl
node, hw.imx6, where all these values live. The values that are currently
under dev.imx6_anatop.0 should probably move to hw.imx6 too, because
"anatop" doesn't even mean anything to me, let alone to an end user.
description was eaten by the dog (or an editor crash or something).
Add variable-frequency support to the arm mpcore eventtimer driver.
This allows a platform's early init code to tell the mpcore driver that the
clock frequency can vary. That causes the mpcore driver to register an
eventtimer, but not a timecounter. The platform has to provide a time
counter using some other fixed-frequency clock, but can still use the
per-cpu goodness of the mpcore hardware for event timers.
When the platform support code does something to change the frequency of
the CPU clocks (power saving, thermal management) it must tell the mpcore
driver code about it using arm_tmr_change_frequency().
register values, then restart the timer. This prevents a situation where
an old event fires just as we're about to load a new value into the timer,
when the start routine is called to change the time of the current event.
Also re-nest the parens properly for casting the result of converting
time and frequency to a count. This doesn't actually change the result of
the calcs, but will some day prevent a loss-of-precision warning on the
assignment, if that warning gets enabled.
Support covers device drivers for:
- Interrupt Combiner
- gpio/pad, External Interrupts Controller (pad)
- I2C Interface
- Chrome Embedded Controller
- Chrome Keyboard
Also:
- Use new gpio dev class in EHCI driver
- Expand device tree information
* Save the required VFP registers on context switch. If the exception bit
is set we need to save and restore the FPINST register, and if the fp2v
bit is also set we need to save and restore FPINST2.
* Move saving and restoring the floating point control registers to C.
* Clear the fpexc exception and fp2v flags on a floating-point exception.
* Signal a SIGFPE if the fpexc exception flag is set on an undefined
instruction. This is how the ARM core signals to software there is a
floating-point exception.
The ADC has a 12bit resolution and its raw output can be read via sysctl(8)
interface.
The driver allows the setup of ADC clock, samples average and open delay
(the number of clock cycles to wait before start the conversion).
The TSC_ADC module is set in the general purpose mode (no touchscreen
support).
Tested on Beaglebone-black.
Written based on AM335x TRM.
Reviewed by: rpaulo
Approved by: adrian (mentor)
Tested by: me, Brian J. McGovern, Sulev-Madis Silber (ketas)
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@
Sponsored by: EMC / Isilon Storage Division
The armv7 architecture uses a unified TLB model for maintenence ops even
if separate instruction and data TLBs are implemented in hardware. That
means that there's no distinction between the 'I' and 'D' flavors of flush,
they all use the same 'ID' implementation. On the other hand, there is a
difference between SMP and UP on armv7, but not on arm11, so use the armv7
routines for cortex-a processors.
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
MFC after: 3 weeks
which was added by cognet in 2012, so remove the no-longer-applicable
license stuff that referred to all the old contents, and put in a
standard 2-clause BSD license (to cover the 6 lines of useful code left
in here).
swi_exit code in exception.S instead of having its own inline expansion
of the DO_AST and PULLFRAME macros. That means that now all references
to the PUSH/PULLFRAME and DO_AST macros are localized to exception.S,
so move the macros themselves into there and remove them from asmacros.h
and eliminate vectors.S. All low-level exception handling is now
consolidated into exception.S.
Along with moving the default FIQ handler, change it to disable FIQs
before returning. An FIQ should never happen, but if it does, it's got
to be disabled as part of ignoring it.
In general, we have hand-wavy support for FIQs that probably hasn't been
used for 10 years and probably doesn't work (almost certainly doesn't
work for SMP because it only updates the vector on the current cpu). This
change doesn't really make the overall situation any better or worse.
and the functionality it provided into arm/exception.S. Rename the main
irq handling routine from arm_handler_execute() to arm_irq_handler() to
make it more congruent with how other exception handlers are named, and
also update its signature to reflect what has long been reality: it is
passed just a trapframe pointer, no interrupt number argument.
never actually ran on these chips (other than using SA1 support in an
emulator to do the early porting to FreeBSD long long ago). The clutter
and complexity of some of this code keeps getting in the way of other
maintenance, so it's time to go.
to the actual handler routine. All the pointers are static-intialized to
the only handlers available, and yet various platform-specific inits still
set those pointers (to the values they're already initialized to). Begin
to drain the swamp by removing all the redundant external declarations and
runtime setting of the pointers that's scattered around various places.
The old code used static storage to preserve a couple registers while
setting up the trapframe for the main handler. Doing so was the last
leftover crumbs from the days when a low-level debugger was hooked into
the exception entry code.
Now the exception entry sequence is essentially the same as for the
other exceptions, which still involves needlessly indirecting through
a function pointer which points to the same code on every platform.
Removing that indirection will be handled as a separate cleanup.
This work is based on an analysis by Juergen Weiss.
enabled. In vfp_discard(), if the state in the VFP hardware belongs to
the thread which is dying, NULL out pcpu fpcurthread to indicate the
state currently in the hardware belongs to nobody.
Submitted by: Juergen Weiss
Pointy hat to: me
a leftover from the days when a low-level debugger had hooks in the
undefined exception vector and needed stack space to function. These days
it effectively isn't used because we switch immediately to the svc32 mode
stack on exception entry. For that, the single undef mode stack per core
that gets set up at init time works fine.
The stack wasn't necessary but it was harmful, because the space for it
was carved out of the normal per-thread svc32 stack, in effect cutting
that 8K stack in half. If svc32 mode used more than 4k of stack space it
wandered down into the undef mode stack, and then an undef exception would
overwrite a couple words on the stack while switching to svc32 mode,
corrupting the scv32 stack. Having another stack abut the bottom of the
svc32 stack also effectively mooted the guard page below the stack.
This work is based on analysis and patches submitted by Juergen Weiss.
The old code was full of complexity that would only matter if the
kernel itself used the VFP hardware. Now that's reduced to either killing
the userland process or panicking the kernel on an illegal VFP instruction.
This removes most of the complexity from the assembler code, reducing it
to just calling the save code if the outgoing thread used the VFP.
The routine that stores the VFP state now takes a flag that indicates
whether the hardware should be disabled after saving state. Right now it
always is, but this makes the code ready to be used by get/set_mcontext()
(doing so will be addressed in a future commit).
Remove the arm-specific pc_vfpcthread from struct pcpu and use the MI
field pc_fpcurthread instead.
Reviewed by: cognet
appropriate for each of the 'foo' in the tree. This will allow us to
compile them together (although symbol conflicts prevent us from doing
that today, this just fixes the file name collision).
struct sigframe containing the struct ucontext.
The signal trampoline return code on the other hand DOES have just a
struct sigframe on the stack to work with, so have it get a pointer to
the ucontext out of there to pass along to sys_sigreturn.
In other words, make everything work right whether sys_sigreturn is
invoked from the trampoline or from userland code calling sigreturn(2).
Submitted by: Takashi Komatsu <komatsu.taka@jp.panasonic.com>
Reviewed by: cognet
It is available mostly in A10 devices like Hackberry, Marsboard,
Mele A1000, A2000, A100 HTPC, cubieboard1 and A20 device
like cubieboard2.
TX performance can be improved using both channels 0 and 1.
RX performance is poor and needs improvement with the assistance of
external DMA controller in case there is bulk TCP receiver.
Reviewed by: yongari@
Approved by: stas (mentor)
we've been using was actually just spinning due to ARM having redefined
the old 'wait for interrupt' operation via the system coprocessor as a nop
and replacing it with a WFI instruction.
code can supply a better implementation. A SoC with variable CPU frequency
is likely to use a fixed-frequency timer for DELAY() (but still use the
mpcore private timers as eventtimers).
Also remove spaces from the eventtimer and timecounter names.
implementation in arm/machdep.c. Most arm platforms either don't need to
do anything, or just need to call the standard eventtimer init routines.
A generic implementation that does that is now provided via weak linkage.
Any platform that needs to do something different can provide a its own
implementation to override the generic one.
- Don't use spaces or dots in the eventtimer or timecounter names.
They turn into sysctl node names, and it's just confusing.
- Use comparator #3 instead of #1 for one-shot events. There's an
extra 1-cycle penalty in the hardware for accessing the registers
for comparator 1, no point in paying that penalty.
- Lower the quality of the eventtimer from 1000 to 800, because the
device can't support PERCPU timers and some other device in the system
may be able to provide that.
Quartz is a tiny module utilized Freescale VF6xx
system-on-chip and development kit produced by
Device Solutions.
Quartz is available in a form of LGA (38x38x2mm)
or as a module with high-density connectors.
Sponsored by: Device Solutions
the Bufferable bit in the PDE entries of the secondary processor startup
pagetables.
The caches really need to be invalidated even earlier than this, but this
is a big step in the right direction. The invalidate needs to happen
before the MMU is enabled, which means it has to be called from asm code
that's running with physical addressing. Fixing that will be handled in
a future change.
implementations for each of the chips we support. Most chips up through
armv6 can use the armv4 implementation which has a single coprocessor
opcode for this operation. The rather more complex armv7 implementation
comes from netbsd.
While this is technically more correct, I don't think it much matters,
because the only thing in the tree that calls cpu_flush_dcache() is md(4)
and I'm > 99% sure it's bogus that it does so; md has no ability to do
anything that can perturb data cache coherency.
it into a bunch of different .c files. Remove declarations for the unused
mptramp() function from everywhere except AramadaXP (and I think it's
really not used there either, because the code that references it appears
to be insanely does-nothing in nature).
WANDBOARD.common config, but with the freescale-specific optons and devices
all together at the bottom now. In addition to reformatting and shuffling
lines around, two new options are added because they're now known to work,
VFP and FREEBSD_BOOT_LOADER.
This config does not include any static DTB, it requires that u-boot
provide a DTB (or a custom kernel config can compile one in).
This will supercede all the existing WANDBOARD* configs, but those will
be left around for a while to help people transition their customized
configs to include this new one instead.
The temperature monitor device is enabled to sample the die temperature at
16hz. The temperature is published via sysctl. A callout routine at 10hz
monitors the temperature and throttles back the cpu if the temperature
goes over a user-settable throttle point (by default 10C less than the
critical high-point temperature for the chip). The hardware is supposed
to be able to deliver an interrupt when the temperature exceeds a settable
limit, but the interrupt never arrives so for now a callout does the job.
At attach time we read the maximum cpu frequency the chip is allowed to run
at and the cpu is set to run at that speed. It's reported at attach time.
A sysctl variable reports the current speed when queried.
New sysctl values:
dev.imx6_anatop.0.cpu_mhz: 984
dev.imx6_anatop.0.temperature: 37.9C
dev.imx6_anatop.0.throttle_temperature: 95.0C
Steven Lawrance did the initial heavy lifting on this, but I changed
enough stuff that I'm the one to blame if anything breaks.
Submitted by: Steven Lawrance <stl@koffein.net>
and CAMBRIA boards that does not use loader to load the kernel. This
is basically how it was done for i386. This way tunables can also be
set. For example in config file:
env "/conf/AVILA.env"
And in AVILA.env:
vfs.unmapped_buf_allowed=0
MFC after: 2 weeks
is attached, by establishing a temporary mapping of the registers when
necessary. This is a temporary measure to keep progress moving; in the
long run we need better control over the order in which devices attach
(better than "the order they appear in the fdt dts source").
Attempt to demote the superpage if trying to pmap_enter() on
one. Panic only when the particular superpage should
no longer exist for that pmap and address.
Because pmap_enter_locked() is called from few different functions
some redundancy in superpage promotion attempts can be observed.
Hence, avoid promotion in pmap_enter_object() (if the object can
be mapped by superpage it will be handled by pmap_enter_object()
itself) and also do not waste time in pmap_enter_quick().
From now on the promotion will be performed only in pmap_enter().
It was possible to create RW superpage mapping even if
the base pages were RO due to wrong setting of the prot
flag passed to pmap_map_section().
Promotion attempt should be canceled in case of attributes
mismatch between any two base pages. Since we still use
pv_flags to maintain permission to write (PVF_WRITE) and
wired status (PVF_WIRED) for a page, it is also necessary
to take those variables into account.
Invalidate L1 PTE regardles of existance of the corresponding
l2_bucket. This is relevant when superpage is entered via
pmap_enter_object() and will fix crash on entering page
in place of not properly removed superpage.
because we use the 1MiB section maps as they only need a single pagetable.
To allow this we only use pc relative loads to ensure we only load from
physical addresses until we are running from a known virtual address.
As a side effect any data from before or 64MiB after the kernel needs to
be mapped in to be used. This should not be an issue for kernels loaded
with ubldr as it places this data just after the kernel. It will be a
problem when loading directly from anything using the Linux ABI that
places the ATAG data outside this range, for example U-Boot.
Add gpioled(4) to BEAGLEBONE kernel and add the description of the four
on-board leds of beaglebone-black to its DTS file.
Approved by: adrian (mentor, implicit)
change the gpio children can be described as directly connected to the GPIO
controller without the need of describing the OFW GPIO bus itself on the
DTS file.
With this commit the OFW GPIO bus is fully functional on BBB and RPi.
GPIO controllers which want to use the OFW GPIO bus will need similar
changes.
Approved by: adrian (mentor, implicit)
to decide what protections are required by the faulting access. The old
code disassembled the faulting instruction, and there are a lot of new
instructions that aren't handled. The old code is still used for armv4/5
which doesn't have the WNR bit)
routine, now a platform can provide a pointer to an early_putc() routine
which is used instead of cn_putc(). Control can be handed off from early
printf support to standard console support by NULLing out the pointer
during standard console init.
This leverages all the existing error reporting that uses printf calls,
such as panic() which can now be usefully employed even in early
platform init code (useful at least to those who maintain that code and
build kernels with EARLY_PRINTF defined).
Reviewed by: imp, eadler
Real means the one TSC / Symmetricom / Microsemi actually uses on their 4370
and other rm9200 boards. This code demonstrates a variety of useful things
board init code can do, including adjusting the master clock frequency.
communicate the kernel's physical load address from where it's known in
initarm() into cpu_mp_start() which is called from non-arm code and
takes no parameters.
This adds the global variable and ensures that all the various copies
of initarm() set it. It uses the variable in cpu_mp_start(), eliminating
the last uses of KERNPHYSADDR outside of locore.S (where we can now
calculate it instead of relying on the constant).
a new physmem.c file. The new code provides helper routines that can be
used by legacy SoCs and newer FDT-based systems. There are routines to
add one or more regions of physically contiguous ram, and exclude one or
more physically contiguous regions of ram. Ram can be excluded from crash
dumps, from being given over to the vm system for allocation management,
or both. After all the included and excluded regions have been added,
arm_physmem_init_kernel_globals() processes the regions into the global
dump_avail and phys_avail arrays and realmem and physmem variables that
communicate memory configuration to the rest of the kernel.
Convert all existing SoCs to use the new helper code.
This was an optimization used only by a few xscale platforms. Part of
the optimization was to create a direct map for all physical pages, and
that resulted in making multiple mappings of pages in a way that bypassed
the logic in pmap.c to handle VIVT cache aliasing. It also just generally
made the code more complex and hard to maintain for all SoCs.
Reviewed by: cognet
and where the code that references it can safely be elided if it's not
defined (meaning the code is used for legacy arm platforms that still
define the compile-time PHYSADDR but not on newer systems that calculate
the value at runtime).
in effect due to r250753. That is sufficient for all SoCs with a 32 byte
cache line size. Systems with 64 byte cache lines will need the option;
that will be done in a separate commit.
Thanks to loos@ for pointing out r250753.
a sub-node of nexus (ofwbus) rather than direct attach under nexus. This
fixes FDT on x86 and will make coexistence with ACPI on ARM systems easier.
SPARC is unchanged.
Reviewed by: imp, ian
the old way was to store pcpu in a register, and get curthread from pcpu,
which is not very atomic, and led to issues if the thread was migrated
to another core between the time we got the pcpu address and the time we
got curthread.
Instead, we now store curthread where pcpu used to be store, and we
calculate the pcpu address based on the cpu id.
to check the status property in their probe routines.
Simplebus used to only instantiate its children whose status="okay"
but that was improper behavior, fixed in r261352. Now that it doesn't
check anymore and probes all its children; the children all have to
do the check because really only the children know how to properly
interpret their status property strings.
Right now all existing drivers only understand "okay" versus something-
that's-not-okay, so they all use the new ofw_bus_status_okay() helper.
It turns out the version of gas we're using interprets the old '_all' mask
as 'fc' instead of 'fsxc'. That is, "all" doesn't really mean "all".
This was the cause of the "wrong-endian register restore" bug that's
been causing problems with some cortex-a9 chips. The 'endian' bit in the
spsr register would never get changed (it falls into the 'x' mask group)
and the first return-from-exception would fail if the chip had powered on
with garbage in the spsr register that included the big-endian bit. It's
unknown why this affected only certain cortex-a9 chips.
strings and include arbitrary information (IRQ line/domain/sense). When the
ofw_bus_map_intr() API was introduced, it assumed that, as on most systems,
these were either 1 cell, containing an interrupt line, or 2, containing
a line number plus a sense code. It turns out a non-negligible number of
ARM systems use 3 (or even 4!) cells for interrupts, so make this more
general.
devices. This is a nop, except for what's reported by atmelbus for the
resources.
It would be nice if we could dymanically allocated these things, but
the pmap_mapdev panics if we don't keep the static mappings, so we
still need to play the carefully allocate VA space between all
supported SoC game.
User's with their own devices may need to make adjustments.
Note that this commit hasn't been compile tested because these files
are not hooked up to the build...
PR: 186129
Submitted by: Takanori Sawada
Approved by: rpaulo