communicate the kernel's physical load address from where it's known in
initarm() into cpu_mp_start() which is called from non-arm code and
takes no parameters.
This adds the global variable and ensures that all the various copies
of initarm() set it. It uses the variable in cpu_mp_start(), eliminating
the last uses of KERNPHYSADDR outside of locore.S (where we can now
calculate it instead of relying on the constant).
a new physmem.c file. The new code provides helper routines that can be
used by legacy SoCs and newer FDT-based systems. There are routines to
add one or more regions of physically contiguous ram, and exclude one or
more physically contiguous regions of ram. Ram can be excluded from crash
dumps, from being given over to the vm system for allocation management,
or both. After all the included and excluded regions have been added,
arm_physmem_init_kernel_globals() processes the regions into the global
dump_avail and phys_avail arrays and realmem and physmem variables that
communicate memory configuration to the rest of the kernel.
Convert all existing SoCs to use the new helper code.
This was an optimization used only by a few xscale platforms. Part of
the optimization was to create a direct map for all physical pages, and
that resulted in making multiple mappings of pages in a way that bypassed
the logic in pmap.c to handle VIVT cache aliasing. It also just generally
made the code more complex and hard to maintain for all SoCs.
Reviewed by: cognet
the old way was to store pcpu in a register, and get curthread from pcpu,
which is not very atomic, and led to issues if the thread was migrated
to another core between the time we got the pcpu address and the time we
got curthread.
Instead, we now store curthread where pcpu used to be store, and we
calculate the pcpu address based on the cpu id.
It turns out the version of gas we're using interprets the old '_all' mask
as 'fc' instead of 'fsxc'. That is, "all" doesn't really mean "all".
This was the cause of the "wrong-endian register restore" bug that's
been causing problems with some cortex-a9 chips. The 'endian' bit in the
spsr register would never get changed (it falls into the 'x' mask group)
and the first return-from-exception would fail if the chip had powered on
with garbage in the spsr register that included the big-endian bit. It's
unknown why this affected only certain cortex-a9 chips.
related to setting up static device mappings. Since it was only used by
arm/mv/mv_pci.c, it's now just static functions within that file, plus
one public function that gets called only from arm/mv/mv_machdep.c.
obsolete. This involves the following pieces:
- Remove it entirely on PowerPC, where it is not used by MD code either
- Remove all references to machine/fdt.h in non-architecture-specific code
(aside from uart_cpu_fdt.c, shared by ARM and MIPS, and so is somewhat
non-arch-specific).
- Fix code relying on header pollution from machine/fdt.h includes
- Legacy fdtbus.c (still used on x86 FDT systems) now passes resource
requests to its parent (nexus). This allows x86 FDT devices to allocate
both memory and IO requests and removes the last notionally MI use of
fdtbus_bs_tag.
- On those architectures that retain a machine/fdt.h, unused bits like
FDT_MAP_IRQ and FDT_INTR_MAX have been removed.
Add suport for setting triggering level and polarity in GIC.
New function pointer was added to nexus which corresponds
to the function which sets level/sense in the hardware (GIC).
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Qualcomm Snapdragon S4 and Snapdragon 400/600/800 SoCs and has architectural
similarities to ARM Cortex-A15. As for development boards IFC6400 series embedded
boards from Inforce Computing uses Snapdragon S4 Pro/APQ8064.
Approved by: stas (mentor)
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky
Reviewed by: cperciva
words, every architecture is now auto-sizing the kmem arena. This revision
changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes
mandatory and the definition of VM_KMEM_SIZE becomes optional.
Replace or eliminate all existing definitions of VM_KMEM_SIZE. With
auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling
for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for
clarity.
Change kmeminit() so that the effect of defining VM_KMEM_SIZE is similar to
that of setting the tunable vm.kmem_size. Whereas the macros
VM_KMEM_SIZE_{MAX,MIN,SCALE} have had the same effect as the tunables
vm.kmem_size_{max,min,scale}, the effects of VM_KMEM_SIZE and vm.kmem_size
have been distinct. In particular, whereas VM_KMEM_SIZE was overridden by
VM_KMEM_SIZE_{MAX,MIN,SCALE} and vm.kmem_size_{max,min,scale}, vm.kmem_size
was not. Remedy this inconsistency. Now, VM_KMEM_SIZE can be used to set
the size of the kmem arena at compile-time without that value being
overridden by auto-sizing.
Update the nearby comments to reflect the kmem submap being replaced by the
kmem arena. Stop duplicating the auto-sizing formula in every machine-
dependent vmparam.h and place it in kmeminit() where auto-sizing takes
place.
Reviewed by: kib (an earlier version)
Sponsored by: EMC / Isilon Storage Division
allocates kva space from the top down for the device mappings and builds
entries in an internal table which is automatically used later by
arm_devmap_bootstrap(). The platform code just calls the new
arm_devmap_add_entry() function as many times as it needs to (up to 32
entries allowed; most platforms use 2 or 3 at most).
There is also a new arm_devmap_lastaddr() function that returns the lowest
kva address allocated; this can be used to implement initarm_lastaddr()
which is used to initialize vm_max_kernel_address.
The new code is based on a similar concept developed for the imx family
SoCs recently. They will soon be converted to use this new common code.
static device mappings, rather than as the first of the initializations
that a platform can hook into. This allows a platform to allocate KVA
from the top of the address space downwards for things like static device
mapping, and return the final "last usable address" result after that and
other early init work is done.
Because some platforms were doing work in initarm_lastaddr() that needs to
be done early, add a new initarm_early_init() routine and move the early
init code to that routine on those platforms.
Rename platform_devmap_init() to initarm_devmap_init() to match all the
other init routines called from initarm() that are designed to be
implemented by platform code.
Add a comment block that explains when these routines are called and the
type of work expected to be done in each of them.
new devmap.[ch] files. Emphasize the MD nature of these things by using
the prefix arm_devmap_ on the function and type names (already a few of
these things found their way into MI code, hopefully it will be harder to
do by accident in the future).
out common code related to mapping device memory into a new devmap.c file.
Remove the growing duplication of code that used pmap_devmap_find_pa() and
then did some math with the returned results to generate a virtual address,
and likewise in reverse to get a physical address. Now there are a pair
of functions, arm_devmap_vtop() and arm_devmap_ptov(), to do that. The
bus_space_map() implementations are rewritten in terms of these.
accessed through the direct map unless the kernel configuration actually
includes a direct map. Only a few configurations do, and for the rest the
unnecessary free page pool is a small pessimization.
Tested by: zbb
MFC after: 6 weeks
Use values of the correct defines to determine statement's result.
ARM_ARCH_ symbols are always defined, hence only values are relevant.
Reviewed by: cognet
Sheeva PJ4Bv6 - based chips were only prototypes for V7 class Armada
SoC family. Current in-tree support for PJ4Bv6 will not work and also
there should be no platforms in active use that would incorporate that
CPU revision.
The only remaining user was the code that allocates bounce pages for armv4
busdma. It's not clear why bounce pages would need uncached memory, but
if that ever changes, kmem_alloc_attr() would be the way to get it.
really need it. That would be almost everywhere it was included. Add
it in a couple files that really do need it and were previously getting
it by accident via another header.
included by vm/pmap.h, which is a prerequisite for arm/machine/pmap.h
so there's no reason to ever include it directly.
Thanks to alc@ for pointing this out.
Promoting base pages to superpages can increase TLB coverage and allow for
efficient use of page table entries. This development provides FreeBSD/ARM
with superpages management mechanism roughly equivalent to what we have for
i386 and amd64 architectures.
1. Add mechanism for automatic promotion of 4KB page mappings to 1MB section
mappings (and demotion when not needed, respectively).
2. Managed and non-kernel mappings are now superpages-aware.
3. The functionality can be enabled by setting "vm.pmap.sp_enabled" tunable to
a non-zero value (either in loader.conf or by modifying "sp_enabled"
variable in pmap-v6.c file). By default, automatic promotion is currently
disabled.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
This allows for enabling and configuring superpages reservation mechanism in
order to allocate and populate 256 4KB base pages (for the purpose of
promotion to a 1MB superpage).
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
Revise L2_S_PROT_MASK to include all of the protection bits. Notice that
clearing these bits does not always take away the corresponding permissions
(for example, permission is granted when the bit is cleared). The bits are
cleared but are to be set or left cleared accordingly in pmap_set_prot(),
pmap_enter_locked(), etc.
Clear L2_XN along with L2_S_PROT_MASK in pmap_set_prot() so that all
permissions related bits are cleared before actual configuration.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
because an exception may happen at any time. The stack alignment rules on
ARM EABI state the only place the stack must be 8-byte aligned is on a
function boundary.
If an exception happens while a function is setting up or tearing down it's
stack frame it may not be correctly aligned. There is also no requirement
for it to be when the function is a leaf node.
The fix is to align the stack after we have stored a backup of the old stack
pointer, but before we have stored anything in the trapframe. Along with
this we need to adjust the size of the trapframe by 4 bytes to ensure the
stack below it is also correctly aligned.
instruction set. Thumb-2 requires an if-then instruction to implement
conditional codes.
When building for ARM mode the it-then instructions do not generate any
assembled instruction as per the ARMv7-A Architecture Reference Manual, and
are safe to use.
While this allows the atomic instructions to be built, it doesn't mean we
fully support Thumb code. It works in small tests, but is still known to
fail in a large number of places.
While here add a check for the armv6t2 architecture.
Issues were noted by Bruce Evans and are present on all architectures.
On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.
On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.
PowerPC64 is treated the same as amd64.
For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.
Noted by: bde
Sponsored by: The FreeBSD Foundation
past a trapframe.
Use this macro in exception_exit as it is the function the unwinder enters
as the functions that store the frame setting lr to point to it.
check which variant we are on, and if it is a VFPv3 or v4, and has 32
double registers we save these. This fixes VFP support on Raspberry Pi.
While here clean fmrx and fmxr up to use the register names from vfp.h
as opposed to the raw register names.
* Stop pretending we support anything other than ELF by removing code
surrounded by #ifdef __ELF__ ... #endif.
* Remove _JB_MAGIC_SETJMP and _JB_MAGIC__SETJMP, they are defined in
setjmp.h, which is able to be included from asm.
* Fix the spelling of dependent.
* Rename END _END and add END and ASEND to complement ENTRY and ASENTRY
respectively
* Add macros to simplify accessing the Global Offset Table, some of these
will be used in the upcoming update to the setjmp functions.
Using PVF_MOD, PVF_REF and PVF_EXEC is redundant as we can get the proper
info from PTE bits.
When the mapping is marked as executable and has been referenced we assume
that it has been executed. Similarly, when the mapping is set to be writable
and is referenced, it must have been due to write access to it.
PVF_MOD and PVF_REF flags are kept just for pmap_clearbit() usage,
to pass the information on which bit should be cleared.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Use pmap_find_pv if needed instead of multiplying its code throughout
pmap-v6.
Avoid possible NULL pointer dereference in pmap_enter_locked()
When trying to get m->md.pv_memattr, make sure that m != NULL,
in particular that vector_page is set to be NULL.
Do not set PGA_REFERENCED flag in pmap_enter_pv().
On ARM any new page reference will result in either entering the new
mapping by calling pmap_enter, etc. or fixing-up the existing mapping in
pmap_fault_fixup().
Therefore we set PGA_REFERENCED flag in the earlier mentioned cases and
setting it later in pmap_enter_pv() is just waste of cycles.
Delete unused pm_pdir pointer from the pmap structure.
Rearrange brackets in the fault cause detection in trap.c
Place the brackets correctly in order to see course of the conditions
instantaneously.
Unify naming in pmap-v6.c and improve style
Use naming common for whole pmap and compatible with other pmaps,
improve style where possible:
pm -> pmap
pg -> m
opg -> om
*pt -> *ptep
*pte -> *ptep
*pde -> *pdep
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
bit in PTE.
Enable Access Flag in CPU control. With AF enabled each valid mapping
needs to have referenced bit in PTE set in order to be able to cache
it in the TLB.
AP[0] bit is to be used as reference flag.
All access permissions are encoded by AP[2:1] wherein AP[1] is in fact
"user enable" and AP[2](APX) is "write disable".
All mappings are always set to be valid. Reference emulation is performed
by setting/clearing reference flag in PTE.
md.pvh_attrs are no longer necessary however pv_flags are still being used
for now.
Marking vm_page as "dirty" or "referenced" is being performed on:
- page or flag fault servicing in pmap_fault_fixup(), basing on the fault
type
- vm_fault servicing in pmap_enter() according to the desired protections
and faulty access type
Redundant page marking has been removed as on ARM we know exactly when the
particular page is referenced or is going to be written.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
PV entries are now roughly half the size.
Instead of using a shared UMA zone for 28 byte pv entries
(two 8-byte tailq nodes, a 4 byte pointer, a 4 byte address and 4 byte
flags), we allocate a page at a time per process.
This provides 252 pv entries per process (actually, per pmap address space)
and eliminates one of the 8-byte tailq entries since we now can track
per-process pv entries implicitly.
The pointer to the pmap can be eliminated by doing address arithmetic to
find the metadata on the page headers to find a single pointer shared by
all 252 entries. There is an 8-int bitmap for the freelist of those 252
entries.
When in serious low memory condition, allocation of another pv_chunk is
possible by freeing some pages in pmap_pv_reclaim().
Added pv_entry/pv_chunk related statistics to pmap.
pv_entry/pv_chunk statistics can be accessed via sysctl vm.pmap.
Ported PTE freelist of KVA allocation and maintenance from i386.
Using an idea from Stephan Uphoff, use the empty pte's that correspond
to the unused kva in the pv memory block to thread a freelist through.
This allows us to free pages that used to be used for pv entry chunks
since we can now track holes in the kva memory block.
As both ARM pmap.c and pmap-v6.c use the same header and pv_entry, pmap and
md_page structures are different, it was needed to separate code designed
for ARMv6/7 from the one for other ARMs.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.
Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc
Keep following access permissions:
APX AP Kernel User
1 01 R N
1 10 R R
0 01 R/W N
0 11 R/W R/W
Avoid using reserved in ARMv6 APX|AP settings:
- In case of unprivileged (user) access without permission to write,
the access permission bits were being set to reserved for ARMv6
(but valid for ARMv7) value of APX|AP = 111.
Fix-up faulting userland accesses properly:
- Wrong condition statement in pmap_fault_fixup() caused that
any genuine, unprivileged access was being fixed-up instead of
just skip doing anything and return. Staring from now we ensure
proper reaction for illicit user accesses.
L2_S_PROT_R and L2_S_PROT_U names might be misleading as they do not
reflect real permission levels. It will be clarified in following
patches (switch to AP[2:1] permissions model).
Obtained from: Semihalf
Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.
See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.
In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.
add the ability for userland to be notified of changes on gpio pins via
a select(2)/read(2) interface.
Change the interrupt handler from filtered to threaded.
Because of the uiomove() calls in the new interface, change locking from
standard mutex to sx.
Add / restore the at91_gpio_high_z() function.
Reviewed by: imp (long ago)
register from a bus space resource.
Note that this macro is just for ARM, and is intended to have a short
lifespan. The DMA engines in some SoCs need the physical address of a
memory-mapped device register as one of the arguments for the transfer.
Several scattered ad-hoc solutions have been converted to use this macro,
which now also serves to mark the places where a more complete fix needs
to be applied (after that fix has been designed).
pages around, taking array of vm_page_t both for source and
destination. Starting offsets and total transfer size are specified.
The function implements optimal algorithm for copying using the
platform-specific optimizations. For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used. The code was
typically borrowed from the pmap_copy_page() for the same
architecture.
Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.
For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.
Sponsored by: The FreeBSD Foundation
Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after: 2 weeks
other architectures [1].
While here:
- Remove an unused and commented out include.
- Add a comment describing the file that other copies have.
- Fix the style of the defines and add a comment on what each one is.
Suggested by: [1] alc
sent a SIGABRT when it is loaded as it is too large. This is the smallest
power of two MiB value that allows us to execute clang.
While here wrap it in an #ifndef to be consistent with the other
architectures.
Submitted by: Daisuke Aoyama <aoyama at peach.ne.jp>
fact, use the same values here that we use on 32-bit x86 and MIPS. Some
machines were reported to have problems with the more aggressive values.
Reported and tested by: andrew
submap. Otherwise, after r246204, the auto-scaling logic in kern_malloc.c
tries to create a kmem submap that consumes the entire kernel map on a
Pandaboard with 1 GB of RAM.
Tested by: gonzo
machine to another. Therefore, VM_MAX_KERNEL_ADDRESS can't be a constant.
Instead, #define it to be a variable, vm_max_kernel_address, just like we
do on sparc64.
Reviewed by: kib
Tested by: ian
VM_KMEM_SIZE_SCALE specifies which fraction of the available physical
memory, after deduction of the kernel itself and other early statically
allocated memory, can be used for the kmem_map. The kmem_map provides
for all UMA/malloc allocations in KVM space.
Previously ARM was using a fixed kmem_map size of (12*1024*1024) = 12MB
without regard to effectively available memory. This is too small for
recent ARM SoC with more than 128MB of RAM.
For reference a description of others related kmem_map parameters:
VM_KMEM_SIZE default start size of kmem_map if SCALE is
not defined
VM_KMEM_SIZE_MIN hard floor on the kmem_map size
VM_KMEM_SIZE_MAX hard ceiling on the kmem_map size
VM_KMEM_SIZE_SCALE fraction of the available real memory to
be used for the kmem_map, limited by the
MIN and MAX parameters.
Tested by: ian
MFC after: 1 week
interrupt counts and names, by making the names into an array of fixed-length
strings that can be directly indexed. This eliminates extra memory accesses
on every interrupt to increment the counts.
As a side effect, it also fixes a bug that would corrupt the names data
if a name was longer than MAXCOMLEN, which led to incorrect vmstat -i output.
Approved by: cognet (mentor)
- Add pl310.disable tunable to disable L2 cache altogether. In
order to make sure that it's 100% disabled we use cache event
counters for cache line eviction and read allocate events
and panic if any of these counters increased. This is purely
for debugging purpose
- Direct access DEBUG_CTRL and CTRL might be unavailable in
unsecure mode, so use platform-specific functions for
these registers
- Replace #if 1 with proper erratum numbers
- Add erratum 753970 workaround
- Remove wait function for atomic operations
- Protect cache operations with spin mutex in order to prevent race condition
- Disable instruction cache prefetch and make sure data cache
prefetch is enabled in OMAP4-specific intialization
interfere with structure fields of the same name in drivers, like
the intr_disable function pointer in struct cphy_ops in cxgb(4).
Instead define intr_disable and intr_restore as inline functions.
With intr_disable() an inline function, the I32_bit and F32_bit
macros now need to be visible in MI code and given the rather
poor names, this is not at all good. Define ARM_CPSR_F32 and
ARM_CPSR_I32 and use that instead of F32_bit and I32_bit (resp)
for now.
The copies of initarm used on platforms with FDT support were almost
identical. The differences were pulled out into separate functions that
were called by initarm.
This change merges the, now identical, copies of initarm and a few of it's
support functions. This is a step towards a common kernel on ARMv6.
this some compilers will place a cmp instruction before the atomic operation
and expect to be able to use the result afterwards. By adding "cc" to the
list of used registers we tell the compiler to not do this.
problematic because some callers to pmap_kextract() expect its
implementation to be lock-less. In particular, uma_dbg_alloc() implicitly
requires this. Otherwise, lock-order reversals occur between pmap locks and
UMA zone locks. So, this change introduces a lock-less implementation of
pmap_kextract().
Disable recursion on the pvh global lock in the new armv6 pmap. While
recursion on this locks occurs in the old arm pmap, it thankfully doesn't
occur in the armv6 pmap.
Tested by: jmg
On single core devices set_stackptrs is only ever called with cpu = 0 in
initarm and will be identical to the existing function. On SMP this needs
to be implemented for sys/arm/mp_machdep.c, but the implementations are
identical for each SoC.
MSI are implemented via software interrupt. PCIe cards will write
into software interrupt register which will cause inbound shared
interrupt which will be interpreted as a MSI.
Obtained from: Marvell, Semihalf
- Add functions to calculate clocks instead using hardcoded values
- Update reset and timers functions
- Update number of interrupts
- Change name of platform from db88f78100 to db78460
- Correct DRAM size and PCI IRQ routing in dts file.
Obtained from: Semihalf
Cummulative patch of changes that are not vendor-specific:
- ARMv6 and ARMv7 architecture support
- ARM SMP support
- VFP/Neon support
- ARM Generic Interrupt Controller driver
- Simplification of startup code for all platforms
arm platform. Add all the atmel boards to the ATMEL kernel for
testing purposes. Until boot loader arg parsing of baord type
is done, this won't actually be able to do the runtime selection.
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.
Discussed with: bde
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.
Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.
As an added bonus, tidy up some nearby comments concerning page flags.
Reviewed by: kib
MFC after: 6 weeks
this array either from Linux boot data, when enabled, or in the
typical way that most ports do it. arm_pyhs_avail_init is coming
soon since it must be a separate function.
redboot. Support is very preiminary and likely needs some work. Also,
do some minor code shuffling of the FreeBSD /boot/loader metadata
parsing code. This code is preliminary and should be used with
caution.
is enabled, sets values based on the metadata passed in. Otherwise
fake_preload_metadata is called. Change the default parse_boot_param
to default_parse_boot_param. Enable this functionality only on the mv
platform, which is where most of the code is from.
Reviewed by: cognet, Ian Lapore
the boot parameters from initarm first thing. parse_boot_param parses
the boot arguments and converts them to the /boot/loader metadata the
rest of the kernel uses. parse_boot_param is a weak alias to
fake_preload_metadata, which all the platforms use now, but may become
more extensive in the future.
Since it is a weak symbol, specific boards may define their own
parse_boot_param to interface to custom boot loaders.
Reviewed by: cognet@, Ian Lapore
structure with the first 4 registers to allow a wider range of boot
loaders to work. Future commits will make use of this to centralize
support for the different loaders.
in_cksum.h required ip.h to be included for struct ip. To be
able to use some general checksum functions like in_addword()
in a non-IPv4 context, limit the (also exported to user space)
IPv4 specific functions to the times, when the ip.h header is
present and IPVERSION is defined (to 4).
We should consider more general checksum (updating) functions
to also allow easier incremental checksum updates in the L3/4
stack and firewalls, as well as ponder further requirements by
certain NIC drivers needing slightly different pseudo values
in offloading cases. Thinking in terms of a better "library".
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
dynamic rounding modes, but FPUless chips that use softfloat can support it
because everything is emulated anyway. (We presently have incomplete
support for hardware FPUs.)
Submitted by: Ian Lepore
- Write Buffers have to be drained after write to Page Table even if caches
are in write-through mode.
- Make sure to sync PTE in pmap_zero_page_generic().
Submitted by: Michal Mazur
Reviewed by: cognet
Obtained from: Semihalf
MFC after: 1 month
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.
Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
- A race condition could happen if two threads were using RAS at the same time
as the code didn't reset RAS_END, the RAS code could believe we were not in
a RAS, when we were in fact.
- Using signed value logic to compare addresses wasn't such a good idea.
Many thanks to Ian to investigate on these issues.
Pointy hat to: cognet
PR: arm/161498
Submitted by: Ian Lepore <freebsd At damnhippie DOT dyndns dot org
MFC after: 1 week
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.
No MFC is previewed for this patch.
Tested by: pluknet
Approved by: re (kib)
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.
Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).
Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.
Requested by: alc
MFC after: 1 week
MFC with: r221853
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]
Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.
Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.
Suggested by: bde [1]
Approved by: kib (mentor)
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.
The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.
The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".
Discussed with: bde
Approved by: kib (mentor)
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.
Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.
While here, correct some comments.
Reviewed by: bde
Approved by: kib (mentor)
It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.
Discussed with: bde
Approved by: kib (mentor)
quite right and hasn't been used in ages and is likely broken. QEMU
with GUMSTIX is a more promising road to FreeBSD/arm in emulation
anyway.
Reviewed by: cognet@
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.
PR: kern/80980
Discussed with: jhb
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.
Suggested by: bde (1, 2)
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.
Tested by: marius (sparc64)
MFC after: 1 month
now it uses a very dumb first-touch allocation policy. This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
a CPU belongs to. Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
(VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
The MD code is required to populate an array of mem_affinity structures.
Each entry in the array defines a range of memory (start and end) and a
domain for the range. Multiple entries may be present for a single
domain. The list is terminated by an entry where all fields are zero.
This array of structures is used to split up phys_avail[] regions that
fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
used when fulfulling a physical memory allocation. Right now the
per-domain freelists are listed in a round-robin order for each domain.
In the future a table such as the ACPI SLIT table may be used to order
the per-domain lookup lists based on the penalty for each memory domain
relative to a specific domain. The lookup lists may be examined via a
new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
pick a lookup list when allocating memory.
Reviewed by: alc
The following systems are involved:
- DB-88F5182
- DB-88F5281
- DB-88F6281
- DB-78100
- SheevaPlug
This overhaul covers the following major changes:
- All integrated peripherals drivers for Marvell ARM SoC, which are
currently in the FreeBSD source tree are reworked and adjusted so they
derive config data out of the device tree blob (instead of hard coded /
tabelarized values).
- Since the common FDT infrastrucutre (fdtbus, simplebus) is used we say
good by to obio / mbus drivers and numerous hard-coded config data.
Note that world needs to be built WITH_FDT for the affected platforms.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation.
o This is disabled by default for now, and can be enabled using WITH_FDT at
build time.
o Tested with ARM and PowerPC.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
previously know by StarSemi STR9104.
Tested by the submitter on an Emprex NSD-100 board.
Submitted by: Yohanes Nugroho <yohanes at gmail.com>
Reviewed by: freebsd-arm, stas
Obtained from: //depot/projects/str91xx/...
This brings hwpmc(4) support for 2nd and 3rd generation XScale cores.
Right now it's enabled by default to make sure we test this a bit.
When the time comes it can be disabled by default.
Tested on Gateworks boards.
A man page is coming.
Obtained from: //depot/user/rpaulo/xscalepmc/...
by looking at the bases used for non-relocatable executables by gnu ld(1),
and adjusting it slightly.
Discussed with: bz
Reviewed by: kan
Tested by: bz (i386, amd64), bsam (linux)
MFC after: some time
dependent memory attributes:
Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.
Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.
Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes. Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures. The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map. The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:
kmem_alloc_contig() can now be used to allocate kernel memory with
non-default memory attributes on amd64 and i386.
vm_page_alloc() and the device pager will set the memory attributes
for the real or fictitious page according to the object's default
memory attributes.
Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.
Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386. In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.
In collaboration with: jhb
Approved by: re (kib)
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent
Reviewed by: bde, imp, marcel
Approved by: re (kensmith)
required by video card drivers. Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures. In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.
In collaboration with: jhb
structure. When the page is shared, the kernel mapping becomes a special
type of managed page to force the cache off the page mappings. This is
needed to avoid stale entries on all ARM VIVT caches, and VIPT caches
with cache color issue.
Submitted by: Mark Tinguely
Reviewed by: alc
Tested by: Grzegorz Bernacki, thompsa
the implementation can guarantee forward progress in the event of
a stuck interrupt or interrupt storm. This is especially critical
for fast interrupt handlers, as they can cause a hard hang in that
case. When first called, arm_get_next_irq() is passed -1.
Obtained from: Juniper Networks, Inc.
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]
Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]
Suggested by: bde [1], jhb [2]
MFC after: 2 weeks
CACHE_LINE_SIZE constant. These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.
Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).
MFC after: 2 weeks
Discussed on: arch@
but I see no benefit from it today.
VM_PROT_READ_IS_EXEC was only intended for use on processors that do not
distinguish between read and execute permission. On an mmap(2) or
mprotect(2), it automatically added execute permission if the caller
specified permissions included read permission. The hope was that this
would reduce the number of vm map entries needed to implement an address
space because there would be fewer neighboring vm map entries that differed
only in the presence or absence of VM_PROT_EXECUTE. (See vm/vm_mmap.c
revision 1.56.)
Today, I don't see any real applications that benefit from
VM_PROT_READ_IS_EXEC. In any case, vm map entries are now organized
as a self-adjusting binary search tree instead of an ordered list. So,
the need for coalescing vm map entries is not as great as it once was.
to the full path of the image that is being executed.
Increase AT_COUNT.
Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.
Reviewed by: kan
- The contents of 'feroceon_cpufuncs' dispatch table was really dedicated for the
new Sheeva CPU (in 88F6xxx and MV-78xxx SOCs), and NOT Feroceon.
- Feroceon CPU (in 88F5xxx SOCs) appears as a regular ARM926EJ-S core and does
not require dedicated routines.
This will be accompanied by a file rename commit.
FPA floating-point format is identical to the VFP format,
but is always stored in big-endian.
Introduce _IEEE_WORD_ORDER to describe the byte-order of
the FP representation.
Obtained from: Juniper Networks, Inc
o recognize ixp435 cpu
o change memory layout for for ixp4xx to not assume memory is aliases
to 0x10000000 (Cambria/ixp435 memory starts at zero)
o handle 64 irqs for ixp435
o dual EHCI USB 2.0 controller integral to ixp435
o overhaul NPE code for ixp435 and better MAC+MII naming
o updated NPE firmware (including NPE-A image for ixp435/ixp465)
o Gateworks Cambria board support:
- IDE compact flash
- MCU
- front panel LED on i2c bus
- Octal LED latch
Sanity-tested with NFS-root on Avila and Cambria boards. Requires
pending boot2 mods for CF-boot on Cambria.
and ifnet functions
- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
This uses the common U-Boot support lib (sys/boot/uboot, already used on
FreeBSD/powerpc), and assumes the underlying firmware has the modern API for
stand-alone apps enabled in the config (CONFIG_API).
Only netbooting is supported at the moment.
Obtained from: Marvell, Semihalf
* Orion
- 88F5181
- 88F5182
- 88F5281
* Kirkwood
- 88F6281
* Discovery
- MV78100
The above families of SOCs are built around CPU cores compliant with ARMv5TE
instruction set architecture definition. They share a number of integrated
peripherals. This commit brings support for the following basic elements:
* GPIO
* Interrupt controller
* L1, L2 cache
* Timers, watchdog, RTC
* TWSI (I2C)
* UART
Other peripherals drivers will be introduced separately.
Reviewed by: imp, marcel, stass (Thanks guys!)
Obtained from: Marvell, Semihalf
- Fix nexus_setup_intr() abuse of setting up multiple IRQs in one go. Calling
arm_setup_irqhandler() in loop is bogus, as there's just one cookie given
from the caller and it is overwritten in each iteration so that only the
last handler's cookie value prevails.
- Proper intr masking/unmasking handling: the IRQ source is masked at PIC level
only after the last handler has been removed from the list.
Reviewed by: cognet, imp, sam, stass
Obtained from: Grzegorz Bernacki gjb ! semihalf dot com
Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.
Approved by: philip (mentor)
boards. This is enough to net-boot to multiuser.
Also supported is the SMSC LAN91C111 parts used on the netCF, netDUO and netMMC
add-on boards.
I'll be putting some instructions on how to boot this on the Gumstix boards
online soon.
This is still fairly rough and will be refined over time but I felt it was
better to get this out there where other people can help out.
interrupt. So, add a new function pointer, arm_post_filter, which defaults
to NULL, and which will be used as the post_filter arg for
intr_event_create(). Set it properly for the AT91, so that it boots again.
Reported by: hps
- Pull all the code to deal with the trampoline stuff into one
centeralized place and use it from everywhere.
- Some minor style tidiness
Reviewed by: tinguely