2024 Commits

Author SHA1 Message Date
alc
aa9a7bb9e6 Significantly reduce the cost, i.e., run time, of calls to madvise(...,
MADV_DONTNEED) and madvise(..., MADV_FREE).  Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE.  Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference().  Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2).  A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).

Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact.  For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.

Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures.  I will commit implementations for the
remaining architectures after further testing.  For now, a stub function is
sufficient because of the advisory nature of pmap_advise().

Discussed with: jeff, jhb, kib
Tested by:      pho (i386), marcel (ia64)
Sponsored by:   EMC / Isilon Storage Division
2013-08-29 15:49:05 +00:00
raj
f925cc353c Introduce superpages support for ARMv6/v7.
Promoting base pages to superpages can increase TLB coverage and allow for
efficient use of page table entries.  This development provides FreeBSD/ARM
with superpages management mechanism roughly equivalent to what we have for
i386 and amd64 architectures.

1. Add mechanism for automatic promotion of 4KB page mappings to 1MB section
   mappings (and demotion when not needed, respectively).

2. Managed and non-kernel mappings are now superpages-aware.

3. The functionality can be enabled by setting "vm.pmap.sp_enabled" tunable to
   a non-zero value (either in loader.conf or by modifying "sp_enabled"
   variable in pmap-v6.c file).  By default, automatic promotion is currently
   disabled.

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-26 17:12:30 +00:00
raj
6a3d0fcc3e Provide settings for superpage reservation system on ARM.
This allows for enabling and configuring superpages reservation mechanism in
order to allocate and populate 256 4KB base pages (for the purpose of
promotion to a 1MB superpage).

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-26 16:23:54 +00:00
raj
1075825bed Add missing TAILQ initializer (omitted in r250634).
Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-26 15:38:27 +00:00
andrew
8e4fda6171 Update the root device to be correct for use with crochet. 2013-08-26 10:27:15 +00:00
andrew
36534f8c4e Revert r251370 as it contains a deadlock. 2013-08-26 10:24:59 +00:00
andrew
4eefe80087 Add the frame information to cpu_switch to allow us to unwind out of it,
for example when dumping threads in the kernel debugger.
2013-08-25 11:23:38 +00:00
andrew
a3b354036c Add the unwind information to irq_entry so we can pass through it when
unwinding the stack.
2013-08-25 11:21:03 +00:00
kib
05a9dff802 Revert r254501. Instead, reuse the type stability of the struct pmap
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone.  Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().

Suggested and reviewed by:	alc (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-22 18:12:24 +00:00
ian
13ce4e66a1 Add support for uarts other than the serial console in TI OMAP SoCs.
The TI uart hardware is ns16550-compatible, except that before it can
be used the clocks and power have to be enabled and a non-standard
mode control register has to be set to put the device in uart mode
(as opposed to irDa or other serial protocols).  This adds the extra
code in an extension to the standard ns8250 probe routine, and the
rest of the driver is just the standard ns8250 code.
2013-08-21 14:33:02 +00:00
ian
06b56d3bab Make the noop clock successfully do nothing, because doing nothing and
returning an error status (which the NULL method pointers caused) isn't
nearly as useful.
2013-08-21 04:49:58 +00:00
ian
7d7f7f1bb2 Define the uart clocks so that they can be en/disabled at runtime. 2013-08-21 04:20:17 +00:00
andrew
636d64b28d Enable VFP on ARMADA XP. 2013-08-20 20:40:20 +00:00
ian
d8b6d1ce61 Make the standard sdhci(4) driver work for the TI OMAP family SoCs.
The MMCHS hardware is pretty much a standard SDHCI v2.0 controller with a
couple quirks, which are now supported by sdhci(4) as of r254507.

This should work for all TI SoCs that use the MMCHS hardware, but it has
only been tested on AM335x right now, so this enables it on those platforms
but leaves the existing ti_mmchs driver in place for other OMAP variants
until they can be tested.

This initial incarnation lacks DMA support (coming soon).  Even without it
this improves performance pretty noticibly over the ti_mmchs driver,
primarily because it now does multiblock IO.
2013-08-20 12:33:35 +00:00
andrew
832f853d49 Enable VFP on the Zedboard. 2013-08-19 22:25:36 +00:00
raj
f55114b9af Do not use pv_kva on ARMv6/v7 and save some space on each vm_page. It's only
relevant for older ARM variants (with virtual cache).

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	gber
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-19 16:16:49 +00:00
raj
23a66842db Simplify and clean up pmap_clearbit()
There is no need for calling vm_page_dirty() when clearing "modified" flag as
it is already set for that page in pmap_fault_fixup() or pmap_enter() thanks
to "modified" bit emulation.

Also, there is no need for checking PTE "referenced" or "writeable" flags.  If
there is a request to clear a particular flag we should just do it.

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	gber
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-19 15:58:39 +00:00
raj
af6a4d3dba Fix ARMv6/v7 mapping's wired status.
Last input argument in pmap_modify_pv() should be a mask of flags to be set.
In pmap_change_wiring() however, the straight wired status was used, which
does not represent valid flags (and is of type boolean).

This commit fixes the issue so that wired flag is passed to pmap_modify_pv()
properly.

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	gber
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-19 15:36:23 +00:00
raj
f8dbb37330 Clear all L2 PTE protection bits before their configuration.
Revise L2_S_PROT_MASK to include all of the protection bits.  Notice that
clearing these bits does not always take away the corresponding permissions
(for example, permission is granted when the bit is cleared). The bits are
cleared but are to be set or left cleared accordingly in pmap_set_prot(),
pmap_enter_locked(), etc.

Clear L2_XN along with L2_S_PROT_MASK in pmap_set_prot() so that all
permissions related bits are cleared before actual configuration.

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	gber
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-19 15:12:36 +00:00
raj
38347b87b7 Simplify pv_entry removal or ARMv6/v7:
- PGA_WRITEABLE indicates that there *might be* a writable mapping for the
  particular page, so to avoid frequent sweeping of the pv_entries whenever
  pmap_nuke_pv(), pmap_modify_pv(), etc. is called, it is sufficient to
  clear that flag if there are no managed mappings for that page anymore
  (notice that only pmap_enter is authorized to set this flag).
- Avoid redundant checking for PVF_WIRED flag when this flag cannot be set
  anyway.
- Clear PGA_WRITEABLE only once for each vm_page instead of multiple,
  redundant clearing it in loop when there are no writeable mappings
  to that page anymore.

Submitted by:	Zbigniew Bodek <zbb@semihalf.com>
Reviewed by:	gber
Sponsored by:	The FreeBSD Foundation, Semihalf
2013-08-19 14:56:17 +00:00
andrew
c36cd113d1 Enable VFP on the Arndale Board. 2013-08-19 08:28:35 +00:00
cognet
eb608cee63 Increase the max KVA available for general consumption on the Exynos 5.
Submitted by:	Ruslan Bukin <br@bsdpad.com>
2013-08-18 18:08:12 +00:00
andrew
dd347357a6 Enable VFP in the Versatile PB (QEMU) kernel. Tested on QEMU 1.6.0. 2013-08-18 17:18:52 +00:00
andrew
31f3d7c917 Enable VFP on the CubieBoard and CubieBoard 2. 2013-08-18 16:16:36 +00:00
andrew
03dd89947b Enable VFP support on EFIKA MX. 2013-08-18 11:54:20 +00:00
ian
85c6da6a6e Enable VFP support for BeagleBone. 2013-08-17 19:29:51 +00:00
andrew
06c57264bc Rename device vfp to option VFP and retire the ARM_VFP_SUPPORT option. This
simplifies enabling as previously both options were required to be enabled,
now we only need a single option.

While here enable VFP on the PandaBoard.
2013-08-17 18:51:38 +00:00
andrew
5dd64f1c48 Remove the ARMFPE option. It is unsupported, and appears to be broken as
arm_fpe_core_changecontext is not a function.
2013-08-17 15:09:14 +00:00
andrew
fe0cf8866d Remove fpe_sp_state as we don't support fpe. 2013-08-17 14:53:53 +00:00
andrew
7fa0395ca3 Remove unused FPE code. This is not enabled anywhere as it is the only
file I can find containing FAST_FPE. It appears this would not work as
want_resched is not defined anywhere.
2013-08-17 14:52:19 +00:00
ian
ddc35de90b Rename imx_machdep.c to imx51_machdep.c, because it contains hardware
addresses which are specific to the imx51 chips.
2013-08-13 21:12:28 +00:00
ian
14fc3cda65 Add imx6 compatibility and make the driver work for any clock frequency.
There are still a couple references to imx51 ccm driver functions that will
need to be changed after an imx6 ccm driver is written.

Reviewed by:	ray
2013-08-13 13:14:13 +00:00
cognet
35e0e7dfd4 Only allocate 2 bounce pages for maps that can only use them for buffers that
are unaligned on cache lines boundary, as we will never need more.
2013-08-11 21:21:02 +00:00
cognet
c5083d5610 Use the correct address when calling kva_free()
Pointy hat to:	cognet
Spotted out by:	alc
2013-08-10 00:53:22 +00:00
cognet
e5d07cf879 - The address lies in the bus space handle, not in the cookie
- Use the right address when calling kva_free()
(Is there any reason why the s3c2xx0 comes with its own version of bs_map/
 bs_unmap ? It seems to be just the same as in bus_space_generic.c)
2013-08-10 00:31:49 +00:00
cognet
51c3f72bfa Instead of just trying to do it for arm, make sure vm_kmem_size is properly
aligned in kmeminit(), where it'll work for any arch.

Suggested by:	alc
2013-08-09 22:30:54 +00:00
cognet
4933953bc5 - The address lies in the bus space handle, not in the cookie
- Use the right address when calling kva_free()
2013-08-09 21:56:28 +00:00
cognet
634eb147bd Make sure vm_kmem_size is aligned on a page boundary, since that's what vmem
expects.
2013-08-09 21:53:02 +00:00
attilio
16c7563cf4 The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
  and vm_page_grab are being executed.  This will be very helpful
  once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff, kib
Tested by:	gavin, bapt (older version)
Tested by:	pho, scottl
2013-08-09 11:11:11 +00:00
cognet
410d14e3ed Don't bother trying to work around buffers which are not aligned on a cache
line boundary. It has never been 100% correct, and it can't work on SMP,
because nothing prevents another core from accessing data from an unrelated
buffer in the same cache line while we invalidated it. Just use bounce pages
instead.

Reviewed by:	ian
Approved by:	mux (mentor) (implicit)
2013-08-07 15:44:58 +00:00
ganbold
7f7658c9c8 Bring initial support for Allwinner A20 SoC (Cubieboard2).
Add support for A20 timer.
	Correct interrupt offset depending from chip.
	Add basic code for CPU configuration module.
	For now, add kernel config and dts file
	(only FDT blob related problem needs to be solved later in
	order to have one kernel for both cubieboard1 and 2).

Approved by: ray@
2013-08-07 11:07:56 +00:00
jeff
de4ecca213 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
andrew
eacc009ef6 We no longer need to align the stack before calling swi_handler as it is
already aligned correctly in the PUSHFRAME macro.
2013-08-06 10:03:44 +00:00
cognet
6c012d075a Let the platform calculate the timer frequency at runtime, and use that for
the omap4, instead of relying on the (wrong) value provided in the dts.
2013-08-05 20:14:56 +00:00
andrew
1e070fe985 When entering exception handlers we may not have an aligned stack. This is
because an exception may happen at any time. The stack alignment rules on
ARM EABI state the only place the stack must be 8-byte aligned is on a
function boundary.

If an exception happens while a function is setting up or tearing down it's
stack frame it may not be correctly aligned. There is also no requirement
for it to be when the function is a leaf node.

The fix is to align the stack after we have stored a backup of the old stack
pointer, but before we have stored anything in the trapframe. Along with
this we need to adjust the size of the trapframe by 4 bytes to ensure the
stack below it is also correctly aligned.
2013-08-05 19:06:28 +00:00
ian
22800b9326 Tweak the imx debug console code so that it works with multiple SoCs.
Instead of hard-coding the uart register addresses for the imx51, use
a variable that defaults to the imx51 address.  When debugging another
imx-family SoC, the variable can be set early in initarm() to provide
full console/printf support for debugging early boot.
2013-08-03 13:31:10 +00:00
cognet
f4eaa79f0b Only receive the interrupts on the first core, to avoid duplicate interrupts. 2013-08-02 20:32:26 +00:00
ganbold
027a033578 Add identification for Cortex-A7 (R0) cores.
Reviewed by: cognet@
2013-08-01 10:06:19 +00:00
obrien
7999076e3e Back out r253779 & r253786. 2013-07-31 17:21:18 +00:00
rpaulo
349cc93fd1 Initialisation routines for the mailbox, spinlock and PRU-ICSS clocks. 2013-07-31 05:52:03 +00:00