While here, correct all consumers to pass NULL instead of 0 as we pass
capability rights as pointers now, not uint64_t.
Reported by: Daniel Peyrolon
Tested by: Daniel Peyrolon
Approved by: re (marius)
Apply the given advice to the specified range of addresses within the
given pmap. Depending on the advice, clear the referenced and/or
modified flags in each mapping. Superpage within the given range will
be demoted or destroyed.
Reviewed by: alc
Approved by: cognet (mentor)
Approved by: re
When clearing the modification status of the superpage, one of the
base pages produced during demotion should be marked as write disabled.
The intention is that subsequent write access may repromote.
In the current implementation this was done wrong as write permission was
granted instead of forbidden.
Approved by: cognet (mentor)
Approved by: re
for ARM.
This is quite ugly, because it has to work around a clang bug that does not
allow built-in functions to be defined, even when they're ones that are
expected to be built as part of a library.
Reviewed by: ed
MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE. Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference(). Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2). A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).
Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact. For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.
Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures. I will commit implementations for the
remaining architectures after further testing. For now, a stub function is
sufficient because of the advisory nature of pmap_advise().
Discussed with: jeff, jhb, kib
Tested by: pho (i386), marcel (ia64)
Sponsored by: EMC / Isilon Storage Division
Promoting base pages to superpages can increase TLB coverage and allow for
efficient use of page table entries. This development provides FreeBSD/ARM
with superpages management mechanism roughly equivalent to what we have for
i386 and amd64 architectures.
1. Add mechanism for automatic promotion of 4KB page mappings to 1MB section
mappings (and demotion when not needed, respectively).
2. Managed and non-kernel mappings are now superpages-aware.
3. The functionality can be enabled by setting "vm.pmap.sp_enabled" tunable to
a non-zero value (either in loader.conf or by modifying "sp_enabled"
variable in pmap-v6.c file). By default, automatic promotion is currently
disabled.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
This allows for enabling and configuring superpages reservation mechanism in
order to allocate and populate 256 4KB base pages (for the purpose of
promotion to a 1MB superpage).
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone. Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().
Suggested and reviewed by: alc (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
The TI uart hardware is ns16550-compatible, except that before it can
be used the clocks and power have to be enabled and a non-standard
mode control register has to be set to put the device in uart mode
(as opposed to irDa or other serial protocols). This adds the extra
code in an extension to the standard ns8250 probe routine, and the
rest of the driver is just the standard ns8250 code.
The MMCHS hardware is pretty much a standard SDHCI v2.0 controller with a
couple quirks, which are now supported by sdhci(4) as of r254507.
This should work for all TI SoCs that use the MMCHS hardware, but it has
only been tested on AM335x right now, so this enables it on those platforms
but leaves the existing ti_mmchs driver in place for other OMAP variants
until they can be tested.
This initial incarnation lacks DMA support (coming soon). Even without it
this improves performance pretty noticibly over the ti_mmchs driver,
primarily because it now does multiblock IO.
There is no need for calling vm_page_dirty() when clearing "modified" flag as
it is already set for that page in pmap_fault_fixup() or pmap_enter() thanks
to "modified" bit emulation.
Also, there is no need for checking PTE "referenced" or "writeable" flags. If
there is a request to clear a particular flag we should just do it.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Last input argument in pmap_modify_pv() should be a mask of flags to be set.
In pmap_change_wiring() however, the straight wired status was used, which
does not represent valid flags (and is of type boolean).
This commit fixes the issue so that wired flag is passed to pmap_modify_pv()
properly.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Revise L2_S_PROT_MASK to include all of the protection bits. Notice that
clearing these bits does not always take away the corresponding permissions
(for example, permission is granted when the bit is cleared). The bits are
cleared but are to be set or left cleared accordingly in pmap_set_prot(),
pmap_enter_locked(), etc.
Clear L2_XN along with L2_S_PROT_MASK in pmap_set_prot() so that all
permissions related bits are cleared before actual configuration.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
- PGA_WRITEABLE indicates that there *might be* a writable mapping for the
particular page, so to avoid frequent sweeping of the pv_entries whenever
pmap_nuke_pv(), pmap_modify_pv(), etc. is called, it is sufficient to
clear that flag if there are no managed mappings for that page anymore
(notice that only pmap_enter is authorized to set this flag).
- Avoid redundant checking for PVF_WIRED flag when this flag cannot be set
anyway.
- Clear PGA_WRITEABLE only once for each vm_page instead of multiple,
redundant clearing it in loop when there are no writeable mappings
to that page anymore.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
- Use the right address when calling kva_free()
(Is there any reason why the s3c2xx0 comes with its own version of bs_map/
bs_unmap ? It seems to be just the same as in bus_space_generic.c)
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.
Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
and vm_page_grab are being executed. This will be very helpful
once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag
The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.
Sponsored by: EMC / Isilon storage division
Discussed with: alc
Reviewed by: jeff, kib
Tested by: gavin, bapt (older version)
Tested by: pho, scottl
line boundary. It has never been 100% correct, and it can't work on SMP,
because nothing prevents another core from accessing data from an unrelated
buffer in the same cache line while we invalidated it. Just use bounce pages
instead.
Reviewed by: ian
Approved by: mux (mentor) (implicit)
Add support for A20 timer.
Correct interrupt offset depending from chip.
Add basic code for CPU configuration module.
For now, add kernel config and dts file
(only FDT blob related problem needs to be solved later in
order to have one kernel for both cubieboard1 and 2).
Approved by: ray@
transparent layering and better fragmentation.
- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.
Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
because an exception may happen at any time. The stack alignment rules on
ARM EABI state the only place the stack must be 8-byte aligned is on a
function boundary.
If an exception happens while a function is setting up or tearing down it's
stack frame it may not be correctly aligned. There is also no requirement
for it to be when the function is a leaf node.
The fix is to align the stack after we have stored a backup of the old stack
pointer, but before we have stored anything in the trapframe. Along with
this we need to adjust the size of the trapframe by 4 bytes to ensure the
stack below it is also correctly aligned.
Instead of hard-coding the uart register addresses for the imx51, use
a variable that defaults to the imx51 address. When debugging another
imx-family SoC, the variable can be set early in initarm() to provide
full console/printf support for debugging early boot.
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.
* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.
* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah
* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.
Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien
instruction set. Thumb-2 requires an if-then instruction to implement
conditional codes.
When building for ARM mode the it-then instructions do not generate any
assembled instruction as per the ARMv7-A Architecture Reference Manual, and
are safe to use.
While this allows the atomic instructions to be built, it doesn't mean we
fully support Thumb code. It works in small tests, but is still known to
fail in a large number of places.
While here add a check for the armv6t2 architecture.
- We should check is_d32 to see howmany registers we have
- In vfp_restore mark vfpscr as an output register
Without the second part it appears we can return the incorrect value from
vfp_bounce if the VFP condition flags are set as it may override the
register holding the return value.
This follows section 18.4.2.2 SD Soft Reset Flow in the TI AM335x Technical
Reference Manual and seems to fix the "ti_mmchs0: Error: current cmd NULL,
already done?" messages.
Gcc outputs pre-UAL asm and expects the ldcl instruction with a condition
in the form ldc<c>l, where the code produces the instruction in the UAL
form ldcl<c>. Work around this by checking if we are using clang or gcc and
adjusting the instruction.
While here correct the cmp instruction's value to include the # before the
immediate value.
ePWM is controlled by sysctl nodes dev.am335x_pwm.N.period,
dev.am335x_pwm.N.dutyA and dev.am335x_pwm.N.dutyB that controls
PWM period and duty cycles for channels A and B respectively.
Period and duty cycle are measured in clock ticks. Default
clock frequency for AM335x PWM subsystem is 100MHz
pmap_remove_all()
This flag should already be cleared by pmap_nuke_pv()
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
When doing pmap_enter_locked(), enable write permission only when access
type indicates attempt to write. Otherwise, leave the page read only but
mark it writable in pv_flags.
This will result in:
1. Marking page writable during pmap_enter() but only when ensured that it
will be written right away so that we will not get redundant permissions
fault on write attempt.
2. Keeping page read only when it is permitted to be written but there was
no actual write attempt. Hence, we will get permissions fault on write
access and mark page writable in pmap_fault_fixup() what will indicate
modification status.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Force UMA zone to allocate service structures like slabs using own
allocator. uma_debug code performs atomic ops on uma_slab_t fields
and safety of this operation is not guaranteed for write-back caches
Issues were noted by Bruce Evans and are present on all architectures.
On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.
On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.
PowerPC64 is treated the same as amd64.
For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.
Noted by: bde
Sponsored by: The FreeBSD Foundation
- Initialize SMAPx registers too although they're unused in QEMU
- Do not pass IO/MEM resources to upper bus for activation, handle them locally.
Previously ACTIVATE method of upper bus was no-op so nothing bad
happened. But now FDT maps physaddr to vaddr and it causes
troubles: fdtbus_activate_resource resource assumes that
bustag/bushandle are already set which in this case is wrong.
past a trapframe.
Use this macro in exception_exit as it is the function the unwinder enters
as the functions that store the frame setting lr to point to it.
Provide both __sync_*-style and __atomic_*-style functions that perform
the atomic operations on ARMv5 by using Restartable Atomic Sequences.
While there, clean up some pieces of code where it's sufficient to use
regular uint32_t to store register contents and don't need full reg_t's.
Also sync this back to the MIPS code.
check which variant we are on, and if it is a VFPv3 or v4, and has 32
double registers we save these. This fixes VFP support on Raspberry Pi.
While here clean fmrx and fmxr up to use the register names from vfp.h
as opposed to the raw register names.
Basically the situation is as follows:
- When using Clang + armv6, we should not need any intrinsics. It should
support it, even though due to a target misconfiguration it does not.
We should fix this in Clang.
- When using Clang + noarmv6, provide __atomic_* functions that disable
interrupts.
- When using GCC + armv6, we can provide __sync_* intrinsics, similar to
what we did for MIPS. As ARM and MIPS are quite similar, simply base
this implementation on the one I did for MIPS.
- When using GCC + noarmv6, disable the interrupts, like we do for
Clang.
This implementation still lacks functions for noarmv6 userspace. To be
done.
* Stop pretending we support anything other than ELF by removing code
surrounded by #ifdef __ELF__ ... #endif.
* Remove _JB_MAGIC_SETJMP and _JB_MAGIC__SETJMP, they are defined in
setjmp.h, which is able to be included from asm.
* Fix the spelling of dependent.
* Rename END _END and add END and ASEND to complement ENTRY and ASENTRY
respectively
* Add macros to simplify accessing the Global Offset Table, some of these
will be used in the upcoming update to the setjmp functions.
In order to become independent of Coherency Fabric frequency, configure
Timer and Watchdog to operate in 25MHz mode.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Copy the given range of mappings from the source map to the
destination map, thereby reducing the number of VM faults on fork.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
pmap_enter_locked() implementation was very ambiguous and confusing.
Rearrange it so that each part of the mapping creation is separated.
Avoid walking through the redundant conditions.
Extract vector_page specific PTE setup from normal PTE setting.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Using PVF_MOD, PVF_REF and PVF_EXEC is redundant as we can get the proper
info from PTE bits.
When the mapping is marked as executable and has been referenced we assume
that it has been executed. Similarly, when the mapping is set to be writable
and is referenced, it must have been due to write access to it.
PVF_MOD and PVF_REF flags are kept just for pmap_clearbit() usage,
to pass the information on which bit should be cleared.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Use pmap_find_pv if needed instead of multiplying its code throughout
pmap-v6.
Avoid possible NULL pointer dereference in pmap_enter_locked()
When trying to get m->md.pv_memattr, make sure that m != NULL,
in particular that vector_page is set to be NULL.
Do not set PGA_REFERENCED flag in pmap_enter_pv().
On ARM any new page reference will result in either entering the new
mapping by calling pmap_enter, etc. or fixing-up the existing mapping in
pmap_fault_fixup().
Therefore we set PGA_REFERENCED flag in the earlier mentioned cases and
setting it later in pmap_enter_pv() is just waste of cycles.
Delete unused pm_pdir pointer from the pmap structure.
Rearrange brackets in the fault cause detection in trap.c
Place the brackets correctly in order to see course of the conditions
instantaneously.
Unify naming in pmap-v6.c and improve style
Use naming common for whole pmap and compatible with other pmaps,
improve style where possible:
pm -> pmap
pg -> m
opg -> om
*pt -> *ptep
*pte -> *ptep
*pde -> *pdep
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
bit in PTE.
Enable Access Flag in CPU control. With AF enabled each valid mapping
needs to have referenced bit in PTE set in order to be able to cache
it in the TLB.
AP[0] bit is to be used as reference flag.
All access permissions are encoded by AP[2:1] wherein AP[1] is in fact
"user enable" and AP[2](APX) is "write disable".
All mappings are always set to be valid. Reference emulation is performed
by setting/clearing reference flag in PTE.
md.pvh_attrs are no longer necessary however pv_flags are still being used
for now.
Marking vm_page as "dirty" or "referenced" is being performed on:
- page or flag fault servicing in pmap_fault_fixup(), basing on the fault
type
- vm_fault servicing in pmap_enter() according to the desired protections
and faulty access type
Redundant page marking has been removed as on ARM we know exactly when the
particular page is referenced or is going to be written.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
o Relax locking assertions for pmap_enter_object() and add them also
to architectures that currently don't have any
o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade
operation on the per-object rwlock
o Use all the mechanisms above to make vm_map_pmap_enter() to work
mostl of the times only with readlocks.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
PV entries are now roughly half the size.
Instead of using a shared UMA zone for 28 byte pv entries
(two 8-byte tailq nodes, a 4 byte pointer, a 4 byte address and 4 byte
flags), we allocate a page at a time per process.
This provides 252 pv entries per process (actually, per pmap address space)
and eliminates one of the 8-byte tailq entries since we now can track
per-process pv entries implicitly.
The pointer to the pmap can be eliminated by doing address arithmetic to
find the metadata on the page headers to find a single pointer shared by
all 252 entries. There is an 8-int bitmap for the freelist of those 252
entries.
When in serious low memory condition, allocation of another pv_chunk is
possible by freeing some pages in pmap_pv_reclaim().
Added pv_entry/pv_chunk related statistics to pmap.
pv_entry/pv_chunk statistics can be accessed via sysctl vm.pmap.
Ported PTE freelist of KVA allocation and maintenance from i386.
Using an idea from Stephan Uphoff, use the empty pte's that correspond
to the unused kva in the pv memory block to thread a freelist through.
This allows us to free pages that used to be used for pv entry chunks
since we can now track holes in the kva memory block.
As both ARM pmap.c and pmap-v6.c use the same header and pv_entry, pmap and
md_page structures are different, it was needed to separate code designed
for ARMv6/7 from the one for other ARMs.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
EABI ARM kernels or clang-compiled ARM kernels.
This fixes a crash seen in clang-compiled ARM
kernels that include WITNESS.
This code could be easily modified to walk the stack
for current clang-generated code (including EABI)
but Andrew Turner has raised concerns that the
stack frame currently emitted by clang isn't actually
required by EABI so such a change might cause problems
down the road.
In case anyone wants to experiment, the change
to support current clang-compiled kernels
involves simply setting FR_RFP=0 and FR_SCP=1.
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.
Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc
Keep following access permissions:
APX AP Kernel User
1 01 R N
1 10 R R
0 01 R/W N
0 11 R/W R/W
Avoid using reserved in ARMv6 APX|AP settings:
- In case of unprivileged (user) access without permission to write,
the access permission bits were being set to reserved for ARMv6
(but valid for ARMv7) value of APX|AP = 111.
Fix-up faulting userland accesses properly:
- Wrong condition statement in pmap_fault_fixup() caused that
any genuine, unprivileged access was being fixed-up instead of
just skip doing anything and return. Staring from now we ensure
proper reaction for illicit user accesses.
L2_S_PROT_R and L2_S_PROT_U names might be misleading as they do not
reflect real permission levels. It will be clarified in following
patches (switch to AP[2:1] permissions model).
Obtained from: Semihalf
- On ARMADAXP B0 (GP development board) we are not able to use PCI due to
whole 32-bit address space used by 4GB of RAM memory.
- Change is required to destroy unnecessary window to free address space
for PCI and other devices
- Fix offset value for SDRAM decoding windows
Obtained from: Semihalf
to driver specific files.
- window initialization is done during device attach
- CESA TDMA decoding windows values are set based on DTS,
not copied from CPU registers
- remove unnecessary virtual mapping
- update dts file
Obtained from: Semihalf
fork_trampoline (thread entry point) assembler routines, because it's
not possible to unwind beyond those points.
Also insert STOP_UNWINDING in the exception_exit routine, to prevent an
unwind-loop at that point. This is just a stopgap until we get around
to instrumenting all assembler functions with proper unwind metadata.
exit the loop until after printing info about the current frame. Also,
if executing the unwind function for a frame doesn't change the values of
any registers, log that and exit the loop rather than looping endlessly.
Communication on src-commiters, Sat, 27 Apr 2013 22:09:06 -0700,
Subject was: "Re: svn commit: r249997"
As I'm here, fix the style main block comments in files' headers.
sys/arm and sys/mips), squelching the clang 3.3 warnings about this.
Noticed by: tinderbox and many irate spectators
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
PR: kern/177759
MFC after: 3 days
and kern.cam.ctl.disable tunable; those were introduced as a workaround
to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8),
this makes CTL work out of the box.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.
See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.
In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
uart(4) allocates send and receiver buffers in attach() before it calls
the low-level driver's attach routine. Many low-level drivers set the
fifo sizes in their attach routine, which is too late. Other drivers set
them in the probe() routine, so that they're available when uart(4)
allocates buffers. This fixes the ones that were setting the values too
late by moving the code to probe().
Changes to make rtc/cts flow control work...
This does not turn on the builtin hardware flow control on the SoC's usart
device, because that doesn't work on uart1 due to a chip erratum (they
forgot to wire up pin PA21 to RTS0 internally). Instead it uses the
hardware flow control logic where the tty layer calls the driver to assert
and de-assert the flow control lines as needed. This prevents overruns at
the tty layer (app doesn't read fast enough), but does nothing for overruns
at the driver layer (interrupts not serviced fast enough).
To work around the wiring problem with RTS0, the driver reassigns that pin
as a GPIO and controls it manually. It only does so if given permission via
hint.uart.1.use_rts0_workaround=1, to prevent accidentally driving the pin
if uart1 is used without flow control (because something not related to
serial IO could be wired to that pin).
In addition to the RTS0 workaround, driver changes were needed in the area
of reading the current set of DCE signals. A priming read is now done at
attach() time, and the interrupt routine now sets SER_INT_SIGCHG when any
of the DCE signals change. Without these changes, nothing could ever be
transmitted, because the tty layer thought CTS was de-asserted (when in fact
we had just never read the status register, and the hwsig variable was
init'd to CTS de-asserted).
Changes to support bulk high-speed (230kbps and higher) data reception...
Allow the receive fifo size to be tuned with hint.uart.<dev>.fifo_bytes.
For high speed receive, a fifo size of 1024 works well. The default is
still 128 bytes if no hint is provided. Using a value larger than 384
requires a change in dev/uart/uart_core.c to size the intermediate
buffer as MAX(384, 3*sc->sc_rxfifosize).
Recalculate the receive timeout whenever the baud rate changes. At low
baud rates (19.2kbps and below) the timeout is the number of bits in 2
characters. At higher speed it's calculated to be 500 microseconds
worth of bits. The idea is to compromise between being responsive in
interactive situations and not timing out prematurely during a brief
pause in bulk data flow. The old fixed timeout of 1.5 characters was
just 32 microseconds at 460kbps.
At interrupt time, check for receiver holding register overrun status
and set the corresponding status bit in the return value.
When handling a buffer overrun, get a single buffer emptied and handed
back to the hardware as quickly as possible, then deal with the second
buffer. This at least minimizes data loss compared to the old logic
that fully processed both buffers before restarting the hardware.
Rewrite the logic for handling buffers after a receive timeout. The
original author speculated in a comment that there may be a race with
high speed data. There was, although it was rare. The code now handles
all three possible scenarios on receive timeout: two empty buffers, one
empty and one partial buffer, or one full and one partial buffer.
Reviewed by: imp
add the ability for userland to be notified of changes on gpio pins via
a select(2)/read(2) interface.
Change the interrupt handler from filtered to threaded.
Because of the uiomove() calls in the new interface, change locking from
standard mutex to sx.
Add / restore the at91_gpio_high_z() function.
Reviewed by: imp (long ago)
of bits, not just a 0/1 indicating whether any of the masked bits are on.
This is compatible with the single in-tree caller of this function right now
(at91_vbus_poll() in dev/usb/controller/at91dci_atemelarm.c).
With some recent busdma refactoring, sometimes it happens that a sync
op gets called when bus_dmamap_load() never got called, which results
in a spurious warning about a map mismatch when no sync operations will
actually happen anyway. Now the check is done only if a sync operation
is actually performed, and the result of the check is a panic, not just
a printf.
Reviewed by: cognet (who prevented me from donning a point hat)
original 2us are indeed not enough, 3us are working quite well on my tests.
To be more safe set minimal period to 5us and to be even more safe replicate
here from HPET mechanism of rereading counter after programming comparator.
This change allows to handle 30K of short nanosleep() calls per second on
Raspberry Pi instead of just 8K before.
Discussed with: gonzo
do not map the b_pages pages into buffer_map KVA. The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.
The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer. For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.
When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.
Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer. The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.
The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings. Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.
Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.
In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.
By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.
Sponsored by: The FreeBSD Foundation
Discussed with: jeff (previous version)
Tested by: pho, scottl (previous version), jhb, bf
MFC after: 2 weeks
register from a bus space resource.
Note that this macro is just for ARM, and is intended to have a short
lifespan. The DMA engines in some SoCs need the physical address of a
memory-mapped device register as one of the arguments for the transfer.
Several scattered ad-hoc solutions have been converted to use this macro,
which now also serves to mark the places where a more complete fix needs
to be applied (after that fix has been designed).
pages around, taking array of vm_page_t both for source and
destination. Starting offsets and total transfer size are specified.
The function implements optimal algorithm for copying using the
platform-specific optimizations. For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used. The code was
typically borrowed from the pmap_copy_page() for the same
architecture.
Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.
For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.
Sponsored by: The FreeBSD Foundation
Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after: 2 weeks
and that can drive someone crazy. While m_get2() is young and not
documented yet, change its order of arguments to match m_getm2().
Sorry for churn, but better now than later.
when the kernel attempts to unwind through this function.
The .fnstart and .fnend in this function should be moved to macros but we
are currently missing an END macro on ARM.
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.
The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho
other architectures [1].
While here:
- Remove an unused and commented out include.
- Add a comment describing the file that other copies have.
- Fix the style of the defines and add a comment on what each one is.
Suggested by: [1] alc
sent a SIGABRT when it is loaded as it is too large. This is the smallest
power of two MiB value that allows us to execute clang.
While here wrap it in an #ifndef to be consistent with the other
architectures.
Submitted by: Daisuke Aoyama <aoyama at peach.ne.jp>
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.
Although AM335x TRM states that GPIO_OE register is not used and just
reflects pads configuration in practice it does control pin behavior
and shoiuld be set in addition to pinmux setup
Replace the sub-optimal uma_zone_set_obj() primitive with more modern
uma_zone_reserve_kva(). The new primitive reserves before hand
the necessary KVA space to cater the zone allocations and allocates pages
with ALLOC_NOOBJ. More specifically:
- uma_zone_reserve_kva() does not need an object to cater the backend
allocator.
- uma_zone_reserve_kva() can cater M_WAITOK requests, in order to
serve zones which need to do uma_prealloc() too.
- When possible, uma_zone_reserve_kva() uses directly the direct-mapping
by uma_small_alloc() rather than relying on the KVA / offset
combination.
The removal of the object attribute allows 2 further changes:
1) _vm_object_allocate() becomes static within vm_object.c
2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by
direct calls to mtx_init() as there is no need to export it anymore
and the calls aren't either homogeneous anymore: there are now small
differences between arguments passed to mtx_init().
Sponsored by: EMC / Isilon storage division
Reviewed by: alc (which also offered almost all the comments)
Tested by: pho, jhb, davide
fact, use the same values here that we use on 32-bit x86 and MIPS. Some
machines were reported to have problems with the more aggressive values.
Reported and tested by: andrew
thread scheduled by interrupt fired after we entered critical section.
None of cpu_sleep() implementations on ARM check sched_runnable() now, so
put the first line of defence here. This mostly fixes unexpectedly long
sleeps in synthetic tests of calloutng code and probably other situations.
seems to cause more problems then previous behavior: it either breaks
initilization sequence in other places or uncovers problems with
high-speed mode timing for SDHCI 3.0
Fix pull-up and pull-down values of gpio.
According to A10 user manual possible pull register
values are 00 Pull-up/down disable, 01 Pull-up, 10 Pull-down.
Approved by: gonzo@
submap. Otherwise, after r246204, the auto-scaling logic in kern_malloc.c
tries to create a kmem submap that consumes the entire kernel map on a
Pandaboard with 1 GB of RAM.
Tested by: gonzo
machine to another. Therefore, VM_MAX_KERNEL_ADDRESS can't be a constant.
Instead, #define it to be a variable, vm_max_kernel_address, just like we
do on sparc64.
Reviewed by: kib
Tested by: ian
SDHCI driver
Suggested by: Daisuke Aoyama
- Set initilization sequence frequency to 8MHz. It should fix Data CRC
errors. Standard requires initialization sequence to be executed
at 400KHz but on this hardware low frequncies seems to cause
Data CRC errors.
Value was derived from analyzing hardware signals after
Raspberry Pi is powered up. Before any data is read though DATA line
adapter's clock frequency is changed to 8MHz.
Modern cards should function fine at 8MHz but for older MMC cards it
can be overriden by setting hw.bcm2835.sdhci.min_freq tunable.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)