instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks
advantages. First, PV entries are roughly half the size. Second, this
allocator doesn't access the paging queues, and thus it will allow for the
removal of the page queues lock from this pmap.
Fix a rather serious bug in pmap_remove_write(). After removing write
access from the specified page's first mapping, pmap_remove_write() then
used the wrong "next" pointer. Consequently, the page's second, third,
etc. mappings were not write protected.
Tested by: jchandra
Reduce the size of a PV entry by eliminating pv_ptem. There is no need
to store a pointer to the page table page in the PV entry because it is
easily computed during the walk down the page table.
Eliminate the ptphint from the pmap. Long, long ago, page table pages
belonged to a vm object, and we would look up page table pages based
upon their offset within this vm object. In those days, this hint may
have had tangible benefits.
Tested by: jchandra
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.
Discussed with: bde
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.
Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.
As an added bonus, tidy up some nearby comments concerning page flags.
Reviewed by: kib
MFC after: 6 weeks
in_cksum.h required ip.h to be included for struct ip. To be
able to use some general checksum functions like in_addword()
in a non-IPv4 context, limit the (also exported to user space)
IPv4 specific functions to the times, when the ip.h header is
present and IPVERSION is defined (to 4).
We should consider more general checksum (updating) functions
to also allow easier incremental checksum updates in the L3/4
stack and firewalls, as well as ponder further requirements by
certain NIC drivers needing slightly different pseudo values
in offloading cases. Thinking in terms of a better "library".
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
This makes our naming scheme more closely match other systems and the
expectations of much third-party software. MIPS builds which are little-endian
should require and exhibit no changes. Big-endian TARGET_ARCHes must be
changed:
From: To:
mipseb mips
mipsn32eb mipsn32
mips64eb mips64
An entry has been added to UPDATING and some foot-shooting protection (complete
with warnings which should become errors in the near future) to the top-level
base system Makefile.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
- Replace MIPS24K-specific code with more generic framework that will
make adding new CPU support easier
- Add MIPS24K support for new framework
- Limit backtrace depth to 1 for stability reasons and add option
HWPMC_MIPS_BACKTRACE to override this limitation
required for the ABI the kernel is being built for.
XXX This is implemented in a kind-of nasty way that involves including source
files, but it's still an improvement.
o) Retire ISA_* options since they're unused and were always wrong.
implementations or no implementation on all platforms.
Some of these functions might be good ideas, but their semantics were unclear
given the lack of implementation, and an unlucky porter could be fooled into
trying to implement them or, worse, being baffled when something like
platform_trap_enter() failed to be called.
o) Get rid of some unused macros related to features we don't intend to
provide.
o) Get rid of macro definitions for MIPS-I CPUs. We are not likely to
support anything that predartes MIPS-III.
o) Respell MIPS3_* macros as MIPS_*, which is how most of them were being
used already.
o) Eliminate a duplicate and mostly-unused set of exception vector macros.
There's still considerable duplication and lots more obsolete in our headers,
but this reduces one of the larger files to a size where one could reckon
about the correctness of its contents with a mere few hours of contemplation.
There is, of course, a question of whether we need definitions for fields,
registers and configurations that we are unlikely to ever use or implement,
even if they're not obsolete since 1991. FreeBSD is not a processor
reference manual, and things that aren't used may be wrong, or may be
duplicated because nobody could possibly actually know whether they're
already defined.
TLS:
o) The mc_tls field used to store the TLS base when doing context gets and
restores was left a pointer and not converted to a 32-bit integer. This
had the bug of not correctly capturing the TLS value desired by the user,
and the extra nastiness of making the structure the wrong size.
o) The mc_tls field was not being saved by sendsig. As a result, the TLS base
would always be set to NULL when restoring from a signal handler.
Thanks to gonzo for helping track down a bunch of other TLS bugs that came out
of tracking these down.
and offset it only if requested by RDHWR handler. Otherwise things
get overly complicated - we need to track whether address passsed in
request for setting td_md.md_tls is already offseted or not.
using the o32 ABI. This mostly follows nwhitehorn's lead in implementing
COMPAT_FREEBSD32 on powerpc64.
o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the
32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit
ABIs use a 64-bit time_t.
o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS
with 32-bit compatibility, then, disable some code which assumes otherwise
wrongly when built for MIPS. A more general macro to check in this case would
seem like a good idea eventually. If someone adds support for using n32
userland with n64 kernels on MIPS, then they will have to add a variety of
flags related to each piece of the ABI that can vary. That's probably the
right time to generalize further.
o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the
freebsd32 compat code. Probably this should be generalized at some point.
Reviewed by: gonzo
Reading register $29 with RDHWR is becoming the de-facto standard to
implement TLS. According to linux-mips wiki, MIPS Technologies has
reserved hardware register $29 for ABI use. Furthermore current GCC
makes the following assumptions:
- RDHWR is natively available or otherwise emulated by the kernel
- Register $29 holds the TLS pointer
Submitted by: Robert Millan <rmh@debian.org>
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.
Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
If we handle an interrupt just before the 'wait' and the interrupt
schedules some work, we need to skip the 'wait' call. The simple solution
of calling sched_runnable() with interrupts disabled immediately before
wait still leaves a window after the call and before 'wait' in which
the same issue can occur.
The solution implemented is to check the EPC in the interrupt handler, and
if it is in a region before the 'wait' call, to fix up the EPC to skip the
wait call.
Reported/analysed by: adrian
Fix suggested by: kib
Reviewed by: jmallett, imp
- update xlp_machdep.c to read arguments from FDT if FDT support is
compiled in.
- define rmi_uart_bus_space, and use it as fdtbus_bs_tag
- update conf files for FDT support
- add default dts file xlp-basic.dts
Wrong in that it must be guarded (it's configurable)
and bogus in that there's absolutely no rationale for
it not default to a page size like all other archs.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.
No MFC is previewed for this patch.
Tested by: pluknet
Approved by: re (kib)
This patch adds support for the Netlogic XLP mips64 processors in
the common MIPS code. The changes are :
- Add CPU_NLM processor type
- Add cases for CPU_NLM, mostly were CPU_RMI is used.
- Update cache flush changes for CPU_NLM
- Add kernel build configuration files for xLP.
In collaboration with: Prabhath Raman <prabhathpr at netlogicmicro com>
Approved by: bz(re), jmallett, imp(mips)
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.
Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).
Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.
Requested by: alc
MFC after: 1 week
MFC with: r221853
a number of cores, this allows for a sparse set of CPUs. Implement support
for sparse core masks on Octeon.
XXX jeff@ suggests that all_cpus should include cores that are offline or
running other applications/OSes, so the platform API should be further
extended to allow us to set all_cpus to include all cores that are
physically-present as opposed to only those that are running FreeBSD.
Submitted by: Bhanu Prakash (with modifications)
Reviewed by: jchandra
Glanced at by: kib, jeff, jhb
o) Have mips_wblush just do syncw, not sync on Cavium Octeon.
o) Add support for reading and writing some Octeon-specific registers.
NB: Some of these are not entirely Octeon-specific.
Submitted by: Bhanu Prakash
- Provide trivial implementation of sf_buf_alloc(), sf_buf_free(),
sf_buf_kva() and sf_buf_page() using direct map for n64.
- uio_machdep.c - use macros so that the direct map will be used in
case of n64.
Reviewed by: imp (earlier version)
Obtained from: jmallett (user/jmallett/octeon)
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
In n32 and n64, add support for physical address above 4GB by having
64 bit page table entries and physical addresses. Major changes are:
- param.h: update PTE sizes, masks and shift values to support 64 bit PTEs.
- param.h: remove DELAY(), mips_btop(same as atop), mips_ptob (same as
ptoa), and reformat.
- param.h: remove casting to unsigned long in trunc_page and round_page
since this will be used on physical addresses.
- _types.h: have 64 bit __vm_paddr_t for n32.
- pte.h: update TLB LO0/1 access macros to support 64 bit PTE
- pte.h: assembly macros for PTE operations.
- proc.h: md_upte is now 64 bit for n32 and n64.
- exception.S and swtch.S: use the new PTE macros for PTE operations.
- cpufunc.h: TLB_LO0/1 registers are 64bit for n32 and n64.
- xlr_machdep.c: Add memory segments above 4GB to phys_avail[] as they are
supported now.
Reviewed by: jmallett (earlier version)
1. Use vm_paddr_t for physical addresses.
There are a few places in the MIPS platform code where vm_offset_t is
used for physical addresses, change these to use vm_paddr_t:
- phys_avail[], physmem_desc[] arrays
- pmap_mapdev(), page_is_managed(), is_cacheable_mem() pmap_map() args
- local variables of various pmap functions
2. Change init_pte_prot() return from int to pt_entry_t, as this can be
64 bit when using 64 bit TLB entries.
3. Update printing of pt_entry_t and of vm_paddr_t to use 'j' format with
uintmax_t. This will be useful later if we plan to use 64bit phsical addr
on 32 bit n32 compilation.
Reviewed by: imp
and pointers don't always have the same size, e.g. the __mips_n32 ABI
(ILP32) has 64 bit registers but 32 bit pointers.
On mips introduce PRIptr to fix the format specifier for (u)intptr_t.
Prefix PRI64 and PRIptr with underscores because macro names starting with
PRI[a-zX] are reserved for future use.
Approved by: kib (mentor)
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]
Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.
Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.
Suggested by: bde [1]
Approved by: kib (mentor)
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.
The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.
The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".
Discussed with: bde
Approved by: kib (mentor)
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.
Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.
While here, correct some comments.
Reviewed by: bde
Approved by: kib (mentor)
It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.
Discussed with: bde
Approved by: kib (mentor)
The macros here for generating coprocessor 0 accessors are named like:
MIPS_RDRW32_COP0
That macro would produce mips_rd_<register>() and mips_wr_<register>()
inlines to access the specified register by name from C. The problem is that
the R and the W were swapped in the macros originally; it was meant to be named
RDWR because it generated mips_rd_* and mips_wr_* functions, but was instead
spelled RDRW, which nobody should be expected to get right by anything other
than copy and paste.
It's too many consonants in a row to keep straight anyway, so just prefer e.g.:
MIPS_RW32_COP0
While here, add a missing #undef.
o) Make the octeon_wdog driver work on multi-CPU systems and to also print more
information on NMI that may aid debugging. Simplify and clean up internal
API and structure.
Implement uma_small_alloc() and uma_small_free() for mips that allocates
pages from direct mapped memory. Uses the same mechanism as the page table
page allocator, so that we allocate from KSEG0 in 32 bit, and from XKPHYS
on 64 bit.
Reviewed by: alc, jmallett
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.
PR: kern/80980
Discussed with: jhb
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.
Suggested by: bde (1, 2)
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.
PMAP_DIAGNOSTIC was eliminated from amd64/i386, and, in fact, the
non-MIPS parts of the kernel, several years ago. Any of the interesting
checks were turned into KASSERT()s. Basically, the motivation was that
lots of people run with INVARIANTS but no one runs with DIAGNOSTIC.
panic strings needn't and shouldn't have a terminating newline.
Finally, there is one functional change. The sched_pin() in
pmap_remove_pages() is an artifact of the way we temporarily map page
table pages on i386. (The mappings are processor private. We don't do
a system-wide shootdown.) It isn't needed by MIPS.
Tested by: jchandra
Submitted by: alc
1. On n64, use XKPHYS to map page table pages instead of KSEG0. Maintain
just one freepages list on n64.
The changes are mainly to introduce MIPS_PHYS_TO_DIRECT(pa),
MIPS_DIRECT_TO_PHYS(), which will use KSEG0 in 32 bit compilation
and XKPHYS in 64 bit compilation.
2. Change macro based PMAP_LMEM_MAP1(), PMAP_LMEM_MAP2(), PMAP_LMEM_UNMAP()
to inline functions.
3. Introduce MIPS_DIRECT_MAPPABLE(pa), which will further reduce the cases
in which we will need to have a special case for 64 bit compilation.
4. Update CP0 hazard definitions for CPU_RMI - the cpu does not need any
nops
Reviewed by: neel
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.
Tested by: marius (sparc64)
MFC after: 1 month
1. Move dirty bit emulation code that is duplicted for kernel and user
in trap.c to a function pmap_emulate_modified() in pmap.c.
2. While doing dirty bit emulation, it is not necessary to update the
TLB entry on all CPUs using smp_rendezvous(), we can just update the
TLB entry on the current CPU, and let the other CPUs update their TLB
entry lazily if they get an exception.
Reviewed by: alc, neel