functions are implemented with CMPXCHG8B instruction where it is available,
i. e., all Pentium-class and later processors. Note this instruction is
also used for atomic_store_rel_64() because a simple XCHG-like instruction
for 64-bit memory access does not exist, unfortunately. If the processor
lacks the instruction, i. e., 80486-class CPUs, two 32-bit load/store are
performed with interrupt temporarily disabled, assuming it does not support
SMP. Although this assumption may be little naive, it is true in reality.
This implementation is inspired by Linux.
at run-time on i386. cpu_ticks() is set to use RDTSC early enough on i386
where it is available. Otherwise, cpu_ticks() is driven by the current
timecounter hardware as binuptime(9) does. This also avoids unnecessary
namespace pollution from <machine/cputypes.h>.
Also, express this new maximum as a fraction of the kernel's address
space size rather than a constant so that increasing KVA_PAGES will
automatically increase this maximum. As a side-effect of this change,
kern.maxvnodes will automatically increase by a proportional amount.
While I'm here ensure that this change doesn't result in an unintended
increase in maxpipekva on i386. Calculate maxpipekva based upon the
size of the kernel address space and the amount of physical memory
instead of the size of the kmem map. The memory backing pipes is not
allocated from the kmem map. It is allocated from its own submap of
the kernel map. In short, it has no real connection to the kmem map.
(In fact, the commit messages for the maxpipekva auto-sizing talk
about using the kernel map size, cf. r117325 and r117391, even though
the implementation actually used the kmem map size.) Although the
calculation is now done differently, the resulting value for
maxpipekva should remain almost the same on i386. However, on amd64,
the value will be reduced by 2/3. This is intentional. The recent
change to VM_KMEM_SIZE_SCALE on amd64 for the benefit of ZFS also had
the unnecessary side-effect of increasing maxpipekva. This change is
effectively restoring maxpipekva on amd64 to its prior value.
Eliminate init_param3() since it is no longer used.
Unfortunately, it pulls in <machine/cputypes.h> but it is small enough and
namespace pollution is minimal, I hope.
Pointed out by: bde
Pointy hat: jkim
soon as possible for stack protector. However, dummy timecounter does not
have enough entropy and we don't need to sacrifice Pentium class and later.
Pointed out by: Maxim Dounin (mdounin at mdounin dot ru)
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]
Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.
Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.
Suggested by: bde [1]
Approved by: kib (mentor)
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.
The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.
The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".
Discussed with: bde
Approved by: kib (mentor)
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.
Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.
While here, correct some comments.
Reviewed by: bde
Approved by: kib (mentor)
It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.
Discussed with: bde
Approved by: kib (mentor)
mechanical change. This opens the door for using PV device drivers
under Xen HVM on i386, as well as more general harmonisation of i386
and amd64 Xen support in FreeBSD.
Reviewed by: cperciva
MFC after: 3 weeks
When cleaning up a thread, reset its LDT to the default LDT.
Note: Casting the LDT pointer to an int and storing it in pc_currentldt is
wildly bogus, but is harmless since pc_currentldt is a write-only variable.
MFC after: 3 days
Don't map physical to machine page numbers in pte_load_store, since it uses
PT_SET_VA (which takes a physical page number and converts it to a machine
page number).
MFC after: 3 days
the original amd64 and i386 headers with stubs.
Rename (AMD64|I386)_BUS_SPACE_* to X86_BUS_SPACE_* everywhere.
Reviewed by: imp (previous version), jhb
Approved by: kib (mentor)
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.
PR: kern/80980
Discussed with: jhb
functions, they are unused. Remove 'user' from npxgetuserregs()
etc. names.
For {npx,fpu}{get,set}regs(), always use pcb->pcb_user_save for FPU
context storage. This eliminates the need for ugly copying with
overwrite of the newly added and reserved fields in ucontext on i386
to satisfy alignment requirements for fpusave() and fpurstor().
pc98 version was copied from i386.
Suggested and reviewed by: bde
Tested by: pho (i386 and amd64)
MFC after: 1 week
_HYPERVISOR_multicall, and create a new HYPERVISOR_multicall function which
invokes _HYPERVISOR_multicall and checks that the individual hypercalls all
succeeded.
Clean up the code by converting xpmap_ptom(VTOP(...)) to VTOM(...) and
converting xpmap_ptom(VM_PAGE_TO_PHYS(...)) to VM_PAGE_TO_MACH(...). In
a few places we take advantage of the fact that xpmap_ptom can commute with
setting PG_* flags.
This commit should have no net effect save to improve the readability of
this code.
These MSRs can be used to determine actual (average) performance as
compared to a maximum defined performance.
Availability of these MSRs is indicated by bit0 in CPUID.6.ECX on both
Intel and AMD processors.
MFC after: 5 days
It seems that this MSR has been available in a range of AMD processors
families for quite a while now.
Note1: not all AMD MSRs that are found in amd64 specialreg.h are also in
the i386 version.
Note2: perhaps some additional name component is needed to distinguish
AMD-specific MSRs.
MFC after: 5 days
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.
Suggested by: bde (1, 2)
pmap_kextract()) before pmap_bootstrap() is called.
Document the set of pmap functions that may be called before
pmap_bootstrap() is called.
Tested by: bde@
Reviewed by: kib@
Discussed with: jhb@
MFC after: 6 weeks
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.