APIC interrupt that fires when a threshold of corrected machine check
events is reached. CMCI also includes a count of events when reporting
corrected errors in the bank's status register. Note that individual
banks may or may not support CMCI. If they do, each bank includes its own
threshold register that determines when the interrupt fires. Currently
the code uses a very simple strategy where it doubles the threshold on
each interrupt until it succeeds in throttling the interrupt to occur
only once a minute (this interval can be tuned via sysctl). The threshold
is also adjusted on each hourly poll which will lower the threshold once
events stop occurring.
Tested by: Sailaja Bangaru sbappana at yahoo com
MFC after: 1 month
arbitrary frequencies into hardclock(), statclock() and profclock() calls.
Same code with minor variations duplicated several times over the tree for
different timer drivers and architectures.
- Switch all x86 archs to new functions, simplifying the code and removing
extra logic from timer drivers. Other archs are also welcome.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.
Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().
The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.
Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().
Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.
The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.
Reviewed by: jhb, marcel, marius, nwhitehorn, stas
Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
stas (mips)
MFC after: 1 month
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
In the end, it does help fixing /dev/io usage from multithreaded
processes.
- On i386 and amd64 the old behaviour is kept but multithreaded
processes must use the new interface in order to work well.
- Support for the other architectures is greatly improved, where
necessary, by the necessity to define very small things now.
Manpage update will happen shortly.
Sponsored by: Sandvine Incorporated
PR: threads/116181
Reviewed by: emaste, marcel
MFC after: 3 weeks
domain clock, 8 programmable PMC.
- Westmere based CPU (Xeon 5600, Corei7 980X) support.
- New man pages with events list for core and uncore.
- Updated Corei7 events with Intel 253669-033US December 2009 doc.
There is some removed events in the documentation, they have been
kept in the code but documented in the man page as obsolete.
- Offcore response events can be setup with rsp token.
Sponsored by: NETASQ
for parsing model-specific and other fields in machine check events
including the global machine check capabilities and status registers,
CPU identification, and the FreeBSD CPU ID.
- Report these added fields in the console log of a machine check so that
a record structure can be reconstituted from the console messages.
- Parse new architectural errors including memory controller errors.
MFC after: 1 week
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.
Reviewed by: kib, jhb
processors. With this workaround, superpage promotion can be re-enabled
under virtualization. Moreover, machine check exceptions can safely be
enabled when FreeBSD is running natively on Family 10h processors.
Most of the credit should go to Andriy Gapon for diagnosing the error and
working with Borislav Petkov at AMD to document it. Andriy also reviewed
and tested my patches.
Discussed with: jhb
MFC after: 3 weeks
correctly initialized and just then assign to softclock/profclock.
Right now, some atrtc seems reporting strange diagnostic error* making the
current pattern bogus.
In order to do that cleanly, lapic_setup_clock(), on both ia32 and amd64,
now accepts as arguments the desired sources to handle, and returns the
actual ones (LAPIC_CLOCK_NONE is forbidden because otherwise there is no
meaning in calling such function).
This allows to bring out into commont x86 code the handling part for
machdep.lapic_allclocks tunable, which is retained.
Sponsored by: Sandvine Incorporated
Tested by: yongari, Richard Todd
<rmtodd at ichotolot dot servalan dot com>
MFC: 3 weeks
X-MFC: r202387, 204309
This header file uses __packed, without including <sys/cdefs.h>. This
means it cannot be used in the way described in sysarch(3) by only
including <machine/sysarch.h>.
LAPIC may lead to aliasing for softclock and profclock because frequencies
are sized in order to fit mainly hardclock.
atrtc used to take care of the softclock and profclock and it does still
do, if the LAPIC can't handle the clocks properly.
Revert the change when the LAPIC started taking charge of all three of
them and let atrtc handle softclock and profclock if not explicitly
requested. Such request can be made setting != 0 the new tunable
machdep.lapic_allclocks or if the new device ATPIC is not present
within the i386 kernel config (atrtc is linked to atpic presence).
Diagnosed by: Sandvine Incorporated
Reviewed by: jhb, emaste
Sponsored by: Sandvine Incorporated
MFC: 3 weeks
I/O port access is implemented on Itanium by reading and writing to a
special region in memory. To hide details and avoid misaligned memory
accesses, a process did I/O port reads and writes by making a MD system
call. There's one fatal problem with this approach: unprivileged access
was not being prevented. /dev/io serves that purpose on amd64/i386, so
employ it on ia64 as well. Use an ioctl for doing the actual I/O and
remove the sysarch(2) interface.
Backward compatibility is not being considered. The sysarch(2) approach
was added to support X11, but support for FreeBSD/ia64 was never fully
implemented in X11. Thus, nothing gets broken that didn't need more work
to begin with.
MFC after: 1 week
sys/vmmeter.h: warning: shadowed declaration is here
machine/cpufunc.h: In function 'insw':
machine/cpufunc.h: warning: declaration of 'cnt' shadows a global declaration
..snip..
- directly print mca information in case we fail to allocate memory
for a record
- include bank number into mca record
- print raw mca status value for extended information
Reviewed by: jhb
MFC after: 10 days
not properly set up. r199067 added the call to TUNABLE_INT_FETCH() to
initializecpu() that results in hang because AP are started when kernel
environment is already dynamic and thus needs to acquire mutex, that is
too early in AP start sequence to work.
Extract the code that should be executed only once, because it sets
up global variables, from initializecpu() to initializecpucache(),
and call the later only from hammer_time() executed on BSP. Now,
TUNABLE_INT_FETCH() is done only once at BSP at the early boot stage.
In collaboration with: Mykola Dzham <freebsd levsha org ua>
Reviewed by: jhb
Tested by: ed, battlez
handlers. This is primarily intended as a way to allow devices that use
multiple interrupts (e.g. MSI) to meaningfully distinguish the various
interrupt handlers.
- Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
a description with an active interrupt handler setup by BUS_SETUP_INTR.
It has a default method (bus_generic_describe_intr()) which simply passes
the request up to the parent device.
- Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
printf(9) style formatting using var args.
- Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
an interrupt handler and copy the name passed to intr_event_add_handler()
into that buffer instead of just saving the pointer to the name.
- Add a new intr_event_describe_handler() which appends a description string
to an interrupt handler's name.
- Implement support for interrupt descriptions on amd64 and i386 by having
the nexus(4) driver supply a custom bus_describe_intr method that invokes
a new intr_describe() MD routine which in turn looks up the associated
interrupt event and invokes intr_event_describe_handler().
Requested by: many
Reviewed by: scottl
MFC after: 2 weeks
by looking at the bases used for non-relocatable executables by gnu ld(1),
and adjusting it slightly.
Discussed with: bz
Reviewed by: kan
Tested by: bz (i386, amd64), bsam (linux)
MFC after: some time
specify their own version of atomic_cmpset_* which could have been
different than the membar version.
Right now, however, FreeBSD is bound mostly to GCC-like compilers and
it is desired to add new support and compat shim mostly when there is
a real necessity, in order to avoid too much compatibility bloats.
In this optic, bring back atomic_cmpset_{acq, rel}_* to be the same as
atomic_cmpset_* and unwind the atomic_cmpset_barr_* introduction.
Requested by: jhb
Reviewed by: jhb
Tested by: Giovanni Trematerra <giovanni dot trematerra at
gmail dot com>
not defined through macros or similar) in order to be later compiled in
the kernel and offer this way the support for modules (and
compatibility among the UP case and SMP case).
Fix this for the newly introduced atomic_cmpset_barr_* cases by defining
and specifying a template. Note that the new DEFINE_CMPSET_GEN()
template save more typing on amd64 than the current code. [1]
- Fix the style for memory barriers on amd64.
[1] Reported by: Paul B. Mahol <onemda at gmail dot com>
memory barriers should also ensure that the compiler doesn't reorder paths
where they are used. GCC, however, does that aggressively, even in
presence of volatile operands. The most reliable way GCC offers for avoid
instructions reordering is clobbering "memory" even if that is
theoretically an heavy-weight operation, flushing the content of all
the registers and forcing reload of them (We could rely, however, on
gcc DTRT by just understanding the purpose as this is a well-known
pattern for many modern operating-systems).
Not all our memory barriers, right now, clobber memory for GCC-like
compilers. The most notable cases are IA32 and amd64 where the memory
barrier are treacted the same as normal atomic instructions.
Fix this by offering the possibility to implement atomic instructions
with memory barriers separately from the normal version and implement
the GCC-like specific one using memory clobbering.
Thanks to Chris Lattner (@apple) for his discussion on llvm specifics.
Reported by: jhb
Reviewed by: jhb
Tested by: rdivacky, Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
startup and genericize it so it can be reused to map other tables as well:
- Add a routine to walk a list of ACPI subtables such as those used in the
APIC and SRAT tables in the MI acpi(4) driver.
- Move the routines for mapping and unmapping an ACPI table as well as
mapping the RSDT or XSDT and searching for a table with a given signature
out into acpica_machdep.c for both amd64 and i386.
- Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc()
routines in the local APIC code that the hwpmc(4) driver can use to
manage the local APIC PMC interrupt vector.
- Do not enable the local APIC PMC interrupt vector by default when
HWPMC_HOOKS is enabled. Instead, the hwpmc(4) driver explicitly
enables the interrupt when it is succesfully initialized and disables
the interrupt when it is unloaded. This avoids enabling the interrupt
on unsupported CPUs which may result in spurious NMIs.
Reported by: rnoland
Reviewed by: jkoshy
Approved by: re (kib)
MFC after: 2 weeks
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions. This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding
This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.
Please don't forget to update your config file with the STOP_NMI
option removal
Reviewed by: jhb
Tested by: pho, bz, rink
Approved by: re (kib)
established, OS shall flush the caches on all processors that may have
used the mapping previously. This operation is not needed if processors
support self-snooping. If not, but clflush instruction is implemented
on the CPU, series of the clflush can be used on the mapping region.
Otherwise, we have to flush the whole cache. The later operation is very
expensive, and AMD-made CPUs do not have self-snooping.
Implement cache flush for remapped region by using clflush for amd64,
when supported by CPU.
Proposed and reviewed by: alc
Approved by: re (kensmith)
dependent memory attributes:
Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.
Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.
Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes. Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures. The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map. The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:
kmem_alloc_contig() can now be used to allocate kernel memory with
non-default memory attributes on amd64 and i386.
vm_page_alloc() and the device pager will set the memory attributes
for the real or fictitious page according to the object's default
memory attributes.
Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.
Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386. In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.
In collaboration with: jhb
Approved by: re (kib)
return path only when neither thread was context switched while
executing syscall code nor syscall explicitely modified LDT or MSRs.
Save segment registers in trap handlers before interrupts are enabled,
to not allow context switches to happen before registers are saved.
Use separated byte in pcb for indication of fast/full return, since
pcb_flags are not synchronized with context switches.
The change puts back syscall microbenchmark numbers that were slowed
down after commit of the support for LDT on amd64.
Reviewed by: jeff
Tested (and tested, and tested ...) by: pho
Approved by: re (kensmith)
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent
Reviewed by: bde, imp, marcel
Approved by: re (kensmith)
- For x86, change the interrupt source method to assign an interrupt source
to a specific CPU to return an error value instead of void, thus allowing
it to fail.
- If moving an interrupt to a CPU fails due to a lack of IDT vectors in the
destination CPU, fail the request with ENOSPC rather than panicing.
- For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used
on the first interrupt in a group. Moving the first interrupt in a group
moves the entire group.
- Use the icu_lock to protect intr_next_cpu() on x86 instead of the
intr_table_lock to fix a LOR introduced in the last set of MSI changes.
- Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with
interrupts. Previously, binding an interrupt to a CPU only performed a
privilege check if the interrupt had an interrupt thread. Interrupts
without a thread could be bound by non-root users as a result.
- If an interrupt event's assign_cpu method fails, then restore the original
cpuset mask for the associated interrupt thread.
Approved by: re (kib)
required by video card drivers. Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures. In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.
In collaboration with: jhb
This is mostly important for the multiple MSI message case where the
IDT vectors for the entire group need to be allocated together. This
also restores the assumptions made by the PCI bus code that it could
invoke PCIB_MAP_MSI() once MSI vectors were allocated.
- To avoid whiplash with CPU assignments, change the way that CPUs are
assigned to interrupt sources on activation. Instead of assigning the
CPU via pic_assign_cpu() before calling enable_intr(), allow the
different interrupt source drivers to ask the MD interrupt code which
CPU to use when they allocate an IDT vector. I/O APIC interrupt pins
do this in their pic_enable_intr() routines giving the same behavior as
before. MSI sources do it when the IDT vectors are allocated during
msi_alloc() and msix_alloc().
- Change the intr_table_lock from an sx lock to a mutex.
Tested by: rnoland
With the arrival of 128+ cores it is necessary to handle more than that.
One of the first thing to change is the support for cpumask_t that needs
to handle more than 32 bits masking (which happens now). Some places,
however, still assume that cpumask_t is a 32 bits mask.
Fix that situation by using always correctly cpumask_t when needed.
While here, remove the part under STOP_NMI for the Xen support as it
is broken in any case.
Additively make ipi_nmi_pending as static.
Reviewed by: jhb, kmacy
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
- For CPUs that only support MCE (the machine check exception) but not MCA
(i.e. Pentium), all this does is print out the value of the machine check
registers and then panic when a machine check exception occurs.
- For CPUs that support MCA (the machine check architecture), the support is
a bit more involved.
- First, there is limited support for decoding the CPU-independent MCA
error codes in the kernel, and the kernel uses this to output a short
description of any machine check events that occur.
- When a machine check exception occurs, all of the MCx banks on the
current CPU are scanned and any events are reported to the console
before panic'ing.
- To catch events for correctable errors, a periodic timer kicks off a
task which scans the MCx banks on all CPUs. The frequency of these
checks is controlled via the "hw.mca.interval" sysctl.
- Userland can request an immediate scan of the MCx banks by writing
a non-zero value to "hw.mca.force_scan".
- If any correctable events are encountered, the appropriate details
are stored in a 'struct mca_record' (defined in <machine/mca.h>).
The "hw.mca.count" is a count of such records and each record may
be queried via the "hw.mca.records" tree by specifying the record
index (0 .. count - 1) as the next name in the MIB similar to using
PIDs with the kern.proc.* sysctls. The idea is to export machine
check events to userland for more detailed processing.
- The periodic timer and hw.mca sysctls are only present if the CPU
supports MCA.
Discussed with: emaste (briefly)
MFC after: 1 month
and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock
controlling it. Setting it to 0 disables using RTC clock as stat-/
profclock sources.
Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254
hardclock, when LAPIC and RTC clocks are disabled.
This allows to reduce global interrupt rate of idle system down to about
100 interrupts per core, permitting C3 and deeper C-states provide maximum
CPU power efficiency.
topology of nehalem/corei7 based systems.
- Remove the cpu_cores/cpu_logical detection from identcpu.
- Describe the layout of the system in cpu_mp_announce().
Sponsored by: Nokia
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]
Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]
Suggested by: bde [1], jhb [2]
MFC after: 2 weeks
CACHE_LINE_SIZE constant. These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.
Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).
MFC after: 2 weeks
Discussed on: arch@
- Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata. Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS. We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.
Reviewed by: jhb
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.
Submitted by: Christoph Mallon <christoph mallon gmx de>
Most compilers nowadays (including GCC) are smart enough to know what's
going on and generate more efficient code anyway.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
Because the "c" input constaint is used, the compiler will already place
the MSR_FSBASE/MSR_GSBASE constants in ecx. Using __asm("ecx") makes
LLVM crash. Even though this is also an LLVM bug, we'd better remove the
unnecessary GCCism as well.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
the kernel on amd64. Fill and read segment registers for mcontext and
signals. Handle traps caused by restoration of the
invalidated selectors.
Implement user-mode creation and manipulation of the process-specific
LDT descriptors for amd64, see sysarch(2).
Implement support for TSS i/o port access permission bitmap for amd64.
Context-switch LDT and TSS. Do not save and restore segment registers on
the context switch, that is handled by kernel enter/leave trampolines
now. Remove segment restore code from the signal trampolines for
freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.
Implement amd64-specific compat shims for sysarch.
Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.
TODO:
Currently, gdb is not adapted to show segment registers from struct reg.
Also, no machine-depended ptrace command is added to set segment
registers for debugged process.
In collaboration with: pho
Discussed with: peter
Reviewed by: jhb
Linuxolator tested by: dchagin
Reorder amd64 gdt descriptors so that user-accessible selectors are the
same as on i386. At least Wine hard-codes this into the binary.
In collaboration with: pho
Reviewed by: jhb
Provides i386/freebsd API-compatible definitions for the argument
structures of the above sysarch commands. struct i386_ioperm_args
definition is ABI-compatible.
In collaboration with: pho
Reviewed by: jhb
To keep these structures ABI-compatible, half the size of r_trapno,
r_err, mc_trapno, mc_flags.
Add fsbase and gsbase to mcontext on both amd64 and i386.
Add flags to amd64 mcontext to indicate that it contains valid segments
or bases.
In collaboration with: pho
Discussed with: peter
Reviewed by: jhb
stored in the pmap is from the direct map region. The two exceptions have
been the kernel pmap and the swapper's pmap. These pmaps have used a
kernel virtual address established by pmap_bootstrap() for their shared
pml4 page table page. However, there is no reason not to use the direct
map for these pmaps as well.
to the full path of the image that is being executed.
Increase AT_COUNT.
Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.
Reviewed by: kan
This code is heavily inspired by Takanori Watanabe's experimental SMP patch
for i386 and large portion was shamelessly cut and pasted from Peter Wemm's
AP boot code.
ABIs:
- Store the FPU initial control word in the pcb for each thread.
- When first using the FPU, load the initial control word after restoring
the clean state if it is not the standard control word.
- Provide a correct control word for Linux/i386 binaries under
FreeBSD/amd64.
- Adjust the control word returned for fpugetregs()/npxgetregs() when a
thread hasn't used the FPU yet to reflect the real initial control
word for the current ABI.
- The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control
word instead of trashing whatever the current state of the FPU is.
Reviewed by: bde
- fpudna() always returned 1 since amd64 CPUs always have FPUs. Change
the function to return void and adjust the calling code in trap() to
assume the return 1 case is the only case.
- Remove fpu_cleanstate_ready as it is always true when it is tested.
Also, only initialize fpu_cleanstate when fpuinit() is called on the BSP.
Reviewed by: bde
mode.
- Make the NMI handler run on its own stack (TSS_IST2).
- Store the GSBASE value for each CPU just before the start of
each NMI stack, permitting efficient retrieval using %rsp-relative
addressing.
- For NMIs taken from kernel mode, program MSR_GSBASE explicitly
since one or both of MSR_GSBASE and MSR_KGSBASE can be potentially
invalid. The current contents of MSR_GSBASE are saved and restored
at exit.
- For NMIs handled from user mode, continue to use 'swapgs' to
load the per-CPU GSBASE.
Reviewed by: jeff
Debugging help: jeff
Tested by: gnn, Artem Belevich <artemb at gmail dot com>
more irqs as we have more cpus. This is principally useful on systems
with msi devices which may want many irqs per-cpu.
Discussed with: jhb
Sponsored by: Nokia
32-bit processes. The value matches the initial setting used by
FreeBSD/i386. Otherwise, 32-bit binaries using floating point would use
a slightly different initial state when run on FreeBSD/amd64.
MFC after: 1 week
and Core Duo), models 0xF (Core2), model 0x17 (Core2Extreme) and
model 0x1C (Atom).
In these CPUs, the actual numbers, kinds and widths of PMCs present
need to queried at run time. Support for specific "architectural"
events also needs to be queried at run time.
Model 0xE CPUs support programmable PMCs, subsequent CPUs
additionally support "fixed-function" counters.
- Use event names that are close to vendor documentation, taking in
account that:
- events with identical semantics on two or more CPUs in this family
can have differing names in vendor documentation,
- identical vendor event names may map to differing events across
CPUs,
- each type of CPU supports a different subset of measurable
events.
Fixed-function and programmable counters both use the same vendor
names for events. The use of a class name prefix ("iaf-" or
"iap-" respectively) permits these to be distinguished.
- In libpmc, refactor pmc_name_of_event() into a public interface
and an internal helper function, for use by log handling code.
- Minor code tweaks: staticize a global, freshen a few comments.
Tested by: gnn