Rather than trying to KASSERT for callers that invoke this on
IO tags, either do nothing (for write_8) or return ~0 (for read_8).
Using KASSERT here just makes bus.h too messy from both
polluting bus.h with systm.h (for any number of drivers that include
bus.h without first including systm.h) or ports that use bus.h
directly (i.e. libpciaccess) as reported by zeising@.
Also don't try to implement all of the other bus_space functions for
8 byte access since realistically only these two are needed for some
devices that expose 64-bit memory-mapped registers.
Put the amd64-specific functions here rather than sys/amd64/include/bus.h
so that we can keep this header unified for x86, as requested by mdf@
and tijl@.
Submitted by: Carl Delsey <carl.r.delsey@intel.com>
MFC after: 3 days
introduced with the IvyBridge CPUs. Provide the definitions for new
bits in CR3 and CR4 registers.
Tested by: avg, Michael Moll <kvedulv@kvedulv.de>
MFC after: 2 weeks
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks
mostly meets the guidelines set by the Intel SDM:
1. We use XRSTOR and XSAVE from the same CPL using the same linear
address for the store area
2. Contrary to the recommendations, we cannot zero the FPU save area
for a new thread, since fork semantic requires the copy of the
previous state. This advice seemingly contradicts to the advice
from the item 6.
3. We do use XSAVEOPT in the context switch code only, and the area
for XSAVEOPT already always contains the data saved by XSAVE.
4. We do not modify the save area between XRSTOR, when the area is
loaded into FPU context, and XSAVE. We always spit the fpu context
into save area and start emulation when directly writing into FPU
context.
5. We do not use segmented addressing to access save area, or rather,
always address it using %ds basing.
6. XSAVEOPT can be only executed in the area which was previously
loaded with XRSTOR, since context switch code checks for FPU use by
outgoing thread before saving, and thread which stopped emulation
forcibly get context loaded with XRSTOR.
7. The PCB cannot be paged out while FPU emulation is turned off, since
stack of the executing thread is never swapped out.
The context switch code is patched to issue XSAVEOPT instead of XSAVE
if supported. This approach eliminates one conditional in the context
switch code, which would be needed otherwise.
For user-visible machine context to have proper data, fpugetregs()
checks for unsaved extension blocks and manually copies pristine FPU
state into them, according to the description provided by CPUID leaf
0xd.
MFC after: 1 month
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.
Discussed with: bde
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
an uncorrected ECC error tends to fire on all CPUs in a package
simultaneously and the current printf hacks are not sufficient to make
the messages legible. Instead, use the existing mca_lock spinlock to
serialize calls to mca_log() and change the machine check code to panic
directly when an unrecoverable error is encoutered rather than falling
back to a trap_fatal() call in trap() (which adds nearly a screen-full of
logging messages that aren't useful for machine checks).
MFC after: 2 weeks
that revision, the bswapXX_const() macros were renamed to bswapXX_gen().
Also, bswap64_gen() was implemented as two calls to bswap32(), and
similarly, bswap32_gen() as two calls to bswap16(). This mainly helps
our base gcc to produce more efficient assembly.
However, the arguments are not properly masked, which results in the
wrong value being calculated in some instances. For example,
bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0)
returns 0xfcdefc9a7c563412.
Fix this by appropriately masking the arguments to bswap16() in
bswap32_gen(), and to bswap32() in bswap64_gen(). This should also
silence warnings from clang.
Submitted by: jh
revision has two problems:
- It can produce worse code with both clang and gcc.
- It doesn't fix the actual issue introduced in r232721, which will be
fixed in the next commit.
Submitted by: bde, tijl and jh
Pointy hat to: dim
recent changes in sys/x86/include/endian.h:
sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__uint32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes value from 1684238190 to 28526 [-Werror,-Wconstant-conversion]
buf->magic = ntohl(DCONS_MAGIC);
^~~~~~~~~~~~~~~~~~
sys/sys/param.h:306:18: note: expanded from:
#define ntohl(x) __ntohl(x)
^
./x86/endian.h:128:20: note: expanded from:
#define __ntohl(x) __bswap32(x)
^
./x86/endian.h:78:20: note: expanded from:
__bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
^
./x86/endian.h:68:26: note: expanded from:
(((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
^
./x86/endian.h:75:53: note: expanded from:
__bswap16_gen((__uint16_t)(x)) : __bswap16_var(x)))
~~~~~~~~~~~~~ ^
This is because the __bswapXX_gen() macros (for x86) call the regular
__bswapXX() macros. Since the __bswapXX_gen() variants are only called
when their arguments are constant, there is no need to do that constancy
check recursively. Also, it causes the above error with clang.
Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly,
__bswap32_gen() from __bswap64_gen().
While here, add extra parentheses around the __bswap16_gen() macro
expansion, to prevent unexpected side effects.
segments.h to a new x86 segments.h.
Add __packed attribute to some structs (just to be sure).
Also make it clear that i386 GDT and LDT entries are used in ia64 code.
reg.h with stubs.
The tREGISTER macros are only made visible on i386. These macros are
deprecated and should not be available on amd64.
The i386 and amd64 versions of struct reg have been renamed to struct
__reg32 and struct __reg64. During compilation either __reg32 or __reg64
is defined as reg depending on the machine architecture. On amd64 the i386
struct is also available as struct reg32 which is used in COMPAT_FREEBSD32
code.
Most of compat/ia32/ia32_reg.h is now IA64 only.
Reviewed by: kib (previous version)
Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed.
Create machine/npx.h on amd64 to allow compiling i386 code that uses
this header.
The original npx.h and fpu.h define struct envxmm differently. Both
definitions have been included in the new x86 header as struct __envxmm32
and struct __envxmm64. During compilation either __envxmm32 or __envxmm64
is defined as envxmm depending on machine architecture. On amd64 the i386
struct is also available as struct envxmm32.
Reviewed by: kib
didn't already have them. This is because the ternary expression will
return int, due to the Usual Arithmetic Conversions. Such casts are not
needed for the 32 and 64 bit variants.
While here, add additional parentheses around the x86 variant, to
protect against unintended consequences.
MFC after: 2 weeks
- Remove extern "C". There are no functions with external linkage here. [1]
- Rename bswapNN_const(x) to bswapNN_gen(x) to indicate that these macros
are generic implementations that can take non-constant arguments. [1]
- Split up __GNUCLIKE_ASM && __GNUCLIKE_BUILTIN_CONSTANT_P and deal with
each separately.
- Replace _LP64 with __amd64__ because asm instructions are machine
dependent, not ABI dependent.
Submitted by: bde [1]
Reviewed by: bde
amd64/i386/pc98 ptrace.h with stubs.
For amd64 PT_GETXSTATE and PT_SETXSTATE have been redefined to match the
i386 values. The old values are still supported but should no longer be
used.
Reviewed by: kib
amd64/i386/pc98 endian.h with stubs.
In __bswap64_const(x) the conflict between 0xffUL and 0xffULL has been
resolved by reimplementing the macro in terms of __bswap32(x). As a side
effect __bswap64_var(x) is now implemented using two bswap instructions on
i386 and should be much faster. __bswap32_const(x) has been reimplemented
in terms of __bswap16(x) for consistency.
resource allocation on x86 platforms:
- Add a new helper API that Host-PCI bridge drivers can use to restrict
resource allocation requests to a set of address ranges for different
resource types.
- For the ACPI Host-PCI bridge driver, use Producer address range resources
in _CRS to enumerate valid address ranges for a given Host-PCI bridge.
This can be disabled by including "hostres" in the debug.acpi.disabled
tunable.
- For the MPTable Host-PCI bridge driver, use entries in the extended
MPTable to determine the valid address ranges for a given Host-PCI
bridge. This required adding code to parse extended table entries.
Similar to the new PCI-PCI bridge driver, these changes are only enabled
if the NEW_PCIB kernel option is enabled (which is enabled by default on
amd64 and i386).
Approved by: re (kib)