Commit Graph

2156 Commits

Author SHA1 Message Date
fabient
5edfb77dd3 Add software PMC support.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ
MFC after:	1 month
2012-03-28 20:58:30 +00:00
alc
e02fd6b842 Handle spurious page faults that may occur in no-fault sections of the
kernel.

When access restrictions are added to a page table entry, we flush the
corresponding virtual address mapping from the TLB.  In contrast, when
access restrictions are removed from a page table entry, we do not
flush the virtual address mapping from the TLB.  This is exactly as
recommended in AMD's documentation.  In effect, when access
restrictions are removed from a page table entry, AMD's MMUs will
transparently refresh a stale TLB entry.  In short, this saves us from
having to perform potentially costly TLB flushes.  In contrast,
Intel's MMUs are allowed to generate a spurious page fault based upon
the stale TLB entry.  Usually, such spurious page faults are handled
by vm_fault() without incident.  However, when we are executing
no-fault sections of the kernel, we are not allowed to execute
vm_fault().  This change introduces special-case handling for spurious
page faults that occur in no-fault sections of the kernel.

In collaboration with:	kib
Tested by:		gibbs (an earlier version)

I would also like to acknowledge Hiroki Sato's assistance in
diagnosing this problem.

MFC after:	1 week
2012-03-22 04:52:51 +00:00
tijl
fee85d364e Copy amd64 sysarch.h to x86 and merge with i386 sysarch.h. Replace
amd64/i386/pc98 sysarch.h with stubs.
2012-03-19 21:57:31 +00:00
tijl
42f636fdcd Copy i386 specialreg.h to x86 and merge with amd64 specialreg.h. Replace
amd64/i386/pc98 specialreg.h with stubs.
2012-03-19 21:34:11 +00:00
tijl
ea1f38e564 Copy i386 psl.h to x86 and replace amd64/i386/pc98 psl.h with stubs. 2012-03-19 21:29:57 +00:00
tijl
2a126f1975 Move userland bits (and some common kernel bits) from amd64 and i386
segments.h to a new x86 segments.h.

Add __packed attribute to some structs (just to be sure).
Also make it clear that i386 GDT and LDT entries are used in ia64 code.
2012-03-19 21:24:50 +00:00
tijl
2bf580ea66 Copy i386 reg.h to x86 and merge with amd64 reg.h. Replace i386/amd64/pc98
reg.h with stubs.

The tREGISTER macros are only made visible on i386. These macros are
deprecated and should not be available on amd64.

The i386 and amd64 versions of struct reg have been renamed to struct
__reg32 and struct __reg64. During compilation either __reg32 or __reg64
is defined as reg depending on the machine architecture. On amd64 the i386
struct is also available as struct reg32 which is used in COMPAT_FREEBSD32
code.

Most of compat/ia32/ia32_reg.h is now IA64 only.

Reviewed by:	kib (previous version)
2012-03-18 19:06:38 +00:00
tijl
f36691a4af Use exact width integer types in amd64/i386 reg.h to prepare for a merge.
The only real change is replacing long with int on i386.
2012-03-18 18:44:42 +00:00
tijl
9c671fcaca Move userland bits of i386 npx.h and amd64 fpu.h to x86 fpu.h.
Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed.
Create machine/npx.h on amd64 to allow compiling i386 code that uses
this header.

The original npx.h and fpu.h define struct envxmm differently. Both
definitions have been included in the new x86 header as struct __envxmm32
and struct __envxmm64. During compilation either __envxmm32 or __envxmm64
is defined as envxmm depending on machine architecture. On amd64 the i386
struct is also available as struct envxmm32.

Reviewed by:	kib
2012-03-16 20:24:30 +00:00
tijl
9d9a56e6ef Use exact width integer types instead of long in struct env87 in
preparation to merge with amd64.

Reviewed by:	kib
2012-03-16 19:42:39 +00:00
tijl
4b85eeee48 Remove prototypes of _amd64_get_fsbase et al. The functions were removed
in r145571.
2012-03-16 10:10:17 +00:00
jhb
d6f48ea8cc Allow a native i386 kernel to be built with 'nodevice atpic'. Just as on
amd64, if 'device isa' is present quiesce the 8259A's during boot and
resume from suspend.

While here, be more selective on amd64 about which kernel configurations
need elcr.c.

MFC after:	2 weeks
2012-03-09 19:42:48 +00:00
tijl
ab137477da Copy amd64 ptrace.h to x86 and merge with i386 ptrace.h. Replace
amd64/i386/pc98 ptrace.h with stubs.

For amd64 PT_GETXSTATE and PT_SETXSTATE have been redefined to match the
i386 values. The old values are still supported but should no longer be
used.

Reviewed by:	kib
2012-03-04 20:24:28 +00:00
tijl
bb4b86d0ab Copy amd64 trap.h to x86 and replace amd64/i386/pc98 trap.h with stubs. 2012-03-04 14:12:57 +00:00
tijl
28848902c7 Copy amd64 float.h to x86 and merge with i386 float.h. Replace
amd64/i386/pc98 float.h with stubs.
2012-03-04 14:00:32 +00:00
tijl
2db0395534 Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs. 2012-02-28 22:30:58 +00:00
tijl
e5cc7570cf Copy amd64 setjmp.h to x86 and replace amd64/i386/pc98 setjmp.h with stubs. 2012-02-28 22:17:52 +00:00
tijl
bfd7a07500 Copy amd64 endian.h to x86 and merge with i386 endian.h. Replace
amd64/i386/pc98 endian.h with stubs.

In __bswap64_const(x) the conflict between 0xffUL and 0xffULL has been
resolved by reimplementing the macro in terms of __bswap32(x). As a side
effect __bswap64_var(x) is now implemented using two bswap instructions on
i386 and should be much faster. __bswap32_const(x) has been reimplemented
in terms of __bswap16(x) for consistency.
2012-02-28 19:39:54 +00:00
tijl
738f5859fb Copy amd64 _stdint.h to x86 and merge with i386 _stdint.h. Replace
amd64/i386/pc98 _stdint.h with stubs.
2012-02-28 18:38:33 +00:00
tijl
86b4719ed5 Copy amd64 _limits.h to x86 and merge with i386 _limits.h. Replace
amd64/i386/pc98 _limits.h with stubs.
2012-02-28 18:24:28 +00:00
tijl
3b29ed2286 Copy amd64 _types.h to x86 and merge with i386 _types.h. Replace existing
amd64/i386/pc98 _types.h with stubs.
2012-02-28 18:15:28 +00:00
jhb
c5aa114d99 Remove completely duplicate '#ifdef XEN' section. 2012-02-27 17:30:21 +00:00
jhb
8e4f58c4d2 Resort the IDT_DTRACE_RET constant after it was changed to be less than
IDT_SYSCALL.
2012-02-27 17:29:37 +00:00
ken
15e9b1348e Fix the netback driver build for i386.
netback.c:	Add missing VM includes.

xen/xenvar.h,
xen/xenpmap.h:	Move some XENHVM macros from <machine/xen/xenpmap.h> to
		<machine/xen/xenvar.h> on i386 to match the amd64 headers.

conf/files:	Add netback to the build.

Submitted by:	jhb
MFC after:	3 days
2012-02-02 17:54:35 +00:00
kib
82e1b5ac86 Synchronize the struct sigcontext definitions on x86 with mcontext_t.
Pointed out by:	bde
MFC after:	1 month
2012-01-30 07:51:52 +00:00
das
9feb719605 Add C11 macros describing subnormal numbers to float.h.
Reviewed by:	bde
2012-01-23 06:36:41 +00:00
kib
361bfae5c2 Add support for the extended FPU states on amd64, both for native
64bit and 32bit ABIs.  As a side-effect, it enables AVX on capable
CPUs.

In particular:

- Query the CPU support for XSAVE, list of the supported extensions
  and the required size of FPU save area. The hw.use_xsave tunable is
  provided for disabling XSAVE, and hw.xsave_mask may be used to
  select the enabled extensions.

- Remove the FPU save area from PCB and dynamically allocate the
  (run-time sized) user save area on the top of the kernel stack,
  right above the PCB. Reorganize the thread0 PCB initialization to
  postpone it after BSP is queried for save area size.

- The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as
  well. FPU state is only useful for suspend, where it is saved in
  dynamically allocated suspfpusave area.

- Use XSAVE and XRSTOR to save/restore FPU state, if supported and
  enabled.

- Define new mcontext_t flag _MC_HASFPXSTATE, indicating that
  mcontext_t has a valid pointer to out-of-struct extended FPU
  state. Signal handlers are supplied with stack-allocated fpu
  state. The sigreturn(2) and setcontext(2) syscall honour the flag,
  allowing the signal handlers to inspect and manipilate extended
  state in the interrupted context.

- The getcontext(2) never returns extended state, since there is no
  place in the fixed-sized mcontext_t to place variable-sized save
  area. And, since mcontext_t is embedded into ucontext_t, makes it
  impossible to fix in a reasonable way.  Instead of extending
  getcontext(2) syscall, provide a sysarch(2) facility to query
  extended FPU state.

- Add ptrace(2) support for getting and setting extended state; while
  there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries.

- Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to
  consumers, making it opaque. Internally, struct fpu_kern_ctx now
  contains a space for the extended state. Convert in-kernel consumers
  of fpu_kern KPI both on i386 and amd64.

First version of the support for AVX was submitted by Tim Bird
<tim.bird am sony com> on behalf of Sony. This version was written
from scratch.

Tested by:	pho (previous version), Yamagi Burmeister <lists yamagi org>
MFC after:	1 month
2012-01-21 17:45:27 +00:00
kib
69f3e63a09 Add definitions for the FPU extended state header, legacy extended
state and AVX state.

MFC after:	1 week
2012-01-17 17:07:13 +00:00
kib
e94bd75cc5 Add definitions related to XCR0.
MFC after:	1 week
2012-01-17 07:23:43 +00:00
uqs
d61d88a310 Convert files to UTF-8 2012-01-15 13:23:18 +00:00
ed
3bad498373 Also import WEAK_ALIAS() from the MIPS code. 2012-01-05 08:51:06 +00:00
ed
29cd68a581 Add support for strong aliasing of symbols in i386 assembly.
This macro is a literal copy from the MIPS version of <machine/asm.h>.
2012-01-03 07:06:35 +00:00
kib
a1ca31e42c Make the comment in i386/include/ucontext.h identical to the one in
amd64/include/ucontext.h. The later is better worded.

Requested by:	deischen
MFC after:	3 days
2011-12-31 14:44:42 +00:00
ed
cb983d98e7 Replace __signed by signed.
The signed keyword is an integral part of the C syntax. There's no need
to use __signed.
2011-12-13 13:38:03 +00:00
alc
57f34640a0 Avoid the possibility of integer overflow in the calculation of
VM_KMEM_SIZE_MAX.  Specifically, if the user/kernel address space split
was changed such that the kernel address space was greater than or equal
to 2 GB, then overflow would occur.

PR:		161721
MFC after:	3 weeks
2011-12-10 18:42:00 +00:00
kib
d8e6287519 Attempt to improve formatting and content of several comments for
amd64 and i386 MD code.

Based on suggestions by:	bde
MFC after:	1 week
2011-11-09 18:25:50 +00:00
rstone
7513eaddcf Fix the DTrace pid return trap interrupt vector. Previously we were using
31, but that vector is reserved.

Without this fix, running dtrace -p <pid> would either cause the target
process to crash or the kernel to page fault.

Obtained from:	rpaulo
MFC after:	3days
2011-11-07 01:53:25 +00:00
das
28e8dea258 People porting FreeBSD to new architectures ought not have to
implement a deprecated FPU control interface in addition to the
standard one.  To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.

Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware.  Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
2011-10-21 06:41:46 +00:00
kib
1134edae2b Remove unused define.
MFC after:	1 month
2011-10-07 16:09:44 +00:00
attilio
a73e834ebb Add the possibility to specify from kernel configs MAXCPU value.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.

No MFC is previewed for this patch.

Tested by:	pluknet
Approved by:	re (kib)
2011-07-19 00:37:24 +00:00
jkim
52539f62b4 Correct cpu_monitor() and cpu_mwait() for amd64. These instructions take
%rcx as "extensions" in long mode.  If any unused bit is set in %rcx, these
instructions cause general protection fault.  Fix style nits and synchronize
i386 with amd64.
2011-07-05 18:42:10 +00:00
jhb
83fca1d193 Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to
the x86 tree.  The $PIR code is still only enabled on i386 and not amd64.
While here, make the qpi(4) driver on conditional on 'device pci'.
2011-06-22 21:04:13 +00:00
jhb
3fa22c485f Oops, missed these in 223424.
Reported by:	jkim
2011-06-22 18:48:07 +00:00
avg
74204e61b2 remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default
scheduler.  It may have been broken for SCHED_4BSD in more subtle ways,
e.g. with manually configured CPU affinities and for interrupt devilery
purposes.
We still provide a way to disable individual CPUs or all hyperthreading
"twin" CPUs before SMP startup.  See the UPDATING entry for details.

Interaction between building CPU topology and disabling CPUs still
remains fuzzy: topology is first built using all availble CPUs and then
the disabled CPUs should be "subtracted" from it.  That doesn't work
well if the resulting topology becomes non-uniform.

This work is done in cooperation with Attilio Rao who in addition to
reviewing also provided parts of code.

PR:		kern/145385
Discussed with:	gcooper, ambrisko, mdf, sbruno
Reviewed by:	attilio
Tested by:	pho, pluknet
X-MFC after:	never
2011-06-08 08:12:15 +00:00
attilio
ccbb37970b Reintroduce the lazypmap infrastructure and convert it to using
cpuset_t.

Requested by:	alc
2011-05-20 14:53:16 +00:00
attilio
6a2b7fdc52 MFC 2011-05-18 16:01:29 +00:00
attilio
9ff3491e67 MFC 2011-05-13 20:58:48 +00:00
attilio
d7cb9e4814 MFC 2011-05-09 18:53:13 +00:00
attilio
a0b51ba62f MFC 2011-05-06 22:45:33 +00:00
attilio
fe4de567b5 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00