Commit Graph

1376 Commits

Author SHA1 Message Date
Alan Cox
3cfc28b0a0 Now that amd64's kernel map is 512GB (SVN rev 192216), there is no reason
to cap its buffer map at 1GB.

MFC after:	6 weeks
2009-06-08 16:43:40 +00:00
John Baldwin
8aba835b8e Bump CACHE_LINE_SIZE to 128 for x86. Intel's manuals explicitly recommend
using 128 byte alignment for locks.  (See IA-32 SDM Vol 3A 7.11.6.7)
2009-05-18 19:33:59 +00:00
Kip Macy
b522d2c99b correct range in comment
pointed out by alc
2009-05-16 22:08:00 +00:00
Kip Macy
e127902229 update vm map comment
pointed out by Larry Rosenman
2009-05-16 22:00:13 +00:00
Kip Macy
b6d82b1ae9 Increase default kernel map to 512GB
I briefly discussed this with alc. It could lead to problems for greater than 64GB.
However, that seems unlikely in practice.
2009-05-16 20:57:08 +00:00
Attilio Rao
120b18d86f FreeBSD right now support 32 CPUs on all the architectures at least.
With the arrival of 128+ cores it is necessary to handle more than that.
One of the first thing to change is the support for cpumask_t that needs
to handle more than 32 bits masking (which happens now).  Some places,
however, still assume that cpumask_t is a 32 bits mask.
Fix that situation by using always correctly cpumask_t when needed.

While here, remove the part under STOP_NMI for the Xen support as it
is broken in any case.

Additively make ipi_nmi_pending as static.

Reviewed by:	jhb, kmacy
Tested by:	Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2009-05-14 17:43:00 +00:00
John Baldwin
9dc0b3d54f Implement simple machine check support for amd64 and i386.
- For CPUs that only support MCE (the machine check exception) but not MCA
  (i.e. Pentium), all this does is print out the value of the machine check
  registers and then panic when a machine check exception occurs.
- For CPUs that support MCA (the machine check architecture), the support is
  a bit more involved.
  - First, there is limited support for decoding the CPU-independent MCA
    error codes in the kernel, and the kernel uses this to output a short
    description of any machine check events that occur.
  - When a machine check exception occurs, all of the MCx banks on the
    current CPU are scanned and any events are reported to the console
    before panic'ing.
  - To catch events for correctable errors, a periodic timer kicks off a
    task which scans the MCx banks on all CPUs.  The frequency of these
    checks is controlled via the "hw.mca.interval" sysctl.
  - Userland can request an immediate scan of the MCx banks by writing
    a non-zero value to "hw.mca.force_scan".
  - If any correctable events are encountered, the appropriate details
    are stored in a 'struct mca_record' (defined in <machine/mca.h>).
    The "hw.mca.count" is a count of such records and each record may
    be queried via the "hw.mca.records" tree by specifying the record
    index (0 .. count - 1) as the next name in the MIB similar to using
    PIDs with the kern.proc.* sysctls.  The idea is to export machine
    check events to userland for more detailed processing.
  - The periodic timer and hw.mca sysctls are only present if the CPU
    supports MCA.

Discussed with:	emaste (briefly)
MFC after:	1 month
2009-05-13 17:53:04 +00:00
Doug Rabson
8480241102 Fix XENHVM build. 2009-05-06 17:48:39 +00:00
Alexander Motin
1703f2b424 Rename statclock_disable variable to atrtcclock_disable that it actually is,
and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock
controlling it. Setting it to 0 disables using RTC clock as stat-/
profclock sources.

Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254
hardclock, when LAPIC and RTC clocks are disabled.

This allows to reduce global interrupt rate of idle system down to about
100 interrupts per core, permitting C3 and deeper C-states provide maximum
CPU power efficiency.
2009-05-03 17:47:21 +00:00
Alexander Motin
6a3a164d6e Add support for using i8254 and rtc timers as event sources for amd64 SMP
system. Redistribute hard-/stat-/profclock events to other CPUs using IPIs.
2009-05-02 12:20:43 +00:00
Jeff Roberson
82fcb0f192 - Add support for cpuid leaf 0xb. This allows us to determine the
topology of nehalem/corei7 based systems.
 - Remove the cpu_cores/cpu_logical detection from identcpu.
 - Describe the layout of the system in cpu_mp_announce().

Sponsored by:   Nokia
2009-04-29 06:54:40 +00:00
Robert Watson
9725389e1e Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]

Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]

Suggested by:	bde [1], jhb [2]
MFC after:	2 weeks
2009-04-20 12:59:23 +00:00
Robert Watson
22037b2d2c Add description and cautionary note regarding CACHE_LINE_SIZE.
MFC after:	2 weeks
Suggested by:	alc
2009-04-19 21:26:36 +00:00
Robert Watson
a93fa8f2bb For each architecture, define CACHE_LINE_SHIFT and a derived
CACHE_LINE_SIZE constant.  These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.

Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).

MFC after:	2 weeks
Discussed on:   arch@
2009-04-19 20:19:13 +00:00
Jung-uk Kim
cebe9dc98a A simple rewrite of biossmap.c:
- Do not iterate int 15h, function e820h twice.  Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata.  Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS.  We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.

Reviewed by:	jhb
2009-04-15 17:31:22 +00:00
Ed Schouten
e1048f7678 Simplify in/out functions (for i386 and AMD64).
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.

Submitted by:	Christoph Mallon <christoph mallon gmx de>
2009-04-11 14:01:01 +00:00
Ed Schouten
2c97d32a81 Also remove the unused __word_swap_int*() macros.
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2009-04-08 19:10:20 +00:00
Ed Schouten
17cfde3df4 Implement __bswap16() without using inline assembly.
Most compilers nowadays (including GCC) are smart enough to know what's
going on and generate more efficient code anyway.

Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2009-04-08 19:06:47 +00:00
Ed Schouten
db26a6714a Don't explicitly force ecx to be used for MSR_FSBASE/MSR_GSBASE.
Because the "c" input constaint is used, the compiler will already place
the MSR_FSBASE/MSR_GSBASE constants in ecx. Using __asm("ecx") makes
LLVM crash. Even though this is also an LLVM bug, we'd better remove the
unnecessary GCCism as well.

Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2009-04-07 19:31:36 +00:00
Jung-uk Kim
4a608e44b5 Garbage collect unused stack segment since r190620. 2009-04-01 16:24:24 +00:00
Konstantin Belousov
7496ce7d74 Sync definitions for struct sigcontext for i386 and amd64 architectures
to struct mcontext.
2009-04-01 13:44:28 +00:00
Konstantin Belousov
2c66cccab7 Save and restore segment registers on amd64 when entering and leaving
the kernel on amd64. Fill and read segment registers for mcontext and
signals. Handle traps caused by restoration of the
invalidated selectors.

Implement user-mode creation and manipulation of the process-specific
LDT descriptors for amd64, see sysarch(2).

Implement support for TSS i/o port access permission bitmap for amd64.

Context-switch LDT and TSS. Do not save and restore segment registers on
the context switch, that is handled by kernel enter/leave trampolines
now. Remove segment restore code from the signal trampolines for
freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.

Implement amd64-specific compat shims for sysarch.

Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.

TODO:
Currently, gdb is not adapted to show segment registers from struct reg.
Also, no machine-depended ptrace command is added to set segment
registers for debugged process.

In collaboration with:	pho
Discussed with:	peter
Reviewed by:	jhb
Linuxolator tested by:	dchagin
2009-04-01 13:09:26 +00:00
Konstantin Belousov
c11d6143ca Add separate gdt descriptors for %fs and %gs on amd64.
Reorder amd64 gdt descriptors so that user-accessible selectors are the
same as on i386. At least Wine hard-codes this into the binary.

In collaboration with:	pho
Reviewed by:	jhb
2009-04-01 12:53:01 +00:00
Konstantin Belousov
59aff0f894 Fully enumerate all i386 sysarch commands an amd64 include file.
Provides i386/freebsd API-compatible definitions for the argument
structures of the above sysarch commands. struct i386_ioperm_args
definition is ABI-compatible.

In collaboration with:	pho
Reviewed by:	jhb
2009-04-01 12:48:17 +00:00
Konstantin Belousov
0cdf4ffabc Add all segment registers for the amd64 CPU to struct reg and mcontext.
To keep these structures ABI-compatible, half the size of r_trapno,
r_err, mc_trapno, mc_flags.

Add fsbase and gsbase to mcontext on both amd64 and i386.
Add flags to amd64 mcontext to indicate that it contains valid segments
or bases.

In collaboration with:	pho
Discussed with:	peter
Reviewed by:	jhb
2009-04-01 12:44:17 +00:00
Konstantin Belousov
49c9cff881 Provide convenient definition of the union descriptor, similar to the
i386 one. Fully enumerate system segments and gate types.

In collaboration with:	pho
Reviewed by:	jhb
2009-04-01 12:31:04 +00:00
Alan Cox
b4862e19af Update stale comments. The alternate address space mapping was eliminated
when PAE support was added to i386.  The direct mapping exists on amd64.
2009-03-22 18:56:26 +00:00
Alan Cox
0c645b7267 In general, the kernel virtual address of the pml4 page table page that is
stored in the pmap is from the direct map region.  The two exceptions have
been the kernel pmap and the swapper's pmap.  These pmaps have used a
kernel virtual address established by pmap_bootstrap() for their shared
pml4 page table page.  However, there is no reason not to use the direct
map for these pmaps as well.
2009-03-22 04:32:05 +00:00
Konstantin Belousov
a4f2b2b0c6 Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer
to the full path of the image that is being executed.
Increase AT_COUNT.

Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.

Reviewed by:	kan
2009-03-17 12:50:16 +00:00
Jung-uk Kim
c66d2b38c8 Initial suspend/resume support for amd64.
This code is heavily inspired by Takanori Watanabe's experimental SMP patch
for i386 and large portion was shamelessly cut and pasted from Peter Wemm's
AP boot code.
2009-03-17 00:48:11 +00:00
Doug Rabson
1267802438 Merge in support for Xen HVM on amd64 architecture. 2009-03-11 15:30:12 +00:00
John Baldwin
2ee8325f42 A better fix for handling different FPU initial control words for different
ABIs:
- Store the FPU initial control word in the pcb for each thread.
- When first using the FPU, load the initial control word after restoring
  the clean state if it is not the standard control word.
- Provide a correct control word for Linux/i386 binaries under
  FreeBSD/amd64.
- Adjust the control word returned for fpugetregs()/npxgetregs() when a
  thread hasn't used the FPU yet to reflect the real initial control
  word for the current ABI.
- The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control
  word instead of trashing whatever the current state of the FPU is.

Reviewed by:	bde
2009-03-05 19:42:11 +00:00
John Baldwin
a8346a9865 A few cleanups to the FPU code on amd64:
- fpudna() always returned 1 since amd64 CPUs always have FPUs.  Change
  the function to return void and adjust the calling code in trap() to
  assume the return 1 case is the only case.
- Remove fpu_cleanstate_ready as it is always true when it is tested.
  Also, only initialize fpu_cleanstate when fpuinit() is called on the BSP.

Reviewed by:	bde
2009-03-05 16:56:16 +00:00
John Baldwin
9edc34f864 Move the PCB flag macros up next to the 'pcb_flags' member in the struct. 2009-03-05 16:52:50 +00:00
Warner Losh
3282e64ac0 Companion for r188301: fix the prototypes. 2009-02-08 07:03:34 +00:00
Joseph Koshy
bb471e3315 Improve robustness of NMI handling, for NMIs recognized in kernel
mode.

- Make the NMI handler run on its own stack (TSS_IST2).
- Store the GSBASE value for each CPU just before the start of
  each NMI stack, permitting efficient retrieval using %rsp-relative
  addressing.
- For NMIs taken from kernel mode, program MSR_GSBASE explicitly
  since one or both of MSR_GSBASE and MSR_KGSBASE can be potentially
  invalid.  The current contents of MSR_GSBASE are saved and restored
  at exit.
- For NMIs handled from user mode, continue to use 'swapgs' to
  load the per-CPU GSBASE.

Reviewed by:	jeff
Debugging help:	jeff
Tested by:	gnn, Artem Belevich <artemb at gmail dot com>
2009-02-03 09:01:45 +00:00
David E. O'Brien
e6493bbebf Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions
for moving between a segment register and a 32-bit memory location.

Looked at by:	jhb
2009-01-31 11:37:21 +00:00
Jeff Roberson
9c8e8e3aa7 - Allocate apic vectors on a per-cpu basis. This allows us to allocate
more irqs as we have more cpus.  This is principally useful on systems
   with msi devices which may want many irqs per-cpu.

Discussed with:	jhb
Sponsored by:	Nokia
2009-01-29 09:22:56 +00:00
John Baldwin
de43ac6044 Use a different value for the initial control word for the FPU state for
32-bit processes.  The value matches the initial setting used by
FreeBSD/i386.  Otherwise, 32-bit binaries using floating point would use
a slightly different initial state when run on FreeBSD/amd64.

MFC after:	1 week
2009-01-28 20:35:16 +00:00
Jung-uk Kim
92df0bda99 Add basic amd64 support for VIA Nano processors. 2009-01-12 19:17:35 +00:00
Jung-uk Kim
6811e5d474 Add Centaur/IDT/VIA vendor ID for Nano family, which has long mode support. 2009-01-05 21:51:49 +00:00
Warner Losh
db3cd725a5 AT_DEBUG and AT_BRK were OBE like 10 years ago, so retire them.
Reviewed by:	peter
2008-12-17 06:56:58 +00:00
Jung-uk Kim
39e52304e0 Add more CPUID bits from AMD CPUID Specification Rev. 2.28. 2008-12-12 23:17:00 +00:00
John Baldwin
660f08b291 Add constants for fields in the local APIC error status register and a
routine to read it.
2008-12-11 15:56:30 +00:00
Joseph Koshy
0cfab8ddc1 - Add support for PMCs in Intel CPUs of Family 6, model 0xE (Core Solo
and Core Duo), models 0xF (Core2), model 0x17 (Core2Extreme) and
  model 0x1C (Atom).

  In these CPUs, the actual numbers, kinds and widths of PMCs present
  need to queried at run time.  Support for specific "architectural"
  events also needs to be queried at run time.

  Model 0xE CPUs support programmable PMCs, subsequent CPUs
  additionally support "fixed-function" counters.

- Use event names that are close to vendor documentation, taking in
  account that:
  - events with identical semantics on two or more CPUs in this family
    can have differing names in vendor documentation,
  - identical vendor event names may map to differing events across
    CPUs,
  - each type of CPU supports a different subset of measurable
    events.

  Fixed-function and programmable counters both use the same vendor
  names for events.  The use of a class name prefix ("iaf-" or
  "iap-" respectively) permits these to be distinguished.

- In libpmc, refactor pmc_name_of_event() into a public interface
  and an internal helper function, for use by log handling code.

- Minor code tweaks: staticize a global, freshen a few comments.

Tested by:	gnn
2008-11-27 09:00:47 +00:00
Jung-uk Kim
5113aa0af3 Introduce cpu_vendor_id and replace a lot of strcmp(cpu_vendor, "...").
Reviewed by:	jhb, peter (early amd64 version)
2008-11-26 19:25:13 +00:00
Kip Macy
db7f0b974f - bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
  allow drivers to efficiently manage multiple hardware queues
  (i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.
2008-11-22 05:55:56 +00:00
Joseph Koshy
e829eb6d61 - Separate PMC class dependent code from other kinds of machine
dependencies.  A 'struct pmc_classdep' structure describes operations
  on PMCs; 'struct pmc_mdep' contains one or more 'struct pmc_classdep'
  structures depending on the CPU in question.

  Inside PMC class dependent code, row indices are relative to the
  PMCs supported by the PMC class; MI code in "hwpmc_mod.c" translates
  global row indices before invoking class dependent operations.

- Augment the OP_GETCPUINFO request with the number of PMCs present
  in a PMC class.

- Move code common to Intel CPUs to file "hwpmc_intel.c".

- Move TSC handling to file "hwpmc_tsc.c".
2008-11-09 17:37:54 +00:00
Jung-uk Kim
e39dddd413 Simplify AMD64_CPU_MODEL() and AMD64_CPU_FAMILY() macros as the base family
should be at least 0xf00 for all supported platforms.
2008-10-22 17:36:52 +00:00
Jung-uk Kim
87c919e808 Set kern.timecounter.invariant_tsc to 1 for AMD CPU family 10h and higher
even if BIOS does not advertise it.
2008-10-22 00:01:53 +00:00