5878 Commits

Author SHA1 Message Date
jkim
56b80da7ca Move identical copies of apm_bios.h to sys/x86/include, replace them with
stubs, and adjust PC98 stub accordingly.

Reviewed by:	imp, nyan
2010-11-11 19:36:21 +00:00
avg
6ba2297b75 amd64: introduce minidump version 2
After KVA space was increased to 512GB on amd64 it became impractical
to use PTEs as entries in the minidump map of dumped pages, because size
of that map alone would already be 1GB.
Instead, we now use PDEs as page map entries and employ two stage lookup
in libkvm: virtual address -> PDE -> PTE -> physical address.  PTEs are
now dumped as regular pages.  Fixed page map size now is 2MB.

libkvm keeps support for accessing amd64 minidumps of version 1.
Support for 1GB pages is added.

Many thanks to Alan Cox for his guidance, numerous reviews, suggestions,
enhancments and corrections.

Reviewed by:	alc [kernel part]
MFC after:	15 days
2010-11-11 18:35:28 +00:00
jkim
d6fa755921 Make APM emulation look more closer to its origin. Use device_get_softc(9)
instead of hardcoding acpi(4) unit number as we have device_t for it.
2010-11-10 18:50:12 +00:00
jkim
f01ec2a1c6 Refactor acpi_machdep.c for amd64 and i386, move APM emulation into a new
file acpi_apm.c, and place it on sys/x86/acpica.
2010-11-10 01:29:56 +00:00
jhb
acd72eb169 - Remove <machine/mutex.h>. Most of the headers were empty, and the
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
  to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
  in the _mtx_* or __mtx_* namespaces.  While here, change the names to more
  closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.

Suggested by:	bde (1, 2)
2010-11-09 20:46:41 +00:00
attilio
4963bf694d Move the mptable.h under x86/include/.
Sponsored by:	Sandvine Incorporated
MFC after:	14 days
2010-11-09 20:28:09 +00:00
jkim
83f0c9c278 Now OsdEnvironment.c is identical on amd64 and i386. Move it to a new home. 2010-11-09 00:27:18 +00:00
jkim
36651dc0f0 Reduce diff between platforms and fix style(9) bugs. 2010-11-09 00:14:39 +00:00
jhb
81db049683 Move the MADT parser for amd64 and i386 to sys/x86/acpica now that it is
identical on both platforms.
2010-11-08 20:57:02 +00:00
jhb
e4b52a71f4 A few small style and whitespace fixes. 2010-11-08 20:05:22 +00:00
alc
046fb85874 Don't call pmap_demote_DMAP() on MTRR entries from the BIOS that are marked
as "bogus".

Reported by:	Jia-Shiun Li
2010-11-07 21:48:49 +00:00
jhb
45c0759920 Adjust the order of operations in spinlock_enter() and spinlock_exit() to
work properly with single-stepping in a kernel debugger.  Specifically,
these routines have always disabled interrupts before increasing the nesting
count and restored the prior state of interrupts after decreasing the nesting
count to avoid problems with a nested interrupt not disabling interrupts
when acquiring a spin lock.  However, trap interrupts for single-stepping
can still occur even when interrupts are disabled.  Now the saved state of
interrupts is not saved in the thread until after interrupts have been
disabled and the nesting count has been increased.  Similarly, the saved
state from the thread cannot be read once the nesting count has been
decreased to zero.  To fix this, use temporary variables to store interrupt
state and shuffle it between the thread's MD area and the appropriate
registers.

In cooperation with:	bde
MFC after:     1 month
2010-11-05 13:42:58 +00:00
avg
18175daafc x86 topo_probe: do not probe smp topology if only one cpu is visible
This could lead to a division by zero if hardware is multi-core and/or
multi-threaded, but for some (quite unusual) reason FreeBSD sees only
one logical processor.  This could happen, for example, if neither MADT
nor MP Table are presented by BIOS.

Also:
- assert in topo_probe_0x4 that BSP is accounted for
- neither cpu_cores nor cpu_logical should be zero after successful
  probing, so either being zero is an indication of failed probing

Reported by:	vwe, Dan Allen <danallen46@airwired.net>
Tested by:	Dan Allen <danallen46@airwired.net>
MFC after:	3 days
2010-11-04 08:51:45 +00:00
jhb
e0a2a85d3a Move <machine/apicreg.h> to <x86/apicreg.h>. 2010-11-01 18:18:46 +00:00
jhb
c7dd85142c Move the <machine/mca.h> header to <x86/mca.h>. 2010-11-01 17:40:35 +00:00
alc
c1d60ad229 Add another safety belt to pmap_demote_DMAP(). 2010-10-30 23:49:37 +00:00
alc
17ccf31867 Don't demote in pmap_demote_DMAP() if the specified length is zero. 2010-10-30 17:21:32 +00:00
attilio
7ab661360c Merge nexus.c from amd64 and i386 to x86 subtree.
Sponsored by:	Sandvine Incorporated
Tested by:	gianni
2010-10-28 16:31:39 +00:00
jhb
c1f742a56c Use 'PCPU_GET(apic_id)' to determine the BSP's APIC ID on a UP machine
when routing interrupts instead of cpu_apic_ids[0] since cpu_apic_ids[]
is only populated for multiple-CPU machines.  This also matches what the
code does when SMP is not enabled.

PR:		bin/151616
Tested by:	"Damian S. Kolodziejczyk"  damkol | gmail
Submitted by:	avg
MFC after:	1 week
2010-10-28 13:44:19 +00:00
attilio
0237c602c6 Merge the mptable support from MD bits to x86 subtree.
Sponsored by:	Sandvine Incorporated
Discussed with:	jhb
2010-10-28 07:58:06 +00:00
alc
c190531c4f [1] According to the x86 architectural specifications, no virtual-to-
physical page mapping should span two or more MTRRs of different types.
Add a pmap function, pmap_demote_DMAP(), by which the MTRR module can
ensure that the direct map region doesn't have such a mapping.

[2] Fix a couple of nearby style errors in amd64_mrset().

[3] Re-enable the use of 1GB page mappings for implementing the direct
map.  (See also r197580 and r213897.)

Tested by:	kib@ on a Westmere-family processor [3]
MFC after:	3 weeks
2010-10-27 16:46:37 +00:00
attilio
efd2e37632 Merge dump_machdep.c i386/amd64 under the x86 subtree.
Sponsored by:	Sandvine Incorporated
Tested by:	gianni
2010-10-26 12:46:26 +00:00
jhb
9bbda5d7a4 Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned
by intr_disable().

Requested by:	bde
2010-10-25 15:31:13 +00:00
jhb
5979f73be6 Use intr_disable() and intr_restore() instead of frobbing the flags register
directly to disable interrupts.

Reviewed by:	bde (earlier version)
MFC after:	2 weeks
2010-10-25 15:28:03 +00:00
alc
9ee8aa4dbf Update pmap_extract() to handle 1GB page mappings. Some device drivers
use pmap_extract() rather than pmap_kextract() on direct map addresses.
Thus, pmap_extract() needs to be able to deal with 1GB page mappings if
we are to use 1GB page mappings for the direct map.  (See r197580.)
2010-10-15 15:23:34 +00:00
jkim
94b2db5cb7 Remove trailing ", " from `sysctl machdep.idle_available' output. 2010-10-12 20:53:12 +00:00
kib
84212c7551 Add macro DECLARE_MODULE_TIED to denote a module as requiring the
kernel of exactly the same __FreeBSD_version as the headers module was
compiled against.

Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules
use kernel interfaces that the Release Engineering Team feel are not
stable enough to guarantee they will not change during the life cycle
of a STABLE branch. In particular, the layout of struct sysentvec is
declared to be not part of the STABLE KBI.

Discussed with:	bz, rwatson
Approved by:	re (bz, kensmith)
MFC after:	2 weeks
2010-10-12 09:18:17 +00:00
kib
35416f69f7 Regen. 2010-10-08 07:19:05 +00:00
kib
cc1aae90a2 Fix typo.
Submitted by:	arundel
MFC after:	3 days
2010-10-08 07:18:44 +00:00
kib
4c0eac49f2 Display PCID capability of CPU and add CPUID define for it.
MFC after:	1 week
2010-10-05 15:31:56 +00:00
kib
e68de2436e The makectx() function, used by kdb_trap() to reconstruct pcb from
trap frame when trap initiated kdb entry, incorrectly calculated the
value of %rsp for trapped thread.

According to Intel(R) 64 and IA-32 Architectures Software Developer's Manual
Volume 3A: System Programming Guide, Part 1, rev. 035, 6.14.2 64-Bit Mode
Stack Frame, "64-bit mode ... pushes SS:RSP unconditionally, rather than
only on a CPL change."
Even assuming the conditional push of the %ss:%rsp, the calculation
was still wrong because sizeof(tf_ss) + sizeof(tf_rsp) == 16 on amd64.

Always use the tf_rsp from trap frame. The change supposedly fixes
stepping when using kgdb backend for kdb.

Submitted by:	Zhouyi Zhou <zhouzhouyi gmail com>
PR:	amd64/151167
Reviewed by:	avg
MFC after:	1 week
2010-10-03 13:52:17 +00:00
avg
9c29d0d1f0 i386 and amd64 mp_machdep: improve topology detection for Intel CPUs
This patch is significantly based on previous work by jkim.
List of changes:
- added comments that describe topology uniformity assumption
- added reference to Intel Processor Topology Enumeration article
- documented a few global variables that describe topology
- retired weirdly set and used logical_cpus variable
- changed fallback code for mp_ncpus > 0 case, so that CPUs are treated
  as being different packages rather than cores in a single package
- moved AMD-specific code to topo_probe_amd [jkim]
- in topo_probe_0x4() follow Intel-prescribed procedure of deriving SMT
  and core masks and match APIC IDs against those masks [started by
  jkim]
- in topo_probe_0x4() drop code for double-checking topology parameters
  by looking at L1 cache properties [jkim]
- in topo_probe_0xb() add fallback path to topo_probe_0x4() as
  prescribed by Intel [jkim]

Still to do:
- prepare for upcoming AMD CPUs by using new mechanism of uniform
  topology description [pointed by jkim]
- probe cache topology in addition to CPU topology and probably use that
  for scheduler affinity topology; e.g. Core2 Duo and Athlon II X2 have
  the same CPU topology, but Athlon cores do not share L2 cache while
  Core2's do (no L3 cache in both cases)
- think of supporting non-uniform topologies if they are ever
  implemented for platforms in question
- think how to better described old HTT vs new HTT distinction, HTT vs
  SMT can be confusing as SMT is a generic term
- more robust code for marking CPUs as "logical" and/or "hyperthreaded",
  use HTT mask instead of modulo operation
- correct support for halting logical and/or hyperthreaded CPUs, let
  scheduler know that it shouldn't schedule any threads on those CPUs

PR:			kern/145385 (related)
In collaboration with:	jkim
Tested by:		Sergey Kandaurov <pluknet@gmail.com>,
			Jeremy Chadwick <freebsd@jdc.parodius.com>,
			Chip Camden <sterling@camdensoftware.com>,
			Steve Wills <steve@mouf.net>,
			Olivier Smedts <olivier@gid0.org>,
			Florian Smeets <flo@smeets.im>
MFC after:		1 month
2010-10-01 10:32:54 +00:00
neel
11129fcf49 Fix bogus error message from bus_dmamem_alloc() about incorrect alignment.
The check for alignment should be made against the physical address and not
the virtual address that maps it.

Sponsored by:	NetApp
Submitted by:	Will McGovern (will at netapp dot com)
Reviewed by:	mjacob, jhb
2010-09-29 21:53:11 +00:00
davidxu
b9eeaa21c2 Now userland POSIX semaphore is based on umtx. The kernel module
is only used to support binary compatible, if want to run old
binary, you need to kldload the module.
2010-09-24 09:04:16 +00:00
nork
9673f78b29 Add support 'device tpm' for amd64.
Add tpm(4)'s default setting to /boot/defaults/loader.conf.
Add 'device tpm' to NOTES for amd64 and i386.

Discussed with:	takawata
Approved by:	imp (mentor)
2010-09-19 14:40:37 +00:00
avg
eb6f40cd9b amd64: reduce VM_KMEM_SIZE_SCALE to 1 allowing kernel to use more memory
KVA space is abundant on amd64, so there is no reason to limit kernel
map size to a fraction of available physical memory.  In fact, it could
be larger than physical memory.

This should help with memory auto-tuning for ZFS and shouldn't affect
other workloads.
This should reduce number of circumstances for "kmem_map too small"
panics, but probably won't eliminate them entirely due to potential kmem
fragmentation.

In fact, you might want/need to limit maximum ARC size after this commit
if you need to resrve more memory for applications.

This change was discussed on arch@ and nobody said "don't do it".

MFC after:	6 weeks
2010-09-17 07:36:32 +00:00
mav
eb4931dc6c Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
  kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
  kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
  kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
  kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by:	many (on i386, amd64, sparc64 and powerc)
H/W donated by:	Gheorghe Ardelean
Sponsored by:	iXsystems, Inc.
2010-09-13 07:25:35 +00:00
ken
f65b2b217c MFp4 (//depot/projects/mps/...)
Bring in a driver for the LSI Logic MPT2 6Gb SAS controllers.

This driver supports basic I/O, and works with SAS and SATA drives and
expanders.

Basic error recovery works (i.e. timeouts and aborts) as well.

Integrated RAID isn't supported yet, and there are some known bugs.

So this isn't ready for production use, but is certainly ready for
testing and additional development.  For the moment, new commits to this
driver should go into the FreeBSD Perforce repository first
(//depot/projects/mps/...) and then get merged into -current once
they've been vetted.

This has only been added to the amd64 GENERIC, since that is the only
architecture I have tested this driver with.

Submitted by:	scottl
Discussed with:	imp, gibbs, will
Sponsored by:	Yahoo, Spectra Logic Corporation
2010-09-10 15:03:56 +00:00
avg
c9fe8ad7f0 bus_add_child: change type of order parameter to u_int
This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.

Followup to:	r212213
MFC after:	10 days
2010-09-10 11:19:03 +00:00
rdivacky
b464d39d95 Change the parameter passed to the inline assembly to u_short
as we are dealing with 16bit segment registers. Change mov
to movw.

Approved by:    rpaulo (mentor)
Reviewed by:    kib, rink
2010-09-03 14:25:17 +00:00
jkim
4c2b72a3c0 Save MSR_FSBASE, MSR_GSBASE and MSR_KGSBASE directly to PCB as we do not use
these values in the function.
2010-08-30 21:19:42 +00:00
rpaulo
707917e57b Register an interrupt vector for DTrace return probes. There is some
code missing in lapic to make sure that we don't overwrite this entry,
but this will be done on a sequent commit.

Sponsored by:	The FreeBSD Foundation
2010-08-28 08:03:29 +00:00
rpaulo
fb4382e1f0 Call the necessary DTrace function pointers when we have different kinds
of traps.

Sponsored by:	The FreeBSD Foundation
2010-08-25 09:10:32 +00:00
rpaulo
2aa1a2c156 Add two DTrace trap type values. Used by fasttrap.
Sponsored by:	The FreeBSD Foundation
2010-08-24 13:13:24 +00:00
attilio
df3fa718d4 Revert part of the r211149 as I erroneously ported the logical_cpus from
Yahoo! patchset as a mask (and according manipulating variables) while
it is actually a CPU count.

Submitted by:	neel
MFC after:	1 month
X-MFC:		211149
2010-08-19 22:37:43 +00:00
jhb
d02cab2556 Remove unused KTRACE includes. 2010-08-19 16:41:27 +00:00
gahr
e748dbf471 - The iMac9,1 needs the PAT workaround as well
Approved by:	cognet
2010-08-17 12:17:24 +00:00
kib
d9f088a03e Supply some useful information to the started image using ELF aux vectors.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.

Tested by:	marius (sparc64)
MFC after:	1 month
2010-08-17 08:55:45 +00:00
jkim
e677a6e721 Reset switchtime to zero rather than the current CPU ticker (TSC) value.
It is more appropriate in this context because TSC MSR is reset to zero
when the CPU is restarted from S3 and above.  Move acpi_resync_clock() back
to where it was before r211202.  It does not make a difference any more.
2010-08-13 22:08:42 +00:00
attilio
c47bc039c0 Revert r211176:
As long as interrupts are disabled and there is not explicit call to
sched_add() there can't be any preemption there, thus the calls may be
consistent.

Reported by:	kib, jhb
2010-08-12 13:46:43 +00:00