Commit Graph

1858 Commits

Author SHA1 Message Date
Attilio Rao
c8790f5d09 Fix some entries in the locks static table of witness.
In particular:
- smp_tlb_mtx is no longer used, so it is axed.
- smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in
  the table, however, has been the source of a false positive LOR reporting
  with the dt_lock.  However, smp rendezvous lock would have had sched_lock
  there for older lock, so it wasn't still a leaf lock.
- allpmaps is only used in ia32 architecture, so it is inserted in the
  appropriate stub.

Addictionally:
- kse_zombie_lock is no longer present, so its definition is axed out.
- zombie_lock doesn't need to have an exported symbol, so just let's it be
  declared as static.

Tested by: kris
Approved by: jeff (mentor)
Approved by: re
2007-09-20 20:38:43 +00:00
Joseph Koshy
298889efcb Define an END() macro for use in i386 and amd64 assembly code, akin
to the one available on the ia64, sparc64, and sun4v architectures.

Approved by:	re (kensmith)
2007-08-22 04:26:07 +00:00
Dag-Erling Smørgrav
83d18f2283 Add a driver for the on-die digital thermal sensor found on Intel Core
and newer CPUs (including Core 2 and Core / Core 2 based Xeons).  The
driver attaches to each cpu device and creates a sysctl node in that
device's sysctl context (dev.cpu.N.temperature).  When invoked, the
handler binds to the appropriate CPU to ensure a correct reading.

Submitted by:	Rui Paulo <rpaulo@fnop.net>
Sponsored by:	Google Summer of Code 2007
Tested by:	des, marcus, Constantine A. Murenin, Ian FREISLICH
Approved by:	re (kensmith)
MFC after:	3 weeks
2007-08-15 19:26:03 +00:00
Nate Lawson
3b3f28135f Add "show sysregs" command to ddb. On i386, this gives gdt, idt, ldt,
cr0-4, etc.  Support should be added for other platforms that have a
different set of registers for system use.

Loosely based on: OpenBSD
Approved by:	re
2007-08-09 20:14:35 +00:00
Matt Jacob
06b642b55d Remove the internal use of __packed and put it on the structures
themselves.

Reviewed by:	nate, peter, warner, robert
Approved by:	re (ken)
2007-07-11 22:34:34 +00:00
Bjoern A. Zeeb
5b919cdc47 I4B header files were repo-copied from sys/i386/include/ to
sys/i4b/include/ so they will be available to all architectures
once I4B compiles on those.

Approved by:	re (kensmith)
2007-07-06 07:23:39 +00:00
Peter Wemm
e106f3d812 __packed has no effect on u_int8_t's except to cause a warning (and
never has had any effect).

Approved by:  re (rwatson)
2007-07-05 07:28:38 +00:00
Marcel Moolenaar
01bd17cc99 Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
2007-06-09 21:55:17 +00:00
Alan Cox
e5c45405f0 Add the machine-specific definitions for configuring the new physical
memory allocator.

Set the size of phys_avail[] and dump_avail[] using one of these
definitions.

Approved by:	re
2007-06-05 05:17:20 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
Dag-Erling Smørgrav
753bcb5c34 Add CPUID2_PDCM
Requested by:	jkim
MFC after:	3 days
2007-05-31 11:26:45 +00:00
Alan Cox
c155d5d059 Eliminate an unused definition. 2007-05-27 20:34:26 +00:00
Jeff Roberson
0ad5e7f326 - Move GDT/LDT locking into a seperate spinlock, removing the global
scheduler lock from this responsibility.

Contributed by:	Attilio Rao <attilio@FreeBSD.org>
Tested by:	jeff, kkenn
2007-05-20 22:03:57 +00:00
Alexander Kabaev
fa298d5ea8 Include machine/pcb.hto turn extern struct pcb stoppcbs[]; construct
into the valid C.
2007-05-19 05:01:43 +00:00
John Baldwin
2e025791ce Handle CPUs with APIC IDs higher than 32 (at least one IBM server uses
an APIC ID of 38 for its second CPU):
- Add a new MAX_APIC_ID constant for the highest valid APIC ID for modern
  systems.
- Size the various arrays in the MADT, MP Table, and SMP code that are
  indexed by APIC IDs to allow for up to MAX_APIC_ID.
- Explicitly go through and assign logical cpu ids to local APICs before
  starting any of the APs up rather than doing it while starting up the
  APs.  This step is now where we honor MAXCPU.

MFC after:	1 week
2007-05-08 22:01:04 +00:00
John Baldwin
fb610ca1f9 Minor fixes and tweaks to the x86 interrupt code:
- Split the intr_table_lock into an sx lock used for most things, and a
  spin lock to protect intrcnt_index.  Originally I had this as a spin lock
  so interrupt code could use it to lookup sources.  However, we don't
  actually do that because it would add a lot of overhead to interrupts,
  and if we ever do support removing interrupt sources, we can use other
  means to safely do so w/o locking in the interrupt handling code.
- Replace is_enabled (boolean) with is_handlers (a count of handlers) to
  determine if a source is enabled or not.  This allows us to notice when
  a source is no longer in use.  When that happens, we now invoke a new
  PIC method (pic_disable_intr()) to inform the PIC driver that the
  source is no longer in use.  The I/O APIC driver frees the APIC IDT
  vector when this happens.  The MSI driver no longer needs to have a
  hack to clear is_enabled during msi_alloc() and msix_alloc() as a result
  of this change as well.
- Add an apic_disable_vector() to reset an IDT vector back to Xrsvd to
  complement apic_enable_vector() and use it in the I/O APIC and MSI code
  when freeing an IDT vector.
- Add a new nexus hook: nexus_add_irq() to ask the nexus driver to add an
  IRQ to its irq_rman.  The MSI code uses this when it creates new
  interrupt sources to let the nexus know about newly valid IRQs.
  Previously the msi_alloc() and msix_alloc() passed some extra stuff
  back to the nexus methods which then added the IRQs.  This approach is
  a bit cleaner.
- Change the MSI sx lock to a mutex.  If we need to create new sources,
  drop the lock, create the required number of sources, then get the lock
  and try the allocation again.
2007-05-08 21:29:14 +00:00
Alan Cox
04a18977c8 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
John Baldwin
e706f7f0c7 Revamp the MSI/MSI-X code a bit to achieve two main goals:
- Simplify the amount of work that has be done for each architecture by
  pushing more of the truly MI code down into the PCI bus driver.
- Don't bind MSI-X indicies to IRQs so that we can allow a driver to map
  multiple MSI-X messages into a single IRQ when handling a message
  shortage.

The changes include:
- Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus
  to calculate the address and data values for a given MSI/MSI-X IRQ.
  The x86 nexus drivers map this into a call to a new 'msi_map()' function
  in msi.c that does the mapping.
- Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index'
  parameter from PCIB_ALLOC_MSIX().  MD code no longer has any knowledge
  of the MSI-X index for a given MSI-X IRQ.
- The PCI bus driver now stores more MSI-X state in a child's ivars.
  Specifically, it now stores an array of IRQs (called "message vectors" in
  the code) that have associated address and data values, and a small
  virtual version of the MSI-X table that specifies the message vector
  that a given MSI-X table entry uses.  Sparse mappings are permitted in
  the virtual table.
- The PCI bus driver now configures the MSI and MSI-X address/data
  registers directly via custom bus_setup_intr() and bus_teardown_intr()
  methods.  pci_setup_intr() invokes PCIB_MAP_MSI() to determine the
  address and data values for a given message as needed.  The MD code
  no longer has to call back down into the PCI bus code to set these
  values from the nexus' bus_setup_intr() handler.
- The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD
  code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get
  new values of the address and data fields for a given IRQ.  The x86
  MSI code uses this when an MSI IRQ is moved to a different CPU, requiring
  a new value of the 'address' field.
- The x86 MSI psuedo-driver loses a lot of code, and in fact the separate
  MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver
  since the only remaining diff between the two is a substring in a
  bootverbose printf.
- The PCI bus driver will now restore MSI-X state (including programming
  entries in the MSI-X table) on device resume.
- The interface for pci_remap_msix() has changed.  Instead of accepting
  indices for the allocated vectors, it accepts a mini-virtual table
  (with a new length parameter).  This table is an array of u_ints, where
  each value specifies which allocated message vector to use for the
  corresponding MSI-X message.  A vector of 0 forces a message to not
  have an associated IRQ.  The device may choose to only use some of the
  IRQs assigned, in which case the unused IRQs must be at the "end" and
  will be released back to the system.  This allows a driver to use the
  same remap table for different shortage values.  For example, if a driver
  wants 4 messages, it can use the same remap table (which only uses the
  first two messages) for the cases when it only gets 2 or 3 messages and
  in the latter case the PCI bus will release the 3rd IRQ back to the
  system.

MFC after:	1 month
2007-05-02 17:50:36 +00:00
Stephane E. Potvin
0e5179e441 Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
Alan Cox
1434c1f34b MFamd64
Define PGEX_RSV.
2007-04-12 17:00:56 +00:00
Ruslan Ermilov
2e137367b4 Add the PG_NX support for i386/PAE.
Reviewed by:	alc
2007-04-06 18:15:03 +00:00
Jung-uk Kim
2be4e4713a Catch up with ACPI-CA 20070320 import. 2007-03-22 18:16:43 +00:00
John Baldwin
b8783b00f8 Add a new apic0 psuedo-device to claim memory resources for the memory
address ranges used by local and I/O APICs in the system.  Some systems
also reserve these ranges as system resources via either PnPBIOS or
ACPI, so this device currently attaches after acpi0 and legacy0 so that
the system resources are given precedence.
2007-03-20 21:53:31 +00:00
Jung-uk Kim
2498f259d4 - Add macros for newly added CPUID bits in the corresponding header files.
- Use correct capticalization in xTPR as Intel uses in their documents.
- Use proper description instead of vendor code name in comment.
2007-03-20 20:22:45 +00:00
Alan Cox
8cfba7267f Eliminate an unused parameter. 2007-03-17 19:42:06 +00:00
Jung-uk Kim
ab5916a526 Add another CPUID for AMD CPUs and fix style(9) while I am here. 2007-03-12 20:27:21 +00:00
Alan Cox
c640357f04 Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
John Baldwin
4c5bec1161 Change the x86 interrupt code to use FreeBSD CPU IDs (i.e. PCPU_GET(cpuid))
rather than local APIC IDs to keep track of CPUs which can handle
interrupts.
2007-03-06 17:16:47 +00:00
John Baldwin
aa7a005ee0 Use vm_paddr_t rather than uintptr_t when passing the physical address of
APICs to lapic_init() and ioapic_create().
2007-03-05 20:35:17 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Bruce Evans
1300fd67f3 Fixed some style bugs. Routine except:
- don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard
  FreeBSD implementaion detail which has nothing to do with GNUC.
2007-02-06 18:04:02 +00:00
Bruce Evans
3764a82377 Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary
variable to avoid invalid constraints in dead code.  Use an array of
u_char's (inside a struct) instead of a char/short/int/long variable so
that the variable and its accesses can be spelled in the same way in all
cases and code doesn't need to be cloned just to hold the spelling
differences.

Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET().
Cast to (void *) as in rev.1.37 of the i386 version where the errors
were fixed for the i386 PCPU_GET() only.  It would be more correct to
copy to and from the temp. variable using memcpy(), but then an
ifdef tangle would be required to ensure using the builtin memcpy().
We depend on fairly aggressive optimization to put the temp. variable
only in a register despite it being copied using
*(type *)(void *)&anothertype and could depend on this when using
memcpy() too.  This seems to work right even for -O0, but the -O0 case
has not been completely tested.

This change gives identical object code for all object files in LINT
on amd64 (except for one file with a __TIME__ stamp).  For LINT on
i386 it gives unimportant differences in instruction order and padding
in a few object files.  This was only tested for -O.

This change (actually a previous version of it) gives the following
reductions in the number of object files in LINT that fail to compile
with -O2 but without the -fno-strict-aliasing kludge:
- amd64: 29 (down from 211)
- i386: 36 (down from 47)

gcc-3.4.6 actually allows the invalid constraints that result from not
using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes
on them and I don't want to depend on compiler bugs.
2007-02-06 16:21:09 +00:00
Bruce Evans
71799af2d5 Cleaned up declaration and initialization of clock_lock. It is only
used by clock code, so don't export it to the world for machdep.c to
initialize.  There is a minor problem initializing it before it is
used, since although clock initialization is split up so that parts
of it can be done early, the first part was never done early enough
to actually work.  Split it up a bit more and do the first part as
late as possible to document the necessary order.  The functions that
implement the split are still bogusly exported.

Cleaned up initialization of the i8254 clock hardware using the new
split.  Actually initialize it early enough, and don't work around it
not being initialized in DELAY() when DELAY() is called early for
initialization of some console drivers.

This unfortunately moves a little more code before the early debugger
breakpoint so that it is harder to debug.  The ordering of console and
related initialization is delicate because we want to do as little as
possible before the breakpoint, but must initialize a console.
2007-01-23 08:01:20 +00:00
John Baldwin
5fe82bca57 Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.
- First off, device drivers really do need to know if they are allocating
  MSI or MSI-X messages.  MSI requires allocating powerof2() messages for
  example where MSI-X does not.  To address this, split out the MSI-X
  support from pci_msi_count() and pci_alloc_msi() into new driver-visible
  functions pci_msix_count() and pci_alloc_msix().  As a result,
  pci_msi_count() now just returns a count of the max supported MSI
  messages for the device, and pci_alloc_msi() only tries to allocate MSI
  messages.  To get a count of the max supported MSI-X messages, use
  pci_msix_count().  To allocate MSI-X messages, use pci_alloc_msix().
  pci_release_msi() still handles both MSI and MSI-X messages, however.
  As a result of this change, drivers using the existing API will only
  use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
  values (and thus does not require all of the messages to have their
  MD vectors allocated as a group), some devices allow for "sparse" use
  of MSI-X message slots.  For example, if a device supports 8 messages
  but the OS is only able to allocate 2 messages, the device may make the
  best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
  than default of using the first N slots (or indicies) at 1 and 2.  To
  support this, add a new pci_remap_msix() function that a driver may call
  after a successful pci_alloc_msix() (but before allocating any of the
  SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
  assigned to different message indices.  For example, from the earlier
  example, after pci_alloc_msix() returned a value of 2, the driver would
  call pci_remap_msix() passing in array of integers { 1, 4 } as the
  new message indices to use.  The rid's for the SYS_RES_IRQ resources
  will always match the message indices.  Thus, after the call to
  pci_remap_msix() the driver would be able to access the first message
  in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
  SYS_RES_IRQ rid 4.  Note that the message slots/indices are 1-based
  rather than 0-based so that they will always correspond to the rid
  values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
  To support this API, a new PCIB_REMAP_MSIX() method was added to the
  pcib interface to change the message index for a single IRQ.

Tested by:	scottl
2007-01-22 21:48:44 +00:00
Warner Losh
fed32d7544 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
Jung-uk Kim
5efc6c44ff Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors. 2007-01-09 19:23:22 +00:00
Bruce Evans
0b194ec872 Fix oops in previous commit. 2006-12-29 15:48:18 +00:00
Bruce Evans
f28e1c8f99 Fixed some style bugs (mainly assorted errors in comments, and inconsistent
spelling of `result').
2006-12-29 15:29:49 +00:00
Bruce Evans
6c296ffa81 Fixed some style bugs (whitespace only). 2006-12-29 14:28:23 +00:00
Bruce Evans
7e4277e591 Try harder to garbage-collect the "LOCORE" (really asm) version of
MPLOCKED.  The cleaning in rev.1.25 was supposed to have been undone
by rev.1.26, but 1.26 could never have actually affected asm files
since atomic.h is full of C declarations so including it in asm files
would just give syntax errors.  The asm MPLOCKED is even less needed
than when misplaced definitions of it were first removed, and is now
unused in any asm file in the src tree except in anachronismns in
sys/i386/i386/support.s.
2006-12-29 13:36:26 +00:00
Bruce Evans
26ab2d1d23 Avoid an instruction in atomic_cmpset_{int_long)() in most cases.
These functions are used a lot for mutexes, so this reduces the text
size of an average kernel by about 0.75%.  This wasn't intended to
be a significant optimization, but it somehow increased the maximum
number of packets per second that can be transmitted by my bge hardware
from 320000 to 460000 (this benchmark is CPU-bound and remarkably
sensitive to changes in the text section).

Details: we would prefer to leave the result of the cmpxchg in %al,
but cannot tell gcc that it is there, so we have to convert it to an
integer register.  We converted  to %al, then to %[re]ax, but the
latter step is usually wasted since gcc usually only wants the condition
code and can recover it from %al just as easily as from %[re]ax.  Let
gcc promote %al in the few cases where this is needed.

Nearby style fixes;
- let gcc manage the load of `res', and don't abuse `res' for a copy of `exp'
- don't echo `res's name in comments
- consistently spell the condition code as 'e' after comparison for equality
- don't hard-code %al anywhere except in constraints
- for the version that doesn't use cmpxchg, there is no requirement to use
  %al anywhere, so don't hard-code it in the constraints either.

Style non-fix:
- for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al)
  for the main output operand, although this is not required.  The input
  and output operands that use the "a" constraint are now decoupled, and
  this makes things clearer except for the reason that the output register
  is hard-coded.  It is now just a hack to tell gcc that the input "a" has
  been clobbered without increasing the number of operands.
2006-12-27 20:26:00 +00:00
Kip Macy
a5c5d4402c Evidently FreeBSD has long relied on the compiler to treat structures
passed by value (trap frames) as if they were in fact being passed by
reference. For better or worse, this incorrect behaviour is no longer
present in gcc 4.1. In this patch I convert all trapframe arguments to
be explicitly pass by reference. I also remove vm86_initflags, pushing
the very little work that it actually does up into vm86_prepcall.

Reviewed by: kan
Tested by: kan
2006-12-17 05:07:01 +00:00
John Baldwin
fde45e231a Sort function prototypes. 2006-12-12 19:24:45 +00:00
Alan Cox
da44960498 The global variable avail_end is redundant and only used once. Eliminate
it.  Make avail_start static to the pmap on amd64.  (It no longer exists
on other architectures.)
2006-11-19 20:54:58 +00:00
John Baldwin
7693afca4e - Add macro constants for the various fields in %dr7 and use them in place
of various scattered magic values.
- Pretty print the address of hardware watchpoints in 'show watch' rather
  than just displaying hex.
- Expand address field width on amd64 for 64-bit pointers.
2006-11-17 19:20:32 +00:00
John Baldwin
71f4007710 Various whitespace and style fixes. 2006-11-15 19:53:48 +00:00
John Baldwin
4184900911 MD support for PCI Message Signalled Interrupts on amd64 and i386:
- Add a new apic_alloc_vectors() method to the local APIC support code
  to allocate N contiguous IDT vectors (aligned on a M >= N boundary).
  This function is used to allocate IDT vectors for a group of MSI
  messages.
- Add MSI and MSI-X PICs.  The PIC code here provides methods to manage
  edge-triggered MSI messages as x86 interrupt sources.  In addition to
  the PIC methods, msi.c also includes methods to allocate and release
  MSI and MSI-X messages.  For x86, we allow for up to 128 different
  MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs,
  16-254 for APIC PCI IRQs, and IRQ 255 is reserved).
- Add pcib_(alloc|release)_msi[x]() methods to the MD x86 PCI bridge
  drivers to bubble the request up to the nexus driver.
- Add pcib_(alloc|release)_msi[x]() methods to the x86 nexus drivers that
  ask the MSI PIC code to allocate resources and IDT vectors.

MFC after:	2 months
2006-11-13 22:23:34 +00:00
Ruslan Ermilov
d77f5882e7 Fix NKPT comments to match reality. Note that the current value
of NKPT is no longer enough to run amd64 with 16G of RAM, as it
doesn't have space for mapping a kernel (16M kernel would require
additionally 8 page tables).
2006-11-13 20:33:54 +00:00
Ruslan Ermilov
26af9ac7d0 Fix a comment. 2006-11-13 06:26:57 +00:00
Bruce Evans
43f0ea0a27 i386/include/profile.h:
Fixed a syntax error for the (!__KERNEL && !__GNUCLIKE_ASM) case in
rev.1.36.  Apparently, this case has never been reached even by lint.

Submitted by:	stefanf

{amd64,i386}/include/profile.h:
In case the above case is actually reached, break it properly by
providing null support that will fail at link time instead of a stub
that gives wrong (null) profiling at runtime.
2006-10-28 11:03:03 +00:00
Bruce Evans
853b92dacf In MCOUNT_OVERHEAD(label), actually use the `label' parameter. We were
still using the global label named "profil", and this worked accidentally
because all callers use the same name.
2006-10-28 07:59:11 +00:00
Bruce Evans
94450a83e8 Removed all traces of HIDENAME() in amd64 and i386 kernel code. Using
this used to be slightly cleaner than using ifdefs in a few places to
support both a.out and elf, but using it now just causes messes and
unportabilities.  It seems to be impossible to implement the elf
HIDENAME() portably in cpp (since token pasting of "." and <name> is
invalid).

*/prof_machdep.c:
- Removed all uses of CNAME().  CNAME() is easy enough to use in pure
  asm code, but using it in inline asm requires messy quoting.  The
  core pure asm code has been hacked on more and all uses of CNAME() in
  it have already gone away.  Just assume the elf convention here too.
- Removed now-uneeded include of <machine/asmacros.h>.
- Removed the workaround for a namespace conflict with this include.
2006-10-28 06:04:29 +00:00
Bruce Evans
447647908c Don't call mexitcount or provide a stub mexitcount to call when
profiling is configured but high resolution profiling is not configured.
Only functions in *.[Ss] called the stub, so efficiency was not
significantly affected.
2006-10-27 14:17:50 +00:00
John Baldwin
520ffff83e Change the x86 interrupt code to suspend/resume interrupt controllers
(PICs) rather than interrupt sources.  This allows interrupt controllers
with no interrupt pics (such as the 8259As when APIC is in use) to
participate in suspend/resume.
- Always register the 8259A PICs even if we don't use any of their pins.
- Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't
  included.
- Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC
  on resume.  This gets suspend/resume working with APIC on UP systems.
  SMP still needs more work to bring the APs back to life.

The MFC after is tentative.

Tested by:	anholt (i386)
Submitted by:	Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3)
MFC after:	1 week
2006-10-10 23:23:12 +00:00
John Baldwin
6e20fe33ba Oops, fix sign bug in #ifdef for value of INTRCNT_COUNT.
PR:		kern/99870
Submitted by:	jkim
MFC after:	3 days
2006-10-10 19:26:35 +00:00
John Birrell
6825d60738 PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.
2006-10-04 21:37:10 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Alexander Kabaev
d9cb97ff9d Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and  __builtin_va_start was present in all GCC version 3.1 and
later.
2006-09-21 01:37:02 +00:00
John Baldwin
7e9f73f3ed First pass at allowing memory to be mapped using cache modes other than
WB (write-back) on x86 via control bits in PTEs and PDEs (including making
use of the PAT MSR).  Changes include:
- A new pmap_mapdev_attr() function for amd64 and i386 which takes an
  additional parameter (relative to pmap_mapdev()) specifying the cache
  mode for this mapping.  Note that on amd64 only WB mappings are done with
  the direct map, all other modes result in a private mapping.
- pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached)
  mappings rather than WB.  Previously we relied on the BIOS setting up
  MTRR's to enforce memio regions being treated as UC.  This might make
  hw.cbb_start_memory unnecessary in some cases now for example.
- A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places
  that used pmap_mapdev() to map non-device memory (such as ACPI tables)
  to do so using WB as before.
- A new pmap_change_attr() function for amd64 and i386 that changes the
  caching mode for a range of KVA.

Reviewed by:	alc
2006-08-11 19:22:57 +00:00
Jung-uk Kim
0758eaa227 Sync specialreg.h changes between amd64 and i386 with few fixes. 2006-07-13 16:09:40 +00:00
Michael Reifenberger
df2f5de4e5 Initialise (if necessary) the VIA C3/C7 features.
Store the capabilities for further use by random(4), padlock(4), ...

Obtained from:	mostly OpenBSD
MFC after:	1 week
2006-07-12 19:46:08 +00:00
Michael Reifenberger
9b6560e483 fix typo in identcpu.c and add one define to specialreg.h.
MFC after:	1 week
2006-07-12 16:52:56 +00:00
Michael Reifenberger
e5f87cebb3 First step to identify and initialize the newer VIA C7 CPU
as found in a VIA EPIA EN-15000 board.

Obtained from:	large parts from OpenBSD
2006-07-12 14:52:32 +00:00
Jung-uk Kim
444576c0c4 Add two new CPUID bits for AMD CPUs, i. e., SVM and extended APIC register. 2006-07-12 06:04:12 +00:00
Thomas Wintergerst
5d0c7501b6 Extend i4b to support CAPI manager based ISDN controllers (CAPI manager is part of
c4b, CAPI for BSD). This is a preparation to add CAPI for BSD to the source tree.

Approved by:	hm (mentor)
MFC after:	2 weeks
2006-07-09 21:16:06 +00:00
David Xu
d037e6d6d0 Style fix, use low-case. 2006-06-19 07:55:29 +00:00
David Xu
85b2d575de Clear bit 22 in MSR IA32_MISC_ENABLE, according to Intel document,
when the bit 22 is set to 1, CPUID with EAX=0 returns a maximum
value in EAX[7..0] of 3, when set to 0(default), CPUID with EAX=0
returns the number corresponding to the maximum standard function
supported. On my machine, BIOS sets the bit to 1 to make it to be
compatible with old OS, this causes dual-core Pentium-D (two
physical cores) to be identified as hyperthreading (two logical
cores) by function mp_topology().
2006-06-19 07:51:47 +00:00
David Xu
afedf1a7f1 Use the method described in IA-32 Intel Architecture Software Developer's
Manual chapter 11.6.6 to get valid mxcsr bits, use the mxcsr mask to clear
invalid bits passed by user code.

Reviewed by: bde
2006-05-30 23:44:21 +00:00
David Xu
5d84379dd6 Backout changes trying to inherit floating-point environment, although
POSIX (susv3) requires this, but it is unclear what should be inherited,
duplicating whole 387 stack for new thread seems to be unnecessary and
dangerous. Revert to previous code, force a new thread to be started with
clean FP state.
2006-05-29 02:58:37 +00:00
David Xu
38fd748725 When creating a new thread, inherit floating-point environment from
current thread, this is required by POSIX pthread_create document.
2006-05-28 02:03:13 +00:00
Maxim Sobolev
aa1807d5d6 Move clock_lock prototype into <machine/clock.h>, where it is more
appropriate.

Discussed with:	jhb
2006-05-19 18:53:50 +00:00
Poul-Henning Kamp
f6ce2a64f7 Send the pcvt(4) driver off to retirement. 2006-05-17 09:33:15 +00:00
Peter Wemm
374757c7cb Test commit after repoman upgrade. Remove one of my many email addresses
from a copyright message.
2006-05-12 22:41:58 +00:00
Peter Wemm
b02a3351e8 Test commit after repoman upgrade. Remove one of my many email addresses
from a coyright message.
2006-05-12 22:38:53 +00:00
Poul-Henning Kamp
5405ab4889 Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
2006-05-11 17:29:25 +00:00
John Baldwin
2b8a339c7e Add various constants for the PAT MSR and the PAT PTE and PDE flags.
Initialize the PAT MSR during boot to map PAT type 2 to Write-Combining
(WC) instead of Uncached (UC-).

MFC after:	1 month
2006-05-01 22:07:00 +00:00
John Baldwin
4ac60df584 Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the
wbinvd() instruction.  This includes a new IPI so that all CPU caches on
all CPUs are flushed for the SMP case.

MFC after:	1 month
2006-05-01 21:36:47 +00:00
Peter Wemm
041a991fa7 MFamd64: shrink pv entries from 24 bytes to about 12 bytes. (336 pv entries
per page = effectively 12.19 bytes per pv entry after overheads).
Instead of using a shared UMA zone for 24 byte pv entries (two 8-byte tailq
nodes, a 4 byte pointer, and a 4 byte address), we allocate a page at a
time per process.  This provides 336 pv entries per process (actually, per
pmap address space) and eliminates one of the 8-byte tailq entries since
we now can track per-process pv entries implicitly.  The pointer to
the pmap can be eliminated by doing address arithmetic to find the metadata
on the page headers to find a single pointer shared by all 336 entries.
There is an 11-int bitmap for the freelist of those 336 entries.

This is mostly a mechanical conversion from amd64, except:
* i386 has to allocate kvm and map the pages, amd64 has them outside of kvm
* native word size is smaller, so bitmaps etc become 32 bit instead of 64
* no dump_add_page() etc stuff because they are in kvm always.
* various pmap internals tweaks because pmap uses direct map on amd64 but
  on i386 it has to use sched_pin and temporary mappings.

Also, sysctl vm.pmap.pv_entry_max and vm.pmap.shpgperproc are now
dynamic sysctls.  Like on amd64, i386 can now tune the pv entry limits
without a recompile or reboot.

This is important because of the following scenario.   If you have a 1GB
file (262144 pages) mmap()ed into 50 processes, that requires 13 million
pv entries.  At 24 bytes per pv entry, that is 314MB of ram and kvm, while
at 12 bytes it is 157MB.  A 157MB saving is significant.

Test-run by:  scottl (Thanks!)
2006-04-26 21:49:20 +00:00
Peter Wemm
4503a06eef Merge minidumps from amd64 where they were originally developed.
Major differences:
 * since there is no direct map region, there is no custom uma memory
   allocator to modify to include its pages in the dumps.
 * Various data entries are reduced from 64 bit to 32 bit to match the
   native size.

dump_add_page() and dump_drop_page() are still present in case one wants to
arrange for arbitary pages to be dumped.  This is of marginal use though
because libkvm+kgdb cannot address physical memory that isn't mapped into
kvm.
2006-04-21 04:28:43 +00:00
Marcel Moolenaar
bfcdefd8aa Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.
2006-04-03 22:51:47 +00:00
Dag-Erling Smørgrav
6f0f8cca25 Use wrapper macros for atomic pointer operations in order to perform the
correct casts.  This should probably be merged to other architectures.
2006-03-28 14:34:48 +00:00
Rink Springer
5fa7c51ff6 Committed the xbox syscons(8)-able console driver.
Reviewed by:    arch@ (no comments)
Approved by:    imp (mentor)
2006-03-03 14:52:57 +00:00
John Baldwin
215e7c161a Rework how we wire up interrupt sources to CPUs:
- Throw out all of the logical APIC ID stuff.  The Intel docs are somewhat
  ambiguous, but it seems that the "flat" cluster model we are currently
  using is only supported on Pentium and P6 family CPUs.  The other
  "hierarchy" cluster model that is supported on all Intel CPUs with
  local APICs is severely underdocumented.  For example, it's not clear
  if the OS needs to glean the topology of the APIC hierarchy from
  somewhere (neither ACPI nor MP Table include it) and setup the logical
  clusters based on the physical hierarchy or not.  Not only that, but on
  certain Intel chipsets, even though there were 4 CPUs in a logical
  cluster, all the interrupts were only sent to one CPU anyway.
- We now bind interrupts to individual CPUs using physical addressing via
  the local APIC IDs.  This code has also moved out of the ioapic PIC
  driver and into the common interrupt source code so that it can be
  shared with MSI interrupt sources since MSI is addressed to APICs the
  same way that I/O APIC pins are.
- Interrupt source classes grow a new method pic_assign_cpu() to bind an
  interrupt source to a specific local APIC ID.
- The SMP code now tells the interrupt code which CPUs are avaiable to
  handle interrupts in a simpler and more intuitive manner.  For one thing,
  it means we could now choose to not route interrupts to HT cores if we
  wanted to (this code is currently in place in fact, but under an #if 0
  for now).
- For now we simply do static round-robin of IRQs to CPUs when the first
  interrupt handler just as before, with the change that IRQs are now
  bound to individual CPUs rather than groups of up to 4 CPUs.
- Because the IRQ to CPU mapping has now been moved up a layer, it would
  be easier to manage this mapping from higher levels.  For example, we
  could allow drivers to specify a CPU affinity map for their interrupts,
  or we could allow a userland tool to bind IRQs to specific CPUs.

The MFC is tentative, but I want to see if this fixes problems some folks
had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work
fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose
interrupts).

MFC after:	1 week
2006-02-28 22:24:55 +00:00
Sam Leffler
3f676959ae guard function decls with _KERNEL so user code can include this file
MFC after:	1 week
2006-02-22 21:38:33 +00:00
Rink Springer
424d9b482d Cleaned the memory initialization up, moved some defines from the framebuffer
to an include file.

Reviewed by:		imp
Approved by:		imp (mentor)
2006-02-10 18:48:22 +00:00
Roman Kurakin
8edb110aa3 Prepare for sconfig(8) update.
Change also my e-mail.
2006-01-30 13:34:57 +00:00
Warner Losh
d5e61c97a6 By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into
param.h.  Per request, I've placed these just after the
_NO_NAMESPACE_POLLUTION ifndef.  I've not renamed anything yet, but
may since we don't need the __.

Submitted by: bde, jhb, scottl, many others.
2006-01-09 06:05:57 +00:00
Warner Losh
501755f4f6 Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for
each platform.  These will be used in the pci code in preference to
the complicated #ifdefs we have there now.
2006-01-01 20:59:28 +00:00
David Xu
f71ba3d4a7 Remove pcb_switchout, it has not been used for a long time. 2005-12-29 13:23:48 +00:00
David Xu
1bfa910843 Move global variable private_tss into per-cpu area.
Reviewed by: jhb
2005-12-26 00:07:19 +00:00
John Baldwin
b439e431bf Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
John Baldwin
696effb697 - Cleanup whitespace and extra ()s in vtophys() macros.
- Move vtophys() macros next to vtopte() where vtopte() exists to match
  comments above vtopte().
- Remove references to the alternate address space in the comment above
  vtopte().  amd64 never had the alternate address space, and i386 lost it
  prior to PAE support being added.
- s/entires/entries/ in comments.

Reviewed by:	alc
2005-12-06 21:09:01 +00:00
Ruslan Ermilov
224d140293 Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE).  Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.
2005-12-06 13:27:21 +00:00
John Baldwin
2dce95a085 Change the i386 code to pass the interrupt vector as a separate argument
rather than embedding it in the intrframe as if_vec.  This reduces diffs
with amd64 somewhat.
- Remove cf_vec from clockframe (it wasn't used anyway) and stop pushing
  dummy vector arguments for ipi_bitmap_handler() and lapic_handle_timer()
  since clockframe == trapframe now.
- Fix ddb to handle stack traces across interrupt entry points that just
  have a trapframe on their stack and not a trapframe + vector.
- Change intr_execute_handlers() to take a trapframe rather than an
  intrframe pointer.
- Change lapic_handle_intr() and atpic_handle_intr() to take a vector and
  trapframe rather than an intrframe.
- GC struct intrframe now that nothing uses it anymore.
- GC CLOCK_TO_TRAPFRAME() and INTR_TO_TRAPFRAME().

Reviewed by:	bde
Requested by:	peter
2005-12-05 22:39:09 +00:00
John Baldwin
f0b9813920 - Move the code to deal with handling an IPI_STOP IPI out of
ipi_nmi_handler() and into a new cpustop_handler() function.  Change
  the Xcpustop IPI_STOP handler to call this function instead of
  duplicating all the same logic in assembly.
- EOI the local APIC for the lapic timer interrupt in C rather than
  assembly.
- Bump the lazypmap IPI counter if COUNT_IPIS is defined in C rather than
  assembly.
2005-12-05 22:25:41 +00:00
John Baldwin
48c8cbcb82 - Move PUSH_FRAME and POP_FRAME into machine/asmacros.h.
- Add a new SET_KERNEL_SREGS macro that sets up %ds and %es to point to
  kernel data and %fs to point to per-CPU data and use the new macro
  in several kernel entry points including trap and interrupt handlers.
- Convert the IPI_STOP handler Xcpustop to push a standard trap frame
  rather than an application frame.
- Make the TRAP() macro private to exception.s since it is only used
  there.
- Move the PCPU_*() macros in asmacros.h out of the middle of the
  profiling macros.

Reviewed by:	bde
Requested by:	bde (4, 5)
2005-12-05 21:44:47 +00:00
Ruslan Ermilov
342ed5d948 Fix -Wundef warnings found when compiling i386 LINT, GENERIC and
custom kernels.
2005-12-05 11:58:35 +00:00
John Baldwin
1dab802e37 Garbage collect machine/smptests.h now that it is empty and no longer used. 2005-11-22 22:55:48 +00:00
John Baldwin
c21ba8d166 Make COUNT_IPIS and COUNT_XINVLTLB_HITS real kernel options and take
them out of machine/smptests.h.
2005-11-22 22:54:42 +00:00
John Baldwin
e36e973da9 Garbage collect unused {VERBOSE_,}CPUSTOP_ON_DDBBREAK macros. 2005-11-22 22:37:13 +00:00