Commit Graph

166 Commits

Author SHA1 Message Date
John Baldwin
824fc46089 MFamd64: Add support for extended FPU states on i386. This includes
support for AVX on i386.
- Similar to amd64, move the FPU save area out of the PCB and instead
  store saved FPU state in a variable-sized buffer after the PCB on the
  stack.
- To support the variable PCB location, alter the locore code to only use
  the bottom-most page of proc0stack for init386().  init386() returns
  the correct stack pointer to locore which adjusts the stack for thread0
  before calling mi_startup().
- Don't bother setting cr3 in thread0's pcb in locore before calling
  init386().  It wasn't used (init386() overwrote it at the end) and
  it doesn't work with the variable-sized FPU save area.
- Remove the new-bus attachment from npx.  This was only ever useful for
  external co-processors using IRQ13, but those have not been supported
  for several years.  npxinit() is now called much earlier during boot
  (init386()) similar to amd64.
- Implement PT_{GET,SET}XSTATE and I386_GET_XFPUSTATE.
- npxsave() is now only called from context switch contexts so it can
  use XSAVEOPT.

Differential Revision:	https://reviews.freebsd.org/D1058
Reviewed by:	kib
Tested on:	FreeBSD/i386 VM under bhyve on Intel i5-2520
2014-11-02 22:58:30 +00:00
Andriy Gapon
f25e50cf0b provide fast versions of ffsl and flsl for i386; ffsll and flsll for amd64
Reviewed by:	jhb
MFC after:	10 days
X-MFC note:	consider thirdparty modules depending on these symbols
Sponsored by:	HybridCluster
2014-02-14 15:18:37 +00:00
Konstantin Belousov
0220d04fe3 Add lfence().
MFC after:	1 week
2012-08-01 17:24:53 +00:00
John Baldwin
d706ec297a Add a clts() wrapper around the 'clts' instruction to <machine/cpufunc.h>
on x86 and use that to implement stop_emulating() in the fpu/npx code.
Reimplement start_emulating() in the non-XEN case by using load_cr0() and
rcr0() instead of the 'lmsw' and 'smsw' instructions.  Intel explicitly
discourages the use of 'lmsw' and 'smsw' on 80386 and later processors in
the description of these instructions in Volume 2 of the ADM.

Reviewed by:	kib
MFC after:	1 month
2012-07-09 20:55:39 +00:00
Jung-uk Kim
f0b28f005e Correct cpu_monitor() and cpu_mwait() for amd64. These instructions take
%rcx as "extensions" in long mode.  If any unused bit is set in %rcx, these
instructions cause general protection fault.  Fix style nits and synchronize
i386 with amd64.
2011-07-05 18:42:10 +00:00
Jung-uk Kim
0e72764232 Add a function rdtsc32() to read lower 32 bits from TSC and discard upper
32 bits.  Some times compiler inserts unnecessary instructions to preserve
unused upper 32 bits even when it is casted to a 32-bit value.  It reduces
such compiler mistakes where every cycle counts.
2011-04-14 16:53:32 +00:00
Jung-uk Kim
4854ae249c Consistently use __volatile as the rest of this file. 2011-04-14 16:19:41 +00:00
Jung-uk Kim
f54c13ea44 Consistently use C99 standard integers as the rest of this file. 2011-04-14 16:02:52 +00:00
Roman Divacky
27d4fea6c5 Change the parameter passed to the inline assembly to u_short
as we are dealing with 16bit segment registers. Change mov
to movw.

Approved by:    rpaulo (mentor)
Reviewed by:    kib, rink
2010-09-03 14:25:17 +00:00
David E. O'Brien
93d8be03d9 Quiet variable "shadows" warning:
sys/vmmeter.h: warning: shadowed declaration is here
  machine/cpufunc.h: In function 'insw':
  machine/cpufunc.h: warning: declaration of 'cnt' shadows a global declaration
  ..snip..
2010-01-01 20:55:11 +00:00
Kip Macy
46aba52a50 make read_eflags and write_eflags accomplish the same effect on PVM as native,
simplifying interrupt handling
2009-10-01 22:05:38 +00:00
Andriy Gapon
beb2c1f3e9 cpufunc.h: unify/correct style of c extension names
i386 and amd64 archs only.
inline => __inline. [1]
__asm__ => __asm. [2]

Reviewed by:	kib, jhb [1]
Suggested by:	kib [2]
MFC after:	1 week
2009-09-30 16:34:50 +00:00
Konstantin Belousov
8a5ac5d56f As was done in r195820 for amd64, use clflush for flushing cache lines
when memory page caching attributes changed, and CPU does not support
self-snoop, but implemented clflush, for i386.

Take care of possible mappings of the page by sf buffer by utilizing
the mapping for clflush, otherwise map the page transiently. Amd64
used direct map.

Proposed and reviewed by:  alc
Approved by:   re (kensmith)
2009-07-29 08:49:58 +00:00
John Baldwin
38a9df71f9 Move (read|write)_cyrix_reg() inlines from specialreg.h to cpufunc.h.
specialreg.h now consists solely of register-related macros.
2009-06-16 15:13:18 +00:00
Ed Schouten
2b7ceeb0b3 Clobber "cc" instead of using volatile.
Submitted by:	Christoph Mallon
2009-06-13 14:30:08 +00:00
Ed Schouten
e1048f7678 Simplify in/out functions (for i386 and AMD64).
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.

Submitted by:	Christoph Mallon <christoph mallon gmx de>
2009-04-11 14:01:01 +00:00
David E. O'Brien
e6493bbebf Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions
for moving between a segment register and a 32-bit memory location.

Looked at by:	jhb
2009-01-31 11:37:21 +00:00
Kip Macy
9bf38e47a3 - move gdt, ldt allocation to before KPT allocation
- fix bugs where we would:
    - try to map the hypervisors address space
    - accidentally kick out an existing kernel mapping for some domain creation memory allocation sizes
    - accidentally skip a 2MB kernel mapping for some domain creation memory allocation sizes
- don't rely on trapping in to xen to read rcr2, reference through vcpu
- whitespace cleanups
2008-10-19 01:27:40 +00:00
Kip Macy
18bad85737 - clean up interrupt handling for xen a tiny bit
- parse the command line in to kenv
- defer shutdown watcher until later in boot

MFC after:	1 month
2008-08-20 09:16:46 +00:00
Kip Macy
93ee134a24 Integrate support for xen in to i386 common code.
MFC after:	1 month
2008-08-15 20:51:31 +00:00
Stanislav Sedov
e085f869d5 - Add cpuctl(4) pseudo-device driver to provide access to some low-level
features of CPUs like reading/writing machine-specific registers,
  retrieving cpuid data, and updating microcode.
- Add cpucontrol(8) utility, that provides userland access to
  the features of cpuctl(4).
- Add subsequent manpages.

The cpuctl(4) device operates as follows. The pseudo-device node cpuctlX
is created for each cpu present in the systems. The pseudo-device minor
number corresponds to the cpu number in the system. The cpuctl(4) pseudo-
device allows a number of ioctl to be preformed, namely RDMSR/WRMSR/CPUID
and UPDATE. The first pair alows the caller to read/write machine-specific
registers from the correspondent CPU. cpuid data could be retrieved using
the CPUID call, and microcode updates are applied via UPDATE.

The permissions are inforced based on the pseudo-device file permissions.
RDMSR/CPUID will be allowed when the caller has read access to the device
node, while WRMSR/UPDATE will be granted only when the node is opened
for writing. There're also a number of priv(9) checks.

The cpucontrol(8) utility is intened to provide userland access to
the cpuctl(4) device features. The utility also allows one to apply
cpu microcode updates.

Currently only Intel and AMD cpus are supported and were tested.

Approved by:	kib
Reviewed by:	rpaulo, cokane, Peter Jeremy
MFC after:	1 month
2008-08-08 16:26:53 +00:00
Jeff Roberson
66247efa5a - Add inlines for the monitor and mwait instructions.
Sponsored by:	Nokia
2008-04-18 05:47:56 +00:00
Nate Lawson
3b3f28135f Add "show sysregs" command to ddb. On i386, this gives gdt, idt, ldt,
cr0-4, etc.  Support should be added for other platforms that have a
different set of registers for system use.

Loosely based on: OpenBSD
Approved by:	re
2007-08-09 20:14:35 +00:00
Jacques Vidrine
f6108b6158 Add a knob for disabling/enabling HTT, "machdep.hyperthreading_allowed".
Default off due to information disclosure on multi-user systems.

Submitted by:	cperciva
Reviewed by:	jhb
2005-05-13 00:10:56 +00:00
Joerg Wunsch
a5f50ef9e4 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Marcel Moolenaar
60aa1737df Move the definition of rss() from db_interface.c to cpufunc.h where
it belongs. Change the implementation to match those of rfs() and
rgs() for consistency and irrespective of whether the original was
more correct or not (technically speaking).
2004-04-03 22:23:36 +00:00
Tom Rhodes
a122cca953 These are changes to allow to use the Intel C/C++ compiler (lang/icc)
to build the kernel. It doesn't affect the operation if gcc.

Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as
icc v8 may define __GNUC__ some parts may look strange but are
necessary.

Additional changes:
 - in_cksum.[ch]:
   * use a generic C version instead of the assembly version in the !gcc
     case (ASM code breaks with the optimizations icc does)
     -> no bad checksums with an icc compiled kernel
     Help from:		andre, grehan, das
     Stolen from: 	alpha version via ppc version
     The entire checksum code should IMHO be replaced with the DragonFly
     version (because it isn't guaranteed future revisions of gcc will
     include similar optimizations) as in:
        ---snip---
          Revision  Changes    Path
          1.12      +1 -0      src/sys/conf/files.i386
          1.4       +142 -558  src/sys/i386/i386/in_cksum.c
          1.5       +33 -69    src/sys/i386/include/in_cksum.h
          1.5       +2 -0      src/sys/netinet/igmp.c
          1.6       +0 -1      src/sys/netinet/in.h
          1.6       +2 -0      src/sys/netinet/ip_icmp.c

          1.4       +3 -4      src/contrib/ipfilter/ip_compat.h
          1.3       +1 -2      src/sbin/natd/icmp.c
          1.4       +0 -1      src/sbin/natd/natd.c
          1.48      +1 -0      src/sys/conf/files
          1.2       +0 -1      src/sys/conf/files.amd64
          1.13      +0 -1      src/sys/conf/files.i386
          1.5       +0 -1      src/sys/conf/files.pc98
          1.7       +1 -1      src/sys/contrib/ipfilter/netinet/fil.c
          1.10      +2 -3      src/sys/contrib/ipfilter/netinet/ip_compat.h
          1.10      +1 -1      src/sys/contrib/ipfilter/netinet/ip_fil.c
          1.7       +1 -1      src/sys/dev/netif/txp/if_txp.c
          1.7       +1 -1      src/sys/net/ip_mroute/ip_mroute.c
          1.7       +1 -2      src/sys/net/ipfw/ip_fw2.c
          1.6       +1 -2      src/sys/netinet/igmp.c
          1.4       +158 -116  src/sys/netinet/in_cksum.c
          1.6       +1 -1      src/sys/netinet/ip_gre.c
          1.7       +1 -2      src/sys/netinet/ip_icmp.c
          1.10      +1 -1      src/sys/netinet/ip_input.c
          1.10      +1 -2      src/sys/netinet/ip_output.c
          1.13      +1 -2      src/sys/netinet/tcp_input.c
          1.9       +1 -2      src/sys/netinet/tcp_output.c
          1.10      +1 -1      src/sys/netinet/tcp_subr.c
          1.10      +1 -1      src/sys/netinet/tcp_syncache.c
          1.9       +1 -2      src/sys/netinet/udp_usrreq.c

          1.5       +1 -2      src/sys/netinet6/ipsec.c
          1.5       +1 -2      src/sys/netproto/ipsec/ipsec.c
          1.5       +1 -1      src/sys/netproto/ipsec/ipsec_input.c
          1.4       +1 -2      src/sys/netproto/ipsec/ipsec_output.c

          and finally remove
            sys/i386/i386        in_cksum.c
            sys/i386/include     in_cksum.h
        ---snip---
 - endian.h:
   * DTRT in C++ mode
 - quad.h:
   * we don't use gcc v1 anymore, remove support for it
   Suggested by:	bde (long ago)
 - assym.h:
   * avoid zero-length arrays (remove dependency on a gcc specific
     feature)
     This change changes the contents of the object file, but as it's
     only used to generate some values for a header, and the generator
     knows how to handle this, there's no impact in the gcc case.
   Explained by:	bde
   Submitted by:	Marius Strobl <marius@alchemy.franken.de>
 - aicasm.c:
   * minor change to teach it about the way icc spells "-nostdinc"
   Not approved by:	gibbs (no reply to my mail)
 - bump __FreeBSD_version (lang/icc needs to know about the changes)

Incarnations of this patch survive gcc compiles since a loooong time,
I use it on my desktop. An icc compiled kernel works since Nov. 2003
(exceptions: snd_* if used as modules), it survives a build of the
entire ports collection with icc.

Parts of this commit contains suggestions or submissions from
Marius Strobl <marius@alchemy.franken.de>.

Reviewed by:	-arch
Submitted by:	netchild
2004-03-12 21:45:33 +00:00
Bruce Evans
a67ef0a77a Don't implement anything in the ffs family in <machine/cpufunc.h>
in the non-_KERNEL case.  This "fixes" applications that include
this "kernel-only" header and also include <strings.h> (or get
<strings.h> via the default _BSD_VISIBLE pollution in <string.h>.
In C++ there was a fatal error: the declaration specifies C linkage
but the implementation gives C++ linkage.  In C there was only a
static/extern mismatch if the headers were included in a certain order
order, and a partially redundant declaration for all include orders;
gcc emits incomplete or wrong diagnostics for these, but only for
compiling with -Wsystem-headers and certain other warning options, so
the problem was usually not seen for C.

Ports breakage reported by:	kris
2004-03-11 13:38:54 +00:00
Bruce Evans
39f0cfa27b Fixed insertion sort errors in prototype list. 2004-03-05 15:30:40 +00:00
Bruce Evans
83f1e7f9f4 Removed garbage:
- completely unused things
- all of rev.1.102 (C++ support).  <sys/cdefs.h> is included by the
  prerequisite <sys/types.h>.  __BEGIN_DECLS/__END_DECLS has no effect
  (except possibly if undefined behaviour is invoked using a hack like
  defining away __inline) since this header doesn't really support any
  extern functions.
2004-03-05 15:22:05 +00:00
Lukas Ertl
1bcf24ee9d Fix syntax errors and wrong function prototypes in several MD header
files when using non-GNUC compilers.

PR:             kern/58515
Submitted by:   Stefan Farfeleder <stefan@fafoe.narf.at>
Approved by:    grog (mentor), obrien
2004-03-05 09:19:59 +00:00
Bruce Evans
90630944c8 Backed out previous commit. This restores the warning about pessimized
(short) types for the port arg of inb() (rev.1.56).  The warning started
working for u_short types with gcc-3.3.  The pessimizations exposed
by this been fixed except for the cx and oltr drivers where the breakage
of the warning has been pushed to the drivers.
2003-08-06 18:21:27 +00:00
Poul-Henning Kamp
8b30546120 Stop GCC from whining when people use a 16 bit port number for inb() and outb() 2003-07-23 20:28:23 +00:00
David Xu
d1fc2022c3 Backout my last commit.
Requested by: bde
2003-04-20 01:35:21 +00:00
David Xu
2bdf11638e Don't return garbage in high 16 bits. 2003-04-19 02:40:39 +00:00
Peter Wemm
eb1443c8dd Create inlines for ltr(sel), lldt(sel), lidt(addr) rather than
functions that have one instruction.
2002-09-22 04:45:21 +00:00
Mark Murray
d7ee442578 Provide in inline function for the (GNUC) assembler "hlt" instruction. 2002-09-21 18:26:53 +00:00
Peter Wemm
e344afe7c9 Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat
gathering is not an x86 cpu feature)
2002-07-21 05:22:16 +00:00
Mark Murray
7e622d3c84 Cast to prevent "signed/unsigned comparison" warnings. 2002-07-15 13:27:43 +00:00
Peter Wemm
f1b665c8fe Revive backed out pmap related changes from Feb 2002. The highlights are:
- It actually works this time, honest!
- Fine grained TLB shootdowns for SMP on i386.  IPI's are very expensive,
  so try and optimize things where possible.
- Introduce ranged shootdowns that can be done as a single IPI.
- PG_G support for i386
- Specific-cpu targeted shootdowns.  For example, there is no sense in
  globally purging the TLB cache for where we are stealing a page from
  the local unshared process on the local cpu.  Use pm_active to track
  this.
- Add some instrumentation for the tlb shootdown code.
- Rip out SMP code from <machine/cpufunc.h>
- Try and fix some very bogus PG_G and PG_PS interactions that were bad
  enough to cause vm86 bios calls to break.  vm86 depended on our existing
  bugs and this was the cause of the VESA panics last time.
- Fix the silly one-line error that caused the 'panic: bad pte' last time.
- Fix a couple of other silly one-line errors that should have caused more
  pain than they did.

Some more work is needed:
- pmap_{zero,copy}_page[_idle].  These can be done without IPI's if we
  have a hook in cpu_switch.
- The IPI handlers need some cleanup.  I have a bogus %ds load that can
  be avoided.
- APTD handling is rather bogus and appears to be a large source of
  global TLB IPI shootdowns for no really good reason.

I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop.
I expect to see a bigger difference when there is significant pageout
activity or the system otherwise has memory shortages.

I have backed out a few optimizations that I had been using over the last
few days in order to be a little more conservative.  I'll revisit these
again over the next few days as the dust settles.

New option:  DISABLE_PG_G - In case I missed something.
2002-07-12 07:56:11 +00:00
John Baldwin
6b8c698908 Rename pause() to ia32_pause() so it doesn't conflict with the pause()
function defined in <unistd.h>.  I didn't #ifdef _KERNEL it because the
mutex implementation in libpthread will probably need this.
2002-05-22 20:32:39 +00:00
John Baldwin
07508f90b6 Debug registers aren't selectors, so use saner names for the variables in
the inline functions for reading and writing the debug registers.
2002-05-22 13:29:18 +00:00
John Baldwin
2be69f326a - Sort the pause() inline into the appropriate location.
- Add many missing prototypes to the non-GCC section.
2002-05-22 13:27:05 +00:00
John Baldwin
0228ea4e0b Rename cpu_pause() to pause(). Originally I was going to make this an
MI API with empty cpu_pause() functions on other arch's, but this
functionality is definitely unique to IA-32, so I decided to leave it
as i386-only and wrap it in #ifdef's.  I should have dropped the cpu_
prefix when I made that decision.

Requested by:	bde
2002-05-22 13:19:22 +00:00
John Baldwin
bb0d293f15 Add an inline function cpu_pause() for the IA32 'pause' instruction. 2002-05-21 20:21:53 +00:00
David Malone
a983fdfe4c Move do_cpuid into the correct place in this file and make
the indentation more like the other multi-line assembley in
this file.

Someone who understands gcc constraints could update the
constraints for do_cpuid.
2002-04-10 21:18:46 +00:00
Matthew Dillon
182da8209d Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development.  Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures.  Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable.  Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks.  They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by:	jake
2002-04-01 23:51:23 +00:00
Matthew Dillon
d74ac6819b Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
Bruce Evans
809dbbc99b Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-23 15:09:35 +00:00