freebsd-skq

Author	SHA1	Message	Date
avg	84460fd88d	provide fast versions of ffsl and flsl for i386; ffsll and flsll for amd64 Reviewed by: jhb MFC after: 10 days X-MFC note: consider thirdparty modules depending on these symbols Sponsored by: HybridCluster	2014-02-14 15:18:37 +00:00
gibbs	437790b349	Implement PV IPIs for PVHVM guests and further converge PV and HVM IPI implmementations. Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D Submitted by: gibbs (misc cleanup, table driven config) Reviewed by: gibbs MFC after: 2 weeks sys/amd64/include/cpufunc.h: sys/amd64/amd64/pmap.c: Move invltlb_globpcid() into cpufunc.h so that it can be used by the Xen HVM version of tlb shootdown IPI handlers. sys/x86/xen/xen_intr.c: sys/xen/xen_intr.h: Rename xen_intr_bind_ipi() to xen_intr_alloc_and_bind_ipi(), and remove the ipi vector parameter. This api allocates an event channel port that can be used for ipi services, but knows nothing of the actual ipi for which that port will be used. Removing the unused argument and cleaning up the comments surrounding its declaration helps clarify its actual role. sys/amd64/amd64/mp_machdep.c: sys/amd64/include/cpu.h: sys/i386/i386/mp_machdep.c: sys/i386/include/cpu.h: Implement a generic framework for amd64 and i386 that allows the implementation of certain CPU management functions to be selected at runtime. Currently this is only used for the ipi send function, which we optimize for Xen when running on a Xen hypervisor, but can easily be expanded to support more operations. sys/x86/xen/hvm.c: Implement Xen PV IPI handlers and operations, replacing native send IPI. sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: sys/i386/include/smp.h: Remove NR_VIRQS and NR_IPIS from FreeBSD headers. NR_VIRQS is defined already for us in the xen interface files. NR_IPIS is only needed in one file per Xen platform and is easily inferred by the IPI vector table that is defined in those files. sys/i386/xen/mp_machdep.c: Restructure to more closely match the HVM implementation by performing table driven IPI setup.	2013-09-06 22:17:02 +00:00
kib	f8c0849eff	Provide a wrapper for the INVPCID instruction, definition of the descriptor and symbolic names for the operation types. Sponsored by: The FreeBSD Foundation Reviewed by: alc Tested by: pho, bf	2013-08-30 07:42:38 +00:00
kib	36babd37ca	Add lfence(). MFC after: 1 week	2012-08-01 17:24:53 +00:00
jhb	501beca671	Add a clts() wrapper around the 'clts' instruction to <machine/cpufunc.h> on x86 and use that to implement stop_emulating() in the fpu/npx code. Reimplement start_emulating() in the non-XEN case by using load_cr0() and rcr0() instead of the 'lmsw' and 'smsw' instructions. Intel explicitly discourages the use of 'lmsw' and 'smsw' on 80386 and later processors in the description of these instructions in Volume 2 of the ADM. Reviewed by: kib MFC after: 1 month	2012-07-09 20:55:39 +00:00
jhb	01cf370271	Now that our assembler supports the xsave family of instructions, use them natively rather than hand-assembled versions. For xgetbv/xsetbv, add a wrapper API to deal with xcr* registers: rxcr() and load_xcr(). Reviewed by: kib MFC after: 1 month	2012-07-05 18:19:35 +00:00
alc	aacced50df	Optimize reserve_pv_entries() using the popcnt instruction.	2012-06-30 20:25:12 +00:00
jhb	f436893eac	Correct function prototype for read_rflags().	2012-02-27 17:28:47 +00:00
kib	a39b6a3bff	Move xrstor/xsave/xsetbv into fpu.c and reorder them. Requested by: bde MFC after: 1 month	2012-01-30 07:53:33 +00:00
kib	dbd94fb4b8	Order newly added functions alphabetically. Requested by: bde MFC after: 3 days	2012-01-25 12:43:27 +00:00
kib	6a34de7c5a	Implement xsetbv(), xsave() and xrstor() providing C access to the similarly named CPU instructions. Since our in-tree binutils gas is not aware of the instructions, and I have to use the byte-sequence to encode them, hardcode the r/m operand as (%rdi). This way, first argument of the pseudo-function is already placed into proper register. MFC after: 1 week	2012-01-17 07:30:36 +00:00
jkim	52539f62b4	Correct cpu_monitor() and cpu_mwait() for amd64. These instructions take %rcx as "extensions" in long mode. If any unused bit is set in %rcx, these instructions cause general protection fault. Fix style nits and synchronize i386 with amd64.	2011-07-05 18:42:10 +00:00
jkim	7d94642dc2	Add a function rdtsc32() to read lower 32 bits from TSC and discard upper 32 bits. Some times compiler inserts unnecessary instructions to preserve unused upper 32 bits even when it is casted to a 32-bit value. It reduces such compiler mistakes where every cycle counts.	2011-04-14 16:53:32 +00:00
jkim	8ada5a0bae	Consistently use __volatile as the rest of this file.	2011-04-14 16:19:41 +00:00
jkim	581c70b26c	Prefer C99 standard integers to reduce diff from i386 version.	2011-04-14 16:14:35 +00:00
rdivacky	b464d39d95	Change the parameter passed to the inline assembly to u_short as we are dealing with 16bit segment registers. Change mov to movw. Approved by: rpaulo (mentor) Reviewed by: kib, rink	2010-09-03 14:25:17 +00:00
obrien	bbbada417d	Quiet variable "shadows" warning: sys/vmmeter.h: warning: shadowed declaration is here machine/cpufunc.h: In function 'insw': machine/cpufunc.h: warning: declaration of 'cnt' shadows a global declaration ..snip..	2010-01-01 20:55:11 +00:00
avg	b6e8843767	cpufunc.h: unify/correct style of c extension names i386 and amd64 archs only. inline => __inline. [1] __asm__ => __asm. [2] Reviewed by: kib, jhb [1] Suggested by: kib [2] MFC after: 1 week	2009-09-30 16:34:50 +00:00
kib	f8feb430b0	When the page caching attributes are changed, after new mapping is established, OS shall flush the caches on all processors that may have used the mapping previously. This operation is not needed if processors support self-snooping. If not, but clflush instruction is implemented on the CPU, series of the clflush can be used on the mapping region. Otherwise, we have to flush the whole cache. The later operation is very expensive, and AMD-made CPUs do not have self-snooping. Implement cache flush for remapped region by using clflush for amd64, when supported by CPU. Proposed and reviewed by: alc Approved by: re (kensmith)	2009-07-22 14:32:38 +00:00
ed	a0f5dad6a9	Simplify in/out functions (for i386 and AMD64). Remove a hack to generate more efficient code for port numbers below 0x100, which has been obsolete for at least ten years, because GCC has an asm constraint to specify that. Submitted by: Christoph Mallon <christoph mallon gmx de>	2009-04-11 14:01:01 +00:00
ed	0d9b7a69e5	Don't explicitly force ecx to be used for MSR_FSBASE/MSR_GSBASE. Because the "c" input constaint is used, the compiler will already place the MSR_FSBASE/MSR_GSBASE constants in ecx. Using __asm("ecx") makes LLVM crash. Even though this is also an LLVM bug, we'd better remove the unnecessary GCCism as well. Submitted by: Christoph Mallon <christoph.mallon@gmx.de>	2009-04-07 19:31:36 +00:00
obrien	7a153194ec	Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions for moving between a segment register and a 32-bit memory location. Looked at by: jhb	2009-01-31 11:37:21 +00:00
stas	a782fc10fe	- Add cpuctl(4) pseudo-device driver to provide access to some low-level features of CPUs like reading/writing machine-specific registers, retrieving cpuid data, and updating microcode. - Add cpucontrol(8) utility, that provides userland access to the features of cpuctl(4). - Add subsequent manpages. The cpuctl(4) device operates as follows. The pseudo-device node cpuctlX is created for each cpu present in the systems. The pseudo-device minor number corresponds to the cpu number in the system. The cpuctl(4) pseudo- device allows a number of ioctl to be preformed, namely RDMSR/WRMSR/CPUID and UPDATE. The first pair alows the caller to read/write machine-specific registers from the correspondent CPU. cpuid data could be retrieved using the CPUID call, and microcode updates are applied via UPDATE. The permissions are inforced based on the pseudo-device file permissions. RDMSR/CPUID will be allowed when the caller has read access to the device node, while WRMSR/UPDATE will be granted only when the node is opened for writing. There're also a number of priv(9) checks. The cpucontrol(8) utility is intened to provide userland access to the cpuctl(4) device features. The utility also allows one to apply cpu microcode updates. Currently only Intel and AMD cpus are supported and were tested. Approved by: kib Reviewed by: rpaulo, cokane, Peter Jeremy MFC after: 1 month	2008-08-08 16:26:53 +00:00
jeff	b82673ba40	- Add inlines for the monitor and mwait instructions. Sponsored by: Nokia	2008-04-18 05:47:56 +00:00
nectar	9ee23cec03	Add a knob for disabling/enabling HTT, "machdep.hyperthreading_allowed". Default off due to information disclosure on multi-user systems. Submitted by: cperciva Reviewed by: jhb	2005-05-13 00:10:56 +00:00
peter	aae81bd5b3	Remove diffs to i386 version that came in via the compiler support ifdefs. This changes things like whitespace, inconsistent use of #ifndef vs #if !defined(), different macro argument orders, mismatched comments, etc.	2005-03-11 22:16:09 +00:00
joerg	c85a3e95f7	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago	2005-03-02 21:33:29 +00:00
ps	1be1b43db4	MFia64: Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead of creating an inline function that just calls __builtin_ffs.	2004-07-30 16:44:29 +00:00
peter	b035268c4a	MFi386: move rss() from db_interface.c to cpufunc.h	2004-04-07 00:41:05 +00:00
imp	ceab5f79a3	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm. Approved by: core, peter	2004-04-05 23:55:14 +00:00
bde	7bc1e8430f	Don't implement anything in the ffs family in <machine/cpufunc.h> in the non-_KERNEL case. This "fixes" applications that include this "kernel-only" header and also include <strings.h> (or get <strings.h> via the default _BSD_VISIBLE pollution in <string.h>. In C++ there was a fatal error: the declaration specifies C linkage but the implementation gives C++ linkage. In C there was only a static/extern mismatch if the headers were included in a certain order order, and a partially redundant declaration for all include orders; gcc emits incomplete or wrong diagnostics for these, but only for compiling with -Wsystem-headers and certain other warning options, so the problem was usually not seen for C. Ports breakage reported by: kris	2004-03-11 13:38:54 +00:00
peter	d2aca56c04	MFi386: re-sort non-gcc function prototypes, trim includes	2004-03-08 00:24:15 +00:00
le	d23ba57557	Fix syntax errors and wrong function prototypes in several MD header files when using non-GNUC compilers. PR: kern/58515 Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at> Approved by: grog (mentor), obrien	2004-03-05 09:19:59 +00:00
peter	9ba1ee132d	Re-add debug register functions	2004-01-28 23:53:04 +00:00
peter	955dcc87c1	Add 64 bit bsf/ffs routines. Have the ffs() inline use gcc's builtin because it uses the better cmove instructions to avoid branches.	2003-12-06 23:22:43 +00:00
peter	2edc2f597b	Update the graffiti.	2003-11-08 04:39:22 +00:00
peter	12d7e4bee6	Collect the nastiness for preserving the kernel MSR_GSBASE around the load_gs() calls into a single place that is less likely to go wrong. Eliminate the per-process context switching of MSR_GSBASE, because it should be constant for a single cpu. Instead, save/restore it during the loading of the new %gs selector for the new process. Approved by: re (amd64/* blanket)	2003-05-15 00:23:40 +00:00
peter	770abdbb9c	Add BASIC i386 binary support for the amd64 kernel. This is largely stolen from the ia64/ia32 code (indeed there was a repocopy), but I've redone the MD parts and added and fixed a few essential syscalls. It is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic) and p4. The ia64 code has not implemented signal delivery, so I had to do that. Before you say it, yes, this does need to go in a common place. But we're in a freeze at the moment and I didn't want to risk breaking ia64. I will sort this out after the freeze so that the common code is in a common place. On the AMD64 side, this required adding segment selector context switch support and some other support infrastructure. The %fs/%gs etc code is hairy because loading %gs will clobber the kernel's current MSR_GSBASE setting. The segment selectors are not used by the kernel, so they're only changed at context switch time or when changing modes. This still needs to be optimized. Approved by: re (amd64/* blanket)	2003-05-14 04:10:49 +00:00
peter	45949ccde1	Commit MD parts of a loosely functional AMD64 port. This is based on a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.	2003-05-01 01:05:25 +00:00
davidxu	f781b4eab2	Backout my last commit. Requested by: bde	2003-04-20 01:35:21 +00:00
davidxu	6223a95348	Don't return garbage in high 16 bits.	2003-04-19 02:40:39 +00:00
peter	e8f9ac26c8	Create inlines for ltr(sel), lldt(sel), lidt(addr) rather than functions that have one instruction.	2002-09-22 04:45:21 +00:00
markm	fed5dfb239	Provide in inline function for the (GNUC) assembler "hlt" instruction.	2002-09-21 18:26:53 +00:00
peter	4a13c91393	Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat gathering is not an x86 cpu feature)	2002-07-21 05:22:16 +00:00
markm	ba39a3d3df	Cast to prevent "signed/unsigned comparison" warnings.	2002-07-15 13:27:43 +00:00
peter	4d88d6566a	Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did. Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason. I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages. I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles. New option: DISABLE_PG_G - In case I missed something.	2002-07-12 07:56:11 +00:00
jhb	096c0249dc	Rename pause() to ia32_pause() so it doesn't conflict with the pause() function defined in <unistd.h>. I didn't #ifdef _KERNEL it because the mutex implementation in libpthread will probably need this.	2002-05-22 20:32:39 +00:00
jhb	06e51352f6	Debug registers aren't selectors, so use saner names for the variables in the inline functions for reading and writing the debug registers.	2002-05-22 13:29:18 +00:00
jhb	fd3295715e	- Sort the pause() inline into the appropriate location. - Add many missing prototypes to the non-GCC section.	2002-05-22 13:27:05 +00:00
jhb	2f66cc911b	Rename cpu_pause() to pause(). Originally I was going to make this an MI API with empty cpu_pause() functions on other arch's, but this functionality is definitely unique to IA-32, so I decided to leave it as i386-only and wrap it in #ifdef's. I should have dropped the cpu_ prefix when I made that decision. Requested by: bde	2002-05-22 13:19:22 +00:00

1 2 3 4

171 Commits