freebsd-dev

Author	SHA1	Message	Date
Roger Pau Monné	079f7ef839	xen: add a hook to perform AP startup AP startup on PVH follows the PV method, so we need to add a hook in order to diverge from bare metal. Approved by: gibbs Sponsored by: Citrix Systems R&D amd64/amd64/machdep.c: - Add hook for start_all_aps on native (using native_start_all_aps defined in mp_machdep). amd64/amd64/mp_machdep.c: - Make some variables global because they will also be used by the Xen PVH AP startup code. - Use the start_all_aps hook to start APs. - Rename start_all_aps to native_start_all_aps. amd64/include/smp.h: - Add declaration for native_start_all_aps. x86/include/init.h: - Declare start_all_aps hook in init_ops. x86/xen/pv.c: - Pick external declarations from mp_machdep. - Introduce Xen PV code to start APs on PVH. - Set start_all_aps init hook to use the Xen PVH implementation.	2014-03-11 10:27:57 +00:00
Roger Pau Monné	e8da1c4877	amd64/i386: switch IPI handlers to C code. Move asm IPIs handlers to C code, so both Xen and native IPI handlers share the same code. Reviewed by: jhb Approved by: gibbs Sponsored by: Citrix Systems R&D amd64/amd64/apic_vector.S: i386/i386/apic_vector.s: - Remove asm coded IPI handlers and instead call the newly introduced C variants. amd64/amd64/mp_machdep.c: i386/i386/mp_machdep.c: - Add C coded clones to the asm IPI handlers (moved from x86/xen/hvm.c). i386/include/smp.h: amd64/include/smp.h: - Add prototypes for the C IPI handlers. x86/xen/hvm.c: - Move the C IPI handlers to mp_machdep and call those in the Xen IPI handlers. i386/xen/mp_machdep.c: - Add dummy IPI handlers to the i386 Xen PV port (this port doesn't support SMP).	2014-03-11 10:03:29 +00:00
John Baldwin	e07ef9b0f6	Move <machine/apicvar.h> to <x86/apicvar.h>.	2014-01-23 20:10:22 +00:00
Konstantin Belousov	6aceaa3e17	Tidy up some loose ends in the PCID code: - Restore the pre-PCID TLB shootdown handlers for whole address space and single page invalidation asm code, and assign the IPI handler to them when PCID is not supported or disabled. Old handlers have linear control flow. But, still use the common return sequence. - Stop using pcpu for INVPCID descriptors in the invlrg handler. It is enough to allocate descriptors on the stack. As result, two SWAPGS instructions are shaved off from the code for Haswell+. - Fix the reverted condition in invlrng for checking of the PCID support [1], also in invlrng check that pmap is kernel pmap before performing other tests. For the kernel pmap, which provides global mappings, the INVLPG must be used for invalidation always. - Save the pre-computed pmap' %CR3 register in the struct pmap. This allows to remove several checks for pm_pcid validity when %CR3 is reloaded [2]. Noted by: gibbs [1] Discussed with: alc [2] Tested by: pho, flo Sponsored by: The FreeBSD Foundation	2013-09-04 23:31:29 +00:00
Konstantin Belousov	37eed8419c	Implement support for the process-context identifiers ('PCID') on Intel CPUs. The feature tags TLB entries with the Id of the address space and allows to avoid TLB invalidation on the context switch, it is available only in the long mode. In the microbenchmarks, using the PCID decreased latency of the context switches by ~30% on SandyBridge class desktop CPUs, measured with the lat_ctx program from lmbench. If available, use INVPCID instruction when a TLB entry in non-current address space needs to be invalidated. The instruction is typically available on the Haswell. If needed, the use of PCID can be turned off with the vm.pmap.pcid_enabled loader tunable set to 0. The state of the feature is reported by the vm.pmap.pcid_enabled sysctl. The sysctl vm.pmap.pcid_save_cnt reports the number of context switches which avoided invalidating the TLB; compare with the total number of context switches, available as sysctl vm.stats.sys.v_swtch. Sponsored by: The FreeBSD Foundation Reviewed by: alc Tested by: pho, bf	2013-08-30 07:59:49 +00:00
Mitsuru IWASAKI	77c80e2e5b	Share IPI init and startup code of mp_machdep.c with acpi_wakeup.c as ipi_startup().	2012-06-12 00:14:54 +00:00
Andriy Gapon	234dab4a82	remove code for dynamic offlining/onlining of CPUs on x86 The code has definitely been broken for SCHED_ULE, which is a default scheduler. It may have been broken for SCHED_4BSD in more subtle ways, e.g. with manually configured CPU affinities and for interrupt devilery purposes. We still provide a way to disable individual CPUs or all hyperthreading "twin" CPUs before SMP startup. See the UPDATING entry for details. Interaction between building CPU topology and disabling CPUs still remains fuzzy: topology is first built using all availble CPUs and then the disabled CPUs should be "subtracted" from it. That doesn't work well if the resulting topology becomes non-uniform. This work is done in cooperation with Attilio Rao who in addition to reviewing also provided parts of code. PR: kern/145385 Discussed with: gcooper, ambrisko, mdf, sbruno Reviewed by: attilio Tested by: pho, pluknet X-MFC after: never	2011-06-08 08:12:15 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Attilio Rao	8c0ef2464e	Revert md_assert_preempt() introduction. Discussed with: jeff, jhb	2011-05-04 20:29:40 +00:00
Attilio Rao	f1edea81ac	Add the function md_assert_nopreempt(), which is a very consistent function on the possibility of a thread to not preempt. As this function is very tied to x86 (interrupts disabled checkings) it is not intended to be used in MI code.	2011-04-30 23:12:37 +00:00
Alan Cox	63740078e0	Amd64 doesn't have a lazypmap ipi.	2011-03-27 16:18:51 +00:00
John Baldwin	d9d8d1449d	Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month	2010-08-06 15:36:59 +00:00
Alexander Motin	d364638110	Merge COUNT_XINVLTLB_HITS and COUNT_IPIS kernel options from i386 to amd64. This information can be very valuable for CPU sleep-time (and respectively idle power consumption) optimization. Add counters for timer-related IPIs. Reviewed by: jhb@ (previous version)	2010-06-17 11:54:49 +00:00
Attilio Rao	dc6fbf6545	* Completely Remove the option STOP_NMI from the kernel. This option has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists. Please don't forget to update your config file with the STOP_NMI option removal Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)	2009-08-13 17:09:45 +00:00
Attilio Rao	120b18d86f	FreeBSD right now support 32 CPUs on all the architectures at least. With the arrival of 128+ cores it is necessary to handle more than that. One of the first thing to change is the support for cpumask_t that needs to handle more than 32 bits masking (which happens now). Some places, however, still assume that cpumask_t is a 32 bits mask. Fix that situation by using always correctly cpumask_t when needed. While here, remove the part under STOP_NMI for the Xen support as it is broken in any case. Additively make ipi_nmi_pending as static. Reviewed by: jhb, kmacy Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-05-14 17:43:00 +00:00
Jeff Roberson	82fcb0f192	- Add support for cpuid leaf 0xb. This allows us to determine the topology of nehalem/corei7 based systems. - Remove the cpu_cores/cpu_logical detection from identcpu. - Describe the layout of the system in cpu_mp_announce(). Sponsored by: Nokia	2009-04-29 06:54:40 +00:00
Jung-uk Kim	c66d2b38c8	Initial suspend/resume support for amd64. This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.	2009-03-17 00:48:11 +00:00
Marius Strobl	6f04e7b9aa	Remove ipi_all() and ipi_self() as the former hasn't been used at all to date and the latter also is only used in ia64 and powerpc code which no longer serves a real purpose after bring-up and just can be removed as well. Note that architectures like sun4u also provide no means of implementing IPI'ing a CPU itself natively in the first place. Suggested by: jhb Reviewed by: arch, grehan, jhb	2008-09-28 18:34:14 +00:00
Jeff Roberson	81aa71755b	- Remove the old smp cpu topology specification with a new, more flexible tree structure that encodes the level of cache sharing and other properties. - Provide several convenience functions for creating one and two level cpu trees as well as a default flat topology. The system now always has some topology. - On i386 and amd64 create a seperate level in the hierarchy for HTT and multi-core cpus. This will allow the scheduler to intelligently load balance non-uniform cores. Presently we don't detect what level of the cache hierarchy is shared at each level in the topology. - Add a mechanism for testing common topologies that have more information than the MD code is able to provide via the kern.smp.topology tunable. This should be considered a debugging tool only and not a stable api. Sponsored by: Nokia	2008-03-02 07:58:42 +00:00
Attilio Rao	c8790f5d09	Fix some entries in the locks static table of witness. In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re	2007-09-20 20:38:43 +00:00
Alexander Kabaev	fa298d5ea8	Include machine/pcb.hto turn extern struct pcb stoppcbs[]; construct into the valid C.	2007-05-19 05:01:43 +00:00
John Baldwin	4c5bec1161	Change the x86 interrupt code to use FreeBSD CPU IDs (i.e. PCPU_GET(cpuid)) rather than local APIC IDs to keep track of CPUs which can handle interrupts.	2007-03-06 17:16:47 +00:00
John Baldwin	4ac60df584	Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case. MFC after: 1 month	2006-05-01 21:36:47 +00:00
John Baldwin	b439e431bf	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
John Baldwin	333b8de537	MFi386: - Move PUSH_FRAME and POP_FRAME to asmacros.h and use PUSH_FRAME in atpic entry points. - Move PCPU_* asm macros out of the middle of the asm profiling macros. - Pass IRQ vector argument as an int rather than void * to reduce diffs with i386. - EOI the lapic in C for the lapic timer handler. - GC unused Xcpuast function. - Split IPI_STOP handling code of ipi_nmi_handler() out into a cpustop_handler() function and call it from Xcpustop rather than duplicating all the logic in assembly. - Fixup the list of symbols with interrupt frames in ddb traces. Xatpic_fastintr* have never existed on amd64, and the lapic timer handler and various IPI handlers were missing. - Use trapframe instead of intrframe for interrupt entry points (on amd64 the interrupt vector was already a separate argument, so the two frames were already identical) and GC intrframe. Submitted by: peter (3)	2005-12-08 18:33:30 +00:00
John Baldwin	58553b9925	Rename the KDB_STOP_NMI kernel option to STOP_NMI and make it apply to all IPI_STOP IPIs. - Change the i386 and amd64 MD IPI code to send an NMI if STOP_NMI is enabled if an attempt is made to send an IPI_STOP IPI. If the kernel option is enabled, there is also a sysctl to change the behavior at runtime (debug.stop_cpus_with_nmi which defaults to enabled). This includes removing stop_cpus_nmi() and making ipi_nmi_selected() a private function for i386 and amd64. - Fix ipi_all(), ipi_all_but_self(), and ipi_self() on i386 and amd64 to properly handle bitmapped IPIs as well as IPI_STOP IPIs when STOP_NMI is enabled. - Fix ipi_nmi_handler() to execute the restart function on the first CPU that is restarted making use of atomic_readandclear() rather than assuming that the BSP is always included in the set of restarted CPUs. Also, the NMI handler didn't clear the function pointer meaning that subsequent stop and restarts could execute the function again. - Define a new macro HAVE_STOPPEDPCBS on i386 and amd64 to control the use of stoppedpcbs[] and always enable it for i386 and amd64 instead of being dependent on KDB_STOP_NMI. It works fine in both the NMI and non-NMI cases.	2005-10-24 21:04:19 +00:00
Doug White	fdc9713bf7	Implement an alternate method to stop CPUs when entering DDB. Normally we use a regular IPI vector, but this vector is blocked when interrupts are disabled. With "options KDB_STOP_NMI" and debug.kdb.stop_cpus_with_nmi set, KDB will send an NMI to each CPU instead. The code also has a context-stuffing feature which helps ddb extract the state of processes running on the stopped CPUs. KDB_STOP_NMI is only useful with SMP and complains if SMP is not defined. This feature only applies to i386 and amd64 at the moment, but could be used on other architectures with the appropriate MD bits. Submitted by: ups	2005-04-30 20:01:00 +00:00
Peter Wemm	c29f1e2b3b	MFi386: Bring over John's local apic timer code	2005-02-28 23:37:35 +00:00
Peter Wemm	b6e89c6d47	JumboMFi386: use bitmapped IPI handler. Update elcr and default mptable config handler. Tidy up various local apic initialization.	2005-01-21 06:01:20 +00:00
Warner Losh	46280ae719	Begin all license/copyright comments with /*-	2005-01-05 20:17:21 +00:00
Peter Wemm	12c1418ccf	Kill the LAZYPMAP ifdefs. While they worked, they didn't do anything to help the AMD cpus (which have a hardware tlb flush filter). I held off to see what the 64 bit Intel cpus did, but it doesn't seem to help much there either. Oh well, store it in the Attic.	2004-05-16 22:11:50 +00:00
Peter Wemm	b29fd7c4db	MFi386: mp_topology().	2004-01-28 23:51:16 +00:00
Peter Wemm	0d2a298904	Initial landing of SMP support for FreeBSD/amd64. - This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.	2003-11-17 08:58:16 +00:00
Peter Wemm	afa8862328	Commit MD parts of a loosely functional AMD64 port. This is based on a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.	2003-05-01 01:05:25 +00:00
Peter Wemm	cc66ebe2a9	Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb	2003-04-02 23:53:30 +00:00
Paul Saab	87437b0b89	Nuke options HTT infavor of machdep.hlt_logical_cpus tunable/sysctl. This keeps the logical cpu's halted in the idle loop. By default the logical cpu's are halted at startup. It is also possible to halt any cpu in the idle loop now using machdep.hlt_cpus. Examples of how to use this: machdep.hlt_cpus=1 halt cpu0 machdep.hlt_cpus=2 halt cpu1 machdep.hlt_cpus=4 halt cpu2 machdep.hlt_cpus=3 halt cpu0,cpu1 Reviewed by: jhb, peter	2003-03-26 19:49:34 +00:00
Jake Burkholder	238dd3209a	Split statclock into statclock and profclock, and made the method for driving statclock based on profhz when profiling is enabled MD, since most platforms don't use this anyway. This removes the need for statclock_process, whose only purpose was to subdivide profhz, and gets the profiling clock running outside of sched_lock on platforms that implement suswintr. Also changed the interface for starting and stopping the profiling clock to do just that, instead of changing the rate of statclock, since they can now be separate. Reviewed by: jhb, tmm Tested on: i386, sparc64	2003-02-03 17:53:15 +00:00
Jim Pirzyk	b2eb172cc3	Add the !define(COMPILING_LINT) pass the pointy hat... Requested by: Juli Mallett <jmallett@FreeBSD.org>	2002-10-17 18:17:28 +00:00
Jim Pirzyk	c8c1cf0ca7	put an #error directive when SMP and CPU_DISABLE_CMPXCHG are set together. Requested by: Lars Eggart <larse@isi.edu> Enlighted how to do it by: John Baldwin <jhb@freebsd.org>	2002-10-17 05:51:36 +00:00
Peter Wemm	f1b665c8fe	Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did. Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason. I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages. I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles. New option: DISABLE_PG_G - In case I missed something.	2002-07-12 07:56:11 +00:00
Bruce Evans	809dbbc99b	Fixed some style bugs in the removal of __P(()). The main ones were not removing tabs before "__P((", and not outdenting continuation lines to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting and/or rewrap the whole prototype in some cases.	2002-03-23 15:09:35 +00:00
Alfred Perlstein	b63dc6ad47	Remove __P.	2002-03-20 05:48:58 +00:00
Peter Wemm	d1693e1701	Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.	2002-02-27 09:51:33 +00:00
Peter Wemm	6bd95d70db	Work-in-progress commit syncing up pmap cleanups that I have been working on for a while: - fine grained TLB shootdown for SMP on i386 - ranged TLB shootdowns.. eg: specify a range of pages to shoot down with a single IPI, since the IPI is very expensive. Adjust some callers that used to trigger this inside tight loops to do a ranged shootdown at the end instead. - PG_G support for SMP on i386 (options ENABLE_PG_G) - defer PG_G activation till after we decide what we are going to do with PSE and the 4MB pages at the start of the kernel. This should solve some rumored strangeness about stale PG_G entries getting stuck underneath the 4MB pages. - add some instrumentation for the fine TLB shootdown - convert some asm instruction wrappers from functions to inlines. gcc seems to do a fair bit better with this. - [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix this again shortly. This has been working fairly well for me for a while, but I have tweaked it again prior to commit since my last major testing round. The only outstanding problem that I know of is PG_G related, which is why there is an option for it (not on by default for SMP). I have seen a world speedups by a few percent (as much as 4 or 5% in one case) but I have not accurately measured this - I am a bit sceptical of these numbers.	2002-02-25 23:49:51 +00:00
John Baldwin	1ecf0d56c8	Small cleanups to the SMP code: - Axe inlvtlb_ok as it was completely redundant with smp_active. - Remove references to non-existent variable and non-existent file in i386/include/smp.h. - Don't perform initializations local to each CPU while holding the ap boot lock on i386 while an AP bootstraps itself. - Reorganize the AP startup code some to unify the latter half of the functions to bring an AP up. Eventually this might be broken out into a MI function in subr_smp.c.	2001-12-17 23:14:35 +00:00
John Baldwin	6caa8a1501	Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP. - It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the _process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the _process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the _process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the _process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in. Reviewed by: jake, peter Looked over by: eivind	2001-04-27 19:28:25 +00:00
John Baldwin	ca7ef17c08	Remove the BETTER_CLOCK #ifdef's. The code is on by default and is here to stay for the foreseeable future. OK'd by: peter (the idea)	2001-04-10 21:34:13 +00:00
Tor Egge	48bed92485	Defer assignment of low level interrupt handlers for PCI interrupts described in the MP table until something asks for the interrupt number later on.	2001-01-28 01:07:54 +00:00
Peter Wemm	a263238c86	Move io_apic_{read,write} from apic_ipl.s (where they do not belong) into mpapic.c. This gives us the benefit of C type checking. These functions are not called in any critical paths and are not used by the interrupt routines.	2000-12-06 01:04:02 +00:00
Peter Wemm	9a18987b73	GC unused assembler function apic_eoi()	2000-12-06 00:38:04 +00:00

1 2 3

110 Commits