freebsd-skq

Author	SHA1	Message	Date
scottl	136dd28a2e	Double the number of kernel page tables for amd64 and for i386/PAE. The old value was only enough for 8GB of RAM, the new value can do 16GB. This still isn't optimal since it doesn't scale. Fixing this for amd64 looks to be fairly easy, but for i386 will be quite difficult. Reviewed by: peter	2004-09-11 01:31:26 +00:00
scottl	d9af98161a	Turn PREEMPTION into a kernel option. Make sure that it's defined if FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is enabled (code inspired by the PREEMPTION warning in kern_switch.c). This is a possible MT5 candidate.	2004-09-02 18:59:15 +00:00
julian	2c3ff98534	Give up trying to make preemption dependent on SCHED_4BSD the list of breakages was getting too long	2004-09-01 20:41:18 +00:00
julian	ecf87479de	Don't ask for this for modules. no modules need to know about preemption at the moment	2004-09-01 18:29:57 +00:00
scottl	c0ab9d7bd6	Protect the PREEMPTION logic with #ifdef _KERNEL to fix the build.	2004-09-01 10:12:08 +00:00
julian	5a545bc8bc	Only turn preemption for 4bsd. it's still poison for ULE.	2004-09-01 09:01:32 +00:00
julian	8354ba9e3a	Give the 4bsd scheduler the ability to wake up idle processors when there is new work to be done. MFC after: 5 days	2004-09-01 06:42:02 +00:00
marcel	01fd13440d	Move the kernel-specific logic to adjust frompc from MI to MD. For these two reasons: 1. On ia64 a function pointer does not hold the address of the first instruction of a functions implementation. It holds the address of a function descriptor. Hence the user(), btrap(), eintr() and bintr() prototypes are wrong for getting the actual code address. 2. The logic forces interrupt, trap and exception entry points to be layed-out contiguously. This can not be achieved on ia64 and is generally just bad programming. The MCOUNT_FROMPC_USER macro is used to set the frompc argument to some kernel address which represents any frompc that falls outside the kernel text range. The macro can expand to ~0U to bail out in that case. The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to some kernel address to represent a call to a trap or interrupt handler. This to avoid that the trap or interrupt handler appear to be called from everywhere in the call graph. The macro can expand to ~0U to prevent adjusting frompc. Note that the argument is selfpc, not frompc. This commit defines the macros on all architectures equivalently to the original code in sys/libkern/mcount.c. People can take it from here... Compile-tested on: alpha, amd64, i386, ia64 and sparc64 Boot-tested on: i386	2004-08-27 19:42:35 +00:00
obrien	cb4ed35612	Fix a bug in in_cksum_hdr w/o -O. The C code assumes that the carry bit is always kept from the previous operation. However, the pointer indexing requires another add operation. Thus, the carry bit from the first operation is tromped over by the "addl" operation that ends up following it, so the "adcl" that follows that has no effect because the carry bit is cleared before it. The result is checksum failure on received packets. The larger issue is that there isn't any other way of preventing the compiler inserting arbitrary instructions between different __asm statements (and that the commit message in revision 1.13 of in_cksum.h is wrong on this point). From http://developer.apple.com/documentation/DeveloperTools/gcc-3.3/gcc/Extended-Asm.html ---8<---8<---8<--- You can't expect a sequence of volatile asm instructions to remain perfectly consecutive. If you want consecutive output, use a single asm. Also, GCC will perform some optimizations across a volatile asm instruction; GCC does not "forget everything" when it encounters a volatile asm instruction the way some other compilers do. ---8<---8<---8<--- Also, this change also makes the ASM code much easier to read. PR: 69257 Submitted by: Mike Bristow <mike@urgle.com>, Qing Li <qing.li@bluecoat.com>	2004-08-25 18:28:15 +00:00
obrien	01b52fbc3e	Increase the scaling of VM_KMEM_SIZE_MAX. Submitted by: alc	2004-08-16 08:35:22 +00:00
rwatson	abf6ea2973	Add an "options MP_WATCHDOG" to i386. This option allows one of the logical CPUs on a system to be used as a dedicated watchdog to cause a drop to the debugger and/or generate an NMI to the boot processor if the kernel ceases to respond. A sysctl enables the watchdog running out of the processor's idle thread; a callout is launched to reset a timer in the watchdog. If the callout fails to reset the timer for ten seconds, the watchdog will fire. The sysctl allows you to select which CPU will run the watchdog. A sample "debug.leak_schedlock" is included, which causes a sysctl to spin holding sched_lock in order to trigger the watchdog. On my Xeons, the watchdog is able to detect this failure mode and break into the debugger, which cannot otherwise be done without an NMI button. This option does not currently work with sched_ule due to ule's push notion of scheduling, similar to machdep.hlt_logical_cpus failing to work with that scheduler. On face value, this might seem somewhat inefficient, but there are a lot of dual-processor Xeons with HTT around, so using one as a watchdog for testing is not as inefficient as one might fear.	2004-08-15 18:02:09 +00:00
mux	35780dc21a	Instead of calling ia32_pause() conditionally on __i386__ or __amd64__ being defined, define and use a new MD macro, cpu_spinwait(). It only expands to something on i386 and amd64, so the compiled code should be identical. Name of the macro found by: jhb Reviewed by: jhb	2004-08-03 18:44:27 +00:00
dfr	090a3e72f1	Add definitions for TLS relocations.	2004-08-02 19:12:17 +00:00
scottl	fb7f90d7ec	Optimize intr_execute_handlers() by combining the pic_disable_source() and pic_eoi_source() into one call. This halves the number of spinlock operations and indirect function calls in the normal case of handling a normal (ithread) interrupt. Optimize the atpic and ioapic drivers to use inlines where appropriate in supporting the intr_execute_handlers() change. This knocks 900ns, or roughly 1350 cycles, off of the time spent servicing an interrupt in the common case on my 1.5GHz P4 uniprocessor system. SMP systems likely won't see as much of a gain due to the ioapic being more efficient than the atpic. I'll investigate porting this to amd64 soon. Reviewed by: jhb	2004-08-02 15:31:10 +00:00
scottl	30cf65f35d	Turn off PREEMPTION by default while it gets debugged. It's been causing 4 weeks of problems including deadlocks and instant panics. Note that the real bugs are likely in the scheduler.	2004-08-01 14:31:45 +00:00
markm	a6c822020d	Break out the MI part of the /dev/[k]mem and /dev/io drivers into their own directory and module, leaving the MD parts in the MD area (the MD parts _are_ part of the modules). /dev/mem and /dev/io are now loadable modules, thus taking us one step further towards a kernel created entirely out of modules. Of course, there is nothing preventing the kernel from having these statically compiled.	2004-08-01 11:40:54 +00:00
rwatson	4ab080249a	Pass a thread argument into cpu_critical_{enter,exit}() rather than dereference curthread. It is called only from critical_{enter,exit}(), which already dereferences curthread. This doesn't seem to affect SMP performance in my benchmarks, but improves MySQL transaction throughput by about 1% on UP on my Xeon. Head nodding: jhb, bmilekic	2004-07-27 16:41:01 +00:00
das	86c293bf54	Make FLT_ROUNDS correctly reflect the dynamic rounding mode.	2004-07-19 08:17:25 +00:00
marcel	aae5483213	Mega update for the KDB framework: turn DDB into a KDB backend. Most of the changes are a direct result of adding thread awareness. Typically, DDB_REGS is gone. All registers are taken from the trapframe and backtraces use the PCB based contexts. DDB_REGS was defined to be a trapframe on all platforms anyway. Thread awareness introduces the following new commands: thread X switch to thread X (where X is the TID), show threads list all threads. The backtrace code has been made more flexible so that one can create backtraces for any thread by giving the thread ID as an argument to trace. With this change, ia64 has support for breakpoints.	2004-07-10 23:47:20 +00:00
marcel	70e2b7a25f	Update for the KDB framework: o s/ddb_on_nmi/kdb_on_nmi/g o Rename sysctl machdep.ddb_on_nmi to machdep.kdb_on_nmi o Make debugging support conditional upon KDB instead of DDB. o Call kdb_reenter() when kdb_active is non-zero. o Call kdb_trap() to enter the debugger when not already active. o Update comments accordingly. o Remove misplaced prototype of kdb_trap().	2004-07-10 22:11:14 +00:00
marcel	006d404cf7	Implement makectx(). The makectx() function is used by KDB to create a PCB from a trapframe for purposes of unwinding the stack. The PCB is used as the thread context and all but the thread that entered the debugger has a valid PCB. This function can also be used to create a context for the threads running on the CPUs that have been stopped when the debugger got entered. This however is not done at the time of this commit.	2004-07-10 19:56:00 +00:00
marcel	60b53542e9	Introduce the KDB debugger frontend. The frontend provides a framework in which multiple (presumably different) debugger backends can be configured and which provides basic services to those backends. Besides providing services to backends, it also serves as the single point of contact for any and all code that wants to make use of the debugger functions, such as entering the debugger or handling of the alternate break sequence. For this purpose, the frontend has been made non-optional. All debugger requests are forwarded or handed over to the current backend, if applicable. Selection of the current backend is done by the debug.kdb.current sysctl. A list of configured backends can be obtained with the debug.kdb.available sysctl. One can enter the debugger by writing to the debug.kdb.enter sysctl.	2004-07-10 18:40:12 +00:00
marcel	6e0dcca8a9	Introduce the GDB debugger backend for the new KDB framework. The backend improves over the old GDB support in the following ways: o Unified implementation with minimal MD code. o A simple interface for devices to register themselves as debug ports, ala consoles. o Compression by using run-length encoding. o Implements GDB threading support.	2004-07-10 17:47:22 +00:00
jhb	696704716d	Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64. This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting. Approved by: scottl (with his re@ hat)	2004-07-02 20:21:44 +00:00
peter	486f03d1c3	Reduce the size of pv entries by 15%. This saves 1MB of KVA for mapping pv entries per 1GB of user virtual memory. (eg: if we had 1GB file was mmaped into 30 processes, that would theoretically reduce the KVA demand by 30MB for pv entries. In reality though, we limit pv entries so we don't have that many at once.) We used to store the vm_page_t for the page table page. But we recently had the pa of the ptp, or can calculate it fairly quickly. If we wanted to avoid the shift/mask operation in pmap_pde(), we could recover the pa but that means we have to store it for a while. This does not measurably change performance. Suggested by: alc Tested by: alc	2004-06-29 15:57:05 +00:00
imp	dde16666cb	bde points out that this can't do anything useful. The full patch has other parts that I can't locat at the moment, so back it out until I can.	2004-06-28 05:37:22 +00:00
imp	c83f7f6bdc	When opening /dev/io, preserve iopl properly. Otherwise, if you open /dev/io multiple times, the first close remove the privs.	2004-06-28 03:16:32 +00:00
imp	16d11e08eb	remove needless blankline	2004-06-28 03:08:17 +00:00
jhb	22d8e1a72e	Various cleanups in support of a future ioapic_config_intr() function: - Allow ioapic_set_{nmi,smi,extint}() to be called multiple times on the same pin so long as the pin's mode is the same as the mode being requested. - Add a notion of bus type for the interrupt associated with interrupt pin. This is needed so that we can force all EISA interrupts to be active high in the forthcoming ioapic_config_intr(). - Fix a bug for EISA systems that didn't remap IRQs. This would have broken EISA systems that tried to disable mixed mode for IRQ 0.	2004-06-23 15:29:20 +00:00
bde	da4e7c693b	Backed out previous commit. Blind substitution of dev_t by `struct cdev *' was just wrong here because the dev_t's are user dev_t's.	2004-06-20 03:52:50 +00:00
bde	8f6039173c	Include <sys/_lock.h>'s prerequisite <sys/queue.h> before including the former, not after.	2004-06-20 00:33:14 +00:00
phk	dfd1f7fd50	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
alc	16451deefe	MFamd64 Introduce pmap locking to many of the pmap functions.	2004-06-16 07:03:15 +00:00
alc	91c65a303c	- Remove an unused declaration. - Move a definition inside the scope of a #ifdef _KERNEL.	2004-06-13 03:44:11 +00:00
jhb	ee8370535a	- Use the correct devclass name ("acpi" vs "ACPI") to detect if acpi0 is present and thus that the PnPBIOS probe should be skipped instead of having ACPI zero out the PnPBIOStable pointer. - Make the PnPBIOStable pointer static to i386/i386/bios.c now that that is the only place it is used.	2004-06-10 20:43:04 +00:00
jhb	f7c8770deb	Remove atdevbase and replace it's remaining uses with direct references to KERNBASE instead.	2004-06-10 20:31:00 +00:00
phk	89408985ea	Add new bios_string() which will hunt for a string inside a given range of the BIOS. This can be used for finding arbitrary magic in the BIOS in order to recognize particular platforms.	2004-06-03 22:36:24 +00:00
phk	cd2173882e	The NatSemi (now AMD) Geode SC1100 needs special treatment here and there because it is an embedded gadget. Give it it's own value for the "cpu" variable and add code to reset it lacking a keyboard controller.	2004-06-03 21:14:53 +00:00
phk	cfb99155bf	struct cpu_nameclass is a private to identcpu.c, move it there.	2004-05-30 15:16:07 +00:00
bde	36151be4c9	Moved i386 asms to an i386 header. The asms are for calibration of high resolution kernel profiling (options GUPROF. "U" in GUPROF stands for microseconds resolution, but the resolution is now smaller than 1 nanosecond on multi-GHz machines and the accuracy is heading towards 1 nanosecond too). Arches that support GUPROF must now provide certain macros for the calibration. GUPROF is now only supported for i386's, so the absence of the new macros for other arches doesn't break anything that wasn't already broken. amd64's have uncommitted support for GUPROF, and sparc64's have support that seems to be complete except here (there was an #error for non-i386 cases; now there are undefined macros). Changed the asms a little: - declare them as __volatile. They must not be moved, and exporting a label across asms is technically incorrect, so try harder to stop gcc moving them. - don't put the non-clobbered register "bx" in the clobber list. The clobber lists are still more conservative than necessary. - drop the non-support for gcc-1. It just gave a better error message, and this is not useful since compiling with gcc-1 would cause thousands of worse error messages. - drop the support for aout.	2004-05-20 16:12:19 +00:00
bde	1808931bc7	Fixed some style bugs (mainly misalignment of backslashes).	2004-05-19 16:04:26 +00:00
bde	f309a749f5	Moved most of the "MI" definitions and declarations from <machine/profile.h> to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef for incomplete and out of date support for GUPROF in userland, as in the sparc64 version.	2004-05-19 15:41:26 +00:00
stefanf	e784f59f15	<stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is defined. Otherwise first including <wchar.h> and then <stdint.h> leads to no WINT_M{AX,IN} at all. PR: 64956 Approved by: das (mentor)	2004-05-18 16:04:57 +00:00
jhb	3564a3448d	Rework the APIC mixed mode support a bit: - Require the APIC enumerators to explicitly enable mixed mode by calling ioapic_enable_mixed_mode(). Calling this function tells the apic driver that the PC-AT 8259A PICs are present and routable through the first I/O APIC via an ExtINT pin. The mptable enumerator always calls this function for now. The MADT enumerator only enables mixed mode if the PC-AT compatability flag is set in the MADT header. - Allow mixed mode to be enabled or disabled via a 'hw.apic.mixed_mode' tunable. By default this tunable is set to 1 (true). The kernel option NO_MIXED_MODE changes the default to 0 to preserve existing behavior, but adding 'hw.apic.mixed_mode=0' to loader.conf achieves the same effect. - Only use mixed mode to route IRQ 0 if it is both enabled by the APIC enumerator and activated by the loader tunable. Note that both conditions must be true, so if the APIC enumerator does not enable mixed mode, then you can't set the tunable to try to override the enumerator.	2004-05-10 18:49:58 +00:00
njl	f6bbab06a9	Add an MI implementation of the ACPI global lock routines and retire the individual asm versions. The global lock is shared between the BIOS and OS and thus cannot use our mutexes. It is defined in section 5.2.9.1 of the ACPI specification. Reviewed by: marcel, bde, jhb	2004-05-05 20:04:14 +00:00
jhb	e7ea4d6595	- Create a pir0 psuedo device as a child of legacy0 if we attach a legacy host-PCI bridge device and find a valid $PIR. - Make pci_pir_parse() private to pci_pir.c and have pir0's attach routine call it instead of having legacy_pcib_attach() call it. - Implement suspend/resume support for the $PIR by giving pir0 a resume method that calls the BIOS to reroute each link that was already routed before the machine was suspended. - Dump the state of the routed flag in the links display code. - If a link's IRQ is set by a tunable, then force that link to be re-routed the first time it is used. - Move the 'Found $PIR' message under bootverbose as the pir0 description line lists the number of entries already. The pir0 line also only shows up if we are actually using the $PIR which is a bonus. - Use BUS_CONFIG_INTR() to ensure that any IRQs used by a PCI link are set to level/low trigger/polarity.	2004-05-04 21:17:52 +00:00
jhb	415131cd6d	- Add a new pic method pic_config_intr() to set the trigger mode and polarity for a specified IRQ. The intr_config_intr() function wraps this pic method hiding the IRQ to interrupt source lookup. - Add a config_intr() method to the atpic(4) driver that reconfigures the interrupt using the ELCR if possible and returns an error otherwise. - Add a config_intr() method to the apic(4) driver that just logs any requests that would change the existing programming under bootverbose. Currently, the only changes the apic(4) driver receives are due to bugs in the acpi(4) driver and its handling of link devices, hence the reason for such requests currently being ignored. - Have the nexus(4) driver on i386 implement the bus_config_intr() function by calling intr_config_intr().	2004-05-04 21:02:56 +00:00
jhb	e077495344	- Change the APIC code to mostly use the recently added intr_trigger and intr_polarity enums for passing around interrupt trigger modes and polarity rather than using the magic numbers 0 for level/low and 1 for edge/high. - Convert the mptable parsing code to use the new ELCR wrapper code rather than reading the ELCR directly. Also, use the ELCR settings to control both the trigger and polarity of EISA IRQs instead of just the trigger mode. - Rework the MADT's handling of the ACPI SCI again: - If no override entry for the SCI exists at all, use level/low trigger instead of the default edge/high used for ISA IRQs. - For the ACPI SCI, use level/low values for conforming trigger and polarity rather than the edge/high values we use for all other ISA IRQs. - Rework the tunables available to override the MADT. The hw.acpi.force_sci_lo tunable is no longer supported. Instead, there are now two tunables that can independently override the trigger mode and/or polarity of the SCI. The hw.acpi.sci.trigger tunable can be set to either "edge" or "level", and the hw.acpi.sci.polarity tunable can be set to either "high" or "low". To simulate hw.acpi.force_sci_lo, set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low". If you are having problems with ACPI either causing an interrupt storm or not working at all (e.g., the power button doesn't turn invoke a shutdown -p now), you can try tweaking these two tunables to find the combination that works.	2004-05-04 20:39:24 +00:00
jhb	9f40cdcc06	Add a simple mini-driver for the ELCR register. Originally, the ELCR register controlled the trigger mode and polarity of EISA interrupts. However, it appears that most (all?) PCI systems use the ELCR to manage the trigger mode and polarity of ISA interrupts as well since ISA IRQs used to route PCI interrupts need to be level triggered with active low polarity. We check to see if the ELCR exists by sanity checking the value we get back ensuring that IRQS 0 (8254), 1 (atkbd), 2 (the link from the slave PIC), and 8 (RTC) are all clear indicating edge trigger and active high polarity. This mini-driver will be used by the atpic driver to manage the trigger and polarity of ISA IRQs. Also, the mptable parsing code will use this mini driver rather than examining the ELCR directly.	2004-05-04 20:07:46 +00:00
nyan	87d450b90b	Increase BUS_SPACE_IAT_MAXSIZE to 33 for the ed driver.	2004-05-04 06:38:13 +00:00

1 2 3 4 5 ...

1634 Commits