freebsd-dev

Author	SHA1	Message	Date
Sam Leffler	5225f08dc9	guard function decls with _KERNEL so user code can include this file	2006-03-01 05:59:56 +00:00
John Baldwin	215e7c161a	Rework how we wire up interrupt sources to CPUs: - Throw out all of the logical APIC ID stuff. The Intel docs are somewhat ambiguous, but it seems that the "flat" cluster model we are currently using is only supported on Pentium and P6 family CPUs. The other "hierarchy" cluster model that is supported on all Intel CPUs with local APICs is severely underdocumented. For example, it's not clear if the OS needs to glean the topology of the APIC hierarchy from somewhere (neither ACPI nor MP Table include it) and setup the logical clusters based on the physical hierarchy or not. Not only that, but on certain Intel chipsets, even though there were 4 CPUs in a logical cluster, all the interrupts were only sent to one CPU anyway. - We now bind interrupts to individual CPUs using physical addressing via the local APIC IDs. This code has also moved out of the ioapic PIC driver and into the common interrupt source code so that it can be shared with MSI interrupt sources since MSI is addressed to APICs the same way that I/O APIC pins are. - Interrupt source classes grow a new method pic_assign_cpu() to bind an interrupt source to a specific local APIC ID. - The SMP code now tells the interrupt code which CPUs are avaiable to handle interrupts in a simpler and more intuitive manner. For one thing, it means we could now choose to not route interrupts to HT cores if we wanted to (this code is currently in place in fact, but under an #if 0 for now). - For now we simply do static round-robin of IRQs to CPUs when the first interrupt handler just as before, with the change that IRQs are now bound to individual CPUs rather than groups of up to 4 CPUs. - Because the IRQ to CPU mapping has now been moved up a layer, it would be easier to manage this mapping from higher levels. For example, we could allow drivers to specify a CPU affinity map for their interrupts, or we could allow a userland tool to bind IRQs to specific CPUs. The MFC is tentative, but I want to see if this fixes problems some folks had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose interrupts). MFC after: 1 week	2006-02-28 22:24:55 +00:00
David Malone	0cbae93607	It seems bit 5 of cpu_feature2 is the VMX (Virtual Machine Extensions) bit. While I'm here, delete a comment that was cut and past from the cpu_features code that doesn't belong here.	2006-02-15 14:48:59 +00:00
Poul-Henning Kamp	e8444a7e6f	CPU time accounting speedup (step 2) Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64	2006-02-11 09:33:07 +00:00
Poul-Henning Kamp	eb2da9a51f	Simplify system time accounting for profiling. Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.	2006-02-08 08:09:17 +00:00
Poul-Henning Kamp	5b1a8eb397	Modify the way we account for CPU time spent (step 1) Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.	2006-02-07 21:22:02 +00:00
John Baldwin	8917b8d28c	- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.	2006-02-06 22:06:54 +00:00
Wayne Salamon	4f9ac41fba	Call the audit syscall enter/exit functions for the amd64 architecture, both 32-bit and 64-bit paths. System calls will now be audited. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)	2006-02-04 20:37:20 +00:00
David Xu	6d7c1bdccd	MFi386: Clear carry flag in get_mconetxt so that setcontext does not return a bogus error.	2006-02-03 02:49:14 +00:00
Peter Wemm	e2a5e4efdb	Make PV entries dynamic on amd64. i386 has a pre-reserved block of kva dedicated to storing pv entries, originally so that kva didn't have to be allocated at inconvenient times. For amd64, we can get the same effect by using the direct map area. Allocating pages is the same as with the object backed method, but now we can just lookup the page in the direct map area. Thus, no more pageable kva is reserved. This is the single largest consumer of kva on our work machines and this change should help conserve the fixed size 2GB pageable kva on the amd64 kernel. There are a pair of sysctl nodes introduced, named the same as their tunable counterparts. vm.pmap.shpgperproc and vm.pmap.pv_entry_max They work just like the tunables of the same path, except the values are linked. The pv entry cap is now dynamically changeable. I didn't make them totally unlimited because we need some sort of safety limit still. One could consume all physical memory without a cap.	2006-02-03 00:16:36 +00:00
John Baldwin	6966c33482	Call WITNESS_CHECK() in the page fault handler and immediately assume it is a fatal fault if we are holding any non-sleepable locks. This should cut down on the number of bogus LORs we currently get when the kernel panics due to a NULL (or bogus) pointer dereference that goes wandering off into the VM system which tries to acquire locks and then kicks off the spurious LORs. This should probably be ported to all the archs at some point. Tested on: i386	2006-01-27 22:22:10 +00:00
Scott Long	0af57729a6	Free the newtag if we exit with a failure from alloc_bounce_zone(). Found by: Coverity Prevent(tm)	2006-01-14 17:22:47 +00:00
David E. O'Brien	f8ed1e340d	Move linux support to the linux section.	2006-01-12 01:20:59 +00:00
Poul-Henning Kamp	d3e64681d6	Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43) to COMPAT_43TTY. Add COMPAT_43TTY to NOTES and */conf/GENERIC Compile tty_compat.c only under the new option. Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.	2006-01-10 09:19:10 +00:00
Warner Losh	d5e61c97a6	By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __. Submitted by: bde, jhb, scottl, many others.	2006-01-09 06:05:57 +00:00
John Baldwin	04dda605c5	- Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all the various pcib drivers to use their own private devclass_t variables for their modules. - Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib drivers while I'm here.	2006-01-06 19:22:19 +00:00
John Baldwin	360c3c2d1a	Fix various places that were testing td_critnest to see if interrupts should remain disabled during a trap or not to check td_md.md_spinlock_count instead.	2006-01-06 18:02:12 +00:00
Jung-uk Kim	dccb7faff6	- Explicitly validate an empty filter to match bpf_filter() comment[1]. - Do not use BPF JIT compiler for an empty filter. [1] Pointed out by: darrenr	2006-01-03 20:26:03 +00:00
Warner Losh	501755f4f6	Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for each platform. These will be used in the pci code in preference to the complicated #ifdefs we have there now.	2006-01-01 20:59:28 +00:00
Alexander Leidinger	e3d101c377	Unbreak kernel build. A happy new year to all. Submitted by: Goran Gajic <ggajic@afrodita.rcub.bg.ac.yu>, bz Pointy hat to: netchild Appologies to: all	2006-01-01 05:35:57 +00:00
Alexander Leidinger	ef39c05baa	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
Pawel Jakub Dawidek	70665fda32	Fix watch address truncation. The address was truncated when it was passed to amd64_set_watch() as 'unsigned int' and 'unsigned int' is 32bit long on amd64. Even with that fix hardware watchpoint don't work for me on amd64, ie. when I set the watchpoint and write a byte there, nothing happens.	2005-12-27 23:23:47 +00:00
Maxim Sobolev	900b28f9f6	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
Jeff Roberson	660002d398	- Improve the INKERNEL macro such that it can no longer give false positives. This fixes the stack(9) functionality. Submitted by: Antoine Brodin <antoine.brodin@laposte.net>	2005-12-23 21:33:55 +00:00
John Baldwin	b439e431bf	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
John Baldwin	5b2119223e	Move the hostb driver out of the i386 and amd64 PCI code (where it was duplicated anyways) and into a single MI driver. Extend the driver a bit to implement the bus and PCI kobj interfaces such that other drivers can attach to it and transparently act as if their parent device is the PCI bus (for the most part).	2005-12-20 21:09:45 +00:00
Marcel Moolenaar	757686b115	Make our ELF64 type definitions match standards. In particular this means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size. MFC after: 2 weeks	2005-12-18 04:52:37 +00:00
Scott Long	0717619c5c	Don peril sensitive sunglasses and jack up the MAX_BPAGES limit to 8192 on amd64. If you're going to stuff >4GB into your box, reserving 32MB for bonce pages amounts to a rounding error in the overall scheme of things.	2005-12-16 05:57:18 +00:00
John Baldwin	410d857972	Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1) which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT() has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here was redundant and resulted in panics in debug kernels. MFC after: 1 week Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu	2005-12-15 16:30:41 +00:00
John Baldwin	05ee80c796	Fix stale comment.	2005-12-14 21:47:02 +00:00
John Baldwin	e83f6bcb75	Revert previous commit. The BIOS braindamage is even worse than I originally thought. The BIOS that cleared CPUID_APIC actually managed to disable the local APIC entirely and even Windows 64 doesn't boot on it. Reported by: bz	2005-12-13 18:29:10 +00:00
John Baldwin	15b7edbeaa	Don't check the CPUID_APIC bit in the cpu_features flags field to determine if the boot CPU has a local APIC because some BIOS vendors are not competent enough to set this bit. Instead, just assume that we always have a local APIC on amd64. For i386 the check is a bit more subtle. FreeBSD requires either an MP Table or an ACPI MADT table to enumerate APICs. The only systems that have one of those tables that don't have local APICs are some presumably rare (and old) SMP 486 systems using external APICs. Thus, instead of checking the CPUID_APIC flag, check the CPU class and abort if we are running on a 486. MFC after: 1 week Reported by: bz	2005-12-13 15:09:40 +00:00
Peter Wemm	6bcdd71391	For the amd64 platform, we can depend on the TSC being present. This patch changes DELAY to use the TSC once it has been calibrated. This does NOT use the TSC for long-term timekeeping. It only uses it to bound the DELAY() spinloop. This should not be affected by the Athlon64 X2 TSC quirks because the cpu is not halted while we use DELAY().	2005-12-12 22:27:07 +00:00
David Xu	992ee51fc0	Sync with i386, fix compiling for non-SMP.	2005-12-09 13:30:34 +00:00
John Baldwin	333b8de537	MFi386: - Move PUSH_FRAME and POP_FRAME to asmacros.h and use PUSH_FRAME in atpic entry points. - Move PCPU_* asm macros out of the middle of the asm profiling macros. - Pass IRQ vector argument as an int rather than void * to reduce diffs with i386. - EOI the lapic in C for the lapic timer handler. - GC unused Xcpuast function. - Split IPI_STOP handling code of ipi_nmi_handler() out into a cpustop_handler() function and call it from Xcpustop rather than duplicating all the logic in assembly. - Fixup the list of symbols with interrupt frames in ddb traces. Xatpic_fastintr* have never existed on amd64, and the lapic timer handler and various IPI handlers were missing. - Use trapframe instead of intrframe for interrupt entry points (on amd64 the interrupt vector was already a separate argument, so the two frames were already identical) and GC intrframe. Submitted by: peter (3)	2005-12-08 18:33:30 +00:00
Peter Wemm	79880f7327	Catch up to the system siginfo changes. Use a union for the ia32 layout of siginfo just like the system one. There are now two fields to copy instead of one.	2005-12-06 23:06:29 +00:00
John Baldwin	696effb697	- Cleanup whitespace and extra ()s in vtophys() macros. - Move vtophys() macros next to vtopte() where vtopte() exists to match comments above vtopte(). - Remove references to the alternate address space in the comment above vtopte(). amd64 never had the alternate address space, and i386 lost it prior to PAE support being added. - s/entires/entries/ in comments. Reviewed by: alc	2005-12-06 21:09:01 +00:00
Jung-uk Kim	50c9fad9ce	Fix ZERO_EDX() macro from the previous commit. It was emitting `xor %ecx, %ecx', not `xor %edx, %edx'.	2005-12-06 20:11:07 +00:00
Ruslan Ermilov	224d140293	Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with MACHINE_ARCH and MACHINE). Their purpose was to be able to test in cpp(1), but cpp(1) only understands integer type expressions. Using such unsupported expressions introduced a number of subtle bugs, which were discovered by compiling with -Wundef.	2005-12-06 13:27:21 +00:00
Jung-uk Kim	6a96c4832f	s/M_WAITOK/M_NOWAIT/ while mutex is held. Pointed out by: csjp	2005-12-06 07:22:01 +00:00
Jung-uk Kim	23a8fc28c2	- Micro-optimize `mov $0, %edx' ->` xor %edx, %edx'. - Correct amd64 macro style (no functional change).	2005-12-06 06:45:39 +00:00
Jung-uk Kim	ae275efcae	Add experimental BPF Just-In-Time compiler for amd64 and i386. Use the following kernel configuration option to enable: options BPF_JITTER If you want to use bpf_filter() instead (e. g., debugging), do: sysctl net.bpf.jitter.enable=0 to turn it off. Currently BIOCSETWF and bpf_mtap2() are unsupported, and bpf_mtap() is partially supported because 1) no need, 2) avoid expensive m_copydata(9). Obtained from: WinPcap 3.1 (for i386)	2005-12-06 02:58:12 +00:00
John Baldwin	5ae84c09e7	Really slam the door on mixed mode now that we don't depend on it for a working IRQ0 with APIC anymore. Previously, it was possible to have some other ATPIC IRQS "leak" through in a few edge cases. For example, on my x86 test machine, ACPI re-routes the SCI (IRQ 9) to intpin 13 on the first I/O APIC. This leaves a hole for IRQ 13 (since the APIC doesn't provide a source for IRQ 13 in that case) with the result that the ATPIC IRQ13 source was registered instead. This changes the 8259A drivers to only register their interrupt sources if none of the 16 ISA IRQs have an interrupt source already installed. MFC after: 1 week	2005-12-05 22:09:30 +00:00
Eric Anholt	69b9fffc84	Merge DRM CVS as of 2005-12-02, adding i915 DRM support thanks to Alexey Popov, and a new r300 PCI ID.	2005-12-03 01:23:50 +00:00
Eric Anholt	9fb0767374	Update DRM to CVS snapshot as of 2005-11-28. Notable changes: - S3 Savage driver ported. - Added support for ATI_fragment_shader registers for r200. - Improved r300 support, needed for latest r300 DRI driver. - (possibly) r300 PCIE support, needs X.Org server from CVS. - Added support for PCI Matrox cards. - Software fallbacks fixed for Rage 128, which used to render badly or hang. - Some issues reported by WITNESS are fixed. - i915 module Makefile added, as the driver may now be working, but is untested. - Added scripts for copying and preprocessing DRM CVS for inclusion in the kernel. Thanks to Daniel Stone for getting me started on that.	2005-11-28 23:13:57 +00:00
John Baldwin	d6ef938e56	If we get a stray interrupt, return after logging it. In the extremely rare case of a stray interrupt to an unregistered source (such as a stray interrupt from the 8259As when using APIC), this could result in a page fault when it tried to walk the list of interrupt handlers to execute INTR_FAST handlers. This bug was introduced with the intr_event changes, so it's not present in 5.x or 6.x. Submitted by: Mark Tinguely tinguely at casselton dot net	2005-11-28 20:18:43 +00:00
Ruslan Ermilov	6646524f34	- Allow duplicate "machine" directives with the same arguments. - Move existing "machine" directives to DEFAULTS.	2005-11-27 23:17:00 +00:00
Lukas Ertl	ae5a74ec72	Fix typo.	2005-11-24 15:28:32 +00:00
Ruslan Ermilov	1a581012df	Add missing "struct" in i386/i386/machdep.c,v 1.497 by deischen@.	2005-11-24 08:16:18 +00:00
John Baldwin	7417e80b4e	Don't enable PUC_FASTINTR by default in the source. Instead, enable it via the DEFAULTS kernel configs. This allows folks to turn it that option off in the kernel configs if desired without having to hack the source. This is especially useful since PUC_FASTINTR hangs the kernel boot on my ultra60 which has two uart(4) devices hung off of a puc(4) device. I did not enable PUC_FASTINTR by default on powerpc since powerpc does not currently allow sharing of INTR_FAST with non-INTR_FAST like the other archs.	2005-11-21 20:22:35 +00:00

1 2 3 4 5 ...

4537 Commits