freebsd-nq

Author	SHA1	Message	Date
Bruce Evans	71799af2d5	Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.	2007-01-23 08:01:20 +00:00
John Baldwin	5fe82bca57	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support. - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl	2007-01-22 21:48:44 +00:00
Craig Rodrigues	5a09873361	Revert previous change. Requested by: kan	2007-01-18 05:46:32 +00:00
Craig Rodrigues	e76c6d8cd3	Forward declare __pcpu as a pointer type instead of an array type to eliminate GCC 4.1 error: "array type has incomplete element type".	2007-01-18 02:00:04 +00:00
Warner Losh	fed32d7544	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00
Jung-uk Kim	5efc6c44ff	Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors.	2007-01-09 19:23:22 +00:00
Bruce Evans	f28e1c8f99	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').	2006-12-29 15:29:49 +00:00
Bruce Evans	6c296ffa81	Fixed some style bugs (whitespace only).	2006-12-29 14:28:23 +00:00
Bruce Evans	7e4277e591	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.	2006-12-29 13:36:26 +00:00
Bruce Evans	276c702d8d	Removed gratuitous cosmetic differences with the i386 version. This mainly involves removing all __CC_SUPPORTS___INLINE__ ifdefs. These ifdefs are even less needed for amd64 than for i386, but the i386 atomic.h never had them. The ifdefs here were just an optimization of obsolescent compatibility cruft (__inline) for a null set of compilers. I think null sets of compilers should only be supported in cases where this is more than an optimization, doesn't require extensive ifdefs, and only involves not-so-obsolescent compatibility cruft (plain inline here).	2006-12-28 08:15:14 +00:00
Bruce Evans	26ab2d1d23	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.	2006-12-27 20:26:00 +00:00
Kip Macy	e5f8d4099d	Newer versions of gcc don't support treating structures passed by value as if they were really passed by reference. Specifically, the dead stores elimination pass in the GCC 4.1 optimiser breaks the non-compliant behavior on which FreeBSD relied. This change brings FreeBSD up to date by switching trap frames to being explicitly passed by reference. Reviewed by: kan Tested by: kan	2006-12-17 06:48:40 +00:00
John Baldwin	fde45e231a	Sort function prototypes.	2006-12-12 19:24:45 +00:00
Ruslan Ermilov	3cbc967ef7	Use a different bitmask for superpages' base address so that it doesn't conflict with the PG_PDE_PAT bit. (We still don't mask off all the reserved bits but that's okay for now.) Reviewed by: alc	2006-12-05 11:31:33 +00:00
Alan Cox	da44960498	The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)	2006-11-19 20:54:58 +00:00
John Baldwin	81efc3d94c	Add support for 8 byte hardware watches in long mode. Kernel hardware watches support 8 byte watches. For userland, we disallow 8 byte watches for 32-bit tasks.	2006-11-17 20:27:01 +00:00
John Baldwin	7693afca4e	- Add macro constants for the various fields in %dr7 and use them in place of various scattered magic values. - Pretty print the address of hardware watchpoints in 'show watch' rather than just displaying hex. - Expand address field width on amd64 for 64-bit pointers.	2006-11-17 19:20:32 +00:00
John Baldwin	71f4007710	Various whitespace and style fixes.	2006-11-15 19:53:48 +00:00
John Baldwin	4184900911	MD support for PCI Message Signalled Interrupts on amd64 and i386: - Add a new apic_alloc_vectors() method to the local APIC support code to allocate N contiguous IDT vectors (aligned on a M >= N boundary). This function is used to allocate IDT vectors for a group of MSI messages. - Add MSI and MSI-X PICs. The PIC code here provides methods to manage edge-triggered MSI messages as x86 interrupt sources. In addition to the PIC methods, msi.c also includes methods to allocate and release MSI and MSI-X messages. For x86, we allow for up to 128 different MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs, 16-254 for APIC PCI IRQs, and IRQ 255 is reserved). - Add pcib_(alloc\|release)_msi[x]() methods to the MD x86 PCI bridge drivers to bubble the request up to the nexus driver. - Add pcib_(alloc\|release)_msi[x]() methods to the x86 nexus drivers that ask the MSI PIC code to allocate resources and IDT vectors. MFC after: 2 months	2006-11-13 22:23:34 +00:00
Ruslan Ermilov	d77f5882e7	Fix NKPT comments to match reality. Note that the current value of NKPT is no longer enough to run amd64 with 16G of RAM, as it doesn't have space for mapping a kernel (16M kernel would require additionally 8 page tables).	2006-11-13 20:33:54 +00:00
Ruslan Ermilov	26af9ac7d0	Fix a comment.	2006-11-13 06:26:57 +00:00
Bruce Evans	91b4d1bfc2	In the userland .mcount(): - Don't use a frame pointer. Our callers need a frame pointer, but we could only use one to support things that aren't supported. (These things are: - profiling of profiling - debugging of profiling. The core ENTRY() macro doesn't support forcing a frame pointer for debugging, so don't do more here.) - Ensure that we are in the text section and have normal alignment. - Use the normal syntax for `.type'.	2006-10-28 13:12:06 +00:00
Bruce Evans	43f0ea0a27	i386/include/profile.h: Fixed a syntax error for the (!__KERNEL && !__GNUCLIKE_ASM) case in rev.1.36. Apparently, this case has never been reached even by lint. Submitted by: stefanf {amd64,i386}/include/profile.h: In case the above case is actually reached, break it properly by providing null support that will fail at link time instead of a stub that gives wrong (null) profiling at runtime.	2006-10-28 11:03:03 +00:00
Bruce Evans	853b92dacf	In MCOUNT_OVERHEAD(label), actually use the `label' parameter. We were still using the global label named "profil", and this worked accidentally because all callers use the same name.	2006-10-28 07:59:11 +00:00
Bruce Evans	94450a83e8	Removed all traces of HIDENAME() in amd64 and i386 kernel code. Using this used to be slightly cleaner than using ifdefs in a few places to support both a.out and elf, but using it now just causes messes and unportabilities. It seems to be impossible to implement the elf HIDENAME() portably in cpp (since token pasting of "." and <name> is invalid). */prof_machdep.c: - Removed all uses of CNAME(). CNAME() is easy enough to use in pure asm code, but using it in inline asm requires messy quoting. The core pure asm code has been hacked on more and all uses of CNAME() in it have already gone away. Just assume the elf convention here too. - Removed now-uneeded include of <machine/asmacros.h>. - Removed the workaround for a namespace conflict with this include.	2006-10-28 06:04:29 +00:00
Bruce Evans	447647908c	Don't call mexitcount or provide a stub mexitcount to call when profiling is configured but high resolution profiling is not configured. Only functions in *.[Ss] called the stub, so efficiency was not significantly affected.	2006-10-27 14:17:50 +00:00
John Baldwin	520ffff83e	Change the x86 interrupt code to suspend/resume interrupt controllers (PICs) rather than interrupt sources. This allows interrupt controllers with no interrupt pics (such as the 8259As when APIC is in use) to participate in suspend/resume. - Always register the 8259A PICs even if we don't use any of their pins. - Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't included. - Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC on resume. This gets suspend/resume working with APIC on UP systems. SMP still needs more work to bring the APs back to life. The MFC after is tentative. Tested by: anholt (i386) Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3) MFC after: 1 week	2006-10-10 23:23:12 +00:00
John Baldwin	6e20fe33ba	Oops, fix sign bug in #ifdef for value of INTRCNT_COUNT. PR: kern/99870 Submitted by: jkim MFC after: 3 days	2006-10-10 19:26:35 +00:00
John Birrell	6825d60738	PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: Security: Move the relocation definitions to the common elf header so that DTrace can use them on one architecture targeted to a different one. Add the additional ELF types defines in Sun's "Linker and Libraries" manual.	2006-10-04 21:37:10 +00:00
Poul-Henning Kamp	f645b0b51c	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
Alexander Kabaev	d9cb97ff9d	Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes the former and __builtin_va_start was present in all GCC version 3.1 and later.	2006-09-21 01:37:02 +00:00
John Baldwin	7e9f73f3ed	First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc	2006-08-11 19:22:57 +00:00
Alan Cox	f8883c0160	Define the additional page fault error codes that are implemented by amd64.	2006-08-02 16:24:23 +00:00
Jung-uk Kim	0758eaa227	Sync specialreg.h changes between amd64 and i386 with few fixes.	2006-07-13 16:09:40 +00:00
Jung-uk Kim	444576c0c4	Add two new CPUID bits for AMD CPUs, i. e., SVM and extended APIC register.	2006-07-12 06:04:12 +00:00
David Xu	4d70df3fee	MFi386: Use the method described in IA-32 Intel Architecture Software Developer's Manual chapter 11.6.6 to get valid mxcsr bits, use the mxcsr mask to clear invalid bits passed by user code.	2006-06-19 22:36:01 +00:00
Maxim Sobolev	aa1807d5d6	Move clock_lock prototype into <machine/clock.h>, where it is more appropriate. Discussed with: jhb	2006-05-19 18:53:50 +00:00
Poul-Henning Kamp	5405ab4889	Clean out sysctl machdep.* related defines. The cmos clock related stuff should really be in MI code.	2006-05-11 17:29:25 +00:00
John Baldwin	2b8a339c7e	Add various constants for the PAT MSR and the PAT PTE and PDE flags. Initialize the PAT MSR during boot to map PAT type 2 to Write-Combining (WC) instead of Uncached (UC-). MFC after: 1 month	2006-05-01 22:07:00 +00:00
John Baldwin	4ac60df584	Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case. MFC after: 1 month	2006-05-01 21:36:47 +00:00
Peter Wemm	c0345a84aa	Introduce minidumps. Full physical memory crash dumps are still available via the debug.minidump sysctl and tunable. Traditional dumps store all physical memory. This was once a good thing when machines had a maximum of 64M of ram and 1GB of kvm. These days, machines often have many gigabytes of ram and a smaller amount of kvm. libkvm+kgdb don't have a way to access physical ram that is not mapped into kvm at the time of the crash dump, so the extra ram being dumped is mostly wasted. Minidumps invert the process. Instead of dumping physical memory in in order to guarantee that all of kvm's backing is dumped, minidumps instead dump only memory that is actively mapped into kvm. amd64 has a direct map region that things like UMA use. Obviously we cannot dump all of the direct map region because that is effectively an old style all-physical-memory dump. Instead, introduce a bitmap and two helper routines (dump_add_page(pa) and dump_drop_page(pa)) that allow certain critical direct map pages to be included in the dump. uma_machdep.c's allocator is the intended consumer. Dumps are a custom format. At the very beginning of the file is a header, then a copy of the message buffer, then the bitmap of pages present in the dump, then the final level of the kvm page table trees (2MB mappings are expanded into a 4K page mappings), then the sparse physical pages according to the bitmap. libkvm can now conveniently access the kvm page table entries. Booting my test 8GB machine, forcing it into ddb and forcing a dump leads to a 48MB minidump. While this is a best case, I expect minidumps to be in the 100MB-500MB range. Obviously, never larger than physical memory of course. minidumps are on by default. It would want be necessary to turn them off if it was necessary to debug corrupt kernel page table management as that would mess up minidumps as well. Both minidumps and regular dumps are supported on the same machine.	2006-04-21 04:24:50 +00:00
Marcel Moolenaar	b1fb1bb19a	Sync with i386: Map exceptions to signals in gdb_cpu_signal() so that kgdb(1) gets a SIGTRAP when it needs to. Pointed out by: grehan@	2006-04-04 03:00:20 +00:00
Marcel Moolenaar	470d831703	The PC is register 16, not 18. Pointed out by: grehan@	2006-04-04 02:44:51 +00:00
Marcel Moolenaar	bfcdefd8aa	Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.	2006-04-03 22:51:47 +00:00
Peter Wemm	68ac481184	Shrink the amd64 pv entry from 48 bytes to about 24 bytes. On a machine with large mmap files mapped into many processes, this saves hundreds of megabytes of ram. pv entries were individually allocated and had two tailq entries and two pointers (or addresses). Each pv entry was linked to a vm_page_t and a process's address space (pmap). It had the virtual address and a pointer to the pmap. This change replaces the individual allocation with a per-process allocation system. A page ("pv chunk") is allocated and this provides 168 pv entries for that process. We can now eliminate one of the 16 byte tailq entries because we can simply iterate through the pv chunks to find all the pv entries for a process. We can eliminate one of the 8 byte pointers because the location of the pv entry implies the containing pv chunk, which has the pointer. After overheads from the pv chunk bitmap and tailq linkage, this works out that each pv entry has an effective size of 24.38 bytes. Future work still required, and other problems: * when running low on pv entries or system ram, we may need to defrag the chunk pages and free any spares. The stats (vm.pmap.) show that this doesn't seem to be that much of a problem, but it can be done if needed. running low on pv entries is now a much bigger problem. The old get_pv_entry() routine just needed to reclaim one other pv entry. Now, since they are per-process, we can only use pv entries that are assigned to our current process, or by stealing an entire page worth from another process. Under normal circumstances, the pmap_collect() code should be able to dislodge some pv entries from the current process. But if needed, it can still reclaim entire pv chunk pages from other processes. * This should port to i386 really easily, except there it would reduce pv entries from 24 bytes to about 12 bytes. (I have integrated Alan's recent changes.)	2006-04-03 21:36:01 +00:00
Peter Wemm	8d0593f54e	Merge/sync with i386: various cosmetic tweaks	2006-03-14 00:01:56 +00:00
Peter Wemm	cfa7ffb1d7	MFi386: The SIGFPE macros were moved to signal.h (FPE_INTOVF etc)	2006-03-14 00:01:22 +00:00
Sam Leffler	5225f08dc9	guard function decls with _KERNEL so user code can include this file	2006-03-01 05:59:56 +00:00
John Baldwin	215e7c161a	Rework how we wire up interrupt sources to CPUs: - Throw out all of the logical APIC ID stuff. The Intel docs are somewhat ambiguous, but it seems that the "flat" cluster model we are currently using is only supported on Pentium and P6 family CPUs. The other "hierarchy" cluster model that is supported on all Intel CPUs with local APICs is severely underdocumented. For example, it's not clear if the OS needs to glean the topology of the APIC hierarchy from somewhere (neither ACPI nor MP Table include it) and setup the logical clusters based on the physical hierarchy or not. Not only that, but on certain Intel chipsets, even though there were 4 CPUs in a logical cluster, all the interrupts were only sent to one CPU anyway. - We now bind interrupts to individual CPUs using physical addressing via the local APIC IDs. This code has also moved out of the ioapic PIC driver and into the common interrupt source code so that it can be shared with MSI interrupt sources since MSI is addressed to APICs the same way that I/O APIC pins are. - Interrupt source classes grow a new method pic_assign_cpu() to bind an interrupt source to a specific local APIC ID. - The SMP code now tells the interrupt code which CPUs are avaiable to handle interrupts in a simpler and more intuitive manner. For one thing, it means we could now choose to not route interrupts to HT cores if we wanted to (this code is currently in place in fact, but under an #if 0 for now). - For now we simply do static round-robin of IRQs to CPUs when the first interrupt handler just as before, with the change that IRQs are now bound to individual CPUs rather than groups of up to 4 CPUs. - Because the IRQ to CPU mapping has now been moved up a layer, it would be easier to manage this mapping from higher levels. For example, we could allow drivers to specify a CPU affinity map for their interrupts, or we could allow a userland tool to bind IRQs to specific CPUs. The MFC is tentative, but I want to see if this fixes problems some folks had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose interrupts). MFC after: 1 week	2006-02-28 22:24:55 +00:00
Warner Losh	d5e61c97a6	By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __. Submitted by: bde, jhb, scottl, many others.	2006-01-09 06:05:57 +00:00

1 2 3 4 5 ...

1238 Commits