freebsd-skq

Author	SHA1	Message	Date
Konstantin Belousov	6259969d36	Implement read_default_ldt in linux_modify_ldt(). It copies out zeroed descriptor, like real Linux does. Tested by: Yuriy Tsibizov <yuriy.tsibizov at gmail com> Submitted by: rdivacky MFC after: 1 week	2007-11-26 11:06:19 +00:00
Joseph Koshy	4c8e514bdc	MFP4: Add assembly language symbols used by hwpmc(4)'s callchain capture.	2007-11-23 03:03:30 +00:00
Scott Long	8611774e5e	Extend critical section coverage in the low-level interrupt handlers to include the ithread scheduling step. Without this, a preemption might occur in between the interrupt getting masked and the ithread getting scheduled. Since the interrupt handler runs in the context of curthread, the scheudler might see it as having a such a low priority on a busy system that it doesn't get to run for a _long_ time, leaving the interrupt stranded in a disabled state. The only way that the preemption can happen is by a fast/filter handler triggering a schduling event earlier in the handler, so this problem can only happen for cases where an interrupt is being shared by both a fast/filter handler and an ithread handler. Unfortunately, it seems to be common for this sharing to happen with network and USB devices, for example. This fixes many of the mysterious TCP session timeouts and NIC watchdogs that were being reported. Many thanks to Sam Lefler for getting to the bottom of this problem. Reviewed by: jhb, jeff, silby	2007-11-21 04:03:51 +00:00
Alan Cox	59677d3c0e	Prevent the leakage of wired pages in the following circumstances: First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated. Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the pages beyond the EOF are unmapped and freed. However, when the file is mlock(2)ed, the pages beyond the EOF are unmapped but not freed because they have a non-zero wire count. This can be a mistake. Specifically, it is a mistake if the sole reason why the pages are wired is because of wired, managed mappings. Previously, unmapping the pages destroys these wired, managed mappings, but does not reduce the pages' wire count. Consequently, when the file is unmapped, the pages are not unwired because the wired mapping has been destroyed. Moreover, when the vm object is finally destroyed, the pages are leaked because they are still wired. The fix is to reduce the pages' wired count by the number of wired, managed mappings destroyed. To do this, I introduce a new pmap function pmap_page_wired_mappings() that returns the number of managed mappings to the given physical page that are wired, and I use this function in vm_object_page_remove(). Reviewed by: tegge MFC after: 6 weeks	2007-11-17 22:52:29 +00:00
Marcel Moolenaar	0c3967e7fe	o Rename cpu_thread_setup() to cpu_thread_alloc() to better communicate that it relates to (is called by) thread_alloc() o Add cpu_thread_free() which is called from thread_free() to counter-act cpu_thread_alloc(). i386: Have cpu_thread_free() call cpu_thread_clean() to preserve behaviour. ia64: Have cpu_thread_free() call mtx_destroy() for the mutex initialized in cpu_thread_alloc(). PR: ia64/118024	2007-11-14 20:21:54 +00:00
Julian Elischer	e01eafef2a	A bunch more files that should probably print out a thread name instead of a process name.	2007-11-14 06:51:33 +00:00
Julian Elischer	431f890614	generally we are interested in what thread did something as opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.	2007-11-14 06:21:24 +00:00
Julian Elischer	502e39a873	Apply the same sort of locking done in sys/dev/acpica/acpi.c rev 1.196 a while ago: Grab Giant around calls to DEVICE_SUSPEND/RESUME in acpi_SetSleepState(). If we are resuming non-MPSAFE drivers, they need Giant held for them. This may fix some obscure suspend/resume problems. It has fixed keyrate setting problems that were triggered by cardbus (MPSAFE) changing the ordering for syscons resume (non-MPSAFE). Also, add some asserts that Giant is held in our suspend/resume and shutdown methods. Submitted by: Marko Zec	2007-11-14 05:43:55 +00:00
Peter Wemm	6dd3a6c06e	Drastically simplify the i386 pcpu backend by merging parts of the amd64 mechanism over. Instead of page table hackery that isn't actually needed, just use 'struct pcpu __pcpu[MAXCPU]' for backing like all the other platforms do. Get rid of 'struct privatespace' and a while mess of #ifdef SMP garbage that set it up. As a bonus, this returns the 4MB of KVA that we stole to implement it the old way. This also allows you to read the pcpu data for each cpu when reading a minidump. Background information: Originally, pcpu stuff was implemented as having per-cpu page tables and magic to make different data structures appear at the same actual address. In order to share page tables, we switched to using the GDT and %fs/%gs to access it. But we still did the evil magic to set it up for the old way. The "idle stacks" are not used for the idle process anymore and are just used for a few functions during bootup, then ignored. (excercise for reader: free these afterwards).	2007-11-13 23:00:24 +00:00
Benjamin Close	037347714a	Link wpi(4) into the build. This includes: o mtree (for legal/intel_wpi) o manpage for i386/amd64 archs o module for i386/amd64 archs o NOTES for i386/amd64 archs Approved by: mlaier (comentor)	2007-11-08 22:09:37 +00:00
Alan Cox	605385f843	Add comments explaining why all stores updating a non-kernel page table must be globally performed before calling any of the TLB invalidation functions. With one exception, on amd64, this requirement was already met. Fix this one case. Also, as a clarification, change an existing atomic op into a release. (Suggested by: jhb) Reported and reviewed by: ups MFC after: 3 days	2007-11-05 18:13:34 +00:00
Konstantin Belousov	89b57fcf01	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
Sam Leffler	a3393da7bf	fix build: when usb was enabled wireless drivers were brought in so remove the nodevice lines that elided wlan support	2007-11-03 19:26:49 +00:00
Andrew Thompson	c596337ff8	Remove zyd as wireless is not supported on PAE.	2007-11-03 07:11:07 +00:00
Alan Cox	6afd4b92f7	Eliminate spurious "Approaching the limit on PV entries, ..." warnings. Specifically, whenever vm_page_alloc(9) returned NULL to get_pv_entry(), we issued a warning regardless of the number of pv entries in use. (Note: The older pv entry allocator in RELENG_6 does not have this problem.) Reported by: Jeremy Chadwick Eliminate the direct call to pagedaemon_wakeup() by get_pv_entry(). This was a holdover from earlier times when the page daemon was responsible for the reclamation of pv entries. MFC after: 5 days	2007-11-03 05:15:26 +00:00
Peter Wemm	c8b14fa8f0	Move nvram out of DEFAULTS. There really isn't a lot of justification for consuming the memory. The module works just fine in the unlikely case that this is needed. It can still be compiled into a custom kernel.	2007-10-29 22:19:08 +00:00
John Baldwin	8518d50a63	- Add constants for the different memory types in the SMAP table. - Use the SMAP types and constants from <machine/pc/bios.h> in the boot code rather than duplicating it.	2007-10-28 21:23:49 +00:00
Peter Wemm	d556638404	Split /dev/nvram driver out of isa/clock.c for i386 and amd64. I have not refactored it to be a generic device. Instead of being part of the standard kernel, there is now a 'nvram' device for i386/amd64. It is in DEFAULTS like io and mem, and can be turned off with 'nodevice nvram'. This matches the previous behavior when it was first committed.	2007-10-26 03:23:54 +00:00
Warner Losh	97816f8e3d	Add usb serial devices by default. I'm tired of telling people how to do this that should know better :-).	2007-10-26 02:20:29 +00:00
John Baldwin	e0f5da6d08	Update copyright attribution. MFC after: 3 days	2007-10-24 21:16:22 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
John Baldwin	5c5b5d4607	Slightly cleanup the 'bootdev' concept on x86 by changing the various macros to treat the 'slice' field as a real part of the bootdev instead of as hack that spans two other fields (adaptor (sic) and controller) that are not used in any modern FreeBSD boot code. MFC after: 1 week	2007-10-24 04:03:25 +00:00
John Baldwin	7e68ed1218	Stop disabling USB in the PAE kernel config. The USB code has been using bus_dma(9) for quite a while now and has been used on 64-bit archs as well. MFC after: 1 month	2007-10-24 03:53:10 +00:00
Julian Elischer	3745c395ec	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
Bjoern A. Zeeb	2b3e7485f6	Fold multiple asm statements into one so that the compiler at a certain optimization level (-march=pentium-mmx for example) does not insert intermediate ops which would trash the carry. Change both sys/i386/i386/in_cksum.c[1] and sys/i386/include/in_cksum.h. To my best understanding the same problem was addressed in rev. 1.16 of src/sys/i386/include/in_cksum.h for just a single function 3y ago. Reviewed by: jhb Submitted by: Zhouyi ZHOU <zhouzhouyi FreeBSD.org> (intial version of [1]) MFC after: 5 days PR: 115678, 69257	2007-10-20 22:18:42 +00:00
Ken Smith	95b55771b2	Switch over to ULE as the default scheduler for amd64 and i386 architectures.	2007-10-19 12:30:33 +00:00
Alexander Leidinger	9f05d312b3	Backout sensors framework. Requested by: phk Discussed on: cvs-all	2007-10-15 20:00:24 +00:00
Alexander Leidinger	989500bf1a	Import it(4) and lm(4), supporting most popular Super I/O Hardware Monitors. Submitted by: Constantine A. Murenin <cnst@FreeBSD.org> Sponsored by: Google Summer of Code 2007 (GSoC2007/cnst-sensors) Mentored by: syrinx Tested by: many OKed by: kensmith Obtained from: OpenBSD (parts)	2007-10-14 10:55:50 +00:00
Marius Strobl	55aaf894e8	Make the PCI code aware of PCI domains (aka PCI segments) so we can support machines having multiple independently numbered PCI domains and don't support reenumeration without ambiguity amongst the devices as seen by the OS and represented by PCI location strings. This includes introducing a function pci_find_dbsf(9) which works like pci_find_bsf(9) but additionally takes a domain number argument and limiting pci_find_bsf(9) to only search devices in domain 0 (the only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order to no longer report false positives when searching for siblings and dupe devices in the same domain respectively. Along with this change the sole host-PCI bridge driver converted to actually make use of PCI domain support is uninorth(4), the others continue to use domain 0 only for now and need to be converted as appropriate later on. Note that this means that the format of the location strings as used by pciconf(8) has been changed and that consumers of <sys/pciio.h> potentially need to be recompiled. Suggested by: jhb Reviewed by: grehan, jhb, marcel Approved by: re (kensmith), jhb (PCI maintainer hat)	2007-09-30 11:05:18 +00:00
Christian Brueffer	4fabde5686	Use the correct expanded name for SCTP. PR: 116496 Submitted by: koitsu Reviewed by: rrs Approved by: re (kensmith)	2007-09-26 20:05:07 +00:00
Alan Cox	7bfda801a8	Change the management of cached pages (PQ_CACHE) in two fundamental ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)	2007-09-25 06:25:06 +00:00
Attilio Rao	c8790f5d09	Fix some entries in the locks static table of witness. In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re	2007-09-20 20:38:43 +00:00
Konstantin Belousov	96a2b63525	Fill in cr2 in the signal context from ksi->ksi_addr. Together with the sys/i386/i386/trap.c rev. 1.306 it fixes the PR. Submitted by: rdivacky Suggested by: jhb Sponsored by: Google Summer of Code 2007 PR: kern/77710 Approved by: re (kensmith)	2007-09-20 13:46:26 +00:00
David Malone	e4a76144d0	regen. Approved by: re (kensmith)	2007-09-18 19:51:49 +00:00
David Malone	3ab8526963	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
Poul-Henning Kamp	f1be017900	Recognize the Soekris NET5501 and configure the error led. Add watchdog(4) support by using the MFGPT0 in the Geode LX CX5536. (Supported range: 2^30 .. 2^44 ns = 1s ... 5h) Approved by: re (bmah)	2007-09-18 09:19:44 +00:00
Peter Wemm	8bff6a112b	Fix an undefined symbol that as/ld neglected to flag as a problem. It was used in assembler code in such a way that no unresolved relocation records were generated, so ld didn't flag the problem. You can see this with an 'nm' of the kernel. There will be 'U MAXCPU' on SMP systems. The impact of this is that the intrcount/intrnames arrays do not have the intended amount of space reserved. This could lead to interesting problems due to the arrays being present in the middle of kernel code. An overflow would be rather interesting as executable code would be used as per-cpu incrementing interrupt counters. This fixes it for now by exporting MAXCPU to the assembler. A better fix might be to define these data structures in C - they're only referenced in the kernel from C code these days anyway. Approved by: re (kensmith)	2007-09-17 21:55:28 +00:00
Jeff Roberson	b61ce5b0e6	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
Attilio Rao	0b2e598c14	This is a follow-up, cleaning-up commit about recent changes involving topology foo functions. Working at the patch for topology problems in ia32/amd64 evicted some problems regarding functions ordering in the SI_SUB_CPU family of SYSINIT'ed subsystems. In order to avoid problems with new modified to involved functions, a correct ordering is not semantically specified for SI_SUB_CPU functions (for a larger view of the issue please visit: http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075409.html ) Discussed with: peter Tested by: kris, Rui Paulo <rpaulo@FreeBSD.org> Approved by: jeff Approved by: re	2007-09-11 22:54:09 +00:00
Yoshihiro Takahashi	7b226dfaa8	Fix a kernel panic due to a NULL pointer access on pc98. When any PnP device exists, isa_release_resource() is called with no activated resource. So a bushandle is not allocated yet. Approved by: re (kensmith)	2007-09-01 12:18:28 +00:00
Konstantin Belousov	0e6ed4feab	Regenerate. Approved by: re (kensmith)	2007-08-28 12:36:23 +00:00
Konstantin Belousov	b6e645c90f	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
Joseph Koshy	ea49750231	Assign sizes to assembly language support functions. Approved by: re (kensmith)	2007-08-22 05:06:14 +00:00
Joseph Koshy	298889efcb	Define an END() macro for use in i386 and amd64 assembly code, akin to the one available on the ia64, sparc64, and sun4v architectures. Approved by: re (kensmith)	2007-08-22 04:26:07 +00:00
Alan Cox	8beae25391	In general, when we map a page into the kernel's address space, we no longer create a pv entry for that mapping. (The two exceptions are mappings into the kernel's exec and pipe submaps.) Consequently, there is no reason for get_pv_entry() to dig deep into the free page queues, i.e., use VM_ALLOC_SYSTEM, by default. This revision changes get_pv_entry() to use VM_ALLOC_NORMAL by default, i.e., before calling pmap_collect() to reclaim pv entries. Approved by: re (kensmith)	2007-08-21 04:59:34 +00:00
Dag-Erling Smørgrav	83d18f2283	Add a driver for the on-die digital thermal sensor found on Intel Core and newer CPUs (including Core 2 and Core / Core 2 based Xeons). The driver attaches to each cpu device and creates a sysctl node in that device's sysctl context (dev.cpu.N.temperature). When invoked, the handler binds to the appropriate CPU to ensure a correct reading. Submitted by: Rui Paulo <rpaulo@fnop.net> Sponsored by: Google Summer of Code 2007 Tested by: des, marcus, Constantine A. Murenin, Ian FREISLICH Approved by: re (kensmith) MFC after: 3 weeks	2007-08-15 19:26:03 +00:00
Nate Lawson	3b3f28135f	Add "show sysregs" command to ddb. On i386, this gives gdt, idt, ldt, cr0-4, etc. Support should be added for other platforms that have a different set of registers for system use. Loosely based on: OpenBSD Approved by: re	2007-08-09 20:14:35 +00:00
Peter Wemm	b7778ae08f	Move mp_topology() from apic_init(i386) and apic_setup_local(amd64) to cpu_start_mp(). This is after we have read the cpuid registers to calculate the hyperthreading_cpus value for the sysctl that enables or disables hyperthread cores. Change mp_topology() to use that information rather than trying to do it itself. This solves the problem of ULE being incorrectly told that dual core Athlon64 X2 or Operton cpus are hyperthreading cores. At the very least, we now have a single piece of code to identify hyperthreading. Obtained from: jhb Approved by: re (kensmith)	2007-08-02 21:17:58 +00:00
David Malone	9be70a793e	It seems that some i386 mothermoards either do not implement the day of week field correctly, or they remember bad values that are written into the day of week field. For this reason, ignore the day of week field when reading the clock on i386 rather than bailing if it is set incorrectly. Problems were seen on a number of platforms, including VMWare, qemu, EPIA ME6000, Epox-3PTA and ABIT-SL30T. This is a slightly different fix to that proposed by Ted in his PR, but the same basic idea. PR: 111117 Submitted by: Ted Faber <faber@lunabase.org> Approved by: re (rwatson) MFC after: 3 weeks	2007-07-27 09:34:42 +00:00
John Baldwin	de016534a8	If the trap number stored in the trapframe is corrupted into a negative value, then we would use a negative index into the trap_msg[] array resulting in a nested page fault. Make the 'type' variable holding the trap number unsigned to avoid this. MFC after: 2 weeks Approved by: re (rwatson)	2007-07-26 15:32:55 +00:00
David Malone	6d8617d42a	If clock_ct_to_ts fails to convert time time from the real time clock, print a one line error message. Add some comments on not being able to trust the day of week field (I'll act on these comments in a follow up commit). Approved by: re MFC after: 3 weeks	2007-07-23 09:42:32 +00:00
Attilio Rao	52739c2d25	i386_set_ioperm, i386_get_ldt and i386_set_ldt are now MPSAFE (Giant/sched_lock free) so remove unuseful Giant cruft. Approved by: jeff Approved by: re Sponsorized by: NGX Italy (http://www.ngx.it)	2007-07-20 08:35:18 +00:00
Jeff Roberson	56a114967b	- Add support for blocking and releasing threads to i386 cpu_switch(). This is required for per-cpu scheduler lock support. Obtained from: attilio Tested by: current@ many users Approved by: re	2007-07-17 22:34:14 +00:00
Matt Jacob	06b642b55d	Remove the internal use of __packed and put it on the structures themselves. Reviewed by: nate, peter, warner, robert Approved by: re (ken)	2007-07-11 22:34:34 +00:00
Attilio Rao	ea11c140d0	NULL_LDT_BASE is used in !SMP kernels too and set_user_ldt() is not properly called. Address these two issues. Reported by: Tinderbox Tested by: le Approved by: jeff (mentor) Approved by: re	2007-07-08 18:17:42 +00:00
Nate Lawson	d73144e778	Now that we have a function that can be called from a cdevsw close() entry point, use it. Approved by: re	2007-07-07 17:54:33 +00:00
Attilio Rao	05dfa22fe9	Actual code shows several problems in ia32 LDT handling: - When a LDT entry changes, the old one is freed while it is still referenced by gdt and ldtr. This can lead to disruptive behaviours in particular on SMP machines. - When a LDT entry changes, it is assumed that the only one entity sharing the same LDT are threads in the same proc. It doesn't take in account edge cases where two processes share the same VM (rfork'ed ones, for example). This patch addresses these two problems and addictionally it fixes the usage of refcount switching back it to the old manually-grown refcount (since in this case would be faster). Diagnosed by: tegge Tested by: pho (a former version) Reviewed by: kib Approved by: jeff (mentor) Approved by: re	2007-07-07 16:59:01 +00:00
Bjoern A. Zeeb	5b919cdc47	I4B header files were repo-copied from sys/i386/include/ to sys/i4b/include/ so they will be available to all architectures once I4B compiles on those. Approved by: re (kensmith)	2007-07-06 07:23:39 +00:00
Peter Wemm	e106f3d812	__packed has no effect on u_int8_t's except to cause a warning (and never has had any effect). Approved by: re (rwatson)	2007-07-05 07:28:38 +00:00
Peter Wemm	b811e070b4	Remove pad argument from ftruncate wrapper. Oops. Approved by: re (kensmith)	2007-07-05 05:32:44 +00:00
Peter Wemm	79d5bdcca5	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
Bjoern A. Zeeb	118043c6b1	Temporary disconnect i4bing, i4bisppp and i4bipr from the build for the 7.0 timeframe. This is needed because I4B is not locked and NET_NEEDS_GIANT goes away. The plan is to lock I4B and bring everything back for 7.1. Approved by: re (kensmith)	2007-07-04 00:18:39 +00:00
Nate Lawson	a1ec53930b	Revert previous commit, retaining cpufreq. Approved by: re (implicitly)	2007-07-01 22:19:20 +00:00
Nate Lawson	a7b811a620	Add cpufreq(4) to GENERIC. It does not change the frequency by default, so systems should be relatively unaffected. Users can then simply enable powerd(8) in rc.conf to take advantage of it. Approved by: re	2007-07-01 21:47:45 +00:00
Alan Cox	ba4b85e482	Pages that do belong to an object and page queue can now be freed without holding the page queues lock. Thus, the page table pages released by pmap_remove() and pmap_remove_pages() can be freed after the page queues lock is released. Approved by: re (kensmith)	2007-07-01 07:08:26 +00:00
Nate Lawson	00a304487f	Update the suspend/resume user API while maintaining backwards compat. Improvements: * /etc/rc.suspend,rc.resume are always run, no matter the source of the suspend request (user or kernel, apm or acpi) * suspend now requires positive user acknowledgement. If a user program wants to cancel the suspend, they can. If one of the user programs hangs or doesn't respond within 10 seconds, the system suspends anyway. * /dev/apm is clonable, allowing multiple listeners for suspend events. In the future, xorg-server can use this to be informed about suspend even if there are other listeners (i.e. apmd). Changes: * Two new ACPI ioctls: REQSLPSTATE and ACKSLPSTATE. Request begins the process of suspending by notifying all listeners. acpi is monitored by devd(8) and /dev/apm listener(s) are also counted. Users register their approval or disapproval via Ack. If anyone disapproves, suspend is vetoed. * Old user programs or kernel modules that used SETSLPSTATE continue to work. A message is printed once that this interface is deprecated. * acpiconf gains the -k flag to ack the suspend request. This flag is undocumented on purpose since it's only used by /etc/rc.suspend. It is not intended to be a permanent change and will be removed once a better power API is implemented. * S5 (power off) is no longer supported via acpiconf -s 5 or apm -z/-Z. This restores previous behavior of halt/shutdown -p being the interface. * Miscellaneous improvements to error reporting Approved by: re	2007-06-21 22:50:37 +00:00
Nate Lawson	eb988b9d42	Use bus_dma to get a page in the first 4 GB. Since the physical address of the magic string is passed in a 32-bit register, we can't use high memory in the PAE case. This also eliminates a use of vtophys(). Tested by: Jeff Shimbo <jts767 / gmail.com> MFC after: 1 week	2007-06-17 07:18:23 +00:00
Marius Strobl	7a89ac4d26	- Define data of struct gfb_font a const as it's only used to supply font data and remove the array size from the definition as f.e. the gallant 12 x 22 font data is 256 * 44 in size, exceeding the previously hard- coded size. - Declare the bold8x16 instance of struct gfb_font as const as it's not intended to be changed at run-time as a whole either. - Use __FBSDID in xboxfb.c Tested by: rink	2007-06-16 21:31:53 +00:00
Peter Wemm	5915fb72fb	Prototype (but functional) Linux-ish /dev/nvram interface to the extra 114 bytes of cmos ram in the PC clock chip. The big difference between this and the Linux version is that we do not recalculate the checksums for bytes 16..31. We use this at work when cloning identical machines - we can copy the bios settings as well. Reading /dev/nvram gives 114 bytes of data but you can seek/read/write whichever bytes you like. Yes, this is a "foot, gun, fire!" type of device.	2007-06-15 22:58:14 +00:00
Xin LI	a2346f7c3c	Enable SCTP by default for GENERIC kernels in order to give it more exposure. The current state of SCTP implementation is considered to be ready for 32-bit platforms, but still need some work/testing on 64-bit platforms. Approved by: re (kensmith) Discussed with: rrs	2007-06-14 17:14:27 +00:00
John Baldwin	7dba15b72b	Don't clobber tf_err with the eva from a page fault as the page fault address is saved in ksi_addr already. PR: i386/101379 Submitted by: Tijl Coosemans : tijl ulyssis org	2007-06-13 22:37:48 +00:00
Pyun YongHyeon	b5f0caf909	Add nfe(4) to the list of drivers supported by GENERIC kernel. While I'm here comment out nve(4) as nfe(4) will take over. Approved by: re	2007-06-12 02:24:30 +00:00
Andrew Thompson	7302aa80a9	Exclude wlan_scan_* from PAE like the rest of wlan.	2007-06-11 19:29:42 +00:00
Matt Jacob	f2114f3bcd	Check against maxsegsz being zero in bus_dma_tag_create and return EINVAL if it is. Reviewed by: scott long	2007-06-11 17:57:24 +00:00
Andrew Thompson	ed3247cea7	Add wlan_scan_ap and wlan_scan_sta to platforms that include wlan.	2007-06-11 08:26:40 +00:00
Marcel Moolenaar	2b39bb4f4f	Use default options for default partitioning schemes, rather than making the relevant files standard. This avoids duplication and makes it easier to override/disable unwanted schemes. Since ARM doesn't have a DEFAULTS configuration file, leave the source files for the BSD and MBR partitioning schemes in files.arm for now.	2007-06-11 00:38:06 +00:00
Attilio Rao	393a081d42	Optimize vmmeter locking. In particular: - Add an explicative table for locking of struct vmmeter members - Apply new rules for some of those members - Remove some unuseful comments Heavily reviewed by: alc, bde, jeff Approved by: jeff (mentor)	2007-06-10 21:59:14 +00:00
Marcel Moolenaar	01bd17cc99	Add kdb_cpu_sync_icache(), intended to synchronize instruction caches with data caches after writing to memory. This typically is required to make breakpoints work on ia64 and powerpc. For those architectures the function is implemented.	2007-06-09 21:55:17 +00:00
Robert Watson	68d4cc614a	Enable AUDIT by default in the GENERIC kernel, allowing security event auditing to be turned on without a kernel recompile, just an rc.conf option. Approved by: re (kensmith) Obtained from: TrustedBSD Project	2007-06-08 20:29:07 +00:00
David Xu	42ce445fed	Backout experimental adaptive-spin umtx code.	2007-06-06 07:35:08 +00:00
John Baldwin	ce0b0c05aa	Move a warning under bootverbose as no machines that trigger it have ended up being broken.	2007-06-05 18:57:48 +00:00
Alan Cox	e5c45405f0	Add the machine-specific definitions for configuring the new physical memory allocator. Set the size of phys_avail[] and dump_avail[] using one of these definitions. Approved by: re	2007-06-05 05:17:20 +00:00
Jeff Roberson	982d11f836	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
Jeff Roberson	1b1618fb12	- Change comments and asserts to reflect the removal of the global scheduler lock. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:57:32 +00:00
Jeff Roberson	74aaec43e8	Commit 11/14 of sched_lock decomposition. - There is no globally visible scheduler lock any longer. For now the watchdog can only check Giant. This model of checking particular locks is flawed and should be revisited. Other metrics should be considered. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:56:33 +00:00
Jeff Roberson	e4b5aee3a8	Commit 10/14 of sched_lock decomposition. - Use sched_throw() rather than replicating the same cpu_throw() code for each architecture. This also allows the scheduler to use any locking it may want to. - Use the thread_lock() rather than sched_lock when preempting. - The scheduler lock is not required to synchronize release_aps. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:56:08 +00:00
Attilio Rao	6759608248	Rework the PCPU_* (MD) interface: - Rename PCPU_LAZY_INC into PCPU_INC - Add the PCPU_ADD interface which just does an add on the pcpu member given a specific value. Note that for most architectures PCPU_INC and PCPU_ADD are not safe. This is a point that needs some discussions/work in the next days. Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:38:48 +00:00
David Malone	041b706b2f	Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.	2007-06-04 18:25:08 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Paolo Pisati	3401f2c1df	In some particular cases (like in pccard and pccbb), the real device handler is wrapped in a couple of functions - a filter wrapper and an ithread wrapper. In this case (and just in this case), the filter wrapper could ask the system to schedule the ithread and mask the interrupt source if the wrapped handler is composed of just an ithread handler: modify the "old" interrupt code to make it support this situation, while the "new" interrupt code is already ok. Discussed with: jhb	2007-05-31 19:25:35 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Dag-Erling Smørgrav	753bcb5c34	Add CPUID2_PDCM Requested by: jkim MFC after: 3 days	2007-05-31 11:26:45 +00:00
Dag-Erling Smørgrav	3da3c3632e	Add descriptive comment to PDCM entry.	2007-05-29 19:39:18 +00:00
Dag-Erling Smørgrav	a99f0a4653	Remove a pointless bootverbose message. MFC after: 3 days	2007-05-29 19:25:50 +00:00
Dag-Erling Smørgrav	cc1eed925f	Add feature name for features2 bit 15. PR: i386/113133 Submitted by: Pankov Pavel <pankov_p@mail.ru> MFC after: 3 days	2007-05-29 19:21:53 +00:00
Attilio Rao	02b0a160dc	Fix some problems introduced with the last descriptors tables locking patch: - Do the correct test for ldt allocation - Drop dt_lock just before to call kmem_free (since it acquires blocking locks inside) - Solve a deadlock with smp_rendezvous() where other CPU will wait undefinitively for dt_lock acquisition. - Add dt_lock in the WITNESS list of spinlocks While applying these modifies, change the requirement for user_ldt_free() making that returning without dt_lock held. Tested by: marcus, tegge Reviewed by: tegge Approved by: jeff (mentor)	2007-05-29 18:55:41 +00:00
Pyun YongHyeon	590f73f72e	Honor maxsegsz of less than a page size in a DMA tag. Previously it used to return PAGE_SIZE without respect to restrictions of a DMA tag. This affected all of the busdma load functions that use _bus_dmamap_loader_buffer() as their back-end. Reviewed by: scottl	2007-05-29 06:30:26 +00:00
Hidetoshi Shimokawa	35fafac2ac	Enable fwip and dcons in GENERIC. They seem fairly stable. Note on dcons: To enable dcons in kernel, put the following lines in /boot/loader.conf. You may also want to enable dcons in /etc/ttys. boot_multicons="YES" #Force dcons to be the high-level console if a firewire bus presents. #hw.firewire.dcons_crom.force_console=1 FireWire/dcons support in loader will come shortly. (i386/amd64 only)	2007-05-28 14:38:43 +00:00
Alan Cox	c155d5d059	Eliminate an unused definition.	2007-05-27 20:34:26 +00:00
Robert Watson	2129d3ea2b	Remove "XXX Giant" comments before calls to kdb_trap() -- the kernel debugger is quite capable of handling Giant-free execution at this point. Several other similar comments remain in trap.c on both i386 and amd64 awaiting analysis.	2007-05-27 19:16:45 +00:00
Robert Watson	5aabce7698	Don't check curproc->p_comm for NULL as it is an array allocated as part of struct proc. Found with: Coverity Prevent(tm) CID: 2096	2007-05-27 17:27:59 +00:00
Konstantin Belousov	1c182de9a9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
Alexander Kabaev	23a29e45cd	Allow FreeBSD's native ELF image activators to execute shared libraries the same way it was enabled for Linux binares in linuxulator. This allows binaries built with -pie. Many ports auto-detect -fPIE support in GCC 4.2 and build binaries FreeBSD was unable to run.	2007-05-22 02:22:58 +00:00
Jeff Roberson	80b200da28	- rename VMCNT_DEC to VMCNT_SUB to reflect the count argument. Suggested by: julian@ Contributed by: attilio@	2007-05-20 22:33:42 +00:00
Jeff Roberson	0ad5e7f326	- Move GDT/LDT locking into a seperate spinlock, removing the global scheduler lock from this responsibility. Contributed by: Attilio Rao <attilio@FreeBSD.org> Tested by: jeff, kkenn	2007-05-20 22:03:57 +00:00
Matt Jacob	2d5f1502fe	Initializae lastaddr to 0 in bus_dmamap_load_uio so that _bus_dmamap_load_buffer won't (potentially) be confused. Discovered by: gcc 4.2 MFC after: 3 days	2007-05-20 16:53:45 +00:00
Alexander Kabaev	fa298d5ea8	Include machine/pcb.hto turn extern struct pcb stoppcbs[]; construct into the valid C.	2007-05-19 05:01:43 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Kirk McKusick	2f79c90982	Update entries for building tags.	2007-05-13 18:21:54 +00:00
Alexander Kabaev	ec69a8a6d2	Do not dereference linux_to_bsd_signal[-1] if userland has passed zero as exit signal. GCC 4.2 changes the kernel data segment layout not to have 0 in that memory location. This code ran by luck before and now the luck has run out.	2007-05-11 01:25:51 +00:00
Scott Long	f73e86c383	It turns out that the hptiop driver isn't portable after all. Confine it to amd64 and i386 for now.	2007-05-09 15:55:45 +00:00
Maxim Konovalov	856d5abeed	o Fix typo: comments start by "#" not "*".	2007-05-09 11:43:04 +00:00
Scott Long	4439f8b4b6	Introduce a driver for the Highpoint RocketRAID 3xxx series of controllers. The driver relies on CAM. Many thanks to Highpoint for providing this driver.	2007-05-09 07:07:26 +00:00
John Baldwin	2e025791ce	Handle CPUs with APIC IDs higher than 32 (at least one IBM server uses an APIC ID of 38 for its second CPU): - Add a new MAX_APIC_ID constant for the highest valid APIC ID for modern systems. - Size the various arrays in the MADT, MP Table, and SMP code that are indexed by APIC IDs to allow for up to MAX_APIC_ID. - Explicitly go through and assign logical cpu ids to local APICs before starting any of the APs up rather than doing it while starting up the APs. This step is now where we honor MAXCPU. MFC after: 1 week	2007-05-08 22:01:04 +00:00
John Baldwin	fb610ca1f9	Minor fixes and tweaks to the x86 interrupt code: - Split the intr_table_lock into an sx lock used for most things, and a spin lock to protect intrcnt_index. Originally I had this as a spin lock so interrupt code could use it to lookup sources. However, we don't actually do that because it would add a lot of overhead to interrupts, and if we ever do support removing interrupt sources, we can use other means to safely do so w/o locking in the interrupt handling code. - Replace is_enabled (boolean) with is_handlers (a count of handlers) to determine if a source is enabled or not. This allows us to notice when a source is no longer in use. When that happens, we now invoke a new PIC method (pic_disable_intr()) to inform the PIC driver that the source is no longer in use. The I/O APIC driver frees the APIC IDT vector when this happens. The MSI driver no longer needs to have a hack to clear is_enabled during msi_alloc() and msix_alloc() as a result of this change as well. - Add an apic_disable_vector() to reset an IDT vector back to Xrsvd to complement apic_enable_vector() and use it in the I/O APIC and MSI code when freeing an IDT vector. - Add a new nexus hook: nexus_add_irq() to ask the nexus driver to add an IRQ to its irq_rman. The MSI code uses this when it creates new interrupt sources to let the nexus know about newly valid IRQs. Previously the msi_alloc() and msix_alloc() passed some extra stuff back to the nexus methods which then added the IRQs. This approach is a bit cleaner. - Change the MSI sx lock to a mutex. If we need to create new sources, drop the lock, create the required number of sources, then get the lock and try the allocation again.	2007-05-08 21:29:14 +00:00
Kevin Lo	dd5ee92cb5	Add rum(4)	2007-05-07 02:06:03 +00:00
Paolo Pisati	bafe5a3118	Bring in the reminaing bits to make interrupt filtering work: o push much of the i386 and amd64 MD interrupt handling code (intr_machdep.c::intr_execute_handlers()) into MI code (kern_intr.c::ithread_loop()) o move filter handling to kern_intr.c::intr_filter_loop() o factor out the code necessary to mask and ack an interrupt event (intr_machdep.c::intr_eoi_src() and intr_machdep.c::intr_disab_eoi_src()), and make them part of 'struct intr_event', passing them as arguments to kern_intr.c::intr_event_create(). o spawn a private ithread per handler (struct intr_handler::ih_thread) with filter and ithread functions. Approved by: re (implicit?)	2007-05-06 17:02:50 +00:00
Kevin Lo	0738dfc386	Add support for Ralink Technology RT2501USB/RT2601USB devices. Reviewed by: sam, sephe Obtained from: OpenBSD	2007-05-06 10:07:21 +00:00
Alan Cox	04a18977c8	Define every architecture as either VM_PHYSSEG_DENSE or VM_PHYSSEG_SPARSE depending on whether the physical address space is densely or sparsely populated with memory. The effect of this definition is to determine which of two implementations of vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy implementation is obtained by defining VM_PHYSSEG_DENSE, and a new implementation that trades off time for space is obtained by defining VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64 allows the entirety of my Itanium 2's memory to be used. Previously, only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on sparc64 allows USIIIi-based systems to boot without crashing. This change is a combination of Nathan Whitehorn's patch and my own work in perforce. Discussed with: kmacy, marius, Nathan Whitehorn PR: 112194	2007-05-05 19:50:28 +00:00
John Baldwin	e706f7f0c7	Revamp the MSI/MSI-X code a bit to achieve two main goals: - Simplify the amount of work that has be done for each architecture by pushing more of the truly MI code down into the PCI bus driver. - Don't bind MSI-X indicies to IRQs so that we can allow a driver to map multiple MSI-X messages into a single IRQ when handling a message shortage. The changes include: - Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus to calculate the address and data values for a given MSI/MSI-X IRQ. The x86 nexus drivers map this into a call to a new 'msi_map()' function in msi.c that does the mapping. - Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index' parameter from PCIB_ALLOC_MSIX(). MD code no longer has any knowledge of the MSI-X index for a given MSI-X IRQ. - The PCI bus driver now stores more MSI-X state in a child's ivars. Specifically, it now stores an array of IRQs (called "message vectors" in the code) that have associated address and data values, and a small virtual version of the MSI-X table that specifies the message vector that a given MSI-X table entry uses. Sparse mappings are permitted in the virtual table. - The PCI bus driver now configures the MSI and MSI-X address/data registers directly via custom bus_setup_intr() and bus_teardown_intr() methods. pci_setup_intr() invokes PCIB_MAP_MSI() to determine the address and data values for a given message as needed. The MD code no longer has to call back down into the PCI bus code to set these values from the nexus' bus_setup_intr() handler. - The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get new values of the address and data fields for a given IRQ. The x86 MSI code uses this when an MSI IRQ is moved to a different CPU, requiring a new value of the 'address' field. - The x86 MSI psuedo-driver loses a lot of code, and in fact the separate MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver since the only remaining diff between the two is a substring in a bootverbose printf. - The PCI bus driver will now restore MSI-X state (including programming entries in the MSI-X table) on device resume. - The interface for pci_remap_msix() has changed. Instead of accepting indices for the allocated vectors, it accepts a mini-virtual table (with a new length parameter). This table is an array of u_ints, where each value specifies which allocated message vector to use for the corresponding MSI-X message. A vector of 0 forces a message to not have an associated IRQ. The device may choose to only use some of the IRQs assigned, in which case the unused IRQs must be at the "end" and will be released back to the system. This allows a driver to use the same remap table for different shortage values. For example, if a driver wants 4 messages, it can use the same remap table (which only uses the first two messages) for the cases when it only gets 2 or 3 messages and in the latter case the PCI bus will release the 3rd IRQ back to the system. MFC after: 1 month	2007-05-02 17:50:36 +00:00
Ariff Abdullah	1d80d190af	Disable C1 Enhanced mode on AMD K8 Family Revision F and above to keep local APIC timer alive. Reviewed by: jhb PR: i386/104678 MFC after: 3 days	2007-04-25 19:58:42 +00:00
John Baldwin	a5b6b9a68e	Fix the triple fault used as a last resort during a reboot to actually fault. The previous method zero'd out the page tables, invalidated the TLB, and then entered a spin loop. The idea was that the instruction after the TLB invalidate would result in a page fault and the page fault and subsequent double fault wouldn't be able to determine the physical page for their fault handlers' first instruction. This stopped working when PGE (PG_G PTE/PDE bit) support was added as a TLB invalidate via %cr3 reload doesn't clear TLB entries with PG_G set. Thus, the CPU was still able to map the virtual address for the spin loop and happily performed its infinite loop. The triple fault now uses a much more deterministic sledge-hammer approach to generate a triple fault. First, the IDT descriptor is set to point to an empty IDT, so any interrupts (including a double fault) will instantly fault. Second, we trigger a int 3 breakpoint to force an interrupt and kick off a triple fault. MFC after: 3 days	2007-04-24 21:17:45 +00:00
John Baldwin	b72d374cee	Update comments for the 0xcf9 and 0x92 reset methods to explain what we are actually doing and what the various bits mean.	2007-04-24 15:16:27 +00:00
John Baldwin	194617769b	Tweak printf string.	2007-04-23 22:53:01 +00:00
Robert Watson	c14d15ae3e	Remove MAC Framework access control check entry points made redundant with the introduction of priv(9) and MAC Framework entry points for privilege checking/granting. These entry points exactly aligned with privileges and provided no additional security context: - mac_check_sysarch_ioperm() - mac_check_kld_unload() - mac_check_settime() - mac_check_system_nfsd() Add mpo_priv_check() implementations to Biba and LOMAC policies, which, for each privilege, determine if they can be granted to processes considered unprivileged by those two policies. These mostly, but not entirely, align with the set of privileges granted in jails. Obtained from: TrustedBSD Project	2007-04-22 15:31:22 +00:00
Stephan Uphoff	31b4f4a916	Modify TLB invalidation handling. Reviewed by: alc@, peter@ MFC after: 1 week	2007-04-21 14:17:30 +00:00
Stephane E. Potvin	0e5179e441	Add support for specifying a minimal size for vm.kmem_size in the loader via vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will be at least 256mb (for example) without forcing a particular value via vm.kmem_size. Approved by: njl (mentor) Reviewed by: alc	2007-04-21 01:14:48 +00:00
Poul-Henning Kamp	3f17cc74af	style nit	2007-04-19 09:18:51 +00:00
Poul-Henning Kamp	cc76e59ded	On AMD's Geode LX: Force the TSC to run through core-suspension so we can use it as a timecounter. Sponsored by: Soekris Engineering	2007-04-18 10:08:24 +00:00
John Baldwin	88a5255bc4	Honor the BUS_DMA_NOCACHE flag to bus_dmamem_alloc() on amd64 and i386 by mapping the pages as UC (uncacheable) using pmap_change_attr(). MFC after: 1 week Requested by: ariff Reviewed by: scottl	2007-04-17 21:05:34 +00:00
Alan Cox	0b76504872	Eliminate the misuse of PG_FRAME to truncate a virtual address to a virtual page boundary. Reviewed by: ru@	2007-04-13 16:07:29 +00:00
Alan Cox	1434c1f34b	MFamd64 Define PGEX_RSV.	2007-04-12 17:00:56 +00:00
Ruslan Ermilov	f76f6cfd25	Fix PAE on SMP by enabling EFER_NXE on all APs. Reported by: kris Diagnosed by: alc	2007-04-12 11:05:24 +00:00
Pawel Jakub Dawidek	fef2a25971	Remove trailing '.' for consistency!	2007-04-10 21:40:13 +00:00
Pawel Jakub Dawidek	57bcf75fd2	Add UFS_GJOURNAL options to the GENERIC kernel. Approved by: re (kensmith)	2007-04-10 16:49:41 +00:00
Ruslan Ermilov	2e137367b4	Add the PG_NX support for i386/PAE. Reviewed by: alc	2007-04-06 18:15:03 +00:00
Jung-uk Kim	357afa7113	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
John Baldwin	4e7f640dfb	Optimize sx locks to use simple atomic operations for the common cases of obtaining and releasing shared and exclusive locks. The algorithms for manipulating the lock cookie are very similar to that rwlocks. This patch also adds support for exclusive locks using the same algorithm as mutexes. A new sx_init_flags() function has been added so that optional flags can be specified to alter a given locks behavior. The flags include SX_DUPOK, SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature to the similar flags for mutexes. Adaptive spinning on select locks may be enabled by enabling the ADAPTIVE_SX kernel option. Only locks initialized with the SX_ADAPTIVESPIN flag via sx_init_flags() will adaptively spin. The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock() are now performed inline in non-debug kernels. As a result, <sys/sx.h> now requires <sys/lock.h> to be included prior to <sys/sx.h>. The new kernel option SX_NOINLINE can be used to disable the aforementioned inlining in non-debug kernels. The size of struct sx has changed, so the kernel ABI is probably greatly disturbed. MFC after: 1 month Submitted by: attilio Tested by: kris, pjd	2007-03-31 23:23:42 +00:00
Jung-uk Kim	46bd727a1e	Correct BB-profiling and adjust comments. Pointed out by: bde Reviewed by: bde	2007-03-31 01:47:37 +00:00
Jung-uk Kim	6a4abad780	Fix off-by-4 error in address validation for i386, reduce PCB reloading, and fix more style(9) nits. Pointed out by: bde Discussed with: kib Reviewd by: bde	2007-03-30 23:19:08 +00:00
Jung-uk Kim	80f87d5e55	Fix more style(9) nits[1] and remove unnecessary use of '#if !defined(_KERNEL)'. Pointed out by: bde[1]	2007-03-30 19:33:53 +00:00
Jung-uk Kim	6403d3a160	Use the same wisdom of sys/i386/i386/support.s 1.97 to remove obfuscation. Pointed out by: bde	2007-03-30 18:27:57 +00:00
Jung-uk Kim	a328699b34	MFP4: Linux futex support for amd64. Initial patch was submitted by kib and additional work was done by Divacky Roman. Tested by: emulation	2007-03-30 01:07:28 +00:00
Julian Elischer	6734f35eac	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
Nick Hibma	f29fa1dfa4	Revisit the watchdogs: Resetting the error to EINVAL after failing to set the watchdog might hide the succesful arming of an earlier one. Accept that on failing to arm any watchdog (because of non-supported timeouts) EOPNOTSUPP is returned instead of the more appropriate EINVAL. MFC after: 3 days	2007-03-27 21:03:37 +00:00
Kris Kennaway	67eae018cb	Remove unnecessary giant acquisition around panic in #ifdef DIAGNOSTIC code. # There is some question about whether this code is even relevant any # longer (it dates back to prehistoric times, i.e. present in r1.1), # especially on amd64. Reviewed by: jhb	2007-03-26 21:45:44 +00:00
Nate Lawson	0d4ac62a35	Add an interface for drivers to be notified of changes to CPU frequency. cpufreq_pre_change is called before the change, giving each driver a chance to revoke the change. cpufreq_post_change provides the results of the change (success or failure). cpufreq_levels_changed gives the unit number of the cpufreq device whose number of available levels has changed. Hook in all the drivers I could find that needed it. * TSC: update TSC frequency value. When the available levels change, take the highest possible level and notify the timecounter set_cputicker() of that freq. This gets rid of the "calcru: runtime went backwards" messages. * identcpu: updates the sysctl hw.clockrate value * Profiling: if profiling is active when the clock changes, let the user know the results may be inaccurate. Reviewed by: bde, phk MFC after: 1 month	2007-03-26 18:03:29 +00:00
John Baldwin	9a53f6d97a	Fix a silly bogon that broke ibcs2_rename(). CID: 1065 Found by: Coverity Prevent (tm) Reported by: netchild	2007-03-26 15:39:49 +00:00
Alan Cox	8a5e898d63	In order to satisfy ACPI's need for an identity mapping, modify the temporary mapping created by locore so that the lowest two to four megabytes can become a permanent identity mapping. This implementation avoids any use of a large page mapping.	2007-03-24 19:53:22 +00:00
Jung-uk Kim	2be4e4713a	Catch up with ACPI-CA 20070320 import.	2007-03-22 18:16:43 +00:00
John Baldwin	d66ff27773	Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and handles when activating a resource via bus_activate_resource() rather than doing some of the work in bus_alloc_resource() and some of it in bus_activate_resource(). One note is that when using isa_alloc_resourcev() on PC-98, drivers now need to just use bus_release_resource() without explicitly calling bus_deactivate_resource() first. nyan@ has already fixed all of the PC-98 drivers.	2007-03-21 15:36:38 +00:00
John Baldwin	b8783b00f8	Add a new apic0 psuedo-device to claim memory resources for the memory address ranges used by local and I/O APICs in the system. Some systems also reserve these ranges as system resources via either PnPBIOS or ACPI, so this device currently attaches after acpi0 and legacy0 so that the system resources are given precedence.	2007-03-20 21:53:31 +00:00
John Baldwin	95a07592ee	Add a new ram0 pseudo-device that claims memory resouces for physical addresses corresponding to system RAM. On amd64 ram0 uses the SMAP and claims all the type 1 SMAP regions. On i386 ram0 uses the dump_avail[] array. Note that on i386 we have to ignore regions above 4G in PAE kernels since bus resources use longs.	2007-03-20 21:08:39 +00:00
Jung-uk Kim	2498f259d4	- Add macros for newly added CPUID bits in the corresponding header files. - Use correct capticalization in xTPR as Intel uses in their documents. - Use proper description instead of vendor code name in comment.	2007-03-20 20:22:45 +00:00
John Baldwin	ce533e82a2	Tweak the probe/attach order of devices on the x86 nexus devices. Various BIOS-related psuedo-devices are added at an order of 5. acpi0 is added at an order of 10, and legacy0 is added at an order of 11.	2007-03-20 20:21:44 +00:00
Sam Leffler	09dce7a13d	display two new Intel feature bits Submitted by: "Rui Paulo" <rpaulo@gmail.com> MFC after: 2 weeks	2007-03-19 05:23:42 +00:00
Alan Cox	8cfba7267f	Eliminate an unused parameter.	2007-03-17 19:42:06 +00:00
Nate Lawson	9b0df55b61	Create an identity mapping (V=P) super page for the low memory region on boot. Then, just switch to the kernel pmap when suspending instead of allocating/freeing our own mapping every time. This should solve a panic of pmap_remove() being called with interrupts disabled. Thanks to Alan Cox for developing this patch. Note: this means that ACPI requires super page (PG_PS) support in the CPU. This has been present since the Pentium and first documented in the Pentium Pro. However, it may need to be revisited later. Submitted by: alc MFC after: 1 month	2007-03-14 22:30:02 +00:00
Jung-uk Kim	ab5916a526	Add another CPUID for AMD CPUs and fix style(9) while I am here.	2007-03-12 20:27:21 +00:00
Alan Cox	c640357f04	Push down the implementation of PCPU_LAZY_INC() into the machine-dependent header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it atomic with respect to interrupts. Reviewed by: bde, jhb	2007-03-11 05:54:29 +00:00
John Baldwin	07bb813626	Defer calling lapic_init() until we've completed the 'MPTable: <...>' printf. Otherwise, printfs inside of lapic_init() (such as during a verbose boot) can uglify the output.	2007-03-09 15:49:57 +00:00
Mohan Srinivasan	f9bb753844	Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.	2007-03-09 04:02:38 +00:00
Scott Long	94e6bc303f	Don't increment total_bounced when doing no-op dmamap_sync ops.	2007-03-06 18:28:43 +00:00
John Baldwin	4c5bec1161	Change the x86 interrupt code to use FreeBSD CPU IDs (i.e. PCPU_GET(cpuid)) rather than local APIC IDs to keep track of CPUs which can handle interrupts.	2007-03-06 17:16:47 +00:00
John Baldwin	6cd68eaa7e	Trim trailing whitespace.	2007-03-05 22:27:55 +00:00
Alan Cox	8da3fc95a7	Acquiring smp_ipi_mtx on every call to pmap_invalidate_() is wasteful. For example, during a buildworld more than half of the calls do not generate an IPI because the only TLB entry invalidated is on the calling processor. This revision pushes down the acquisition and release of smp_ipi_mtx into smp_tlb_shootdown() and smp_targeted_tlb_shootdown() and instead uses sched_pin() and sched_unpin() in pmap_invalidate_() so that thread migration doesn't lead to a missed TLB invalidation. Reviewed by: jhb MFC after: 3 weeks	2007-03-05 21:40:10 +00:00
John Baldwin	aa7a005ee0	Use vm_paddr_t rather than uintptr_t when passing the physical address of APICs to lapic_init() and ioapic_create().	2007-03-05 20:35:17 +00:00
John Baldwin	db181dc741	Add a simple device driver to "eat" any I/O APICs that show up as PCI devices. MFC after: 1 week	2007-03-05 16:22:49 +00:00
Bruce Evans	d78180f8f5	Partial fix for a bug in rev.1.231. If suspend/resume clobbers the RTC state, then it may clobber the RTC index register, so the index register must be restored before using it to restore control registers in rtc_restore(). The following problems remain: - rtc_restore() is only called if pmtimer is configured. Buggy suspend/resumes are more likely to clobber the index register than a control register, so pmtimer is more needed than it used to be. - pmtimer doesn't exist for amd64. - Restoring of the RTC state may race with rtcintr(). If an RTC interrupt is handled before the state is restored, then rtcin(RTC_INTR) in rtcintr() may read from the wrong register, so rtcintr() may spin forever. This may be mitigated by the most common state clobbering being to turn off RTC interrupts.	2007-03-05 09:10:17 +00:00
Yoshihiro Takahashi	de038d5ffc	style(9).	2007-03-04 04:55:19 +00:00
Jung-uk Kim	a4e3bad794	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
John Baldwin	4d70511ac3	Use pause() rather than tsleep() on stack variables and function pointers.	2007-02-27 17:23:29 +00:00
Jung-uk Kim	6a5964d385	MFP4: 115094 Linux does not check file descriptor when MAP_ANONYMOUS is set. This should fix recent LTP test regressions. Reported by: Scot Hetzel (swhetzel at gmail dot com) netchild	2007-02-27 02:08:01 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
John Baldwin	860d8e2312	Use ih_filter instead of ih_handler in a couple of places. This fixes most INTR_FAST handlers on i386. Reviewed by: piso	2007-02-23 20:03:24 +00:00
Paolo Pisati	ef544f6312	o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@	2007-02-23 12:19:07 +00:00
Konstantin Belousov	3c97ab97bf	Unbreak ddb stepping over special frames after the following commit: Revision Changes Path 1.113 +4 -2 src/sys/i386/i386/apic_vector.s 1.117 +7 -1 src/sys/i386/i386/exception.s 1.36 +7 -7 src/sys/i386/i386/local_apic.c 1.298 +61 -63 src/sys/i386/i386/trap.c 1.62 +15 -22 src/sys/i386/i386/vm86.c 1.32 +4 -2 src/sys/i386/i386/vm86bios.s 1.21 +2 -2 src/sys/i386/include/apicvar.h 1.27 +2 -2 src/sys/i386/isa/atpic.c 1.50 +2 -1 src/sys/i386/isa/atpic_vector.s 1.35 +1 -1 src/sys/i386/isa/icu.h Tested by: kris, Peter Holm No objections from: kmacy	2007-02-19 10:57:47 +00:00
Alan Cox	ae0663a383	Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.	2007-02-18 06:33:02 +00:00
John Baldwin	9be403be00	Add bootverbose printfs to indicate which IDT vectors are assigned to MSI interrupts.	2007-02-15 22:22:57 +00:00
Jung-uk Kim	1441cab7b2	Regen.	2007-02-15 00:57:04 +00:00
Jung-uk Kim	10931a467a	MFP4: 113025, 113146, 113177, 113203, 113500, 113546, 113570 - PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC. Linux/ia64's i386 emulation layer does this and it complies with Linux header files. This fixes mmap05 LTP test case on amd64. - Do not adjust stack size when failure has occurred. - Synchronize i386 mmap/mprotect with amd64.	2007-02-15 00:54:40 +00:00
Brooks Davis	983f970981	Include GEOM_LABEL in GENERIC. It's very useful and not well publicized enough. Approved by: pjd	2007-02-09 19:03:18 +00:00
John Baldwin	71ddf30bd2	Don't send interrupts to CPUs disabled via lapic hints. Reported by: Ludger Bolmerg <lbolmerg ! web.de> MFC after: 3 days Pointy hat to: jhb	2007-02-08 16:49:59 +00:00
Marcel Moolenaar	1d3aed33e8	Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp). The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.	2007-02-07 18:55:31 +00:00
Bruce Evans	1300fd67f3	Fixed some style bugs. Routine except: - don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard FreeBSD implementaion detail which has nothing to do with GNUC.	2007-02-06 18:04:02 +00:00
Bruce Evans	3764a82377	Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary variable to avoid invalid constraints in dead code. Use an array of u_char's (inside a struct) instead of a char/short/int/long variable so that the variable and its accesses can be spelled in the same way in all cases and code doesn't need to be cloned just to hold the spelling differences. Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET(). Cast to (void ) as in rev.1.37 of the i386 version where the errors were fixed for the i386 PCPU_GET() only. It would be more correct to copy to and from the temp. variable using memcpy(), but then an ifdef tangle would be required to ensure using the builtin memcpy(). We depend on fairly aggressive optimization to put the temp. variable only in a register despite it being copied using (type )(void )&anothertype and could depend on this when using memcpy() too. This seems to work right even for -O0, but the -O0 case has not been completely tested. This change gives identical object code for all object files in LINT on amd64 (except for one file with a __TIME__ stamp). For LINT on i386 it gives unimportant differences in instruction order and padding in a few object files. This was only tested for -O. This change (actually a previous version of it) gives the following reductions in the number of object files in LINT that fail to compile with -O2 but without the -fno-strict-aliasing kludge: - amd64: 29 (down from 211) - i386: 36 (down from 47) gcc-3.4.6 actually allows the invalid constraints that result from not using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes on them and I don't want to depend on compiler bugs.	2007-02-06 16:21:09 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	a9ccaccfc3	Fix LOR that occurs because proctree_lock was acquired while holding emuldata lock by moving the code upwards outside the emul_lock coverage. Submitted by: rdivacky	2007-02-01 13:27:52 +00:00
Robert Watson	b1e5dcf778	As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the comment for sf_buf_alloc(9) talks about sleeping.	2007-01-28 17:39:03 +00:00
Bruno Ducrot	8867dfa953	o introduce a flags 'errata' for HW bugs onto the softc. o remove errata_a0 and introduce the corresponding flags into 'errata'. o introduce a new errata for K8, namely some platform might set the PENDING_BIT but aren't able to unset it, also don't loop forever waiting PENDING_BIT being cleared. o try to introduce a workaround for the PENDING_BIT stuck problem, o support now half multipliers for K8. Tested by: Abdullah Al-Marrie Approved by: njl	2007-01-23 19:20:30 +00:00
Jeff Roberson	f0393f063a	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
Jeff Roberson	3c93ca7d2f	- Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision for this behavior on the initiator side.	2007-01-23 08:38:39 +00:00
Bruce Evans	71799af2d5	Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.	2007-01-23 08:01:20 +00:00
John Baldwin	5fe82bca57	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support. - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl	2007-01-22 21:48:44 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Xin LI	f67af5c918	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 15:05:52 +00:00
Alexander Leidinger	973ac082f8	MFp4 (112893): Make linux_vfork() actually work. This enables make to work again with 2.6. It also fixes the LTP vfork tests. Submitted by: rdivacky	2007-01-14 16:20:37 +00:00
Kip Macy	6f3a846eeb	exclude the icu and clock lock from LOCK_PROFILING	2007-01-14 02:13:07 +00:00
Warner Losh	fed32d7544	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00

... 2 3 4 5 6 ...

11231 Commits