freebsd-nq

Author	SHA1	Message	Date
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
John Baldwin	860d8e2312	Use ih_filter instead of ih_handler in a couple of places. This fixes most INTR_FAST handlers on i386. Reviewed by: piso	2007-02-23 20:03:24 +00:00
Paolo Pisati	ef544f6312	o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@	2007-02-23 12:19:07 +00:00
Konstantin Belousov	3c97ab97bf	Unbreak ddb stepping over special frames after the following commit: Revision Changes Path 1.113 +4 -2 src/sys/i386/i386/apic_vector.s 1.117 +7 -1 src/sys/i386/i386/exception.s 1.36 +7 -7 src/sys/i386/i386/local_apic.c 1.298 +61 -63 src/sys/i386/i386/trap.c 1.62 +15 -22 src/sys/i386/i386/vm86.c 1.32 +4 -2 src/sys/i386/i386/vm86bios.s 1.21 +2 -2 src/sys/i386/include/apicvar.h 1.27 +2 -2 src/sys/i386/isa/atpic.c 1.50 +2 -1 src/sys/i386/isa/atpic_vector.s 1.35 +1 -1 src/sys/i386/isa/icu.h Tested by: kris, Peter Holm No objections from: kmacy	2007-02-19 10:57:47 +00:00
Alan Cox	ae0663a383	Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.	2007-02-18 06:33:02 +00:00
John Baldwin	9be403be00	Add bootverbose printfs to indicate which IDT vectors are assigned to MSI interrupts.	2007-02-15 22:22:57 +00:00
Jung-uk Kim	1441cab7b2	Regen.	2007-02-15 00:57:04 +00:00
Jung-uk Kim	10931a467a	MFP4: 113025, 113146, 113177, 113203, 113500, 113546, 113570 - PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC. Linux/ia64's i386 emulation layer does this and it complies with Linux header files. This fixes mmap05 LTP test case on amd64. - Do not adjust stack size when failure has occurred. - Synchronize i386 mmap/mprotect with amd64.	2007-02-15 00:54:40 +00:00
Brooks Davis	983f970981	Include GEOM_LABEL in GENERIC. It's very useful and not well publicized enough. Approved by: pjd	2007-02-09 19:03:18 +00:00
John Baldwin	71ddf30bd2	Don't send interrupts to CPUs disabled via lapic hints. Reported by: Ludger Bolmerg <lbolmerg ! web.de> MFC after: 3 days Pointy hat to: jhb	2007-02-08 16:49:59 +00:00
Marcel Moolenaar	1d3aed33e8	Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp). The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.	2007-02-07 18:55:31 +00:00
Bruce Evans	1300fd67f3	Fixed some style bugs. Routine except: - don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard FreeBSD implementaion detail which has nothing to do with GNUC.	2007-02-06 18:04:02 +00:00
Bruce Evans	3764a82377	Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary variable to avoid invalid constraints in dead code. Use an array of u_char's (inside a struct) instead of a char/short/int/long variable so that the variable and its accesses can be spelled in the same way in all cases and code doesn't need to be cloned just to hold the spelling differences. Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET(). Cast to (void ) as in rev.1.37 of the i386 version where the errors were fixed for the i386 PCPU_GET() only. It would be more correct to copy to and from the temp. variable using memcpy(), but then an ifdef tangle would be required to ensure using the builtin memcpy(). We depend on fairly aggressive optimization to put the temp. variable only in a register despite it being copied using (type )(void )&anothertype and could depend on this when using memcpy() too. This seems to work right even for -O0, but the -O0 case has not been completely tested. This change gives identical object code for all object files in LINT on amd64 (except for one file with a __TIME__ stamp). For LINT on i386 it gives unimportant differences in instruction order and padding in a few object files. This was only tested for -O. This change (actually a previous version of it) gives the following reductions in the number of object files in LINT that fail to compile with -O2 but without the -fno-strict-aliasing kludge: - amd64: 29 (down from 211) - i386: 36 (down from 47) gcc-3.4.6 actually allows the invalid constraints that result from not using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes on them and I don't want to depend on compiler bugs.	2007-02-06 16:21:09 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	a9ccaccfc3	Fix LOR that occurs because proctree_lock was acquired while holding emuldata lock by moving the code upwards outside the emul_lock coverage. Submitted by: rdivacky	2007-02-01 13:27:52 +00:00
Robert Watson	b1e5dcf778	As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the comment for sf_buf_alloc(9) talks about sleeping.	2007-01-28 17:39:03 +00:00
Bruno Ducrot	8867dfa953	o introduce a flags 'errata' for HW bugs onto the softc. o remove errata_a0 and introduce the corresponding flags into 'errata'. o introduce a new errata for K8, namely some platform might set the PENDING_BIT but aren't able to unset it, also don't loop forever waiting PENDING_BIT being cleared. o try to introduce a workaround for the PENDING_BIT stuck problem, o support now half multipliers for K8. Tested by: Abdullah Al-Marrie Approved by: njl	2007-01-23 19:20:30 +00:00
Jeff Roberson	f0393f063a	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
Jeff Roberson	3c93ca7d2f	- Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision for this behavior on the initiator side.	2007-01-23 08:38:39 +00:00
Bruce Evans	71799af2d5	Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.	2007-01-23 08:01:20 +00:00
John Baldwin	5fe82bca57	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support. - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl	2007-01-22 21:48:44 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Xin LI	f67af5c918	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 15:05:52 +00:00
Alexander Leidinger	973ac082f8	MFp4 (112893): Make linux_vfork() actually work. This enables make to work again with 2.6. It also fixes the LTP vfork tests. Submitted by: rdivacky	2007-01-14 16:20:37 +00:00
Kip Macy	6f3a846eeb	exclude the icu and clock lock from LOCK_PROFILING	2007-01-14 02:13:07 +00:00
Warner Losh	fed32d7544	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00
John Baldwin	cad688f011	Remove magic from rman_activate_resource() that uses the direct map at KERNBASE for the first 1 MB of RAM instead of calling pmap_mapdev(). pmap_mapdev() knows how to handle the first 1 MB (and has known for a while now) and properly maps the memory as UC to boot. MFC after: 2 weeks	2007-01-11 19:40:19 +00:00
Jeff Roberson	b31e373bf7	- Use the correct test in the ipi bitmask handler for IPI_PREEMPT so that we actually issue preemptions. - Remove the #ifdef IPI_PREEMPTION so it is always compiled in. Leave the option which optionally enables support in sched_4bsd. sched_ule.c will soon use this functionality as a run time rather than compile time option. - Compare against the idlethread rather than the priority. There are some idle prio tasks that we can preempt. Discussed with: ups Tested on: i386, amd64	2007-01-11 00:17:02 +00:00
Jung-uk Kim	5efc6c44ff	Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors.	2007-01-09 19:23:22 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Alexander Leidinger	99e9dcf022	regen after addition of linux_utimes and linux_rt_sigtimedwait	2006-12-31 13:20:31 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Bruce Evans	0b194ec872	Fix oops in previous commit.	2006-12-29 15:48:18 +00:00
Bruce Evans	f28e1c8f99	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').	2006-12-29 15:29:49 +00:00
Bruce Evans	6c296ffa81	Fixed some style bugs (whitespace only).	2006-12-29 14:28:23 +00:00
Bruce Evans	7e4277e591	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.	2006-12-29 13:36:26 +00:00
Bruce Evans	26ab2d1d23	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.	2006-12-27 20:26:00 +00:00
Jung-uk Kim	5e448826b7	Regen (just to fix 'generated from' line from the previous commit).	2006-12-20 20:42:58 +00:00
Jung-uk Kim	8187e7d7ad	Add linux_nanosleep() and regen.	2006-12-20 20:21:48 +00:00
Jung-uk Kim	77424f4177	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
David Xu	4e32b7b3cc	Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130	2006-12-20 04:40:39 +00:00
Kip Macy	a5c5d4402c	Evidently FreeBSD has long relied on the compiler to treat structures passed by value (trap frames) as if they were in fact being passed by reference. For better or worse, this incorrect behaviour is no longer present in gcc 4.1. In this patch I convert all trapframe arguments to be explicitly pass by reference. I also remove vm86_initflags, pushing the very little work that it actually does up into vm86_prepcall. Reviewed by: kan Tested by: kan	2006-12-17 05:07:01 +00:00
Kip Macy	2c1709c67b	vm86_initflags was causing gcc41 and even gcc346 to get rather confused - de-obfuscate Suggested by: kan Reviewed by: kan Tested by: kan	2006-12-17 03:17:46 +00:00
Nick Hibma	9079fff550	Align the interfaces for the various watchdogs and make the interface behave as expected. Also: - Return an error if WD_PASSIVE is passed in to the ioctl as only WD_ACTIVE is implemented at the moment. See sys/watchdog.h for an explanation of the difference between WD_ACTIVE and WD_PASSIVE. - Remove the I_HAVE_TOTALLY_LOST_MY_SENSE_OF_HUMOR define. If you've lost your sense of humor, than don't add a define. Specific changes: i80321_wdog.c Don't roll your own passive watchdog tickle as this would defeat the purpose of an active (userland) watchdog tickle. ichwd.c / ipmi.c: WD_ACTIVE means active patting of the watchdog by a userland process, not whether the watchdog is active. See sys/watchdog.h. kern_clock.c: (software watchdog) Remove a check for WD_ACTIVE as this does not make sense here. This reverts r1.181.	2006-12-15 21:44:49 +00:00
Pyun YongHyeon	1f90cf9895	Add msk(4) to the list of drivers supported by GENERIC kernel.	2006-12-13 03:41:47 +00:00
John Baldwin	8964299ac8	Give Host-PCI bridge drivers their own pcib_alloc_msi() and pcib_alloc_msix() methods instead of using the method from the generic PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI specific logic soon.	2006-12-12 19:27:01 +00:00
John Baldwin	fde45e231a	Sort function prototypes.	2006-12-12 19:24:45 +00:00
John Baldwin	d748ef4792	Replace a few magic numbers.	2006-12-12 19:23:52 +00:00
John Baldwin	c304531851	Add a function to return the MD interrupt source cookie associated with an interrupt event. Use this in the x86 code to fixup the intrcnt names when an interrupt handler is removed.	2006-12-12 19:20:19 +00:00
Maxim Sobolev	efa43a53bd	Allow machdep.cpu_idle_hlt to be set from the loader. This should allow to workaround the problem with SMP kernels on Turion64 X2 processors described in kern/104678 and may be useful in other situations too. MFC after: 3 days	2006-12-06 18:27:17 +00:00
Julian Elischer	ad1e7d285a	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
Bruce Evans	b73057227b	Optimized RTC accesses by avoiding null writes to the index register and by only delaying when an RTC register is written to. The delay after writing to the data register is now not just a workaround. This reduces the number of ISA accesses in the usual case from 4 to 1. The usual case is 2 rtcin()'s for each RTC interrupt. The index register is almost always RTC_INTR for this. The 3 extra ISA accesses were 1 for writing the index and 2 for delays. Some delays are needed in theory, but in practice they now just slow down slow accesses some more since almost eveyone including us does them wrong so modern systems enforce sufficient delays in hardware. I used to have the delays ifdefed out, but with the index register optimization the delays are rarely executed so the old magic ones can be kept or even implemented non- magically without significant cost. Optimizing RTC interrupt handling is more interesting than it used to be because RTC interrupts are currently needed to fix the more efficient apic timer interrupts on some systems. apic_timer_hz is normally 2000 so the RTC interrupt rate needs to be 2048 to keep the apic timer firing on such systems. Without these changes, each RTC interrupt normally took 10 ISA accesses (2 PIC accesses and 2 sets of 4 RTC accesses). Each ISA access takes 1-1.5uS so 10 of then at 2048 Hz takes 2-3% of a CPU. Now 4 of them take 0.8-1.2% of a CPU.	2006-12-03 03:49:28 +00:00
John Birrell	e0b651251d	Turn console printf buffering into a kernel option and only on by default for sun4v where it is absolutely required. This change moves the buffer from struct pcpu to the stack to avoid using the critical section which created a LOR in a couple of cases due to interaction with the tty code and kqueue. The LOR can't be fixed with the critical section and the pcpu buffer can't be used without the critical section. Putting the buffer on the stack was my initial solution, but it was pointed out that the stress on the stack might cause problems depending on the call path. We don't have a way of creating tests for those possible cases, so it's best to leave this as an option for the time being. In time we may get enough data to enable this option more generally.	2006-11-30 04:17:05 +00:00
Ruslan Ermilov	ca0fa71fde	Tweak the comment about mapping a kernel using large pages.	2006-11-25 23:00:46 +00:00
Alan Cox	da44960498	The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)	2006-11-19 20:54:58 +00:00
John Baldwin	7693afca4e	- Add macro constants for the various fields in %dr7 and use them in place of various scattered magic values. - Pretty print the address of hardware watchpoints in 'show watch' rather than just displaying hex. - Expand address field width on amd64 for 64-bit pointers.	2006-11-17 19:20:32 +00:00
John Baldwin	5527d3ed75	Trim some noise from bootverbose: - Drop the printf in intr_machdep.c when we assign an interrupt souce to a CPU. Each source already has a more detailed printf. - Don't output a line for each ioapic pin showing its initial state, this has outlived its usefulness. - When an APIC enumerator sets the bus, polarity, or trigger mode of an ioapic pin, just return success without printing anything if the new value matches the current one. MFC after: 2 weeks	2006-11-17 16:41:03 +00:00
John Baldwin	5d346a567c	A few more style fixes.	2006-11-17 16:37:35 +00:00
Maxim Konovalov	79ba24ca87	o Make pv_maxchunks no less than maxproc. This helps to survive a forkbomb explosion. Reviewed by: alc Security: local DoS X-MFC atfer: RELENG_6 is not affected due to a different pv_entry allocation code.	2006-11-16 11:46:24 +00:00
John Baldwin	71f4007710	Various whitespace and style fixes.	2006-11-15 19:53:48 +00:00
John Baldwin	15f266289d	Fix a typo that broke MSI (MSI-X worked fine) in the later revisions of the MSI patches.	2006-11-15 18:40:00 +00:00
John Baldwin	4184900911	MD support for PCI Message Signalled Interrupts on amd64 and i386: - Add a new apic_alloc_vectors() method to the local APIC support code to allocate N contiguous IDT vectors (aligned on a M >= N boundary). This function is used to allocate IDT vectors for a group of MSI messages. - Add MSI and MSI-X PICs. The PIC code here provides methods to manage edge-triggered MSI messages as x86 interrupt sources. In addition to the PIC methods, msi.c also includes methods to allocate and release MSI and MSI-X messages. For x86, we allow for up to 128 different MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs, 16-254 for APIC PCI IRQs, and IRQ 255 is reserved). - Add pcib_(alloc\|release)_msi[x]() methods to the MD x86 PCI bridge drivers to bubble the request up to the nexus driver. - Add pcib_(alloc\|release)_msi[x]() methods to the x86 nexus drivers that ask the MSI PIC code to allocate resources and IDT vectors. MFC after: 2 months	2006-11-13 22:23:34 +00:00
Ruslan Ermilov	d77f5882e7	Fix NKPT comments to match reality. Note that the current value of NKPT is no longer enough to run amd64 with 16G of RAM, as it doesn't have space for mapping a kernel (16M kernel would require additionally 8 page tables).	2006-11-13 20:33:54 +00:00
Ruslan Ermilov	26af9ac7d0	Fix a comment.	2006-11-13 06:26:57 +00:00
Alan Cox	44b8bd66f9	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)	2006-11-12 21:48:34 +00:00
Tom Rhodes	6aeb05d7be	Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@	2006-11-11 16:26:58 +00:00
John Baldwin	fdaac72fcd	Don't dump the $PIR table under bootverbose. The pirtool program in src/tools/tools works fine, and dumping this table can add a lot of noise. MFC after: 1 week	2006-11-09 18:03:36 +00:00
Ruslan Ermilov	7eae4829bf	Spelling.	2006-11-07 21:57:18 +00:00
John Baldwin	203886d93c	Remove old XXX comment about possibly adding a print_Intel_info() function to dump CPUID level=2 stuff. A print_INTEL_info() function that does just that was added a while ago.	2006-11-07 18:48:18 +00:00
John Baldwin	3900a3be21	Remove duplicate IDTVEC macro definition, it's already defined in <machine/intr_machdep.h>.	2006-11-07 18:46:33 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
John Birrell	8391a99bf7	Remove the KDTRACE option again because of the complaints about having it as a default. For the record, the KDTRACE option caused _no_ additional source files to be compiled in; certainly no CDDL source files. All it did was to allow existing BSD licensed kernel files to include one or more CDDL header files. By removing this from DEFAULTS, the onus is on a kernel builder to add the option to the kernel config, possibly by including GENERIC and customising from there. It means that DTrace won't be a feature available in FreeBSD by default, which is the way I intended it to be. Without this option, you can't load the dtrace module (which contains the dtrace device and the DTrace framework). This is equivalent to requiring an option in a kernel config before you can load the linux emulation module, for example. I think it is a mistake to have DTrace ported to FreeBSD, but not to have it available to everyone, all the time. The only exception to this is the companies which distribute systems with FreeBSD embedded. Those companies will customise their systems anyway. The KDTRACE option was intended for them, and only them.	2006-11-04 23:50:12 +00:00
John Birrell	1f80cd9398	Build in kernel support for loading DTrace modules by default. This adds the hooks that DTrace modules register with, and adds a few functions which have the dtrace_ prefix to allow the DTrace FBT (function boundary trace) provider to avoid tracing because they are called from the DTtrace probe context. Unlike other forms of tracing and debug, DTrace support in the kernel incurs negligible run-time cost. I think the only reason why anyone wouldn't want to have kernel support enabled for DTrace would be due to the license (CDDL) under which DTrace is released.	2006-11-04 04:58:10 +00:00
John Birrell	3d068827c2	Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time. To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters. String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices. Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads. A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb. Reviewed by: scottl, jhb MFC after: 2 weeks	2006-11-01 04:54:51 +00:00
Takanori Watanabe	0967107190	Fix Typo. Pointed out by: ru	2006-10-31 07:22:24 +00:00
Takanori Watanabe	1cc5605910	Add conf file entries for acpi_aiboost drivers.	2006-10-30 05:51:54 +00:00
Alexander Leidinger	96ed72ac81	regen after linux_io_* backout	2006-10-29 14:12:44 +00:00
Alexander Leidinger	3680a41902	Backout the linux aio stuff. Several problems where identified and the dynamic nature (if no native aio code is available, the linux part returns ENOSYS because of missing requisites) should be solved differently than it is. All this will be done in P4. Not included in this commit is a backout of the changes to the native aio code (removing static in some places). Those changes (and some more) will also be needed when the reworked linux aio stuff will reenter the tree. Requested by: rwatson Discussed with: rwatson	2006-10-29 14:02:39 +00:00
Bruce Evans	6a70163fcc	Removed some SMP ifdefs so that using the TSC as a cputime clock is not completely decided at config time. Just don't default to using the TSC if there are multiple active CPUs. Also, don't default to using the TSC if it is broken. SMP ifdefs are still used to disallow using perfmon since perfmon is always broken if SMP is just configured. This only helps much for SMP kernels running on 1 CPU. The overheads for using the i8254 cputime clock were a bit too high on 486/33's, and now on multi-GHz CPUs they are usually in the 99-99.9% range. Switching from the old default of an i8254 clock to the TSC works poorly because the overheads are not recalibrated. Use the same condition for declaring perfmon stuff as for using it.	2006-10-29 09:48:44 +00:00
Alexander Leidinger	c1ea90bfd3	regen (prctl addition)	2006-10-28 11:24:38 +00:00
Bruce Evans	43f0ea0a27	i386/include/profile.h: Fixed a syntax error for the (!__KERNEL && !__GNUCLIKE_ASM) case in rev.1.36. Apparently, this case has never been reached even by lint. Submitted by: stefanf {amd64,i386}/include/profile.h: In case the above case is actually reached, break it properly by providing null support that will fail at link time instead of a stub that gives wrong (null) profiling at runtime.	2006-10-28 11:03:03 +00:00
Alexander Leidinger	955d762aca	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
Bruce Evans	853b92dacf	In MCOUNT_OVERHEAD(label), actually use the `label' parameter. We were still using the global label named "profil", and this worked accidentally because all callers use the same name.	2006-10-28 07:59:11 +00:00
Bruce Evans	3a110062fd	Cleaned up includes. <machine/profile.h> was unused. <machine/timerreg.h> was only used in the GUPROF case, so the messes to get its i386 prerequisites included shouldn't have been needed. Fixed some style bugs. Quote #error contents, and don't repeat an #error directive on amd64.	2006-10-28 06:38:51 +00:00
Bruce Evans	94450a83e8	Removed all traces of HIDENAME() in amd64 and i386 kernel code. Using this used to be slightly cleaner than using ifdefs in a few places to support both a.out and elf, but using it now just causes messes and unportabilities. It seems to be impossible to implement the elf HIDENAME() portably in cpp (since token pasting of "." and <name> is invalid). */prof_machdep.c: - Removed all uses of CNAME(). CNAME() is easy enough to use in pure asm code, but using it in inline asm requires messy quoting. The core pure asm code has been hacked on more and all uses of CNAME() in it have already gone away. Just assume the elf convention here too. - Removed now-uneeded include of <machine/asmacros.h>. - Removed the workaround for a namespace conflict with this include.	2006-10-28 06:04:29 +00:00
Bruce Evans	447647908c	Don't call mexitcount or provide a stub mexitcount to call when profiling is configured but high resolution profiling is not configured. Only functions in *.[Ss] called the stub, so efficiency was not significantly affected.	2006-10-27 14:17:50 +00:00
John Birrell	3750d1ecad	Remove the KSE option now that it's in DEFAULTS on these arches/machines. The 'nooption' kernel config entry has to be used to turn KSE off now. This isn't my preferred way of dealing with this, but I'll defer to scottl's experience with the io/mem kernel option change and the grief experienced over that. Submitted by: scottl@	2006-10-26 22:11:35 +00:00
John Birrell	013d6d8cb4	Add 'options KSE' to the kernel config DEFAULTS on all arches/machines except sun4v. This change makes the transition from a default to an option more transparent and is an attempt to head off all the compliants that are likely from people who don't read UPDATING, based on experience with the io/mem change. Submitted by: scottl@	2006-10-26 22:05:25 +00:00
John Birrell	8460a577a4	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
Ruslan Ermilov	837f167eb2	Move "device splash" back to MI NOTES and "files", it's MI.	2006-10-23 13:23:14 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Alan Cox	43200cd3ed	Eliminate unnecessary PG_BUSY tests.	2006-10-22 04:18:01 +00:00
Alexander Leidinger	0f0549587b	Fix a recent regression regarding valid signals. Submitted by: rdivacky	2006-10-20 10:09:40 +00:00
Dag-Erling Smørgrav	c43ac89acc	Move more MD devices and options out of MI NOTES.	2006-10-20 09:52:27 +00:00
Bruce Evans	045f738b58	Don't show debug registers in "show registers". Special registers should be displayed specially, and debug registers are among of the least interesting special registers (far behind %cr3). The debug registers are still accessible as variables and displayed in another bogus place ("show watches").	2006-10-20 09:44:21 +00:00
Dag-Erling Smørgrav	c276283866	The VGA_DEBUG option only exists on {amd64,i386,ia64}. Also remove 'device io' from amd64 NOTES; DEFAULTS takes care of it.	2006-10-20 08:56:26 +00:00
Ruslan Ermilov	034f5f8e72	Add missing acpi_wakecode.o: assym.s dependency, so that if assym.s is newer than acpi_wakecode.h, the latter is rebuilt. Reported by: bde	2006-10-19 05:55:09 +00:00
Warner Losh	e54ad0a189	Remove references to pccard.conf	2006-10-19 05:17:55 +00:00
David Xu	5f641fc0fb	o Add keyword volatile for user mutex owner field. o Fix type consistent problem by using type long for old umtx and wait channel. o Rename casuptr to casuword.	2006-10-17 02:24:47 +00:00
Alexander Leidinger	95f2da66d3	regen (linux AIO stuff)	2006-10-15 14:24:10 +00:00
Alexander Leidinger	6a1162d4cd	MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>	2006-10-15 14:22:14 +00:00
Alexander Leidinger	0a62e03542	MFP4 (106538 + 106541): Implement CLONE_VFORK. This fixes the clone05 LTP test. Submitted by: rdivacky	2006-10-15 13:39:40 +00:00
Alexander Leidinger	2482245b0c	Revert my previous commit, I mismerged this to the wrong place. Pointy hat to: netchild	2006-10-15 13:30:45 +00:00
Alexander Leidinger	21aed094a9	MFP4 (106541): Fix the clone05 test in the LTP. Submitted by: rdivacky	2006-10-15 13:25:23 +00:00
Alexander Leidinger	4b3583a354	MFP4 (107144[1]): Implement CLONE_FS on i386[1] and amd64. Submitted by: rdivacky [1]	2006-10-15 13:22:14 +00:00
Alexander Leidinger	687c23be1d	MFP4 (107868 - 107870): Use a macro to test for a valid signal instead of doing it my hand everywhere. Submitted by: rdivacky	2006-10-15 12:51:43 +00:00
John Baldwin	520ffff83e	Change the x86 interrupt code to suspend/resume interrupt controllers (PICs) rather than interrupt sources. This allows interrupt controllers with no interrupt pics (such as the 8259As when APIC is in use) to participate in suspend/resume. - Always register the 8259A PICs even if we don't use any of their pins. - Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't included. - Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC on resume. This gets suspend/resume working with APIC on UP systems. SMP still needs more work to bring the APs back to life. The MFC after is tentative. Tested by: anholt (i386) Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3) MFC after: 1 week	2006-10-10 23:23:12 +00:00
John Baldwin	6e20fe33ba	Oops, fix sign bug in #ifdef for value of INTRCNT_COUNT. PR: kern/99870 Submitted by: jkim MFC after: 3 days	2006-10-10 19:26:35 +00:00
Simon L. B. Nielsen	4517aab293	- Remove SCHED_ULE from GENERIC to better avoid foot-shooting by unsuspecting users. - Add a comment in NOTES about experimental status of SCHED_ULE. - Make warning about experimental status in sched_ule(4) a bit stronger. Suggested and reviewed by: dougb Discussed on: developers MFC after: 3 days	2006-10-05 20:31:58 +00:00
John Birrell	6825d60738	PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: Security: Move the relocation definitions to the common elf header so that DTrace can use them on one architecture targeted to a different one. Add the additional ELF types defines in Sun's "Linker and Libraries" manual.	2006-10-04 21:37:10 +00:00
Poul-Henning Kamp	e4c9547050	Use calendaric calculation support from subr_clock.c instead of home-rolled. Eventually, this RTC should probably use subr_rtc.c as well	2006-10-02 16:18:40 +00:00
Poul-Henning Kamp	b69f71eb29	Second part of a little cleanup in the calendar/timezone/RTC handling. Split subr_clock.c in two parts (by repo-copy): subr_clock.c contains generic RTC and calendaric stuff. etc. subr_rtc.c contains the newbus'ified RTC interface. Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock} sysctls and associated variables into subr_clock.c. They are not machine dependent and we have generic code that relies on being present so they are not even optional.	2006-10-02 15:42:02 +00:00
Poul-Henning Kamp	f645b0b51c	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
Poul-Henning Kamp	c29ba5fe6e	Remove the no longer relevant or correct bootinfo sysctls.	2006-09-30 10:08:09 +00:00
Maxim Sobolev	2c473eaf67	Extend comment explaining why code is conditional at !defined(SCHED_ULE). Suggested by: ru	2006-09-27 22:09:35 +00:00
Maxim Sobolev	6e93c19e3d	Since ULE doesn't honor hlt_cpus_mask don't compile code that prevents timer interrupt servicing for disabled HTT cores in ULE case. Should be probably fixed in ULE code instead, but we have no real maintainer for ULE to do it. PR: 103697	2006-09-27 18:51:19 +00:00
Scott Long	31e2a87d4d	The need to run a filter also implies that bouncing could be possible, so just use the COULD_BOUNCE flag for both and retire the USE_FILTER flag. This fixes the problem that rev 1.81 introduced with the if_bfe driver (and possibly others).	2006-09-26 23:14:42 +00:00
Ruslan Ermilov	6c9fdda750	Added COMPAT_FREEBSD6 option.	2006-09-26 12:36:34 +00:00
Warner Losh	2dca50b6a1	Add a newline to the printf.	2006-09-24 19:24:26 +00:00
John Baldwin	d72a078647	Update the ipmi(4) driver: - Split out the communication protocols into their own files and use a couple of function pointers in the softc that the commuication protocols setup in their own attach routine. - Add support for the SSIF interface (talking to IPMI over SMBus). - Add an ACPI attachment. - Add a PCI attachment that attaches to devices with the IPMI interface subclass. - Split the ISA attachment out into its own file: ipmi_isa.c. - Change the code to probe the SMBIOS table for an IPMI entry to just use pmap_mapbios() to map the table in rather than trying to setup a fake resource on an isa device and then activating the resource to map in the table. - Make bus attachments leaner by adding attach functions for each communication interface (ipmi_kcs_attach(), ipmi_smic_attach(), etc.) that setup per-interface data. - Formalize the model used by the driver to handle requests by adding an explicit struct ipmi_request object that holds the state of a given request and reply for the entire lifetime of the request. By bundling the request into an object, it is easier to add retry logic to the various communication backends (as well as eventually support BT mode which uses a slightly different message format than KCS, SMIC, and SSIF). - Add a per-softc lock and remove D_NEEDGIANT as the driver is now MPSAFE. - Add 32-bit compatibility ioctl shims so you can use a 32-bit ipmitool on FreeBSD/amd64. - Add ipmi(4) to i386 and amd64 NOTES. Submitted by: ambrisko (large portions of 2 and 3) Sponsored by: IronPort Systems, Yahoo! MFC after: 6 days	2006-09-22 22:11:29 +00:00
Robert Watson	827f0e85a6	Regenerate.	2006-09-21 16:20:38 +00:00
Robert Watson	e6f188152c	Use AUE_CREAT instead of AUE_O_CREAT for linux_creat(). Obtained from: TrustedBSD Project	2006-09-21 16:18:33 +00:00
Robert Watson	753a5e888c	Regenerate.	2006-09-21 16:13:16 +00:00
Robert Watson	b5ca51459a	Use AUE_GETDIRENTRIES instead of AUE_O_GETDENTS and AUE_NULL for a number of directory reading system calls. Respell a mis-spelled event name. Clean up white space/line wraps in a couple of places. Assign event numbers to some new system call entries that have turned up in the list since audit support was added. Obtained from: TrustedBSD Project	2006-09-21 16:12:58 +00:00
Alexander Kabaev	d9cb97ff9d	Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes the former and __builtin_va_start was present in all GCC version 3.1 and later.	2006-09-21 01:37:02 +00:00
Alexander Leidinger	6dc4e81071	style(9) While I'm here add a MFC reminder, I forgot it in the previous commit. Noticed by: ssouhlal MFC after: 1 week	2006-09-20 19:27:11 +00:00
Alexander Leidinger	a312f6a30a	Bring the i386 linux mmap code more into line with how linux (2.4.x) behaves. This fixes a lot of test which failed before. For amd64 there are still some problems, but without any testers which apply patches and run some predefines tests we can't do more ATM. Submitted by: Marcin Cieslak <saper@SYSTEM.PL> (minor fixups by myself) Tested with: LTP	2006-09-20 17:24:20 +00:00
Wojciech A. Koszek	6a535c2e4a	Fix 'interrupt interrupt' -> 'interrupt' in the comment. Approved by: cognet (mentor)	2006-09-20 12:23:33 +00:00
Scott Long	adab0fdc4f	Remove duplicated code. Declare functions non-static that shouldn't be inlined.	2006-09-13 09:35:59 +00:00
John-Mark Gurney	c14c65ed52	document that PAE kernels needs twice the value of non-PAE kernels for KVA_PAGES, and that it it likely needed for >4GB memory boxes.. MFC after: 3 days	2006-09-13 01:23:08 +00:00
John Baldwin	884ff1813f	Add a new ddb command 'show lapic' to dump details about the local APIC registers for the current CPU. MFC after: 3 days	2006-09-11 20:12:42 +00:00
John Baldwin	5c15c7e71d	Actually hook up the IPI_INVLCACHE IDT vectors backing pmap_invalidate_cache() in the SMP case so pmap_mapdev() in multiuser doesn't panic with a trap 30. I broke this many months ago when I added pmap_invalidate_cache() as early parts of the PAT work. Patience from: jmg Pointy hat: jhb	2006-09-11 20:10:42 +00:00
John Baldwin	9914a8cc7d	- Fix rman_manage_region() to be a lot more intelligent. It now checks for overlaps, but more importantly, it collapses adjacent free regions. This is needed to cope with BIOSen that split up ports for system devices (like IPMI controllers) across multiple system resource entries. - Now that rman_manage_region() is not so dumb, remove extra logic in the x86 nexus drivers to populate the IRQ rman that manually coalesced the regions. MFC after: 1 week	2006-09-11 19:31:52 +00:00
Scott Long	88591e04af	The run_filter() procedure is a means of working around DMA engine bugs in old/broken hardware. Unfortunately, it adds cache pressure and possible mispredicted branches to the fast path of the bus_dmamap_load collection of functions. Since it's meant for slow path exception processing, de-inline it and allow its conditions to be pre-computed at tag_create time and thus short-circuited at runtime. While here, cut down on the size of _bus_dmamap_load_buffer() by pushing the bounce page logic into a non-inlined function. Again, this helps with cache pressure and mispredicted branches. According to the TSC, this shaves off a few cycles on average. Unfortunately, the data varies quite a bit due to interrupts and preemption, so it's hard to get a good measurement. Real world measurements of network PPS are welcomed. A merge to amd64 and other arches is pending more testing.	2006-09-11 06:48:53 +00:00
Alexander Leidinger	bb59e63f8f	Change futex lock from mutex to sx. Make futex_get atomic (protected by the futex lock). Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:25:25 +00:00
Robert Watson	98bf5a707d	Audit sysarch() operation argument. MFC after: 3 days	2006-09-09 10:20:31 +00:00
John Baldwin	f6c48c1932	Use a single constant to define the sizes of the physmap[], phys_avail[], and dump_avail[] arrays so they are in sync (previously it was possible to store more entries in the physmap[] then we could store in phys_avail[], which was pointless). While I'm here, bump up the length of these tables to hold 30 entries on amd64 and 16 on i386. This allows machines with fairly fragmented memory maps to boot ok (at least one machine would not boot FreeBSD/i386 but would boot FreeBSD/amd64 because amd64 allowed for more fragments). MFC after: 3 days	2006-09-07 15:03:02 +00:00
Maxim Sobolev	e7d33dcbc5	Unbreak in the case when device apic is compiled into non-SMP kernel. Reported by: jhay MFC after: 2 weeks	2006-09-06 22:05:34 +00:00
Ruslan Ermilov	4962065404	Refine previous revision to allow acpi_wakecode.h to be safely built from both the acpi module build directory and a kernel build directory. The latter didn't work when one attempted to build a kernel which had "device acpi" with the "make kernel-toolchain buildkernel" command because a cross-compiler couldn't find anything in the standard system include path (it's empty in the kernel-toolchain case). Fix this by passing a better root path to kernel headers (src/sys) which works for both cases, kernel and module (-I@ only worked for module). Also, while here, pass -nostdinc (and a different spelling for icc) -- it's a feature that the kernel source tree is self-contained, and this change enforces this. Reported by: glebius	2006-09-06 14:23:40 +00:00
Maxim Sobolev	23da540855	The FreeBSD by default "disables" hyper-threading cores, by not scheduling any threads to them. However, it still counts those cores as "active but permanently idle" when calculating system-wide CPUs statistics. It is incorrect, since it skews statistics quite a bit and creates real problems for certain types of applications (monitoring applications for example), by making them believe that the system does have enough idle CPU resources, while in fact it does not. Correct the problem by not calling performance counting routines on "disabled" cores. The cleaner solution would be to just disable APIC timer interrupts on those cores completely, but ENOTIME here and it is not clear if the additional complexity really worth minor performance gain. Reviewed by: ssouhlal Sponsored by: Sippy Software, Inc. MFC after: 2 weeks	2006-09-05 17:15:24 +00:00
David Xu	66e1c26dba	Implement casuword32, compare and set user integer, thank Marcel Moolenarr who wrote the IA64 version of casuword32.	2006-08-28 02:28:15 +00:00
Alexander Leidinger	a6c5f81339	Fix video playing and network connections in realplayer (and most likely other stuff) in the osrelease=2.6.16 case: - implement CLONE_PARENT semantic - fix TLS loading in clone CLONE_SETTLS - lock proc in the currently disabled part of CLONE_THREAD I suggest to not unload the linux module after testing this, there are some "<defunct>" processes hanging around after exiting (they aren't with osrelease=2.4.2) and they may panic your kernel when unloading the linux module. They are in state Z and some of them consume CPU according to ps. But I don't trust the CPU part, the idle threads gets too much CPU that this may be possible (accumulating idle, X and 2 defunct processes results in 104.7%, this looks to much to be a rounding error). Noticed by: Intron <mag@intron.ac> Submitted by: rdivacky (in collaboration with Intron) Tested by: Intron, netchild Reviewed by: jhb (previous version)	2006-08-27 18:51:32 +00:00
Alexander Leidinger	084556f5d7	regen	2006-08-27 08:58:00 +00:00
Alexander Leidinger	835e506190	Add the linux statfs64 call. This allows Tivoli backup to proceed a little but further on -current (still not successful, but a step into the right direction). Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: Paul Mather <paul@gromit.dlib.vt.edu>	2006-08-27 08:56:54 +00:00
Alexander Leidinger	40f734dd0d	Emulate what vfork does instead of using it in linux_vfork. This way we can do the stuff we need to do with linux processes at fork and don't panic the kernel at exit of the child. Submitted by: rdivacky Tested with: tst-vfork* (glibc regression tests) Tested by: netchild	2006-08-25 11:59:56 +00:00
Alexander Leidinger	29ddc19bbf	Get rid of some nested includes. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb	2006-08-19 15:13:01 +00:00
Alexander Leidinger	94cb2ecf79	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
Alexander Leidinger	0eef2f8a4e	Style fixes to comments. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-16 18:54:51 +00:00
Pav Lucistnik	a08875fbb2	- Fix typo in #error pragma: compitable -> compatible Submitted by: neologism	2006-08-15 20:10:49 +00:00
John Baldwin	f8f1f7fb85	Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier systrace changes.	2006-08-15 17:37:01 +00:00
John Baldwin	df78f6d313	- Remove unused sysvec variables from various syscalls.conf. - Send the systrace_args files for all the compat ABIs to /dev/null for now. Right now makesyscalls.sh generates a file with a hardcoded function name, so it wouldn't work for any of the ABIs anyway. Probably the function name should be configurable via a 'systracename' variable and the functions should be stored in a function pointer in the sysvec structure.	2006-08-15 17:25:55 +00:00
Warner Losh	72fa8a0268	No need for opt_global.h here	2006-08-15 15:48:58 +00:00
Alexander Leidinger	e9fe50af7e	Remove the include of opt_global.h. It's included globally by a command line switch. Other files which may make the same mistake (according to fxr.watson.org) but aren't fixed in this commit (people with more clue about those files should fix this): - i386/xbox/xbox.c - arm/arm/elf_trampoline.c - arm/arm/mem.c Noticed by: cognet	2006-08-15 15:27:13 +00:00
Alexander Leidinger	94930c140c	Add include of opt_global.h, else the futex operations aren't locked on SMP systems. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 13:45:39 +00:00
Alexander Leidinger	77b959aa51	add autogenerated systrace_args stuff for dtrace	2006-08-15 12:56:36 +00:00
Alexander Leidinger	9b44bfc556	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
Alexander Leidinger	c107650561	regen	2006-08-15 12:51:45 +00:00
Alexander Leidinger	b4359bd8e5	Add new syscalls in the linuxolator (only used when the sysctl compat.linux.osrelease is changed to "2.6.16" or similar). On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 12:28:14 +00:00
Alexander Leidinger	fb5592c084	- Add some ASM stuff needed for futexes (linuxolator). Sponsored by: Google SoC 2006 Submitted by: rdivacky With help from: kib	2006-08-15 12:14:36 +00:00
Alan Cox	c3b182d235	Eliminate an unnecessary initialization from trap_pfault() that also happens to contain a style error.	2006-08-14 19:53:53 +00:00
John Baldwin	d25fdf539e	Don't try to preserve PAT bits in pmap_enter(). We currently on pages that aren't mapped via pmap_enter() (KVA). We will eventually support PAT bits on user pages, but those will require some sort of MI caching mode stored in the vm_page. Reviewed by: alc	2006-08-14 15:39:41 +00:00
John Baldwin	7e9f73f3ed	First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc	2006-08-11 19:22:57 +00:00
Alexander Leidinger	50e422f056	Add some more errno mappings (bsd -> linux) and a comment about the status.. Submitted by: "Intron" <mag@intron.ac>	2006-08-10 22:05:25 +00:00
Warner Losh	ddebcb409b	Eliminate one set of XBOX #ifdefs. The Xbox code just needs to set a different TIMER_FREQ value than default. Accomplish this via the config file rather than via an #ifdef.	2006-08-09 23:47:38 +00:00
Warner Losh	f5190c1277	Minor style(9) nit.	2006-08-09 23:37:30 +00:00
Nate Lawson	ad3d78ead0	If a beep was enabled, turn it off 3 seconds after resume. MFC after: 3 days	2006-08-08 01:30:54 +00:00
Alan Cox	14aaab5329	Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().	2006-08-06 06:29:16 +00:00
Michael Reifenberger	0b2aa43a65	Dont overwrite cpu_model in the case of Via's C3-CPU. Noticed by: Mike Tancsa MFC after: 2 days	2006-08-04 13:49:16 +00:00
Yaroslav Tykhiy	776fc0e90e	Commit the results of the typo hunt by Darren Pilgrim. This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week	2006-08-04 07:56:35 +00:00
Alan Cox	78985e424a	Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)	2006-08-01 19:06:06 +00:00
David E. O'Brien	a4755e0e13	Correct spelling of 3DNow!.	2006-08-01 01:23:39 +00:00
Marcel Moolenaar	302981e72a	Remove sio(4) and related options from MI files to amd64, i386 and pc98 MD files. Remove nodevice and nooption lines specific to sio(4) from ia64, powerpc and sparc64 NOTES. There were no such lines for arm yet. sio(4) is usable on less than half the platforms, not counting a future mips platform. Its presence in MI files is therefore increasingly becoming a burden.	2006-07-29 18:38:54 +00:00
John Baldwin	cb76d9b05c	Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is now back to just being an argument count.	2006-07-28 20:22:58 +00:00
John Baldwin	91ce2694d1	Regen for MPSAFE flag removal.	2006-07-28 19:08:37 +00:00
John Baldwin	af5bf12239	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
John Baldwin	e0b4add8d8	Various fixes to comments in the syscall master files including removing cruft from the audit import and adding mention of COMPAT4 to freebsd32.	2006-07-28 18:55:18 +00:00
John Baldwin	22ea1bc57a	Unify the checking for lock misbehavior in the various syscall() implementations and adjust some of the checks while I'm here: - Add a new check to make sure we don't return from a syscall in a critical section. - Add a new explicit check before userret() to make sure we don't return with any locks held. The advantage here is that we can include the syscall number and name in syscall() whereas that info is not available in userret(). - Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by the more general checks just added. MFC after: 2 weeks	2006-07-27 22:32:30 +00:00
John Baldwin	7abb84313a	Argh, fix compile with XBOX enabled. Somehow I missed a LINT compile. :(	2006-07-27 22:19:02 +00:00
John Baldwin	57b16b0882	Don't allow MAXMEM or hw.physmem to extend the top of memory if our memory map was obtained from the SMAP. SMAP is trustworthy, and the memory extending feature is a band-aid for older systems where FreeBSD's methods of detecting memory were not always trustworthy. This fixes the issue where using hw.physmem could result in the ACPI tables getting trashed breaking ACPI. MFC after: 3 days Tested on: i386	2006-07-27 19:47:22 +00:00
Pyun YongHyeon	ae05dd24c0	Add stge(4) to the list of drivers supported by GENERIC kernel.	2006-07-25 01:06:32 +00:00
John Baldwin	3ce72960e8	Regen.	2006-07-21 20:41:33 +00:00
John Baldwin	b4c63329d5	- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional Giant VFS locking in that function. - Remove bogus code to handle the case where namei() returns success but a NULL vnode pointer. - Note that this code duplicates exec_check_permissions() and annotate where it differs. - Hold the vnode lock longer to protect the write to set VV_TEXT in v_vflag. - Mark linux_uselib() MPSAFE. Reviewed by: rwatson	2006-07-21 20:22:13 +00:00
Alan Cox	3cad40e517	Add pmap_clear_write() to the interface between the virtual memory system's machine-dependent and machine-independent layers. Once pmap_clear_write() is implemented on all of our supported architectures, I intend to replace all calls to pmap_page_protect() by calls to pmap_clear_write(). Why? Both the use and implementation of pmap_page_protect() in our virtual memory system has subtle errors, specifically, the management of execute permission is broken on some architectures. The "prot" argument to pmap_page_protect() should behave differently from the "prot" argument to other pmap functions. Instead of meaning, "give the specified access rights to all of the physical page's mappings," it means "don't take away the specified access rights from all of the physical page's mappings, but do take away the ones that aren't specified." However, owing to our i386 legacy, i.e., no support for no-execute rights, all but one invocation of pmap_page_protect() specifies VM_PROT_READ only, when the intent is, in fact, to remove only write permission. Consequently, a faithful implementation of pmap_page_protect(), e.g., ia64, would remove execute permission as well as write permission. On the other hand, some architectures that support execute permission have basically ignored whether or not VM_PROT_EXECUTE is passed to pmap_page_protect(), e.g., amd64 and sparc64. This change represents the first step in replacing pmap_page_protect() by the less subtle pmap_clear_write() that is already implemented on amd64, i386, and sparc64. Discussed with: grehan@ and marcel@	2006-07-20 17:48:41 +00:00
Alan Cox	7c9cc27f26	MFamd64 pmap_clear_ptes() is already convoluted. This will worsen with the implementation of superpages. Eliminate it and add pmap_clear_write(). There are no functional changes. Checked by: md5	2006-07-18 03:17:12 +00:00
Alan Cox	e4cec28398	Now that free_pv_entry() accesses the pmap, call free_pv_entry() in pmap_remove_all() before rather than after the pmap is unlocked. At present, the page queues lock provides sufficient sychronization. In the future, the page queues lock may not always be held when free_pv_entry() is called.	2006-07-17 03:10:17 +00:00
Alan Cox	d259662b81	MFamd64 Make three simplifications to pmap_ts_referenced(): Eliminate an initialized but otherwise unused variable. Eliminate an unnecessary test. Exit the loop in a shorter way.	2006-07-16 21:05:58 +00:00
Alan Cox	ddd6244a16	Eliminate the remaining uses of "register". Convert the remaining K&R-style function declarations to ANSI-style. Eliminate excessive white space from pmap_ts_referenced().	2006-07-16 19:43:49 +00:00
Alan Cox	d3dd65ab60	Make pc_freemask an array of uint32_t, rather than uint64_t. (I believe that the use of the latter is simply an oversight in porting the new pv entry code from amd64.)	2006-07-15 07:24:30 +00:00
John Baldwin	2e5ea45f19	Regen.	2006-07-14 15:42:47 +00:00
John Baldwin	5560239759	Somewhat surprisingly, ibcs2_ioctl() is MPSAFE as it is without needing any further fixes.	2006-07-14 15:42:21 +00:00
John Baldwin	4028fd5b49	Regen.	2006-07-14 15:31:01 +00:00
John Baldwin	86759770f9	Mark ibcs2_mount() (just returns EINVAL) and ibcs2_umount() (just calls unmount(2)) MPSAFE.	2006-07-14 15:30:50 +00:00
John Baldwin	e4edf2558e	Regen.	2006-07-14 15:11:46 +00:00
John Baldwin	7186587b82	ibcs2_sigprocmask() is already marked MPSAFE in syscalls.xenix, so mark it MPSAFE in syscalls.isc.	2006-07-14 15:11:20 +00:00
Jung-uk Kim	0758eaa227	Sync specialreg.h changes between amd64 and i386 with few fixes.	2006-07-13 16:09:40 +00:00
John Baldwin	19e9205a23	Simplify the pager support in DDB. Allowing different db commands to install custom pager functions didn't actually happen in practice (they all just used the simple pager and passed in a local quit pointer). So, just hardcode the simple pager as the only pager and make it set a global db_pager_quit flag that db commands can check when the user hits 'q' (or a suitable variant) at the pager prompt. Also, now that it's easy to do so, enable paging by default for all ddb commands. Any command that wishes to honor the quit flag can do so by checking db_pager_quit. Note that the pager can also be effectively disabled by setting $lines to 0. Other fixes: - 'show idt' on i386 and pc98 now actually checks the quit flag and terminates early. - 'show intr' now actually checks the quit flag and terminates early.	2006-07-12 21:22:44 +00:00
Michael Reifenberger	df2f5de4e5	Initialise (if necessary) the VIA C3/C7 features. Store the capabilities for further use by random(4), padlock(4), ... Obtained from: mostly OpenBSD MFC after: 1 week	2006-07-12 19:46:08 +00:00
Michael Reifenberger	9b6560e483	fix typo in identcpu.c and add one define to specialreg.h. MFC after: 1 week	2006-07-12 16:52:56 +00:00
Michael Reifenberger	e5f87cebb3	First step to identify and initialize the newer VIA C7 CPU as found in a VIA EPIA EN-15000 board. Obtained from: large parts from OpenBSD	2006-07-12 14:52:32 +00:00
Jung-uk Kim	444576c0c4	Add two new CPUID bits for AMD CPUs, i. e., SVM and extended APIC register.	2006-07-12 06:04:12 +00:00
John Baldwin	90aff9de2d	Regen.	2006-07-11 20:55:23 +00:00
John Baldwin	be5747d5b5	- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.	2006-07-11 20:52:08 +00:00
John Baldwin	a46a6706c5	Retire the stackgap macros from ibcs2 as they are no longer used. Push the includes of <sys/exec.h> and <sys/sysent.h> down into the only files that now need them.	2006-07-10 17:59:26 +00:00
John Baldwin	33f016341e	Regen.	2006-07-10 15:55:38 +00:00
John Baldwin	036fd5f3bc	Mark ibcs2_msgsys(), ibcs2_semsys(), and ibcs2_shmsys() MPSAFE.	2006-07-10 15:55:17 +00:00
Thomas Wintergerst	5d0c7501b6	Extend i4b to support CAPI manager based ISDN controllers (CAPI manager is part of c4b, CAPI for BSD). This is a preparation to add CAPI for BSD to the source tree. Approved by: hm (mentor) MFC after: 2 weeks	2006-07-09 21:16:06 +00:00
Matt Jacob	626b06c711	Make the firmware assist driver resident in preparation for isp using it. Reviewed by: sam, max	2006-07-09 16:41:22 +00:00
Matt Jacob	1b530437b6	If PAE is built w/o modules, make sure that isp(4) has its firmware resident as well.	2006-07-09 16:38:58 +00:00
John Baldwin	16a9155a67	Regen.	2006-07-08 20:14:34 +00:00
John Baldwin	d9f4623307	- Split ioctl() up into ioctl() and kern_ioctl(). The kern_ioctl() assumes that the 'data' pointer is already setup to point to a valid KVM buffer or contains the copied-in data from userland as appropriate (ioctl(2) still does this). kern_ioctl() takes care of looking up a file pointer, implementing FIONCLEX and FIOCLEX, and calling fi_ioctl(). - Use kern_ioctl() to implement xenix_rdchk() instead of using the stackgap and mark xenix_rdchk() MPSAFE.	2006-07-08 20:12:14 +00:00
John Baldwin	43e757a78d	Use kern_connect() in spx_open() to avoid the need for the stackgap. I also used kern_close() for simplicity though close(2) wasn't requiring the use of the stackgap.	2006-07-08 20:05:04 +00:00
John Baldwin	c68b315699	- Split the IBCS2 ipc foosys() system calls up into subfunctions matching the organization in svr4_ipc.c. - Use kern_msgctl(), kern_semctl(), and kern_shmctl() instead of the stackgap.	2006-07-08 19:54:12 +00:00
John Baldwin	839cea4b0a	Use ibsc2_key_t rather than key_t.	2006-07-08 19:52:49 +00:00
John Baldwin	ec982ae761	Regen.	2006-07-06 21:43:14 +00:00
John Baldwin	ad6d226d43	- Protect the list of linux ioctl handlers with an sx lock. - Hold Giant while calling linux ioctl handlers for now as they aren't all known to be MPSAFE yet. - Mark linux_ioctl() MPSAFE.	2006-07-06 21:42:36 +00:00
John Baldwin	42fd98d94b	Regen.	2006-07-06 21:33:14 +00:00
John Baldwin	3cb83e714d	Add kern_setgroups() and kern_getgroups() and use them to implement ibcs2_[gs]etgroups() rather than using the stackgap. This also makes ibcs2_[gs]etgroups() MPSAFE. Also, it cleans up one bit of weirdness in the old setgroups() where it allocated an entire credential just so it had a place to copy the group list into. Now setgroups just allocates a NGROUPS_MAX array on the stack that it copies into and then passes to kern_setgroups().	2006-07-06 21:32:20 +00:00
John Baldwin	c79c04f176	Use the regular poll(2) function to implement poll(2) for the IBCS2 compat ABI as FreeBSD's poll(2) is ABI compatible. The ibcs2_poll() function attempted to implement poll(2) using a wrapper around select(2). Besides being somewhat ugly, it also had at least one bug in that instead of allocating complete fdset's on the stack via the stackgap it just allocated pointers to fdsets.	2006-07-06 21:29:05 +00:00
David Xu	20197a8c49	Temporarily remove SCHED_CORE, it seems I have so many works can do now, one example is POSIX priority mutex for libthr.	2006-07-05 02:32:55 +00:00
Alan Cox	da536e6348	Correct an error in the new pmap_collect(), thus only affecting HEAD. Specifically, the pv entry was always being freed to the caller's pmap instead of the pmap to which the pv entry belongs.	2006-07-02 18:22:47 +00:00
Rink Springer	44a8277bc6	Updated the XBOX kernel to use the new nfe(4) driver obtained from OpenBSD. This driver seems to give a small performance increase, and should lead to better maintainability in the future. The nForce Ethernet-specific hack in sys/i386/xbox/xbox.c is still required, judging from dev/nfe/if_nfe.c. The condition it hacks will almost certainly only occur on XBOX-es anyway, so it is best left there. Approved by: imp (mentor)	2006-06-27 20:22:32 +00:00
John Baldwin	cec34dbf79	Regen.	2006-06-27 18:32:16 +00:00
John Baldwin	49d409a108	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
John Baldwin	0cceebeeb2	Regen.	2006-06-27 14:47:08 +00:00
John Baldwin	597d608f86	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
Alan Cox	8e0e1e2239	Correct a very old and very obscure bug: vmspace_fork() calls pmap_copy() if the mapping is VM_INHERIT_SHARE. Suppose the mapping is also wired. vmspace_fork() clears the wiring attributes in the vm map entry but pmap_copy() copies the PG_W attribute in the PTE. I don't think this is catastrophic. It blocks pmap_remove_pages() from destroying the mapping and corrupts the pmap's wiring count. This revision fixes the problem by changing pmap_copy() to clear the PG_W attribute. Reviewed by: tegge@	2006-06-27 04:28:23 +00:00
David E. O'Brien	bfc788c283	Add a pure open source nForce Ethernet driver, under BSDL. This driver was ported from OpenBSD by Shigeaki Tagashira <shigeaki@se.hiroshima-u.ac.jp> and posted at http://www.se.hiroshima-u.ac.jp/~shigeaki/software/freebsd-nfe.html It was additionally cleaned up by me. It is still a work-in-progress and thus is purposefully not in GENERIC. And it conflicts with nve(4), so only one should be loaded.	2006-06-26 23:41:07 +00:00
Sergey Babkin	d81175c738	Backed out the change by request from rwatson. PR: kern/14584	2006-06-26 22:03:22 +00:00
John Baldwin	b820787fb3	Regen.	2006-06-26 18:37:36 +00:00
John Baldwin	cf837b8943	linux_brk() is MPSAFE.	2006-06-26 18:36:16 +00:00
Alan Cox	031bf5ea1e	Eliminate a comment that became stale after revision 1.547.	2006-06-25 22:15:02 +00:00
Sergey Babkin	7a799f1ef0	The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin	2006-06-25 18:37:44 +00:00
Alan Cox	f05446648b	Change get_pv_entry() such that the call to vm_page_alloc() specifies VM_ALLOC_NORMAL instead of VM_ALLOC_SYSTEM when try is TRUE. In other words, when get_pv_entry() is permitted to fail, it no longer tries as hard to allocate a page. Change pmap_enter_quick_locked() to fail rather than wait if it is unable to allocate a page table page. This prevents a race between pmap_enter_object() and the page daemon. Specifically, an inactive page that is a successor to the page that was given to pmap_enter_quick_locked() might become a cache page while pmap_enter_quick_locked() waits and later pmap_enter_object() maps the cache page violating the invariant that cache pages are never mapped. Similarly, change pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather than pmap_insert_entry(). Generally speaking, pmap_enter_quick_locked() is used to create speculative mappings. So, it should not try hard to allocate memory if free memory is scarce. Add an assertion that the object containing m_start is locked in pmap_enter_object(). Remove a similar assertion from pmap_enter_quick_locked() because that function no longer accesses the containing object. Remove a stale comment. Reviewed by: ups@	2006-06-20 20:52:11 +00:00
Alexander Leidinger	aff681d258	regen after change to syscalls.master	2006-06-20 20:41:29 +00:00
Alexander Leidinger	502195ac72	Switch to using the DUMMY infrastructure instead of UNIMPL for the new syscalls. This way there will be a log message printed to the console (this time for real). Note: UNIMPL should be used for syscalls we do not implement ever, e.g. syscalls to load linux kernel modules. Submitted by: rdivacky Sponsored by: Goole SoC 2006 P4 IDs: 99600, 99602	2006-06-20 20:38:44 +00:00
Yaroslav Tykhiy	15a901e263	We no longer need to disable interrupts in MD trap machinery when we're about to call kdb_trap() because the latter MI function can disable interrupts by itself now. Pointed out by: bde X-MFC remark: depends on kern/subr_kdb.c#1.18 Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-20 12:44:21 +00:00
David Xu	d037e6d6d0	Style fix, use low-case.	2006-06-19 07:55:29 +00:00
David Xu	85b2d575de	Clear bit 22 in MSR IA32_MISC_ENABLE, according to Intel document, when the bit 22 is set to 1, CPUID with EAX=0 returns a maximum value in EAX[7..0] of 3, when set to 0(default), CPUID with EAX=0 returns the number corresponding to the maximum standard function supported. On my machine, BIOS sets the bit to 1 to make it to be compatible with old OS, this causes dual-core Pentium-D (two physical cores) to be identified as hyperthreading (two logical cores) by function mp_topology().	2006-06-19 07:51:47 +00:00
Yaroslav Tykhiy	8276a67ad0	Fix style while I'm here.	2006-06-18 12:13:49 +00:00
Yaroslav Tykhiy	042bbfae5a	The i386 "call" instruction works as follows: it pushes the return address on the stack and only then "dereferences" %pc. Therefore, in the case of a call to an invalid address, we arrive to the trap handler with the invalid value in tf_eip. This used to prevent db_backtrace() from assigning the most recent and interesting frame on the stack to the right spot in the right function, from which the invalid call was attempted. Try to detect and work around that by recovering the return address from the stack. The work-around requires the fault address be passed to db_backtrace(). Smuggle it as tf_err. MFC after: 1 month Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-18 12:07:00 +00:00
Matt Jacob	375e362989	Unbreak tinderbox- fix device_printf arg to accomodate different sizes of vm_paddr_t in different contexts (e.g., PAE vs. non PAE).	2006-06-16 14:04:21 +00:00
Yaroslav Tykhiy	a436dbf123	Return -1 from db_numargs() if number of args couldn't be guessed. Use this later to indicate in backtrace output that args shown are uncertain. Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-16 11:49:37 +00:00
Yaroslav Tykhiy	70b906ae82	Guess the number of arguments to a function somewhat better. Now GCC likes to stick a "mov %eax, %FOO" instruction before "addl $BAR, %esp" if the function just called returns an int, which is a very common case in the kernel. Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-16 11:14:54 +00:00
Alexander Leidinger	28a3ae7f88	Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's an explicit comment that it's needed for the linuxolator. This is not the case anymore. For all other architectures there was only a "KEEP THIS". I'm (and other people too) running a COMPAT_43-less kernel since it's not necessary anymore for the linuxolator. Roman is running such a kernel for a for longer time. No problems so far. And I doubt other (newer than ia32 or alpha) architectures really depend on it. This may result in a small performance increase for some workloads. If the removal of COMPAT_43 results in a not working program, please recompile it and all dependencies and try again before reporting a problem. The only place where COMPAT_43 is needed (as in: does not compile without it) is in the (outdated/not usable since too old) svr4 code. Note: this does not remove the COMPAT_43TTY option. Nagging by: rdivacky	2006-06-15 19:58:53 +00:00
Stephan Uphoff	2053c12705	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
Alexander Leidinger	4946fe7c4d	regen after MFP4 (soc2006/rdivacky_linuxolator) of syscalls.master P4-Changes: similar to 98673 and 98675 but regenerated locally Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:48:30 +00:00
Alexander Leidinger	c8b579c182	MFP4 (soc2006/rdivacky_linuxolator) Update of syscall.master: o Adding of several new dummy syscalls (268-310) o Synchronization of amd64 syscall.master with i386 one o Auditing added to amd64 syscall.master o Change auditing type for lstat syscall (bugfix). [1] P4-Changes: 98672, 98674 Noticed by: rwatson [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:43:55 +00:00
David Xu	b41f1452d9	Add scheduler CORE, the work I have done half a year ago, recent, I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.	2006-06-13 13:12:56 +00:00
Marius Strobl	acb8c14985	Make the ISAPNP code optional and only enable it on i386 and pc98 (used for CBUS-PNP cards there) by default, as there are no amd64 and sparc64 machines with ISA slots and which therefore could make use of this code known to exist. For sparc64 this additionally allows to get rid of the compat shims for in{b,w,l}()/out{b,w,l}() etc and the associated hacks. OK'ed by: imp, peter	2006-06-12 21:07:13 +00:00
John Baldwin	e3d7caf487	Enable a few more things in x86 NOTES to get broader LINT coverage: - Turn on iwi(4), ipw(4), and ndis(4) on amd64 and i386. - Turn on ral(4) and ural(4) on i386, pc98, and amd64.	2006-06-12 20:38:17 +00:00

... 3 4 5 6 7 ...

11107 Commits