freebsd-dev

Author	SHA1	Message	Date
John Baldwin	78c85e8dfc	Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month	2004-10-05 18:51:11 +00:00
Alan Cox	caa665aae3	Undo revision 1.251. This change was a performance pessimizing work-around that is no longer required. (In fact, it is not clear that it was ever required in HEAD or RELENG_4, only RELENG_3 required a work-around.) Now, as before revision 1.251, if the preexisting PTE is invalid, pmap_enter() does not call pmap_invalidate_page() to update the TLB(s). Note: Even with this change, the handling of a copy-on-write fault is inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice. Discussed with: bde@ PR: kern/16568	2004-10-03 20:14:07 +00:00
Alan Cox	8ceb3dcb60	The physical address stored in the vm_page is page aligned. There is no need to mask off the page offset bits. (This operation made some sense prior to i386/i386/pmap.c revision 1.254 when we passed a physical address rather than a vm_page pointer to pmap_enter().)	2004-10-03 00:16:43 +00:00
Alan Cox	07b3303943	Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These uses predate the change in the pmap_enter() interface that replaced the page's physical address by the address of its vm_page structure. The PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page structure that was being passed in.	2004-10-02 07:34:58 +00:00
Alan Cox	bbda1f18d9	Remove an unused declaration. (I should have included this change in revision 1.486.)	2004-10-02 05:58:32 +00:00
Alan Cox	0a752e9843	Prevent the unexpected deallocation of a page table page while performing pmap_copy(). This entails additional locking in pmap_copy() and the addition of a "flags" parameter to the page table page allocator for specifying whether it may sleep when memory is unavailable. (Already, pmap_copy() checks the availability of memory, aborting if it is scarce. In theory, another CPU could, however, allocate memory between pmap_copy()'s check and the call to the page table page allocator, causing the current thread to release its locks and sleep. This change makes this scenario impossible.) Reviewed by: tegge@	2004-09-29 19:20:40 +00:00
Peter Wemm	ffee5dac09	MFi386: rev 1.239 - invalidate tlb after pte update	2004-09-29 01:59:10 +00:00
Peter Wemm	083e5bdc72	MFi386: rev 1.236 - improve panic message for a busted mptable	2004-09-29 01:58:24 +00:00
Peter Wemm	6fdf763cef	Like on i386, use the definition of struct bios_smap from machine/pc/bios.h again.	2004-09-24 01:11:11 +00:00
Peter Wemm	2169193596	Converge towards i386. I originally resisted creating <machine/pc/bios.h> because it was mostly irrelevant - except for the silly BIOS_PADDRTOVADDR etc macros. Along the way of working around this, I missed a few things. * Make syscons properly inherit the bios capslock/shiftlock/etc state like i386 does. Note that we cannot inherit the bios key repeat rate because that requires a bios call (which is impossible for us). * Give syscons the ability to beep on amd64. Oops. While here, make bios.c compile and add it to files.amd64.	2004-09-24 01:08:34 +00:00
Peter Wemm	c3277f936c	Severely strip down the repocopied i386/bios.c and bios.h files. It turns out that bios_sigsearch() etc is useful for finding tables in roms.	2004-09-24 00:42:36 +00:00
Alan Cox	a971139680	Correct a long-standing error in _pmap_unwire_pte_hold() affecting multiprocessors. Specifically, the error is conditioning the call to pmap_invalidate_page() on whether the pmap is active on the current CPU. This call must be unconditional. Regardless of whether the pmap is active on the CPU performing _pmap_unwire_pte_hold(), it could be active on another CPU. For example, a call to pmap_remove_all() by the page daemon could result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the current CPU and active on another CPU. In such circumstances, failing to call pmap_invalidate_page() results in a stale TLB entry on the other CPU that still maps the now deallocated page table page. What happens next is typically a mysterious panic in pmap_enter() by the other CPU, either "pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished, va: 0x%lx". Both occur because the former page table page has been recycled and allocated to a new purpose. Consequently, it no longer contains zeroes. See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail thread last year. Many thanks to the engineers at Sandvine for providing clear and concise information until all of the pieces of the puzzle fell into place and for testing an earlier patch. MT5 Candidate	2004-09-22 05:01:48 +00:00
Peter Wemm	7789933b6a	MFi386: adapt rev 1.19 (debugger fixes)	2004-09-22 01:27:06 +00:00
Peter Wemm	b05ba1f73d	Minor sync-up with i386. Catch up on de-quoting and de-counting after config changes.	2004-09-22 01:04:54 +00:00
Peter Wemm	46ec31b083	MFi386: add ispfw (except using correct device<tab><tab>ispfw format, <space><tab> is for the options line)	2004-09-22 00:44:13 +00:00
John Baldwin	76764432e4	- Add support for "paging" in stack trace output. That is, when you do a stack trace from ddb, the output will pause with a '--More--' prompt every 18 lines. If you hit Enter, it will print another line and prompt again. If you hit space it will output another page and then prompt. If you hit 'q' or 'x' it will abort the rest of the stack trace. - Fix the sparc64 userland stack trace to honor the total count of lines to print. This is useful if your trace happens to walk back onto 0xdeadc0de and gets stuck in an endless loop. MFC after: 1 month Tested on: i386, alpha, sparc64	2004-09-20 19:05:32 +00:00
Alan Cox	de6c3db01f	Simplify the reference counting of page table pages. Specifically, use the page table page's wired count rather than its hold count to contain the reference count. My rationale for this change is based on several factors: 1. The machine-independent and pmap layers used the same hold count field in subtly different ways. The machine-independent layer uses the hold count to implement a form of ephemeral wiring that is used by pipes, physio, etc. In other words, subsystems where we wish to temporarily block a page from being swapped out while it is mapped into the kernel's address space. Such pages are never removed from the page queues. Instead, the page daemon recognizes a non-zero hold count to mean "hands off this page." In contrast, page table pages are never in the page queues; they are wired from birth to death. The hold count was being used as a kind of reference count, specifically, the number of valid page table entries within the page. Not surprisingly, these two different uses imply different synchronization rules: in the machine- independent layer access to the hold count requires the page queues lock; whereas in the pmap layer the pmap lock is required. Thus, continued use by the pmap layer of vm_page_unhold(), which asserts that the page queues lock is held, made no sense. 2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired count. An unexpected wired count on a page table page was ignored and the underlying page leaked. 3. In a word, microoptimization. Using the wired count exclusively, rather than a combination of the wired and hold counts, makes the code slightly smaller and faster. Reviewed by: tegge@	2004-09-19 21:20:01 +00:00
Alan Cox	8478ea241b	Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc() succeeds, the page's queue field is unconditionally set to PQ_NONE by vm_pageq_remove_nowakeup().)	2004-09-19 02:39:31 +00:00
Alan Cox	7580b56bdc	Release the page queues lock earlier in pmap_protect() and pmap_remove() in order to reduce contention.	2004-09-18 22:56:58 +00:00
Poul-Henning Kamp	7ce1979be6	Add new a function isa_dma_init() which returns an errno when it fails and which takes a M_WAITOK/M_NOWAIT flag argument. Add compatibility isa_dmainit() macro which whines loudly if isa_dma_init() fails. Problem uncovered by: tegge	2004-09-15 12:09:50 +00:00
Poul-Henning Kamp	5757a0b985	Remove now unused #include files.	2004-09-15 12:02:35 +00:00
Alan Cox	031102cc7b	Use an atomic op to update the pte in pmap_protect(). This is to prevent the loss of a page modified (PG_M) bit in a race between processors. Quoting Tor: One scenario where the old code could cause a lost PG_M bit is a multithreaded linux program (or FreeBSD program using the linuxthreads port) where one thread was starting a subprocess. The thread doing fork() would call vmspace_fork(), which would then call vm_map_copy_entry() which would call pmap_protect() on an area possibly accessed by other threads. Additionally, make the clearing of PG_M by pmap_protect() unconditional if write permission is removed. Previously, PG_M could persist on a read-only unmanaged page. That seems inconsistent and confusing. In collaboration with: tegge@ MT5 candidate PR: 61852	2004-09-12 20:20:40 +00:00
Scott Long	9e0c3bdf64	Double the number of kernel page tables for amd64 and for i386/PAE. The old value was only enough for 8GB of RAM, the new value can do 16GB. This still isn't optimal since it doesn't scale. Fixing this for amd64 looks to be fairly easy, but for i386 will be quite difficult. Reviewed by: peter	2004-09-11 01:31:26 +00:00
Bill Paul	a07bd003bf	Add device driver support for the VIA Networking Technologies VT6122 gigabit ethernet chip and integrated 10/100/1000 copper PHY. The vge driver has been added to GENERIC for i386, pc98 and amd64, but not to sparc or ia64 since I don't have the ability to test it there. The vge(4) driver supports VLANs, checksum offload and jumbo frames. Also added the lge(4) and nge(4) drivers to GENERIC for i386 and pc98 since I was in the neighborhood. There's no reason to leave them out anymore.	2004-09-10 20:57:46 +00:00
Alan Cox	e232eb8288	Use atomic ops in pmap_clear_ptes() to prevent SMP races that could result in the loss of an accessed or modified bit from the pte. In collaboration with: tegge@ MT5 candidate	2004-09-08 18:58:29 +00:00
Scott Long	50736a153b	Fix a problem with tag->boundary inheritence that has existed since day one and was propagated to nearly every platform. The boundary of the child needs to consider the boundary of the parent and pick the minimum of the two, not the maximum. However, if either is 0 then pick the appropriate one. This bug was exposed by a recent change to ATA, which should now be fixed by this change. The alignment and maxsegsz tag attributes likely also need a similar review in the near future. This is a MT5 candidate. Reviewed by: marcel Submitted by: sos (in part)	2004-09-08 04:54:19 +00:00
Scott Long	444ba94513	Switch the default scheduler to 4BSD to match what will go into RELENG_5 soon. It can be switched back once 5.3 is tested and released. Also turn on PREEMPTION as many of the stability problems with it have been fixed. MT5: 3 days.	2004-09-07 22:37:43 +00:00
Julian Elischer	ed062c8d66	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
Scott Long	9923b511ed	Turn PREEMPTION into a kernel option. Make sure that it's defined if FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is enabled (code inspired by the PREEMPTION warning in kern_switch.c). This is a possible MT5 candidate.	2004-09-02 18:59:15 +00:00
Julian Elischer	6804a3ab6d	Give the 4bsd scheduler the ability to wake up idle processors when there is new work to be done. MFC after: 5 days	2004-09-01 06:42:02 +00:00
Julian Elischer	2630e4c90c	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
Julian Elischer	5995adc206	Remove an unneeded argument.. The removed argument could trivially be derived from the remaining one. That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument. Having both proc and thread as an argumen tjust gives an opportunity for them to get out sync. MFC after: 3 days	2004-08-31 07:34:54 +00:00
Julian Elischer	99e9dcb817	Remove sched_free_thread() which was only used in diagnostics. It has outlived its usefulness and has started causing panics for people who turn on DIAGNOSTIC, in what is otherwise good code. MFC after: 2 days	2004-08-31 06:12:13 +00:00
Peter Wemm	5fe155d0fd	Add the mp_watchdog hooks, although it locks up my SMP test box. It might be useable to somebody.	2004-08-30 23:33:33 +00:00
Alan Cox	bfa15df9ba	Remove unnecessary check for curthread == NULL.	2004-08-30 03:52:05 +00:00
David E. O'Brien	dd68efd05b	s/smp_rv_mtx/smp_ipi_mtx/g Requested by: jhb	2004-08-28 00:49:55 +00:00
Tilman Keskinoz	0039290c1f	Fix a comment, IA32 was renamed to COMPAT_IA32 Approved by: marcel	2004-08-27 21:29:20 +00:00
Marcel Moolenaar	0f2fe153bc	Move the kernel-specific logic to adjust frompc from MI to MD. For these two reasons: 1. On ia64 a function pointer does not hold the address of the first instruction of a functions implementation. It holds the address of a function descriptor. Hence the user(), btrap(), eintr() and bintr() prototypes are wrong for getting the actual code address. 2. The logic forces interrupt, trap and exception entry points to be layed-out contiguously. This can not be achieved on ia64 and is generally just bad programming. The MCOUNT_FROMPC_USER macro is used to set the frompc argument to some kernel address which represents any frompc that falls outside the kernel text range. The macro can expand to ~0U to bail out in that case. The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to some kernel address to represent a call to a trap or interrupt handler. This to avoid that the trap or interrupt handler appear to be called from everywhere in the call graph. The macro can expand to ~0U to prevent adjusting frompc. Note that the argument is selfpc, not frompc. This commit defines the macros on all architectures equivalently to the original code in sys/libkern/mcount.c. People can take it from here... Compile-tested on: alpha, amd64, i386, ia64 and sparc64 Boot-tested on: i386	2004-08-27 19:42:35 +00:00
Alan Cox	8991a235cb	The machine-independent parts of the virtual memory system always pass a valid pmap to the pmap functions that require one. Remove the checks for NULL. (These checks have their origins in the Mach pmap.c that was integrated into BSD. None of the new code written specifically for FreeBSD included them.)	2004-08-27 19:06:17 +00:00
Andre Oppermann	c21fd23260	Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack. If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active. Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.	2004-08-27 15:16:24 +00:00
John Baldwin	ef36ad6921	Correct the arguments to kern_sigaltstack() as they were reversed. PR: kern/68079 Submitted by: Georg-W. Koltermann gwk at rahn-koltermann dot de	2004-08-24 20:52:52 +00:00
Nate Lawson	dc6851d588	Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list().	2004-08-24 19:22:54 +00:00
Peter Wemm	648bfe0b75	It is now an error to call pmap_unuse_pt without the paddr of the pde that contained the pte.	2004-08-24 00:17:52 +00:00
Peter Wemm	a32f55d629	Oops, I forgot to have the idle loop call mp_grab_cpu_hlt() on the amd64 SMP case.	2004-08-24 00:16:43 +00:00
Peter Wemm	f1009e1e1f	Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock. We were obtaining different spin mutexes (which disable interrupts after aquisition) and spin waiting for delivery. For example, KSE processes do LDT operations which use smp_rendezvous, while other parts of the system are doing things like tlb shootdowns with a different mutex. This patch uses the common smp_rendezvous mutex for all MD home-grown IPIs that spinwait for delivery. Having the single mutex means that the spinloop to aquire it will enable interrupts periodically, thus avoiding the cross-ipi deadlock. Obtained from: dwhite, alc Reviewed by: jhb	2004-08-23 21:39:29 +00:00
Peter Wemm	1c0dea0f6e	Sync with i386 - Optimize intr_execute_handlers a bit etc.	2004-08-16 23:12:30 +00:00
Peter Wemm	a9cd97ba05	Sync with i386 - remove unused includes	2004-08-16 23:10:46 +00:00
Peter Wemm	e88022749d	Sync with i386 - get the softc via the devclass rather than caching the dev	2004-08-16 23:10:18 +00:00
Peter Wemm	deefc3c4f5	Sync with i386 - add ADAPTIVE_GIANT, remove pcic	2004-08-16 22:59:24 +00:00
Peter Wemm	9727279886	Sync with i386 - add foot shooting protection for the DDB/KDB thing.	2004-08-16 22:57:47 +00:00

1 2 3 4 5 ...

4207 Commits