freebsd-nq

Author	SHA1	Message	Date
Alan Cox	0cd31a0d75	Change the page's CLEANCHK flag from being a page queue mutex synchronized flag to a vm object mutex synchronized flag.	2007-02-22 06:15:52 +00:00
Alan Cox	711585d087	Enable vm_page_free() and vm_page_free_zero() to be called on some pages without the page queues lock being held, specifically, pages that are not contained in a vm object and not a member of a page queue.	2007-02-18 05:54:42 +00:00
Alan Cox	ba000fb2c1	Remove a stale comment. Add punctuation to a nearby comment.	2007-02-17 19:37:00 +00:00
Alan Cox	d3d029bd62	Relax the page queue lock assertions in vm_page_remove() and vm_page_free_toq() to account for recent changes that allow vm_page_free_toq() to be called on some pages without the page queues lock being held, specifically, pages that are not contained in a vm object and not a member of a page queue. (Examples of such pages include page table pages, pv entry pages, and uma small alloc pages.)	2007-02-15 05:43:38 +00:00
Alan Cox	7d60988bad	Avoid the unnecessary acquisition of the free page queues lock when a page is actually being added to the hold queue, not the free queue. At the same time, avoid unnecessary tests to wake up threads waiting for free memory and the idle thread that zeroes free pages. (These tests will be performed later when the page finally moves from the hold queue to the free queue.)	2007-02-14 07:05:55 +00:00
Robert Watson	1e319f6db3	Add uma_set_align() interface, which will be called at most once during boot by MD code to indicated detected alignment preference. Rather than cache alignment being encoded in UMA consumers by defining a global alignment value of (16 - 1) in UMA_ALIGN_CACHE, UMA_ALIGN_CACHE is now a special value (-1) that causes UMA to look at registered alignment. If no preferred alignment has been selected by MD code, a default alignment of (16 - 1) will be used. Currently, no hardware platforms specify alignment; architecture maintainers will need to modify MD startup code to specify an alignment if desired. This must occur before initialization of UMA so that all UMA zones pick up the requested alignment. Reviewed by: jeff, alc Submitted by: attilio	2007-02-11 20:13:52 +00:00
Alan Cox	5351a2488a	Use the free page queue mutex instead of the page queue mutex to synchronize sleeping and waking of the zero idle thread.	2007-02-11 05:18:40 +00:00
John Baldwin	e8865caffb	- Move 'struct swdevt' back into swap_pager.h and expose it to userland. - Restore support for fetching swap information from crash dumps via kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps. Reviewed by: arch@, phk MFC after: 1 week	2007-02-07 17:43:11 +00:00
Alan Cox	e9f995d824	Change the pagedaemon, vm_wait(), and vm_waitpfault() to sleep on the vm page queue free mutex instead of the vm page queue mutex.	2007-02-07 06:37:30 +00:00
Alan Cox	3ae3919d0b	Change the free page queue lock from a spin mutex to a default (blocking) mutex. With the demise of Alpha support, there is no longer a reason for it to be a spin mutex.	2007-02-05 06:02:55 +00:00
Mohan Srinivasan	6c125b8df6	Fix for problems that occur when all mbuf clusters migrate to the mbuf packet zone. Cluster allocations fail when this happens. Also processes that may have blocked on cluster allocations will never be woken up. Thanks to rwatson for an overview of the issue and pointers to the mbuma paper and his tool to dump out UMA zones. Reviewed by: andre@	2007-01-25 01:05:23 +00:00
Mohan Srinivasan	7738029183	Fix for a bug where only one process (of multiple) blocked on maxpages on a zone is woken up, with the rest never being woken up as a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked procsses instead. This change introduces a thundering herd, but since this should be relatively infrequent, optimizing this (by introducing a count of blocked processes, for example) may be premature. Reviewd by: ups@	2007-01-24 22:49:11 +00:00
Jeff Roberson	f0393f063a	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
Xin LI	f67af5c918	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 15:05:52 +00:00
Robert Watson	635fd50514	Remove uma_zalloc_arg() hack, which coerced M_WAITOK to M_NOWAIT when allocations were made using improper flags in interrupt context. Replace with a simple WITNESS warning call. This restores the invariant that M_WAITOK allocations will always succeed or die horribly trying, which is relied on by many UMA consumers. MFC after: 3 weeks Discussed with: jhb	2007-01-10 21:04:43 +00:00
Alan Cox	e6eaadba43	Declare the map entry created by kmem_init() for the range from VM_MIN_KERNEL_ADDRESS to the end of the kernel's bootstrap data as MAP_NOFAULT.	2007-01-07 07:32:04 +00:00
John Baldwin	663b416f16	- Add a new function uma_zone_exhausted() to see if a zone is full. - Add a printf in swp_pager_meta_build() to warn if the swapzone becomes exhausted so that there's at least a warning before a box that runs out of swapzone space before running out of swap space deadlocks. MFC after: 1 week Reviwed by: alc	2007-01-05 19:09:01 +00:00
Alan Cox	73000556e8	Optimize vm_object_split(). Specifically, make the number of iterations equal to the number of physical pages that are renamed to the new object rather than the new object's virtual size.	2006-12-17 20:14:43 +00:00
Alan Cox	95442adf05	Simplify the computation of the new object's size in vm_object_split().	2006-12-16 08:17:07 +00:00
Kip Macy	35d10226b7	Remove the requirement that phys_avail be sorted in ascending order by explicitly finding the lowest and highest addresses when calculating the size of the vm_pages array Reviewed by :alc	2006-12-08 08:44:47 +00:00
Julian Elischer	ad1e7d285a	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
Ruslan Ermilov	9bed18a493	The clean_map has been made local to vm_init.c long ago.	2006-11-20 16:23:34 +00:00
Ruslan Ermilov	ef1b7c4804	Remove a redundant pointer-type variable.	2006-11-20 08:33:55 +00:00
Ruslan Ermilov	276096bb3e	When counting vm totals, skip unreferenced objects, including vnodes representing mounted file systems. Reviewed by: alc MFC after: 3 days	2006-11-20 00:16:00 +00:00
Alan Cox	0f3b612a06	There is no point in setting PG_REFERENCED on kmem_object pages because they are "unmanaged", i.e., non-pageable, pages. Remove a stale comment.	2006-11-13 00:27:02 +00:00
Alan Cox	44b8bd66f9	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)	2006-11-12 21:48:34 +00:00
Alan Cox	49c3b92531	I misplaced the assertion that was added to vm_page_startup() in the previous change. Correct its placement.	2006-11-08 19:11:54 +00:00
Alan Cox	9ad3296a25	Simplify the construction of the free queues in vm_page_startup(). Add an assertion to test a hypothesis concerning other redundant computation in vm_page_startup().	2006-11-08 18:43:47 +00:00
Alan Cox	815bc69fb0	Ensure that the page's oflags field is initialized by contigmalloc().	2006-11-08 06:23:29 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
John Birrell	8460a577a4	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
Robert Watson	ae4e9636ac	Better align output of "show uma" by moving from displaying the basic counters of allocs/frees/use for each zone to the same statistics shown by userspace "vmstat -z". MFC after: 3 days	2006-10-26 12:55:32 +00:00
Alan Cox	66bdd5d619	The page queues lock is no longer required by vm_page_wakeup().	2006-10-23 05:27:31 +00:00
Alan Cox	2a53696fb8	The page queues lock is no longer required by vm_page_busy() or vm_page_wakeup(). Reduce or eliminate its use accordingly.	2006-10-22 21:18:48 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Alan Cox	9af80719db	Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.	2006-10-22 04:28:14 +00:00
Alan Cox	9fea8cad08	Eliminate unnecessary PG_BUSY tests. They originally served a purpose that is now handled by vm object locking.	2006-10-21 21:02:04 +00:00
Alan Cox	7e2393ff51	Long ago, revision 1.22 of vm/vm_pager.h introduced a bug. Specifically, it introduced a check after the call to file system's get pages method that assumes that the get pages method does not change the array of pages that is passed to it. In the case of vnode_pager_generic_getpages(), this assumption has been incorrect. The contents of the array of pages may be shifted by vnode_pager_generic_getpages(). Likely, the problem has been hidden by vnode_pager_haspage() limiting the set of pages that are passed to vnode_pager_generic_getpages() such that a shift never occurs. The fix implemented herein is to adjust the pointer to the array of pages rather than shifting the pages within the array. MFC after: 3 weeks Fix suggested by: tegge	2006-10-14 23:21:48 +00:00
Alan Cox	bff763439b	Change vnode_pager_addr() such that on returning it distinguishes between an error returned by VOP_BMAP() and a hole in the file. Change the callers to vnode_pager_addr() such that they return VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page. Reviewed by: tegge MFC after: 3 weeks	2006-10-14 22:09:03 +00:00
Kip Macy	600c53adf9	sun4v requires TSBs (translation storage buffers) to be contiguous and be size aligned requiring heavy usage of vm_page_alloc_contig This change makes vm_page_alloc_contig SMP safe Approved by: scottl (acting as backup for mentor rwatson)	2006-10-12 04:41:39 +00:00
Alan Cox	1de11f1af3	Distinguish between two distinct kinds of errors from VOP_BMAP() in vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the underlying file system and (2) an error in performing the VOP_BMAP(). Previously, vnode_pager_generic_getpages() assumed that all errors were of the first type. If, in fact, the error was of the second type, the likely outcome was for the process to become permanently blocked on a busy page. MFC after: 3 weeks Reviewed by: tegge	2006-10-10 18:26:18 +00:00
Alan Cox	f4f83da02d	Change vnode_pager_generic_getpages() so that it does not panic if the given file is sparse. Instead, it zeroes the requested page. Reviewed by: tegge PR: kern/98116 MFC after: 3 days	2006-10-08 20:26:16 +00:00
Ken Smith	a9a5d47c85	Fix two minor style(9) nits in v1.313 which were noticed during an MFC review. alc@ will be MFCing V1.313 plus style fix to RELENG_6.	2006-09-29 00:20:56 +00:00
Alan Cox	e1cb7bc081	Make vm_page_release_contig() static.	2006-09-03 22:24:08 +00:00
Alan Cox	eb4bbba83a	Refactor vm_page_sleep_if_busy() so that the test for a busy page is inlined and a procedure call is made in the rare case, i.e., when it is necessary to sleep. In this case, inlining the test actually makes the kernel smaller.	2006-08-27 19:50:13 +00:00
Alan Cox	1f081553cc	Prevent a call to contigmalloc() that asks for more physical memory than the machine has from causing a panic. Submitted by: Michael Plass PR: 101668 MFC after: 3 days	2006-08-26 02:43:23 +00:00
Alan Cox	09ef0d6e0c	The return value from vm_pageq_add_new_page() is not used. Eliminate it.	2006-08-25 04:36:19 +00:00
Alan Cox	b276ae6f6a	Add _vm_stats and _vm_stats_misc to the sysctl declarations in sysctl.h and eliminate their declarations from various source files.	2006-08-21 06:27:28 +00:00
Alan Cox	38498c2123	vm_page_zero_idle()'s return value serves no purpose. Eliminate it.	2006-08-21 00:55:05 +00:00
Alan Cox	4f9d17d8ab	Page flags are reset on (re)allocation. There is no need to clear any flags except for PG_ZERO in vm_page_free_toq().	2006-08-21 00:34:31 +00:00
Alan Cox	b146f9e5d2	Reimplement the page's NOSYNC flag as an object-synchronized instead of a page queues-synchronized flag. Reduce the scope of the page queues lock in vm_fault() accordingly. Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the scope of the page queues lock. Reviewed by: tegge Additionally, eliminate an unnecessary dereference in computing the argument that is passed to vm_object_set_writeable_dirty().	2006-08-13 00:11:09 +00:00
Alan Cox	25017df472	Ensure that the page's new field for object-synchronized flags is always initialized to zero. Call vm_page_sleep_if_busy() instead of duplicating its implementation in vm_page_grab().	2006-08-11 17:18:58 +00:00
Alan Cox	75db2abb2e	Change vm_page_cowfault() so that it doesn't allocate a pre-busied page.	2006-08-10 04:48:29 +00:00
Alan Cox	5786be7cc7	Introduce a field to struct vm_page for storing flags that are synchronized by the lock on the object containing the page. Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively. Eliminate the assertion that the page queues lock is held in vm_page_io_finish(). Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().	2006-08-09 17:43:27 +00:00
Alan Cox	e7e56b2889	Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().	2006-08-06 00:17:17 +00:00
Alan Cox	e74814b66a	Change vm_page_sleep_if_busy() so that it no longer requires the caller to hold the page queues lock.	2006-08-06 00:15:40 +00:00
Alan Cox	ab1661cb76	Remove a stale comment.	2006-08-05 19:07:07 +00:00
Alan Cox	91449ce98c	When sleeping on a busy page, use the lock from the containing object rather than the global page queues lock.	2006-08-03 23:56:11 +00:00
Alan Cox	78985e424a	Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)	2006-08-01 19:06:06 +00:00
Alan Cox	604c2bbc34	Export the number of object bypasses and collapses through sysctl.	2006-07-22 22:31:57 +00:00
Alan Cox	2cf139527c	Retire debug.mpsafevm. None of the architectures supported in CVS require it any longer.	2006-07-21 23:22:49 +00:00
Alan Cox	af51d7bf57	Eliminate OBJ_WRITEABLE. It hasn't been used in a long time.	2006-07-21 06:40:29 +00:00
Alan Cox	3cad40e517	Add pmap_clear_write() to the interface between the virtual memory system's machine-dependent and machine-independent layers. Once pmap_clear_write() is implemented on all of our supported architectures, I intend to replace all calls to pmap_page_protect() by calls to pmap_clear_write(). Why? Both the use and implementation of pmap_page_protect() in our virtual memory system has subtle errors, specifically, the management of execute permission is broken on some architectures. The "prot" argument to pmap_page_protect() should behave differently from the "prot" argument to other pmap functions. Instead of meaning, "give the specified access rights to all of the physical page's mappings," it means "don't take away the specified access rights from all of the physical page's mappings, but do take away the ones that aren't specified." However, owing to our i386 legacy, i.e., no support for no-execute rights, all but one invocation of pmap_page_protect() specifies VM_PROT_READ only, when the intent is, in fact, to remove only write permission. Consequently, a faithful implementation of pmap_page_protect(), e.g., ia64, would remove execute permission as well as write permission. On the other hand, some architectures that support execute permission have basically ignored whether or not VM_PROT_EXECUTE is passed to pmap_page_protect(), e.g., amd64 and sparc64. This change represents the first step in replacing pmap_page_protect() by the less subtle pmap_clear_write() that is already implemented on amd64, i386, and sparc64. Discussed with: grehan@ and marcel@	2006-07-20 17:48:41 +00:00
Robert Watson	a0d4b0aeaa	Fix build of uma_core.c when DDB is not compiled into the kernel by making uma_zone_sumstat() ifdef DDB, as it's only used with DDB now. Submitted by: Wolfram Fenske <Wolfram.Fenske at Student.Uni-Magdeburg.DE>	2006-07-18 01:13:18 +00:00
Alan Cox	2e9f4a698d	Ensure that vm_object_deallocate() doesn't dereference a stale object pointer: When vm_object_deallocate() sleeps because of a non-zero paging in progress count on either object or object's shadow, vm_object_deallocate() must ensure that object is still the shadow's backing object when it reawakens. In fact, object may have been deallocated while vm_object_deallocate() slept. If so, reacquiring the lock on object can lead to a deadlock. Submitted by: ups@ MFC after: 3 weeks	2006-07-17 06:45:03 +00:00
Robert Watson	eabadd9e4b	Remove sysctl_vm_zone() and vm.zone sysctl from 7.x. As of 6.x, libmemstat(3) is used by vmstat (and friends) to produce more accurate and more detailed statistics information in a machine-readable way, and vmstat continues to provide the same text-based front-end. This change should not be MFC'd.	2006-07-16 22:53:26 +00:00
Alan Cox	ff5ff76116	Set debug.mpsafevm to true on PowerPC. (Now, by default, all architectures in CVS have debug.mpsafevm set to true.) Tested by: grehan@	2006-07-10 07:08:05 +00:00
John Baldwin	9bdaa43379	Move the code to handle the vm.blacklist tunable up a layer into vm_page_startup(). As a result, we now only lookup the tunable once instead of looking it up once for every physical page of memory in the system. This cuts out about a 1 second or so delay in boot on x86 systems. The delay is much larger and more noticable on sun4v apparently. Reported by: kmacy MFC after: 1 week	2006-06-23 16:44:24 +00:00
Konstantin Belousov	455dd7d4c7	Make the mincore(2) return ENOMEM when requested range is not fully mapped. Requested by: Bruno Haible <bruno at clisp org> Reviewed by: alc Approved by: pjd (mentor) MFC after: 1 month	2006-06-21 12:59:05 +00:00
Alan Cox	379fb6429d	Use ptoa(psize) instead of size to compute the end of the mapping in vm_map_pmap_enter().	2006-06-17 08:45:01 +00:00
Stephan Uphoff	2053c12705	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
Alan Cox	d2d9e24a89	Correct an error in the previous revision that could lead to a panic: Found mapped cache page. Specifically, if cnt.v_free_count dips below cnt.v_free_reserved after p_start has been set to a non-NULL value, then vm_map_pmap_enter() would break out of the loop and incorrectly call pmap_enter_object() for the remaining address range. To correct this error, this revision truncates the address range so that pmap_enter_object() will not map any cache pages. In collaboration with: tegge@ Reported by: kris@	2006-06-14 17:48:45 +00:00
Alan Cox	e4c7a7b169	Enable debug.mpsafevm on arm by default. Tested by: cognet@	2006-06-10 05:29:37 +00:00
Alan Cox	ce142d9ec0	Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@	2006-06-05 20:35:27 +00:00
Paul Saab	4cbb1c1aaa	Fix minidumps to include pages allocated via pmap_map on amd64. These pages are allocated from the direct map, and were not previous tracked. This included the vm_page_array and the early UMA bootstrap pages. Reviewed by: peter	2006-05-31 22:55:23 +00:00
Tor Egge	57051fdc4b	Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice. Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier. Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice. Change vmtotal() and swapout_procs() to use vmspace_acquire_ref(). Reviewed by: alc	2006-05-29 21:28:56 +00:00
Robert Watson	4f538c7480	When allocating a bucket to hold a free'd item in UMA fails, don't report this as an allocation failure for the item type. The failure will be separately recorded with the bucket type. This my eliminate high mbuf allocation failure counts under some circumstances, which can be alarming in appearance, but not actually a problem in practice. MFC after: 2 weeks Reported by: ps, Peter J. Blok <pblok at bsd4all dot org>, OxY <oxy at field dot hu>, Gabor MICSKO <gmicskoa at szintezis dot hu>	2006-05-21 23:25:32 +00:00
Alan Cox	8f8790a76d	Simplify the implementation of vm_fault_additional_pages() based upon the object's memq being ordered. Specifically, replace repeated calls to vm_page_lookup() by two simple constant-time operations. Reviewed by: tegge	2006-05-13 20:05:44 +00:00
Pawel Jakub Dawidek	61f73c79da	Use better order here.	2006-05-10 06:50:44 +00:00
Alan Cox	fda28c1440	Add synchronization to vm_pageq_add_new_page() so that it can be called safely after kernel initialization. Remove GIANT_REQUIRED. MFC after: 6 weeks	2006-04-25 17:27:24 +00:00
Tom Rhodes	89eae00b84	It seems that POSIX would rather ENODEV returned in place of EINVAL when trying to mmap() an fd that isn't a normal file. Reference: http://www.opengroup.org/onlinepubs/009695399/functions/mmap.html Submitted by: fanf	2006-04-21 07:17:25 +00:00
Peter Wemm	c0345a84aa	Introduce minidumps. Full physical memory crash dumps are still available via the debug.minidump sysctl and tunable. Traditional dumps store all physical memory. This was once a good thing when machines had a maximum of 64M of ram and 1GB of kvm. These days, machines often have many gigabytes of ram and a smaller amount of kvm. libkvm+kgdb don't have a way to access physical ram that is not mapped into kvm at the time of the crash dump, so the extra ram being dumped is mostly wasted. Minidumps invert the process. Instead of dumping physical memory in in order to guarantee that all of kvm's backing is dumped, minidumps instead dump only memory that is actively mapped into kvm. amd64 has a direct map region that things like UMA use. Obviously we cannot dump all of the direct map region because that is effectively an old style all-physical-memory dump. Instead, introduce a bitmap and two helper routines (dump_add_page(pa) and dump_drop_page(pa)) that allow certain critical direct map pages to be included in the dump. uma_machdep.c's allocator is the intended consumer. Dumps are a custom format. At the very beginning of the file is a header, then a copy of the message buffer, then the bitmap of pages present in the dump, then the final level of the kvm page table trees (2MB mappings are expanded into a 4K page mappings), then the sparse physical pages according to the bitmap. libkvm can now conveniently access the kvm page table entries. Booting my test 8GB machine, forcing it into ddb and forcing a dump leads to a 48MB minidump. While this is a best case, I expect minidumps to be in the 100MB-500MB range. Obviously, never larger than physical memory of course. minidumps are on by default. It would want be necessary to turn them off if it was necessary to debug corrupt kernel page table management as that would mess up minidumps as well. Both minidumps and regular dumps are supported on the same machine.	2006-04-21 04:24:50 +00:00
John Baldwin	0f180a7cce	Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.	2006-04-17 18:20:38 +00:00
Pawel Jakub Dawidek	0909f38a3c	On shutdown try to turn off all swap devices. This way GEOM providers are properly closed on shutdown. Requested by: ru Reviewed by: alc MFC after: 2 weeks	2006-04-10 10:03:41 +00:00
Peter Wemm	b9eee07e36	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
Joseph Koshy	49874f6ea3	MFP4: Support for profiling dynamically loaded objects. Kernel changes: Inform hwpmc of executable objects brought into the system by kldload() and mmap(), and of their removal by kldunload() and munmap(). A helper function linker_hwpmc_list_objects() has been added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve the list of currently loaded kernel modules. The unused `MAPPINGCHANGE' event has been deprecated in favour of separate `MAP_IN' and `MAP_OUT' events; this change reduces space wastage in the log. Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to handle the map change callbacks. Change the default per-cpu sample buffer size to hold 32 samples (up from 16). Increment __FreeBSD_version. libpmc(3) changes: Update libpmc(3) to deal with the new events in the log file; bring the pmclog(3) manual page in sync with the code. pmcstat(8) changes: Introduce new options to pmcstat(8): "-r" (root fs path), "-M" (mapfile name), "-q"/"-v" (verbosity control). Option "-k" now takes a kernel directory as its argument but will also work with the older invocation syntax. Rework string handling in pmcstat(8) to use an opaque type for interned strings. Clean up ELF parsing code and add support for tracking dynamic object mappings reported by a v2.0.00 hwpmc(4). Report statistics at the end of a log conversion run depending on the requested verbosity level. Reviewed by: jhb, dds (kernel parts of an earlier patch) Tested by: gallatin (earlier patch)	2006-03-26 12:20:54 +00:00
Warner Losh	62a59e8f0d	Remove leading __ from __(inline\|const\|signed\|volatile). They are obsolete. This should reduce diffs to NetBSD as well.	2006-03-08 06:31:46 +00:00
Tor Egge	34ef4672d2	Ignore dirty pages owned by "dead" objects.	2006-03-08 00:51:00 +00:00
Tor Egge	3b582b4e72	Eliminate a deadlock when creating snapshots. Blocking vn_start_write() must be called without any vnode locks held. Remove calls to vn_start_write() and vn_finished_write() in vnode_pager_putpages() and add these calls before the vnode lock is obtained to most of the callers that don't already have them.	2006-03-02 22:13:28 +00:00
Tor Egge	6b085058e4	Hold extra reference to vm object while cleaning pages.	2006-03-02 21:38:38 +00:00
John Baldwin	ca95b5146a	Lock the vm_object while checking its type to see if it is a vnode-backed object that requires Giant in vm_object_deallocate(). This is somewhat hairy in that if we can't obtain Giant directly, we have to drop the object lock, then lock Giant, then relock the object lock and verify that we still need Giant. If we don't (because the object changed to OBJT_DEAD for example), then we drop Giant before continuing. Reviewed by: alc Tested by: kris	2006-02-21 22:09:54 +00:00
Tor Egge	625e6c0af4	Expand scope of marker to reduce the number of page queue scan restarts.	2006-02-17 21:02:39 +00:00
Tor Egge	db27dcc0f0	Check return value from nonblocking call to vn_start_write().	2006-02-17 18:22:19 +00:00
Stephan Uphoff	224409590d	When the VM needs to allocated physical memory pages (for non interrupt use) and it has not plenty of free pages it tries to free pages in the cache queue. Unfortunately freeing a cached page requires the locking of the object that owns the page. However in the context of allocating pages we may not be able to lock the object and thus can only TRY to lock the object. If the locking try fails the cache page can not be freed and is activated to move it out of the way so that we may try to free other cache pages. If all pages in the cache belong to objects that are currently locked the cache queue can be emptied without freeing a single page. This scenario caused two problems: 1) vm_page_alloc always failed allocation when it tried freeing pages from the cache queue and failed to do so. However if there are more than cnt.v_interrupt_free_min pages on the free list it should return pages when requested with priority VM_ALLOC_SYSTEM. Failure to do so can cause resource exhaustion deadlocks. 2) Threads than need to allocate pages spend a lot of time cleaning up the page queue without really getting anything done while the pagedaemon needs to work overtime to refill the cache. This change fixes the first problem. (1) Reviewed by: tegge@	2006-02-15 22:29:53 +00:00
Robert Watson	082dc776db	Skip per-cpu caches associated with absent CPUs when generating a memory statistics record stream via sysctl. MFC after: 3 days	2006-02-11 19:20:56 +00:00
Jeff Roberson	b73f64c484	- Fix silly VI locking that is used to check a single flag. The vnode lock also protects this flag so it is not necessary. - Don't rely on v_mount to detect whether or not we've been recycled, use the more appropriate VI_DOOMED instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week	2006-02-06 10:14:12 +00:00
Alan Cox	3b7db47d7e	Remove an unnecessary call to pmap_remove_all(). The given page is not mapped because its contents are invalid.	2006-02-04 22:37:10 +00:00
Tor Egge	44ed341759	Adjust old comment (present in rev 1.1) to match changes in rev 1.82. PR: kern/92509 Submitted by: "Bryan Venteicher" <bryanv@daemoninthecloset.org>	2006-02-02 21:55:38 +00:00
Yaroslav Tykhiy	731959b118	Use off_t for file size passed to vnode_create_vobject(). The former type, size_t, was causing truncation to 32 bits on i386, which immediately led to undersizing of VM objects backed by files >4GB. In particular, sendfile(2) was broken for such files. PR: kern/92243 MFC after: 5 days	2006-02-01 12:43:13 +00:00
Jeff Roberson	c05e22d44b	- Install a temporary bandaid in vm_object_reference() that will stop mtx_assert()s from triggering until I find a real long-term solution.	2006-02-01 09:47:02 +00:00

1 2 3 4 5 ...

2357 Commits