freebsd-skq

Author	SHA1	Message	Date
Alan Cox	d2d9e24a89	Correct an error in the previous revision that could lead to a panic: Found mapped cache page. Specifically, if cnt.v_free_count dips below cnt.v_free_reserved after p_start has been set to a non-NULL value, then vm_map_pmap_enter() would break out of the loop and incorrectly call pmap_enter_object() for the remaining address range. To correct this error, this revision truncates the address range so that pmap_enter_object() will not map any cache pages. In collaboration with: tegge@ Reported by: kris@	2006-06-14 17:48:45 +00:00
Alan Cox	e4c7a7b169	Enable debug.mpsafevm on arm by default. Tested by: cognet@	2006-06-10 05:29:37 +00:00
Alan Cox	ce142d9ec0	Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@	2006-06-05 20:35:27 +00:00
Paul Saab	4cbb1c1aaa	Fix minidumps to include pages allocated via pmap_map on amd64. These pages are allocated from the direct map, and were not previous tracked. This included the vm_page_array and the early UMA bootstrap pages. Reviewed by: peter	2006-05-31 22:55:23 +00:00
Tor Egge	57051fdc4b	Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice. Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier. Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice. Change vmtotal() and swapout_procs() to use vmspace_acquire_ref(). Reviewed by: alc	2006-05-29 21:28:56 +00:00
Robert Watson	4f538c7480	When allocating a bucket to hold a free'd item in UMA fails, don't report this as an allocation failure for the item type. The failure will be separately recorded with the bucket type. This my eliminate high mbuf allocation failure counts under some circumstances, which can be alarming in appearance, but not actually a problem in practice. MFC after: 2 weeks Reported by: ps, Peter J. Blok <pblok at bsd4all dot org>, OxY <oxy at field dot hu>, Gabor MICSKO <gmicskoa at szintezis dot hu>	2006-05-21 23:25:32 +00:00
Alan Cox	8f8790a76d	Simplify the implementation of vm_fault_additional_pages() based upon the object's memq being ordered. Specifically, replace repeated calls to vm_page_lookup() by two simple constant-time operations. Reviewed by: tegge	2006-05-13 20:05:44 +00:00
Pawel Jakub Dawidek	61f73c79da	Use better order here.	2006-05-10 06:50:44 +00:00
Alan Cox	fda28c1440	Add synchronization to vm_pageq_add_new_page() so that it can be called safely after kernel initialization. Remove GIANT_REQUIRED. MFC after: 6 weeks	2006-04-25 17:27:24 +00:00
Tom Rhodes	89eae00b84	It seems that POSIX would rather ENODEV returned in place of EINVAL when trying to mmap() an fd that isn't a normal file. Reference: http://www.opengroup.org/onlinepubs/009695399/functions/mmap.html Submitted by: fanf	2006-04-21 07:17:25 +00:00
Peter Wemm	c0345a84aa	Introduce minidumps. Full physical memory crash dumps are still available via the debug.minidump sysctl and tunable. Traditional dumps store all physical memory. This was once a good thing when machines had a maximum of 64M of ram and 1GB of kvm. These days, machines often have many gigabytes of ram and a smaller amount of kvm. libkvm+kgdb don't have a way to access physical ram that is not mapped into kvm at the time of the crash dump, so the extra ram being dumped is mostly wasted. Minidumps invert the process. Instead of dumping physical memory in in order to guarantee that all of kvm's backing is dumped, minidumps instead dump only memory that is actively mapped into kvm. amd64 has a direct map region that things like UMA use. Obviously we cannot dump all of the direct map region because that is effectively an old style all-physical-memory dump. Instead, introduce a bitmap and two helper routines (dump_add_page(pa) and dump_drop_page(pa)) that allow certain critical direct map pages to be included in the dump. uma_machdep.c's allocator is the intended consumer. Dumps are a custom format. At the very beginning of the file is a header, then a copy of the message buffer, then the bitmap of pages present in the dump, then the final level of the kvm page table trees (2MB mappings are expanded into a 4K page mappings), then the sparse physical pages according to the bitmap. libkvm can now conveniently access the kvm page table entries. Booting my test 8GB machine, forcing it into ddb and forcing a dump leads to a 48MB minidump. While this is a best case, I expect minidumps to be in the 100MB-500MB range. Obviously, never larger than physical memory of course. minidumps are on by default. It would want be necessary to turn them off if it was necessary to debug corrupt kernel page table management as that would mess up minidumps as well. Both minidumps and regular dumps are supported on the same machine.	2006-04-21 04:24:50 +00:00
John Baldwin	0f180a7cce	Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.	2006-04-17 18:20:38 +00:00
Pawel Jakub Dawidek	0909f38a3c	On shutdown try to turn off all swap devices. This way GEOM providers are properly closed on shutdown. Requested by: ru Reviewed by: alc MFC after: 2 weeks	2006-04-10 10:03:41 +00:00
Peter Wemm	b9eee07e36	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
Joseph Koshy	49874f6ea3	MFP4: Support for profiling dynamically loaded objects. Kernel changes: Inform hwpmc of executable objects brought into the system by kldload() and mmap(), and of their removal by kldunload() and munmap(). A helper function linker_hwpmc_list_objects() has been added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve the list of currently loaded kernel modules. The unused `MAPPINGCHANGE' event has been deprecated in favour of separate `MAP_IN' and `MAP_OUT' events; this change reduces space wastage in the log. Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to handle the map change callbacks. Change the default per-cpu sample buffer size to hold 32 samples (up from 16). Increment __FreeBSD_version. libpmc(3) changes: Update libpmc(3) to deal with the new events in the log file; bring the pmclog(3) manual page in sync with the code. pmcstat(8) changes: Introduce new options to pmcstat(8): "-r" (root fs path), "-M" (mapfile name), "-q"/"-v" (verbosity control). Option "-k" now takes a kernel directory as its argument but will also work with the older invocation syntax. Rework string handling in pmcstat(8) to use an opaque type for interned strings. Clean up ELF parsing code and add support for tracking dynamic object mappings reported by a v2.0.00 hwpmc(4). Report statistics at the end of a log conversion run depending on the requested verbosity level. Reviewed by: jhb, dds (kernel parts of an earlier patch) Tested by: gallatin (earlier patch)	2006-03-26 12:20:54 +00:00
Warner Losh	62a59e8f0d	Remove leading __ from __(inline\|const\|signed\|volatile). They are obsolete. This should reduce diffs to NetBSD as well.	2006-03-08 06:31:46 +00:00
Tor Egge	34ef4672d2	Ignore dirty pages owned by "dead" objects.	2006-03-08 00:51:00 +00:00
Tor Egge	3b582b4e72	Eliminate a deadlock when creating snapshots. Blocking vn_start_write() must be called without any vnode locks held. Remove calls to vn_start_write() and vn_finished_write() in vnode_pager_putpages() and add these calls before the vnode lock is obtained to most of the callers that don't already have them.	2006-03-02 22:13:28 +00:00
Tor Egge	6b085058e4	Hold extra reference to vm object while cleaning pages.	2006-03-02 21:38:38 +00:00
John Baldwin	ca95b5146a	Lock the vm_object while checking its type to see if it is a vnode-backed object that requires Giant in vm_object_deallocate(). This is somewhat hairy in that if we can't obtain Giant directly, we have to drop the object lock, then lock Giant, then relock the object lock and verify that we still need Giant. If we don't (because the object changed to OBJT_DEAD for example), then we drop Giant before continuing. Reviewed by: alc Tested by: kris	2006-02-21 22:09:54 +00:00
Tor Egge	625e6c0af4	Expand scope of marker to reduce the number of page queue scan restarts.	2006-02-17 21:02:39 +00:00
Tor Egge	db27dcc0f0	Check return value from nonblocking call to vn_start_write().	2006-02-17 18:22:19 +00:00
Stephan Uphoff	224409590d	When the VM needs to allocated physical memory pages (for non interrupt use) and it has not plenty of free pages it tries to free pages in the cache queue. Unfortunately freeing a cached page requires the locking of the object that owns the page. However in the context of allocating pages we may not be able to lock the object and thus can only TRY to lock the object. If the locking try fails the cache page can not be freed and is activated to move it out of the way so that we may try to free other cache pages. If all pages in the cache belong to objects that are currently locked the cache queue can be emptied without freeing a single page. This scenario caused two problems: 1) vm_page_alloc always failed allocation when it tried freeing pages from the cache queue and failed to do so. However if there are more than cnt.v_interrupt_free_min pages on the free list it should return pages when requested with priority VM_ALLOC_SYSTEM. Failure to do so can cause resource exhaustion deadlocks. 2) Threads than need to allocate pages spend a lot of time cleaning up the page queue without really getting anything done while the pagedaemon needs to work overtime to refill the cache. This change fixes the first problem. (1) Reviewed by: tegge@	2006-02-15 22:29:53 +00:00
Robert Watson	082dc776db	Skip per-cpu caches associated with absent CPUs when generating a memory statistics record stream via sysctl. MFC after: 3 days	2006-02-11 19:20:56 +00:00
Jeff Roberson	b73f64c484	- Fix silly VI locking that is used to check a single flag. The vnode lock also protects this flag so it is not necessary. - Don't rely on v_mount to detect whether or not we've been recycled, use the more appropriate VI_DOOMED instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week	2006-02-06 10:14:12 +00:00
Alan Cox	3b7db47d7e	Remove an unnecessary call to pmap_remove_all(). The given page is not mapped because its contents are invalid.	2006-02-04 22:37:10 +00:00
Tor Egge	44ed341759	Adjust old comment (present in rev 1.1) to match changes in rev 1.82. PR: kern/92509 Submitted by: "Bryan Venteicher" <bryanv@daemoninthecloset.org>	2006-02-02 21:55:38 +00:00
Yaroslav Tykhiy	731959b118	Use off_t for file size passed to vnode_create_vobject(). The former type, size_t, was causing truncation to 32 bits on i386, which immediately led to undersizing of VM objects backed by files >4GB. In particular, sendfile(2) was broken for such files. PR: kern/92243 MFC after: 5 days	2006-02-01 12:43:13 +00:00
Jeff Roberson	c05e22d44b	- Install a temporary bandaid in vm_object_reference() that will stop mtx_assert()s from triggering until I find a real long-term solution.	2006-02-01 09:47:02 +00:00
Alan Cox	6c237adcea	Change #if defined(DIAGNOSTIC) to KASSERT.	2006-01-31 19:06:51 +00:00
Pawel Jakub Dawidek	847a2a1716	Add buffer corruption protection (RedZone) for kernel's malloc(9). It detects both: buffer underflows and buffer overflows bugs at runtime (on free(9) and realloc(9)) and prints backtraces from where memory was allocated and from where it was freed. Tested by: kris	2006-01-31 11:09:21 +00:00
Scott Long	a5cbb43e43	The change a few years ago of having contigmalloc start its scan at the top of physical RAM instead of the bottom was a sound idea, but the implementation left a lot to be desired. Scans would spend considerable time looking at pages that are above of the address range given by the caller, and multiple calls (like what happens in busdma) would spend more time on top of that rescanning the same pages over and over. Solve this, at least for now, with two simple optimizations. The first is to not bother scanning high ordered pages that are outside of the provided address range. Second is to cache the page index from the last successful operation so that subsequent scans don't have to restart from the top. This is conditional on the numpages argument being the same or greater between calls. MFC After: 2 weeks	2006-01-29 08:24:54 +00:00
John Baldwin	ffaf2c55a8	Add a new macro wrapper WITNESS_CHECK() around the witness_warn() function. The difference between WITNESS_CHECK() and WITNESS_WARN() is that WITNESS_CHECK() should be used in the places that the return value of witness_warn() is checked, whereas WITNESS_WARN() should be used in places where the return value is ignored. Specifically, in a kernel without WITNESS enabled, WITNESS_WARN() evaluates to an empty string where as WITNESS_CHECK evaluates to 0. I also updated the one place that was checking the return value of WITNESS_WARN() to use WITNESS_CHECK.	2006-01-27 22:20:15 +00:00
Olivier Houchard	100650dee1	Make sure b_vp and b_bufobj are NULL before calling relpbuf(), as it asserts they are. They should be NULL at this point, except if we're coming from swapdev_strategy(). It should only affect the case where we're swapping directly on a file over NFS.	2006-01-27 21:11:50 +00:00
Alan Cox	0034fd6fff	Style: Add blank line after local variable declarations.	2006-01-27 21:06:37 +00:00
Alan Cox	82eedee4a4	Use the new macros abstracting the page coloring/queues implementation. (There are no functional changes.)	2006-01-27 08:35:32 +00:00
Alan Cox	997e1c252b	Use the new macros abstracting the page coloring/queues implementation. (There are no functional changes.)	2006-01-27 07:28:51 +00:00
Alan Cox	cfc26cd69c	Plug a leak in the newer contigmalloc() implementation. Specifically, if a multipage allocation was aborted midway, the pages that were already allocated were not always returned to the free list. Submitted by: tegge	2006-01-26 05:51:26 +00:00
Jeff Roberson	df59a0fee7	- Avoid calling vm_object_backing_scan() when collapsing an object when the resident page count matches the object size. We know it fully backs its parent in this case. Reviewed by: acl, tegge Sponsored by: Isilon Systems, Inc.	2006-01-25 08:42:58 +00:00
Alan Cox	0883c2d739	The previous revision incorrectly changed a switch statement into an if statement. Specifically, a break statement that previously broke out of the enclosing switch was not changed. Consequently, the enclosing loop terminated prematurely. This could result in "vm_page_insert: page already inserted" panics. Submitted by: tegge	2006-01-25 06:45:57 +00:00
Alan Cox	39fd9b639f	With the recent changes to the implementation of page coloring, the the option PQ_NOOPT is used exclusively by vm_pageq.c. Thus, the include of opt_vmpage.h can be removed from vm_page.h.	2006-01-24 19:24:54 +00:00
Alan Cox	fc3c1bc471	In vm_page_set_invalid() invalidate all of the page's mappings as soon as any part of the page's contents is invalidated. Submitted by: tegge	2006-01-24 07:21:38 +00:00
Alan Cox	02dd83311a	Make vm_object_vndeallocate() static. The external calls to it were eliminated in ufs/ffs/ffs_vnops.c's revision 1.125.	2006-01-22 23:56:20 +00:00
John Baldwin	ca49f12fdb	Reduce the scope of one #ifdef to avoid duplicating a SYSCTL_INT() macro and trim another unneeded #ifdef (it was just around a macro that is already conditionally defined).	2006-01-06 18:03:45 +00:00
Alexander Leidinger	924771865a	Convert the PAGE_SIZE check into a CTASSERT. Suggested by: jhb	2006-01-04 19:19:42 +00:00
Alexander Leidinger	1442a4476a	Prevent divide by zero, use default values in case one of the divisor's is zero. Tested by: Randy Bush <randy@psg.com>	2006-01-04 18:26:54 +00:00
Alexander Leidinger	ef39c05baa	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
Pawel Jakub Dawidek	d362c40d3a	Improve memguard a bit: - Provide tunable vm.memguard.desc, so one can specify memory type without changing the code and recompiling the kernel. - Allow to use memguard for kernel modules by providing sysctl vm.memguard.desc, which can be changed to short description of memory type before module is loaded. - Move as much memguard code as possible to memguard.c. - Add sysctl node vm.memguard. and move memguard-specific sysctl there. - Add malloc_desc2type() function for finding memory type based on its short description (ks_shortdesc field). - Memory type can be changed (via vm.memguard.desc sysctl) only if it doesn't exist (will be loaded later) or when no memory is allocated yet. If there is allocated memory for the given memory type, return EBUSY. - Implement two ways of memory types comparsion and make safer/slower the default.	2005-12-30 11:45:07 +00:00
Tor Egge	b898bb1be3	Don't access fs->first_object after dropping reference to it. The result could be a missed or extra giant unlock. Reviewed by: alc	2005-12-20 12:27:59 +00:00
Alan Cox	da61b9a69e	Use sf_buf_alloc() instead of vm_map_find() on exec_map to create the ephemeral mappings that are used as the source for three copy operations from kernel space to user space. There are two reasons for making this change: (1) Under heavy load exec_map can fill up causing vm_map_find() to fail. When it fails, the nascent process is aborted (SIGABRT). Whereas, this reimplementation using sf_buf_alloc() sleeps. (2) Although it is possible to sleep on vm_map_find()'s failure until address space becomes available (see kmem_alloc_wait()), using sf_buf_alloc() is faster. Furthermore, the reimplementation uses a CPU private mapping, avoiding a TLB shootdown on multiprocessors. Problem uncovered by: kris@ Reviewed by: tegge@ MFC after: 3 weeks	2005-12-16 18:34:14 +00:00
Alan Cox	984922d761	Assert that the page that is given to vm_page_free_toq() does not have any managed mappings.	2005-12-13 19:59:09 +00:00
Alan Cox	05406e6f33	Remove unneeded calls to pmap_remove_all(). The given page is not mapped. Reviewed by: tegge	2005-12-11 22:06:57 +00:00
Alan Cox	717f7d5962	Simplify vmspace_dofree().	2005-12-04 22:55:41 +00:00
Alan Cox	51016cdfd7	Eliminate unneeded preallocation at initialization. Reviewed by: tegge	2005-12-03 22:41:15 +00:00
Alan Cox	8215781ba2	Eliminate unneeded preallocation at initialization. Reviewed by: tegge	2005-12-03 19:37:29 +00:00
Alan Cox	97a0c226d6	Eliminate pmap_init2(). It's no longer used.	2005-11-20 06:09:49 +00:00
Alan Cox	7a35a21e7b	Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations. Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away. Reviewed by: tegge	2005-11-09 08:19:21 +00:00
Alan Cox	7e9d944218	If a physical page is mapped by two or more virtual addresses, transmitted by the zero-copy sockets method, and written to before the transmission completes, we need to destroy all of the existing mappings to the page, not just the one that we fault on. Otherwise, the mappings will no longer be to the same page and changes made through one of the mappings will not be visible through the others. Observed by: tegge	2005-11-08 06:33:21 +00:00
Paul Saab	dd498befc4	Rate limit vnode_pager_putpages printfs to once a second.	2005-11-01 23:00:24 +00:00
Alan Cox	674b706ea0	Consider the zero-copy transmission of a page that was wired by mlock(2). If a copy-on-write fault occurs on the page, the new copy should inherit a part of the original page's wire count. Submitted by: tegge MFC after: 1 week	2005-11-01 04:30:21 +00:00
Robert Watson	5bb84bc84b	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
Alan Cox	f6d8983846	Use of the ZERO_COPY_SOCKETS options can result in an unusual state that vm_object_backing_scan() was not written to handle. Specifically, a wired page within a backing object that is shadowed by a page within the shadow object. Handle this state by removing the wired page from the backing object. The wired page will be freed by socow_iodone(). Stop masking errors: If a page is being freed by vm_object_backing_scan(), assert that it is no longer mapped rather than quietly destroying any mappings. Tested by: Harald Schmalzbauer	2005-10-22 18:46:38 +00:00
Robert Watson	64a266f9e8	Change format string for u_int64_t to %ju from %llu, in order to use the correct format string on 64-bit systems. Pointed out by: pjd	2005-10-20 21:28:31 +00:00
Robert Watson	48c5777e3d	Add a "show uma" command to DDB, which prints out the current stats for available UMA zones. Quite useful for post-mortem debugging of memory leaks without a dump device configured on a panicked box. MFC after: 2 weeks	2005-10-20 16:39:33 +00:00
Diomidis Spinellis	9f5c1d1955	Move execve's access time update functionality into a new vfs_mark_atime() function, and use the new function for performing efficient atime updates in mmap(). Reviewed by: bde MFC after: 2 weeks	2005-10-12 06:56:00 +00:00
Dag-Erling Smørgrav	3803b26bae	As alc pointed out to me, vm_page.c 1.305 was incomplete: uma_startup() still uses the constant UMA_BOOT_PAGES. Change it to accept boot_pages as an additional argument. MFC after: 2 weeks	2005-10-08 21:03:54 +00:00
Diomidis Spinellis	1e3090039d	Update the vnode's access time after an mmap operation on it. Before this change a copy operation with cp(1) would not update the file access times. According to the POSIX mmap(2) documentation: the st_atime field of the mapped file may be marked for update at any time between the mmap() call and the corresponding munmap() call. The initial read or write reference to a mapped region shall cause the file's st_atime field to be marked for update if it has not already been marked for update.	2005-10-04 14:58:58 +00:00
John Baldwin	b65089ccb5	Trim a couple of unneeded includes.	2005-09-29 19:13:52 +00:00
Olivier Houchard	3cfc7651b2	Make sure we have a bufobj before calling bstrategy(). I'm not sure this is the right thing to do, but at least I don't panic anymore when swapping on a NFS file without using md(4). X-MFC after: proper review	2005-09-21 15:01:09 +00:00
Peter Wemm	749474f2f5	Remove unused (but initialized) variable 'objsize' from vm_mmap()	2005-09-20 22:08:27 +00:00
Alan Cox	f353d3388f	Introduce a new lock for the purpose of synchronizing access to the UMA boot pages. Disable recursion on the general UMA lock now that startup_alloc() no longer uses it. Eliminate the variable uma_boot_free. It serves no purpose. Note: This change eliminates a lock-order reversal between a system map mutex and the UMA lock. See http://sources.zabbadoz.net/freebsd/lor.html#109 for details. MFC after: 3 days	2005-09-09 06:03:08 +00:00
Alan Cox	57b5187b16	Eliminate an incorrect cast.	2005-09-07 01:42:30 +00:00
Alan Cox	ba8bca610c	Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.	2005-09-03 18:20:20 +00:00
Alexander Kabaev	857b66d505	Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable. vm_pager_init() is run before required nswbuf variable has been set to correct value. This caused system to run with single pbuf available for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt variable in the same way. Reported by: ade Obtained from: alc MFC after: 2 days	2005-08-13 20:21:33 +00:00
Tor Egge	1113a8b44a	Check for marker pages when scanning active and inactive page queues. Reviewed by: alc	2005-08-12 18:17:40 +00:00
Dag-Erling Smørgrav	cfa22bcc4c	Introduce the vm.boot_pages tunable and sysctl, which controls the number of pages reserved to bootstrap the kernel memory allocator. MFC after: 2 weeks	2005-08-12 12:24:19 +00:00
Tor Egge	8dbca793a9	Don't allow pagedaemon to skip pages while scanning PQ_ACTIVE or PQ_INACTIVE due to the vm object being locked. When a process writes large amounts of data to a file, the vm object associated with that file can contain most of the physical pages on the machine. If the process is preempted while holding the lock on the vm object, pagedaemon would be able to move very few pages from PQ_INACTIVE to PQ_CACHE or from PQ_ACTIVE to PQ_INACTIVE, resulting in unlimited cleaning of dirty pages belonging to other vm objects. Temporarily unlock the page queues lock while locking vm objects to avoid lock order violation. Detect and handle relevant page queue changes. This change depends on both the lock portion of struct vm_object and normal struct vm_page being type stable. Reviewed by: alc	2005-08-10 00:17:36 +00:00
Suleiman Souhlal	4f12e0acb0	Use atomic operations on runningbufspace. PR: kern/84318 Submitted by: ade MFC after: 3 days	2005-08-08 22:44:10 +00:00
Robert Watson	8be9063ea1	Don't perform a nested include of opt_vmpage.h if LIBMEMSTAT is defined, as opt_vmpage.h will not be available to user space library builds. A similar existing check is present for KLD_MODULE for similar reasons. MFC after: 3 days	2005-08-04 10:05:11 +00:00
Robert Watson	af17e9a922	Wrap inlines in uma_int.h in #ifdef _KERNEL so that uma_int.h can be used from memstat_uma.c for the purposes of kvm access without lots of additional unsafe includes. MFC after: 3 days	2005-08-04 10:03:53 +00:00
Robert Watson	cbbb4a0089	Rename UMA_MAX_NAME to UTH_MAX_NAME, since it's a maximum in the monitoring API, which might or might not be the same as the internal maximum (currently none). Export flag information on UMA zones -- in particular, whether or not this is a secondary zone, and so the keg free count should be considered in that light. MFC after: 1 day	2005-07-25 00:47:32 +00:00
Alan Cox	ec9c9e7363	Eliminate inconsistency in the setting of the B_DONE flag. Specifically, make the b_iodone callback responsible for setting it if it is needed. Previously, it was set unconditionally by bufdone() without holding whichever lock is shared by the b_iodone callback and the corresponding top-half function. Consequently, in a race, the top-half function could conclude that operation was done before the b_iodone callback finished. See, for example, aio_physwakeup() and aio_fphysio(). Note: I don't believe that the other, more widely-used b_iodone callbacks are affected. Discussed with: jeff Reviewed by: phk MFC after: 2 weeks	2005-07-20 19:06:06 +00:00
Robert Watson	f4ff923bdd	Further UMA statistics related changes: - Add a new uma_zfree_internal() flag, ZFREE_STATFREE, which causes it to to update the zone's uz_frees statistic. Previously, the statistic was updated unconditionally. - Use the flag in situations where a "real" free occurs: i.e., one where the caller is freeing an allocated item, to be differentiated from situations where uma_zfree_internal() is used to tear down the item during slab teardown in order to invoke its fini() method. Also use the flag when UMA is freeing its internal objects. - When exchanging a bucket with the zone from the per-CPU cache when freeing an item, flush cache statistics back to the zone (since the zone lock and critical section are both held) to match the allocation case. MFC after: 3 days	2005-07-20 18:47:42 +00:00
Alan Cox	15d2d31372	Eliminate an incorrect (and unnecessary) cast.	2005-07-20 18:41:08 +00:00
Robert Watson	ab3a57c04d	Use mp_maxid in preference to MAXCPU when creating exports of UMA per-CPU cache statistics. UMA sizes the cache array based on the number of CPUs at boot (mp_maxid + 1), and iterating based on MAXCPU could read off the end of the array (into the next zone). Reported by: yongari MFC after: 1 week	2005-07-16 11:03:06 +00:00
Robert Watson	08ecce74bc	Improve canonicalization of copyrights. Order copyrights by order of assertion (jeff, bmilekic, rwatson). Suggested ages ago by: bde MFC after: 1 week	2005-07-16 09:51:52 +00:00
Robert Watson	2450bbb872	Move the unlocking of the zone mutex in sysctl_vm_zone_stats() so that it covers the following of the uc_alloc/freebucket cache pointers. Originally, I felt that the race wasn't helped by holding the mutex, hence a comment in the code and not holding it across the cache access. However, it does improve consistency, as while it doesn't prevent bucket exchange, it does prevent bucket pointer invalidation. So a race in gathering cache free space statistics still can occur, but not one that follows an invalid bucket pointer, if the mutex is held. Submitted by: yongari MFC after: 1 week	2005-07-16 09:40:34 +00:00
Mike Silbersack	2018f30c01	Increase the flags field for kegs from a 16 to a 32 bit value; we have exhausted all 16 flags.	2005-07-16 02:23:41 +00:00
Robert Watson	2019094a35	Track UMA(9) allocation failures by zone, and export via sysctl. Requested by: victor cruceru <victor dot cruceru at gmail dot com> MFC after: 1 week	2005-07-15 23:34:39 +00:00
John Baldwin	b9d3b80521	Convert a remaining !fs.map->system_map to fs.first_object->flags & OBJ_NEEDGIANT test that was missed in an earlier revision. This fixes mutex assertion failures in the debug.mpsafevm=0 case. Reported by: ps MFC after: 3 days	2005-07-14 21:18:07 +00:00
Robert Watson	7a52a97eb3	Introduce a new sysctl, vm.zone_stats, which exports UMA(9) allocator statistics via a binary structure stream: - Add structure 'uma_stream_header', which defines a stream version, definition of MAXCPUs used in the stream, and the number of zone records in the stream. - Add structure 'uma_type_header', which defines the name, alignment, size, resource allocation limits, current pages allocated, preferred bucket size, and central zone + keg statistics. - Add structure 'uma_percpu_stat', which, for each per-CPU cache, includes the number of allocations and frees, as well as the number of free items in the cache. - When the sysctl is queried, return a stream header, followed by a series of type descriptions, each consisting of a type header followed by a series of MAXCPUs uma_percpu_stat structures holding per-CPU allocation information. Typical values of MAXCPU will be 1 (UP compiled kernel) and 16 (SMP compiled kernel). This query mechanism allows user space monitoring tools to extract memory allocation statistics in a machine-readable form, and to do so at a per-CPU granularity, allowing monitoring of allocation patterns across CPUs in order to better understand the distribution of work and memory flow over multiple CPUs. While here, also export the number of UMA zones as a sysctl vm.uma_count, in order to assist in sizing user swpace buffers to receive the stream. A follow-up commit of libmemstat(3), a library to monitor kernel memory allocation, will occur in the next few days. This change directly supports converting netstat(1)'s "-mb" mode to using UMA-sourced stats rather than separately maintained mbuf allocator statistics. MFC after: 1 week	2005-07-14 16:35:13 +00:00
Robert Watson	773df9ab16	In addition to tracking allocs in the zone, also track frees. Add a zone free counter, as well as a cache free counter. MFC after: 1 week	2005-07-14 16:17:21 +00:00
Robert Watson	2c743d361a	In an earlier world order, UMA would flush per-CPU statistics to the zone whenever it was moving buckets between the zone and the cache, or when coalescing statistics across the CPU. Remove flushing of statistics to the zone when coalescing statistics as part of sysctl, as we won't be running on the right CPU to write to the cache statistics. Add a missed gathering of statistics: when uma_zalloc_internal() does a special case allocation of a single item, make sure to update the zone statistics to represent this. Previously this case wasn't accounted for in user-visible statistics. MFC after: 1 week	2005-07-14 16:13:46 +00:00
Mike Silbersack	cb6e5c1aba	Change the panic in trash_ctor into just a printf for now. Once the reports of panics in trash_ctor relating to mbufs have been examined and a fix found, this will be turned back into a panic. Approved by: re (rwatson)	2005-06-26 23:44:07 +00:00
Alan Cox	eafc7b549a	Increase UMA_BOOT_PAGES to prevent a crash during initialization. See http://docs.FreeBSD.org/cgi/mid.cgi?42AD8270.8060906 for a detailed description of the crash. Reported by: Eric Anderson Approved by: re (scottl) MFC after: 3 days	2005-06-16 17:06:34 +00:00
Brian Feldman	a534973af4	The new contigmalloc(9) has a bad degenerate case where there were many regions checked again and again despite knowing the pages contained were not usable and only satisfied the alignment constraints This case was compounded, especially for large allocations, by the practice of looping from the top of memory so as to keep out of the important low-memory regions. While the old contigmalloc(9) has the same problem, it is not as noticeable due to looping from the low memory to high. This degenerate case is fixed, as well as reversing the sense of the rest of the loops within it, to provide a tremendous speed increase. This makes the best case O(n * VM overhead) much more likely than the worst case O(4 * VM overhead). For comparison, the worst case for old contigmalloc would be O(5 * VM overhead) in addition to its strategy of turning used memory into free being highly pessimal. Also, fix a bug that in practice most likely couldn't have been triggered, int the new contigmalloc(9): it walked backwards from the end of memory without accounting for how many pages it needed. Potentially, nonexistant pages could have been mapped. This hasn't occurred because the kernel generally requests as its first contigmalloc(9) a single page. Reported by: Nicolas Dehaine <nicko@stbernard.com>, wes MFC After: 1 month More testing by: Nicolas Dehaine <nicko@stbernard.com>, wes	2005-06-11 00:05:16 +00:00
Alan Cox	25f2e1c8cc	Add a comment to the effect that fictitious pages do not require the initialization of their machine-dependent fields.	2005-06-10 17:27:54 +00:00
Alan Cox	1c245ae7d1	Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time. Remove code from pmap_init() for initializing the vm_page's machine-dependent fields. Remove stale comments from pmap_init(). Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries. Tested by: cognet, kensmith, marcel	2005-06-10 03:33:36 +00:00
Alan Cox	5f7679afd0	Update some comments to reflect the change from spl-based to lock-based synchronization.	2005-05-28 17:56:18 +00:00
Stephan Uphoff	d13ec71369	Use low level constructs borrowed from interrupt threads to wait for work in proc0. Remove the TDP_WAKEPROC0 workaround.	2005-05-23 23:01:53 +00:00

1 2 3 4 5 ...

2286 Commits