freebsd-skq

Author	SHA1	Message	Date
alc	8a01505f5e	The M_ZERO can be eliminated from the uma_zalloc() call in vm_radix_node_get() with a small change to vm_radix_reclaim_allnodes_int(). This change further reduced the average number of cycles per vm_page_insert() call from 532 to 519. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-17 16:49:37 +00:00
alc	9e48bd7ba9	Most allocation of pages to objects proceeds from lower to higher indices. Consequentially, vm_page_insert() should use vm_radix_lookup_le() instead of vm_radix_lookup_ge(). Here's why. In the expected case, vm_radix_lookup_le() will quickly find a page less than the specified key at the same radix node. In contrast, vm_radix_lookup_ge() is expected to return NULL, but to do that it must examine every slot in the radix tree that is greater than the key. Prior to this change, the average cost of a vm_page_insert() call on my test machine was 992 cycles. After this change, the average cost is only 532 cycles, a reduction of 46%. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-17 16:23:19 +00:00
alc	b346e448af	Simplify the interface to vm_radix_insert() by eliminating the parameter "index". The content of a radix tree leaf, or at least its "key", is not opaque to the other radix tree operations. Specifically, they know how to extract the "key" from a leaf. So, eliminating the parameter "index" isn't breaking the abstraction. Moreover, eliminating the parameter "index" effectively prevents the caller from passing an inconsistent "index" and leaf to vm_radix_insert(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-17 16:06:03 +00:00
attilio	a2e67affe3	Expand ambiguous comments some more. Requested by: alc	2013-03-17 15:27:26 +00:00
kib	7ca94eca24	Some style fixes. Sponsored by: The FreeBSD Foundation	2013-03-14 20:31:39 +00:00
kib	63efc821c3	Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks	2013-03-14 20:18:12 +00:00
kib	51407f194b	Remove excessive and inconsistent initializers for the various kernel maps and submaps. MFC after: 2 weeks	2013-03-14 19:50:09 +00:00
attilio	07b5846fc9	Fix compilation. Sponsored by: EMC / Isilon storage division	2013-03-13 01:38:32 +00:00
attilio	3b0a5f0419	Use the _KERNEL protectors. Sponsored by: EMC / Isilon storage division Requested by: alc	2013-03-13 01:02:11 +00:00
attilio	02cf10e6db	Add a further safety belt to prevent inconsistencies. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-03-13 01:00:34 +00:00
attilio	ba43ac477b	For uniformity, use the user provided index. Sponsored by: EMC / Isilon storage division Reviewed and reported by: alc	2013-03-13 00:41:37 +00:00
attilio	3c52979cb4	MFC	2013-03-12 13:26:12 +00:00
attilio	45af7dd4e7	Simplify vm_page_is_valid(). Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-03-12 12:20:49 +00:00
alc	07fc599921	When transferring the page from one object to another, don't insert the page into its new object until the page's pindex has been updated. Otherwise, one code path within vm_radix_insert() may use the wrong pindex value. Sponsored by: EMC / Isilon Storage Division	2013-03-12 06:14:31 +00:00
attilio	32a3275e77	MFC	2013-03-11 10:49:02 +00:00
alc	2c9c761886	Introduce vm_radix_is_empty(), and use it in place of vm_object_cache_is_empty() where the caller is aware of the page cache's implementation as a radix trie. Sponsored by: EMC / Isilon Storage Division	2013-03-10 17:30:57 +00:00
alc	854d4fd5e6	Update a comment: The object lock is no longer a mutex.	2013-03-09 21:32:24 +00:00
attilio	76954ad68a	Merge from vmcontention.	2013-03-09 03:19:53 +00:00
attilio	16a80466e5	MFC	2013-03-09 02:51:51 +00:00
attilio	72f7f3e528	Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho	2013-03-09 02:32:23 +00:00
attilio	754f3790b8	Merge from vmc-playground: Introduce a new KPI that verifies if the page cache is empty for a specified vm_object. This KPI does not make assumptions about the locking in order to be used also for building assertions at init and destroy time. It is mostly used to hide implementation details of the page cache. Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: alc (vm_radix based version) Tested by: flo, pho, jhb, davide	2013-03-09 02:05:29 +00:00
attilio	7fd2627275	MFC	2013-03-09 01:39:42 +00:00
andre	adea04bda7	Move the callout subsystem initialization to its own SYSINIT() from being indirectly called via cpu_startup()+vm_ksubmap_init(). The boot order position remains the same at SI_SUB_CPU. Allocation of the callout array is changed to stardard kernel malloc from a slightly obscure direct kernel_map allocation. kern_timeout_callwheel_alloc() is renamed to callout_callwheel_init() to better describe its purpose. kern_timeout_callwheel_init() is removed simplifying the per-cpu initialization. Reviewed by: davide	2013-03-08 10:37:17 +00:00
attilio	bf1dc90446	MFC	2013-03-08 00:03:07 +00:00
attilio	82aa86d64f	Improve comments. Sponsored by: EMC / Isilon storage division Submitted by: mdf	2013-03-07 23:37:10 +00:00
attilio	1be810ec73	MFC	2013-03-04 13:14:59 +00:00
attilio	e5bdd2f06e	Merge from vmcontention: As vm objects are type-stable there is no need to initialize the resident splay tree pointer and the cache splay tree pointer in _vm_object_allocate() but this could be done in the init UMA zone handler. The destructor UMA zone handler, will further check if the condition is retained at every destruction and catch for bugs. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-03-04 13:10:59 +00:00
attilio	709ad55889	Evaluations on the likelyhood of empty object cache cannot be made in general way but must be evaluated case by case. Embedd the decision in the caller themselves rather than in a general purpose KPI. Sponsored by: EMC / Isilon storage division Reported by: alc Reviewed by: alc	2013-03-04 12:33:40 +00:00
alc	a8671df14b	Fix a typo. Sponsored by: EMC / Isilon Storage Division	2013-03-04 07:25:11 +00:00
alc	a855741cf1	A Boolean is more appropriate than an int here. Use what I think is a slightly better variable name. Sponsored by: EMC / Isilon Storage Division	2013-03-04 07:20:59 +00:00
alc	c3be5353b8	Make a pass over most of the comments.	2013-03-04 07:11:10 +00:00
alc	475367da61	Simplify Boolean expressions. Sponsored by: EMC / Isilon Storage Division	2013-03-04 06:26:25 +00:00
alc	5094368613	Fix spelling. Sponsored by: EMC / Isilon Storage Division	2013-03-04 06:13:26 +00:00
attilio	60e39c95b8	Remove the boot-time cache support and rely on UMA boot-time slab cache for allocating the nodes before to have the possibility to carve directly from the UMA subsystem. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-03-04 00:07:23 +00:00
alc	f661d6e522	We don't need to reinitialize the root of the page cache trie on every vm object allocation. We can, instead, rely on the type stability of the vm object zone. (Note that we already assert that the page cache trie is empty in the vm object zone destructor.) Sponsored by: EMC / Isilon Storage Division	2013-03-03 20:37:27 +00:00
alc	317a9584fb	Two out of three times that vm_page_find_least() is called, it's going to return the vm object's first page. In those cases, there is no need to traverse the trie. Sponsored by: EMC / Isilon Storage Division	2013-03-03 01:36:31 +00:00
attilio	a345907061	Merge from vmcontention	2013-03-03 01:10:49 +00:00
attilio	c53a782d3a	MFC	2013-03-03 01:06:24 +00:00
alc	2322e91e7c	Revert white space change in the previous commit. Requested by: attilio	2013-03-02 18:27:51 +00:00
alc	c5b028cc14	Assert that the trie is empty when a vm object is destroyed. Since vm objects are allocated from type-stable memory, we don't need to initialize the trie's root in _vm_object_allocate() on every vm object allocation. We can instead do it once in vm_object_zinit(). We don't need to call vm_radix_reclaim_allnodes() in vm_object_terminate() unless the resident page count is non-zero. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-02 18:18:30 +00:00
alc	90d4aeb975	The value held by the vm object's field pg_color is only considered valid if the flag OBJ_COLORED is set. Since _vm_object_allocate() doesn't set this flag, it needn't initialize pg_color. Sponsored by: EMC / Isilon Storage Division	2013-03-02 18:07:29 +00:00
attilio	e98f58faf6	MFC	2013-03-02 14:48:41 +00:00
attilio	89979cd218	Merge from vmcontention	2013-03-02 14:35:15 +00:00
attilio	17028bb6ae	MFC	2013-03-02 14:28:31 +00:00
pjd	f07ebb8888	Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ \| PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ \| PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE \| PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ \| PROT_WRITE \| PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK \| CAP_READ) #define CAP_PWRITE (CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP \| CAP_SEEK \| CAP_READ) #define CAP_MMAP_W (CAP_MMAP \| CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP \| CAP_SEEK \| 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R \| CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R \| CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W \| CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R \| CAP_MMAP_W \| CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| CAP_GETSOCKOPT \| \ CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| CAP_SETSOCKOPT \| CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT \| CAP_BIND \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| \ CAP_GETSOCKOPT \| CAP_LISTEN \| CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| \ CAP_SETSOCKOPT \| CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT \| CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib	2013-03-02 00:53:12 +00:00
attilio	31dffb7b33	Merge from vmcontention	2013-02-27 18:25:57 +00:00
attilio	6ff1954532	MFC	2013-02-27 18:23:12 +00:00
attilio	8d28f94790	Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-27 18:12:13 +00:00
attilio	c74a3afc6a	Fix compiling.	2013-02-26 23:54:17 +00:00
attilio	74f58faa15	MFC	2013-02-26 23:52:23 +00:00
attilio	cd86838830	Merge from vmcontention	2013-02-26 23:46:19 +00:00
attilio	cbe7c0e167	MFC	2013-02-26 23:43:28 +00:00
attilio	cc89d0bd92	Merge from vmc-playground branch: Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination. The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init(). Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide	2013-02-26 23:35:27 +00:00
attilio	726aa55a61	Merge from vmcontention	2013-02-26 21:17:38 +00:00
attilio	9d00dd1afe	MFC	2013-02-26 21:13:09 +00:00
attilio	820ab571ec	MFC	2013-02-26 21:09:35 +00:00
attilio	5a60eaa26c	Remove white spaces. Sponsored by: EMC / Isilon storage division	2013-02-26 20:35:40 +00:00
attilio	43aa55b4cd	Revert the moving of vm_object objects initialization: the objects zone ensures type-stability and thus we want to execute actual lock initialization only when the objects are brought into the zone otherwise there could be races between lock threads doing re-initilization and other threads that want to acquire the lock without a reference. Sponsored by: EMC / Isilon storage division Reported by: alc	2013-02-26 20:18:25 +00:00
attilio	210a93e7f7	Merge from vmcontention	2013-02-26 18:18:39 +00:00
attilio	134623836d	MFC	2013-02-26 18:11:43 +00:00
attilio	afe5ce0c13	MFC	2013-02-26 17:33:18 +00:00
attilio	49f99b7251	Wrap the sleeps synchronized by the vm_object lock into the specific macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-26 17:22:08 +00:00
alc	c5315c03fb	Update a comment: noobj_alloc() has replaced obj_alloc(), but it doesn't really make sense for this comment to name specific backend allocators, instead simply refer to backend allocators. Sponsored by: EMC / Isilon Storage Division	2013-02-26 06:38:00 +00:00
attilio	e590e8091e	As VM_OBJECT_SLEEP() is a vm_object_t specific function, make the passed object as the first argument of the function for consistency. Sponsored by: EMC / Isilon storage revision	2013-02-26 01:38:12 +00:00
attilio	fc0ecac2f8	Revert wrongly added asserts: lookup and remove from the collection of cached pages doesn't require the object lock to be held. Sponsored by: EMC / Isilon storage division	2013-02-26 00:34:52 +00:00
alc	26f238c055	Revise the comment describing uma_zone_reserve_kva(). Sponsored by: EMC / Isilon Storage Division Reviewed by: attilio	2013-02-26 00:18:50 +00:00
attilio	343c9f6f19	Missing semicolon. Sponsored by: EMC / Isilon storage division Submitted by: alc Pointy hat to: me	2013-02-24 19:10:16 +00:00
attilio	1a753217f3	Simplify return logic. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-02-24 19:05:11 +00:00
attilio	69d25b60d5	Merge from vmcontention	2013-02-24 17:11:10 +00:00
attilio	cff31deb1a	MFC	2013-02-24 16:50:53 +00:00
attilio	12289fcebc	Retire the old UMA primitive uma_zone_set_obj() and replace it with the more modern uma_zone_reserve_kva(). The difference is that it doesn't rely anymore on an obj to allocate pages and the slab allocator doesn't use any more any specific locking but atomic operations to complete the operation. Where possible, the uma_small_alloc() is instead used and the uk_kva member becomes unused. The subsequent cleanups also brings along the removal of VM_OBJECT_LOCK_INIT() macro which is not used anymore as the code can be easilly cleaned up to perform a single mtx_init(), private to vm_object.c. For the same reason, _vm_object_allocate() becomes private as well. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-02-24 16:41:36 +00:00
attilio	6b1291b4d1	Do not call vm_radix_lookup_ge() in the reservation system unless it is absolutely necessary. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-02-24 16:10:43 +00:00
attilio	f6d331e804	Fix an inverted check that was reporting indexes wrongly detected as wrapped. Sponsored by: EMC / Isilon storage divison Reported by: alc	2013-02-24 16:08:37 +00:00
alc	96feae12e9	Correctly assert that no page already exists at the offset within the object that is currently being allocated. Sponsored by: EMC / Isilon Storage Division	2013-02-23 19:28:31 +00:00
attilio	8702b26c68	Complete the asserts by definining also assertions for RA_RLOCKED and RA_LOCKED cases. Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:56:51 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	b2afca4987	Add read mode operations to VM_OBJECT_LOCK* class of functions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:06:33 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	1f1e13ca03	There is no need to use VM_OBJECT_LOCKED() as the assertion won't make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().	2013-02-20 10:51:34 +00:00
attilio	6a2e2ce522	Remove unused VM_OBJECT_LOCKPTR(). Sponsored by: EMC / Isilon storage division	2013-02-20 10:40:27 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
alc	88b6705ed6	On arm, like sparc64, the end of the kernel map varies from one type of machine to another. Therefore, VM_MAX_KERNEL_ADDRESS can't be a constant. Instead, #define it to be a variable, vm_max_kernel_address, just like we do on sparc64. Reviewed by: kib Tested by: ian	2013-02-18 01:02:48 +00:00
attilio	47d4527f8f	Remove an unuseful check as looking up into an empty trie should be as fast as checking a NULL ptr. Sponsored by: EMC / Isilon storage division	2013-02-15 17:22:57 +00:00
attilio	ea8f6ef283	Remove whitespace.	2013-02-15 17:21:41 +00:00
attilio	ebcaab64ea	Merge from vmcontention	2013-02-15 16:11:30 +00:00
attilio	b4e24f9126	MFC	2013-02-15 16:08:08 +00:00
attilio	1cf2f4550b	On arches with VM_PHYSSEG_DENSE the vm_page_array is larger than the actual number of vm_page_t that will be derived, so v_page_count should be used appropriately. Besides that, add a panic condition in case UMA fails to properly restrict the area in a way to keep all the desired objects. Sponsored by: EMC / Isilon storage division Reported by: alc	2013-02-15 16:05:18 +00:00
attilio	757b950804	Remove unused headers.	2013-02-15 15:34:19 +00:00
attilio	daa1f2caab	Fix comment.	2013-02-15 14:54:09 +00:00
attilio	fa12391493	Move the radix node zone destructor definition closer to vm_radix_init() definition. Sponsored by: EMC / Isilon storage division	2013-02-15 14:53:42 +00:00
attilio	be627ca24c	- When panicing for "too small boot cache" reason, print the actual cache size value - Add a way to specify the size of the boot cache at compile time Sponsored by: EMC / Isilon storage division	2013-02-15 14:50:36 +00:00
attilio	47ecbcf556	Improve dynamic branch prediction and i-cache utilization: - Use predict_false() to tag boot-time cache decisions - Compact boot-time cache allocation into a separate, non-inline, function that won't be called most of the times. Sponsored by: EMC / Isilon storage division	2013-02-15 14:48:06 +00:00
attilio	af55a9fb46	- Fix style in vm_page_lookup(): there is no whiteline between assertions and other code in this file. - Reinsert some comments that were lost during the work but which are actual yet, reducing differences with HEAD. Sponsoed by: EMC / Isilon storage division	2013-02-15 12:35:16 +00:00
jhb	ca6ecf3ea5	Make VM_NDOMAIN a kernel option so that it can be enabled from a kernel config file. Requested by: phk (ages ago) MFC after: 1 month	2013-02-14 19:38:04 +00:00
attilio	642ba82fdf	Remove an unuseful check on resident_page_count. vm_radix_lookup_ge() of an empty trie is as fast as checking a NULL pointer.	2013-02-14 15:25:31 +00:00
attilio	908e129569	Fix style.	2013-02-14 15:24:13 +00:00
attilio	eafe26c8a6	The radix preallocation pages can overfow the biggestone segment, so use a different scheme for preallocation: reserve few KB of nodes to be used to cater page allocations before the memory can be efficiently pre-allocated by UMA. This at all effects remove boot_pages further carving and along with this modifies to the boot_pages allocation system and necessity to initialize the UMA zone before pmap_init(). Reported by: pho, jhb	2013-02-14 15:23:00 +00:00
attilio	3db337c2ea	Grammar. Sponsored by: EMC / Isilon storage division	2013-02-13 02:04:49 +00:00
attilio	53f78d1a7d	Implement a new algorithm for managing the radix trie which also includes path-compression. This greatly helps with sparsely populated tries, where an uncompressed trie may end up by having a lot of intermediate nodes for very little leaves. The new algorithm introduces 2 main concepts: the node level and the node owner. Every node represents a branch point where the leaves share the key up to the level specified in the node-level (current level excluded, of course). Such key partly shared is the one contained in the owner. Of course, the root branch is exempted to keep a valid owner, because theoretically all the keys are contained in the space designed by the root branch node. The search algorithm seems very intuitive and that is where one should start reading to understand the full approach. In the end, the algorithm ends up by demanding only one node per insert and this is not necessary in all the cases. To stay safe, we basically preallocate as many nodes as the number of physical pages are in the system, using uma_preallocate(). However, this raises 2 concerns: * As pmap_init() needs to kmem_alloc(), the nodes must be pre-allocated when vm_radix_init() is currently called, which is much before UMA is fully initialized. This means that uma_prealloc() will dig into the UMA_BOOT_PAGES pool of pages, which is often not enough to keep track of such large allocations. In order to fix this, change a bit the concept of UMA_BOOT_PAGES and vm.boot_pages. More specifically make the UMA_BOOT_PAGES an initial "value" as long as vm.boot_pages and extend the boot_pages physical area by as many bytes as needed with the information returned by vm_radix_allocphys_size(). * A small amount of pages will be held in per-cpu buckets and won't be accessible from curcpu, so the vm_radix_node_get() could really panic when the pre-allocation pool is close to be exhausted. In theory we could pre-allocate more pages than the number of physical frames to satisfy such request, but as many insert would happen without a node allocation anyway, I think it is safe to assume that the over-allocation is already compensating for such problem. On the field testing can stand me correct, of course. This could be further helped by the case where we allow a single-page insert to not require a complete root node. The use of pre-allocation gets rid all the non-direct mapping trickery and introduced lock recursion allowance for vm_page_free_queue. The nodes children are reduced in number from 32 -> 16 and from 16 -> 8 (for respectively 64 bits and 32 bits architectures). This would make the children to fit into cacheline for amd64 case, for example, and in general spawn less cacheline, which may be helpful in lookup_ge() case. Also, path-compression cames to help in cases where there are many levels, making the fallouts of such change less hurting. Sponsored by: EMC / Isilon storage division Reviewed by: jeff (partially) Tested by: flo	2013-02-13 01:19:31 +00:00
attilio	87d8d2eec7	Fix style.	2013-02-10 16:00:14 +00:00
attilio	8743c2878e	Fix wrong object reference. Sponsored by: EMC / Isilon Storage Division	2013-02-10 01:30:13 +00:00
attilio	25a17068be	Remove implementation specific comments from a public interface.	2013-02-07 15:13:35 +00:00
attilio	702feea4c3	Correctly complete r246474.	2013-02-07 15:08:35 +00:00
attilio	5ab232ef16	Strengten checks.	2013-02-07 15:06:45 +00:00
attilio	59f669fb82	Style.	2013-02-07 15:06:04 +00:00
attilio	12b7890c27	Reduce differences with HEAD.	2013-02-07 11:36:34 +00:00
attilio	950536d7a4	Reformat comments to follow original version and re-add correct locking flags.	2013-02-06 23:48:04 +00:00
attilio	86cff934d5	Do not assume the lock to be held so that this can be used also in safe cases as a short-cut.	2013-02-06 19:03:48 +00:00
attilio	9066f231e3	Tweak comment to remove splay tree references.	2013-02-06 19:02:46 +00:00
attilio	1bab15985e	Make vm_object_cache_is_empty() inline.	2013-02-06 18:59:34 +00:00
attilio	4c22b4bafe	Cleanup vm_radix KPI: - Avoid the return value for vm_radix_insert() - Name the functions argument per-style(9) - Avoid to get and return opaque objects but use vm_page_t as vm_radix is thought to not really be general code but to cater specifically page cache and resident cache.	2013-02-06 18:37:46 +00:00
attilio	3bbca3bce2	Fixup r246423 by adding vm_radix.h includes where it is not present currently.	2013-02-06 18:33:32 +00:00
attilio	c02c27a33d	Avoid a namespace pollution in vm_object.h by defining separately the structure for vm_radix implementation.	2013-02-06 18:04:28 +00:00
attilio	d3fb98bfb4	Enrich comments on newly added assertions.	2013-02-06 17:47:24 +00:00
attilio	abbe2a9b91	- Move the vm_object_cache_is_empty() prototype to be sorted alphabetically. - Change the return type to be boolean_t in order to match what vm_page_is_cached() does.	2013-02-06 17:27:41 +00:00
attilio	cde4f0caa2	Fix mismerge.	2013-02-06 17:22:16 +00:00
attilio	22b0de04b3	Reduce diffs against HEAD.	2013-02-06 17:17:11 +00:00
attilio	9d88c5279c	Now that vm_page_cache_free() and vm_page_cache_transfer() are reimplemented as ranged operations, sync vm_page_is_cached() semantic with HEAD.	2013-02-06 14:50:34 +00:00
attilio	44f85cd1a5	Reduce diffs against HEAD: Reimplement vm_page_cache_free() as a range operation.	2013-02-06 14:29:05 +00:00
attilio	439c0b8cf1	Reduce diffs against HEAD: - Reimplement vm_page_cache_transfer() properly - Remove vm_page_cache_rename() as a subsequent change	2013-02-05 00:09:33 +00:00
attilio	62f53da2e7	Merge from vmcontention	2013-02-04 22:15:36 +00:00
attilio	d3b7ec3a08	MFC	2013-02-04 22:10:01 +00:00
attilio	d61cd60feb	Reduce differences with HEAD.	2013-02-04 22:05:22 +00:00
attilio	b972b67ed7	Merge from vmcontention	2013-02-04 15:44:42 +00:00
marius	790d2fce4f	Try to improve r242655 take III: move these SYSCTLs describing the kernel map, which is defined and initialized in vm/vm_kern.c, to the latter. Submitted by: alc	2013-02-04 09:35:48 +00:00
attilio	b134f527dc	Detect address wrapup without defining the right boundary.	2013-02-04 08:53:51 +00:00
attilio	0d3b58aee0	MFC	2013-02-03 20:13:33 +00:00
glebius	e3e319a0b6	Fix typo in debug printf.	2013-01-29 19:06:16 +00:00
zont	b5edc96a84	- Add system wide page faults requiring I/O counter. Reviewed by: alc MFC after: 2 weeks	2013-01-28 12:54:53 +00:00
zont	875b69507c	- Add sysctls to show number of stats scans. MFC after: 2 weeks	2013-01-28 12:20:20 +00:00
zont	b3905d7835	- Style. MFC after: 2 weeks	2013-01-28 12:08:29 +00:00
zont	ee65990ea4	- Get rid of unused function vmspace_wired_count(). Reviewed by: alc Approved by: kib (mentor) MFC after: 1 week	2013-01-14 12:12:56 +00:00
zont	3b71bce613	- Improve readability of sys_obreak(). Suggested by: alc Reviewed by: alc Approved by: kib (mentor) MFC after: 1 week	2013-01-11 09:58:35 +00:00
zont	d2863e4c68	- Reduce kernel size by removing unnecessary pointer indirections. GENERIC kernel size reduced in 16 bytes and RACCT kernel in 336 bytes. Suggested by: alc Reviewed by: alc Approved by: kib (mentor) MFC after: 1 week	2013-01-10 12:43:58 +00:00
attilio	f458bac614	Remove vm_radix_lookupn() and its usage in the kernel.	2013-01-10 12:30:58 +00:00
ken	1abc90f894	Fix a bug in the device pager code that can trigger an assertion in devfs if a particular race condition is hit in the device pager code. This was a side effect of change 227530 which changed the device pager interface to call a new destructor routine for the cdev. That destructor routine, old_dev_pager_dtor(), takes a VM object handle. The object handle is cast to a struct cdev *, and passed into dev_rel(). That works in most cases, except the case in cdev_pager_allocate() where there is a race condition between two threads allocating an object backed by the same device. The loser of the race deallocates its object at the end of the function. The problem is that before inserting the object into the dev_pager_object_list, the object's handle is changed from the struct cdev pointer to the object's own address. This is to avoid conflicts with the winner of the race, which already inserted an object in the list with a handle that is a pointer to the same cdev structure. The object is then passed to vm_object_deallocate(), and eventually makes its way down to old_dev_pager_dtor(). That function passes the handle pointer (which is actually a VM object, not a struct cdev as usual) into dev_rel(). dev_rel() decrements the reference count in the assumed struct cdev (which happens to be 0), and that triggers the assertion in dev_rel() that the reference count is greater than or equal to 0. The fix is to add a cdev pointer to the VM object, and use that pointer when calling the cdev_pg_dtor() routine. vm_object.h: Add a struct cdev pointer to the VM object structure. device_pager.c: In cdev_pager_allocate(), populate the new cdev pointer. In dev_pager_dealloc(), use the new cdev pointer when calling the object's cdev_pg_dtor() routine. Reviewed by: kib Sponsored by: Spectra Logic Corporation MFC after: 1 week	2013-01-09 16:48:38 +00:00
attilio	fcadd67d75	MFC	2012-12-26 08:20:27 +00:00
glebius	1292747048	Comment fix: there is no ub_ptr, instead explain meaning of uz_count field verbally.	2012-12-21 10:09:45 +00:00
zont	15b694913e	- Fix locked memory accounting for maps with MAP_WIREFUTURE flag. - Add sysctl vm.old_mlock which may turn such accounting off. Reviewed by: avg, trasz Approved by: kib (mentor) MFC after: 1 week	2012-12-18 07:35:01 +00:00
attilio	be719e9167	MFC	2012-12-11 00:07:19 +00:00
alc	02094caa2c	In the past four years, we've added two new vm object types. Each time, similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages. To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged. Reviewed and tested by: kib	2012-12-09 00:32:38 +00:00
pjd	fc89492084	White-space cleanups.	2012-12-08 09:23:05 +00:00
pjd	a585ca9ec8	Implemented uma_zone_set_warning(9) function that sets a warning, which will be printed once the given zone becomes full and cannot allocate an item. The warning will not be printed more often than every five minutes. All UMA warnings can be globally turned off by setting sysctl/tunable vm.zone_warnings to 0. Discussed on: arch Obtained from: WHEEL Systems MFC after: 2 weeks	2012-12-07 22:27:13 +00:00
alc	793b12af79	Add support for the (relatively) new object type OBJT_MGTDEVICE to vm_object_set_memattr(). Also, add a "safety belt" so that vm_object_set_memattr() doesn't silently modify undefined object types. Reviewed by: kib MFC after: 10 days	2012-11-28 18:29:34 +00:00
alc	e44badfb9f	Make a few small changes to vm_map_pmap_enter(): Add detail to the comment describing this function. In particular, describe what MAP_PREFAULT_PARTIAL does. Eliminate the abrupt change in behavior when the specified address range grows from MAX_INIT_PT pages to MAX_INIT_PT plus one pages. Instead of doing nothing, i.e., preloading no mappings whatsoever, map any resident pages that fall within the start of the specified address range, i.e., [addr, addr + ulmin(size, ptoa(MAX_INIT_PT))). Long ago, the vm object's list of resident pages was not ordered, so this function had to choose between probing the global hash table of all resident pages and iterating over the vm object's unordered list of resident pages. Now, the list is ordered, so there is no reason for MAP_PREFAULT_PARTIAL to be concerned with the vm object's count of resident changes. MFC after: 14 days	2012-11-25 19:42:36 +00:00
alc	e12b2ad698	Correct an error in r230623. When both VM_ALLOC_NODUMP and VM_ALLOC_ZERO were specified to vm_page_alloc(), PG_NODUMP wasn't being set on the allocated page when it happened to be pre-zeroed. MFC after: 5 days	2012-11-21 06:26:18 +00:00
jh	25dd09b996	- Don't pass geom and provider names as format strings. - Add __printflike() attributes. - Remove an extra argument for the g_new_geomf() call in swapongeom_ev(). Reviewed by: pjd	2012-11-20 12:32:18 +00:00
alc	1284a0383d	Update a comment to reflect the elimination of the hold queue in r242300.	2012-11-17 04:00:19 +00:00
kib	bc5bfde14d	Move the declaration of vm_phys_paddr_to_vm_page() from vm/vm_page.h to vm/vm_phys.h, where it belongs. Requested and reviewed by: alc MFC after: 2 weeks	2012-11-16 05:55:56 +00:00
kib	75f2aa672f	Explicitely state that M_USE_RESERVE requires M_NOWAIT, using assertion. Reviewed by: alc MFC after: 2 weeks	2012-11-16 05:49:56 +00:00
kib	e8ae50d444	Flip the semantic of M_NOWAIT to only require the allocation to not sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions. Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class. Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags(). Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks	2012-11-14 20:01:40 +00:00
alc	ff7333d33f	Replace the single, global page queues lock with per-queue locks on the active and inactive paging queues. Reviewed by: kib	2012-11-13 02:50:39 +00:00
attilio	7efc7fb950	Fix DDB command "show map XXX": - Check that an argument is always available, otherwise current map printing before to recurse is garbage. - Spit out a message if an argument is not provided. - Remove unread nlines variable. - Use an explicit recursive function, disassociated from the DB_SHOW_COMMAND() body, in order to make clear prototype and recursion of the above mentioned function. The code results now much less obscure. Submitted by: gianni	2012-11-12 00:30:40 +00:00
kib	f16ea99007	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
alc	60d5a532fb	In general, we call pmap_remove_all() before calling vm_page_cache(). So, the call to pmap_remove_all() within vm_page_cache() is usually redundant. This change eliminates that call to pmap_remove_all() and introduces a call to pmap_remove_all() before vm_page_cache() in the one place where it didn't already exist. When iterating over a paging queue, if the object containing the current page has a zero reference count, then the page can't have any managed mappings. So, a call to pmap_remove_all() is pointless. Change a panic() call in vm_page_cache() to a KASSERT(). MFC after: 6 weeks	2012-11-01 16:20:02 +00:00
attilio	d38d7bb245	Rework the known mutexes to benefit about staying on their own cache line in order to avoid manual frobbing but using struct mtx_padalign. The sole exception being nvme and sxfge drivers, where the author redefined CACHE_LINE_SIZE manually, so they need to be analyzed and dealt with separately. Reviwed by: jimharris, alc	2012-10-31 18:07:18 +00:00
alc	77582e8298	Replace the page hold queue, PQ_HOLD, by a new page flag, PG_UNHOLDFREE, because the queue itself serves no purpose. When a held page is freed, inserting the page into the hold queue has the side effect of setting the page's "queue" field to PQ_HOLD. Later, when the page is unheld, it will be freed because the "queue" field is PQ_HOLD. In other words, PQ_HOLD is used as a flag, not a queue. So, this change replaces it with a flag. To accomodate the new page flag, make the page's "flags" field wider and "oflags" field narrower. Reviewed by: kib	2012-10-29 06:15:04 +00:00
trasz	edd0d84112	Remove useless check; vm_pindex_t is unsigned on all architectures. CID: 3701 Found with: Coverity Prevent	2012-10-28 20:03:57 +00:00
mdf	1bc1b805d7	Const-ify the zone name argument to uma_zcreate(9). MFC after: 3 days	2012-10-26 17:51:05 +00:00
andre	a8b2ff5af7	Move the corresponding MTX_SYSINIT() next to their struct mtx declaration to make their relationship more obvious as done with the other such mutexs.	2012-10-26 17:31:35 +00:00
kib	ddaaa16d8b	Commit the actual text provided by Alan, instead of the wrong update in r242011. MFC after: 1 week	2012-10-24 18:32:37 +00:00
kib	2a76567642	Dirty the newly copied anonymous pages after the wired region is forked. Otherwise, pagedaemon might reclaim the page without saving its content into the swap file, resulting in the valid content replaced by zeroes. Reported and tested by: pho Reviewed and comment update by: alc MFC after: 1 week	2012-10-24 18:21:59 +00:00
attilio	64eaf39fd7	MFC	2012-10-22 21:26:36 +00:00
kib	560aa751e0	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
eadler	23c67e54ce	Print flags as hex instead of an integer. PR: kern/168210 Submitted by: linimon Reviewed by: alc Approved by: cperciva MFC after: 3 days	2012-10-22 02:11:57 +00:00
alc	1df941a7f3	Move vm_page_requeue() to the only file that uses it. MFC after: 3 weeks	2012-10-13 20:19:43 +00:00
alc	c84b1820ea	Eliminate the conditional for releasing the page queues lock in vm_page_sleep(). vm_page_sleep() is no longer called with this lock held. Eliminate assertions that the page queues lock is NOT held. These assertions won't translate well to having distinct locks on the active and inactive page queues, and they really aren't that useful. MFC after: 3 weeks	2012-10-13 18:46:46 +00:00
alc	271cefc5f6	Tidy up a bit: Update some of the comments. In particular, use "sleep" in preference to "block" where appropriate. Eliminate some unnecessary casts. Make a few whitespace changes for consistency. Reviewed by: kib MFC after: 3 days	2012-10-03 05:06:45 +00:00
kib	8f845e475e	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
alc	a0349df30f	Address a race condition that was introduced in r238212. Unless the page queues lock is acquired before the page lock is released, there is no guarantee that the page will still be in that same page queue when vm_page_requeue() is called. Reported by: pho In collaboration with: kib MFC after: 3 days	2012-09-23 17:42:39 +00:00
attilio	be930066f1	MFC	2012-09-21 03:07:34 +00:00
kib	667e5e154a	Plug the accounting leak for the wired pages when msync(MS_INVALIDATE) is performed on the vnode mapping which is wired in other address space. While there, explicitely assert that the page is unwired and zero the wire_count instead of substract. The condition is rechecked later in vm_page_free(_toq) already. Reported and tested by: zont Reviewed by: alc (previous version) MFC after: 1 week	2012-09-20 09:52:57 +00:00
glebius	e4b6b754eb	If caller specifies UMA_ZONE_OFFPAGE explicitly, then do not waste memory in an allocation for a slab. Reviewed by: jeff	2012-09-18 20:28:55 +00:00
eadler	8600cbb5b6	Correct double "the the" Approved by: cperciva MFC after: 3 days	2012-09-14 21:28:56 +00:00
zont	2b9b209471	- Simplify VM code by using vmspace_wired_count() for counting wired memory of a process. Reviewed by: avg Approved by: kib (mentor) MFC after: 2 weeks	2012-09-05 18:19:54 +00:00
des	71dbd73468	Whitespace cleanup.	2012-09-05 12:24:50 +00:00
des	3ed9c078db	No memory barrier is required. This was pointed out by kib@ a while ago, but I got distracted by other matters. (for real this time)	2012-09-04 22:19:33 +00:00
des	dec17a5bb5	Revert previous commit, which was performed in the wrong tree.	2012-09-04 21:06:53 +00:00
des	627c3f1a6e	No memory barrier is required. This was pointed out by kib@ a while ago, but I got distracted by other matters.	2012-09-04 19:04:02 +00:00
zont	f93fc1d719	- After r240026 sgrowsiz should be used in a safer maner. Approved by: kib (mentor) MCF after: 1 week	2012-09-03 09:34:46 +00:00
zont	2f4305a824	- Remove accounting of locked memory from vsunlock(9) that I missed in r239818. Approved by: kib (mentor)	2012-08-30 08:03:33 +00:00
zont	85dfc3b8b7	- Don't take an account of locked memory for current process in vslock(9). There are two consumers of vslock(9): sysctl code and drm driver. These consumers are using locked memory as transient memory, it doesn't belong to a process's memory. Suggested by: avg Reviewed by: alc Approved by: kib (mentor) MFC after: 2 weeks	2012-08-29 11:23:59 +00:00
attilio	d3c5a80b69	MFC	2012-08-27 11:59:04 +00:00
pluknet	91ac59768e	Typo in previous change: print half the theoretical maximum as maximum recommended amount. Reported by: <site freebsd at orientalsensation com> Reviewed by: des	2012-08-27 10:59:49 +00:00
glebius	b1ab314c3f	Fix function name in keg_cachespread_init() assert.	2012-08-26 09:54:11 +00:00
des	5e88649166	- When running out of swzone, instead of spewing an error message every tick until the situation is resolved (if ever), just print a single message when running out and another when space becomes available. - When adding more swap, warn if the total amount exceeds half the theoretical maximum we can handle.	2012-08-16 08:29:49 +00:00
kib	ce7012daf6	For old mmap syscall, when executing on amd64 or ia64, enforce the PROT_EXEC if prot is non-zero, process is 32bit and kern.elf32.i386_read_exec syscal is enabled. This workaround is needed for old i386 a.out binaries, where dynamic linker did not specified PROT_EXEC for mapping of the text. The kern.elf32.i386_read_exec MIB name looks weird for a.out binaries, but I reused the existing knob which already has the needed semantic. MFC after: 1 week	2012-08-14 12:11:48 +00:00
kib	0d46b47153	Adjust the r205536, by allowing a non-zero offset for anonymous mappings for a.out binaries. Apparently, a.out ld.so from FreeBSD 1.1.5.1 can issue such requests. Reported and tested by: Dan Plassche <dplassche@gmail.com> MFC after: 1 week	2012-08-14 11:47:07 +00:00
kib	a3d0fb0175	Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week	2012-08-14 11:45:47 +00:00
alc	cd8266338a	Never sleep on busy pages in vm_pageout_launder(), always skip them. Long ago, sleeping on busy pages in vm_pageout_launder() made sense. The call to vm_pageout_flush() specified asynchronous I/O and sleeping on busy pages blocked vm_pageout_launder() until the flush had completed. However, in CVS revision 1.35 of vm/vm_contig.c, the call to vm_pageout_flush() was changed to request synchronous I/O, but the sleep on busy pages was not removed.	2012-08-07 04:48:14 +00:00
kib	cac2fe116f	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
kib	4259905d31	Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week	2012-08-04 18:16:43 +00:00
attilio	c52a057b19	MFC	2012-08-03 15:58:05 +00:00
alc	5b4712b5a1	Inline vm_page_aflags_clear() and vm_page_aflags_set(). Add comments stating that neither these functions nor the flags that they are used to manipulate are part of the KBI.	2012-08-03 01:48:15 +00:00
alc	ceefb8bf17	Eliminate an unneeded declaration. (I should have removed this as part of r227568.)	2012-07-30 20:38:37 +00:00
kib	4f8212948b	Do not requeue held page or page for which locking failed, just leave them alone. Process the act_count updates for the held pages in the vm_pageout loop over the inactive queue, instead of refusing to do anything with such page. Clarify the intent of the addl_page_shortage counter and change its use for pages which are not processed in the loop according to the description. Reviewed by: alc MFC after: 2 weeks	2012-07-26 09:06:48 +00:00
alc	26fd7fb588	Addendum to r238604. If the inactive queue scan isn't restarted, then the variable "addl_page_shortage_init" isn't needed. X-MFC after: r238604	2012-07-24 02:35:30 +00:00
kib	80c3756a6f	Do not restart scan of the inactive queue when non-inactive page is found. Rather, we shall not find such pages on inactive queue at all. Requested and reviewed by: alc MFC after: 2 weeks	2012-07-18 21:47:50 +00:00
alc	e5949174d4	Move what remains of vm/vm_contig.c into vm/vm_pageout.c, where similar code resides. Rename vm_contig_grow_cache() to vm_pageout_grow_cache(). Reviewed by: kib	2012-07-18 05:21:34 +00:00
alc	ad2692aed9	Correct vm_page_alloc_contig()'s implementation of VM_ALLOC_NODUMP.	2012-07-17 02:36:59 +00:00

... 2 3 4 5 6 ...

3375 Commits