freebsd-skq

Author	SHA1	Message	Date
attilio	e9f37cac74	On all the architectures, avoid to preallocate the physical memory for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl	2013-08-09 11:28:55 +00:00
kib	da0e8446db	Never remove user-wired pages from an object when doing msync(MS_INVALIDATE). The vm_fault_copy_entry() requires that object range which corresponds to the user-wired vm_map_entry, is always fully populated. Add OBJPR_NOTWIRED flag for vm_object_page_remove() to request the preserving behaviour, use it when calling vm_object_page_remove() from vm_object_sync(). Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-07-11 05:47:26 +00:00
attilio	fdf82ef9cf	o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-05-21 20:38:19 +00:00
kib	2f2c1edec8	Rework the handling of the tmpfs node backing swap object and tmpfs vnode v_object to avoid double-buffering. Use the same object both as the backing store for tmpfs node and as the v_object. Besides reducing memory use up to 2x times for situation of mapping files from tmpfs, it also makes tmpfs read and write operations copy twice bytes less. VM subsystem was already slightly adapted to tolerate OBJT_SWAP object as v_object. Now the vm_object_deallocate() is modified to not reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle VV_TEXT flag on the last dereference of the tmpfs backing object. Reviewed by: alc Tested by: pho, bf MFC after: 1 month	2013-04-28 19:38:59 +00:00
alc	2c9c761886	Introduce vm_radix_is_empty(), and use it in place of vm_object_cache_is_empty() where the caller is aware of the page cache's implementation as a radix trie. Sponsored by: EMC / Isilon Storage Division	2013-03-10 17:30:57 +00:00
attilio	76954ad68a	Merge from vmcontention.	2013-03-09 03:19:53 +00:00
attilio	16a80466e5	MFC	2013-03-09 02:51:51 +00:00
attilio	72f7f3e528	Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho	2013-03-09 02:32:23 +00:00
attilio	754f3790b8	Merge from vmc-playground: Introduce a new KPI that verifies if the page cache is empty for a specified vm_object. This KPI does not make assumptions about the locking in order to be used also for building assertions at init and destroy time. It is mostly used to hide implementation details of the page cache. Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: alc (vm_radix based version) Tested by: flo, pho, jhb, davide	2013-03-09 02:05:29 +00:00
attilio	709ad55889	Evaluations on the likelyhood of empty object cache cannot be made in general way but must be evaluated case by case. Embedd the decision in the caller themselves rather than in a general purpose KPI. Sponsored by: EMC / Isilon storage division Reported by: alc Reviewed by: alc	2013-03-04 12:33:40 +00:00
attilio	31dffb7b33	Merge from vmcontention	2013-02-27 18:25:57 +00:00
attilio	6ff1954532	MFC	2013-02-27 18:23:12 +00:00
attilio	8d28f94790	Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-27 18:12:13 +00:00
attilio	74f58faa15	MFC	2013-02-26 23:52:23 +00:00
attilio	cbe7c0e167	MFC	2013-02-26 23:43:28 +00:00
attilio	cc89d0bd92	Merge from vmc-playground branch: Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination. The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init(). Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide	2013-02-26 23:35:27 +00:00
attilio	210a93e7f7	Merge from vmcontention	2013-02-26 18:18:39 +00:00
attilio	134623836d	MFC	2013-02-26 18:11:43 +00:00
attilio	49f99b7251	Wrap the sleeps synchronized by the vm_object lock into the specific macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-26 17:22:08 +00:00
attilio	e590e8091e	As VM_OBJECT_SLEEP() is a vm_object_t specific function, make the passed object as the first argument of the function for consistency. Sponsored by: EMC / Isilon storage revision	2013-02-26 01:38:12 +00:00
attilio	12289fcebc	Retire the old UMA primitive uma_zone_set_obj() and replace it with the more modern uma_zone_reserve_kva(). The difference is that it doesn't rely anymore on an obj to allocate pages and the slab allocator doesn't use any more any specific locking but atomic operations to complete the operation. Where possible, the uma_small_alloc() is instead used and the uk_kva member becomes unused. The subsequent cleanups also brings along the removal of VM_OBJECT_LOCK_INIT() macro which is not used anymore as the code can be easilly cleaned up to perform a single mtx_init(), private to vm_object.c. For the same reason, _vm_object_allocate() becomes private as well. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-02-24 16:41:36 +00:00
attilio	8702b26c68	Complete the asserts by definining also assertions for RA_RLOCKED and RA_LOCKED cases. Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:56:51 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	b2afca4987	Add read mode operations to VM_OBJECT_LOCK* class of functions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:06:33 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	1f1e13ca03	There is no need to use VM_OBJECT_LOCKED() as the assertion won't make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().	2013-02-20 10:51:34 +00:00
attilio	6a2e2ce522	Remove unused VM_OBJECT_LOCKPTR(). Sponsored by: EMC / Isilon storage division	2013-02-20 10:40:27 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
attilio	12b7890c27	Reduce differences with HEAD.	2013-02-07 11:36:34 +00:00
attilio	950536d7a4	Reformat comments to follow original version and re-add correct locking flags.	2013-02-06 23:48:04 +00:00
attilio	86cff934d5	Do not assume the lock to be held so that this can be used also in safe cases as a short-cut.	2013-02-06 19:03:48 +00:00
attilio	9066f231e3	Tweak comment to remove splay tree references.	2013-02-06 19:02:46 +00:00
attilio	1bab15985e	Make vm_object_cache_is_empty() inline.	2013-02-06 18:59:34 +00:00
attilio	c02c27a33d	Avoid a namespace pollution in vm_object.h by defining separately the structure for vm_radix implementation.	2013-02-06 18:04:28 +00:00
attilio	abbe2a9b91	- Move the vm_object_cache_is_empty() prototype to be sorted alphabetically. - Change the return type to be boolean_t in order to match what vm_page_is_cached() does.	2013-02-06 17:27:41 +00:00
attilio	b972b67ed7	Merge from vmcontention	2013-02-04 15:44:42 +00:00
attilio	0d3b58aee0	MFC	2013-02-03 20:13:33 +00:00
ken	1abc90f894	Fix a bug in the device pager code that can trigger an assertion in devfs if a particular race condition is hit in the device pager code. This was a side effect of change 227530 which changed the device pager interface to call a new destructor routine for the cdev. That destructor routine, old_dev_pager_dtor(), takes a VM object handle. The object handle is cast to a struct cdev *, and passed into dev_rel(). That works in most cases, except the case in cdev_pager_allocate() where there is a race condition between two threads allocating an object backed by the same device. The loser of the race deallocates its object at the end of the function. The problem is that before inserting the object into the dev_pager_object_list, the object's handle is changed from the struct cdev pointer to the object's own address. This is to avoid conflicts with the winner of the race, which already inserted an object in the list with a handle that is a pointer to the same cdev structure. The object is then passed to vm_object_deallocate(), and eventually makes its way down to old_dev_pager_dtor(). That function passes the handle pointer (which is actually a VM object, not a struct cdev as usual) into dev_rel(). dev_rel() decrements the reference count in the assumed struct cdev (which happens to be 0), and that triggers the assertion in dev_rel() that the reference count is greater than or equal to 0. The fix is to add a cdev pointer to the VM object, and use that pointer when calling the cdev_pg_dtor() routine. vm_object.h: Add a struct cdev pointer to the VM object structure. device_pager.c: In cdev_pager_allocate(), populate the new cdev pointer. In dev_pager_dealloc(), use the new cdev pointer when calling the object's cdev_pg_dtor() routine. Reviewed by: kib Sponsored by: Spectra Logic Corporation MFC after: 1 week	2013-01-09 16:48:38 +00:00
attilio	be719e9167	MFC	2012-12-11 00:07:19 +00:00
alc	02094caa2c	In the past four years, we've added two new vm object types. Each time, similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages. To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged. Reviewed and tested by: kib	2012-12-09 00:32:38 +00:00
attilio	e237fbcafa	Merge from vmcontention	2012-07-08 16:12:59 +00:00
attilio	ffa3f082ff	- Split the cached and resident pages tree into 2 distinct ones. This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio	2012-07-08 14:01:25 +00:00
attilio	5d9dd820d8	MFC	2012-06-23 02:08:15 +00:00
attilio	a7497af6fd	- Add a comment explaining the locking of the cached pages pool held by vm_objects. - Add flags for the per-object lock and free pages queue mutex lock. Use the newly added flags to mark the cache root within the vm_object structure. Please note that other vm_object members should be marked with correct locking but they are left for other commits. In collabouration with: alc MFC after: 3 days3 days3 days	2012-06-22 18:34:11 +00:00
attilio	3fc8e26872	Introduce a new tree for dealing with cached pages separately and remove the RED/BLACK concept. This is based on the assumption that path-compressed tries will be small and fast enough that a separate trie for cached pages will make sense and will leave the trie code simple enough (along with removing a lot of differences in the userend code).	2012-06-09 12:25:30 +00:00
attilio	7ee59361d5	MFC	2012-03-19 18:54:01 +00:00
jhb	9628d3dbf8	Fix madvise(MADV_WILLNEED) to properly handle individual mappings larger than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int' count of pages overflowed on 64-bit platforms. While here, change vm_object_madvise() to accept two vm_pindex_t parameters (start and end) rather than a (start, count) tuple to match other VM APIs as suggested by alc@.	2012-03-19 18:47:34 +00:00
kib	2963c3c979	In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag if the filesystem performed short write and we are skipping the page due to this. Propogate write error from the pager back to the callers of vm_pageout_flush(). Report the failure to write a page from the requested range as the FALSE return value from vm_object_page_clean(), and propagate it back to msync(2) to return EIO to usermode. While there, convert the clearobjflags variable in the vm_object_page_clean() and arguments of the helper functions to boolean. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks	2012-03-17 23:00:32 +00:00
attilio	d4c43cbb8b	MFC	2012-02-25 18:24:45 +00:00
kib	f315a59476	Account the writeable shared mappings backed by file in the vnode v_writecount. Keep the amount of the virtual address space used by the mappings in the new vm_object un_pager.vnp.writemappings counter. The vnode v_writecount is incremented when writemappings gets non-zero value, and decremented when writemappings is returned to zero. Writeable shared vnode-backed mappings are accounted for in vm_mmap(), and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on the created map entry. During deferred map entry deallocation, vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and decrements writemappings for the vm object. Now, the writeable mount cannot be demoted to read-only while writeable shared mappings of the vnodes from the mount point exist. Also, execve(2) fails for such files with ETXTBUSY, as it should be. Noted by: tegge Reviewed by: tegge (long time ago, early version), alc Tested by: pho MFC after: 3 weeks	2012-02-23 21:07:16 +00:00

1 2 3 4

185 Commits