freebsd-skq

Author	SHA1	Message	Date
glebius	742daba58c	Fix incorrect assertion that could miss overflows. Reviewed by: kib	2016-10-19 19:50:09 +00:00
kib	b028e9a2d9	Clarify the vnode_destroy_vobject() logic handling for already terminated objects. Assert that there is no new waiters for the already terminated objects. Old waiters should have been notified by the termination calling vnode_pager_dealloc() (old/new are with regard of the lock acquisition interval). Only clear the vp->v_object for the case of already terminated object, since other branches call vnode_pager_dealloc(), which should clear the pointer. Assert this. Tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-07-05 11:21:02 +00:00
kib	8da898f26c	Add implementation of robust mutexes, hopefully close enough to the intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. When a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. Somewhat unrelated changes in the patch: 1. Style. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the userspace struct pthread_mutex m_owner field. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Reviewed by: jilles (previous version, supposedly the objection was fixed) Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects) Tested by: pho Sponsored by: The FreeBSD Foundation	2016-05-17 09:56:22 +00:00
pfg	8327dbfe4d	sys/vm: minor spelling fixes in comments. No functional change.	2016-05-02 20:16:29 +00:00
kib	6cbc48de82	Add missed relpbuf() for a smallfs page-in. Reported by: Shawn Webb Tested by: pho Sponsored by: The FreeBSD Foundation	2015-12-27 14:42:39 +00:00
glebius	63cd1c131a	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
kib	c95b809b8d	Record proper commit message for r291157. The r289895 revision did not accounted for the block containing the requested page, when calculating the run of pages. Include the pages before/after the requested page, that fit into the reqblock, into the calculation. Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-11-22 09:50:13 +00:00
kib	0ae248d3d0	Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-11-22 09:48:03 +00:00
glebius	6460e3db4e	Remove remnants of the old NFS from vnode pager. Reviewed by: kib Sponsored by: Netflix	2015-11-20 23:52:27 +00:00
kib	326fa7f350	Reduce the amount of calls to VOP_BMAP() made from the local vnode pager. It is enough to execute VOP_BMAP() once to obtain both the disk block address for the requested page, and the before/after limits for the contiguous run. The clipping of the vm_page_t array passed to the vnode_pager_generic_getpages() and the disk address for the first page in the clipped array can be deduced from the call results. While there, remove some noise (like if (1) {...}) and adjust nearby code. Reviewed by: alc Discussed with: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-10-24 21:59:22 +00:00
jeff	3fb666cfae	Refactor unmapped buffer address handling. - Use pointer assignment rather than a combination of pointers and flags to switch buffers between unmapped and mapped. This eliminates multiple flags and generally simplifies the logic. - Eliminate b_saveaddr since it is only used with pager bufs which have their b_data re-initialized on each allocation. - Gather up some convenience routines in the buffer cache for manipulating buf space and buf malloc space. - Add an inline, buf_mapped(), to standardize checks around unmapped buffers. In collaboration with: mlaier Reviewed by: kib Tested by: pho (many small revisions ago) Sponsored by: EMC / Isilon Storage Division	2015-07-23 19:13:41 +00:00
kib	cb622d9740	Satisfy vm_object uma zone destructor requirements after r282660 when vnode object creation raced. Reported by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation	2015-05-10 08:21:03 +00:00
glebius	a7d547aa97	Fix the KASSERT and improve wording in r282426. Submitted by: alc	2015-05-06 08:07:11 +00:00
glebius	acfa186500	Fix arithmetical bug in vnode_pager_haspage(). The check against object size should be done not with the number of pages in the first block, but with the overall number of pages. While here, add KASSERT that makes sure that BMAP doesn't return completely irrelevant blocks. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-05-04 18:49:25 +00:00
glebius	e4390a8823	Catch up on r271387 and remove unused parameter from VOP_GETPAGES_ASYNC().	2015-03-30 22:49:26 +00:00
alc	b131a2abc8	Introduce vm_object_color() and use it in mmap(2) to set the color of named objects to zero before the virtual address is selected. Previously, the color setting was delayed until after the virtual address was selected. In rtld, this delay effectively prevented the mapping of a shared library's code section using superpages. Now, for example, we see the first 1 MB of libc's code on armv6 mapped by a superpage after we've gotten through the initial cold misses that bring the first 1 MB of code into memory. (With the page clustering that we perform on read faults, this happens quickly.) Differential Revision: https://reviews.freebsd.org/D2013 Reviewed by: jhb, kib Tested by: Svatopluk Kraus (armv6) MFC after: 6 weeks	2015-03-21 17:56:55 +00:00
glebius	398be53682	o Enhance vm_pager_free_nonreq() function: - Allow to call the function with vm object lock held. - Allow to specify reqpage that doesn't match any page in the region, meaning freeing all pages. o Utilize the new function in couple more places in vnode pager. Reviewed by: alc, kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-03-17 19:19:19 +00:00
glebius	df5d850742	Provide a comment explaining r279688. Suggested by: alc	2015-03-16 14:24:47 +00:00
glebius	6bbfdd570a	Fix function name in comment.	2015-03-10 13:06:54 +00:00
glebius	3a52ecfa66	- In vnode_pager_generic_getpages() use different free counters for synchronous and asynchronous requests. The latter can saturate the I/O and we do not want them to affect regular paging. - Allocate the pbuf at the very beginning of the function, so that if we are low on certain kind of pbufs don't even proceed to BMAP, but sleep. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-03-06 14:15:30 +00:00
glebius	5e67a4071d	We already have "int i" in this scope. Submitted by: alc	2014-11-24 07:57:20 +00:00
glebius	b4ef8e602d	Merge from projects/sendfile: o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-23 12:01:52 +00:00
glebius	20a46e6241	Use __func__ in KASSERTs, since the code is about to be moved to other place. Sponsored by: Nginx, Inc.	2014-11-19 16:29:39 +00:00
glebius	c73b720d6a	In vnode_pager_generic_getpages() vp->v_mount is dereferenced in the beginning, thus can't be NULL. Sponsored by: Nginx, Inc.	2014-11-19 15:17:19 +00:00
glebius	b87d94c5df	Collapse three contiguous comment blocks into one. Remove historical note about wrong assumptions 20 years ago. Use proper casing. Sponsored by: Nginx, Inc.	2014-11-18 13:38:07 +00:00
alc	17021239d9	Three improvements to vnode_pager_generic_getpages(): Eliminate an exclusive object lock acquisition and release on the expected execution path. Do page zeroing before the object lock is acquired rather than during the time that the object lock is held. Use vm_pager_free_nonreq() to eliminate duplicated code. Reviewed by: kib MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division	2014-09-15 17:14:09 +00:00
kib	ebd8a253bb	Provide the unique implementation for the VOP_GETPAGES() method used by ffs and ext2fs. Remove duplicated call to vm_page_zero_invalid(), done by VOP and by vm_pager_getpages(). Use vm_pager_free_nonreq(). Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 6 weeks (after r271596)	2014-09-15 12:28:29 +00:00
alc	810772ca9c	Avoid an exclusive acquisition of the object lock on the expected execution path through the NFS clients' getpages functions. Introduce vm_pager_free_nonreq(). This function can be used to eliminate code that is duplicated in many getpages functions. Also, in contrast to the code that currently appears in those getpages functions, vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one case. Reviewed by: kib MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division	2014-09-14 18:07:55 +00:00
kib	6044029377	Fix mis-spelling of bits and types names in the vnode_pager_putpages(). The changes should not modify the generated code. The pager->pgo_putpages() method takes int flags as its fourth argument, while vnode_pager_putpages() used boolean_t (which is typedef'ed to int). The flags are from VM_PAGER_* namespace, while vnode_pager_putpages() passed TRUE and OBJPC_SYNC to VOP_PUTPAGES(), which both are numerically equal to VM_PAGER_PUT_SYNC. Noted and reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-09-14 10:27:36 +00:00
glebius	5939c729a8	Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES().	2014-09-10 12:36:41 +00:00
bdrewery	6fcf6199a4	Rename global cnt to vm_cnt to avoid shadowing. To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division	2014-03-22 10:26:09 +00:00
glebius	681dcc3c57	ANSIfy declarations. Ok'ed by: alc	2014-01-20 18:47:56 +00:00
attilio	16c7563cf4	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
jeff	f5ce18bd6e	- Correct a stale comment. We don't have vclean() anymore. The work is done by vgonel() and destroy_vobject() should only be called once from VOP_INACTIVE(). Sponsored by: EMC / Isilon Storage Division	2013-07-23 22:52:38 +00:00
kib	dae3935768	Assert that the object type for the vnode' non-NULL v_object, passed to vnode_pager_setsize(), is either OBJT_VNODE, or, if vnode was already reclaimed, OBJT_DEAD. Note that the later is only possible due to some filesystems, in particular, nfsiods from nfs clients, call vnode_pager_setsize() with unlocked vnode. More, if the object is terminated, do not perform the resizing operation. Reviewed by: alc Tested by: pho, bf MFC after: 1 week	2013-04-28 19:19:26 +00:00
kib	0e1bea778f	Convert panic() into KASSERT(). Reviewed by: alc MFC after: 1 week	2013-04-28 18:40:55 +00:00
kib	fde3650fd8	Fix the logic inversion in the r248512. Noted by: mckay	2013-03-20 09:44:23 +00:00
kib	a43491886a	Pass unmapped buffers for page in requests if the filesystem indicated support for the unmapped i/o. Sponsored by: The FreeBSD Foundation Tested by: pho	2013-03-19 14:36:28 +00:00
kib	7ca94eca24	Some style fixes. Sponsored by: The FreeBSD Foundation	2013-03-14 20:31:39 +00:00
attilio	820ab571ec	MFC	2013-02-26 21:09:35 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
kib	f16ea99007	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
kib	560aa751e0	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
kib	8f845e475e	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
kib	a3d0fb0175	Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week	2012-08-14 11:45:47 +00:00
kib	cac2fe116f	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
kib	4259905d31	Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week	2012-08-04 18:16:43 +00:00
attilio	5d0dc848b7	Do a more targeted check on the page cache and avoid to check the cache pointer directly in vnode_pager_setsize() by using newly introduced vm_page_is_cached() function. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039,234064	2012-06-16 21:39:00 +00:00

1 2 3 4 5 ...

325 Commits