freebsd-dev

Author	SHA1	Message	Date
Jeff Roberson	d6e13f3b4d	Don't hold the object lock while calling getpages. The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033	2020-01-19 23:47:32 +00:00
Kyle Evans	39eae263cd	shmfd: posix_fallocate(2): only take rangelock for section we need Other mechanisms that resize the shmfd grab a write lock from 0 to OFF_MAX for safety, so we still get proper synchronization of shmfd->shm_size in effect. There's no need to block readers/writers of earlier segments when we're just reserving more space, so narrow the scope -- it would likely be safe to narrow it completely to just the section of the range that extends beyond our current size, but this likely isn't worth it since the size isn't stable until the writelock is granted the first time. Suggested by: cem (passing comment)	2020-01-09 04:03:17 +00:00
Kyle Evans	f10405323a	posixshm: implement posix_fallocate(2) Linux expects to be able to use posix_fallocate(2) on a memfd. Other places would use this with shm_open(2) to act as a smarter ftruncate(2). Test has been added to go along with this. Reviewed by: kib (earlier version) Differential Revision: https://reviews.freebsd.org/D23042	2020-01-08 19:08:44 +00:00
Kyle Evans	535b1df993	shm: correct KPI mistake introduced around memfd_create When file sealing and shm_open2 were introduced, we should have grown a new kern_shm_open2 helper that did the brunt of the work with the new interface while kern_shm_open remains the same. Instead, more complexity was introduced to kern_shm_open to handle the additional features and consumers had to keep changing in somewhat awkward ways, and a kern_shm_open2 was added to wrap kern_shm_open. Backpedal on this and correct the situation- kern_shm_open returns to the interface it had prior to file sealing being introduced, and neither function needs an initial_seals argument anymore as it's handled in kern_shm_open2 based on the shmflags.	2020-01-05 04:06:40 +00:00
Kyle Evans	58366f05c0	shmfd/mmap: restrict maxprot with MAP_SHARED + F_SEAL_WRITE If a write seal is set on a shared mapping, we must exclude VM_PROT_WRITE as the fd is effectively read-only. This was discovered by running devel/linux-ltp, which mmap's with acceptable protections specified then attempts to raise to PROT_READ\|PROT_WRITE with mprotect(2), which we allowed. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D22978	2020-01-05 03:15:16 +00:00
Mark Johnston	9f5632e6c8	Remove page locking for queue operations. With the previous reviews, the page lock is no longer required in order to perform queue operations on a page. It is also no longer needed in the page queue scans. This change effectively eliminates remaining uses of the page lock and also the false sharing caused by multiple pages sharing a page lock. Reviewed by: jeff Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22885	2019-12-28 19:04:00 +00:00
Jeff Roberson	d29f674f2e	Fix a mistake in r355765. We need to activate the page if it is not yet on a pagequeue. Reported by: pho	2019-12-15 06:26:47 +00:00
Jeff Roberson	a808177864	Add a deferred free mechanism for freeing swap space that does not require an exclusive object lock. Previously swap space was freed on a best effort basis when a page that had valid swap was dirtied, thus invalidating the swap copy. This may be done inconsistently and requires the object lock which is not always convenient. Instead, track when swap space is present. The first dirty is responsible for deleting space or setting PGA_SWAP_FREE which will trigger background scans to free the swap space. Simplify the locking in vm_fault_dirty() now that we can reliably identify the first dirty. Discussed with: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22654	2019-12-15 03:15:06 +00:00
Jeff Roberson	639676877b	Simplify anonymous memory handling with an OBJ_ANON flag. This eliminates reudundant complicated checks and additional locking required only for anonymous memory. Introduce vm_object_allocate_anon() to create these objects. DEFAULT and SWAP objects now have the correct settings for non-anonymous consumers and so individual consumers need not modify the default flags to create super-pages and avoid ONEMAPPING/NOSPLIT. Reviewed by: alc, dougm, kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D22119	2019-11-19 23:19:43 +00:00
David Bright	2d5603fe65	Jail and capability mode for shm_rename; add audit support for shm_rename Co-mingling two things here: * Addressing some feedback from Konstantin and Kyle re: jail, capability mode, and a few other things * Adding audit support as promised. The audit support change includes a partial refresh of OpenBSM from upstream, where the change to add shm_rename has already been accepted. Matthew doesn't plan to work on refreshing anything else to support audit for those new event types. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: kib Relnotes: Yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22083	2019-11-18 13:31:16 +00:00
Jeff Roberson	0012f373e4	(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594	2019-10-15 03:45:41 +00:00
Jeff Roberson	63e9755548	(1/6) Replace busy checks with acquires where it is trival to do so. This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548	2019-10-15 03:35:11 +00:00
Kyle Evans	5a391b572b	shm_open2(2): completely unbreak kern_shm_open2(), since conception, completely fails to pass the mode along to kern_shm_open(). This breaks most uses of it. Add tests alongside this that actually check the mode of the returned files. PR: 240934 [pulseaudio breakage] Reported by: ler, Andrew Gierth [postgres breakage] Diagnosed by: Andrew Gierth (great catch) Tested by: ler, tmunro Pointy hat to: kevans	2019-10-02 02:37:34 +00:00
David Bright	9afb12bab4	Add an shm_rename syscall Add an atomic shm rename operation, similar in spirit to a file rename. Atomically unlink an shm from a source path and link it to a destination path. If an existing shm is linked at the destination path, unlink it as part of the same atomic operation. The caller needs the same permissions as shm_unlink to the shm being renamed, and the same permissions for the shm at the destination which is being unlinked, if it exists. If those fail, EACCES is returned, as with the other shm_* syscalls. truss support is included; audit support will come later. This commit includes only the implementation; the sysent-generated bits will come in a follow-on commit. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jilles (earlier revision) Reviewed by: brueffer (manpages, earlier revision) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21423	2019-09-26 15:32:28 +00:00
Kyle Evans	a9ac5e1424	sysent: regenerate after r352705 This also implements it, fixes kdump, and removes no longer needed bits from lib/libc/sys/shm_open.c for the interim.	2019-09-25 18:09:19 +00:00
Kyle Evans	20f7057685	Add a shm_open2 syscall to support upcoming memfd_create shm_open2 allows a little more flexibility than the original shm_open. shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate shmflag argument that can be expanded later. Currently the only shmflag is to allow file sealing on the returned fd. shm_open and memfd_create will both be implemented in libc to use this new syscall. __FreeBSD_version is bumped to indicate the presence. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21393	2019-09-25 17:59:15 +00:00
Kyle Evans	0cd95859c8	[2/3] Add an initial seal argument to kern_shm_open() Now that flags may be set on posixshm, add an argument to kern_shm_open() for the initial seals. To maintain past behavior where callers of shm_open(2) are guaranteed to not have any seals applied to the fd they're given, apply F_SEAL_SEAL for existing callers of kern_shm_open. A special flag could be opened later for shm_open(2) to indicate that sealing should be allowed. We currently restrict initial seals to F_SEAL_SEAL. We cannot error out if F_SEAL_SEAL is re-applied, as this would easily break shm_open() twice to a shmfd that already existed. A note's been added about the assumptions we've made here as a hint towards anyone wanting to allow other seals to be applied at creation. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21392	2019-09-25 17:35:03 +00:00
Kyle Evans	af755d3e48	[1/3] Add mostly Linux-compatible file sealing support File sealing applies protections against certain actions (currently: write, growth, shrink) at the inode level. New fileops are added to accommodate seals - EINVAL is returned by fcntl(2) if they are not implemented. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21391	2019-09-25 17:32:43 +00:00
Jeff Roberson	c75757481f	Replace redundant code with a few new vm_page_grab facilities: - VM_ALLOC_NOCREAT will grab without creating a page. - vm_page_grab_valid() will grab and page in if necessary. - vm_page_busy_acquire() automates some busy acquire loops. Discussed with: alc, kib, markj Tested by: pho (part of larger branch) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21546	2019-09-10 19:08:01 +00:00
Mark Johnston	fee2a2fa39	Change synchonization rules for vm_page reference counting. There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486	2019-09-09 21:32:42 +00:00
Kyle Evans	dca52ab480	posixshm: start counting writeable mappings r351650 switched posixshm to using OBJT_SWAP for shm_object r351795 added support to the swap_pager for tracking writeable mappings Take advantage of this and start tracking writeable mappings; fd sealing will use this to reject a seal on writing with EBUSY if any such mapping exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456	2019-09-03 20:33:38 +00:00
Kyle Evans	32287ea72b	posixshm: switch to OBJT_SWAP in advance of other changes Future changes to posixshm will start tracking writeable mappings in order to support file sealing. Tracking writeable mappings for an OBJT_DEFAULT object is complicated as it may be swapped out and converted to an OBJT_SWAP. One may generically add this tracking for vm_object, but this is difficult to do without increasing memory footprint of vm_object and blowing up memory usage by a significant amount. On the other hand, the swap pager can be expanded to track writeable mappings without increasing vm_object size. This change is currently in D21456. Switch over to OBJT_SWAP in advance of the other changes to the swap pager and posixshm.	2019-09-01 00:33:16 +00:00
Mark Johnston	b5d239cb97	Wire pages in vm_page_grab() when appropriate. uiomove_object_page() and exec_map_first_page() would previously wire a page after having grabbed it. Ask vm_page_grab() to perform the wiring instead: this removes some redundant code, and is cheaper in the case where the requested page is not resident since the page allocator can be asked to initialize the page as wired, whereas a separate vm_page_wire() call requires the page lock. In vm_imgact_hold_page(), use vm_page_unwire_noq() instead of vm_page_unwire(PQ_NONE). The latter ensures that the page is dequeued before returning, but this is unnecessary since vm_page_free() will trigger a batched dequeue of the page. Reviewed by: alc, kib Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21440	2019-08-28 16:08:06 +00:00
Kyle Evans	b5a7ac997f	kern_shm_open: push O_CLOEXEC into caller control The motivation for this change is to allow wrappers around shm to be written that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1(). Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from the context. sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX requirements, and a comment has been dropped in to kern_fd_open() to explain the situation and add a pointer to where O_CLOEXEC setting is maintained for shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally sets O_CLOEXEC to match previous behavior. This also has the side-effect of making flags correctly reflect the O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a glance-over leads me to believe that it didn't really matter. Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21119	2019-07-31 15:16:51 +00:00
Mark Johnston	918988576c	Avoid relying on header pollution from sys/refcount.h. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-07-29 20:26:01 +00:00
Mark Johnston	eeacb3b02f	Merge the vm_page hold and wire mechanisms. The hold_count and wire_count fields of struct vm_page are separate reference counters with similar semantics. The remaining essential differences are that holds are not counted as a reference with respect to LRU, and holds have an implicit free-on-last unhold semantic whereas vm_page_unwire() callers must explicitly determine whether to free the page once the last reference to the page is released. This change removes the KPIs which directly manipulate hold_count. Functions such as vm_fault_quick_hold_pages() now return wired pages instead. Since r328977 the overhead of maintaining LRU for wired pages is lower, and in many cases vm_fault_quick_hold_pages() callers would swap holds for wirings on the returned pages anyway, so with this change we remove a number of page lock acquisitions. No functional change is intended. __FreeBSD_version is bumped. Reviewed by: alc, kib Discussed with: jeff Discussed with: jhb, np (cxgbe) Tested by: pho (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19247	2019-07-08 19:46:20 +00:00
Konstantin Belousov	5c066cd2e2	Remove TODO comment after posixshmcontrol(1) added. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-05-30 16:04:00 +00:00
Konstantin Belousov	56d0e33e7a	Add a kern.ipc.posix_shm_list sysctl. The sysctl provides the listing on named linked posix shared memory segments existing in the system. Reuse shm_fill_kinfo() for filling individual struct kinfo_file. Remove unneeded lock around reading of shmfd->shm_mode. Reviewed by: jilles, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20258	2019-05-23 12:35:40 +00:00
Konstantin Belousov	e4b7754848	Report ref count of the backing object as st_nlink for posix shm fd. Unless there are transient references to the object, the ref count is equal to the number of the shared memory segment mappings plus one. Reviewed by: jilles, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20258	2019-05-23 12:27:45 +00:00
Mark Johnston	2b64ab22e8	Allow FIONBIO and FIOASYNC ioctls on POSIX shm descriptors. They have no effect, as with filesystem file descriptors. This improves compatibility with some existing userspace code. Submitted by: Greg V <greg@unrelenting.technology> Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19330	2019-02-28 22:00:36 +00:00
Mateusz Guzik	cc426dd319	Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation	2018-12-11 19:32:16 +00:00
Mateusz Guzik	7883ce1f26	uipc_shm: use unr64 for inode numbers Sponsored by: The FreeBSD Foundation	2018-11-21 22:01:06 +00:00
Pedro F. Giffuni	8a36da99de	sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:20:12 +00:00
Jeff Roberson	8d6fbbb867	Replace manyinstances of VM_WAIT with blocking page allocation flags similar to the kernel memory allocator. This simplifies NUMA allocation because the domain will be known at wait time and races between failure and sleeping are eliminated. This also reduces boilerplate code and simplifies callers. A wait primitive is supplied for uma zones for similar reasons. This eliminates some non-specific VM_WAIT calls in favor of more explicit sleeps that may be satisfied without new pages. Reviewed by: alc, kib, markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon	2017-11-08 02:39:37 +00:00
Alan Cox	0c0e1e96c6	Use vm_page_active() rather than directly accessing the page's queue field. Reviewed by: kib, markj MFC after: 2 weeks X-MFC with: r324146	2017-10-02 07:30:21 +00:00
Mark Johnston	0ffc7ed7e3	Have uiomove_object_page() keep accessed pages in the active queue. Previously, uiomove_object_page() would maintain LRU by requeuing the accessed page. This involves acquiring one of the heavily contended page queue locks. Moreover, it is unnecessarily expensive for pages in the active queue. As of r254304 the page daemon continually performs a slow scan of the active queue, with the effect that unreferenced pages are gradually moved to the inactive queue, from which they can be reclaimed. Prior to that revision, the active queue was scanned only during shortages of free and inactive pages, meaning that unreferenced pages could get "stuck" in the queue. Thus, tmpfs was required to use the inactive queue and requeue pages in order to maintain LRU. Now that this is no longer the case, tmpfs I/O operations can use the active queue and avoid the page queue locks in most cases, instead setting PGA_REFERENCED on referenced pages to provide pseudo-LRU. Reviewed by: alc (previous version) MFC after: 2 weeks	2017-09-30 23:41:28 +00:00
Konstantin Belousov	34d3e89f33	Do not ignore an error from vm_mmap_object(). Found and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-06-27 20:12:13 +00:00
Robert Watson	15bcf785ba	Audit arguments to POSIX message queues, semaphores, and shared memory. This requires minor changes to the audit framework to allow capturing paths that are not filesystem paths (i.e., will not be canonicalised relative to the process current working directory and/or filesystem root). Obtained from: TrustedBSD Project MFC after: 3 weeks Sponsored by: DARPA, AFRL	2017-03-31 13:43:00 +00:00
Alan Cox	2a016de1a5	Use IDX_TO_OFF(), not ptoa(), when converting the difference between two vm_pindex_t's into a vm_ooffset_t. The length given to shm_dotruncate() must never be negative. Assert this. Tidy up a comment. Reviewed by: kib MFC after: 1 week	2017-03-20 05:15:55 +00:00
Konstantin Belousov	987ff18184	Consistently handle negative or wrapping offsets in the mmap(2) syscalls. For regular files and posix shared memory, POSIX requires that [offset, offset + size) range is legitimate. At the maping time, check that offset is not negative. Allowing negative offsets might expose the data that filesystem put into vm_object for internal use, esp. due to OFF_TO_IDX() signess treatment. Fault handler verifies that the mapped range is valid, assuming that mmap(2) checked that arithmetic gives no undefined results. For device mappings, leave the semantic of negative offsets to the driver. Correct object page index calculation to not erronously propagate sign. In either case, disallow overflow of offset + size. Update mmap(2) man page to explain the requirement of the range validity, and behaviour when the range becomes invalid after mapping. Reported and tested by: royger (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2017-02-12 21:05:44 +00:00
Alan Cox	2d612d2dd2	When tmpfs and POSIX shm pagein a page for the sole purpose of performing truncation, immediately queue the page for asynchronous laundering rather than making the page pass through inactive queue first. Reviewed by: kib, markj	2016-12-11 19:24:41 +00:00
Alan Cox	7667839a7e	Remove most of the code for implementing PG_CACHED pages. (This change does not remove user-space visible fields from vm_cnt or all of the references to cached pages from comments. Those changes will come later.) Reviewed by: kib, markj Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8497	2016-11-15 18:22:50 +00:00
Alan Cox	ce3ee09b53	Eliminate unneeded vm_page_xbusy() and vm_page_xunbusy() operations when neither vm_pager_has_page() nor vm_pager_get_pages() is called. Reviewed by: kib, markj MFC after: 3 weeks	2016-08-14 22:00:45 +00:00
Jilles Tjoelker	6ea906eec0	posixshm: Fix lock leak when mac_posixshm_check_read rejects read. While reading the code, I noticed that shm_read() returns without unlocking foffset and rangelock if mac_posixshm_check_read() rejects the read. Reviewed by: kib, jhb, rwatson Approved by: re (gjb) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D6927	2016-06-23 20:59:13 +00:00
Pedro F. Giffuni	55e0987aea	sys: extend use of the howmany() macro when available. We have a howmany() macro in the <sys/param.h> header that is convenient to re-use as it makes things easier to read.	2016-04-26 15:38:17 +00:00
Jamie Gritton	44c16975a2	Clean up some style(9) violations.	2016-04-14 17:07:26 +00:00
Jamie Gritton	cc7b259a26	Separate POSIX sem/shm objects in jails, by prepending the jail's path name to the object's "path". While the objects don't have real path names, it's a filesystem-like namespace, which allows jails to be kept to their own space, but still allows the system / jail parent to access a jail's IPC. PR: 208082	2016-04-13 20:14:13 +00:00
Konstantin Belousov	1bdbd70599	Implement process-shared locks support for libthr.so.3, without breaking the ABI. Special value is stored in the lock pointer to indicate shared lock, and offline page in the shared memory is allocated to store the actual lock. Reviewed by: vangyzen (previous version) Discussed with: deischen, emaste, jhb, rwatson, Martin Simmons <martin@lispworks.com> Tested by: pho Sponsored by: The FreeBSD Foundation	2016-02-28 17:52:33 +00:00
Gleb Smirnoff	b0cd20172d	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
Ed Schouten	7ee1b208c3	Add kern_shm_open(). This allows you to specify the capabilities that the new file descriptor should have. This allows us to create shared memory objects that only have the rights we're interested in. The idea behind restricting the rights is that it makes it a lot easier for CloudABI to get consistent behaviour across different operating systems. We only need to make sure that a shared memory implementation consistently implements the operations that are whitelisted. Approved by: kib Obtained from: https://github.com/NuxiNL/freebsd	2015-08-01 07:21:14 +00:00

1 2 3

102 Commits