a reservation, unless all of the reservation's pages were free, the
reservation was moved to the head of the partially-populated reservations
queue, where it would be the next reservation to be broken in case the
free page queues were emptied. Now, instead, I am moving it to the tail.
Very likely this reservation is in the process of being freed in its
entirety, so placing it at the tail of the queue makes it more likely that
the underlying physical memory will be returned to the free page queues as
one contiguous chunk. If a reservation must be broken, it will, instead,
be the longest unchanged reservation, which is arguably the reservation
that is least likely to ever achieve promotion or be freed in its entirety.
MFC after: 6 weeks
the mappings without any of read and execution rights, in particular,
the PROT_NONE entries. This makes mlockall(2) work for the process
address space that has such mappings.
Since protection mode of the entry may change between setting
MAP_ENTRY_IN_TRANSITION and final pass over the region that records
the wire status of the entries, allocate new map entry flag
MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries.
Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl>
Reviewed by: alc
MFC after: 3 weeks
but I see no benefit from it today.
VM_PROT_READ_IS_EXEC was only intended for use on processors that do not
distinguish between read and execute permission. On an mmap(2) or
mprotect(2), it automatically added execute permission if the caller
specified permissions included read permission. The hope was that this
would reduce the number of vm map entries needed to implement an address
space because there would be fewer neighboring vm map entries that differed
only in the presence or absence of VM_PROT_EXECUTE. (See vm/vm_mmap.c
revision 1.56.)
Today, I don't see any real applications that benefit from
VM_PROT_READ_IS_EXEC. In any case, vm map entries are now organized
as a self-adjusting binary search tree instead of an ordered list. So,
the need for coalescing vm map entries is not as great as it once was.
address space sizes to be longs instead of ints. Specifically, the follow
values are now longs: runningbufspace, bufspace, maxbufspace,
bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace,
hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a
relatively small number (~ 44000) of buffers set in kern.nbuf would result
in integer overflows resulting either in hangs or bogus values of
hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see
such problems. There was a check for a nbuf setting that would cause
overflows in the auto-tuning of nbuf. I've changed it to always check and
cap nbuf but warn if a user-supplied tunable would cause overflow.
Note that this changes the ABI of several sysctls that are used by things
like top(1), etc., so any MFC would probably require a some gross shims
to allow for that.
MFC after: 1 month
fault. In r188331 this update was relocated because of synchronization
changes to a place where it would occur on both hard and soft faults. This
change again restricts the update to hard faults.
function, done in r188334. Instead, collect the entries that shall be
freed, in the deferred_freelist member of the map. Automatically purge
the deferred freelist when map is unlocked.
Tested by: pho
Reviewed by: alc
1 will trigger a pass through the VM's low-memory handlers, such as
protocol and UMA drain routines. This makes it easier to exercise
these otherwise rarely-invoked code paths.
MFC after: 3 days
hold the map lock there, and might need the vnode lock for OBJT_VNODE
objects. Postpone object deallocation until caller of vm_map_delete()
drops the map lock. Link the map entries to be freed into the freelist,
that is released by the new helper function vm_map_entry_free_freelist().
Reviewed by: tegge, alc
Tested by: pho
Reference object, drop the map lock, and then call vm_object_sync().
The object sync might require vnode lock for OBJT_VNODE type objects.
Reviewed by: tegge
Tested by: pho
acquire vnode lock for OBJT_VNODE object after map lock is dropped.
Because we have the busy page(s) in the object, sleeping there would
result in deadlock with vnode resize. Try to get lock without sleeping,
and, if the attempt failed, drop the state, lock the vnode, and restart
the fault handler from the start with already locked vnode.
Because the vnode_pager_lock() function is inlined in vm_fault(),
axe it.
Based on suggestion by: alc
Reviewed by: tegge, alc
Tested by: pho
describing why several calls to vm_deallocate_object() with locked map
do not result in the acquisition of the vnode lock after map lock.
Suggested and reviewed by: tegge
be accessible outside vmspace_fork() yet, but locking it would satisfy
the protocol of the vm_map_entry_link() and other functions called
from vmspace_fork().
Use trylock that is supposedly cannot fail, to silence WITNESS warning
of the nested acquisition of the sx lock with the same name.
Suggested and reviewed by: tegge
backend kegs so it may source compatible memory from multiple backends.
This is useful for cases such as NUMA or different layouts for the same
memory type.
- Provide a new api for adding new backend kegs to secondary zones.
- Provide a new flag for adjusting the layout of zones to stagger
allocations better across cache lines.
Sponsored by: Nokia
inside the SYSCTL() macros and thus does not need to be done for
all of the nodes scattered across the source tree.
- Mark the name-cache related sysctl's (including debug.hashstat.*) MPSAFE.
- Mark vm.loadavg MPSAFE.
- Remove GIANT_REQUIRED from vmtotal() (everything in this routine already
has sufficient locking) and mark vm.vmtotal MPSAFE.
- Mark the vm.stats.(sys|vm).* sysctls MPSAFE.
of the counter, that may happen when too many sendfile(2) calls are
being executed with this vnode [1].
To keep the size of the struct vm_page and offsets of the fields
accessed by out-of-tree modules, swap the types and locations
of the wire_count and cow fields. Add safety checks to detect cow
overflow and force fallback to the normal copy code for zero-copy
sockets. [2]
Reported by: Anton Yuzhaninov <citrin citrin ru> [1]
Suggested by: alc [2]
Reviewed by: alc
MFC after: 2 weeks
vm_map_lookup{,_locked}() to vm_map_lookup_entry(). Having the fast path
in vm_map_lookup{,_locked}() limits its benefits to page faults. Moving
it to vm_map_lookup_entry() extends its benefits to other operations on
the vm map.
In particular following functions KPI results modified:
- bufobj_invalbuf()
- bufsync()
and BO_SYNC() "virtual method" of the buffer objects set.
Main consumers of bufobj functions are affected by this change too and,
in particular, functions which changed their KPI are:
- vinvalbuf()
- g_vfs_close()
Due to the KPI breakage, __FreeBSD_version will be bumped in a later
commit.
As a side note, please consider just temporary the 'curthread' argument
passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP
Reviewed by: kib
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
into the separate function vm_pageout_oom(). Supply a parameter for
vm_pageout_oom() describing a reason for the call.
Call vm_pageout_oom() from the swp_pager_meta_build() when swap zone
is exhausted.
Reviewed by: alc
Tested by: pho, jhb
MFC after: 2 weeks
filedescriptor into it. Make sure that td_fpop is NULL when calling
d_mmap from dev_pager_getpages().
Change guards against td_fpop field being non-NULL with private state
for another device, and against sudden clearing the td_fpop. This
could occur when either a driver method calls another driver through
the filedescriptor operation, or a page fault happen while driver is
writing to a memory backed by another driver.
Noted by: rwatson
Tested by: rnoland
MFC after: 3 days