(Saying that the lock on the object that the page belongs to must be held
only represents one aspect of the rules.)
Eliminate the use of the page queues lock for atomically performing read-
modify-write operations on the dirty field when the underlying architecture
supports atomic operations on char and short types.
Document the fact that 32KB pages aren't really supported.
Reviewed by: attilio, kib
vm_page_undirty(). The assert is not precise due to VPO_BUSY owner
to tracked, so assertion does not catch the case when VPO_BUSY is
owned by other thread.
Reviewed by: alc
VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return
VM_PAGER_AGAIN for the partially written page. Always forward at least
one page in the loop of vm_object_page_clean().
VM_PAGER_ERROR causes the page reactivation and does not clear the
page dirty state, so the write is not lost.
The change fixes an infinite loop in vm_object_page_clean() when the
filesystem returns permanent errors for some page writes.
Reported and tested by: gavin
Reviewed by: alc, rmacklem
MFC after: 1 week
uma_startup2() was called. Thus, setting the variable "booted" to true in
uma_startup() was ok on machines with UMA_MD_SMALL_ALLOC defined, because
any allocations made after uma_startup() but before uma_startup2() could be
satisfied by uma_small_alloc(). Now, however, some multipage allocations
are necessary before uma_startup2() just to allocate zone structures on
machines with a large number of processors. Thus, a Boolean can no longer
effectively describe the state of the UMA allocator. Instead, make "booted"
have three values to describe how far initialization has progressed. This
allows multipage allocations to continue using startup_alloc() until
uma_startup2(), but single-page allocations may begin using
uma_small_alloc() after uma_startup().
2. With the aforementioned change, only a modest increase in boot pages is
necessary to boot UMA on a large number of processors.
3. Retire UMA_MD_SMALL_ALLOC_NEEDS_VM. It has only been used between
r182028 and r204128.
Reviewed by: attilio [1], nwhitehorn [3]
Tested by: sbruno
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.
Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).
Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.
Requested by: alc
MFC after: 1 week
MFC with: r221853
Hold the vnode around the region where object lock is dropped, until
vnode lock is acquired.
Do not drop the vnode reference for a case when the object was
deallocated during unlock. Note that in this case, VV_TEXT is cleared
by vnode_pager_dealloc().
Reported and tested by: pho
Reviewed by: alc
MFC after: 3 days
If supplied length is zero, and user address is invalid, function
might return -1, due to the truncation and rounding of the address.
The callers interpret the situation as EFAULT. Instead of handling
the zero length in caller, filter it in vm_fault_quick_hold_pages().
Sponsored by: The FreeBSD Foundation
Reviewed by: alc
- Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL
in fork to honor the locking requirements. While here, expand the scope
of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously
the code was locking the new child process (p2) after it had locked the
parent process (p1). However, when locking two processes, the safe order
is to lock the child first, then the parent.
- Fix various places that were checking p_state against PRS_NEW without
having the process locked to use PROC_LOCK(). Every place was already
locking the process, just after the PRS_NEW check.
- Remove or reduce the use of PROC_SLOCK() for places that were checking
p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading
the current state.
- Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.
MFC after: 1 week
which are not yet fully initialized (i.e. ones with p_state == PRS_NEW).
Without it, we could panic in _thread_lock_flags().
Note that there may be other instances of FOREACH_PROC_IN_SYSTEM() that
require similar fix.
Reported by: pho, keramida
Discussed with: kib
As it was pointed out by Alan Cox, that no longer serves its purpose with
the modern UMA allocator compared to the old one used in 4.x days.
The removal of sysctl eliminates max_proc_mmap type overflow leading to
the broken mmap(2) seen with large amount of physical memory on arches
with factually unbound KVA space (such as amd64). It was found that
slightly less than 256GB of physmem was enough to trigger the overflow.
Reviewed by: alc, kib
Approved by: avg (mentor)
MFC after: 2 months
vm_map_insert(), the kmem_back() assumption about newly inserted
entry might be broken due to interference of two factors. In the low
memory condition, when vm_page_alloc() returns NULL, supplied map is
unlocked. If another thread performs kmem_malloc() meantime, and its
map entry is placed right next to our thread map entry in the map,
both entries wire count is still 0 and entries are coalesced due to
vm_map_simplify_entry().
Mark new entry with MAP_ENTRY_IN_TRANSITION to prevent coalesce.
Fix some style issues, tighten the assertions to account for
MAP_ENTRY_IN_TRANSITION state.
Reported and tested by: pho
Reviewed by: alc
KASSERT()s and eliminate the rest.
Replace excessive printf()s and a panic() in bufdone_finish() with a
KASSERT() in vm_page_io_finish().
Reviewed by: kib
incorrectly calling vm_object_page_clean(). They are passing the length of
the range rather than the ending offset of the range.
Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the
callers.
Reviewed by: kib
MFC after: 3 weeks
MAP_STACK_* entries. (See r71983 and r74235.)
In some cases, performing this call to vm_map_simplify_entry() halves the
number of vm map entries used by the Sun JDK.
sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT
drain for extremely large amounts of data where the caller knows that
appropriate references are held, and sleeping is not an issue.
Inspired by: rwatson
assertion that is no longer required. Long ago, calls to vm_page_alloc()
from an interrupt handler had to specify VM_ALLOC_INTERRUPT so that
vm_page_alloc() would not attempt to reclaim a PQ_CACHE page from another vm
object. Today, with the synchronization on a vm object's collection of
PQ_CACHE pages, this is no longer an issue. In fact, VM_ALLOC_INTERRUPT now
reclaims PQ_CACHE pages just like VM_ALLOC_{NORMAL,SYSTEM}.
MFC after: 3 weeks
OBJT_PHYS objects. Thus, there is no need for handling them specially
in vm_fault(). In fact, this special case handling would have led to
an assertion failure just before the call to pmap_enter().
Reviewed by: kib@
MFC after: 6 weeks
need it anymore. Moreover, its implementation had a type mismatch, a
long is not necessarily an uint64_t. (This mismatch was hidden by
casting.) Move the remaining two counters up a level in the sysctl
hierarchy. There is no reason for them to be under the vm.pmap node.
Reviewed by: kib
hold this lock until the end of the function.
With the aforementioned change to vm_pageout_clean(), page locks don't need
to support recursive (MTX_RECURSE) or duplicate (MTX_DUPOK) acquisitions.
Reviewed by: kib