sleep, and perform the page allocations with VM_ALLOC_SYSTEM
class. Previously, the allocation was also allowed to completely drain
the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT
request class for vm_page_alloc() and similar functions.
Allow the caller of malloc* to request the 'deep drain' semantic by
providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT
class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM
allocation class.
Centralize the translation of the M_* malloc(9) flags in the single
inline function malloc2vm_flags().
Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com>
Reviewed by: alc, mdf (previous version)
Tested by: pho (previous version)
MFC after: 2 weeks
cache line in order to avoid manual frobbing but using
struct mtx_padalign.
The sole exception being nvme and sxfge drivers, where the author
redefined CACHE_LINE_SIZE manually, so they need to be analyzed and
dealt with separately.
Reviwed by: jimharris, alc
because the queue itself serves no purpose. When a held page is freed,
inserting the page into the hold queue has the side effect of setting the
page's "queue" field to PQ_HOLD. Later, when the page is unheld, it will
be freed because the "queue" field is PQ_HOLD. In other words, PQ_HOLD is
used as a flag, not a queue. So, this change replaces it with a flag.
To accomodate the new page flag, make the page's "flags" field wider and
"oflags" field narrower.
Reviewed by: kib
network file systems (not only NFS proper). Short reads cause pages
other then the requested one, which were not filled by read response,
to stay invalid.
Change the vm_page_readahead_finish() interface to not take the error
code, but instead to make a decision to free or to (de)activate the
page only by its validity. As result, not requested invalid pages are
freed even if the read RPC indicated success.
Noted and reviewed by: alc
MFC after: 1 week
to pull vm_param.h was removed. Other big dependency of vm_page.h on
vm_param.h are PA_LOCK* definitions, which are only needed for
in-kernel code, because modules use KBI-safe functions to lock the
pages.
Stop including vm_param.h into vm_page.h. Include vm_param.h
explicitely for the kernel code which needs it.
Suggested and reviewed by: alc
MFC after: 2 weeks
vm_page oflags by providing helper function
vm_page_readahead_finish(), which handles completed reads for pages
with indexes other then the requested one, for VOP_GETPAGES().
Reviewed by: alc
MFC after: 1 week
This makes the RED/BLACK support go away and simplifies a lot vmradix
functions used here. This happens because with patricia trie support
the trie will be little enough that keeping 2 diffetnt will be
efficient too.
- Reduce differences with head, in places like backing scan where the
optimizazions used shuffled the code a little bit around.
Tested by: flo, Andrea Barberio
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.
Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.
As an added bonus, tidy up some nearby comments concerning page flags.
Reviewed by: kib
MFC after: 6 weeks
The target of this is getting at the point where the recovery path is
completely removed as we could count on pre-allocation once the
path compressed trie is implemented.
for allocation of fictitious pages, for which PHYS_TO_VM_PAGE()
returns proper fictitious vm_page_t. The range should be de-registered
after consumer stopped using it.
De-inline the PHYS_TO_VM_PAGE() since it now carries code to iterate
over registered ranges.
A hash container might be developed instead of range registration
interface, and fake pages could be put automatically into the hash,
were PHYS_TO_VM_PAGE() could look them up later. This should be
considered before the MFC of the commit is done.
Sponsored by: The FreeBSD Foundation
Reviewed by: alc
MFC after: 1 month
vm_page into new interface vm_page_initfake(). Handle the case of fake
page re-initialization with changed memattr.
Sponsored by: The FreeBSD Foundation
Reviewed by: alc
MFC after: 1 month
callers of vm_page_insert().
The default action for every caller is to unwind-back the operation
besides vm_page_rename() where this has proven to be impossible to do.
For that case, it just spins until the page is not available to be
allocated. However, due to vm_page_rename() to be mostly rare (and
having never hit this panic in the past) it is tought to be a very
seldom thing and not a possible performance factor.
The patch has been tested with an atomic counter returning NULL from
the zone allocator every 1/100000 allocations. Per-printf, I've verified
that a typical buildkernel could trigger this 30 times. The patch
survived to 2 hours of repeated buildkernel/world.
Several technical notes:
- The vm_page_insert() is moved, in several callers, closer to failure
points. This could be committed separately before vmcontention hits
the tree just to verify -CURRENT is happy with it.
- vm_page_rename() does not need to have the page lock in the callers
as it hide that as an implementation detail. Do the locking internally.
- now vm_page_insert() returns an int, with 0 meaning everything was ok,
thus KPI is broken by this patch.
excluding other allocations including UMA now entails the addition of
a single flag to kmem_alloc or uma zone create
Reviewed by: alc, avg
MFC after: 2 weeks
Since the address of vm_page lock mutex depends on the kernel options,
it is easy for module to get out of sync with the kernel.
No vm_page_lockptr() accessor is provided for modules. It can be added
later if needed, unless proper KPI is developed to serve the needs.
Reviewed by: attilio, alc
MFC after: 3 weeks
yielding a new public interface, vm_page_alloc_contig(). This new function
addresses some of the limitations of the current interfaces, contigmalloc()
and kmem_alloc_contig(). For example, the physically contiguous memory that
is allocated with those interfaces can only be allocated to the kernel vm
object and must be mapped into the kernel virtual address space. It also
provides functionality that vm_phys_alloc_contig() doesn't, such as wiring
the returned pages. Moreover, unlike that function, it respects the low
water marks on the paging queues and wakes up the page daemon when
necessary. That said, at present, this new function can't be applied to all
types of vm objects. However, that restriction will be eliminated in the
coming weeks.
From a design standpoint, this change also addresses an inconsistency
between vm_phys_alloc_contig() and the other vm_phys_alloc*() functions.
Specifically, vm_phys_alloc_contig() manipulated vm_page fields that other
functions in vm/vm_phys.c didn't. Moreover, vm_phys_alloc_contig() knew
about vnodes and reservations. Now, vm_page_alloc_contig() is responsible
for these things.
Reviewed by: kib
Discussed with: jhb
Use the defined types instead of int when manipulating masks.
Supposedly, it could fix support for 32KB page size in the
machine-independend VM layer.
Reviewed by: alc
MFC after: 2 weeks
Black nodes support standard active pages and red nodes support cached
pages. Red nodes may be removed without the object lock but will not
collapse unused tree nodes. Red nodes may not be directly inserted,
instead a new function is supplied to convert between black and red.
- Handle cached pages and active pages in the same loop in vm_object_split,
vm_object_backing_scan, and vm_object_terminate.
- Retire the splay page handling as the ifdefs are too difficult to
maintain.
- Slightly optimize the vm_radix_lookupn() function.