2011-10-22 23:34:37 +00:00
|
|
|
/*
|
Implement a new algorithm for managing the radix trie which also
includes path-compression. This greatly helps with sparsely populated
tries, where an uncompressed trie may end up by having a lot of
intermediate nodes for very little leaves.
The new algorithm introduces 2 main concepts: the node level and the
node owner. Every node represents a branch point where the leaves share
the key up to the level specified in the node-level (current level
excluded, of course). Such key partly shared is the one contained in
the owner. Of course, the root branch is exempted to keep a valid
owner, because theoretically all the keys are contained in the space
designed by the root branch node. The search algorithm seems very
intuitive and that is where one should start reading to understand the
full approach.
In the end, the algorithm ends up by demanding only one node per insert
and this is not necessary in all the cases. To stay safe, we basically
preallocate as many nodes as the number of physical pages are in the
system, using uma_preallocate(). However, this raises 2 concerns:
* As pmap_init() needs to kmem_alloc(), the nodes must be pre-allocated
when vm_radix_init() is currently called, which is much before UMA
is fully initialized. This means that uma_prealloc() will dig into the
UMA_BOOT_PAGES pool of pages, which is often not enough to keep track
of such large allocations.
In order to fix this, change a bit the concept of UMA_BOOT_PAGES and
vm.boot_pages. More specifically make the UMA_BOOT_PAGES an initial "value"
as long as vm.boot_pages and extend the boot_pages physical area by as
many bytes as needed with the information returned by
vm_radix_allocphys_size().
* A small amount of pages will be held in per-cpu buckets and won't be
accessible from curcpu, so the vm_radix_node_get() could really panic
when the pre-allocation pool is close to be exhausted.
In theory we could pre-allocate more pages than the number of physical
frames to satisfy such request, but as many insert would happen without
a node allocation anyway, I think it is safe to assume that the
over-allocation is already compensating for such problem.
On the field testing can stand me correct, of course. This could be
further helped by the case where we allow a single-page insert to not
require a complete root node.
The use of pre-allocation gets rid all the non-direct mapping trickery
and introduced lock recursion allowance for vm_page_free_queue.
The nodes children are reduced in number from 32 -> 16 and from 16 -> 8
(for respectively 64 bits and 32 bits architectures).
This would make the children to fit into cacheline for amd64 case,
for example, and in general spawn less cacheline, which may be
helpful in lookup_ge() case.
Also, path-compression cames to help in cases where there are many levels,
making the fallouts of such change less hurting.
Sponsored by: EMC / Isilon storage division
Reviewed by: jeff (partially)
Tested by: flo
2013-02-13 01:19:31 +00:00
|
|
|
* Copyright (c) 2013 EMC Corp.
|
2011-11-01 04:21:57 +00:00
|
|
|
* Copyright (c) 2011 Jeffrey Roberson <jeff@freebsd.org>
|
2011-10-22 23:34:37 +00:00
|
|
|
* Copyright (c) 2008 Mayur Shardul <mayur.shardul@gmail.com>
|
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
|
|
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
* SUCH DAMAGE.
|
|
|
|
*
|
2013-03-17 23:53:06 +00:00
|
|
|
* $FreeBSD$
|
2011-10-22 23:34:37 +00:00
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _VM_RADIX_H_
|
|
|
|
#define _VM_RADIX_H_
|
|
|
|
|
2013-02-06 18:04:28 +00:00
|
|
|
#include <vm/_vm_radix.h>
|
2011-10-22 23:34:37 +00:00
|
|
|
|
2011-11-15 23:37:15 +00:00
|
|
|
#ifdef _KERNEL
|
|
|
|
|
2013-03-04 00:07:23 +00:00
|
|
|
void vm_radix_init(void);
|
2013-08-09 11:28:55 +00:00
|
|
|
int vm_radix_insert(struct vm_radix *rtree, vm_page_t page);
|
2013-08-23 17:27:12 +00:00
|
|
|
boolean_t vm_radix_is_singleton(struct vm_radix *rtree);
|
2013-02-06 18:37:46 +00:00
|
|
|
vm_page_t vm_radix_lookup(struct vm_radix *rtree, vm_pindex_t index);
|
|
|
|
vm_page_t vm_radix_lookup_ge(struct vm_radix *rtree, vm_pindex_t index);
|
|
|
|
vm_page_t vm_radix_lookup_le(struct vm_radix *rtree, vm_pindex_t index);
|
|
|
|
void vm_radix_reclaim_allnodes(struct vm_radix *rtree);
|
|
|
|
void vm_radix_remove(struct vm_radix *rtree, vm_pindex_t index);
|
2013-12-08 20:07:02 +00:00
|
|
|
vm_page_t vm_radix_replace(struct vm_radix *rtree, vm_page_t newpage);
|
2011-10-30 11:11:04 +00:00
|
|
|
|
2011-11-01 03:53:10 +00:00
|
|
|
#endif /* _KERNEL */
|
2011-10-22 23:34:37 +00:00
|
|
|
#endif /* !_VM_RADIX_H_ */
|