freebsd-skq

Author	SHA1	Message	Date
attilio	a2e67affe3	Expand ambiguous comments some more. Requested by: alc	2013-03-17 15:27:26 +00:00
attilio	07b5846fc9	Fix compilation. Sponsored by: EMC / Isilon storage division	2013-03-13 01:38:32 +00:00
attilio	02cf10e6db	Add a further safety belt to prevent inconsistencies. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-03-13 01:00:34 +00:00
attilio	ba43ac477b	For uniformity, use the user provided index. Sponsored by: EMC / Isilon storage division Reviewed and reported by: alc	2013-03-13 00:41:37 +00:00
attilio	82aa86d64f	Improve comments. Sponsored by: EMC / Isilon storage division Submitted by: mdf	2013-03-07 23:37:10 +00:00
alc	a8671df14b	Fix a typo. Sponsored by: EMC / Isilon Storage Division	2013-03-04 07:25:11 +00:00
alc	c3be5353b8	Make a pass over most of the comments.	2013-03-04 07:11:10 +00:00
alc	475367da61	Simplify Boolean expressions. Sponsored by: EMC / Isilon Storage Division	2013-03-04 06:26:25 +00:00
alc	5094368613	Fix spelling. Sponsored by: EMC / Isilon Storage Division	2013-03-04 06:13:26 +00:00
attilio	60e39c95b8	Remove the boot-time cache support and rely on UMA boot-time slab cache for allocating the nodes before to have the possibility to carve directly from the UMA subsystem. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-03-04 00:07:23 +00:00
attilio	343c9f6f19	Missing semicolon. Sponsored by: EMC / Isilon storage division Submitted by: alc Pointy hat to: me	2013-02-24 19:10:16 +00:00
attilio	1a753217f3	Simplify return logic. Sponsored by: EMC / Isilon storage division Submitted by: alc	2013-02-24 19:05:11 +00:00
attilio	12289fcebc	Retire the old UMA primitive uma_zone_set_obj() and replace it with the more modern uma_zone_reserve_kva(). The difference is that it doesn't rely anymore on an obj to allocate pages and the slab allocator doesn't use any more any specific locking but atomic operations to complete the operation. Where possible, the uma_small_alloc() is instead used and the uk_kva member becomes unused. The subsequent cleanups also brings along the removal of VM_OBJECT_LOCK_INIT() macro which is not used anymore as the code can be easilly cleaned up to perform a single mtx_init(), private to vm_object.c. For the same reason, _vm_object_allocate() becomes private as well. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-02-24 16:41:36 +00:00
attilio	f6d331e804	Fix an inverted check that was reporting indexes wrongly detected as wrapped. Sponsored by: EMC / Isilon storage divison Reported by: alc	2013-02-24 16:08:37 +00:00
attilio	1cf2f4550b	On arches with VM_PHYSSEG_DENSE the vm_page_array is larger than the actual number of vm_page_t that will be derived, so v_page_count should be used appropriately. Besides that, add a panic condition in case UMA fails to properly restrict the area in a way to keep all the desired objects. Sponsored by: EMC / Isilon storage division Reported by: alc	2013-02-15 16:05:18 +00:00
attilio	757b950804	Remove unused headers.	2013-02-15 15:34:19 +00:00
attilio	daa1f2caab	Fix comment.	2013-02-15 14:54:09 +00:00
attilio	fa12391493	Move the radix node zone destructor definition closer to vm_radix_init() definition. Sponsored by: EMC / Isilon storage division	2013-02-15 14:53:42 +00:00
attilio	be627ca24c	- When panicing for "too small boot cache" reason, print the actual cache size value - Add a way to specify the size of the boot cache at compile time Sponsored by: EMC / Isilon storage division	2013-02-15 14:50:36 +00:00
attilio	47ecbcf556	Improve dynamic branch prediction and i-cache utilization: - Use predict_false() to tag boot-time cache decisions - Compact boot-time cache allocation into a separate, non-inline, function that won't be called most of the times. Sponsored by: EMC / Isilon storage division	2013-02-15 14:48:06 +00:00
attilio	908e129569	Fix style.	2013-02-14 15:24:13 +00:00
attilio	eafe26c8a6	The radix preallocation pages can overfow the biggestone segment, so use a different scheme for preallocation: reserve few KB of nodes to be used to cater page allocations before the memory can be efficiently pre-allocated by UMA. This at all effects remove boot_pages further carving and along with this modifies to the boot_pages allocation system and necessity to initialize the UMA zone before pmap_init(). Reported by: pho, jhb	2013-02-14 15:23:00 +00:00
attilio	3db337c2ea	Grammar. Sponsored by: EMC / Isilon storage division	2013-02-13 02:04:49 +00:00
attilio	53f78d1a7d	Implement a new algorithm for managing the radix trie which also includes path-compression. This greatly helps with sparsely populated tries, where an uncompressed trie may end up by having a lot of intermediate nodes for very little leaves. The new algorithm introduces 2 main concepts: the node level and the node owner. Every node represents a branch point where the leaves share the key up to the level specified in the node-level (current level excluded, of course). Such key partly shared is the one contained in the owner. Of course, the root branch is exempted to keep a valid owner, because theoretically all the keys are contained in the space designed by the root branch node. The search algorithm seems very intuitive and that is where one should start reading to understand the full approach. In the end, the algorithm ends up by demanding only one node per insert and this is not necessary in all the cases. To stay safe, we basically preallocate as many nodes as the number of physical pages are in the system, using uma_preallocate(). However, this raises 2 concerns: * As pmap_init() needs to kmem_alloc(), the nodes must be pre-allocated when vm_radix_init() is currently called, which is much before UMA is fully initialized. This means that uma_prealloc() will dig into the UMA_BOOT_PAGES pool of pages, which is often not enough to keep track of such large allocations. In order to fix this, change a bit the concept of UMA_BOOT_PAGES and vm.boot_pages. More specifically make the UMA_BOOT_PAGES an initial "value" as long as vm.boot_pages and extend the boot_pages physical area by as many bytes as needed with the information returned by vm_radix_allocphys_size(). * A small amount of pages will be held in per-cpu buckets and won't be accessible from curcpu, so the vm_radix_node_get() could really panic when the pre-allocation pool is close to be exhausted. In theory we could pre-allocate more pages than the number of physical frames to satisfy such request, but as many insert would happen without a node allocation anyway, I think it is safe to assume that the over-allocation is already compensating for such problem. On the field testing can stand me correct, of course. This could be further helped by the case where we allow a single-page insert to not require a complete root node. The use of pre-allocation gets rid all the non-direct mapping trickery and introduced lock recursion allowance for vm_page_free_queue. The nodes children are reduced in number from 32 -> 16 and from 16 -> 8 (for respectively 64 bits and 32 bits architectures). This would make the children to fit into cacheline for amd64 case, for example, and in general spawn less cacheline, which may be helpful in lookup_ge() case. Also, path-compression cames to help in cases where there are many levels, making the fallouts of such change less hurting. Sponsored by: EMC / Isilon storage division Reviewed by: jeff (partially) Tested by: flo	2013-02-13 01:19:31 +00:00
attilio	4c22b4bafe	Cleanup vm_radix KPI: - Avoid the return value for vm_radix_insert() - Name the functions argument per-style(9) - Avoid to get and return opaque objects but use vm_page_t as vm_radix is thought to not really be general code but to cater specifically page cache and resident cache.	2013-02-06 18:37:46 +00:00
attilio	f458bac614	Remove vm_radix_lookupn() and its usage in the kernel.	2013-01-10 12:30:58 +00:00
attilio	ffa3f082ff	- Split the cached and resident pages tree into 2 distinct ones. This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio	2012-07-08 14:01:25 +00:00
attilio	807db03f96	Revert r231027 and fix the prototype for vm_radix_remove(). The target of this is getting at the point where the recovery path is completely removed as we could count on pre-allocation once the path compressed trie is implemented.	2012-06-08 18:44:54 +00:00
attilio	7b7f4887b9	Revert r236367. The target of this is getting at the point where the recovery path is completely removed as we could count on pre-allocation once the path compressed trie is implemented.	2012-06-08 18:08:31 +00:00
attilio	ab9d63eba7	Simplify insert path by using the same logic of vm_radix_remove() for the recovery path. The bulk of vm_radix_remove() is put into a generic function vm_radix_sweep() which allows 2 different modes (hard and soft): the soft one will deal with half-constructed paths by cleaning them up. Ideally all these complications should go once that a way to pre-allocate is implemented, possibly by implementing path compression. Requested and discussed with: jeff Tested by: pho	2012-05-31 22:54:08 +00:00
attilio	c72fe43a63	Add braces.	2012-05-12 19:54:57 +00:00
attilio	e5220032ec	On 32-bits architecture KTR has a bug as it cannot correctly grok 64-bits numbers. ktr_tracepoint() infacts casts all the passed value to u_long values as that is what the ktr entries can handle. However, we have to work a lot with vm_pindex_t which are always 64-bit also on 32-bits architectures (most notable case being i386). Use macros to split the 64 bits printing into 32-bits chunks which KTR can correctly handle. Reported and tested by: flo	2012-05-12 19:52:59 +00:00
attilio	3bd53aaf3c	- Fix a bug where lookupn can wrap up looking for the pages to scan, returning a non correct very low address again. - Stub out vm_lookup_foreach as it is not used	2012-05-12 19:22:57 +00:00
attilio	f9319cf885	Fix the nodes allocator in architectures without direct-mapping: - Fix bugs in the free path where the pages were not unwired and relevant locking wasn't acquired. - Introduce the rnode_map, submap of kernel_map, where to allocate from. The reason is that, in architectures without direct-mapping, kmem_alloc*() will try to insert the newly created mapping while holding the vm_object lock introducing a LOR or lock recursion. rnode_map is however a leafly-used submap, thus there cannot be any deadlock. Notes: Size the submap in order to be, by default, around 64 MB and decrase the size of the nodes as the allocation will be much smaller (and when the compacting code in the vm_radix will be implemented this will aim for much less space to be used). However note that the size of the submap can be changed at boot time via the hw.rnode_map_scale scaling factor. - Use uma_zone_set_max() covering the size of the submap. Tested by: flo	2012-03-16 15:41:07 +00:00
attilio	9e63566650	Fix a compile time bug by adding a check just after the struct definition	2012-03-06 23:37:53 +00:00
attilio	df89a6a2db	- Exclude vm_radix_shrink() from the interface but retain the code still as it can be useful. - Make most of the interface private as it is unnecessary public right now. This will help in making nodes changing with arch and still avoid namespace pollution.	2012-03-01 00:54:08 +00:00
flo	1e497814c3	fix KTR consistency I'm committing this on behalf of Attilio as he cannot access svn right now.	2012-02-05 18:55:20 +00:00
attilio	6587a6afdd	Remove the panic from vm_radix_insert() and propagate the error to the callers of vm_page_insert(). The default action for every caller is to unwind-back the operation besides vm_page_rename() where this has proven to be impossible to do. For that case, it just spins until the page is not available to be allocated. However, due to vm_page_rename() to be mostly rare (and having never hit this panic in the past) it is tought to be a very seldom thing and not a possible performance factor. The patch has been tested with an atomic counter returning NULL from the zone allocator every 1/100000 allocations. Per-printf, I've verified that a typical buildkernel could trigger this 30 times. The patch survived to 2 hours of repeated buildkernel/world. Several technical notes: - The vm_page_insert() is moved, in several callers, closer to failure points. This could be committed separately before vmcontention hits the tree just to verify -CURRENT is happy with it. - vm_page_rename() does not need to have the page lock in the callers as it hide that as an implementation detail. Do the locking internally. - now vm_page_insert() returns an int, with 0 meaning everything was ok, thus KPI is broken by this patch.	2012-02-05 17:37:26 +00:00
attilio	1b454e6b83	Fix a bug in vm_radix_leaf() where the shifting start address can wrap-up at some point. This bug is triggered very easilly by indirect blocks in UFS which grow negative resulting in very high counts. In collabouration with: flo	2012-01-29 16:44:21 +00:00
attilio	8bc5caadc8	Fix format string for the pindex members as they should be treated as uintmax_t for compatibility among 32/64 bits.	2012-01-29 16:29:06 +00:00
attilio	1f27e97ae5	Use atomics for rn_count on leaf node because RED operations happen without the VM_OBJECT_LOCK held, thus can be concurrent with BLACK ones. However, also use a write memory barrier in order to not reorder the operation of decrementing rn_count in respect fetching the pointer. Discussed with: jeff	2011-12-06 22:57:48 +00:00
attilio	2436e63a9c	- Make rn_count 32-bits as it will naturally pad for 32-bit arches - Avoid to use atomic to manipulate it at level0 because it seems unneeded and introduces a bug on big-endian architectures where only the top half (2 bits) of the double-words are written (as sparc64, for example, doesn't support atomics at 16-bits) heading to a wrong handling of rn_count. Reported by: flo, andreast Found by: marius No answer by: jeff	2011-12-06 19:04:45 +00:00
andreast	8c385e6008	Fix compilation issue on 32-bit targets. Reviewed by: attilio	2011-12-05 16:06:12 +00:00
attilio	8fbad61ab4	Revert a change that sneaked in during the last MFC.	2011-12-02 23:21:59 +00:00
attilio	b2701fb716	MFC	2011-12-02 21:45:46 +00:00
attilio	984f9ddc2b	- Remove unnecessary checks on rnode in KTR prints - Track rn_count in KTR prints - Improve KTR in a way it best fits rn_count tracking	2011-11-29 02:07:07 +00:00
attilio	27cea0290f	Fix compile. Submitted by: flo	2011-11-28 19:14:38 +00:00
attilio	77442309d8	Improve the diagnostic in the remove case.	2011-11-28 17:26:19 +00:00
attilio	4391f6c279	Fix a bug when the 'rnode' pointer can be NULL and we try to track the children. This helps in debugging case. Reported by: flo	2011-11-26 14:26:37 +00:00
attilio	062b8ecde4	Add more KTR points for failure in vm_radix_insert().	2011-11-20 14:51:27 +00:00

1 2

57 Commits