function, startup_alloc(), that is used for single page allocations prior
to the VM starting up. If it is used after the VM startups up, it
replaces the zone's allocf pointer with either page_alloc() or
uma_small_alloc() where appropriate.
Pointy hat to: me
Tested by: phk/amd64, me/x86
Temporarily disable the UMA_MD_SMALL_ALLOC stuff since recent commits
break sparc64, amd64, ia64 and alpha. It appears only i386 and maybe
powerpc were not broken.
working set cache. This has several advantages. Firstly, we never touch
the per cpu queues now in the timeout handler. This removes one more
reason for having per cpu locks. Secondly, it reduces the size of the zone
by 8 bytes, bringing it under 200 bytes for a single proc x86 box. This
tidies up other logic as well.
- The 'destroy' flag no longer needs to be passed to zone_drain() since it
always frees everything in the zone's slabs.
- cache_drain() is now only called from zone_dtor() and so it destroys by
default. It also does not need the destroy parameter now.
broken consumers of the malloc interface who assume that the allocated
address will be an even multiple of the size.
- Remove disabled time delay code on uma_reclaim(). The comment there said
it all. It was not an effective strategy and it should not be left in
#if 0'd for all eternity.
page_alloc() function from the slab_zalloc() function. This allows us
to unconditionally call uz_allocf().
- In page_alloc() cleanup the boot_pages logic some. Previously memory from
this cache that was not used by the time the system started was left in
the cache and never used. Typically this wasn't more than a few pages,
but now we will use this cache so long as memory is available.
by accepting the user supplied flags directly. Previously this was not
done so that flags for the same field would not be defined in two
different files. Add comments in each header instructing future
developers on how now to shoot their feet.
- Fix a test for !OFFPAGE which should have been a test for HASH. This would
have caused a panic if we had ever destructed a malloc zone. This also
opens up the possibility that other zones could use the vsetobj() method
rather than a hash.
don't cache as many items.
- Introduce the bucket_alloc(), bucket_free() functions to wrap bucket
allocation. These functions select the appropriate bucket zone to
allocate from or free to.
- Rename ub_ptr to ub_cnt to reflect a change in its use. ub_cnt now reflects
the count of free items in the bucket. This gets rid of many unnatural
subtractions by 1 throughout the code.
- Add ub_entries which reflects the number of entries possibly held in a
bucket.
UMA_ZFLAG_INTERNAL zones at all. Apparently, Wilko's alpha
was crashing while entering multi-user because, I think, we
were calculating the garbage cachefree for pcpu caches that
essentially don't exist for at least the 'zones' zone and it so
happened that we were reading from an unmapped location.
Confirmed to fix crash: wilko
Helped debug: wilko, gallatin
compare the zone element size (+1 for the byte of linkage) against
UMA_SLAB_SIZE - sizeof(struct uma_slab), and not just UMA_SLAB_SIZE.
Add a KASSERT in zone_small_init to make sure that the computed
ipers (items per slab) for the zone is not zero, despite the addition
of the check, just to be sure (this part submitted by: silby)
- UMA_ZONE_VM used to imply BUCKETCACHE. Now it implies
CACHEONLY instead. CACHEONLY is like BUCKETCACHE in the
case of bucket allocations, but in addition to that also ensures that
we don't setup the zone with OFFPAGE slab headers allocated from the
slabzone. This means that we're not allowed to have a UMA_ZONE_VM
zone initialized for large items (zone_large_init) because it would
require the slab headers to be allocated from slabzone, and hence
kmem_map. Some of the zones init'd with UMA_ZONE_VM are so init'd
before kmem_map is suballoc'd from kernel_map, which is why this
change is necessary.
concurrent invocations from acquiring the same address(es). Also, in case
of an incomplete allocation, free any allocated pages.
In collaboration with: tegge
sure that uma_dbg_free() is called if we're about to call
uma_zfree_internal() but we're asking it to skip the dtor and
uma_dbg_free() call itself. So, if we're about to call
uma_zfree_internal() from uma_zfree_arg() and skip == 1, call
uma_dbg_free() ourselves.
1) The race has to do with zone destruction. From the zone destructor we
would lock the zone, set the working set size to 0, then unlock the zone,
drain it, and then free the structure. Within the window following the
working-set-size set to 0 and unlocking of the zone and the point where
in zone_drain we re-acquire the zone lock, the uma timer routine could
have fired off and changed the working set size to something non-zero,
thereby potentially preventing us from completely freeing slabs before
destroying the zone (and thus leaking them).
2) The leak has to do with zone destruction as well. When destroying a
zone we would take care to free all the buckets cached in the zone, but
although we would drain the pcpu cache buckets, we would not free them.
This resulted in leaking a couple of bucket structures (512 bytes each)
per cpu on SMP during zone destruction.
While I'm here, also silence GCC warnings by turning uma_slab_alloc()
from inline to real function. It's too big to be an inline.
Reviewed by: JeffR
("UMA Zone") carefully, because it does not have pcpu caches allocated
at all. In the UP case, we did not catch this because one pcpu cache
is always allocated with the zone, but for the MP case, we were getting
bogus stats for this zone.
Tested by: Lukas Ertl <le@univie.ac.at>
- In sysctl_vm_zone use the per cpu locks to read the current cache
statistics this makes them more accurate while under heavy load.
Submitted by: tegge
so not only wastes memory but it can also cause a leak in zones that
will be destroyed later. The problem is that the slab allocation code
places newly created slabs on the partially allocated list because it
assumes that the caller will actually allocate some memory from it.
Failure to do so places an otherwise free slab on the partial slab list
where we wont find it later in zone_drain().
Continuously prodded to fix by: phk (Thanks)
uma_zalloc) is called with exactly one of either M_WAITOK or M_NOWAIT and
that it is called with neither M_TRYWAIT or M_DONTWAIT. Print a warning
if anything is wrong. Default to M_WAITOK of no flag is given. This is the
same test as in malloc(9).
of pcpu locks. This makes uma_zone somewhat smaller (by (LOCKNAME_LEN *
sizeof(char) + sizeof(struct mtx) * maxcpu) bytes, to be exact).
No Objections from jeff.
extra function calls. Refactor uma_zalloc_internal into seperate functions
for finding the most appropriate slab, filling buckets, allocating single
items, and pulling items off of slabs. This makes the code significantly
cleaner.
- This also fixes the "Returning an empty bucket." panic that a few people
have seen.
Tested On: alpha, x86
held. This avoids a lock order reversal when destroying zones.
Unfortunately, this also means that the free checks are not done before
the destructor is called.
Reported by: phk
- Remove all instances of the mallochash.
- Stash the slab pointer in the vm page's object pointer when allocating from
the kmem_obj.
- Use the overloaded object pointer to find slabs for malloced memory.