sure that uma_dbg_free() is called if we're about to call
uma_zfree_internal() but we're asking it to skip the dtor and
uma_dbg_free() call itself. So, if we're about to call
uma_zfree_internal() from uma_zfree_arg() and skip == 1, call
uma_dbg_free() ourselves.
in sync with the backend machdep code. When cpu_thread_init() does not
have the same idea of KSTACK_PAGES as the thing that created the kstack,
all hell breaks loose.
Bad alc! no cookie! :-)
1) The race has to do with zone destruction. From the zone destructor we
would lock the zone, set the working set size to 0, then unlock the zone,
drain it, and then free the structure. Within the window following the
working-set-size set to 0 and unlocking of the zone and the point where
in zone_drain we re-acquire the zone lock, the uma timer routine could
have fired off and changed the working set size to something non-zero,
thereby potentially preventing us from completely freeing slabs before
destroying the zone (and thus leaking them).
2) The leak has to do with zone destruction as well. When destroying a
zone we would take care to free all the buckets cached in the zone, but
although we would drain the pcpu cache buckets, we would not free them.
This resulted in leaking a couple of bucket structures (512 bytes each)
per cpu on SMP during zone destruction.
While I'm here, also silence GCC warnings by turning uma_slab_alloc()
from inline to real function. It's too big to be an inline.
Reviewed by: JeffR
("UMA Zone") carefully, because it does not have pcpu caches allocated
at all. In the UP case, we did not catch this because one pcpu cache
is always allocated with the zone, but for the MP case, we were getting
bogus stats for this zone.
Tested by: Lukas Ertl <le@univie.ac.at>
- In sysctl_vm_zone use the per cpu locks to read the current cache
statistics this makes them more accurate while under heavy load.
Submitted by: tegge
so not only wastes memory but it can also cause a leak in zones that
will be destroyed later. The problem is that the slab allocation code
places newly created slabs on the partially allocated list because it
assumes that the caller will actually allocate some memory from it.
Failure to do so places an otherwise free slab on the partial slab list
where we wont find it later in zone_drain().
Continuously prodded to fix by: phk (Thanks)
contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
uma_zalloc) is called with exactly one of either M_WAITOK or M_NOWAIT and
that it is called with neither M_TRYWAIT or M_DONTWAIT. Print a warning
if anything is wrong. Default to M_WAITOK of no flag is given. This is the
same test as in malloc(9).
the "toss the largest process" emergency handling) from vm_map.c to
swap_pager.c.
The quantity calculated depends strongly on the internals of the
swap_pager and by moving it, we no longer need to expose the
internal metrics of the swap_pager to the world.
paging space and how much of it is in use (in pages).
Use this interface from the Linuxolator instead of groping around in the
internals of the swap_pager.
is not natural and needlessly exposes a lot of dirty laundry.
Move private interfaces between the two from swap_pager.h to swap_pager.c
and staticize as much as possible.
No functional change.
files, so that SWAP_META_PAGES does not vary either.
swap_pager.c ended up with a value of 16, everybody else 8. Go with
the 16 for now.
This should only have any effect in the "kill processes because we
are out of swap" scenario, where it will make some sort of estimate
of something more precise.
- Change vm_pageout_object_deactivate_pages()'s first parameter from a
vm_map_t to a pmap_t.
- Change vm_pageout_object_deactivate_pages()'s and
vm_pageout_map_deactivate_pages()'s last parameter from a vm_pindex_t
to a long. Since the number of pages in an address space doesn't
require 64 bits on an i386, vm_pindex_t is overkill.
order to avoid the overhead of later page faults. In general, it
implements two cases: one for vnode-backed objects and one for
device-backed objects. Only the device-backed case is really
machine-dependent, belonging in the pmap.
This commit moves the vnode-backed case into the (relatively) new
function vm_map_pmap_enter(). On amd64 and i386, this commit only
amounts to code rearrangement. On alpha and ia64, the new machine
independent (MI) implementation of the vnode case is smaller and more
efficient than their pmap-based implementations. (The MI
implementation takes advantage of the fact that objects in -CURRENT
are ordered collections of pages.) On sparc64, pmap_object_init_pt()
hadn't (yet) been implemented.
implementation of a largely MI pmap_object_init_pt() for vnode-backed
objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64
and powerpc.
- Correct a mismatch between pmap_object_init_pt()'s prototype and its
various implementations. (I plan to keep pmap_object_init_pt() as
the MD hook for device-backed objects on i386 and amd64.)
- Correct an error in ia64's pmap_enter_quick() and adjust its interface
to match the other versions. Discussed with: marcel
color in vm_page_alloc(). (This also has small performance benefits.)
- Eliminate vm_page_select_free(); vm_page_alloc() might as well
call vm_pageq_find() directly.
of pcpu locks. This makes uma_zone somewhat smaller (by (LOCKNAME_LEN *
sizeof(char) + sizeof(struct mtx) * maxcpu) bytes, to be exact).
No Objections from jeff.
releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages()
or vm_pager_has_pages()). If we sleep, we have marked the vm object with
the paging-in-progress flag.
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
to the machine-independent parts of the VM. At the same time, this
introduces vm object locking for the non-i386 platforms.
Two details:
1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set
KSTACK_GUARD_PAGES to 0.
2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
processes in the first pass. Among other things, this will give
us a chance to launder vnode-backed pages before concluding that
we need more swap. This is particularly useful for systems that
have no swap.
While here, update a comment and remove some long-unused code.
Reported by: Lucky Green <shamrock@cypherpunks.to>
Suggested by: dillon
Approved by: re (rwatson)
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.
Reviewed by: arch@
Approved by: re (rwatson)