freebsd-dev

Author	SHA1	Message	Date
Bosko Milekic	a5a262c6db	Fix a INVARIANTS-only bug introduced in Revision 1.104: IF INVARIANTS is defined, and in the rare case that we have allocated some objects from the slab and at least one initializer on at least one of those objects failed, and we need to fail the allocation and push the uninitialized items back into the slab caches -- in that scenario, we would fail to [re]set the bucket cache's ub_bucket item references to NULL, which would eventually trigger a KASSERT.	2004-10-27 21:19:35 +00:00
Brian Feldman	55fc8c1146	In the previous revision, I did not intend to change the default value of "nosleepwithlocks." Submitted by: ru	2004-10-09 18:51:32 +00:00
Brian Feldman	ab14a3f7aa	Fix critical stability problems that can cause UMA mbuf cluster state management corruption, mbuf leaks, general mbuf corruption, and at least on i386 a first level splash damage radius that encompasses up to about half a megabyte of the memory after an mbuf cluster's allocation slab. In short, this has caused instability nightmares anywhere the right kind of network traffic is present. When the polymorphic refcount slabs were added to UMA, the new types were not used pervasively. In particular, the slab management structure was turned into one for refcounts, and one for non-refcounts (supposed to be mostly like the old slab management structure), but the latter was almost always used through out. In general, every access to zones with UMA_ZONE_REFCNT turned on corrupted the "next free" slab offset offset and the refcount with each other and with other allocations (on i386, 2 mbuf clusters per 4096 byte slab). Fix things so that the right type is used to access refcounted zones where it was not before. There are additional errors in gross overestimation of padding, it seems, that would cause a large kegs (nee zones) to be allocated when small ones would do. Unless I have analyzed this incorrectly, it is not directly harmful.	2004-10-08 20:19:29 +00:00
Robert Watson	3659f747f1	Generate KTR trace records for uma_zalloc_arg() and uma_zfree_arg(). This doesn't trace every event of interest in UMA, but provides enough basic information to explain lock traces and sleep patterns.	2004-08-06 21:52:38 +00:00
Brian Feldman	b23f72e98a	* Add a "how" argument to uma_zone constructors and initialization functions so that they know whether the allocation is supposed to be able to sleep or not. * Allow uma_zone constructors and initialation functions to return either success or error. Almost all of the ones in the tree currently return success unconditionally, but mbuf is a notable exception: the packet zone constructor wants to be able to fail if it cannot suballocate an mbuf cluster, and the mbuf allocators want to be able to fail in general in a MAC kernel if the MAC mbuf initializer fails. This fixes the panics people are seeing when they run out of memory for mbuf clusters. * Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing the default. Both bmilekic and jeff have reviewed the changes made to make failable zone allocations work.	2004-08-02 00:18:36 +00:00
Bosko Milekic	244f45548a	Rework the way slab header storage space is calculated in UMA. - zone_large_init() stays pretty much the same. - zone_small_init() will try to stash the slab header in the slab page being allocated if the amount of calculated wasted space is less than UMA_MAX_WASTE (for both the UMA_ZONE_REFCNT case and regular case). If the amount of wasted space is >= UMA_MAX_WASTE, then UMA_ZONE_OFFPAGE will be set and the slab header will be allocated separately for better use of space. - uma_startup() calculates the maximum ipers required in offpage slabs (so that the offpage slab header zone(s) can be sized accordingly). The algorithm used to calculate this replaces the old calculation (which only happened to work coincidentally). We now iterate over possible object sizes, starting from the smallest one, until we determine that wastedspace calculated in zone_small_init() might end up being greater than UMA_MAX_WASTE, at which point we use the found object size to compute the maximum possible ipers. The reason this works is because: - wastedspace versus objectsize is a see-saw function with local minima all equal to zero and local maxima growing directly proportioned to objectsize. This implies that for objects up to or equal a certain objectsize, the see-saw remains entirely below UMA_MAX_WASTE, so for those objectsizes it is impossible to ever go OFFPAGE for slab headers. - ipers (items-per-slab) versus objectsize is an inversely proportional function which falls off very quickly (very large for small objectsizes). - To determine the maximum ipers we'll ever need from OFFPAGE slab headers we first find the largest objectsize for which we are guaranteed to not go offpage for and use it to compute ipers (as though we were offpage). Since the only objectsizes allowed to go offpage are bigger than the found objectsize, and since ipers vs objectsize is inversely proportional (and monotonically decreasing), then we are guaranteed that the ipers computed is always >= what we will ever need in offpage slab headers. - Define UMA_FRITM_SZ and UMA_FRITMREF_SZ to be the actual (possibly padded) size of each freelist index so that offset calculations are fixed. This might fix weird data corruption problems and certainly allows ARM to now boot to at least single-user (via simulator). Tested on i386 UP by me. Tested on sparc64 SMP by fenner. Tested on ARM simulator to single-user by cognet.	2004-07-29 15:25:40 +00:00
Alan Cox	5285558ac2	- Change uma_zone_set_obj() to call kmem_alloc_nofault() instead of kmem_alloc_pageable(). The difference between these is that an errant memory access to the zone will be detected sooner with kmem_alloc_nofault(). The following changes serve to eliminate the following lock-order reversal reported by witness: 1st 0xc1a3c084 vm object (vm object) @ vm/swap_pager.c:1311 2nd 0xc07acb00 swap_pager swhash (swap_pager swhash) @ vm/swap_pager.c:1797 3rd 0xc1804bdc vm object (vm object) @ vm/uma_core.c:931 There is no potential deadlock in this case. However, witness is unable to recognize this because vm objects used by UMA have the same type as ordinary vm objects. To remedy this, we make the following changes: - Add a mutex type argument to VM_OBJECT_LOCK_INIT(). - Use the mutex type argument to assign distinct types to special vm objects such as the kernel object, kmem object, and UMA objects. - Define a static swap zone object for use by UMA. (Only static objects are assigned a special mutex type.)	2004-07-22 19:44:49 +00:00
Brian Feldman	0c3c862e21	Since breakage of malloc(9)/uma_zalloc(9) is totally non-optional in GENERIC/for WITNESS users, make sure the sysctl to disable the behavior is read-only and always enabled.	2004-07-19 15:05:24 +00:00
Bosko Milekic	0d0837ee6d	Introduce debug.nosleepwithlocks sysctl, 0 by default. If set to 1 and WITNESS is not built, then force all M_WAITOK allocations to M_NOWAIT behavior (transparently). This is to be used temporarily if wierd deadlocks are reported because we still have code paths that perform M_WAITOK allocations with lock(s) held, which can lead to deadlock. If WITNESS is compiled, then the sysctl is ignored and we ask witness to tell us wether we have locks held, converting to M_NOWAIT behavior only if it tells us that we do. Note this removes the previous mbuf.h inclusion as well (only needed by last revision), and cleans up unneeded [artificial] comparisons to just the mbuf zones. The problem described above has nothing to do with previous mbuf wait behavior; it is a general problem.	2004-07-04 16:07:44 +00:00
Brian Feldman	7a708c3626	Reextend the M_WAITOK-disabling-hack to all three of the mbuf-related zones, and do it by direct comparison of uma_zone_t instead of strcmp. The mbuf subsystem used to provide M_TRYWAIT/M_DONTWAIT semantics, but this is mostly no longer the case. M_WAITOK has taken over the spot M_TRYWAIT used to have, and for mbuf things, still may return NULL if the code path is incorrectly holding a mutex going into mbuf allocation functions. The M_WAITOK/M_NOWAIT semantics are absolute; though it may deadlock the system to try to malloc or uma_zalloc something with a mutex held and M_WAITOK specified, it is absolutely required to not return NULL and will result in instability and/or security breaches otherwise. There is still room to add the WITNESS_WARN() to all cases so that we are notified of the possibility of deadlocks, but it cannot change the value of the "badness" variable and allow allocation to actually fail except for the specialized cases which used to be M_TRYWAIT.	2004-07-04 15:59:25 +00:00
Brian Feldman	cf107c1d1a	Limit mbuma damage. Suddenly ALL allocations with M_WAITOK are subject to failing -- that is, allocations via malloc(M_WAITOK) that are required to never fail -- if WITNESS is not defined. While everyone should be running WITNESS, in any case, zone "Mbuf" allocations are really the only ones that should be screwed with by this hack. This hack is crashing people, and would continue to do so with or without WITNESS. Things shouldn't be allocating with M_WAITOK with locks held, but it's not okay just to always remove M_WAITOK when !WITNESS. Reported by: Bernd Walter <ticso@cicely5.cicely.de>	2004-07-03 18:11:41 +00:00
Bosko Milekic	cc822cb53e	Make uma_mtx MTX_RECURSE. Here's why: The general UMA lock is a recursion-allowed lock because there is a code path where, while we're still configured to use startup_alloc() for backend page allocations, we may end up in uma_reclaim() which calls zone_foreach(zone_drain), which grabs uma_mtx, only to later call into startup_alloc() because while freeing we needed to allocate a bucket. Since startup_alloc() also takes uma_mtx, we need to be able to recurse on it. This exact explanation also added as comment above mtx_init(). Trace showing recursion reported by: Peter Holm <peter-at-holm.cc>	2004-06-23 21:59:03 +00:00
Bosko Milekic	7fd8788213	Backout previous change, I think Julian has a better solution which does not require type-stable refcnts here.	2004-06-09 20:50:08 +00:00
Bosko Milekic	e66468ea7a	Make the slabrefzone, the zone from which we allocated slabs with internal reference counters, UMA_ZONE_NOFREE. This way, those slabs (with their ref counts) will be effectively type-stable, then using a trick like this on the refcount is no longer dangerous: MEXT_REM_REF(m); if (atomic_cmpset_int(m->m_ext.ref_cnt, 0, 1)) { if (m->m_ext.ext_type == EXT_PACKET) { uma_zfree(zone_pack, m); return; } else if (m->m_ext.ext_type == EXT_CLUSTER) { uma_zfree(zone_clust, m->m_ext.ext_buf); m->m_ext.ext_buf = NULL; } else { (*(m->m_ext.ext_free))(m->m_ext.ext_buf, m->m_ext.ext_args); if (m->m_ext.ext_type != EXT_EXTREF) free(m->m_ext.ref_cnt, M_MBUF); } } uma_zfree(zone_mbuf, m); Previously, a second thread hitting the above cmpset might actually read the refcnt AFTER it has already been freed. A very rare occurance. Now we'll know that it won't be freed, though. Spotted by: julian, pjd	2004-06-09 19:18:50 +00:00
Bosko Milekic	099a0e588c	Bring in mbuma to replace mballoc. mbuma is an Mbuf & Cluster allocator built on top of a number of extensions to the UMA framework, all included herein. Extensions to UMA worth noting: - Better layering between slab <-> zone caches; introduce Keg structure which splits off slab cache away from the zone structure and allows multiple zones to be stacked on top of a single Keg (single type of slab cache); perhaps we should look into defining a subset API on top of the Keg for special use by malloc(9), for example. - UMA_ZONE_REFCNT zones can now be added, and reference counters automagically allocated for them within the end of the associated slab structures. uma_find_refcnt() does a kextract to fetch the slab struct reference from the underlying page, and lookup the corresponding refcnt. mbuma things worth noting: - integrates mbuf & cluster allocations with extended UMA and provides caches for commonly-allocated items; defines several zones (two primary, one secondary) and two kegs. - change up certain code paths that always used to do: m_get() + m_clget() to instead just use m_getcl() and try to take advantage of the newly defined secondary Packet zone. - netstat(1) and systat(1) quickly hacked up to do basic stat reporting but additional stats work needs to be done once some other details within UMA have been taken care of and it becomes clearer to how stats will work within the modified framework. From the user perspective, one implication is that the NMBCLUSTERS compile-time option is no longer used. The maximum number of clusters is still capped off according to maxusers, but it can be made unlimited by setting the kern.ipc.nmbclusters boot-time tunable to zero. Work should be done to write an appropriate sysctl handler allowing dynamic tuning of kern.ipc.nmbclusters at runtime. Additional things worth noting/known issues (READ): - One report of 'ips' (ServeRAID) driver acting really slow in conjunction with mbuma. Need more data. Latest report is that ips is equally sucking with and without mbuma. - Giant leak in NFS code sometimes occurs, can't reproduce but currently analyzing; brueffer is able to reproduce but THIS IS NOT an mbuma-specific problem and currently occurs even WITHOUT mbuma. - Issues in network locking: there is at least one code path in the rip code where one or more locks are acquired and we end up in m_prepend() with M_WAITOK, which causes WITNESS to whine from within UMA. Current temporary solution: force all UMA allocations to be M_NOWAIT from within UMA for now to avoid deadlocks unless WITNESS is defined and we can determine with certainty that we're not holding any locks when we're M_WAITOK. - I've seen at least one weird socketbuffer empty-but- mbuf-still-attached panic. I don't believe this to be related to mbuma but please keep your eyes open, turn on debugging, and capture crash dumps. This change removes more code than it adds. A paper is available detailing the change and considering various performance issues, it was presented at BSDCan2004: http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf Please read the paper for Future Work and implementation details, as well as credits. Testing and Debugging: rwatson, brueffer, Ketrien I. Saihr-Kesenchedra, ... Reviewed by: Lots of people (for different parts)	2004-05-31 21:46:06 +00:00
Alan Cox	5d328ed44b	- Make the acquisition of Giant in vm_fault_unwire() conditional on the pmap. For the kernel pmap, Giant is not required. In general, for other pmaps, Giant is required by i386's pmap_pte() implementation. Specifically, the use of PMAP2/PADDR2 is synchronized by Giant. Note: In principle, updates to the kernel pmap's wired count could be lost without Giant. However, in practice, we never use the kernel pmap's wired count. This will be resolved when pmap locking appears. - With the above change, cpu_thread_clean() and uma_large_free() need not acquire Giant. (The first case is simply the revival of i386/i386/vm_machdep.c's revision 1.226 by peter.)	2004-03-10 04:44:43 +00:00
Robert Watson	a3c0761103	Mark uma_callout as CALLOUT_MPSAFE, as uma_timeout can run MPSAFE. Reviewed by: jeff	2004-03-07 07:00:46 +00:00
Jeff Roberson	aaa8bb1604	- Fix a problem where we did not drain the cache of buckets in the zone when uma_reclaim() was called. This was introduced when the zone working-set algorithm was removed in favor of using the per cpu caches as the working set.	2004-02-01 06:15:17 +00:00
Dag-Erling Smørgrav	e726bc0e6c	Mechanical whitespace cleanup.	2004-01-30 16:26:29 +00:00
John Baldwin	b6c71225a9	Fix all users of mp_maxid to use the same semantics, namely: 1) mp_maxid is a valid FreeBSD CPU ID in the range 0 .. MAXCPU - 1. 2) For all active CPUs in the system, PCPU_GET(cpuid) <= mp_maxid. Approved by: re (scottl) Tested on: i386, amd64, alpha	2003-12-03 14:57:26 +00:00
Jeff Roberson	e30b97c5f9	- Unbreak UP. mp_maxid is not defined on uni-processor machines, although I believe it and the other MP variables should be. For now, just define it here and wait for jhb to clean it up later. Approved by: re (rwatson)	2003-11-30 22:18:14 +00:00
Jeff Roberson	504d5de3a8	- Replace the local maxcpu with mp_maxid. Previously, if mp_maxid was equal to MAXCPU, we would overrun the pcpu_mtx array because maxcpu was calculated incorrectly. - Add some more debugging code so that memory leaks at the time of uma_zdestroy() are more easily diagnosed. Approved by: re (rwatson)	2003-11-30 08:04:01 +00:00
Alan Cox	d1f42ac2ee	- Remove use of Giant from uma_zone_set_obj().	2003-11-14 17:49:07 +00:00
Jeff Roberson	009b6fcb03	- Fix MD_SMALL_ALLOC on architectures that support it. Define a new alloc function, startup_alloc(), that is used for single page allocations prior to the VM starting up. If it is used after the VM startups up, it replaces the zone's allocf pointer with either page_alloc() or uma_small_alloc() where appropriate. Pointy hat to: me Tested by: phk/amd64, me/x86	2003-09-21 07:39:16 +00:00
Peter Wemm	c43ab0b5a1	Bad Jeffr! No cookie! Temporarily disable the UMA_MD_SMALL_ALLOC stuff since recent commits break sparc64, amd64, ia64 and alpha. It appears only i386 and maybe powerpc were not broken.	2003-09-20 23:35:33 +00:00
Jeff Roberson	9643769a3a	- Remove the working-set algorithm. Instead, use the per cpu buckets as the working set cache. This has several advantages. Firstly, we never touch the per cpu queues now in the timeout handler. This removes one more reason for having per cpu locks. Secondly, it reduces the size of the zone by 8 bytes, bringing it under 200 bytes for a single proc x86 box. This tidies up other logic as well. - The 'destroy' flag no longer needs to be passed to zone_drain() since it always frees everything in the zone's slabs. - cache_drain() is now only called from zone_dtor() and so it destroys by default. It also does not need the destroy parameter now.	2003-09-19 23:27:46 +00:00
Jeff Roberson	3e0cab95c0	- Remove the cache colorization code. We can't use it due to all of the broken consumers of the malloc interface who assume that the allocated address will be an even multiple of the size. - Remove disabled time delay code on uma_reclaim(). The comment there said it all. It was not an effective strategy and it should not be left in #if 0'd for all eternity.	2003-09-19 23:04:44 +00:00
Jeff Roberson	64f051e99a	- There are an endless stream of style(9) errors in this file. Fix a few. Also catch some spelling errors.	2003-09-19 22:31:45 +00:00
Jeff Roberson	44eca34adb	- Don't inspect the zone in page_alloc(). It may be NULL. - Don't cache more items than the zone would like in uma_zalloc_bucket().	2003-09-19 09:22:04 +00:00
Jeff Roberson	45bf76f0f8	- Move the logic for dealing with the uma_boot_pages cache into the page_alloc() function from the slab_zalloc() function. This allows us to unconditionally call uz_allocf(). - In page_alloc() cleanup the boot_pages logic some. Previously memory from this cache that was not used by the time the system started was left in the cache and never used. Typically this wasn't more than a few pages, but now we will use this cache so long as memory is available.	2003-09-19 08:53:33 +00:00
Jeff Roberson	b60f5b794e	- Fix the silly flag situation in UMA. Remove redundant ZFLAG/ZONE flags by accepting the user supplied flags directly. Previously this was not done so that flags for the same field would not be defined in two different files. Add comments in each header instructing future developers on how now to shoot their feet. - Fix a test for !OFFPAGE which should have been a test for HASH. This would have caused a panic if we had ever destructed a malloc zone. This also opens up the possibility that other zones could use the vsetobj() method rather than a hash.	2003-09-19 08:37:44 +00:00
Jeff Roberson	961647dfd0	- Don't abuse M_DEVBUF, define a tag for UMA hashes.	2003-09-19 07:23:50 +00:00
Jeff Roberson	b983089a05	- Eliminate a pair of unnecessary variables.	2003-09-19 06:41:06 +00:00
Jeff Roberson	cae33c1429	- Initialize a pool of bucket zones so that we waste less space on zones that don't cache as many items. - Introduce the bucket_alloc(), bucket_free() functions to wrap bucket allocation. These functions select the appropriate bucket zone to allocate from or free to. - Rename ub_ptr to ub_cnt to reflect a change in its use. ub_cnt now reflects the count of free items in the bucket. This gets rid of many unnatural subtractions by 1 throughout the code. - Add ub_entries which reflects the number of entries possibly held in a bucket.	2003-09-19 06:26:45 +00:00
Bosko Milekic	1c35e213f1	In sysctl_vm_zone, do not calculate per-cpu cache stats on UMA_ZFLAG_INTERNAL zones at all. Apparently, Wilko's alpha was crashing while entering multi-user because, I think, we were calculating the garbage cachefree for pcpu caches that essentially don't exist for at least the 'zones' zone and it so happened that we were reading from an unmapped location. Confirmed to fix crash: wilko Helped debug: wilko, gallatin	2003-08-20 18:22:06 +00:00
Bosko Milekic	20e8e865bd	- When deciding whether to init the zone with small_init or large_init, compare the zone element size (+1 for the byte of linkage) against UMA_SLAB_SIZE - sizeof(struct uma_slab), and not just UMA_SLAB_SIZE. Add a KASSERT in zone_small_init to make sure that the computed ipers (items per slab) for the zone is not zero, despite the addition of the check, just to be sure (this part submitted by: silby) - UMA_ZONE_VM used to imply BUCKETCACHE. Now it implies CACHEONLY instead. CACHEONLY is like BUCKETCACHE in the case of bucket allocations, but in addition to that also ensures that we don't setup the zone with OFFPAGE slab headers allocated from the slabzone. This means that we're not allowed to have a UMA_ZONE_VM zone initialized for large items (zone_large_init) because it would require the slab headers to be allocated from slabzone, and hence kmem_map. Some of the zones init'd with UMA_ZONE_VM are so init'd before kmem_map is suballoc'd from kernel_map, which is why this change is necessary.	2003-08-11 19:39:45 +00:00
Alan Cox	b245ac95cf	Revise obj_alloc(). Most notably, use the object's lock to prevent two concurrent invocations from acquiring the same address(es). Also, in case of an incomplete allocation, free any allocated pages. In collaboration with: tegge	2003-08-03 06:08:48 +00:00
Bosko Milekic	48bf87258f	When INVARIANTS is on and we're in uma_zalloc_free(), we need to make sure that uma_dbg_free() is called if we're about to call uma_zfree_internal() but we're asking it to skip the dtor and uma_dbg_free() call itself. So, if we're about to call uma_zfree_internal() from uma_zfree_arg() and skip == 1, call uma_dbg_free() ourselves.	2003-08-02 22:40:27 +00:00
Bosko Milekic	174ab4501e	Only free the pcpu cache buckets if they are non-NULL. Crashed this person's machine: harti Pointy-hat to: me	2003-08-01 17:42:27 +00:00
Bosko Milekic	d56368d779	Plug a race and a leak in UMA. 1) The race has to do with zone destruction. From the zone destructor we would lock the zone, set the working set size to 0, then unlock the zone, drain it, and then free the structure. Within the window following the working-set-size set to 0 and unlocking of the zone and the point where in zone_drain we re-acquire the zone lock, the uma timer routine could have fired off and changed the working set size to something non-zero, thereby potentially preventing us from completely freeing slabs before destroying the zone (and thus leaking them). 2) The leak has to do with zone destruction as well. When destroying a zone we would take care to free all the buckets cached in the zone, but although we would drain the pcpu cache buckets, we would not free them. This resulted in leaking a couple of bucket structures (512 bytes each) per cpu on SMP during zone destruction. While I'm here, also silence GCC warnings by turning uma_slab_alloc() from inline to real function. It's too big to be an inline. Reviewed by: JeffR	2003-07-30 18:55:15 +00:00
Bosko Milekic	a40fdcb439	When generating the zone stats make sure to handle the master zone ("UMA Zone") carefully, because it does not have pcpu caches allocated at all. In the UP case, we did not catch this because one pcpu cache is always allocated with the zone, but for the MP case, we were getting bogus stats for this zone. Tested by: Lukas Ertl <le@univie.ac.at>	2003-07-30 15:22:37 +00:00
Poul-Henning Kamp	7b4bd98ad5	Remove the disabling of buckets workaround. Thanks to: jeffr	2003-07-30 07:50:19 +00:00
Jeff Roberson	f828e5bedb	- Get rid of the ill-conceived uz_cachefree member of uma_zone. - In sysctl_vm_zone use the per cpu locks to read the current cache statistics this makes them more accurate while under heavy load. Submitted by: tegge	2003-07-30 05:59:17 +00:00
Jeff Roberson	d11e0ba565	- Check to see if we need a slab prior to allocating one. Failure to do so not only wastes memory but it can also cause a leak in zones that will be destroyed later. The problem is that the slab allocation code places newly created slabs on the partially allocated list because it assumes that the caller will actually allocate some memory from it. Failure to do so places an otherwise free slab on the partial slab list where we wont find it later in zone_drain(). Continuously prodded to fix by: phk (Thanks)	2003-07-30 05:42:55 +00:00
Poul-Henning Kamp	0c32d97ab5	Temporary workaround: Always disable buckets, there is a bug there somewhere. JeffR will look at this as soon as he has time. OK'ed by: jeffr	2003-07-29 22:07:10 +00:00
Alan Cox	234c7726c8	None of the "alloc" functions used by UMA assume that Giant is held any longer. (If they still need it, e.g., contigmalloc(), they acquire it themselves.) Therefore, we need not acquire Giant in slab_zalloc().	2003-07-28 02:29:07 +00:00
Alan Cox	0c1a133f56	Gulp ... call kmem_malloc() without Giant.	2003-07-26 03:55:32 +00:00
Hartmut Brandt	8522511b2a	When INVARIANTS is defined make sure that uma_zalloc_arg (and hence uma_zalloc) is called with exactly one of either M_WAITOK or M_NOWAIT and that it is called with neither M_TRYWAIT or M_DONTWAIT. Print a warning if anything is wrong. Default to M_WAITOK of no flag is given. This is the same test as in malloc(9).	2003-07-18 16:04:36 +00:00
Bosko Milekic	d88797c2ba	Move the pcpu lock out of the uma_cache and instead have a single set of pcpu locks. This makes uma_zone somewhat smaller (by (LOCKNAME_LEN * sizeof(char) + sizeof(struct mtx) * maxcpu) bytes, to be exact). No Objections from jeff.	2003-06-25 20:49:48 +00:00
Bosko Milekic	5c133dfa0e	Make sure that the zone destructor doesn't get called twice in certain free paths.	2003-06-25 17:25:45 +00:00
David E. O'Brien	874651b13c	Use __FBSDID().	2003-06-11 23:50:51 +00:00
Poul-Henning Kamp	c1f5a18201	Revert last commit, I have no idea what happened.	2003-06-09 22:51:36 +00:00
Poul-Henning Kamp	47f94c12da	A white-space nit I noticed.	2003-06-09 19:40:34 +00:00
Alan Cox	82774d8040	uma_zone_set_obj() must perform VM_OBJECT_LOCK_INIT() if the caller provides storage for the vm_object.	2003-04-28 06:11:32 +00:00
Alan Cox	5103186c8c	Remove an XXX comment. It is no longer a problem.	2003-04-26 05:00:56 +00:00
Alan Cox	410cfc455e	Lock the vm_object in obj_alloc().	2003-04-19 00:30:36 +00:00
Andrew Gallatin	b37d8ead52	Don't grab Giant in slab_zalloc() if M_NOWAIT is specified. This should allow the use of INTR_MPSAFE network drivers. Tested by: njl Glanced at by: jeff	2003-04-18 13:02:29 +00:00
Tor Egge	125ee0d161	Obtain Giant before calling kmem_alloc without M_NOWAIT and before calling kmem_free if Giant isn't already held.	2003-03-26 18:44:53 +00:00
John Baldwin	263067951a	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Poul-Henning Kamp	886eaaacfa	Change a printf to also tell how many items were left in the zone.	2003-02-04 08:23:18 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Jeff Roberson	ebc85edf5e	- M_WAITOK is 0 and not a real flag. Test for this properly. Submitted by: tmm Pointy hat to: jeff	2003-01-20 01:32:56 +00:00
Jens Schweikhardt	9d5abbddbf	Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.	2003-01-01 18:49:04 +00:00
Jeff Roberson	74c924b553	- Wakeup the correct address when a zone is no longer full. Spotted by: jake	2002-11-18 08:27:14 +00:00
Jeff Roberson	f3da1873bc	- Don't forget the flags value when using boot pages. Reported by: grehan	2002-11-16 20:57:41 +00:00
Matt Jacob	81f71edaec	atomic_set_8 isn't MI. Instead, follow Jake's suggestions about ZONE_LOCK.	2002-11-11 11:50:03 +00:00
Jeff Roberson	48eea37508	- Add support for machine dependant page allocation routines. MD code may define UMA_MD_SMALL_ALLOC to make use of this feature. Reviewed by: peter, jake	2002-11-01 01:01:27 +00:00
Jeff Roberson	bbee39c629	- Now that uma_zalloc_internal is not the fast path don't be so fussy about extra function calls. Refactor uma_zalloc_internal into seperate functions for finding the most appropriate slab, filling buckets, allocating single items, and pulling items off of slabs. This makes the code significantly cleaner. - This also fixes the "Returning an empty bucket." panic that a few people have seen. Tested On: alpha, x86	2002-10-24 07:59:03 +00:00
Jeff Roberson	bba739abf9	- Move the destructor calls so that they are not called with the zone lock held. This avoids a lock order reversal when destroying zones. Unfortunately, this also means that the free checks are not done before the destructor is called. Reported by: phk	2002-10-24 06:17:30 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	f461cf2297	- Use my freebsd email alias in the copyright. - Remove redundant instances of my email alias in the file summary.	2002-09-19 06:05:32 +00:00
Jeff Roberson	99571dc345	- Split UMA_ZFLAG_OFFPAGE into UMA_ZFLAG_OFFPAGE and UMA_ZFLAG_HASH. - Remove all instances of the mallochash. - Stash the slab pointer in the vm page's object pointer when allocating from the kmem_obj. - Use the overloaded object pointer to find slabs for malloced memory.	2002-09-18 08:26:30 +00:00
Archie Cobbs	55f7c614fd	Don't use "NULL" when "0" is really meant.	2002-08-21 23:39:52 +00:00
Jeff Roberson	17b9cc4941	Fix a lock order reversal in uma_zdestroy. The uma_mtx needs to be held across calls to zone_drain(). Noticed by: scottl	2002-07-05 21:39:52 +00:00
Jeff Roberson	f5118d6aaf	Remove unnecessary includes.	2002-07-05 05:16:19 +00:00
Jeff Roberson	e221e841b0	Actually use the fini callback. Pointy hat to: me :-( Noticed By: Julian	2002-07-03 00:30:51 +00:00
Jeff Roberson	5c0e403ba2	Reduce the amount of code that runs with the zone lock held in slab_zalloc(). This allows us to run the zone initialization functions without any locks held.	2002-06-25 21:04:50 +00:00
Jeff Roberson	3370c5bfd7	- Remove bogus use of kmem_alloc that was inherited from the old zone allocator. - Properly set M_ZERO when talking to the back end page allocators for non malloc zones. This forces us to zero fill pages when they are first brought into a cache. - Properly handle M_ZERO in uma_zalloc_internal. This fixes a problem where per cpu buckets weren't always getting zeroed.	2002-06-19 20:49:44 +00:00
Jeff Roberson	4741dcbff5	Honor the BUCKETCACHE flag on free as well.	2002-06-17 23:53:58 +00:00
Jeff Roberson	18aa2de5a7	- Introduce the new M_NOVM option which tells uma to only check the currently allocated slabs and bucket caches for free items. It will not go ask the vm for pages. This differs from M_NOWAIT in that it not only doesn't block, it doesn't even ask. - Add a new zcreate option ZONE_VM, that sets the BUCKETCACHE zflag. This tells uma that it should only allocate buckets out of the bucket cache, and not from the VM. It does this by using the M_NOVM option to zalloc when getting a new bucket. This is so that the VM doesn't recursively enter itself while trying to allocate buckets for vm_map_entry zones. If there are already allocated buckets when we get here we'll still use them but otherwise we'll skip it. - Use the ZONE_VM flag on vm map entries and pv entries on x86.	2002-06-17 22:02:41 +00:00
Ian Dowse	f97d6ce396	Correct the logic for determining whether the per-CPU locks need to be destroyed. This fixes a problem where destroying a UMA zone would fail to destroy all zone mutexes. Reviewed by: jeff	2002-06-10 03:25:23 +00:00
Jeff Roberson	494273bead	Add a comment describing a resource leak that occurs during a failure case in obj_alloc.	2002-06-03 22:59:19 +00:00
John Baldwin	4c1cc01cd8	In uma_zalloc_arg(), if we are performing a M_WAITOK allocation, ensure that td_intr_nesting_level is 0 (like malloc() does). Since malloc() calls uma we can probably remove the check in malloc() for this now. Also, perform an extra witness check in that case to make sure we don't hold any locks when performing a M_WAITOK allocation.	2002-05-20 17:54:48 +00:00
Jeff Roberson	713deb3677	Don't call the uz free function while the zone lock is held. This can lead to lock order reversals. uma_reclaim now builds a list of freeable slabs and then unlocks the zones to do all of the frees.	2002-05-13 05:08:18 +00:00
Jeff Roberson	0aef6126a1	Remove the hash_free() lock order reversal. This could have happened for several reasons before. Fixing it involved restructuring the generic hash code to require calling code to handle locking, unlocking, and freeing hashes on error conditions.	2002-05-13 04:39:28 +00:00
Jeff Roberson	c7173f58fa	Use pages instead of uz_maxpages, which has not been initialized yet, when creating the vm_object. This was broken after the code was rearranged to grab giant itself. Spotted by: alc	2002-05-04 21:49:29 +00:00
Jeff Roberson	b9ba893179	Move around the dbg code a bit so it's always under a lock. This stops a weird potential race if we were preempted right as we were doing the dbg checks.	2002-05-02 09:05:36 +00:00
Andrew R. Reiter	c3bdc05fb9	- Changed the size element of uma_zctor_args to be size_t instead of int. - Changed uma_zcreate to accept the size argument as a size_t intead of int. Approved by: jeff	2002-05-02 07:36:30 +00:00
Jeff Roberson	5a34a9f089	malloc/free(9) no longer require Giant. Use the malloc_mtx to protect the mallochash. Mallochash is going to go away as soon as I introduce the kfree/kmalloc api and partially overhaul the malloc wrapper. This can't happen until all users of the malloc api that expect memory to be aligned on the size of the allocation are fixed.	2002-05-02 07:22:19 +00:00
Jeff Roberson	639c9550fb	Remove the temporary alignment check in free(). Implement the following checks on freed memory in the bucket path: - Slab membership - Alignment - Duplicate free This previously was only done if we skipped the buckets. This code will slow down INVARIANTS a bit, but it is smp safe. The checks were moved out of the normal path and into hooks supplied in uma_dbg.	2002-05-02 02:08:48 +00:00
Jeff Roberson	2cc35ff9c6	Move the implementation of M_ZERO into UMA so that it can be passed to uma_zalloc and friends. Remove this functionality from the malloc wrapper. Document this change in uma.h and adjust variable names in uma_core.	2002-04-30 04:26:34 +00:00
Jeff Roberson	28bc44195c	Add a new zone flag UMA_ZONE_MTXCLASS. This puts the zone in it's own mutex class. Currently this is only used for kmapentzone because kmapents are are potentially allocated when freeing memory. This is not dangerous though because no other allocations will be done while holding the kmapentzone lock.	2002-04-29 23:45:41 +00:00
Andrew R. Reiter	d4d6aee5a0	- Fix a round down bogon in uma_zone_set_max(). Submitted by: jeff@	2002-04-25 06:24:40 +00:00
Jeff Roberson	5300d9dda2	Fix a witness warning when expanding a hash table. We were allocating the new hash while holding the lock on a zone. Fix this by doing the allocation seperately from the actual hash expansion. The lock is dropped before the allocation and reacquired before the expansion. The expansion code checks to see if we lost the race and frees the new hash if we do. We really never will lose this race because the hash expansion is single threaded via the timeout mechanism.	2002-04-14 13:47:10 +00:00
Jeff Roberson	0da47b2fc6	Protect the initial list traversal in sysctl_vm_zone() with the uma_mtx.	2002-04-14 12:39:38 +00:00
Jeff Roberson	af7f9b97b6	Fix the calculation that determines uz_maxpages. It was off for large zones. Fortunately we have no large zones with maximums specified yet, so it wasn't breaking anything. Implement blocking when a zone exceeds the maximum and M_WAITOK is specified. Previously this just failed like the old zone allocator did. The old zone allocator didn't support WAITOK/NOWAIT though so we should do what we advertise. While I was in there I cleaned up some more zalloc logic to further simplify that code path and reduce redundant code. This was needed to make the blocking work properly anyway.	2002-04-14 01:56:25 +00:00
Jeff Roberson	bce9779110	Remember to unlock the zone if the fill count is too high. Pointed out by: pete, jake, jhb	2002-04-10 01:52:50 +00:00
Jeff Roberson	86bbae32f4	Add a mechanism to disable buckets when the v_free_count drops below v_free_min. This should help performance in memory starved situations.	2002-04-08 06:20:34 +00:00
Jeff Roberson	605cbd6a08	Don't release the zone lock until after the dtor has been called. As far as I can tell this could not have caused any problems yet because UMA is still called with giant. Pointy hat to: jeff Noticed by: jake	2002-04-08 05:13:48 +00:00
Jeff Roberson	9c2cd7e5a9	Implement uma_zdestroy(). It's prototype changed slightly. I decided that I didn't like the wait argument and that if you were removing a zone it had better be empty. Also, I broke out part of hash_expand and made a seperate hash_free() for use in uma_zdestroy.	2002-04-08 04:48:58 +00:00
Jeff Roberson	a553d4b8eb	Rework most of the bucket allocation and free code so that per cpu locks are never held across blocking operations. Also, fix two other lock order reversals that were exposed by jhb's witness change. The free path previously had a bug that would cause it to skip the free bucket list in some cases and go straight to allocating a new bucket. This has been fixed as well. These changes made the bucket handling code much cleaner and removed quite a few lock operations. This should be marginally faster now. It is now possible to call malloc w/o Giant and avoid any witness warnings. This still isn't entirely safe though because malloc_type statistics are not protected by any lock.	2002-04-08 02:42:55 +00:00
Jeff Roberson	d0b06acbe1	This fixes a bug where isitem never got set to 1 if a certain chain of events relating to extreme low memory situations occured. This was only ever seen on the port build cluster, so many thanks to kris for helping me debug this. Tested by: kris	2002-04-07 22:47:36 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Alfred Perlstein	157d7b3538	fix comment typo, s/neccisary/necessary/g	2002-04-02 21:25:12 +00:00
Jeff Roberson	f4af24d55d	Reset the cachefree statistics after draining the cache. This fixes a bug where a sysctl within 20 seconds of a cache_drain could yield negative "USED" counts. Also, grab the uma_mtx while in the sysctl handler. This hadn't caused problems yet because Giant is held all the time. Reported by: kkenn	2002-03-24 10:56:11 +00:00
Jeff Roberson	736ee5907f	Add uma_zone_set_max() to add enforced limits to non vm obj backed zones.	2002-03-20 05:28:34 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00

1 2 3 4 5

208 Commits