freebsd-skq

Author	SHA1	Message	Date
alc	f73575dddd	Assert that the containing vm object is locked in vm_page_cache() and vm_page_try_to_cache().	2004-10-28 05:26:21 +00:00
bmilekic	13ebdd218a	Fix a INVARIANTS-only bug introduced in Revision 1.104: IF INVARIANTS is defined, and in the rare case that we have allocated some objects from the slab and at least one initializer on at least one of those objects failed, and we need to fail the allocation and push the uninitialized items back into the slab caches -- in that scenario, we would fail to [re]set the bucket cache's ub_bucket item references to NULL, which would eventually trigger a KASSERT.	2004-10-27 21:19:35 +00:00
alc	ce02afb500	During traversal of the active queue, try locking the page's containing object before accessing the page's flags or the object's reference count. If the trylock fails, handle the page as though it is busy.	2004-10-27 18:29:17 +00:00
phk	1b27d1d3b9	Also check that the sectormask is bigger than zero. Wrap this overly long KASSERT and remove newline.	2004-10-26 19:51:57 +00:00
phk	c66aa10c8e	Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.	2004-10-26 07:39:12 +00:00
phk	76b805d6f4	Don't clear flags we just checked were not set.	2004-10-26 05:57:29 +00:00
alc	50d63268a9	Assert that the containing vm object is locked in vm_page_flash().	2004-10-25 19:52:44 +00:00
alc	774f792bae	Assert that the containing vm object is locked in vm_page_busy() and vm_page_wakeup().	2004-10-24 23:53:47 +00:00
phk	1b25a59886	Move the buffer method vector (buf->b_op) to the bufobj. Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().	2004-10-24 20:03:41 +00:00
alc	0041f5bef4	Acquire the vm object lock before rather than after calling vm_page_sleep_if_busy(). (The motivation being to transition synchronization of the vm_page's PG_BUSY flag from the global page queues lock to the per-object lock.)	2004-10-24 19:32:19 +00:00
alc	17eb61eeb6	Use VM_ALLOC_NOBUSY instead of calling vm_page_wakeup().	2004-10-24 18:46:32 +00:00
alc	faeb949021	Introduce VM_ALLOC_NOBUSY, an option to vm_page_alloc() and vm_page_grab() that indicates that the caller does not want a page with its busy flag set. In many places, the global page queues lock is acquired and released just to clear the busy flag on a just allocated page. Both the allocation of the page and the clearing of the busy flag occur while the containing vm object is locked. So, the busy flag might as well never be set.	2004-10-24 06:15:36 +00:00
phk	52a089c526	Add b_bufobj to struct buf which eventually will eliminate the need for b_vp. Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.	2004-10-22 08:47:20 +00:00
phk	3833976d12	Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.	2004-10-21 15:53:54 +00:00
alc	a1afafd0da	Correct two errors in PG_BUSY management by vm_page_cowfault(). Both errors are in rarely executed paths. 1. Each time the retry_alloc path is taken, the PG_BUSY must be set again. Otherwise vm_page_remove() panics. 2. There is no need to set PG_BUSY on the newly allocated page before freeing it. The page already has PG_BUSY set by vm_page_alloc(). Setting it again could cause an assertion failure. MFC after: 2 weeks	2004-10-18 08:11:59 +00:00
alc	b36ed839e4	Assert that the containing object is locked in vm_page_io_start() and vm_page_io_finish(). The motivation being to transition synchronization of the vm_page's busy field from the global page queues lock to the per-object lock.	2004-10-17 22:33:40 +00:00
alc	46e3ee9584	Remove unnecessary check for curthread == NULL.	2004-10-17 20:29:28 +00:00
peter	09964b7499	Put on my peril sensitive sunglasses and add a flags field to the internal sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.	2004-10-11 22:04:16 +00:00
green	76d153d5ca	In the previous revision, I did not intend to change the default value of "nosleepwithlocks." Submitted by: ru	2004-10-09 18:51:32 +00:00
green	9128ff1ce9	Fix critical stability problems that can cause UMA mbuf cluster state management corruption, mbuf leaks, general mbuf corruption, and at least on i386 a first level splash damage radius that encompasses up to about half a megabyte of the memory after an mbuf cluster's allocation slab. In short, this has caused instability nightmares anywhere the right kind of network traffic is present. When the polymorphic refcount slabs were added to UMA, the new types were not used pervasively. In particular, the slab management structure was turned into one for refcounts, and one for non-refcounts (supposed to be mostly like the old slab management structure), but the latter was almost always used through out. In general, every access to zones with UMA_ZONE_REFCNT turned on corrupted the "next free" slab offset offset and the refcount with each other and with other allocations (on i386, 2 mbuf clusters per 4096 byte slab). Fix things so that the right type is used to access refcounted zones where it was not before. There are additional errors in gross overestimation of padding, it seems, that would cause a large kegs (nee zones) to be allocated when small ones would do. Unless I have analyzed this incorrectly, it is not directly harmful.	2004-10-08 20:19:29 +00:00
das	2e3453ac22	Don't look for swap blocks in objects that aren't swap-backed. I expect that this will fix the following panic, reported by Jun: swap_pager_isswapped: failed to locate all swap meta blocks MT5 candidate	2004-09-24 16:04:20 +00:00
phk	8d623dca9a	XXX mark two places where we do not hold a threadcount on the dev when frobbing the cdevsw. In both cases we examine only the cdevsw and it is a good question if we weren't better off copying those properties into the cdev in the first place. This question will be revisited.	2004-09-24 08:32:36 +00:00
phk	63c4403fe8	Use dev_re[fl]thread() to maintain a ref on the device driver while we call the ->d_mmap function.	2004-09-24 05:59:11 +00:00
das	8b64b8f028	The zone from which proc structures are allocated is marked UMA_ZONE_NOFREE to guarantee type stability, so proc_fini() should never be called. Move an assertion from proc_fini() to proc_dtor() and garbage-collect the rest of the unreachable code. I have retained vm_proc_dispose(), since I consider its disuse a bug.	2004-09-19 18:34:17 +00:00
phk	1795816cf7	Add new a function isa_dma_init() which returns an errno when it fails and which takes a M_WAITOK/M_NOWAIT flag argument. Add compatibility isa_dmainit() macro which whines loudly if isa_dma_init() fails. Problem uncovered by: tegge	2004-09-15 12:09:50 +00:00
alc	96c3a115d5	System maps are prohibited from mapping vnode-backed objects. Take advantage of this restriction to avoid acquiring and releasing Giant when wiring pages within a system map. In collaboration with: tegge@	2004-09-11 18:49:59 +00:00
phk	71485468ce	add KASSERTS	2004-09-07 07:32:40 +00:00
alc	8ec6e44b75	Enable debug.mpsafevm by default on amd64 and i386. This enables copy-on- write and zero-fill faults to run without holding Giant. It is still possible to disable Giant-free operation by setting debug.mpsafevm to 0 in loader.conf.	2004-09-04 05:51:54 +00:00
alc	82e55fdf76	Push Giant deep into vm_forkproc(), acquiring it only if the process has mapped System V shared memory segments (see shmfork_myhook()) or requires the allocation of an ldt (see vm_fault_wire()).	2004-09-03 05:11:32 +00:00
scottl	d9af98161a	Turn PREEMPTION into a kernel option. Make sure that it's defined if FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is enabled (code inspired by the PREEMPTION warning in kern_switch.c). This is a possible MT5 candidate.	2004-09-02 18:59:15 +00:00
alc	f58b09685d	Remove dead code.	2004-09-01 19:58:37 +00:00
alc	38af5e4b6b	In vm_fault_unwire() eliminate the acquisition and release of Giant in the case of non-kernel pmaps.	2004-09-01 19:18:59 +00:00
julian	e9d9514975	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
alc	0dc141e15d	Move the acquisition and release of the lock on the object at the head of the shadow chain outside of the loop in vm_object_madvise(), reducing the number of times that this lock is acquired and released.	2004-08-29 20:14:10 +00:00
iedowse	e97493a743	Prevent vm_page_zero_idle_wakeup() from attempting to wake up the page zeroing thread before it has been created. It was possible for calls to free() very early in the boot process to panic here because the sleep queues were not yet initialised. Specifically, sysinit_add() running at SI_SUB_KLD would trigger this if the array of pointers became big enough to require uma_large_alloc() allocations. Submitted by: peter	2004-08-29 01:02:33 +00:00
marcel	3bbdf52d20	Move the cow field between wire_count and hold_count. This is the position that is 64-bit aligned and makes sure that the valid and dirty fields are also 64-bit aligned. This means that if PAGE_SIZE is 32K, the size of the vm_page structure is only increased by 8 bytes instead of 16 bytes. More importantly, the vm_page structure is either 120 or 128 bytes on ia64. These are "interesting" sizes.	2004-08-22 20:52:23 +00:00
alc	069d1661bd	In the previous revision, I failed to condition an early release of Giant in vm_fault() on debug_mpsafevm. If debug_mpsafevm was not set, the result was an assertion failure early in the boot process. Reported by: green@	2004-08-22 00:08:43 +00:00
alc	bdaf27d7e6	Further reduce the use of Giant by vm_fault(): Giant is held only when manipulating a vnode, e.g., calling vput(). This reduces contention for Giant during many copy-on-write faults, resulting in some additional speedup on SMPs. Note: debug_mpsafevm must be enabled for this optimization to take effect.	2004-08-21 19:20:21 +00:00
alc	268e93a97a	Acquire and release Giant around a call to VOP_BMAP(). (This is a prerequisite to any further reduction in Giant's use by vm_fault().)	2004-08-19 02:37:12 +00:00
alc	336d354baa	- Introduce and use a new tunable "debug.mpsafevm". At present, setting "debug.mpsafevm" results in (almost) Giant-free execution of zero-fill page faults. (Giant is held only briefly, just long enough to determine if there is a vnode backing the faulting address.) Also, condition the acquisition and release of Giant around calls to pmap_remove() on "debug.mpsafevm". The effect on performance is significant. On my dual Opteron, I see a 3.6% reduction in "buildworld" time. - Use atomic operations to update several counters in vm_fault().	2004-08-16 06:16:12 +00:00
green	d4f662585f	Rather than bringing back all of the changes to make VM map deletion wait for system wires to disappear, do so (much more trivially) by instead only checking for system wires of user maps and not kernel maps. Alternative by: tor Reviewed by: alc	2004-08-16 03:11:09 +00:00
alc	8e661e10f6	Remove spl calls.	2004-08-14 18:57:41 +00:00
alc	482b6818af	Replace the linear search in vm_map_findspace() with an O(log n) algorithm built into the map entry splay tree. This replaces the first_free hint in struct vm_map with two fields in vm_map_entry: adj_free, the amount of free space following a map entry, and max_free, the maximum amount of free space in the entry's subtree. These fields make it possible to find a first-fit free region of a given size in one pass down the tree, so O(log n) amortized using splay trees. This significantly reduces the overhead in vm_map_findspace() for applications that mmap() many hundreds or thousands of regions, and has a negligible slowdown (0.1%) on buildworld. See, for example, the discussion of a micro-benchmark titled "Some mmap observations compared to Linux 2.6/OpenBSD" on -hackers in late October 2003. OpenBSD adopted this approach in March 2002, and NetBSD added it in November 2003, both with Red-Black trees. Submitted by: Mark W. Krentel	2004-08-13 08:06:34 +00:00
tegge	c5a462b4d9	The vm map lock is needed in vm_fault() after the page has been found, to avoid later changes before pmap_enter() and vm_fault_prefault() has completed. Simplify deadlock avoidance by not blocking on vm map relookup. In collaboration with: alc	2004-08-12 20:14:49 +00:00
green	09c41336ae	Re-delete the comment from r1.352.	2004-08-12 17:22:28 +00:00
green	d9efb7d719	Back out all behavioral chnages.	2004-08-10 14:42:48 +00:00
green	a5ad2c7311	Revamp VM map wiring. * Allow no-fault wiring/unwiring to succeed for consistency; however, the wired count remains at zero, so it's a special case. * Fix issues inside vm_map_wire() and vm_map_unwire() where the exact state of user wiring (one or zero) and system wiring (zero or more) could be confused; for example, system unwiring could succeed in removing a user wire, instead of being an error. * Require all mappings to be unwired before they are deleted. When VM space is still wired upon deletion, it will be waited upon for the following unwire. This makes vslock(9) work rather than allowing kernel-locked memory to be deleted out from underneath of its consumer as it would before.	2004-08-09 19:52:29 +00:00
alc	8c107931a7	Make two changes to vm_fault(). 1. Move a comment to its proper place, updating it. (Except for white- space, this comment had been unchanged since revision 1.1!) 2. Remove spl calls.	2004-08-09 18:46:39 +00:00
alc	197114e321	Remove a stale comment from vm_map_lookup() that pertains to share maps. (The last vestiges of the share map code were removed in revisions 1.153 and 1.159.)	2004-08-09 18:15:46 +00:00
alc	eabee22ac5	Make two changes to vm_fault(). 1. Retain the map lock until after the calls to pmap_enter() and vm_fault_prefault(). 2. Remove a stale comment. Submitted by: tegge@	2004-08-09 06:01:46 +00:00

1 2 3 4 5 ...

2028 Commits