187 Commits

Author SHA1 Message Date
alc
b57e5e03fd Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00
alc
dbdc402421 - Make the acquisition of Giant in vm_fault_unwire() conditional on the
pmap.  For the kernel pmap, Giant is not required.  In general, for
   other pmaps, Giant is required by i386's pmap_pte() implementation.
   Specifically, the use of PMAP2/PADDR2 is synchronized by Giant.
   Note: In principle, updates to the kernel pmap's wired count could be
   lost without Giant.  However, in practice, we never use the kernel
   pmap's wired count.  This will be resolved when pmap locking appears.
 - With the above change, cpu_thread_clean() and uma_large_free() need
   not acquire Giant.  (The first case is simply the revival of
   i386/i386/vm_machdep.c's revision 1.226 by peter.)
2004-03-10 04:44:43 +00:00
alc
aae81a61cf Correct a long-standing race condition in vm_fault() that could result in a
panic "vm_page_cache: caching a dirty page, ...": Access to the page must
be restricted or removed before calling vm_page_cache().  This race
condition is identical in nature to that which was addressed by
vm_pageout.c's revision 1.251 and vm_page.c's revision 1.275.

Reviewed by:	tegge
MFC after:	7 days
2004-02-15 00:42:26 +00:00
alc
cabea24620 - Locking for the per-process resource limits structure has eliminated
the need for Giant in vm_map_growstack().
 - Use the proc * that is passed to vm_map_growstack() rather than
   curthread->td_proc.
2004-02-05 06:33:18 +00:00
alc
5eb32a5d39 - Reduce Giant's scope in vm_fault().
- Use vm_object_reference_locked() instead of vm_object_reference()
   in vm_fault().
2003-12-26 23:33:37 +00:00
mini
918610ef5e NFC: Update stale comments.
Reviewed by:	alc
2003-11-10 00:44:00 +00:00
alc
b722f9a630 - vm_fault_copy_entry() should not assume that the source object contains
every page.  If the source entry was read-only, one or more wired pages
   could be in backing objects.
 - vm_fault_copy_entry() should not set the PG_WRITEABLE flag on the page
   unless the destination entry is, in fact, writeable.
2003-10-15 08:00:45 +00:00
alc
352d9382c0 Lock the destination object in vm_fault_copy_entry(). 2003-10-08 07:11:19 +00:00
alc
76f6c3b059 Retire vm_page_copy(). Its reason for being ended when peter@ modified
pmap_copy_page() et al. to accept a vm_page_t rather than a physical
address.  Also, this change will facilitate locking access to the vm page's
valid field.
2003-10-08 05:35:12 +00:00
alc
3272fbe303 Synchronize access to a vm page's valid field using the containing
vm object's lock.
2003-10-04 21:35:48 +00:00
alc
b1691aebe4 Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added.  This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine.  Note: pmap_is_prefaultable() and pmap_mincore() have
much in common.  Going forward, it's worth considering their merger.
2003-10-03 22:46:53 +00:00
alc
1644dd5fce Add vm object locking to vnode_pager_lock(). (This triggers the movement
of a VM_OBJECT_LOCK() in vm_fault().)
2003-09-18 02:26:03 +00:00
alc
06fbefe190 To implement the sequential access optimization, vm_fault() may need to
reacquire the "first" object's lock while a backing object's lock is held.
Since this is a lock-order reversal, vm_fault() uses trylock to acquire
the first object's lock, skipping the sequential access optimization in
the unlikely event that the trylock fails.
2003-08-23 06:52:32 +00:00
alc
fa54a6610e Maintain a lock on the vm object of interest throughout vm_fault(),
releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages()
or vm_pager_has_pages()).  If we sleep, we have marked the vm object with
the paging-in-progress flag.
2003-06-22 21:35:41 +00:00
alc
5d0aaa2a87 As vm_fault() descends the chain of backing objects, set paging-in-
progress on the next object before clearing it on the current object.
2003-06-22 05:36:53 +00:00
alc
752ed0a2b9 Make some style and white-space changes to the copy-on-write path through
vm_fault(); remove a pointless assignment statement from that path.
2003-06-22 00:00:11 +00:00
alc
ed79b4d625 Lock one of the vm objects involved in an optimized copy-on-write fault. 2003-06-21 06:31:42 +00:00
alc
893b54638f The so-called "optimized copy-on-write fault" case should not require
the vm map lock.  What's really needed is vm object locking, which
is (for the moment) provided Giant.

Reviewed by:	tegge
2003-06-20 04:20:36 +00:00
alc
29c6e6376c Fix a vm object reference leak in the page-based copy-on-write mechanism
used by the zero-copy sockets implementation.

Reviewed by:	gallatin
2003-06-19 01:40:44 +00:00
obrien
b0678d7a44 Use __FBSDID(). 2003-06-11 23:50:51 +00:00
jhb
d5cf4c5275 Prefer the proc lock to sched_lock when testing PS_INMEM now that it is
safe to do so.
2003-04-22 20:01:56 +00:00
alc
033a6f0bc7 - Lock the vm_object when performing vm_object_pip_wakeup(). 2003-04-20 19:25:28 +00:00
alc
5990076d78 - Lock the vm_object when performing vm_object_pip_add(). 2003-04-20 03:41:21 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
ken
471eab1868 Zero copy send and receive fixes:
- On receive, vm_map_lookup() needs to trigger the creation of a shadow
  object.  To make that happen, call vm_map_lookup() with PROT_WRITE
  instead of PROT_READ in vm_pgmoveco().

- On send, a shadow object will be created by the vm_map_lookup() in
  vm_fault(), but vm_page_cowfault() will delete the original page from
  the backing object rather than simply letting the legacy COW mechanism
  take over.  In other words, the new page should be added to the shadow
  object rather than replacing the old page in the backing object.  (i.e.
  vm_page_cowfault() should not be called in this case.)  We accomplish
  this by making sure fs.object == fs.first_object before calling
  vm_page_cowfault() in vm_fault().

Submitted by:	gallatin, alc
Tested by:	ken
2003-03-08 06:58:18 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
dillon
aba727244c Merge all the various copies of vm_fault_quick() into a single
portable copy.
2003-01-16 00:02:21 +00:00
alc
e45ec3c803 vm_fault_copy_entry() needn't clear PG_ZERO because it didn't pass
VM_ALLOC_ZERO to vm_page_alloc().
2003-01-12 07:33:16 +00:00
alc
3f894298eb Reduce the number of times that we acquire and release the page queues
lock by making vm_page_rename()'s caller, rather than vm_page_rename(),
responsible for acquiring it.
2002-12-29 07:17:06 +00:00
alc
43045f3d3b - Hold the page queues lock around calls to vm_page_flag_clear(). 2002-12-24 19:02:03 +00:00
alc
3f31cbe67c - Hold the page queues lock when performing vm_page_busy() or
vm_page_flag_set().
 - Replace vm_page_sleep_busy() with proper page queues locking
   and vm_page_sleep_if_busy().
2002-12-19 01:20:24 +00:00
alc
5e336b1d19 Now that pmap_remove_all() is exported by our pmap implementations
use it directly.
2002-11-16 07:44:25 +00:00
alc
fc8a5bc419 When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than
indirectly through vm_page_protect().  The one remaining page flag that
is updated by vm_page_protect() is already being updated by our various
pmap implementations.

Note: A later commit will similarly change the VM_PROT_READ case and
eliminate vm_page_protect().
2002-11-10 07:12:04 +00:00
alc
22918c79b0 Complete the page queues locking needed for the page-based copy-
on-write (COW) mechanism.  (This mechanism is used by the zero-copy
TCP/IP implementation.)
 - Extend the scope of the page queues lock in vm_fault()
   to cover vm_page_cowfault().
 - Modify vm_page_cowfault() to release the page queues lock
   if it sleeps.
2002-10-19 18:34:39 +00:00
alc
d5f256dae2 o Retire pmap_pageable(). It's an advisory routine that none
of our platforms implements.
2002-08-25 04:20:05 +00:00
alc
cdcc7b3446 o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since
pmap_zero_page() and pmap_zero_page_area() were modified to accept
   a struct vm_page * instead of a physical address, vm_page_zero_fill()
   and vm_page_zero_fill_area() have served no purpose.
2002-08-25 00:22:31 +00:00
alc
fe8b5e270e o Move a call to vm_page_wakeup() inside the scope of the page queues lock. 2002-08-10 23:27:06 +00:00
alc
5b6c90a737 o Remove the setting and clearing of the PG_MAPPED flag. (This flag is
obsolete.)
2002-08-10 07:11:16 +00:00
alc
d40c5cf505 o Lock page queue accesses by vm_page_activate(). 2002-07-27 07:20:27 +00:00
alc
17db4f92a1 o Merge vm_fault_wire() and vm_fault_user_wire() by adding a new parameter,
user_wire.
2002-07-24 19:47:56 +00:00
alc
28f5e60e77 o Lock page queue accesses by vm_page_free() and vm_page_deactivate(). 2002-07-21 21:20:57 +00:00
alc
a01b1feba6 o Lock page queue accesses by vm_page_cache() in vm_fault() and
vm_pageout_scan().  (The others are already locked.)
 o Assert that the page queues lock is held in vm_page_cache().
2002-07-20 19:34:21 +00:00
alc
ee4b41f6c5 o Lock some page queue accesses, in particular, those by vm_page_unwire(). 2002-07-13 19:24:04 +00:00
ken
0d3a835f3f At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
alc
078dff0f24 o Remove GIANT_REQUIRED from vm_fault_user_wire().
o Move pmap_pageable() outside of Giant in vm_fault_unwire().
   (pmap_pageable() is a no-op on all supported architectures.)
 o Remove the acquisition and release of Giant from mlock().
2002-06-16 20:42:29 +00:00
alc
642723e24c o Acquire and release Giant around pmap operations in vm_fault_unwire()
and vm_map_delete().  Assert GIANT_REQUIRED in vm_map_delete()
   only if operating on the kernel_object or the kmem_object.
 o Remove GIANT_REQUIRED from vm_map_remove().
 o Remove the acquisition and release of Giant from munmap().
2002-05-26 04:54:56 +00:00
alc
d60a525036 o Condition the compilation and use of vm_freeze_copyopts()
on ENABLE_VFS_IOOPT.
2002-05-06 05:45:57 +00:00
alc
4d466c829c o Revert vm_fault1() to its original name vm_fault(), eliminating the wrapper
that took its place for the purposes of acquiring and releasing Giant.
2002-04-30 03:44:34 +00:00
alc
d61adfe678 Document three synchronization issues in vm_fault(). 2002-04-29 05:23:01 +00:00
alc
dc5b6882d3 o Introduce and use vm_map_trylock() to replace several direct uses
of lockmgr().
 o Add missing synchronization to vmspace_swap_count(): Obtain a read lock
   on the vm_map before traversing it.
2002-04-28 06:07:54 +00:00