Commit Graph

1724 Commits

Author SHA1 Message Date
Alan Cox
ba2157f218 Introduce a new pmap function, pmap_extract_and_hold(). This function
atomically extracts and holds the physical page that is associated with the
given pmap and virtual address.  Such a function is needed to make the
memory mapping optimizations used by, for example, pipes and raw disk I/O
MP-safe.

Reviewed by:	tegge
2003-09-08 02:45:03 +00:00
Alan Cox
7ebcee376a Revise the locking in mincore(2). 2003-09-07 18:47:54 +00:00
Poul-Henning Kamp
afeb65e61d Don't open with exclusive bit, swapon(8) wants to trash our swapdev.
Add XXX comment with a rating of this concept.
2003-09-02 05:53:44 +00:00
Eivind Eklund
2ae51145e8 Change clean_map from a global to an auto variable 2003-09-01 16:46:47 +00:00
Alan Cox
3562af1215 - Add vm object locking to the part of vm_pageout_scan() that launders
dirty pages.
 - Remove some unused variables.
2003-08-31 00:00:46 +00:00
Marcel Moolenaar
b21a0008ba Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for
growable (stack) entries that not only grow down, but also grow up.
Have vm_map_growstack() take these flags into account when growing
an entry.

This is the first step in adding support for upward growable stacks.
It is a required feature on ia64 to support the register stack (or
rstack as I like to call it -- it also means reverse stack). We do
not currently create rstacks, so the upward growing is not exercised
and the change should be a functional no-op.

Reviewed by: alc
2003-08-30 21:25:23 +00:00
Poul-Henning Kamp
dee34ca4fc Add a close() method to a swapdev.
Add a GEOM based backend.

Remove the device/VOP_SPECSTRATEGY() based backend.
2003-08-30 16:44:26 +00:00
Poul-Henning Kamp
20da9c2eaf Protect the swapdevice tailq with a mutex.
Store the udev_t we will report to userland in the swdevt.
2003-08-30 16:10:28 +00:00
Poul-Henning Kamp
59efee01a3 Continue the objectification of the swapdev backends:
Remove the vnode and dev_t fields and replace them with a void *.

Introduce separate strategy functions for devices and regular (NFS)
vnodes.

For devices we don't need the vnode v_numoutput stuff.

Add a generic swaponsomething() function to add a swapdevice and
split the remainder of swaponvp() into swaponvp() and swapondev()
which calls this backend.
2003-08-30 11:33:25 +00:00
Poul-Henning Kamp
4b03903a46 Make the strategy function a method of the individual swapdev. 2003-08-30 09:42:00 +00:00
Poul-Henning Kamp
2f249180f5 Consistent use modern function definitions 2003-08-30 08:32:42 +00:00
Marcel Moolenaar
23562e4bc6 In vnode_pager_generic_putpages(), change the printf format specifier
to long and explicitly cast field dirty of struct vm_page to unsigned
long. When PAGE_SIZE is 32K, this field is actually unsigned long.
2003-08-29 00:16:30 +00:00
Alan Cox
2370c6d40c Recent pmap changes permit the use of a more precise locking assertion
in vm_page_lookup().
2003-08-28 23:23:04 +00:00
Marcel Moolenaar
16bc6ff39e Assert that u_long is at least 64 bits if PAGE_SIZE is 32K.
Suggested by: phk
2003-08-25 19:58:01 +00:00
Alan Cox
529e15ed69 Held pages, just like wired pages, should not be added to the cache queues.
Submitted by:	tegge
2003-08-23 20:29:29 +00:00
Alan Cox
b7ad744dc5 Hold the page queues lock when performing vm_page_clear_dirty() and
vm_page_set_invalid().
2003-08-23 18:11:53 +00:00
Alan Cox
8d8b9c6e70 To implement the sequential access optimization, vm_fault() may need to
reacquire the "first" object's lock while a backing object's lock is held.
Since this is a lock-order reversal, vm_fault() uses trylock to acquire
the first object's lock, skipping the sequential access optimization in
the unlikely event that the trylock fails.
2003-08-23 06:52:32 +00:00
Marcel Moolenaar
21a708cfde Also define VM_PAGE_BITS_ALL for 16K and 32K pages. Make the constant
unsigned for all page sizes and unsigned long for 32K pages.
2003-08-23 06:30:47 +00:00
Marcel Moolenaar
1fa057c6f1 Add support for 16K and 32K page sizes. The valid and dirty maps
in struct vm_page are defined as u_int for 16K pages and u_long
for 32K pages, with the implied assumption that long will at least
be 64 bits wide on platforms where we support 32K pages.
2003-08-23 06:24:00 +00:00
Alan Cox
0f132ba697 Assert that the vm object's lock is held on entry to vm_page_grab(); remove
code from this function that was needed when vm object locking was
incomplete.
2003-08-21 20:59:07 +00:00
Alan Cox
891c1d4bd3 Assert that the vm object lock is held in vm_page_alloc(). 2003-08-20 20:24:29 +00:00
Bosko Milekic
1c35e213f1 In sysctl_vm_zone, do not calculate per-cpu cache stats on
UMA_ZFLAG_INTERNAL zones at all.  Apparently, Wilko's alpha
was crashing while entering multi-user because, I think, we
were calculating the garbage cachefree for pcpu caches that
essentially don't exist for at least the 'zones' zone and it so
happened that we were reading from an unmapped location.

Confirmed to fix crash: wilko
Helped debug: wilko, gallatin
2003-08-20 18:22:06 +00:00
Poul-Henning Kamp
6a4b58230c Replace a homegrown bdone()/bwait() implementation by the real thing 2003-08-18 19:47:16 +00:00
Alan Cox
ef13663bb6 Three unrelated changes to vm_proc_new(): (1) add vm object locking on the
U pages object; (2) reorganize such that the U pages object is created and
filled in one block; and (3) remove an unnecessary clearing of PG_ZERO.
2003-08-18 01:31:43 +00:00
Poul-Henning Kamp
ec7948490b Use NULL for 3rd argument of VOP_BMAP() rather than custom cast.
Eliminate unused variable.
2003-08-17 18:54:23 +00:00
Marcel Moolenaar
710338e94f In vm_thread_swap{in|out}(), remove the alpha specific conditional
compilation and replace it with a call to cpu_thread_swap{in|out}().
This allows us to add similar code on ia64 without cluttering the
code even more.
2003-08-16 23:15:15 +00:00
Poul-Henning Kamp
395714feb7 Eliminate unnecessary udev_t variable: we can derive it from the dev_t
when we need it.
2003-08-15 13:14:25 +00:00
Poul-Henning Kamp
89dc784fa3 Make swaponvp() static to the swap_pager. 2003-08-15 12:04:29 +00:00
Alan Cox
3e1b578a28 Extend the scope of the page queues lock in vm_pageout_scan() to cover
the traversal of the PQ_INACTIVE queue.
2003-08-15 05:13:36 +00:00
Alan Cox
5402d8ec23 Remove GIANT_REQUIRED from vmspace_alloc(). 2003-08-13 19:23:51 +00:00
Alan Cox
46add12552 Reduce the size of the vm map (and by inclusion the vm space) on 64-bit
architectures by moving a field within the structure.
2003-08-13 03:13:22 +00:00
Warner Losh
06b4bf3e55 Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon
2003-08-12 23:24:05 +00:00
Alan Cox
c759a3ca06 Reduce the size of the vm object on 64-bit architectures by moving
a field within the structure.
2003-08-12 20:10:32 +00:00
Bosko Milekic
20e8e865bd - When deciding whether to init the zone with small_init or large_init,
compare the zone element size (+1 for the byte of linkage) against
  UMA_SLAB_SIZE - sizeof(struct uma_slab), and not just UMA_SLAB_SIZE.
  Add a KASSERT in zone_small_init to make sure that the computed
  ipers (items per slab) for the zone is not zero, despite the addition
  of the check, just to be sure (this part submitted by: silby)

- UMA_ZONE_VM used to imply BUCKETCACHE.  Now it implies
  CACHEONLY instead.  CACHEONLY is like BUCKETCACHE in the
  case of bucket allocations, but in addition to that also ensures that
  we don't setup the zone with OFFPAGE slab headers allocated from the
  slabzone.  This means that we're not allowed to have a UMA_ZONE_VM
  zone initialized for large items (zone_large_init) because it would
  require the slab headers to be allocated from slabzone, and hence
  kmem_map.  Some of the zones init'd with UMA_ZONE_VM are so init'd
  before kmem_map is suballoc'd from kernel_map, which is why this
  change is necessary.
2003-08-11 19:39:45 +00:00
Bruce M Simpson
abd498aa71 Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
Mike Silbersack
cebde06978 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
Poul-Henning Kamp
ef3c5abdba Make the first two pages magic to protect the BSD labels rather than
only one.
2003-08-06 14:13:38 +00:00
Poul-Henning Kamp
07f81f9159 Remove an unused variable. 2003-08-06 12:09:34 +00:00
Poul-Henning Kamp
751221fd32 Staticize swap_pager_putpages()
Eliminate a lot of checkes to make sure requests are not cross-device
which is unnecessary with the new layout.  We know a sequential request
cannot possibly be cross-device because there is a reserved page between
the devices.

Remove a couple of comments which no longer are relevant.
2003-08-06 12:08:27 +00:00
Poul-Henning Kamp
030b34923d Access the swap_pagers' ->putpages() through swappagerops instead
of directly, this is a cleaner way to do it.
2003-08-06 12:05:48 +00:00
Poul-Henning Kamp
f976cfd99a Add XXX: comment to vm_pager_unswapped(). 2003-08-06 10:51:40 +00:00
Poul-Henning Kamp
5e04322a6e Explicitly set B_PAGING 2003-08-06 09:22:47 +00:00
Poul-Henning Kamp
c37a77ee86 Rip out the totally bogos vnode swapdev_vp with extreeme prejudice.
Don't mark buffers with B_KEEPGIANT, we don't drop giant in strategy
at this point in time.
2003-08-06 06:53:31 +00:00
Poul-Henning Kamp
e04e4bacf6 Use sparse struct initialization for struct pagerops.
Mark our buffers B_KEEPGIANT before sending them downstream.

Remove swap_pager_strategy implementation.
2003-08-05 06:54:56 +00:00
Poul-Henning Kamp
4e6586002d Use sparse struct initializations for struct pagerops.
This makes grepping for which pagers implement which methods easier.
2003-08-05 06:51:26 +00:00
Poul-Henning Kamp
665c0caf03 Put an uncovered page between the swap devices, that way we can be sure
to not get any cross-device I/O requests.  (The unallocated first page
protecting BSD labels already gave us this, but that hack may go away
at some point in time).

Remove the check for cross-device I/O requests in swap_pager_strategy.

Move the repeated statistics updating into flushchainbuf().
2003-08-04 08:22:49 +00:00
Alan Cox
981371629a Use kmem_alloc_nofault() instead of kmem_alloc_pageable() to allocate
swapbkva.  Swapbkva mappings are explicitly managed using pmap_qenter(),
not on-demand by vm_fault(), making kmem_alloc_nofault() more appropriate.

Submitted by:	tegge
2003-08-04 04:35:04 +00:00
Poul-Henning Kamp
12692209a6 Name swap_pager_find_dev() more correctly swp_pager_finde_dev().
Use ->bio_children to count child buffers, rather than abuse the
bio_caller1 pointer.

Expand the relevant bits of waitchainbuf() inline, this clarifies
the code a little bit.
2003-08-03 21:22:42 +00:00
Poul-Henning Kamp
5ff0108d21 I accidentally hit undo before committing, fix the resulting off-by-one. 2003-08-03 14:53:52 +00:00
Poul-Henning Kamp
8f60c087e6 Change the layout policy of the swap_pager from a hardcoded width
striping to a per device round-robin algorithm.

Because of the policy of not attempting to retain previous swap
allocation on page-out, this means that a newly added swap device
almost instantly takes its 1/N share of the I/O load but it takes
somewhat longer for it to assume it's 1/N share of the pages if there
is plenty of space on the other devices.

Change the 8G total swapspace limitation to 8G per device instead
by using a per device blist rather than one global blist.  This
reduces the memory footprint by 75% (typically a couple hundred
kilobytes) for the common case with one swapdevice but NSWAPDEV=4.

Remove the compile time constant limit of number of swap devices,
there is no limit now.  Instead of a fixed size array, store the
per swapdev structure in a TAILQ.

Total swap space is still addressed by a 32 bit page number and
therefore the upper limit is now 2^42 bytes = 16TB (for i386).

We still do not allocate the first page of each device in order to
give some amount of protection to any bsdlabel at the start of the
device.

A new device is appended after the existing devices in the swap space,
no attempt is made to fill in holes left behind by swapoff (this can
trivially be changed should it ever become a problem).

The sysctl vm.nswapdev now reflects the number of currently configured
swap devices.

Rename vm_swap_size to swap_pager_avail for consistency with other
exported names.

Change argument type for vm_proc_swapin_all() and swap_pager_isswapped()
to be a struct swdevt pointer rather than an index.

Not changed: we are still using blists to manage the free space,
but since the swapspace is no longer fragmented by the striping
different resource managers might fare better.
2003-08-03 13:35:31 +00:00