Commit Graph

34990 Commits

Author SHA1 Message Date
dillon
be14f0e426 Mainly changes to support the new swapper. The big adjustment is that
swap blocks are now in PAGE_SIZE'd increments instead of DEV_BSIZE'd
    increments.  We still convert to DEV_BSIZE'd increments for the
    backing store I/O, but everything else is in PAGE_SIZE increments.
1999-01-21 10:17:12 +00:00
dillon
67eada315f Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h
1999-01-21 10:15:47 +00:00
dillon
c11718f005 Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h

     Added argument to getpbuf() and relpbuf() to allow each subsystem to
     specify a different hard limit on the number of simultanious physical
     bufferes that said subsystem may allocate.  Without this feature, one
     subsystem ( e.g. the vfs clustering code ) could hog *ALL* the pbufs,
     causing a deadlock in the pager in a low memory situation.

     Same for trypbuf().
1999-01-21 10:15:24 +00:00
dillon
58521d75ed Reorganized some of the low memory testing code to make it more useful.
Removed call to vm_object_collapse(), which can block.  This was being
    called without the pageout code holding any sort of reference on the
    vm_object or vm_page_t structures being manipulated.  Since this code
    can block, it was possible for other kernel code to shred the state
    the pageout code was assuming remained intact.

    Fixed potential blocking condition in vm_pageout_page_free() ( which
    could cause a deadlock in a low-memory situation ).

    Currently there is a hack in-place to deal with clean filesystem meta-data
    polluting the inactive page queue.  John doesn't like the hack, and neither
    do I.

    Revamped and commented a portion of the pageout loop.

    Added protection against potential memory deadlocks with OBJT_VNODE
    when using VOP_ISLOCKED().  The problem is that vp->v_data can be NULL
    which causes VOP_ISLOCKED() to return a less informed answer.

    remove vm_pager_sync() -- none of the pagers use it any more ( the old
    swapper used to.  The new one does not ).
1999-01-21 10:12:54 +00:00
dillon
fb433b53df The TAILQ hashq has been turned into a singly-linked=list link,
reducing the size of vm_page_t.

    SWAPBLK_NONE and SWAPBLK_MASK are defined here.  These actually are
    more generalized then their names imply, but their placement is somewhat
    of a legacy issue from a prior test version of this code that put
    the swapblk in the vm_page_t structure.  That test code was eventually
    thrown away.  The legacy remains.

    Added vm_page_flash() inline.  Similar to vm_page_wakeup() except that
    it does not clear PG_BUSY ( one assumes that PG_BUSY is already clear ).
    Used by a number of routines to wakeup waiters.

    Collapsed some of the code in inline calls to make other inline calls.
    GCC will optimize this well and it reduces duplication.

    vm_page_free() and vm_page_free_zero() inlines added to convert to
    the proper vm_page_free_toq() call.

    vm_page_sleep_busy() inline added, replacing vm_page_sleep() ( which has
    been removed ).  This implements a much more optimizable page-waiting
    function.
1999-01-21 10:06:24 +00:00
dillon
1ceb9c9b9e The hash table used to be a table of doubly-link list headers ( two
pointers per entry ).  The table has been changed to a singly linked
    list of vm_page_t pointers.  The table has been doubled in size, but
    the entries only take half the space so a net-zero change in memory use.

    The hash function has been changed, hopefully for the better.  The
    combination of the larger hash table size of changed function should
    keep the chain length down to a reasonable number (0-3, average 1).

    vm_object->page_hint has been removed.  This 'optimization' was not
    only never needed, but costs as much as a hash chain link to implement.
    While having page_hint in vm_object might result in better locality
    of reference, the cost is not worth the space in vm_object or the
    extra instructions in my view.

    vm_page_alloc*() functions have been inlined and call a generalized
    non-inlined vm_page_alloc_toq() which combines the standard alloc
    and zero-page alloc functions together, reducing code size and the L1
    cache footprint.  Some reordering has been done... not much.  The
    delinking code should be faster ( because unlinking a doubly-linked list
    requires four memory ops and unlinking a singly linked list only requires
    two ), and we get a hash consistancy check for free.

    vm_page_rename() now automatically sets the page's dirty bits.

    vm_page_alloc() does not try to manually inline freeing a cache page.
    Instead, it now properly calls vm_page_free(m) ... vm_page_free() is
    really too complex to manually inline.

    vm_await(), supporting asleep(), has been added.
1999-01-21 10:01:49 +00:00
dillon
04052d89f8 The vm_object structure is now somewhat smaller due to the removal
of most of the swap-pager-specific fields, the removal of the id,
    and the removal of paging_offset.

    A new inline, vm_object_pip_wakeupn() has been added to subtract an
    arbitrary number n from the paging_in_progress count and then wakeup
    waiters as necessary.  n may be 0, resulting in a 'flash'.
1999-01-21 09:51:21 +00:00
dillon
3f5f4e54ca object->id was badly implemented. It has simply been removed.
object->paging_offset has been removed - it was used to optimize a
    single OBJT_SWAP collapse case yet introduced massive confusion throughout
    vm_object.c.  The optimization was inconsequential except for the
    claim that it didn't have to allocate any memory.  The optimization
    has been removed.

    madvise() has been fixed.  The old madvise() could be made to operate
    on shared objects which is a big no-no.  The new one is much more careful
    in what it modifies.  MADV_FREE was totally broken and has now been fixed.

    vm_page_rename() now automatically dirties a page, so explicit dirtying
    of the page prior to calling vm_page_rename() has been removed.
1999-01-21 09:46:55 +00:00
dillon
5e7ceee4a0 Objects associated with raw devices are no longer counted in the VM stats
total because they may contain absurd numbers ( like the size of all
    of physical memory if you mmap() /dev/mem ).
1999-01-21 09:41:52 +00:00
dillon
bb85a38eff General cleanup related to the new pager. We no longer have to worry
about conversions of objects to OBJT_SWAP, it is done automatically
    now.

    Replaced manually inserted code with inline calls for busy waiting on
    pages, which also incidently fixes a potential PG_BUSY race due to
    the code not running at splvm().

    vm_objects no longer have a paging_offset field ( see vm/vm_object.c )
1999-01-21 09:40:48 +00:00
dillon
8716ba4543 Potential bug fix, do not just clear PG_BUSY... call vm_page_wakeup()
instead to properly handle any waiters.

    Added comments, added support for M_ASLEEP.  Generally treat M_ flags
    as flags instead of constants to compare against.
1999-01-21 09:38:20 +00:00
dillon
f89474ca7b Removed low-memory blockages at fork. This is the wrong place to put
this sort of test.  We need to fix the low-memory handling in general.
1999-01-21 09:36:23 +00:00
dillon
fae5210519 Mainly cleanup. Removed some inappropriate low-memory handling code
and added lots of comments.  Add tie-in to vm_pager ( and thus the
    new swapper ) to deallocate backing swap for dirtied pages on the fly.
1999-01-21 09:35:38 +00:00
dillon
e3c11331d3 The default_pager's interaction with the swap_pager has been reorganized,
and the swap_pager has been completely replaced.

    The new swap pager uses the new blist radix-tree based bitmap allocator
    for low level swap allocation and deallocation.   The new allocator
    is effectively O(5) while the old one was O(N), and the new allocator
    allocates all required memory at init time rather then at allocate
    memory on the fly at run time.

    Swap metadata is allocated in clusters and stored in a hash table,
    eliminating linearly allocated structures.

    Many, many features have been rewritten or added.  Swap space is now
    reallocated on the fly providing a poor-mans auto defragmentation of
    swap space.  Swap space that is no longer needed is freed on a timely
    basis so no garbage collection is necessary.

    Swap I/O is marked B_ASYNC and NFS has been fixed to do the right
    thing with it, so NFS-based paging now has around 10x the performance
    as it did before ( previously NFS enforced synchronous I/O for paging ).
1999-01-21 09:33:07 +00:00
dillon
99dfef7f2a Added support for VOP_FREEBLKS(), reducing MFS's impact on swap and
increasing performance by deallocating at least some of the backing
    store when files are removed.

    Protect mfsp->buf_queue access at splbio().
1999-01-21 09:27:03 +00:00
dillon
92d48e1c28 Access to mfsp->buf_queue must be protected at splbio(). Other minor
adjustments also made, such as passing mfsp to mfs_doio() directly.
1999-01-21 09:24:46 +00:00
eivind
0929496a14 Move EXT2FS to be more visible, and give it a description. Also make
the text from my last commit somewhat better.
1999-01-21 09:24:28 +00:00
dillon
505442286b Renamed M_KERNEL to a more appropriate M_USE_RESERVE. 1999-01-21 09:23:21 +00:00
dillon
6bd52e6f4b The main operational changes are in getblk()'s handling of the
B_DELWRI and B_CACHE flags, fixing a bug that showed up with NFS.
    Also, a number of cases where manually inserted code has been removed
    and replaced with an inline function call giving us better functional
    isolation in the source.
1999-01-21 09:19:33 +00:00
dillon
b90a2bf706 The code that reclaims descriptors from in-transit unix domain
descriptor-passing messages was calling sorflush() without checking
    to see if the descriptor was actually a socket.  This can cause a
    crash by exiting programs that use the mechanism under certain
    circumstances.
1999-01-21 09:02:18 +00:00
dillon
1cb49fc563 Fixed a potential bug ( but maybe not ), where sendfile() clears PG_BUSY
on a page without testing for waiters.  Also collapsed busy wait into
    new vm_page_sleep_busy() inline ( see vm/vm_page.h )
1999-01-21 09:00:26 +00:00
dillon
ed80311739 This module was used only by the old swapper and has been #if'd out,
and will be eventually removed if no other use is found for it.
1999-01-21 08:58:41 +00:00
dillon
df24433bbe This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
dillon
bae5debf72 Add new blist module - radix tree based bitmap allocator with
size hinting.  Will be used by the new swapper.
1999-01-21 08:11:06 +00:00
dillon
b494d8d4c2 Update pstat -s to handle both old and new swapper.
Add pstat -ss to dump new swapper's radix tree.
1999-01-21 08:08:55 +00:00
jkh
e1a2cb0a33 This is now 4.0-current 1999-01-21 03:07:33 +00:00
grog
75812208a8 Update to reflect changes in kernel module
Remove references to LKMs
Change descriptions on read command (HEADS UP: this command has changed
  in an incompatible manner)
1999-01-21 00:55:28 +00:00
grog
216db4f79f Update to reflect major changes in vinum kernel module 1999-01-21 00:45:11 +00:00
grog
295e173f28 Update to reflect changes in kernel lkm 1999-01-21 00:44:11 +00:00
grog
6210d19dff Remove -DRAID5 from CFLAGS 1999-01-21 00:43:00 +00:00
grog
e1730de469 Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Add field plexsdno to sd struct

Add flag VF_NEWBORN to drive, sd, plex and volume structs, indicating
that the object has just been created.

Add object types for raw (unattached) plexes and subdisks

Remove definitions of VOLNO, PLEXNO and SDNO (now functions Volno,
Plexno and Sdno)

Move revive parameters from struct plex to struct sd.

struct plex:
  maintain a count of the number of inaccessible subdisks.
  remove defective and unmapped regions.

Debug flags: make an enum (previously #define)

Set default revive block size to 64kB (was 32 kB)
1999-01-21 00:41:58 +00:00
grog
ff912372cb Retain some of the RAID5 texts even in the non-RAID5 versions.
Previously, accidentally starting the wrong version could corrupt
  the RAID5 configuration.

  Add functions Volno, Plexno and Sdno to replace the old defines
  VOLNO, PLEXNO and SDNO.
1999-01-21 00:41:31 +00:00
grog
e2cc7b94b7 Remove states:
plex_reviving (now we revive subdisks)
  drive_coming_up (never happened)

Add state sd_reviving.
1999-01-21 00:40:50 +00:00
grog
2e66db50bd Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Serious rewrite.  No longer call set_<foo>_state to set the state
based only on other objects; instead, add functions
update_<foo>_state, which determine what the state should be by
themselves.  This allows the set_<foo>_state functions to shrink
enough to be almost intelligible.

Remove flags setstate_recurse and setstate_recursing.

Remove plex defective regions and unmapped regions, which were
maintained but not used.

Change code to allow daemon to perform operations formerly kludged
into an interrupt context.  Remove the DIRTYCONFIG kludge.
1999-01-21 00:40:32 +00:00
grog
4303d609a9 Change the style of revive: revive subdisks instead of plexes. This
makes it easier to keep track of which parts of a plex need reviving.

Use sdio, not vinumstrategy, to write to the subdisk.
1999-01-21 00:40:03 +00:00
grog
311c235179 Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Remove #ifdefs for FreeBSD 2.c

vinumstrategy:
  Support anonymous (`raw') subdisks and plexes.
  Change code to allow daemon to perform operations formerly kludged
  into an interrupt context.  Remove the DIRTYCONFIG kludge.

No longer set B_ORDERED for reviving subdisks.  I suspect this
wouldn't work correctly, and it should be done in a different manner
in vinumrevive.c

sdio: set subdisk state correctly on error
      start to remove code that doesn't make any sense any more.
1999-01-21 00:39:36 +00:00
grog
164b2c9536 Add keywords help, makedev, quit and setdaemon 1999-01-21 00:39:01 +00:00
grog
58a966e894 MMalloc: remove directory part of name correctly 1999-01-21 00:37:56 +00:00
grog
3b8d5388cd Create functions lockdrive and unlockdrive to handle drive locking 1999-01-21 00:37:38 +00:00
grog
5708c2c7d7 Add keywords makedev, setdaemon and help. 1999-01-21 00:37:18 +00:00
grog
cd30f1add6 Remove ioctls VINUM_GETUNMAPPED and VINUM_GETDEFECTIVE (we no longer
have regions).

Add definitions for VINUM_DAEMON, VINUM_FINDDAEMON and VINUM_DAEMON
1999-01-21 00:36:21 +00:00
grog
b98e229210 Include Peter Wemm's renaming and restructuring
Remove #ifdefs for FreeBSD 2.c

Change from lkm to kld

correct type of `flags' in calls to set_drive_state.

set_drive_parms: handle anonymous drives correctly (remove them)

drive VOP functions: use the PID of the original opener to fool the
lock manager.

open_drive: be quiet about failures (they're normal when scanning the
partitions).

close_drive: lock drive before closing.

remove_drive: lock drive before deallocating.

read_drive_label: set drive up when all is OK

check_drive:
  Complete rewrite.  Offload most of the code to the new
  vinum_scandisk

format_config:
  use snprintf and %qd options to make much less emetic.
  Remove old supporting functions.

vinum_scandisk:
  Moved here from vinum.c
  Almost complete rewrite, incorporating much of what was check_drive.
  We still don't have a general way to find the drives on a system, so
  get the user to supply the names via the `read' command.  For each
  device, try each possible compatibility slice name (there's a danger
  of finding both /dev/da1h and /dev/da0s1h otherwise).  Sort the
  partitions found in reverse order of last update time and read them
  in, setting the `update' parameter to parse_config and descendents.

save_config: rename to daemon_save_config, since the function is now
called by the daemon.  Create a new function save_config which queues
the request with the daemon.

daemon_save_config:  some mods to allow for the unfamiliar
environment.
1999-01-21 00:35:35 +00:00
grog
1d199fad1b Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Recover from I/O errors by passing the request to the daemon, except
for plex I/O, which is not fault tolerant.
1999-01-21 00:34:14 +00:00
grog
7a5e4ef068 Remove old cruft 1999-01-21 00:33:47 +00:00
grog
3efec90ecb Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Remove BROKEN_GDB kludge (it's not needed with klds)

Add code for interfacing with daemon

Modify device minor number encoding, use selector functions which also
permit anonymous plexes and subdisks.

Remove code for 2.x support.

Change messages to omit obvious words like 'plex' and 'subdisk.

give_plex_to_volume: invalidate subdisks being given to a plex which
is part of a volume with other plexes.

give_sd_to_plex: keep track of plex size in all cases

lock drives before closing them, to keep the daemon from getting
confused.

config_drive: handle partition type errors more gracefully

config_subdisk: set subdisk state correctly

find_drive, find_drive_by_dev, find_subdisk, find_plex, find_volume:
  set VF_NEWBORN flag when a new object is created

config_drive:
  Handle partition_status returns more cleverly.
  Replace the device name in some cases where it got overwritten.

config_subdisk:
  add parameter `update'.  If the object already exists, exit without
  any changes.
  Set state correctly.

config_plex, config_volume:
  add parameter `update'.  If the object already exists, exit without
  any changes.

parse_config:
  move read function to vinum_scandisk.
  add parameter `update' to pass to config_<object>.

remove_<object>_entry:
  print a message when the object is removed.

update_plex_config:
  Start defusing this function, which will go away some time.
  Remove calls to update_volume_config.
  Make size 64 bits
1999-01-21 00:32:54 +00:00
grog
ded4095a79 Include Peter Wemm's renaming and restructuring
Change from lkm to kld
1999-01-21 00:31:59 +00:00
grog
d98f77d075 Code for vinum daemon. 1999-01-21 00:31:31 +00:00
grog
30b219d068 Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Remove BROKEN_GDB kludge (it's not needed with klds)

Add code for interfacing with daemon

Modify manner of determining when module is idle

Modify device minor number encoding, use selector functions which also
permit anonymous plexes and subdisks.

Remove code for 2.x support.

Move vinum_scandisk to vinumio.c

Remove myproc kludge

Keep track of open volumes by flag, not by pid (the pids caused some
problems with the lock manager).

free_vinum:
  Remove unmapped and defective regions from plexes.
  Wait for daemon to stop before returning

vinumopen:
  Don't refuse an open if the volume is already open.
1999-01-21 00:30:52 +00:00
grog
e95662ec61 Reflect changes in vinumstate.h 1999-01-21 00:30:24 +00:00
grog
5dd4dc80f9 Include Peter Wemm's renaming and restructuring
Change from lkm to kld

Add structures for daemon

Remove old cruft

Increase RQINFO_SIZE to 128
1999-01-21 00:29:44 +00:00