Commit Graph

1089 Commits

Author SHA1 Message Date
tmm
48dbb11156 Add a missing semicolon to unbreak the kernel build with INVARIANTS
(which was unfortunately turned off in the confguration I used for the
last test build).

Spotted by:	jake
Pointy hat to:	tmm
2001-08-05 03:55:02 +00:00
jhb
d47050ac2b Whitespace fixes. 2001-08-04 20:49:29 +00:00
tmm
8388d01b0c Add a zdestroy() function to the zone allocator. This is needed for the
unload case of modules that use their own zones.
It has been tested with the nfs module.
2001-08-04 20:17:05 +00:00
alfred
1d105403d0 Fixups for the initial allocation by dillon:
1) allocate fewer buckets
  2) when failing to allocate swap zone, keep reducing the zone by
     a third rather than a half in order to reduce the chance of
     allocating way too little.

I also moved around some code for readability.

Suggested by: dillon
Reviewed by: dillon
2001-08-02 07:54:58 +00:00
jake
b4050e8494 Oops. Last commit to vm_object.c should have got these files too.
Remove the use of atomic ops to manipulate vm_object and vm_page flags.
Giant is required here, so they are superfluous.

Discussed with:	dillon
2001-07-31 04:09:52 +00:00
jake
2c20bb4e7b Remove the use of atomic ops to manipulate vm_object and vm_page flags.
Giant is required here, so they are superfluous.

Discussed with:	dillon
2001-07-31 04:03:53 +00:00
iedowse
5da45bdf5b Permit direct swapping to NFS regular files using swapon(2). We
already allow this for NFS swap configured via BOOTP, so it is
known to work fine.

For many diskless configurations is is more flexible to have the
client set up swapping itself; it can recreate a sparse swap file
to save on server space for example, and it works with a non-NFS
root filesystem such as an in-kernel filesystem image.
2001-07-28 20:18:38 +00:00
assar
20642ac3f7 make vm_page_select_cache static
Requested by:	bde
2001-07-23 12:34:31 +00:00
assar
e563ff4066 (vm_page_select_cache): add prototype 2001-07-21 17:08:15 +00:00
benno
b923e2f3e6 The i386-specific includes in this file were "fixed" by bracketing them with
#ifndef __alpha__.  Fix this for the rest of the world by turning it into
#ifdef __i386__.

Reviewed by:	obrien
2001-07-15 04:11:51 +00:00
des
17faec39c6 Fix missing newline and terminator at the end of the vm.zone sysctl. 2001-07-09 03:37:33 +00:00
mjacob
1bbbafcc7f Apply field bandages to the includes so compiles happen on alpha. 2001-07-05 06:13:44 +00:00
dillon
1cf218e40f Move vm_page_zero_idle() from machine-dependant sections to a
machine-independant source file, vm/vm_zeroidle.c.  It was exactly the
same for all platforms and updating them all was getting annoying.
2001-07-05 01:32:42 +00:00
dillon
93369f554a Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.

vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc...  vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)
2001-07-04 23:27:09 +00:00
dillon
f45603dee9 Change inlines back into mainline code in preparation for mutexing. Also,
most of these inlines had been bloated in -current far beyond their
original intent.  Normalize prototypes and function declarations to be ANSI
only (half already were).  And do some general cleanup.

(kernel size also reduced by 50-100K, but that isn't the prime intent)
2001-07-04 20:15:18 +00:00
dillon
cbc4469f38 whitespace / register cleanup 2001-07-04 19:00:13 +00:00
dillon
e028603b7e With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
jhb
34ed38abdc Fix a XXX comment by moving the initialization of the number of pbuf's
for the vnode pager to a new vnode pager init method instead of making it
a hack in getpages().
2001-07-03 07:35:56 +00:00
jhb
ce09fa9fbc - Protect all accesses to nsw_[rw]count{,_{,a}sync} with the pbuf mutex.
- Don't drop the vm mutex while grabbing the pbuf mutex to manipulate
  said variables.
2001-06-22 21:12:19 +00:00
bmilekic
5d710b296b Introduce numerous SMP friendly changes to the mbuf allocator. Namely,
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:

 o Reduce contention for SMP by offering per-CPU pools and locks.
 o Better use of data cache due to per-CPU pools.
 o Much less code cache pollution due to excessively large allocation macros.
 o Framework for `grouping' objects from same page together so as to be able
   to possibly free wired-down pages back to the system if they are no longer
   needed by the network stacks.

 Additional things changed with this addition:

  - Moved some mbuf specific declarations and initializations from
    sys/conf/param.c into mbuf-specific code where they belong.
  - m_getclr() has been renamed to m_get_clrd() because the old name is really
    confusing. m_getclr() HAS been preserved though and is defined to the new
    name. No tree sweep has been done "to change the interface," as the old
    name will continue to be supported and is not depracated. The change was
    merely done because m_getclr() sounds too much like "m_get a cluster."
  - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
    systat(1) (see TODO below).
  - Fixed systat(1) to display number of "free mbufs" based on new per-CPU
    stat structures.
  - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
    per-CPU stat structures. All infos are fetched via sysctl.

 TODO (in order of priority):

  - Re-enable mbtypes statistics in both netstat(1) and systat(1) after
    introducing an SMP friendly way to collect the mbtypes stats under the
    already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
    seems too costly for a mere stat update, especially when other locks are
    already present).
  - Optionally have systat(1) display not only "total free mbufs" but also
    "total free mbufs per CPU pool."
  - Fix minor length-fetching issues in netstat(1) related to recently
    re-enabled option to read mbuf stats from a core file.
  - Move reference counters at least for mbuf clusters into an unused portion
    of the cluster itself, to save space and need to allocate a counter.
  - Look into introducing resource freeing possibly from a kproc.

Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/
2001-06-22 06:35:32 +00:00
jhb
c4ce33c0c7 Don't lock around swap_pager_swap_init() that is only called once during
the pagedaemon's startup code since it calls malloc which results in lock
order reversals.
2001-06-20 23:34:06 +00:00
jhb
3910f65aa2 Put the scheduler, vmdaemon, and pagedaemon kthreads back under Giant for
now.  The proc locking isn't actually safe yet and won't be until the proc
locking is finished.
2001-06-20 00:48:20 +00:00
dillon
633b2a04be Cleanup the tabbing 2001-06-11 19:17:05 +00:00
dillon
3d6ba4564b Two fixes to the out-of-swap process termination code. First, start killing
processes a little earlier to avoid a deadlock.  Second, when calculating
the 'largest process' do not just count RSS.  Instead count the RSS + SWAP
used by the process.  Without this the code tended to kill small
inconsequential processes like, oh, sshd, rather then one of the many
'eatmem 200MB' I run on a whim :-).  This fix has been extensively tested on
-stable and somewhat tested on -current and will be MFCd in a few days.

Shamed into fixing this by: ps
2001-06-09 18:06:58 +00:00
tmm
87a858d545 Change the way information about swap devices is exported to be more
canonical: define a versioned struct xswdev, and add a sysctl node
handler that allows the user to get this structure for a certain device
index by specifying this index as last element of the MIB.
This new node handler, vm.swap_info, replaces the old vm.nswapdev
and vm.swapdevX.* (where X was the index) sysctls.
2001-06-01 22:53:10 +00:00
tmm
9ce8a62347 Clean up the code exporting interrupt statistics via sysctl a bit:
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
  the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
  from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs

Requested by:	bde
Reviewed by:	bde (some time ago)
2001-06-01 13:23:28 +00:00
jhb
0c84f37525 Don't hold the VM lock across VOP's and other things that can sleep. 2001-05-29 16:58:25 +00:00
jhb
cb02b9b724 Stick VM syscalls back under Giant if the BLEED option is not defined. 2001-05-24 18:04:29 +00:00
dillon
a179ee09ab This patch implements O_DIRECT about 80% of the way. It takes a patchset
Tor created a while ago, removes the raw I/O piece (that has cache coherency
problems), and adds a buffer cache / VM freeing piece.

Essentially this patch causes O_DIRECT I/O to not be left in the cache, but
does not prevent it from going through the cache, hence the 80%.  For
the last 20% we need a method by which the I/O can be issued directly to
buffer supplied by the user process and bypass the buffer cache entirely,
but still maintain cache coherency.

I also have the code working under -stable but the changes made to sys/file.h
may not be MFCable, so an MFC is not on the table yet.

Submitted by:	tegge, dillon
2001-05-24 07:22:27 +00:00
jhb
c6bb467070 - Assert Giant is held in the vnode pager methods.
- Lock the VM while walking down a vm_object's backing_object list in
  vnode_pager_lock().
2001-05-23 22:51:23 +00:00
jhb
f7df8897d2 - Add in several asserts of vm_mtx.
- Assert Giant in vm_pageout_scan() for the vnode hacking that it does.
- Don't hold vm_mtx around vget() or vput().
- Lock Giant when calling vm_pageout_scan() from the pagedaemon.  Also,
  lock curproc while setting the P_BUFEXHAUST flag.
- For now we still hold Giant for all of the vm_daemon.  When process
  limits are locked we will be only need Giant for swapout_procs().
2001-05-23 22:48:28 +00:00
jhb
7959aca6f7 - Assert that the vm lock is held for all of _vm_object_allocate().
- Restore the previous order of setting up a new vm_object.  The previous
  had a small bug where we zero'd out the flags after we set the
  OBJ_ONEMAPPING flag.
- Add several asserts of vm_mtx.
- Assert Giant is held rather than locking and unlocking it in a few
  places.
- Add in some #ifdef objlocks code to lock individual vm objects when
  vm objects each have their own lock someday.
- Don't bother acquiring the allproc lock for a ddb command.  If DDB
  blocked on the lock, that would be worse than having an inconsistent
  allproc list.
2001-05-23 22:42:10 +00:00
jhb
cf87d9f5db - Add lots of vm_mtx assertions.
- Add a few KTR tracepoints to track the addition and removal of
  vm_map_entry's and the creation adn free'ing of vmspace's.
- Adjust a few portions of code so that we update the process' vmspace
  pointer to its new vmspace before freeing the old vmspace.
2001-05-23 22:38:00 +00:00
jhb
b6e2dc1111 - Lock the VM around the pmap_swapin_proc() call in faultin().
- Don't lock Giant in the scheduler() function except for when calling
  faultin().
- In swapout_procs(), lock the VM before the proccess to avoid a lock order
  violation.
- In swapout_procs(), release the allproc lock before calling swapout().
  We restart the process scan after swapping out a process.
- In swapout_procs(), un #if 0 the code to bump the vmspace reference count
  and lock the process' vm structures.  This bug was introduced by me and
  could result in the vmspace being free'd out from under a running
  process.
- Fix an old bug where the vmspace reference count was not free'd if we
  failed the swap_idle_threshold2 test.
2001-05-23 22:35:45 +00:00
jhb
38c4b5afb4 - Fix the sw_alloc_interlock to actually lock itself when the lock is
acquired.
- Assert Giant is held in the strategy, getpages, and putpages methods and
  the getchainbuf, flushchainbuf, and waitchainbuf functions.
- Always call flushchainbuf() w/o the VM lock.
2001-05-23 22:31:15 +00:00
jhb
c76962451c Assert Giant is held for the device pager alloc and getpages methods since
we call the mmap method of the cdevsw of the device we are mmap'ing.
2001-05-23 22:27:52 +00:00
jhb
7703731a60 - Obtain Giant in mmap() syscall while messing with file descriptors and
vnodes.
- Fix an old bug that would leak a reference to a fd if the vnode being
  mmap'd wasn't of type VREG or VCHR.
- Lock Giant in vm_mmap() around calls into the VM that can call into
  pager routines that need Giant or into other VM routines that need
  Giant.
- Replace code that used a goto to jump around the else branch of a test
  to use an else branch instead.
2001-05-23 22:17:43 +00:00
jhb
04aacfa658 Acquire Giant around vm_map_remove() inside of the obreak() syscall for
vm_object_terminate().
2001-05-23 22:13:10 +00:00
jhb
a0dd4d53a7 Take a more conservative approach and still lock Giant around VM faults
for now.
2001-05-23 22:09:18 +00:00
jhb
a5ffebeaaa Set the phys_pager_alloc_lock to 1 when it is acquired so that it is
actually locked.
2001-05-23 19:52:23 +00:00
alfred
587340efa6 aquire Giant when playing with the buffercache and doing IO.
use msleep against the vm mutex while waiting for a page IO to complete.
2001-05-23 10:28:11 +00:00
alfred
5382981252 aquire vm mutex in swp_pager_async_iodone. Don't call swp_pager_async_iodone
with the mutex held.
2001-05-22 19:01:26 +00:00
jhb
376941ef28 Remove duplicate include and sort includes. 2001-05-22 07:21:46 +00:00
jhb
85efd8a1d6 Sort includes. 2001-05-22 07:01:11 +00:00
jhb
fd51037384 Unlock the VM lock at the end of munlock() instead of locking it again. 2001-05-22 06:07:36 +00:00
jhb
c2bd2e9bd5 Sort includes from previous commit. 2001-05-22 05:35:45 +00:00
jhb
1dc43912c8 Sort includes. 2001-05-22 00:56:25 +00:00
alfred
a3f0842419 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
jhb
70b98f2d04 - Use a timeout for the tsleep in scheduler() instead of having vmmeter()
wakeup proc0 by hand to enforce the timeout.
- When swapping out a process, keep the process locked via the proc lock
  from the first checks up until we clear PS_INMEM and set PS_SWAPPING in
  swapout().  The swapout() function now must be called with the proc lock
  held and releases it before returning.
- Comment out the code to attempt to lock a process' VM structures before
  swapping out.  It is broken in that it releases the lock after obtaining
  it.  If it does grab the lock, it needs to hand it off to swapout()
  instead of releasing it.  This can be revisisted when the VM is locked
  as this is a valid test to perform.  It also causes a lock order reversal
  for the time being, which is the immediate cause for temporarily
  disabling it.
2001-05-18 00:08:38 +00:00
jhb
c2a1096080 During the code to pick a process to kill when memory is exhausted, keep
the process in question locked as soon as we find it and determine it to
be eligible until we actually kill it.  To avoid deadlock, we don't block
on the process lock but skip any process that is already locked during our
search.
2001-05-17 22:49:03 +00:00
jhb
8865204035 - Use PROC_LOCK_ASSERT instead of a direct mtx_assert.
- Don't hold Giant in the swapper daemon while we walk the list of
  processes looking for a process to swap back in.
- Don't bother grabbing the sched_lock while checking a process' sleep
  time in swapout_procs() to ensure that a process has been idle for at
  least swap_idle_threshold2 before swapping it out.  If we lose the race
  we just let a process stay in memory until the next call of
  swapout_procs().
- Remove some unneeded spl's, sched_lock does all the locking needed in
  this case.
2001-05-15 22:20:44 +00:00
phk
16caeec9b0 Actually biofinish(struct bio *, struct devstat *, int error) is more general
than the bioerror().

Most of this patch is generated by scripts.
2001-05-06 20:00:03 +00:00
markm
b831760191 Putting sys/lockmgr.h in here allows us to depollute userland includes
a bit.
OK'ed by:	bde
2001-05-03 11:33:51 +00:00
markm
bcca5847d5 Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
grog
4b9d9cbaac Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
alfred
3abca35872 Address a number of problems with sysctl_vm_zone().
The zone allocator's locks should be leaflocks, meaning that they
should never be held when entering into another subsystem, however
the sysctl grabs the zone global mutex and individual zone mutexes
while holding the lock it calls SYSCTL_OUT which recurses into the
VM subsystem in order to wire user memory to do a safe copy.  This
can block and cause lock order reversals.

To fix this:
  lock zone global.
  get a count of the number of zones.
  unlock global.
  allocate temporary storage.
  format and SYSCTL_OUT the banner.
  lock global.
  traverse list.
    make sure we haven't looped more than the initial count taken
      to avoid overflowing the allocated buffer.
    lock each nodes.
    read values and format into buffer.
    unlock individual node.
  unlock global.
  format and SYSCTL_OUT the rest of the data.
  free storage.
  return.

Other problems included not checking for errors when doing sysctl out
of the column header.  Fixed.

Inconsistant termination of the copied string. Fixed.

Objected to by: des (for not using sbuf)

Since the output is not variable length and I'm actually over
allocating signifigantly and I'd like to get this fixed now, I'll
work on the sbuf convertion at a later date.  I would not object
to someone else taking it upon themselves to convert it to sbuf.
I hold no MAINTIANER rights to this code (for now).
2001-04-27 22:24:45 +00:00
grog
1f5de30718 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
alfred
1e50a5f33e vnode_pager_freepage() is really vm_page_free() in disguise,
nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()
2001-04-19 06:18:23 +00:00
alfred
e6ee97803e Protect pager object creation with sx locks.
Protect pager object list manipulation with a mutex.

It doesn't look possible to combine them under a single sx lock because
creation may block and we can't have the object list manipulation block
on anything other than a mutex because of interrupt requests.
2001-04-18 20:24:16 +00:00
alfred
cd79f17cad Fix the botched rev 1.59 where I made it such that without INVARIANTS
the map is never locked.

Submitted by: tegge
2001-04-18 05:30:24 +00:00
phk
378e561228 This patch removes the VOP_BWRITE() vector.
VOP_BWRITE() was a hack which made it possible for NFS client
side to use struct buf with non-bio backing.

This patch takes a more general approach and adds a bp->b_op
vector where more methods can be added.

The success of this patch depends on bp->b_op being initialized
all relevant places for some value of "relevant" which is not
easy to determine.  For now the buffers have grown a b_magic
element which will make such issues a tiny bit easier to debug.
2001-04-17 08:56:39 +00:00
alfred
ec0e3ba359 use TAILQ_FOREACH, fix a comment's location 2001-04-15 10:22:04 +00:00
alfred
7e6ce027ec if/panic -> KASSERT 2001-04-13 11:15:40 +00:00
alfred
bbee48d66d protect pbufs and associated counts with a mutex 2001-04-13 10:23:32 +00:00
alfred
bcfbf5a27d use %p for pointer printf, include sys/systm.h for printf proto 2001-04-13 10:22:14 +00:00
alfred
7dcb59378d Use a macro wrapper over printf along with KASSERT to reduce the amount
of code here.
2001-04-13 08:07:37 +00:00
alfred
229635845b remove truncated part from commment 2001-04-12 21:50:03 +00:00
jhb
79cf991a6b Convert the allproc and proctree locks from lockmgr locks to sx locks. 2001-03-28 11:52:56 +00:00
jhb
b47bfbe544 Catch up to header include changes:
- <sys/mutex.h> now requires <sys/systm.h>
- <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>
2001-03-28 09:17:56 +00:00
tmm
76ba4861cd Export intrnames and intrcnt as sysctls (hw.nintr, hw.intrnames and
hw.intrcnt).

Approved by:	rwatson
2001-03-23 03:45:17 +00:00
dillon
be9d069cf0 Fix a lock reversal problem in the VM subsystem related to threaded
programs.   There is a case during a fork() which can cause a deadlock.

From Tor -
The workaround that consists of setting a flag in the vm map that
indicates that a fork is in progress and using that mark in the page
fault handling to force a revalidation failure.  That change will only
affect (pessimize) page fault handling during fork for threaded
(linuxthreads style) applications and applications using aio_*().

Submited by: tegge
2001-03-14 06:48:53 +00:00
dillon
3b687e6575 Temporarily remove the vm_map_simplify() call from vm_map_insert(). The
call is correct, but it interferes with the massive hack called
vm_map_growstack().  The call will be returned after our stack handling
code is fixed.

Reported by: tegge
2001-03-14 06:09:42 +00:00
iedowse
80d8e16746 When creating a shadow vm_object in vmspace_fork(), only one
reference count was transferred to the new object, but both the
new and the old map entries had pointers to the new object.
Correct this by transferring the second reference.

This fixes a panic that can occur when mmap(2) is used with the
MAP_INHERIT flag.

PR:		i386/25603
Reviewed by:	dillon, alc
2001-03-09 18:25:54 +00:00
jhb
1de39ad1a0 Unrevert the pmap_map() changes. They weren't broken on x86.
Sense beaten into me by:	peter
2001-03-07 05:29:21 +00:00
jhb
74a74a3282 Back out the pmap_map() change for now, it isn't completely stable on the
i386.
2001-03-07 01:04:17 +00:00
jhb
a710dd7194 - Rework pmap_map() to take advantage of direct-mapped segments on
supported architectures such as the alpha.  This allows us to save
  on kernel virtual address space, TLB entries, and (on the ia64) VHPT
  entries.  pmap_map() now modifies the passed in virtual address on
  architectures that do not support direct-mapped segments to point to
  the next available virtual address.  It also returns the actual
  address that the request was mapped to.
- On the IA64 don't use a special zone of PV entries needed for early
  calls to pmap_kenter() during pmap_init().  This gets us in trouble
  because we end up trying to use the zone allocator before it is
  initialized.  Instead, with the pmap_map() change, the number of needed
  PV entries is small enough that we can get by with a static pool that is
  used until pmap_init() is complete.

Submitted by:		dfr
Debugging help:		peter
Tested by:		me
2001-03-06 06:06:42 +00:00
alfred
1c4e3d51b7 Simplify vm_object_deallocate(), by decrementing the refcount first.
This allows some of the conditionals to be combined.
2001-03-04 20:25:23 +00:00
gallatin
845eeac4fc Allocate vm_page_array and vm_page_buckets from the end of the biggest chunk
of memory, rather than from the start.

This fixes problems allocating bouncebuffers on alphas where there is only
1 chunk of memory (unlike PCs where there is generally at least one small
chunk and a large chunk).  Having 1 chunk had been fatal, because these
structures take over 13MB on a machine with 1GB of ram. This doesn't leave
much room for other structures and bounce buffers if they're at the front.

Reviewed by: dfr, anderson@cs.duke.edu, silence on -arch
Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>
2001-03-01 19:21:24 +00:00
dillon
86a16a841c If we intend to make the page writable without requiring another fault,
make sure that PG_NOSYNC is properly set.  Previously we only set it
for a write-fault, but this can occur on a read-fault too.
(will be MFCd prior to 4.3 freeze)
2001-02-28 04:26:43 +00:00
rwatson
9a2d215a5f Introduce per-swap area accounting in the VM system, and export
this information via the vm.nswapdev sysctl (number of swap areas)
and vm.swapdevX nodes (where X is the device), which contain the MIBs
dev, blocks, used, and flags.  These changes are required to allow
top and other userland swap-monitoring utilities to run without
setgid kmem.

Submitted by:	Thomas Moestl <tmoestl@gmx.net>
Reviewed by:	freebsd-audit
2001-02-23 18:46:21 +00:00
des
8862541dfd Fix formatting bugs introduced in sysctl_vm_zone() by the previous commit.
Also, if SYSCTL_OUT() returns a non-zero value, stop at once.
2001-02-22 14:44:39 +00:00
jake
55d5108ac5 Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
bmilekic
f364d4ac36 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
phk
e87f7a15ad Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)
2001-02-04 13:13:25 +00:00
dillon
c8a95a285d This commit represents work mainly submitted by Tor and slightly modified
by myself.  It solves a serious vm_map corruption problem that can occur
with the buffer cache when block sizes > 64K are used.  This code has been
heavily tested in -stable but only tested somewhat on -current.  An MFC
will occur in a few days.  My additions include the vm_map_simplify_entry()
and minor buffer cache boundry case fix.

Make the buffer cache use a system map for buffer cache KVM rather then a
normal map.

Ensure that VM objects are not allocated for system maps.  There were cases
where a buffer map could wind up with a backing VM object -- normally
harmless, but this could also result in the buffer cache blocking in places
where it assumes no blocking will occur, possibly resulting in corrupted
maps.

Fix a minor boundry case in the buffer cache size limit is reached that
could result in non-optimal code.

Add vm_map_simplify_entry() calls to prevent 'creeping proliferation'
of vm_map_entry's in the buffer cache's vm_map.  Previously only a simple
linear optimization was made.  (The buffer vm_map typically has only a
handful of vm_map_entry's.  This stabilizes it at that level permanently).

PR: 20609
Submitted by: (Tor Egge) tegge
2001-02-04 06:19:28 +00:00
jhb
ef07536d23 - Doh, lock faultin() with proc lock in scheduler().
- Lock p_swtime with sched_lock in scheduler() as well.
2001-01-25 01:38:09 +00:00
jasone
8d2ec1ebc4 Convert all simplelocks to mutexes and remove the simplelock implementations. 2001-01-24 12:35:55 +00:00
jhb
c5cc2f8e26 Argh, I didn't get this test right when I converted it. Break this up
into two separate if's instead of nested if's.  Also, reorder things
slightly to avoid unnecessary mutex operations.
2001-01-24 12:23:17 +00:00
jhb
9d4ef4f19c - Catch up to proc flag changes.
- Minimal proc locking.
- Use queue macros.
2001-01-24 11:28:36 +00:00
jhb
f2190e5fef Add mtx_assert()'s to verify that kmem_alloc() and kmem_free() are called
with Giant held.
2001-01-24 11:27:29 +00:00
jhb
9e1a5e2c5b - Catch up to proc flag changes.
- Proc locking in a few places.
- faultin() now must be called with the proc lock held.
- Split up swappable() into a couple of tests so that it can be locke in
  swapout_procs().
- Use queue macros.
2001-01-24 11:25:56 +00:00
jhb
963052ead7 - Catch up to proc flag changes. 2001-01-24 11:20:05 +00:00
jhb
85f26795ce Add missing include. 2001-01-24 06:54:24 +00:00
ume
1ae749987d Add mibs to hold the number of forks since boot. New mibs are:
vm.stats.vm.v_forks
	vm.stats.vm.v_vforks
	vm.stats.vm.v_rforks
	vm.stats.vm.v_kthreads
	vm.stats.vm.v_forkpages
	vm.stats.vm.v_vforkpages
	vm.stats.vm.v_rforkpages
	vm.stats.vm.v_kthreadpages

Submitted by:	Paul Herman <pherman@frenchfries.net>
Reviewed by:	alfred
2001-01-23 14:32:01 +00:00
jake
13afa1fa68 Sigh. atomic_add_int takes a pointer, not an integer.
Pointy-hat-to:	des
2001-01-23 03:40:27 +00:00
des
7e93e90119 Use atomic operations to update the stat counters. 2001-01-23 01:11:11 +00:00
des
30bc358ebb Call vm_zone_init() at the appropriate time.
Reviewed by:	jasone, jhb
2001-01-22 07:02:42 +00:00
des
6929628df0 Give this code a major facelift:
- replace the simplelock in struct vm_zone with a mutex.

 - use a proper SLIST rather than a hand-rolled job for the zone list.

 - add a subsystem lock that protects the zone list and the statistics
   counters.

 - merge _zalloc() into zalloc() and _zfree() into zfree(), and
   move them below _zget() so there's no need for a prototype.

 - add two initialization functions: one which initializes the
   subsystem mutex and the zone list, and one that currently doesn't
   do anything.

 - zap zerror(); use KASSERTs instead.

 - dike out half of sysctl_vm_zone(), which was mostly trying to do
   manually what the snprintf() call could do better.

Reviewed by:	jhb, jasone
2001-01-22 07:01:50 +00:00
des
b3c27aaaf7 First step towards an MP-safe zone allocator:
- have zalloc() and zfree() always lock the vm_zone.
 - remove zalloci() and zfreei(), which are now redundant.

Reviewed by:	bmilekic, jasone
2001-01-21 22:23:11 +00:00
alfred
f6aeda2758 fix comment which was outdated 3 years ago
remove useless assignment
purge entire file of 'register' keyword
2000-12-29 13:49:05 +00:00