279 Commits

Author SHA1 Message Date
dyson
e817145c4a Handle the bogus device that MFS uses as its VBLK device. We now don't
try to VMIO open it on MFS mounts.  This will fix the mfs_badops
panic.
1996-03-02 22:18:34 +00:00
dyson
b7119381dd Enable VMIO for non-VDIR metadata and block device. 1996-03-02 03:45:12 +00:00
bde
5183c3b725 Removed vestigial support for the obsolete FIFO option. In ext2fs
it caused null pointer panics for all fifo operations unless FIFO
was defined.
1996-02-25 20:12:36 +00:00
mpp
f3dd75a38d Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.
1996-01-30 23:02:38 +00:00
dyson
8fc8a772af Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
	overhead for merged cache.
Efficiency improvement for vfs_cluster.  It used to do alot of redundant
	calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files.  Additionally,
	fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources.  The pageout code
	will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
	page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
	thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
	that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
	case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
	buffers.
1996-01-19 04:00:31 +00:00
bde
d1eb245b0a Partially fixed negative and truncated "Avail" counts in df output.
This fixes PR943.

ffs/ffs_vfsops.c:
ffs_statfs() multiplied by (100 - minfree) as part of calculating the
minfree percentage (complemented in 100%), so with the standard minfree
of 8, it was broken for file systems of size >= 1TB/92 = 11GB.  Use the
standard freespace() macro instead.  This also fixes a rounding bug (the
"Avail" count was sometimes 1 too small).

ffs/* (not fixed):
The freespace() macro multiplies by minfree, so with the standard
minfree of 8, it is broken for file systems of size >= 1TB/8 = 128GB.
This bug is more serious since it affects block allocation.

ffs/ffs_alloc.c (not fixed):
Ordinary users are sometimes allowed to allocate 1 (partial) block
too many so that the "Avail" count goes negative.  E.g., if there is
1 fragment available and the file is fairly large, one more full
block is allocated.

df/df.c:
ufs_df() used/uses essentially the same code as ffs_statfs(), so it
had/has the same bugs.

ufs_df() gratuitously replaced "Avail" counts of < 0 by 0, so it
gave different results for non-mounted file systems in this case.
1996-01-14 18:55:09 +00:00
wollman
26b6c4cd73 Convert QUOTA to new-style option. 1996-01-05 18:31:58 +00:00
wollman
39d3a9a3d3 Convert DDB to new-style option. 1996-01-04 21:13:23 +00:00
phk
2a5a36a028 Staticize. 1995-12-17 21:14:36 +00:00
peter
1bce25d080 Silence a harmless warning... 1995-12-15 03:36:25 +00:00
dyson
601ed1a4c0 Changes to support 1Tb filesizes. Pages are now named by an
(object,index) pair instead of (object,offset) pair.
1995-12-11 04:58:34 +00:00
dg
c30f46c534 Untangled the vm.h include file spaghetti. 1995-12-07 12:48:31 +00:00
bde
3688cbda94 Completed function declarations and/or added prototypes and/or #includes
to get the prototypes.
1995-12-03 11:17:15 +00:00
phk
9f8979d178 Fix compiler warnings. 1995-11-20 12:25:37 +00:00
dyson
13fde74b24 General fixes to the vfs clustring code:
1) Make cluster buffer list be a non-malloced chain.  This eliminates
yet another 'evil' M_WAITOK and generally cleans up the code.
2) Fix write clustering for ext2fs.  It was just broken.  Also, ffs
clustering had an efficiency problem that more bawrites were happening
than should have been.
3) Make changes to buf.h to support the above, plus remove b_pfcent
at the request of David Greenman.

Note that the reallocblocks code is disabled pending rewrite for
the cluster buffer list changes.
1995-11-19 19:55:26 +00:00
phk
df7c5ad2a5 Get rid of the last debug sysctl variables of the old style. 1995-11-14 09:40:06 +00:00
bde
449a11eb88 Introduced a type `vop_t' for vnode operation functions and used
it 1138 times (:-() in casts and a few more times in declarations.
This change is null for the i386.

The type has to be `typedef int vop_t(void *)' and not `typedef
int vop_t()' because `gcc -Wstrict-prototypes' warns about the
latter.  Since vnode op functions are called with args of different
(struct pointer) types, neither of these function types is any use
for type checking of the arg, so it would be preferable not to use
the complete function type, especially since using the complete
type requires adding 1138 casts to avoid compiler warnings and
another 40+ casts to reverse the function pointer conversions before
calling the functions.
1995-11-09 08:17:23 +00:00
dyson
cfa6fda252 Make MNT_ASYNC more effective for UFS. It should not be too much more
dangerous than the original MNT_ASYNC.  There might be some minor
security considerations due to data writes not being posted as promptly
as before.  Meta-data operations are still not quite as fast as Linux,
but streaming I/O is still higher.
1995-11-05 21:01:15 +00:00
dyson
b1a28fda3c Finalize GETPAGES layering scheme. Move the device GETPAGES
interface into specfs code.  No need at this point to modify the
PUTPAGES stuff except in the layered-type (NULL/UNION) filesystems.
1995-10-23 02:23:29 +00:00
dyson
667fe1779b Re-enable read clustering. 1995-09-25 06:00:59 +00:00
dg
ee52758175 Shit! I changed the wrong doclusterread! ...Thanks to Steven Wallace and
Poul-Henning for convincing me that I should look at my mistake! :-)
1995-09-22 06:02:40 +00:00
dg
c1d2ad6c46 Disable file read clustering until the bug(s) in vfs_cluster.c are fixed.
This should temporarily fix the sig 10/11 problems that people have been
having for the past 3 weeks.
1995-09-22 00:05:46 +00:00
dg
719bf7a2c2 Slight optimization for the standard case of rotdelay=0. 1995-09-08 17:16:32 +00:00
dyson
55b8bdf1c8 Added indirect pointer for ffs_getpages, and added external declaration. 1995-09-06 05:41:17 +00:00
dyson
251a16a6af Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block count
for VOP_BMAP.  Updated affected filesystems...
1995-09-04 00:21:16 +00:00
julian
ebb726ec45 Reviewed by: julian with quick glances by bruce and others
Submitted by:	terry (terry lambert)
This is  a composite of 3 patch sets submitted by terry.
they are:
New low-level init code that supports loadbal modules better
some cleanups in the namei code to help terry in 16-bit character support
some changes to the mount-root code to make it a little more
modular..

NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able
to test those cases..

certainly mounting root of disk still works just fine..
mfs should work but is untested. (tomorrows task)

The low level init stuff includes a total rewrite of init_main.c
to make it possible for new modules to have an init phase by simply
adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can
be added to the kernel without editing any other files other than the
'files' file.
1995-08-28 09:19:25 +00:00
bde
0bdfde4001 Don't call VOP_UPDATE() with volatile timestamps. 1995-08-25 19:40:32 +00:00
dg
b6d06a9f5d Honor -async mount option when doing the inode update.
Obtained from:	4.4BSD-Lite2
1995-08-16 13:16:58 +00:00
dg
5b4b270015 Converted mountlist to a CIRCLEQ.
Partially obtained from: 4.4BSD-Lite2
1995-08-11 11:31:18 +00:00
dg
20100f1812 Use bdwrite() rather than brelse(). The cylinder group bitmap modification
is not preserved otherwise.
Note that this is a no-op in FreeBSD, however, as we have doreallocblks
disabled.

Submitted by:	Kirk McKusick
1995-08-07 08:16:32 +00:00
dg
cf5a823616 Removed redundant call to vm_object_page_clean - this is already done
in vfs_msync().
1995-08-06 11:56:42 +00:00
dg
060942c07e Use the correct flags (IO_SYNC -> B_SYNC) when deciding to do a sync or
async write in the section that changes the filesize. The bug resulted
in the updates always being async.

Obtained from:	4.4BSD-Lite2
1995-08-04 05:49:17 +00:00
dg
5d7eb9210c Since ufs_ihashget can block, the lock must be checked for each time
the function returns. Also, moved lock into .bss and made minor cosmetic
changes.

Submitted by:	Bruce Evans
1995-07-21 16:20:20 +00:00
dg
ab7b1f1cbf Implement a lock in ffs_vget to prevent a race condition where two processes
try allocate the same inode/vnode, causing a duplicate.

Submitted by:	Matt Dillon, slightly reworked by me.
1995-07-21 03:52:40 +00:00
dg
c8b0a7332c NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct
proc or any VM system structure will have to be rebuilt!!!

Much needed overhaul of the VM system. Included in this first round of
changes:

1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages,
   haspage, and sync operations are supported. The haspage interface now
   provides information about clusterability. All pager routines now take
   struct vm_object's instead of "pagers".

2) Improved data structures. In the previous paradigm, there is constant
   confusion caused by pagers being both a data structure ("allocate a
   pager") and a collection of routines. The idea of a pager structure has
   escentially been eliminated. Objects now have types, and this type is
   used to index the appropriate pager. In most cases, items in the pager
   structure were duplicated in the object data structure and thus were
   unnecessary. In the few cases that remained, a un_pager structure union
   was created in the object to contain these items.

3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now
   be removed. For instance, vm_object_enter(), vm_object_lookup(),
   vm_object_remove(), and the associated object hash list were some of the
   things that were removed.

4) simple_lock's removed. Discussion with several people reveals that the
   SMP locking primitives used in the VM system aren't likely the mechanism
   that we'll be adopting. Even if it were, the locking that was in the code
   was very inadequate and would have to be mostly re-done anyway. The
   locking in a uni-processor kernel was a no-op but went a long way toward
   making the code difficult to read and debug.

5) Places that attempted to kludge-up the fact that we don't have kernel
   thread support have been fixed to reflect the reality that we are really
   dealing with processes, not threads. The VM system didn't have complete
   thread support, so the comments and mis-named routines were just wrong.
   We now use tsleep and wakeup directly in the lock routines, for instance.

6) Where appropriate, the pagers have been improved, especially in the
   pager_alloc routines. Most of the pager_allocs have been rewritten and
   are now faster and easier to maintain.

7) The pagedaemon pageout clustering algorithm has been rewritten and
   now tries harder to output an even number of pages before and after
   the requested page. This is sort of the reverse of the ideal pagein
   algorithm and should provide better overall performance.

8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup
   have been removed. Some other unnecessary casts have also been removed.

9) Some almost useless debugging code removed.

10) Terminology of shadow objects vs. backing objects straightened out.
    The fact that the vm_object data structure escentially had this
    backwards really confused things. The use of "shadow" and "backing
    object" throughout the code is now internally consistent and correct
    in the Mach terminology.

11) Several minor bug fixes, including one in the vm daemon that caused
    0 RSS objects to not get purged as intended.

12) A "default pager" has now been created which cleans up the transition
    of objects to the "swap" type. The previous checks throughout the code
    for swp->pg_data != NULL were really ugly. This change also provides
    the rudiments for future backing of "anonymous" memory by something
    other than the swap pager (via the vnode pager, for example), and it
    allows the decision about which of these pagers to use to be made
    dynamically (although will need some additional decision code to do
    this, of course).

13) (dyson) MAP_COPY has been deprecated and the corresponding "copy
    object" code has been removed. MAP_COPY was undocumented and non-
    standard. It was furthermore broken in several ways which caused its
    behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will
    continue to work correctly, but via the slightly different semantics
    of MAP_PRIVATE.

14) (dyson) Sharing maps have been removed. It's marginal usefulness in a
    threads design can be worked around in other ways. Both #12 and #13
    were done to simplify the code and improve readability and maintain-
    ability. (As were most all of these changes)

TODO:

1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing
   this will reduce the vnode pager to a mere fraction of its current size.

2) Rewrite vm_fault and the swap/vnode pagers to use the clustering
   information provided by the new haspage pager interface. This will
   substantially reduce the overhead by eliminating a large number of
   VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be
   improved to provide both a "behind" and "ahead" indication of
   contiguousness.

3) Implement the extended features of pager_haspage in swap_pager_haspage().
   It currently just says 0 pages ahead/behind.

4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps
   via a much more general mechanism that could also be used for disk
   striping of regular filesystems.

5) Do something to improve the architecture of vm_object_collapse(). The
   fact that it makes calls into the swap pager and knows too much about
   how the swap pager operates really bothers me. It also doesn't allow
   for collapsing of non-swap pager objects ("unnamed" objects backed by
   other pagers).
1995-07-13 08:48:48 +00:00
dg
3c7c1dd62f 1) Converted v_vmdata to v_object.
2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs
   after vnode_pager_alloc() calls - the object is already guaranteed to be
   persistent.
3) Removed some gratuitous casts.
1995-06-28 12:01:13 +00:00
rgrimes
c86f0c7a71 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
dg
9e52c97c63 Kill bogus vnode_pager_setsize(). It was being called at the wrong time
and resulted in the object size being too small. This caused bad things
to happen later when the file was mapped.

Reviewed by:	John Dyson
1995-05-28 04:32:23 +00:00
dg
2045200a00 Changes to fix the following bugs:
1) Files weren't properly synced on filesystems other than UFS. In some
   cases, this lead to lost data. Most likely would be noticed on NFS.
   The fix is to make the VM page sync/object_clean general rather than
   in each filesystem.
2) Mixing regular and mmaped file I/O on NFS was very broken. It caused
   chunks of files to end up as zeroes rather than the intended contents.
   The fix was to fix several race conditions and to kludge up the
   "b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention
   to page modifications that occurred via the mmapping.

Reviewed by:	David Greenman
Submitted by:	John Dyson
1995-05-21 21:39:31 +00:00
dg
240701b33f NFS diskless operation was broken because swapdev_vp wasn't initialized.
These changes solve the problem in a general way by moving the
initialization out of the individual fs_mountroot's and into swaponvp().

Submitted by:	Poul-Henning Kamp
1995-05-19 03:27:08 +00:00
dg
138edd5273 Fixed incompleteness that would allow dirty filesystems to get mounted
when the single user shell was terminated. These changes disallow mounting
or R/W upgrading filesystems that are dirty unless "-f" (force) option
is used with mount. /etc/rc has been modified to abort the startup if
one or more non-nfs partitions fail to mount.

Reviewed by:	Poul-Henning Kamp, Rod Grimes
1995-05-15 08:39:37 +00:00
rgrimes
0e1db07cf9 Fix -Wformat warnings from LINT kernel. 1995-05-11 19:26:53 +00:00
dyson
7972592fe7 Limit filesize to the amount that the VM system can currently handle
(2GB).  If this limit is not imposed, then filesystem corruption will
ensue when files larger than 2GB are created.  This is temporary,
and the underlying limitation will be removed later.
1995-05-01 23:20:24 +00:00
dg
c1c54df7da Handle the "syncing VCHR vnode hang" problem a little differently; just
don't lock the vnode - it doesn't appear to ever be necessary for VCHR
vnode/inodes. This fixes a bug introduced in the previous commit that
caused tty timestamps to act strange (causing 'w' and 'finger' to show
the tty wasn't idle when it may have been for hours).
1995-04-11 04:23:47 +00:00
dg
b804a53282 Changes from John Dyson and myself:
Fixed remaining known bugs in the buffer IO and VM system.

vfs_bio.c:
Fixed some race conditions and locking bugs. Improved performance
by removing some (now) unnecessary code and fixing some broken
logic.
Fixed process accounting of # of FS outputs.
Properly handle NFS interrupts (B_EINTR).

(various)
Replaced calls to clrbuf() with calls to an optimized routine
called vfs_bio_clrbuf().

(various FS sync)
Sync out modified vnode_pager backed pages.

ffs_vnops.c:
Do two passes: Sync out file data first, then indirect blocks.

vm_fault.c:
Fixed deadly embrace caused by acquiring locks in the wrong order.

vnode_pager.c:
Changed to use buffer I/O system for writing out modified pages. This
should fix the problem with the modification date previous not getting
updated. Also dramatically simplifies the code. Note that this is
going to change in the future and be implemented via VOP_PUTPAGES().

vm_object.c:
Fixed a pile of bugs related to cleaning (vnode) objects. The performance
of vm_object_page_clean() is terrible when dealing with huge objects,
but this will change when we implement a binary tree to keep the object
pages sorted.

vm_pageout.c:
Fixed broken clustering of pageouts. Fixed race conditions and other
lockup style bugs in the scanning of pages. Improved performance.
1995-04-09 06:03:56 +00:00
bde
4f64fe43e7 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) that I didn't notice when I fixed
"all" such warnings before.
1995-03-28 07:58:53 +00:00
dg
1320869559 Removed third arg (vmio) to allocbuf() that was added with the original
merged cache changes, and figure it out based on the B_VMIO buffer flag.
Fixes a problem where delayed write VMIO buffers would sometimes get
recopied into kernel-alloced memory.

Submitted by:	John Dyson
1995-03-26 23:29:13 +00:00
dg
9cd78521d8 Removed redundant newlines that were in some panic strings. 1995-03-19 14:29:26 +00:00
dg
e38e0cc286 Don't sync the inode date changes of character special devices
during the FS sync. The system would appear to hang momentarily
if there was a large backlog of I/O. This is because the vnode
remains locked during the output - preventing normal character
I/O. The problem was exacerbated by the FFS contiguous block
allocation fixes and a semi-broken disksort(). The inode/date
will still be synced during a normal FS dismount and whenever
the inode is changed for other reasons.
1995-03-18 18:03:29 +00:00
bde
289f11acb4 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'.  Fix all the bugs found.  There were no serious
ones.
1995-03-16 18:17:34 +00:00