Commit Graph

127 Commits

Author SHA1 Message Date
tegge
fbf474f2d8 Update freevnodes when adding a vnode to the head of the free list. 1998-01-31 01:17:58 +00:00
dyson
8726294764 Add better support for larger I/O clusters, including larger physical
I/O.  The support is not mature yet, and some of the underlying implementation
needs help.  However, support does exist for IDE devices now.
1998-01-24 02:01:46 +00:00
dyson
197bd655c4 VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
dyson
b130b30c96 Tie up some loose ends in vnode/object management. Remove an unneeded
config option in pmap.  Fix a problem with faulting in pages.  Clean-up
some loose ends in swap pager memory management.

The system should be much more stable, but all subtile bugs aren't fixed yet.
1998-01-17 09:17:02 +00:00
dyson
9a35ec7fec Fix another vnode leak. 1998-01-12 03:15:01 +00:00
dyson
d9d8bf6d30 Fix some vnode management problems, and better mgmt of vnode free list.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.

PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
1998-01-12 01:46:33 +00:00
dyson
66da5c4f34 Disable io optimizations again, minor bug found, and will be fixed in
a few days.
1998-01-07 09:26:29 +00:00
dyson
cb2800cd94 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
dyson
7bf56bd14a Add the vnode interlock back around vref. 1997-12-29 16:54:03 +00:00
dyson
8ab3ac77d2 Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem
with the object cache removal.
1997-12-29 01:03:55 +00:00
dyson
cd67bb82fe Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00
dyson
6bd1f74dcf Some performance improvements, and code cleanups (including changing our
expensive OFF_TO_IDX to btoc whenever possible.)
1997-12-19 09:03:37 +00:00
wollman
1a06d88098 Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL
if one of the new poll types is requested; hopefully this will not break
any existing code.  (This is done so that programs have a dependable
way of determining whether a filesystem supports the extended poll types
or not.)

The new poll types added are:

	POLLWRITE - file contents may have been modified
	POLLNLINK - file was linked, unlinked, or renamed
	POLLATTRIB - file's attributes may have been changed
	POLLEXTEND - file was extended

Note that the internal operation of poll() means that it is impossible
for two processes to reliably poll for the same event (this could
be fixed but may not be worth it), so it is not possible to rewrite
`tail -f' to use poll at this time.
1997-12-15 03:09:59 +00:00
bde
975c3797b1 Staticized. 1997-11-22 08:35:46 +00:00
julian
ae22df605c Reviewed by: various.
Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left  for another day.
1997-11-12 05:42:33 +00:00
phk
4d26888936 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
phk
e3cdaf12b2 VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place.
Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite.
Use vop_null instead of nullop.
Move vop_nopoll from vfs_subr.c to vfs_default.c
Move vop_sharedlock from vfs_subr.c to vfs_default.c
Move vop_nolock from vfs_subr.c to vfs_default.c
Move vop_nounlock from vfs_subr.c to vfs_default.c
Move vop_noislocked from vfs_subr.c to vfs_default.c
Use vop_ebadf instead of *_ebadf.
Add vop_defaultop for getpages on master vnode in MFS.
1997-10-26 20:55:39 +00:00
phk
36e7a51ea1 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
phk
645e7b2ab6 Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
phk
dde8490d33 Dike out a weird warning. 1997-10-11 07:34:27 +00:00
phk
3783a8767e I lost a bit of my change in the last commit, this is more like it.
Noticed by:	bde
1997-09-26 08:08:58 +00:00
phk
099d96fad2 Reduce the target number of vnodes on the freelist from desiredvnodes
(usually a couple of thousand) to 25.  The measured impact on cache-hits
doesn't justify spending memory this way:

Target number of free vnodes versus namecache hit rate in % during a
make world:
          10    98.5316
         200    98.5479
         500    98.5546
        1000    98.5709
        3000    98.6006
        4000    98.6126
1997-09-25 16:17:57 +00:00
phk
4d1dd5bc33 A couple of handles to tweak, more statistics. 1997-09-24 07:46:54 +00:00
bde
1062c10a86 Fixed gratuitous ANSIisms. 1997-09-16 11:44:05 +00:00
peter
1ffbda9a9e Provide a 'return true' poll vnode op rather than duplicating the
'do nothing' case all over the various filesystems.
1997-09-14 02:49:06 +00:00
peter
723553368e print correct function name in a panic (vop_nolock -> vop_sharedlock) 1997-09-13 15:02:28 +00:00
bde
bcade9a903 Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.
1997-09-07 16:21:11 +00:00
bde
fc775e3711 Removed vestiges of config-time "argument processing" configuration. 1997-09-07 13:49:56 +00:00
phk
c202b61548 Hmm, this is hopefully better. 1997-09-03 13:29:41 +00:00
phk
cbe64dd591 Revert the v_usecount handling in relation to VOP_INACTIVE. 1997-09-03 09:18:48 +00:00
bde
6ffb8bf9af Removed unused #includes. 1997-09-02 20:06:59 +00:00
phk
0b3a12b83e Change the 0xdeadb hack to a flag called VDOOMED.
Introduce VFREE which indicates that vnode is on freelist.
Rename vholdrele() to vdrop().
Create vfree() and vbusy() to add/delete vnode from freelist.
Add vfree()/vbusy() to keep (v_holdcnt != 0 || v_usecount != 0)
  vnodes off the freelist.
Generalize vhold()/v_holdcnt to mean "do not recycle".
Fix reassignbuf()s lack of use of vhold().
Use vhold() instead of checking v_cache_src list.
Remove vtouch(), the vnodes are always vget'ed soon enough
  after for it to have any measuable effect.
Add sysctl debug.freevnodes to keep track of things.
Move cache_purge() up in getnewvnodes to avoid race.
Decrement v_usecount after VOP_INACTIVE(), put a vhold() on
  it during VOP_INACTIVE()
Unmacroize vhold()/vdrop()
Print out VDOOMED and VFREE flags (XXX: should use %b)

Reviewed by:		dyson
1997-08-31 07:32:39 +00:00
bde
c312d330af Restored rev.1.92 which was clobbered by the previous commit. 1997-08-26 11:59:20 +00:00
dyson
b90433b1a9 Back out some incorrect changes that was worse than the original bug. 1997-08-26 04:36:27 +00:00
dyson
042ae4067b This is a trial improvement for the vnode reference count while on the vnode
free list problem.  Also, the vnode age flag is no longer used by the
vnode pager.  (It is actually incorrect to use then.)  Constructive
feedback welcome -- just be kind.
1997-08-22 03:56:37 +00:00
bde
6be005551f #include <machine/limits.h> explicitly in the few places that it is required. 1997-08-21 20:33:42 +00:00
wollman
4542c1cf5d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
dyson
4811e46aa5 Fix a problem with the vfs vnode caching that it doesn't grow quickly
enough and can cause some strange performance problems.  Specifically, at
or near startup time is when the problem is worst.  To reproduce
the problem, run "lat_syscall stat" from the alpha lmbench code right
after bootup.  A positive side effect of this mod is that the name
cache can be set to grow again by sysctl.  A noticable positive
performance impact is realized due to a larger namecache being available
as needed (or tuned.)
1997-08-04 07:43:28 +00:00
dfr
df8d8e5713 Merge WebNFS support from NetBSD
Obtained from:	NetBSD
1997-07-17 07:17:33 +00:00
dyson
8786565a86 Remove a window during running down a file vnode. Also, the OBJ_DEAD
flag wasn't being respected during vref(), et. al.  Note that this
isn't the eventual fix for the locking problem.  Fine grained SMP
in the VM and VFS code will require (lots) more work.
1997-06-22 03:00:24 +00:00
dg
a1414ec53a Disabled the kern.vnode sysctl variable. It's causing system crashes on
large systems and needs to be re-thinked or removed wholesale.
1997-06-10 02:48:08 +00:00
phk
d8e3734a09 Fix a race condition that did, after all, exist.
Reviewed by:	phk
Submitted by:	dfr
1997-05-06 15:19:38 +00:00
phk
aa8738a5f3 1. Add a {pointer, v_id} pair to the vnode to store the reference to the
".." vnode.  This is cheaper storagewise than keeping it in the
    namecache, and it makes more sense since it's a 1:1 mapping.

2.  Also handle the case of "." more intelligently rather than stuff
    the namecache with pointless entries.

3.  Add two lists to the vnode and hang namecache entries which go from
    or to this vnode.  When cleaning a vnode, delete all namecache
    entries it invalidates.

4.  Never reuse namecache enties, malloc new ones when we need it, free
    old ones when they die.  No longer a hard limit on how many we can
    have.

5.  Remove the upper limit on namelength of namecache entries.

6.  Make a global list for negative namecache entries, limit their number
    to a sysctl'able (debug.ncnegfactor) fraction of the total namecache.
    Currently the default fraction is 1/16th.  (Suggestions for better
    default wanted!)

7.  Assign v_id correctly in the face of 32bit rollover.

8.  Remove the LRU list for namecache entries, not needed.  Remove the
    #ifdef NCH_STATISTICS stuff, it's not needed either.

9.  Use the vnode freelist as a true LRU list, also for namecache accesses.

10. Reuse vnodes more aggresively but also more selectively, if we can't
    reuse, malloc a new one.  There is no longer a hard limit on their
    number, they grow to the point where we don't reuse potentially
    usable vnodes.  A vnode will not get recycled if still has pages in
    core or if it is the source of namecache entries (Yes, this does
    indeed work :-)  "." and ".." are not namecache entries any longer...)

11. Do not overload the v_id field in namecache entries with whiteout
    information, use a char sized flags field instead, so we can get
    rid of the vpid and v_id fields from the namecache struct.  Since
    we're linked to the vnodes and purged when they're cleaned, we don't
    have to check the v_id any more.

12. NFS knew about the limitation on name length in the namecache, it
    shouldn't and doesn't now.

Bugs:
        The namecache statistics no longer includes the hits for ".."
        and "." hits.

Performance impact:
        Generally in the +/- 0.5% for "normal" workstations, but
        I hope this will allow the system to be selftuning over a
        bigger range of "special" applications.  The case where
        RAM is available but unused for cache because we don't have
        any vnodes should be gone.

Future work:
        Straighten out the namecache statistics.

        "desiredvnodes" is still used to (bogusly ?) size hash
        tables in the filesystems.

        I have still to find a way to safely free unused vnodes
        back so their number can shrink when not needed.

        There is a few uses of the v_id field left in the filesystems,
        scheduled for demolition at a later time.

        Maybe a one slot cache for unused namecache entries should
        be implemented to decrease the malloc/free frequency.
1997-05-04 09:17:38 +00:00
dyson
43ef323eb0 Staticize an unnecessarily global function: vputrele.
Submitted by:	 Michael Hancock <michaelh@cet.co.jp>
1997-04-30 03:09:15 +00:00
peter
74f136022f copyin the export network mask to the correct variable.
Submitted by: Mike Hibler <mike@marker.cs.utah.edu>, PR#3380
1997-04-25 06:47:12 +00:00
dfr
60008c7902 Add a function vop_sharedlock which a copy of vop_nolock without the
implementation #ifdef out.  This can be used for now by NFS.  As soon
as all the other filesystems' locking is fixed, this can go away.

Print the vnode address in vprint for easier debugging.
1997-04-04 17:46:21 +00:00
bde
da270d0a3c Use OID_AUTO instead of magic number for the Lite2 sysctl debug.busyprt.
Removed declaration of vfs_unmountroot() again.

Staticized vgonel().
1997-04-01 13:05:34 +00:00
dg
1a326a5d28 Fixed splbio problems in vinvalbuf. Closes PR#2875, although fixed
differently by me.
1997-03-05 04:54:54 +00:00
bde
61a92b4f52 Attach vfs_sysctl() one level lower so that only the levels below
VFS_GENERIC aren't done in the FreeBSD way.  The previous commit
broke the nfs sysctls.
1997-03-04 18:31:56 +00:00
bde
5fc94677bd Merged Lite2's vfs_sysctl(). It doesn't fit very well into FreeBSD's
(phk's) sysctl framework, and I needed special code to disambiguate
the VFS_GENERIC node from the VFS_VFSCONF leaf, so I only converted
the leaves to the FreeBSD framework.  The error handling isn't quite
right.  CSRGS's sysctls seem to return ENOTDIR too much and FreeBSD's
sysctls don't agree with the man page.
1997-03-03 12:58:20 +00:00