Commit Graph

499 Commits

Author SHA1 Message Date
dillon
9b8aaaa737 Reviewed by: Julian Elischer <julian@whistle.com>
Add d_parms() to {c,b}devsw[].  If non-NULL this function points to
    a device routine that will properly fill in the specinfo structure.
    vfs_subr.c's checkalias() supplies appropriate defaults.  This change
    should be fully backwards compatible with existing devices.
1999-02-25 05:22:30 +00:00
luoqi
082d37c1ac Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
dillon
98732ec693 Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
jdp
de45986f5a Correct a format mismatch on 64-bit architectures. This should
fix the erroneous values in the procfs "map" file on the Alpha.
1999-02-05 06:18:54 +00:00
dillon
975fba8a24 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
dillon
4cc7d69521 Fix but in devfs_strategy(). Switch cases were falling through
instead of breaking out, so a VCHR devices would run the VCHR
    routine and then fall through and run the VBLK routine.  Fixed.
1999-01-27 23:49:45 +00:00
dillon
dbf5cd2b57 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 22:42:27 +00:00
dillon
df24433bbe This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
eivind
62a8838875 Remove declarations for undefined functions and a couple of unused
enotsupp implementations.
1999-01-12 11:49:30 +00:00
peter
1c3fe295c3 A partial implementation of the procfs cmdline pseudo-file. This
is enough to satisfy things like StarOffice.  This is a hack, but doing
it properly would be a LOT of work, and would require extensive grovelling
around in the user address space to find the argv[].

Obtained from: Mostly from Andrzej Bialecki <abial@nask.pl>.
1999-01-05 03:53:06 +00:00
bde
90744a1d3c Made this compile if UMAPFS_DIAGNOSTIC is defined. This has been broken
since before rev.1.1, so UMAPFS_DIAGNOSTIC should not be trusted.
UMAPFS_DIAGNOSTIC is commented out in LINT to hide various bugs.
1999-01-01 10:14:37 +00:00
eivind
f1509aea94 Fix possible NULL-pointer deref in error case (same as DEVFS). 1998-12-16 00:10:51 +00:00
eivind
3a5b7248f8 Avoid NULL-pointer dereference on error condition. 1998-12-15 23:46:59 +00:00
dillon
31ea12c336 Cleanup uninitialized-possibly-used (but really not) warnings 1998-12-14 05:00:59 +00:00
eivind
d2f9690e5c Rename one of the two devfs_link's to devfs_makelink. 1998-12-10 19:57:01 +00:00
archie
60d13c7a9d The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.
1998-12-07 21:58:50 +00:00
eivind
cae573fc34 '\0' is the most ugly NULL pointer constant I've ever seen in real code. 1998-12-07 02:47:46 +00:00
archie
982e80577d Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Mike Spengler <mks@networkcs.com>
1998-12-04 22:54:57 +00:00
eivind
3cf3a6389e Staticize. 1998-11-26 18:50:24 +00:00
bde
4084e8c5f1 Return ENOTTY instead of EBADF for ioctls on dead vnodes. This fixes
tcsetpgrp() on controlling terminals that are no longer associated
with the session of the calling process, not to mention ioctl.2.
1998-11-22 09:19:07 +00:00
bde
51ad68ca2d Finished updating module event handlers to be compatible with
modeventhand_t.
1998-11-15 15:33:52 +00:00
peter
c30cd27faf "fix" a warning that has been bugging me for ages. Eliminate a couple
of temporary variables since they are only used once and their types
were the cause of the warnings.
1998-11-09 09:21:25 +00:00
peter
0bc0cd8404 Delete stray extern declaration for non-existing variables. 1998-11-09 07:03:04 +00:00
peter
7819a9450e Change the #ifdef UNION code into a callable hook. Arrange to have this
set up when unionfs is present, either statically or as a kld module.
1998-11-03 08:01:48 +00:00
peter
8ef35acf90 Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.
1998-10-31 15:31:29 +00:00
dg
20b2c33d9a Added a second argument, "activate" to the vm_page_unwire() call so that
the caller can select either inactive or active queue to put the page on.
1998-10-28 13:37:02 +00:00
bde
873d7be484 Removed redundant bitrotted checks for major numbers instead of updating
them.
1998-10-26 08:53:13 +00:00
sos
526ee0d833 Make devfs update the atime timestamp so that 'w' works when using
options DEVFS.
1998-09-30 20:33:46 +00:00
phk
f549419b9d various nits that didn't make it through the brucefilter. 1998-09-12 20:21:54 +00:00
bde
d886019b37 Oops, don't assume that the environment is normal in devfs_mount().
It isn't for the hidden mountpoint.  The static vfs's haven't been
attached then, so mp->mnt_vfc can't be valid.
1998-09-08 16:59:37 +00:00
bde
e170b2ba75 Removed statically configured mount type numbers (MOUNT_*) and all
references to them.

The change a couple of days ago to ignore these numbers in statically
configured vfsconf structs was slightly premature because the cd9660,
cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number
in their vfsconf struct.
1998-09-07 13:17:06 +00:00
phk
4630814c8b Add a new vnode op, VOP_FREEBLKS(), which filesystems can use to inform
device drivers about sectors no longer in use.

Device-drivers receive the call through d_strategy, if they have
D_CANFREE in d_flags.

This allows flash based devices to erase the sectors and avoid
pointlessly carrying them around in compactions.

Reviewed by:	Kirk Mckusick, bde
Sponsored by:	M-Systems (www.m-sys.com)
1998-09-05 14:13:12 +00:00
dfr
e2df972eb1 Cosmetic changes to the PAGE_XXX macros to make them consistent with
the other objects in vm.
1998-09-04 08:06:57 +00:00
phk
eb85c7e691 sort the prototypes 1998-08-25 17:48:54 +00:00
phk
2eb129c515 Last commit managed to get mangled somehow. 1998-08-24 18:23:18 +00:00
phk
84bd0e1571 Remove the last remaining evidence of B_TAPE.
Reclaim 3 unused bits in b_flags
1998-08-24 17:47:25 +00:00
bde
9ba1aae697 Enabled Lite2 fix for reading from dead ttys. 1998-08-23 11:43:29 +00:00
bde
9e27b29fba Use [u]intptr_t instead of [u_]long for casts between pointers and
integers.  Don't forget to cast to (void *) as well.
1998-08-16 01:21:52 +00:00
bde
0c01873b01 Fixed printf format errors. 1998-07-30 17:40:45 +00:00
alex
d48cc1feee Style fixes and a bug fix: don't remove the exit handler if unmount
fails.

Submitted by:	bde
1998-07-27 22:47:17 +00:00
alex
be85829abc A better solution to the rm_at_exit problem: Register the exit function
during first mount.  Unregister the exit function at last unmount.

Concept by:	sef
Reviewed by:	sef
Implemented by:	alex
1998-07-27 01:07:01 +00:00
alex
4ab63c85e2 Override the default VFS LKM dispatch functions so that a module
unload function can be provided (this is necessary to unregister
the at_exit handler).
1998-07-25 15:52:44 +00:00
bde
bd9ef8a24a Cast pointers to [u]intptr_t instead of to [unsigned] long. 1998-07-15 04:17:55 +00:00
bde
863d5c8b68 Cast pointers to uintptr_t/intptr_t instead of to u_long/long,
respectively.  Most of the longs should probably have been
u_longs, but this changes is just to prevent warnings about
casts between pointers and integers of different sizes, not
to fix poorly chosen types.
1998-07-15 02:32:35 +00:00
bde
f0b863f4b5 Fixed printf format errors. 1998-07-11 07:46:16 +00:00
bde
c0cb9588c8 Quick fix for type mismatches which were fatal if longs aren't 32
bits.  We used a private, wrong, version of `struct dirent' to help
break getdirentries(), and we use a silly check that the size of this
struct is a power of 2 to help break mount() if getdirentries() would
not work.  This fix just changes the struct to match `struct dirent'
(except for the name length).
1998-07-07 04:08:44 +00:00
julian
04d286f647 DEVFS completely bypasses the cdevsw and bdevsw tables now.
Each devfs node has (and has had fro a while) a pointer directly to
the correct cdefsw entry so just use it instead of doing the lookup.

There are several other places in the kernel that still use the tables
however, so they can't go away yet..
1998-07-05 23:10:22 +00:00
julian
0262543b5f There is no such thing any more as "struct bdevsw".
There is only cdevsw (which should be renamed in a later edit to deventry
or something). cdevsw contains the union of what were in both bdevsw an
cdevsw entries.  The bdevsw[] table stiff exists and is a second pointer
to the cdevsw entry of the device. it's major is in d_bmaj rather than
d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers
to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).

rawread()/rawwrite() went away as part of this though it's not strictly
the same  patch, just that it involves all the same lines in the drivers.

cdroms no longer have write() entries (they did have rawwrite (?)).
tapes no longer have support for bdev operations.

Reviewed by: Eivind Eklund and Mike Smith
	Changes suggested by eivind.
1998-07-04 22:30:26 +00:00
julian
4363221ba2 VOP_STRATEGY grows an (struct vnode *) argument
as the value in b_vp is often not really what you want.
(and needs to be frobbed). more cleanups will follow this.
Reviewed by: Bruce Evans <bde@freebsd.org>
1998-07-04 20:45:42 +00:00
dt
c6c8ca45cd Remove "not hungly" panics. Cookies now used by the linux and ibcs2
emulators. The emulators assume that filesystem may just ignore cookies, and
handle this case correctly. So we just ignore cookies.

Also sync *_readdir "prototypes" with reality.
1998-06-25 16:54:41 +00:00
bde
403bdcb97b Removed unused includes. 1998-06-21 14:53:44 +00:00
bde
a336cb95ff Avoid a 64-bit division in procfs_readdir(). Fixed related overflows.
Check args using the same expression as in fdesc and kernfs.  The check
was actually already correct, modulo overflow.  It could be tightened
up to either allow huge (aligned) offsets, treating them as EOF, or
disallow all offsets beyond EOF.

Didn't fix invalid address calculation &foo[i] where i may be out of
bounds.

Didn't fix shooting of foot using a private unportable dirent struct.
1998-06-14 12:53:39 +00:00
bde
6ee3b26044 Avoid a 64-bit division in kernfs_readdir(). Fixed related overflows
and arg checking.
1998-06-14 12:34:42 +00:00
bde
cc3e00bf7a Avoid a 64-bit division in fdesc_readdir(). Fixed related overflows
and missing arg checking.

Panic instead of returning bogus error codes or forgetting to check
all cases if fdesc_readdir() gets called for a non-directory.  This
can't happen.
1998-06-14 08:46:41 +00:00
dfr
88b8a1bc7e Make these files compile. 1998-06-10 21:21:31 +00:00
alex
a3798e908e ENOPNOTSUPP --> EOPNOTSUPP
PR:		6906
Submitted by:	Steven G. Kargl <kargl@troutmask.apl.washington.edu>
1998-06-10 19:56:06 +00:00
peter
e18ee5bdc0 Don't silently accept attempts to change flags where they are not
supported.
1998-06-10 06:34:57 +00:00
dfr
1d5f38ac22 This commit fixes various 64bit portability problems required for
FreeBSD/alpha.  The most significant item is to change the command
argument to ioctl functions from int to u_long.  This change brings us
inline with various other BSD versions.  Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
1998-06-07 17:13:14 +00:00
dyson
d26bce6481 Make flushing dirty pages work correctly on filesystems that
unexpectedly do not complete writes even with sync I/O requests.
This should help the behavior of mmaped files when using
softupdates (and perhaps in other circumstances also.)
1998-05-21 07:47:58 +00:00
tegge
9fdbafa2fe Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by:	David Greenman <dg@root.com>
1998-05-19 00:00:14 +00:00
sos
12211e2fa7 Cleanup after Garret, include unpch.h to get at various macros.. 1998-05-17 09:37:39 +00:00
msmith
964ce778b1 In the words of the submitter:
---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it.  This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp.  vop_rename
still releases 4 vnode arguments before it returns.  These remaining cases
will be corrected in the next set of patches.
---------

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-07 04:58:58 +00:00
msmith
c645da3999 As described by the submitter:
Reverse the VFS_VRELE patch.  Reference counting of vnodes does not need
to be done per-fs.  I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-06 05:29:41 +00:00
dyson
b5a79794cd Tighten up management of memory and swap space during map allocation,
deallocation cycles.  This should provide a measurable improvement
on swap and memory allocation on loaded systems.  It is unlikely a
complete solution.  Also, provide more map info with procfs.
Chuck Cranor spurred on this improvement.
1998-04-29 04:28:22 +00:00
julian
cb9166e241 Make the devfs SLICE option a standard type option.
(hopefully it will go away eventually anyhow)
1998-04-20 03:57:41 +00:00
julian
0796a5c56e Add changes and code to implement a functional DEVFS.
This code will be turned on with the TWO options
DEVFS and SLICE. (see LINT)
Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes.

/dev will be automatically mounted by init (thanks phk)
on bootup. See /sys/dev/slice/slice.4 for more info.
All code should act the same without these options enabled.

Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others

This code does not support the following:
bad144 handling.
Persistance. (My head is still hurting from the last time we discussed this)
ATAPI flopies are not handled by the SLICE code yet.

When this code is running, all major numbers are arbitrary and COULD
be dynamically assigned. (this is not done, for POLA only)
Minor numbers for disk slices ARE arbitray and dynamically assigned.
1998-04-19 23:32:49 +00:00
des
396b114475 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
bde
cd450d6714 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
phk
00475b662a Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.
1998-03-26 20:54:05 +00:00
kato
0b33917719 If lowervp is NULLVP, vap was clobbered.
Submitted by:	Naofumi Honda <honda@Kururu.math.sci.hokudai.ac.jp>
Obtained from:	NetBSD/pc98
1998-03-17 08:47:50 +00:00
julian
c6b13ec28d Free the vnode in the failure case of vop_symlink()
Suggested by: Michaelh@cet.co.jp
1998-03-10 09:12:19 +00:00
julian
10c5ccc30a Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by:	Kirk McKusick (mcKusick@mckusick.com)
Obtained from:  WHistle development tree
1998-03-08 09:59:44 +00:00
dyson
04efe65302 Initialize b_resid, and also print out better diagnostics on I/O
errors.  This will allow for better tracking of user error reports.
1998-03-08 08:46:18 +00:00
dyson
8ceb6160f4 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
dyson
db1e77e742 Fix certain kinds of block device operations. For example, tunefs on
a block device shouldn't crash the system anymore.
1998-03-04 06:44:59 +00:00
msmith
950d32131b The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE.  This is
initially only for file systems that implement the following ops
that do a WILLRELE:

	vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
	vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet.  VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
	ffs mfs nfs msdosfs devfs ext2fs

Custom
	union umapfs

Just EOPNOTSUPP
	fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers.  vput will be replaced by unlock in these cases.  Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer.  This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by:	phk, dyson, terry et. al.
Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-03-01 22:46:53 +00:00
kato
0712ea24bf Deleted KLOCK-hack. 1998-02-26 03:23:56 +00:00
kato
b83f83460c Deleted unused variable. 1998-02-10 08:04:31 +00:00
kato
50af713bea Undo UN_KLOCK hack except union_allocvp(). Now, vput() doesn't lock
the vnode.
1998-02-10 03:32:07 +00:00
eivind
d7a6ab2803 Staticize. 1998-02-09 06:11:36 +00:00
kato
0511880d25 Fixed pagefault when cred == NOCRED.
PR:		5632
1998-02-07 01:36:24 +00:00
kato
3fac3de4d1 Fixed number of entries in gid-mapfile.
PR:		5640
1998-02-07 01:34:32 +00:00
eivind
4547a09753 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
kato
3dee083700 Workarround for DIAGNOSTIC kernel's panic in union_lookup().
Union_removed_upper() clobbers cache when file is removed.
Upper vp will be removed by union_reclaim().
1998-02-06 02:42:21 +00:00
dyson
ebccbfc1ff 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
eivind
c552a9a1c3 Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
kato
f577f8592b Declare the variable `i' when UMAP_DIAGNOSTIC is defined. 1998-02-03 14:30:01 +00:00
eivind
712a1e61e7 Make the debug options new-style.
This also zaps a DPT option from lint; it wasn't referenced from
anywhere.
1998-01-31 07:23:16 +00:00
kato
a362d5f691 Fixed typo in comment. 1998-01-25 09:44:33 +00:00
dyson
197bd655c4 VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
kato
37c49e0fb8 Delete unused code in union_fsync(). 1998-01-22 02:14:59 +00:00
kato
d49fde64e8 - Move SETKLOC and CLEARKLOCK macros into uion.h.
- Set UN_ULOCK in union_lock() when UN_KLOCK is set.  Caller expects
  that vnode is locked correctly, and may call another function which
  expects locked vnode and may unlock the vnode.
- Do not assume the behavior of inside functions in FreeBSD's
  vfs_suber.c is same as 4.4BSD-Lite2.  Vnode may be locked in
  vget() even though flag is zero.  (Locked vnode is, of course,
  unlocked before returning from vget.)
1998-01-20 10:02:54 +00:00
kato
a76ade94a2 Workarround for locking violation while recycling vnode which union fs
used in freelist.
1998-01-18 08:17:48 +00:00
kato
5ff1564039 Improve and revise fixes for locking violation.
Obtained from:	NetBSD/pc98
1998-01-18 07:56:41 +00:00
dyson
cb2800cd94 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
sef
082257799e Use CHECKIO in procfs_ioctl() to ensure that any changes in UID/GID result
in the expected failure.
1998-01-06 01:37:12 +00:00
julian
8895d4eccd add copyrights 1998-01-02 07:31:07 +00:00
bde
9c98de2bba Fixed missing initialization of mp->mnt_stat. At least vm depends on
at least mp->mnt_stat.f_iosize being nonzero.

PR:		5212
1998-01-01 08:28:26 +00:00
bde
05d3a8c532 Fixed a missing/misplaced/misstyled prototype. 1997-12-30 08:46:44 +00:00
dyson
cd67bb82fe Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00
bde
3c1b6940fc Unspammed nested include of <vm/vm_zone.h>. 1997-12-27 02:56:39 +00:00
sef
f4669f67bc Clear the p_stops field on change of user/group id, unless the correct
flag is set in the p_pfsflags field.  This, essentially, prevents an SUID
proram from hanging after being traced.  (E.g., "truss /usr/bin/rlogin" would
fail, but leave rlogin in a stopevent state.)  Yet another case where procctl
is (hopefully ;)) no longer needed in the general case.

Reviewed by:	bde (thanks bruce :))
1997-12-20 03:05:47 +00:00
bde
4191aa2363 Set the sender's low watermark to match the maximum size for atomic
writes that we advertise (PIPE_BUF = 512).
1997-12-19 18:58:14 +00:00
wollman
1a06d88098 Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL
if one of the new poll types is requested; hopefully this will not break
any existing code.  (This is done so that programs have a dependable
way of determining whether a filesystem supports the extended poll types
or not.)

The new poll types added are:

	POLLWRITE - file contents may have been modified
	POLLNLINK - file was linked, unlinked, or renamed
	POLLATTRIB - file's attributes may have been changed
	POLLEXTEND - file was extended

Note that the internal operation of poll() means that it is impossible
for two processes to reliably poll for the same event (this could
be fixed but may not be worth it), so it is not possible to rewrite
`tail -f' to use poll at this time.
1997-12-15 03:09:59 +00:00
bde
2c703bda5c Fixed EOF handing.
1. SS_CANTRCVMORE was initially set on the wrong socket, so reads
when there has never been a writer on the socket did not return 0.
Note that such reads are only possible if the fifo was opened in
(O_RDONLY | O_NONBLOCK) mode.

2. SS_CANTSENDMORE was initially set on the wrong socket, but this
was harmless because the wrong socket is never sent from and there
is no need to set the flag initially on the right socket (since open
in (O_WRONLY | O_NONBLOCK) mode fails if there is no reader...).

3. SS_CANTRCVMORE was cleared when read() returns.  This broke the
case where read() returns 0 - subsequent reads are supposed to
return 0 until a writer appears.  There is no need to clear the
flag when read() returns, since it is cleared correctly when a
writer appears.
1997-12-13 13:49:59 +00:00
bde
54238221e6 Restored fifo_pathconf() from rev.1.32. vop_stdpathconf() is too
general to be of much use.  Using it here weakened the _PC_MAX_CANON,
_PC_MAX_INPUT and _PC_VDISABLE cases.

fifo_pathconf() is not quite correct either.  _PC_CHOWN_RESTRICTED
and _PC_LINK_MAX should be handled by the host file system.  For
directories, the host file system should let us handle _PC_PIPE_BUF.
1997-12-13 12:58:09 +00:00
sef
f13ddbc865 Change the ioctls for procfs around a bit; in particular, whever possible,
change from

	ioctl(fd, PIOC<foo>, &i);

to

	ioctl(fd, PIOC<foo>, i);

This is going from the _IOW to _IO ioctl macro.  The kernel, procctl, and
truss must be in synch for it all to work (not doing so will get errors about
inappropriate ioctl's, fortunately).  Hopefully I didn't forget anything :).
1997-12-13 03:13:49 +00:00
sef
9fd84ad693 Fix a problem with procfs_exit() that resulted in missing some procfs
nodes; this also apparantly caused a panic in some circumstances.
Also, since procfs_exit() is getting rid of the nodes when a process
exits, don't bother checking for the process' existance in procfs_inactive().
1997-12-12 03:33:43 +00:00
sef
f33f94c5ae Code to prevent a panic caused by procfs_exit(). Note that i don't know
what is teh root cause -- but, sometimes, a procfs vnode in pfshead is
apparantly corrupt (or a UFS vnode instead).  Without this patch, I can
get it to panic by doing (in csh)

	while (1)
		ps auxwww
	end

and it will panic when the PID's wrap.  With it, it does not panic.
Yes -- I know that this is NOT the right way to fix it.  But I haven't
been able to get it to panic yet (which confuses me).  I am going to
be looking into the vgone() code now, as that may be a part of it.
1997-12-09 05:03:41 +00:00
sef
ed33823bf0 A couple of fixes from bruce: first of all, psignal is a void (stupid
me; unfortunately, also makes it hard ot check for errors); second, I had
managed to forget a change to PIOCSFL (it should be _IOW, not _IOR) I had
in my local copy, and Bruce called me on it.

Submitted by:	bde
1997-12-08 22:09:39 +00:00
sef
287c0e3604 Use at_exit() to invoke procfs_exit() instead of calling it directly.
Note that an unload facility should be used to call rm_at_exit() (if
procfs is being loaded as an LKM and is subsequently removed), but it
was non-obvious how to do this in the VFS framework.

Reviewed by:	Julian Elischer
1997-12-08 01:06:36 +00:00
sef
3442d56966 Clear the stop events and wakeup the process on teh last close of the
procfs/mem file.  While this doesn't prevent an unkillable process, it
means that a broken truss prorgam won't do it accidently now (well,
there's a small window of opportunity).  Note that this requires the
change to truss I am about to commit.
1997-12-07 04:01:03 +00:00
sef
c7d273eccb Changes to allow event-based process monitoring and control. 1997-12-06 04:11:14 +00:00
bde
efd51d84cf Don't include <sys/lock.h> in headers when only `struct simplelock' is
required.  Fixed everything that depended on the pollution.
1997-12-05 19:55:52 +00:00
phk
db4618638e Staticize. 1997-11-18 15:07:35 +00:00
tegge
5ba8a227e1 Don't try to obtain an excluive lock on the vm map, since a deadlock might
occur if the process owning the map is wiring pages.
1997-11-14 22:57:46 +00:00
julian
554abdcebf fix slight breakages from PHK's VFS work.
also remove irrelevant copyright, now that all that code has gone away.
1997-11-08 19:02:28 +00:00
phk
4d26888936 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
phk
4c8218a5c7 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
bde
2fa5d41745 KNFize rev.1.31. 1997-10-27 15:39:01 +00:00
bde
ebc42ddbe3 Use unique sleep message strings. 1997-10-27 15:33:04 +00:00
bde
e38eb2dd99 Removed unused #includes. The need for most of them went away with
recent changes (docluster* and vfs improvements).
1997-10-27 13:33:47 +00:00
phk
e3cdaf12b2 VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place.
Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite.
Use vop_null instead of nullop.
Move vop_nopoll from vfs_subr.c to vfs_default.c
Move vop_sharedlock from vfs_subr.c to vfs_default.c
Move vop_nolock from vfs_subr.c to vfs_default.c
Move vop_nounlock from vfs_subr.c to vfs_default.c
Move vop_noislocked from vfs_subr.c to vfs_default.c
Use vop_ebadf instead of *_ebadf.
Add vop_defaultop for getpages on master vnode in MFS.
1997-10-26 20:55:39 +00:00
roberto
cb0b2ed81c Fix the same leak as in nullfs. Now the lowervp is properly marked inactive.
Reviewed by:	phk
1997-10-21 21:08:17 +00:00
roberto
83a98c9cc8 Fix the file leak bug. The lower layer wasn't informed the vnode was inactive
and kept a reference, preventing the blocks to be reclaimed.

Changed the comment in null_inactive to reflect the current situation.

Reviewed by:	phk
1997-10-21 21:01:34 +00:00
phk
f82436f706 VFS clean up "hekto commit"
1.  Add defaults for more VOPs
        VOP_LOCK        vop_nolock
        VOP_ISLOCKED    vop_noislocked
        VOP_UNLOCK      vop_nounlock
    and remove direct reference in filesystems.

2.  Rename the nfsv2 vnop tables to improve sorting order.
1997-10-16 22:01:05 +00:00
phk
373a865574 Another VFS cleanup "kilo commit"
1.  Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
    intereface function, and now lives in the ufsmount structure.

2.  Remove VOP_SEEK, it was unused.

3.  Add mode default vops:

    VOP_ADVLOCK          vop_einval
    VOP_CLOSE            vop_null
    VOP_FSYNC            vop_null
    VOP_IOCTL            vop_enotty
    VOP_MMAP             vop_einval
    VOP_OPEN             vop_null
    VOP_PATHCONF         vop_einval
    VOP_READLINK         vop_einval
    VOP_REALLOCBLKS      vop_eopnotsupp

    And remove identical functionality from filesystems

4.   Add vop_stdpathconf, which returns the canonical stuff.  Use
     it in the filesystems.  (XXX: It's probably wrong that specfs
     and fifofs sets this vop, shouldn't it come from the "host"
     filesystem, for instance ufs or cd9660 ?)

5.   Try to make system wide VOP functions have vop_* names.

6.   Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)
1997-10-16 20:32:40 +00:00
phk
d166441755 VFS mega cleanup commit (x/N)
1.  Add new file "sys/kern/vfs_default.c" where default actions for
    VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
    POLL, REVOKE and STRATEGY.  Various stuff spread over the entire
    tree belongs here.

2.  Change VOP_BLKATOFF to a normal function in cd9660.

3.  Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC.  These
    are private interface functions between UFS and the underlying
    storage manager layer (FFS/LFS/MFS/EXT2FS).  The functions now
    live in struct ufsmount instead.

4.  Remove a kludge of VOP_ functions in all filesystems, that did
    nothing but obscure the simplicity and break the expandability.
    If a filesystem doesn't implement VOP_FOO, it shouldn't have an
    entry for it in its vnops table.  The system will try to DTRT
    if it is not implemented.  There are still some cruft left, but
    the bulk of it is done.

5.  Fix another VCALL in vfs_cache.c (thanks Bruce!)
1997-10-16 10:50:27 +00:00
julian
39af6ebf22 remove forgotten debug printf() 1997-10-16 08:18:38 +00:00
julian
8d5dbe5c95 1/ by default make all versions of the same device get the same vnode.
2/ Show the dummy mount in the mount list. it cannot be reached (that I know of)
	but puting it there, means that disks mounted from devfs will have their	things such as the superblock and the bitmaps, synced to disk :)
1997-10-16 07:28:50 +00:00
julian
1c62ef9858 some cleanups of init code, and changes needed to support disk layering. 1997-10-16 06:29:27 +00:00
phk
f7aabc3ac9 vnops megacommit
1.  Use the default function to access all the specfs operations.
2.  Use the default function to access all the fifofs operations.
3.  Use the default function to access all the ufs operations.
4.  Fix VCALL usage in vfs_cache.c
5.  Use VOCALL to access specfs functions in devfs_vnops.c
6.  Staticize most of the spec and fifofs vnops functions.
7.  Make UFS panic if it lacks bits of the underlying storage handling.
1997-10-15 13:24:07 +00:00
phk
92eeb70dc6 Hmm, realign the vnops into two columns. 1997-10-15 10:05:29 +00:00
phk
26130e0b77 Stylistic overhaul of vnops tables.
1. Remove comment stating the blatantly obvious.
        2. Align in two columns.
        3. Sort all but the default element alphabetically.
        4. Remove XXX comments pointing out entries not needed.
1997-10-15 09:22:02 +00:00
julian
5edc58c755 if we free all the links to a node, then by definition
we freed the name we used to find it..
SO DON'T free it again later!

pointy hat over here please..
1997-10-12 22:27:11 +00:00
phk
36e7a51ea1 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
phk
645e7b2ab6 Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
julian
5f527a96c0 Allow a deleted deveice to delte it's nodes in other mounted devfs
filesystems even if not in SPLIT_DEVS mode.
1997-10-10 07:54:05 +00:00
kato
fe9b86cf0b Clustered read and write are switched at mount-option level.
1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
   bits of the mnt_flag.  The sysctl variables, vfs.foo.doclusterread
   and vfs.foo.doclusterwrite are deleted.  Only mount option can
   control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
   and D_CLUSTERW bits of the d_flags member in the block device switch
   table.  If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
   MNT_NOCLUSTERW bits will be set.  In this case, MNT_NOCLUSTERR and
   MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by:	bde
1997-09-27 13:40:20 +00:00
dyson
e64b1984f9 Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half.  The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs.  Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.
1997-09-21 04:24:27 +00:00
phk
b60a60d3bc Executing binaries on a nullfs (or nullfs-based) filesystem results in
a trap.
PR:		3104
Reviewed by:	phk
Submitted by:	Dan Walters hannibal@cyberstation.net
1997-09-18 18:33:23 +00:00
julian
35b575df34 devfs changes to allow old (better) and newer (braindamaged) behaviour.
I'm going to try migrate back, while keeping the newer code.
1997-09-16 09:10:18 +00:00
peter
ce7feabb13 Convert select -> poll.
Delete 'always succeed' select/poll handlers, replaced with generic call.
Flag missing vnode op table entries.
1997-09-14 02:58:12 +00:00
bde
bcade9a903 Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.
1997-09-07 16:21:11 +00:00
bde
fc775e3711 Removed vestiges of config-time "argument processing" configuration. 1997-09-07 13:49:56 +00:00
bde
2075d422d2 Staticized. 1997-09-07 06:46:34 +00:00
bde
e499dfd06d Some staticized variables were still declared to be extern. 1997-09-07 05:27:26 +00:00
kato
6f2d2e6e94 Support read-only mount. 1997-09-04 03:14:49 +00:00
bde
6ffb8bf9af Removed unused #includes. 1997-09-02 20:06:59 +00:00
kato
b687a4a7e7 Include "opt_ddb.h" only when NULLFS_DIAGNOSTIC is defined. 1997-08-28 00:44:43 +00:00