Commit Graph

1944 Commits

Author SHA1 Message Date
bde
c799e18e37 Updated generated files. 1998-06-08 11:08:35 +00:00
bde
11d8d54296 Fixed some style bugs in output (missing tabs and unparenthesized macros).
Fixed some style bugs in source (mostly, superfluous backslashes).
1998-06-08 11:02:00 +00:00
dfr
707faaecf6 Fix a typo which prevented i386 elf from working at all (including Linux
emulated elf binaries).
1998-06-08 09:19:35 +00:00
phk
8c3dc868d3 Add a member function more to the timecounters, this one is for use
with latch based PPS implementations.  The client that uses it will
be committed after more testing.
1998-06-07 20:36:55 +00:00
dfr
1d5f38ac22 This commit fixes various 64bit portability problems required for
FreeBSD/alpha.  The most significant item is to change the command
argument to ioctl functions from int to u_long.  This change brings us
inline with various other BSD versions.  Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
1998-06-07 17:13:14 +00:00
phk
19aada2084 Add a "this" style argument and a "void *private" so timecounters can
figure out which instance to wount with.
1998-06-07 08:40:53 +00:00
bde
2048f8ec6c Don't attempt to copy the whole slices "struct" for DIOCGSLICEINFO.
The slices "struct" isn't really a struct; we allocate only part of
it in the fully dangerously dedicated case.  Since the "struct" is
malloced, the page beyond it may not be mapped, so attempts to copy
it would crash.  This problem became larger when the full struct was
bloated from < 1K to > 3K by the addition of (mostly unused) DEVFS
tokens some time before 2.2.0 was released.
1998-06-06 03:06:55 +00:00
dg
045446fee6 Moved limit frobbing (and the resulting limcopy()) that occurs for
accounting to the accounting function so that this isn't needlessly
done for some process exits.
Reviewed by:	bde,phk
1998-06-05 21:44:20 +00:00
dg
6ffbdd6b2e If we are out of mb_map space and we failed to m_reclaim() anything and
the alloc is not M_DONTWAIT, then panic with "Out of mbuf clusters".
Callers that specify M_WAIT can't deal with getting a NULL buffer, so this
is a more graceful failure than randomly page faulting in the socket code
or elsewhere.
1998-06-05 21:41:48 +00:00
dyson
5dbf701901 Correct sleep priority. 1998-06-02 05:39:13 +00:00
dufault
cfe0052eea Set PAGE_SIZE for _SC_PAGESIZE sysconf(). 1998-06-01 21:54:43 +00:00
peter
a4063ba4e2 Have the wakeup routine do the upcall if needed.
Obtained from: NetBSD
1998-05-31 18:38:43 +00:00
phk
d3d65c6b2e Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.
Clean up (or if antipodic: down) some of the msgbuf stuff.

Use an inline function rather than a macro for timecounter delta.

Maintain process "on-cpu" time as 64 bits of microseconds to avoid
needless second rollover overhead.

Avoid calling microuptime the second time in mi_switch() if we do
not pass through _idle in cpu_switch()

This should reduce our context-switch overhead a bit, in particular
on pre-P5 and SMP systems.

WARNING:  Programs which muck about with struct proc in userland
will have to be fixed.

Reviewed, but found imperfect by:       bde
1998-05-28 09:30:28 +00:00
dyson
d26bce6481 Make flushing dirty pages work correctly on filesystems that
unexpectedly do not complete writes even with sync I/O requests.
This should help the behavior of mmaped files when using
softupdates (and perhaps in other circumstances also.)
1998-05-21 07:47:58 +00:00
dufault
9feb272372 1. Add new defs for mins and maxs for the POSIX flavor priorities. They
end up being the same, but it doesn't look like you're comparing
apples and oranges.

2. Use need_resched instead of reset_priority.  This isn't right
either, since for example you'll round-robin against equal priority FIFO
processes when lowering the priority of another process,
but this works better and a real fix needs to be in kern_synch and
not out here.

3. This is not a device driver: copyin/copyout the structure.
1998-05-19 21:11:53 +00:00
phk
159f68347c Change a data type internal to the timecounters, and remove the "delta"
function.

Reviewed, but not entirely approved by: bde
1998-05-19 18:55:02 +00:00
phk
00b3b49e1b Make the size of the msgbuf (dmesg) a "normal" option. 1998-05-19 08:58:53 +00:00
tegge
9fdbafa2fe Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by:	David Greenman <dg@root.com>
1998-05-19 00:00:14 +00:00
dufault
3620341ce9 1. Don't use "nosys" and generate coredumps for unconfigured
system calls - return ENOSYS per the spec.

2. Fix interface stub to set priority properly.
1998-05-18 12:53:45 +00:00
tegge
4347025be3 Add forwarding of roundrobin to other cpus. This gives a more regular
update of cpu usage as shown by top when one process is cpu bound
(no system calls) while the system is otherwise idle (except for top).

Don't attempt to switch to the BSP in boot().  If the system was idle when
an interrupt caused a panic, this won't work.  Instead, switch to the BSP
in cpu_reset.

Remove some spurious forward_statclock/forward_hardclock warnings.
1998-05-17 22:12:14 +00:00
bde
2203e2156d Fixed interval calculation in realitimexpire() again. Obtained from:
rev.1.9.  Broken in: rev.1.50.

Fixed a spelling error.  Obtained from: Lite2.
1998-05-17 20:13:01 +00:00
bde
6bcab2370a Fixed stale references to hzto() in comments. 1998-05-17 20:08:05 +00:00
tegge
93d053207e Supply the correct process argument to dounmount when possible. 1998-05-17 19:38:55 +00:00
tegge
0b804fd802 For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1.
get_ptbase and pmap_pte_quick no longer generates IPIs.
This should reduce the number of IPIs during heavy paging.
1998-05-17 18:53:19 +00:00
phk
86337bf437 s/nanoruntime/nanouptime/g
s/microruntime/microuptime/g

Reviewed by:	bde
1998-05-17 11:53:46 +00:00
wollman
bbc4497ada Convert socket structures to be type-stable and add a version number.
Define a parameter which indicates the maximum number of sockets in a
system, and use this to size the zone allocators used for sockets and
for certain PCBs.

Convert PF_LOCAL PCB structures to be type-stable and add a version number.

Define an external format for infomation about socket structures and use
it in several places.

Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through
sysctl(3) without blocking network interrupts for an unreasonable
length of time.  This probably still has some bugs and/or race
conditions, but it seems to work well enough on my machines.

It is now possible for `netstat' to get almost all of its information
via the sysctl(3) interface rather than reading kmem (changes to follow).
1998-05-15 20:11:40 +00:00
peter
ac94d525c8 Nuke signanosleep(). (I've left nanosleep1() seperate to nanosleep()
as I don't want to mess with the multiple returns)
1998-05-14 11:31:08 +00:00
peter
d0cb29e22f regen after signanosleep nuke 1998-05-14 11:29:06 +00:00
peter
d5bf4ccfc2 deep-six signanosleep(). It sounded like a good idea at the time. 1998-05-14 11:28:11 +00:00
peter
3cca5ff2f6 Commit an old change that has been sitting around for a long while.
signanosleep() did not deal with signal masks properly.  This change was
based on a discussion with bde some time ago (at least 6 months or more).

signanosleep() should probably go away since it was never really used for
more than a few weeks and doesn't appear in released code.  It should
probably be killed before somebody uses it and it becomes a gratuitous
nonstandard feature.
1998-05-14 10:38:52 +00:00
bde
53cc68b743 Backed out previous commit. It is invalid to call d_ioctl() on
possibly non-open devices, and we don't want to restrict dumping
to swap devices anwyay.  It is especially invalid to call d_ioctl()
in non-process context for panics.  d_psize() can be called on
non-open devices, at least on non-SLICED ones that support d_dump(),
and setdumpdev() has depended on this for a long time although it
is probably wrong, but even d_psize() can't be called in non-process
context - that's why dumpsys() depends on previously computed values
although these values may be stale.  The historical restriction to
devices with dkpart(dev) == SWAP_PART should go away.
1998-05-12 17:34:02 +00:00
dyson
ee396db7d3 Fix the futimes/undelete/utrace conflict with other BSD's. Note that
the only common  usage of utrace (the possible problem with this
commit) is with malloc, so this should be a real problem.  Add
the various NetBSD syscalls that allow full emulation of their
development environment.
1998-05-11 03:55:28 +00:00
dyson
fac78afe5c Attempt to set write combining mode for graphics devices. 1998-05-11 01:06:08 +00:00
msmith
964ce778b1 In the words of the submitter:
---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it.  This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp.  vop_rename
still releases 4 vnode arguments before it returns.  These remaining cases
will be corrected in the next set of patches.
---------

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-07 04:58:58 +00:00
julian
0cb054bfea Add dump support to the DEVFS/slice code.
now we can actually catch our crashes :-)

Submitted by: Luoqi Chen <luoqi@chen.ml.org> (the man who's everywhere)
1998-05-06 22:14:48 +00:00
msmith
c645da3999 As described by the submitter:
Reverse the VFS_VRELE patch.  Reference counting of vnodes does not need
to be done per-fs.  I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-06 05:29:41 +00:00
dyson
65fbc3a74d Fix the shm panic. I mistakenly used the shadow_count to keep the object
from being split, and instead added an OBJ_NOSPLIT.
1998-05-04 17:12:53 +00:00
dyson
dfdb369a7d Work around some VM bugs, the worst being an overly aggressive
swap space free calculation.  More complete fixes will be forthcoming,
in a week.
1998-05-04 03:01:44 +00:00
bde
e5f7e85071 Oops, the previous commit should have changed i386' to __i386__',
not `__i386'.
1998-05-01 16:40:21 +00:00
bde
004d5369f3 Partially fixed write clustering for cases where cluster_wbuild() is
called from vfs_bio_awrite() without going through cluster_write()
or ufs_bmaparray(), in particular for all writes to block disk devices.
Only ufs_bmaparray() sets vp->v_maxio in a correct way, and it doesn't
seem to be called early enough even for regular files.
1998-05-01 16:29:27 +00:00
peter
dfa49715ca vm_page_is_valid() wasn't expecting a large offset argument, it's
expecting a sub-page offset.  We were passing the file position,
and vm_page_bits() could do some interesting things when base was
larger PAGE_SIZE.
if (size > PAGE_SIZE - base)
	size = PAGE_SIZE - base;
is interesting when (PAGE_SIZE - base) is negative.  I could imagine that
this could have interesting consequences for memory page -> device block
bit validation.
1998-05-01 15:10:59 +00:00
peter
e5ab5108e0 Fix one problem with NFSv3 > 2GB file support.
Submitted by: bde
1998-05-01 15:04:35 +00:00
eivind
67c7bb9c04 Translate T_PROTFLT to SIGSEGV instead of SIGBUS when running under
Linux emulation.  This make Allegro Common Lisp 4.3 work under
FreeBSD!

Submitted by: Fred Gilham <gilham@csl.sri.com>
Commented on by: bde, dg, msmith, tg
Hoping he got everything right:  eivind
1998-04-28 18:15:08 +00:00
obrien
fea3a33427 Discussed with: bde 1998-04-24 11:50:30 +00:00
obrien
82338a65ac Create virgin disklabels with 8 (MAXPARTITIONS) partitions rather than
three (RAW_PART + 1);
This makes ``disklabel -Brw sdN auto'' do the Right Thing.
1998-04-24 11:49:57 +00:00
dg
3253b48e39 Added kern.ipc.nmbclusters 1998-04-24 04:15:52 +00:00
julian
cb9166e241 Make the devfs SLICE option a standard type option.
(hopefully it will go away eventually anyhow)
1998-04-20 03:57:41 +00:00
julian
0796a5c56e Add changes and code to implement a functional DEVFS.
This code will be turned on with the TWO options
DEVFS and SLICE. (see LINT)
Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes.

/dev will be automatically mounted by init (thanks phk)
on bootup. See /sys/dev/slice/slice.4 for more info.
All code should act the same without these options enabled.

Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others

This code does not support the following:
bad144 handling.
Persistance. (My head is still hurting from the last time we discussed this)
ATAPI flopies are not handled by the SLICE code yet.

When this code is running, all major numbers are arbitrary and COULD
be dynamically assigned. (this is not done, for POLA only)
Minor numbers for disk slices ARE arbitray and dynamically assigned.
1998-04-19 23:32:49 +00:00
des
901c8a6cfa Backed out lseek changes. 1998-04-19 22:20:32 +00:00
des
959a007c3a Return EINVAL and do not change file pointer if resulting offset is negative.
PR:		kern/6184
1998-04-18 19:24:44 +00:00
peter
bbf574dac9 In vfs_msync(), test to see if the vnode being examined is "interesting"
(ie: it has a vm_object attached and is marked as OBJ_MIGHTBEDIRTY) before
attempting to lock it.  This should reduce the cpu hit that is incurred
when doing a sync(2) and when the syncer process is doing the 30-second
writeback of dirty mmap() data to disk.  Skip this speedup if we are
doing an unmount() to be sure to get everything - we can afford to
occasionally miss a msync while the system is running, but not at unmount.

I'm not sure about the VXLOCK and MNT_WAIT case, it seems a bit odd to skip
doing a page_clean at unmount time just because a vnode is VXLOCKed, but
that's what was being done before...
1998-04-18 06:26:16 +00:00
des
396b114475 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
bde
e9db829088 Really finish supporting compiling with `gcc -ansi'. 1998-04-17 04:53:44 +00:00
peter
a1bb161dc4 When the softdep conversion took place, the periodic vfs_msync() from
update got lost.  This is responsible for ensuring that dirty mmap() pages
get periodically written to disk.  Without it, long time mmap's might not
have their dirty pages written out at all of the system crashes or isn't
cleanly shut down.  This could be nasty if you've got a long-running
writing via mmap(), dirty pages used to get written to disk within 30
seconds or so.
1998-04-16 03:31:26 +00:00
tegge
eb926b31ca Unlock mountlist_slock if the mount point was busy (unmount in progress)
during the attempt at lazy fsync.
1998-04-15 18:37:49 +00:00
bde
b598f559b2 Support compiling with `gcc -ansi'. 1998-04-15 17:47:40 +00:00
phk
34d62002ae Fix a minor mbuf leak created by the previous change.
Reviewed by:	phk
Submitted by:	pb@fasterix.freenix.org (Pierre Beyssac)
1998-04-14 06:24:43 +00:00
phk
ead640e967 setsockopt() transports user option data in an mbuf. if the user
data is greater than MLEN, setsockopt is unable to pass it onto
the protocol handler.  Allocate a cluster in such case.

PR:		2575
Reviewed by:	 phk
Submitted by:	Julian Assange proff@iq.org
1998-04-11 20:31:46 +00:00
phk
87312aadcc When pmap_pinit0() allocates a page for proc0's page directory,
kernal page table may need to be extended.  But while growing the
kernel page table (pmap_growkernel()), newly allocated kernel page
table pages are entered into every process' page directory. For
proc0, the page directory is not allocated yet, and results in a
page fault.  Eventually, the machine panics with "lockmgr: not
holding exclusive lock".

PR:		5458
Reviewed by:	phk
Submitted by:	Luoqi Chen <luoqi@luoqi.watermarkgroup.com>
1998-04-11 17:24:06 +00:00
alex
dd07a29831 Grammar police. 1998-04-10 00:09:04 +00:00
wosch
e3aed30232 New mount option nosymfollow. If enabled, the kernel lookup()
function will not follow symbolic links on the mounted
file system and return EACCES (Permission denied).
1998-04-08 18:31:59 +00:00
phk
c641c2b9ee Minor adjustments to the timecounting and proc0.
Mostly Submitted by:	bde
1998-04-08 09:01:53 +00:00
peter
33b5f6b63d Today is not my lucky day. Fix missing brace and I got a request
to use EMLINK instead.
1998-04-06 19:32:37 +00:00
peter
db031c687b Use a different errno (ELOOP (as sef mentioned) since the text that goes
with the error sounds ok for the condition) if O_NOFOLLOW gets a link.
1998-04-06 18:43:28 +00:00
peter
b0a513624f Rather than let users get fd's to symlink files, make O_NOFOLLOW cause
an error if it gets a link (like it does if it gets a socket).  The
implications of letting users try and do file operations on symlinks
themselves were too worrying.
1998-04-06 18:25:21 +00:00
peter
d5ab1c3759 Implement a new open(2) flag: O_NOFOLLOW. This will instruct open
to not follow symlinks, but to open a handle on the link itself(!).
As strange as this might sound, it has several useful applications
safe race-free ways of opening files in hostile areas (eg: /tmp, a mode
1777 /var/mail, etc).  It also would allow things like fchown() to work
on the link rather than having to implement a new syscall specifically for
that task.

Reviewed by: phk
1998-04-06 17:38:43 +00:00
peter
6e3ec235ff curproc is initialized in locore at the same time for both SMP and UP now. 1998-04-06 15:51:22 +00:00
peter
32a92c71b6 Use real types for the SMP pages being allocated rather than arrays of
ints.  Remove some no longer needed casts.  Initialize the per-cpu
global data area using the structs rather than knowing too much about
layout, alignment, etc.
1998-04-06 15:48:30 +00:00
phk
ab5541db4c Make read_random() take a (void *) argument instead of (char *) 1998-04-06 09:30:42 +00:00
phk
3c122bd961 Make a kernel version of the timer* functions called timerval* to be
more consistent.

OK'ed by:	bde
1998-04-06 08:26:08 +00:00
phk
2fdb617aee More fixes for the iterative case of nanosleep1 from bruce.
I hate the 2-arg time{spec|val}{add|sub} functions!
1998-04-05 12:10:41 +00:00
phk
ef09a47a6d Make the dummy timecounter run at 1 MHz rather than 100kHz (noticed by bde)
fix the itimer(REAL) handling.
1998-04-05 11:49:36 +00:00
peter
0a735b0829 If there is no error code, don't copyout the remaining time. (As
documented in the man page and the standards).  (and besides, nanosleep1
isn't setting it in this case at present anyway, so we'd be copying junk).
1998-04-05 11:17:19 +00:00
phk
08f33aeded Fix nanosleep1 based on Bruces suggestion. 1998-04-05 10:28:01 +00:00
ache
e0d8a4a2e6 Remove unused atv.tv_usec = 0; from select/poll code 1998-04-05 10:03:52 +00:00
peter
2a16d50561 tsleep() returns EWOULDBLOCK if the timeout expired. Don't return this
to usermode, otherwise sleep(3) fails, cron doesn't work, etc etc etc.
1998-04-05 07:31:44 +00:00
peter
9cde5a8d09 Fix previous commit. Don't people read compiler messages or something?? 1998-04-05 02:59:10 +00:00
phk
9736ec2fbf Handle double fraction overflow in nano & microtime functions (spotted by Bruce)
Use tvtohz() a place where it fits.
1998-04-04 18:46:13 +00:00
phk
5e9a131f20 Time changes mark 2:
* Figure out UTC relative to boottime.  Four new functions provide
      time relative to boottime.

    * move "runtime" into struct proc.  This helps fix the calcru()
      problem in SMP.

    * kill mono_time.

    * add timespec{add|sub|cmp} macros to time.h.  (XXX: These may change!)

    * nanosleep, select & poll takes long sleeps one day at a time

Reviewed by:    bde
Tested by:      ache and others
1998-04-04 13:26:20 +00:00
dyson
da94b50355 Perhaps fix a problem that some drivers have that they don't properly
initialize the b_kvasize element.  This might fix some of the split
I/O requests that some people have.
1998-04-04 05:55:05 +00:00
phk
f1a4c3bb6f Try to fix poll & select after I broke them. 1998-04-02 07:22:17 +00:00
tegge
028480bfb1 Add two workarounds for broken MP tables:
- Attempt to handle PCI devices where the interrupt is
	  an ISA/EISA interrupt according to the mp table.

	- Attempt to handle multiple IO APIC pins connected to
	  the same PCI or ISA/EISA interrupt source.  Print a
	  warning if this happens, since performance is suboptimal.
	  This workaround is only used for PCI devices.

With these two workarounds, the -SMP kernel is capable of running on
my Asus P/I-P65UP5 motherboard when version 1.4 of the MP table is disabled.
1998-04-01 21:07:37 +00:00
phk
0b984010ad Fix an off by 1<<32 error. 1998-03-31 10:47:01 +00:00
phk
ab6754b199 Add a dummy timecounter until we find the real thing(s). 1998-03-31 10:44:56 +00:00
phk
9b703b1455 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
dyson
7f3f758651 Correct a significant problem with the softupdates port. Allow fsync
to work properly within the softupdates framework, and thereby eliminate
some unfortunate panics.
1998-03-29 18:23:44 +00:00
phk
8620163d92 Export MD5Transform in md5.c and remove a private version in random_machdep.c
md5 is standard as a consequence of this.
1998-03-29 11:55:06 +00:00
dufault
a9ef2eb2ee Remove duplicate comment 1998-03-28 18:16:29 +00:00
dufault
c5a445b832 Include sys/resource.h to get PRIO_MAX. 1998-03-28 14:49:47 +00:00
bde
b5ae2c779b Removed unused #includes. 1998-03-28 13:25:01 +00:00
bde
a1015f7749 Don't depend on <sys/mount.h> including <sys/socket.h>. 1998-03-28 12:04:40 +00:00
dufault
8ed0defc6e Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and
_KPOSIX_PRIORITY_SCHEDULING options to work.  Changes:

Change all "posix4" to "p1003_1b".  Misnamed files are left
as "posix4" until I'm told if I can simply delete them and add
new ones;

Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux;

Add man pages for _POSIX_PRIORITY_SCHEDULING system calls;

Add options to LINT;

Minor fixes to P1003_1B code during testing.
1998-03-28 11:51:01 +00:00
bde
cd450d6714 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
phk
4db0fa09db Split the padding out into a separate function.
Synchronize the kernel and libmd versions of md5c.c

PR:		misc/6127
Reviewed by:	phk
Submitted by:	Ari Suutari <ari@suutari.iki.fi>
1998-03-27 10:23:00 +00:00
dyson
1eaa978a47 Correct a problem where buffers might not be zeroed when needed. The
B_MALLOC buffers might not have been properly zeroed.
1998-03-27 06:48:24 +00:00
phk
00475b662a Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.
1998-03-26 20:54:05 +00:00
jlemon
8f4e20b1a3 Add the ability to make real-mode BIOS calls from the kernel. Currently,
everything is contained inside #ifdef VM86, so this option must be
present in the config file to use this functionality.

Thanks to Tor Egge, these changes should work on SMP machines.  However,
it may not be throughly SMP-safe.

Currently, the only BIOS calls made are memory-sizing routines at bootup,
these replace reading the RTC values.
1998-03-23 19:52:59 +00:00
dyson
aad85d5a04 In kern_physio.c fix tsleep priority messup.
In vfs_bio.c, remove b_generation count usage,
	remove redundant reassignbuf,
	remove redundant spl(s),
	manage page PG_ZERO flags more correctly,
	utilize in invalid value for b_offset until it
		is properly initialized.  Add asserts
		for #ifdef DIAGNOSTIC, when b_offset is
		improperly used.
	when a process is not performing I/O, and just waiting
		on a buffer generally, make the sleep priority
		low.
	only check page validity in getblk for B_VMIO buffers.

In vfs_cluster, add b_offset asserts, correct pointer calculation
	for clustered reads.  Improve readability of certain parts of
	the code.  Remove redundant spl(s).

In vfs_subr, correct usage of vfs_bio_awrite (From Andrew Gallatin
	<gallatin@cs.duke.edu>).  More vtruncbuf problems fixed.
1998-03-19 22:48:16 +00:00
dyson
dc6d7f19f8 Fix an embarassing problem in vtruncbuf. 1998-03-19 18:46:58 +00:00
dyson
9470d0a235 Correct a problem where data OR metadata could be thrown away if a
buffer is grown.
1998-03-17 17:36:05 +00:00
kato
7fa5494b44 Deleted PC-98 code because (1) machine dependent code should not be in
here, and (2) the flag used in PC-98 code has been assigned to another
purpose.
1998-03-17 08:41:28 +00:00
dyson
ca4225334c Correct a severely evil bug in the vtruncbuf code. It didn't cause
me any problems until after the previous commit.  This problem then
caused a severe case of creeping crud on my diskdrive, and hosed
my system so bad, that I needed to do a complete reinstall.  Sorry!!!

I assume that others have manifest this bug.
1998-03-17 06:30:52 +00:00
julian
54a5f1c1b0 Remove a soft-update hook that was accidentally added to the READ path.
also add some comments, and a couple of very minor cosmetic changes.
1998-03-16 18:39:41 +00:00
phk
edaad77cba A bunch of BNN (Bruce Normal Nits) from bde:
Bring back the softclock inlining
	save a couple of <<32's
	many white-space shuffles.
1998-03-16 10:19:12 +00:00
dyson
9bd499d7fa Allow vfs_ioopt to be enabled with a (temporary) config option. 1998-03-16 02:13:03 +00:00
dyson
6e92f5716b Some VM improvements, including elimination of alot of Sig-11
problems.  Tor Egge and others have helped with various VM bugs
lately, but don't blame him -- blame me!!!

pmap.c:
1)	Create an object for kernel page table allocations.  This
	fixes a bogus allocation method previously used for such, by
	grabbing pages from the kernel object, using bogus pindexes.
	(This was a code cleanup, and perhaps a minor system stability
	 issue.)

pmap.c:
2)	Pre-set the modify and accessed bits when prudent.  This will
	decrease bus traffic under certain circumstances.

vfs_bio.c, vfs_cluster.c:
3)	Rather than calculating the beginning virtual byte offset
	multiple times, stick the offset into the buffer header, so
	that the calculated offset can be reused.  (Long long multiplies
	are often expensive, and this is a probably unmeasurable performance
	improvement, and code cleanup.)

vfs_bio.c:
4)	Handle write recursion more intelligently (but not perfectly) so
	that it is less likely to cause a system panic, and is also
	much more robust.

vfs_bio.c:
5)	getblk incorrectly wrote out blocks that are incorrectly sized.
	The problem is fixed, and writes blocks out ONLY when B_DELWRI
	is true.

vfs_bio.c:
6)	Check that already constituted buffers have fully valid pages.  If
	not, then make sure that the B_CACHE bit is not set. (This was
	a major source of Sig-11 type problems.)

vfs_bio.c:
7)	Fix a potential system deadlock due to an incorrectly specified
	sleep priority while waiting for a buffer write operation.  The
	change that I made opens the system up to serious problems, and
	we need to examine the issue of process sleep priorities.

vfs_cluster.c, vfs_bio.c:
8)	Make clustered reads work more correctly (and more completely)
	when buffers are already constituted, but not fully valid.
	(This was another system reliability issue.)

vfs_subr.c, ffs_inode.c:
9)	Create a vtruncbuf function, which is used by filesystems that
	can truncate files.  The vinvalbuf forced a file sync type operation,
	while vtruncbuf only invalidates the buffers past the new end of file,
	and also invalidates the appropriate pages.  (This was a system reliabiliy
	and performance issue.)

10)	Modify FFS to use vtruncbuf.

vm_object.c:
11)	Make the object rundown mechanism for OBJT_VNODE type objects work
	more correctly.  Included in that fix, create pager entries for
	the OBJT_DEAD pager type, so that paging requests that might slip
	in during race conditions are properly handled.  (This was a system
	reliability issue.)

vm_page.c:
12)	Make some of the page validation routines be a little less picky
	about arguments passed to them.  Also, support page invalidation
	change the object generation count so that we handle generation
	counts a little more robustly.

vm_pageout.c:
13)	Further reduce pageout daemon activity when the system doesn't
	need help from it.  There should be no additional performance
	decrease even when the pageout daemon is running.  (This was
	a significant performance issue.)

vnode_pager.c:
14)	Teach the vnode pager to handle race conditions during vnode
	deallocations.
1998-03-16 01:56:03 +00:00
dyson
7c9ff06841 Disable the vfs.ioopt option for now, so that we don't get gratuitious
bugreports.  I might not be able to fix the problems before 3.0, due
to other, more important things.
1998-03-14 19:50:36 +00:00
tegge
fca0f92630 Don't misuse vnode interlocks in routines that can be called from interrupts.
PR:		5893
1998-03-14 02:55:01 +00:00
dufault
06743e32eb idprio processes must be preempted as soon as anything is runnable. 1998-03-11 20:50:42 +00:00
msmith
46304bfe5a If the root mount fails from a device that is not the compatability slice
of a disk, because that slice does not exist, try again mounting from the
compatability slice.

This handles the case where a disk has been initialised by 'disklabel
auto', which places a bogus and invalid slice entry on the disk.
The bootstrap is not smart enough to reject this slice, and pretends to
boot from it.  Believing the the bootstrap at this point is unwise.

Booting from non-'wd' disks thus prepared is still broken, as
'disklabel -rwB xdN auto' does not initialise the disk type field, and
the bootstrap mistakenly claims that the disk is handled by 'wd'.

Behaviour is now consistent with DEVFS expected characteristics.
1998-03-11 00:10:31 +00:00
jb
a069106ffc Add statements to generate a sys/syscall.mk file for inclusion
during the libc/libc_r to automatically pick up syscall names on
the assumption that default asm code needs to generated for them.

In the up-coming changes to the libc makefiles, there is the option
to provide a machine dependent asm source file which will turn off
the automatic generation of the default. There is also an option
to just stop code being generated for a syscall. In most cases,
though, the default asm code is all that is required, so this
change makes that the most convenient was to do business.

Idea suggested by: bde
1998-03-09 04:00:42 +00:00
julian
10c5ccc30a Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by:	Kirk McKusick (mcKusick@mckusick.com)
Obtained from:  WHistle development tree
1998-03-08 09:59:44 +00:00
dyson
54f61de05f Free the first page also if it is not valid. 1998-03-08 06:21:33 +00:00
dyson
8ceb6160f4 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
tegge
8644d41f2d The APs now reload the interrupt descriptor table pointer after
f00f_hack has run.

Use the global r_idt descriptor in f00f_hack when in SMP mode,
so the APs find the relocated interrupt descriptor table.

Submitted by:	Partially from David A Adkins <adkin003@tc.umn.edu>
1998-03-07 20:16:49 +00:00
dyson
f5ff9feb67 Some kern_lock code improvements. Add missing wakeup, and enable
disabling some diagnostics when memory or speed is at a premium.
1998-03-07 19:25:34 +00:00
bde
ab37e4723b Set the input and output buffer sizes and the input buffer watermarks
dynamically depending on the line speed(s).  This should give the old
sizes and watermarks until drivers are changed.

Display the input watermarks in pstat and sicontrol.
1998-03-07 15:36:29 +00:00
dufault
e28788f2a4 Reviewed by: msmith, bde long ago
POSIX.4 headers and sysctl variables.  Nothing should change
unless POSIX4 is defined or _POSIX_VERSION is set to 199309.
1998-03-04 10:27:00 +00:00
dufault
8893ec06df Reviewed by: msmith, bde long ago
Fix for RTPRIO scheduler to eliminate invalid context switches.

POSIX.4 headers and sysctl variables.  Nothing should change
unless POSIX4 is defined or _POSIX_VERSION is set to 199309.
1998-03-04 10:25:55 +00:00
dyson
1ae42d49a8 Fix a rounding error for the NFS buffer validend.
Submitted by:	John W. De Boskey <jwd@unx.sas.com>
1998-03-04 03:17:30 +00:00
tegge
9f3982f0f6 When entering the apic version of slow interrupt handler, level
interrupts are masked, and EOI is sent iff the corresponding ISR bit
is set in the local apic. If the CPU cannot obtain the interrupt
service lock (currently the global kernel lock) the interrupt is
forwarded to the CPU holding that lock.

Clock interrupts now have higher priority than other slow interrupts.
1998-03-03 22:56:30 +00:00
tegge
beae57c5b3 Forward the signal if the process runs on a different CPU. This reduces
the signal handling latency for cpu-bound processes that performs very
few system calls.

The IPI for forcing an additional software trap is no longer dependent upon
BETTER_CLOCK being defined.
1998-03-03 20:55:26 +00:00
tegge
9b0c9780e5 Reduce timeout before assuming that forwarding of hardclock or softclock
failed. Don't complain on forwarding failure, unless
BETTER_CLOCK_DIAGNOSTIC is defined.
1998-03-03 20:09:14 +00:00
peter
1cfd0bd061 Update the ELF image activator to use some of the exec resources rather
than rolling it's own.  This means that it now uses the "safe"
exec_map_first_page() to get the ld.so headers rather than risking a panic
on a page fault failure (eg: NFS server goes down).
Since all the ELF tools go to a lot of trouble to make sure everything
lives in the first page for executables, this is a win.  I have not seen
any ELF executable on any system where all the headers didn't fit in the
first page with lots of room to spare.
I have been running variations of this code for some time on my pure ELF
systems.
1998-03-02 05:47:58 +00:00
dyson
1928b68b1a Change vfs.ioopt default back to '0'. 1998-03-01 23:07:45 +00:00
msmith
950d32131b The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE.  This is
initially only for file systems that implement the following ops
that do a WILLRELE:

	vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
	vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet.  VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
	ffs mfs nfs msdosfs devfs ext2fs

Custom
	union umapfs

Just EOPNOTSUPP
	fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers.  vput will be replaced by unlock in these cases.  Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer.  This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by:	phk, dyson, terry et. al.
Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-03-01 22:46:53 +00:00
guido
406aea3e09 Make sure that you can only bind a more specific address when it is
done by the same uid.
Obtained from: OpenBSD
1998-03-01 19:39:29 +00:00
dyson
69e5a1e9f5 1) Use a more consistent page wait methodology.
2)	Do not unnecessarily force page blocking when paging
	pages out.
3)	Further improve swap pager performance and correctness,
	including fixing the paging in progress deadlock (except
	in severe I/O error conditions.)
4)	Enable vfs_ioopt=1 as a default.
5)	Fix and enable the page prezeroing in SMP mode.

All in all, SMP systems especially should show a significant
improvement in "snappyness."
1998-03-01 04:18:54 +00:00
guido
391bb65d14 Raise ncallout from NPROC + 16 to NPROC + 16 + MAXFILES. This shold
prevent a possible DOS attack. The proper fix (to dynamically grow
the callout list) is in the make.
Submitted by:	Paul Traina
1998-02-27 19:58:29 +00:00
bde
bd35cc6b95 Removed unused #includes. 1998-02-25 13:08:07 +00:00
bde
e15b8a8b3a Removed a stale comment and staler code. 1998-02-25 06:30:15 +00:00
bde
df1db50d54 Don't depend on "implicit int" or bloat the data section in the
declaration of ptc_devsw_installed.

Fixed a spelling error.
1998-02-25 06:19:15 +00:00
bde
6c18b55140 Don't depend on "implicit int". 1998-02-25 06:16:37 +00:00
bde
cd90b66afc Declare function pointer args as pointers, not as functions. 1998-02-25 06:13:32 +00:00
bde
59d87ac480 Fixed a missing newline in a debugging printf.
Fixed punctuation in some comments.
1998-02-25 06:04:46 +00:00
bde
d185d33240 Removed unused #includes. 1998-02-25 05:58:50 +00:00
bde
6898166010 Fixed the calculation of `delta' in settime(). We once set all
times consistently wrong (up to 1 tick too late), but recent changes
fixed the setting of the main clock, making other times inconsistent.
The inconsistencies tended to show up as a negative resource usage
for the process that set the time.

Fixed the check for setting the clock backwards.  A stale timestamp
(`time') was checked, so it was possible to set the clock backwards
by up to almost 1 tick.  Until recently, this bug was compensated
for by setting the clock consistently wrong.

Merged the comment about setting the clock backwards from Lite2.

Removed latency micro-optimizations/speed pessimizations in settime().
microtime() and set_timecounter() are relatively expensive, and
they must be called together with clock updates blocked to get a
consistent `delta', so significant latency optimizations are not
possible.

Removed some stale comments.
1998-02-25 04:10:32 +00:00
dyson
7a5637f439 Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden
in a way identically as before.)  I had problems with the system properly
handling the number of vnodes when there is alot of system memory, and the
default VM_KMEM_SIZE.  Two new options "VM_KMEM_SIZE_SCALE" and
"VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems
with greater than 128MB.
1998-02-23 07:41:23 +00:00
dyson
32e0a3673a Clean-up the vget mechanism by permanently attaching VM objects to
vnodes, therefore vget doesn't need to do so anymore.  Other minor
improvements include the temp free vnode queue obeying the VAGE
flag and a printf that warns of to-be-removed code being executed.
1998-02-23 06:59:52 +00:00
phk
044e1e6296 Replace TOD clock code with more systematic approach.
Highlights:
    * Simple model for underlying hardware.
    * Hardware basis for timekeeping can be changed on the fly.
    * Only one hardware clock responsible for TOD keeping.
    * Provides a real nanotime() function.
    * Time granularity: .232E-18 seconds.
    * Frequency granularity:  .238E-12 s/s
    * Frequency adjustment is continuous in time.
    * Less overhead for frequency adjustment.
    * Improves xntpd performance.

Reviewed by:    bde, bde, bde
1998-02-20 16:36:17 +00:00
bde
6982cfc5c0 Staticized.
Don't depend on "implicit int".
1998-02-20 13:52:15 +00:00
bde
bfefd71bbf Don't depend on "implicit int" or bloat the data section in the
declaration of xxx_devsw_installed.
1998-02-20 13:46:58 +00:00
bde
999552194c Don't depend on "implicit int". 1998-02-20 13:37:40 +00:00
bde
9fca072392 Removed unused #includes. 1998-02-20 13:11:54 +00:00
fenner
b318e8e53a Revert sosend() to its behavior from 4.3-Tahoe and before: if
so_error is set, clear it before returning it.  The behavior
introduced in 4.3-Reno (to not clear so_error) causes potentially
transient errors (e.g.  ECONNREFUSED if the other end hasn't opened
its socket yet) to be permanent on connected datagram sockets that
are only used for writing.

(soreceive() clears so_error before returning it, as does
getsockopt(...,SO_ERROR,...).)

Submitted by:	Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.
1998-02-19 19:38:20 +00:00
eivind
a5daa0b2f8 Add HW_WDOG to LINT, and turn it into a new-style option. 1998-02-16 23:57:49 +00:00
phk
369a4922e3 A bunch of nits from bde. 1998-02-15 14:15:21 +00:00
phk
65d335ce82 Add a nanotime() function so that we can start to use this call. 1998-02-15 13:55:06 +00:00
phk
9b84234e63 unifdef -UEXT_CLOCK fdef -UEXT_CLOCK, it is irrelevant.
Fix a couple of nits from bde while here anyway.
1998-02-15 13:50:12 +00:00
bde
1743f0cdee Fixed an aliasing bug. It was too easy to defeat the check for moving
or shrinking an open partition (by changing the label for a compatibility
slice while partitions on the corresponding real slice are open, or vice
versa).
1998-02-15 05:41:31 +00:00
dyson
fbe6fe8df6 Make the rootdir handling more consistent. Now, processes always
have a root vnode associated with them, and no special checks for
the null case are needed.
Submitted by:	terry@freebsd.org
1998-02-15 04:17:09 +00:00
eivind
e51227b005 Make NO_LKM a new-style option.
Forgotten by:	dima
1998-02-12 18:02:07 +00:00
dima
8175420eed I'm not sure whether this is a correct way to do it,
but here's a new kernel option - "NO_LKM"

If anyone has better ideas - please let me know.
1998-02-11 20:47:55 +00:00
dg
fa872f3d14 Fix a && that should be an &.
Reviewed by:	"John S. Dyson" <dyson@FreeBSD.ORG>
Submitted by:	jwd@unx.sas.com (John W. DeBoskey)
1998-02-11 20:06:48 +00:00
eivind
feb116808c Include SIMPLELOCK_DEBUG functions even if SMP if compiling LINT; give
an error for the combination if _not_ compiling LINT.
1998-02-11 00:05:26 +00:00
eivind
01d55887a1 Move include of <machine/ipl.h> inside ifndef SMP where it is used, to
avoid getting 'unused include file' warnings in the SMP case.
1998-02-10 17:10:23 +00:00
kato
8b0f1ac87b Fixed vnode interlock handling.
Reviewed by:	Bruce Evans <bde@zeta.org.au>
            	Tor Egge <Tor.Egge@idi.ntnu.no>
1998-02-10 02:54:24 +00:00
eivind
d7a6ab2803 Staticize. 1998-02-09 06:11:36 +00:00
dyson
9f39ec243f Fix a problem with vn_lock in fsync. 1998-02-08 01:41:33 +00:00
kato
95aa7532e2 When the vp is lcoked, vget() calls vfs_object_create() with
waslocked = TRUE.  This change may fix lockmgr panic in umapfs/nullfs.

PR:		5634
Reviewed by:	"John S. Dyson" <toor@dyson.iquest.net>
Suggested by:	Bruce Evans <bde@zeta.org.au>
1998-02-07 08:44:31 +00:00
eivind
4547a09753 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
dyson
ebccbfc1ff 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
eivind
c552a9a1c3 Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
dg
13eef94007 Restrict idleprio to superuser:
Realtime priority has to be restricted for reasons which should be
obvious. However, for idle priority, there is a potential for
system deadlock if an idleprio process gains a lock on a resource
that other processes need (and the idleprio process can't run
due to a CPU-bound normal process). Fix me! XXX
PR: 5639
1998-02-04 18:43:10 +00:00
bde
eba2144b26 Fixed staticization. 1998-02-03 21:41:12 +00:00
bde
4eac1e52b3 Updated generated files. 1998-02-03 17:52:21 +00:00
bde
89e00e64d9 Fixed type of mincore(). 1998-02-03 17:45:43 +00:00
bde
8ccf06d1ca Generate a forward declaration of `struct proc' in <sys/sysproto.h>.
Removed extra args to a printf.

Fixed some style inconsistencies (unnecessary parentheses for printf).
awk is not C.
1998-02-03 17:39:13 +00:00
dyson
9c63bc645f Return the vm_map in the eproc structure, so we can support more accurate
VSZ display in PS.
1998-02-02 05:14:03 +00:00
dyson
2aacd1ab4f Change the busy page mgmt, so that when pages are freed, they
MUST be PG_BUSY.  It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed.  The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use.  There were some minor problems
with the collapse code, and this plugs those subtile "holes."

Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages.  I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise.  For now, we are stuck
with the current morass.
1998-01-31 11:56:53 +00:00
eivind
712a1e61e7 Make the debug options new-style.
This also zaps a DPT option from lint; it wasn't referenced from
anywhere.
1998-01-31 07:23:16 +00:00
eivind
e8dbec0c06 Make POWERFAIL_NMI, PPS_SYNC and NATM new style options.
This also fixes a couple of defunct options; submitted by bde.
1998-01-31 05:00:21 +00:00
tegge
fbf474f2d8 Update freevnodes when adding a vnode to the head of the free list. 1998-01-31 01:17:58 +00:00
phk
bb6f7d8184 Retire LFS.
If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.

If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.

R.I.P
1998-01-30 11:34:06 +00:00
steve
c0db3d8218 Fix a couple of operator precedence bugs.
PR:		5450
Submitted by:	Sakari Jalovaara <sja@tekla.fi>
1998-01-25 17:25:41 +00:00
dyson
548a436486 Various NFS fixes:
Make vfs_bio buffer mgmt work better.
	Buffers were being used after brelse.
	Make nfs_getpages work independently of other NFS
		interfaces.  This eliminates some difficult
		recursion problems and decreases pagefault
		overhead.
	Remove an erroneous vfs_unbusy_pages.
	Fix a reentrancy problem, with nfs_vinvalbuf when
		vnode is already being rundown.
	Reassignbuf wasn't being called when needed under
		certain circumstances.

	(Thanks to Bill Paul for help.)
1998-01-25 06:24:09 +00:00
eivind
71ddd31390 Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style.
This introduce an xxxFS_BOOT for each of the rootable filesystems.
(Presently not required, but encouraged to allow a smooth move of option *FS
to opt_dontuse.h later.)

LFS is temporarily disabled, and will be re-enabled tomorrow.
1998-01-24 02:54:56 +00:00
dyson
8726294764 Add better support for larger I/O clusters, including larger physical
I/O.  The support is not mature yet, and some of the underlying implementation
needs help.  However, support does exist for IDE devices now.
1998-01-24 02:01:46 +00:00
dyson
197bd655c4 VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
bde
421158c94f Set p_retval for the correct process in getpriority(). This fixes
a null pointer panic when the pointer for the incorrect process is
NULL.  getpriority() was broken in rev.1.27.  Rev.1.28 broke the
warning instead of fixing the problem.

PR:	5495
1998-01-19 12:39:00 +00:00
dyson
b130b30c96 Tie up some loose ends in vnode/object management. Remove an unneeded
config option in pmap.  Fix a problem with faulting in pages.  Clean-up
some loose ends in swap pager memory management.

The system should be much more stable, but all subtile bugs aren't fixed yet.
1998-01-17 09:17:02 +00:00
phk
74b3033fff Move almost all the ntp related stuff from kern_clock.c to
kern_ntptime.c.  The only bit left over is that which is executed
in all calls to hardclock().  Various cleanups and staticizing
along the road.
1998-01-14 20:48:16 +00:00
phk
83c5648e27 Make softticks static.
Remove unneeded stuff.
1998-01-14 19:42:47 +00:00
dyson
9a35ec7fec Fix another vnode leak. 1998-01-12 03:15:01 +00:00
dyson
d9d8bf6d30 Fix some vnode management problems, and better mgmt of vnode free list.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.

PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
1998-01-12 01:46:33 +00:00
dyson
2aff55e25f Implement the first page access for object type determination more
VM clean.  Also, use vm_map_insert instead of vm_mmap.
Reviewed by:	dg@freebsd.org
1998-01-11 21:35:38 +00:00
phk
dfbd942cf7 Try to solve timeout race by not touching softtics here. 1998-01-11 19:07:58 +00:00
phk
8a2e578b97 Fix softclock calling so we don't loose timeouts (I broke this ~10h ago) 1998-01-11 00:44:31 +00:00
phk
2c2c366569 Whoops. softclock is called from doreti_swi as well. Abandon call from
hardclock().

Forgot this:

Pointed hat sent by:	bd
1998-01-10 14:55:14 +00:00
phk
7b6e03e147 Effect the divorce of kern_clock.c and kern_timeout.c (which was
repository copied from kern_clock.c)
1998-01-10 13:16:26 +00:00
eivind
57d4125c71 Make the BOOTP family new-style options (in opt_bootp.h) 1998-01-09 03:21:07 +00:00
phk
1e948b4b8a Improve hardpps readability a bit:
* Rename usec to p_usec so you can search for it.
* Macroize the huge median_of_3_samples if statement.
1998-01-07 12:29:17 +00:00
dyson
66da5c4f34 Disable io optimizations again, minor bug found, and will be fixed in
a few days.
1998-01-07 09:26:29 +00:00
dyson
cb2800cd94 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
alex
017d9a9242 Added missing caddr_t --> void * conversions for sys/mman.h functions.
Submitted by:	bde
1998-01-01 17:07:46 +00:00
bde
62111cffb4 Use a real malloc type for M_LINKER instead of #defining it as M_TEMP.
Fixed a comment.
1998-01-01 08:56:24 +00:00
dyson
7bf56bd14a Add the vnode interlock back around vref. 1997-12-29 16:54:03 +00:00
bde
85fbb446a9 Fixed style bugs in previous commit. 1997-12-29 08:54:52 +00:00
dyson
8ab3ac77d2 Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem
with the object cache removal.
1997-12-29 01:03:55 +00:00
dyson
cd67bb82fe Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00