Commit Graph

1491 Commits

Author SHA1 Message Date
John Dyson
8f9110f6a1 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
Tor Egge
1146c3560f The APs now reload the interrupt descriptor table pointer after
f00f_hack has run.

Use the global r_idt descriptor in f00f_hack when in SMP mode,
so the APs find the relocated interrupt descriptor table.

Submitted by:	Partially from David A Adkins <adkin003@tc.umn.edu>
1998-03-07 20:16:49 +00:00
Tor Egge
5dd528cd4e Remove special handling for resuming clock interrupt when using APIC_IO.
The `generic' vector stubs do the right thing.
1998-03-05 21:45:53 +00:00
Tor Egge
622a086be3 Use t_idt instead of idt inside setidt() if f00f_hack() has relocated the IDT.
Submitted by:	Bruce Evans <bde@zeta.org.au>
1998-03-05 19:37:03 +00:00
KATO Takenori
3a94cb7f7d Defined CCR6 and CCR7 (configuration registers of M2 CPU.) 1998-03-04 11:39:16 +00:00
Peter Dufault
f3df61a1cd Reviewed by: msmith, bde long ago
Fix for RTPRIO scheduler to eliminate invalid context switches.
1998-03-04 10:25:03 +00:00
Tor Egge
02c1dc3bbc When entering the apic version of slow interrupt handler, level
interrupts are masked, and EOI is sent iff the corresponding ISR bit
is set in the local apic. If the CPU cannot obtain the interrupt
service lock (currently the global kernel lock) the interrupt is
forwarded to the CPU holding that lock.

Clock interrupts now have higher priority than other slow interrupts.
1998-03-03 22:56:30 +00:00
Tor Egge
3163861c7b Forward the signal if the process runs on a different CPU. This reduces
the signal handling latency for cpu-bound processes that performs very
few system calls.

The IPI for forcing an additional software trap is no longer dependent upon
BETTER_CLOCK being defined.
1998-03-03 20:55:26 +00:00
Tor Egge
fe9cd27373 Reduce timeout before assuming that forwarding of hardclock or softclock
failed. Don't complain on forwarding failure, unless
BETTER_CLOCK_DIAGNOSTIC is defined.
1998-03-03 20:09:14 +00:00
Tor Egge
221e0c595b forward_statclock and forward_hardclock are located in mp_machdep.c. 1998-03-03 19:44:34 +00:00
Peter Wemm
c8a7999933 Update the ELF image activator to use some of the exec resources rather
than rolling it's own.  This means that it now uses the "safe"
exec_map_first_page() to get the ld.so headers rather than risking a panic
on a page fault failure (eg: NFS server goes down).
Since all the ELF tools go to a lot of trouble to make sure everything
lives in the first page for executables, this is a win.  I have not seen
any ELF executable on any system where all the headers didn't fit in the
first page with lots of room to spare.
I have been running variations of this code for some time on my pure ELF
systems.
1998-03-02 05:47:58 +00:00
John Dyson
ffc82b0a70 1) Use a more consistent page wait methodology.
2)	Do not unnecessarily force page blocking when paging
	pages out.
3)	Further improve swap pager performance and correctness,
	including fixing the paging in progress deadlock (except
	in severe I/O error conditions.)
4)	Enable vfs_ioopt=1 as a default.
5)	Fix and enable the page prezeroing in SMP mode.

All in all, SMP systems especially should show a significant
improvement in "snappyness."
1998-03-01 04:18:54 +00:00
Poul-Henning Kamp
3301c800f5 Prevent the TSC from being used on APM machines, we have no idea if
it runs at a constant frequency.  This was less of an issue before,
because the TSC only interpolated in the HZ intervals, but now where
the timecounter is used all the way, this becomes much more visible.

Nit: Fix a printf which triggered the bde-filter.
1998-02-28 21:16:13 +00:00
John Dyson
660957521c Fix page prezeroing for SMP, and fix some potential paging-in-progress
hangs.  The paging-in-progress diagnosis was a result of Tor Egge's
excellent detective work.
Submitted by:	Partially from Tor Egge.
1998-02-25 03:56:15 +00:00
Bruce Evans
6aeb4f6474 Removed vestiges of previous microtime() implementation. 1998-02-25 02:20:30 +00:00
John Dyson
d9bed5bee1 Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden
in a way identically as before.)  I had problems with the system properly
handling the number of vnodes when there is alot of system memory, and the
default VM_KMEM_SIZE.  Two new options "VM_KMEM_SIZE_SCALE" and
"VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems
with greater than 128MB.

Add some accouting for vm_zone memory allocations, and provide properly
for vm_zone allocations out of the kmem_map.  Also move the vm_zone
allocation stats to the VM OID tree from the KERN OID tree.
1998-02-23 07:42:43 +00:00
Bruce Evans
168be6199e Quick fix for the i8254 timecounter often gaining 10 msec. 1998-02-23 00:11:25 +00:00
Jordan K. Hubbard
6bf2dcfc50 Add missing CLOCK_UNLOCK() before write_eflags().
Submitted by:	dave adkins <adkin003@tc.umn.edu>
1998-02-21 20:45:27 +00:00
Poul-Henning Kamp
7ec73f6417 Replace TOD clock code with more systematic approach.
Highlights:
    * Simple model for underlying hardware.
    * Hardware basis for timekeeping can be changed on the fly.
    * Only one hardware clock responsible for TOD keeping.
    * Provides a real nanotime() function.
    * Time granularity: .232E-18 seconds.
    * Frequency granularity:  .238E-12 s/s
    * Frequency adjustment is continuous in time.
    * Less overhead for frequency adjustment.
    * Improves xntpd performance.

Reviewed by:    bde, bde, bde
1998-02-20 16:36:17 +00:00
Bruce Evans
39e4376ba7 Removed unused #includes. 1998-02-20 13:11:54 +00:00
Mike Smith
42be88e88d Remove DISABLE_PSE option which was masking (but not fixing) the problem.
A correct fix for execution off MFS filesystems has been committed.
1998-02-16 23:57:03 +00:00
Mike Smith
354f83d34f TEMPORARILY disable support for the 4MB kernel page, as it appears to be
causing installation images for -current to be unbootable.

Submitted by:	phk
1998-02-16 00:29:05 +00:00
Bruce Evans
ee10b2a475 Removed a superstitious fnop() that broke the usefulness of the FPU's
"last instruction" pointer.
1998-02-15 06:25:26 +00:00
KATO Takenori
53a76bb7ed Use RDMSR instruction instead of WRMSR. 1998-02-13 09:34:42 +00:00
Bruce Evans
359888780e Ifdefed SMP-only declarations. 1998-02-13 06:59:22 +00:00
Bruce Evans
1d804e7925 Update timer0_prescaler_count before calling hardclock() while timer0
is "acquired".  This fixes a TSC biasing error of about 10 msec when
pcaudio is active.

Update `time' before calling hardclock() when timer0 is being released.
This is not known to be important.

Added some delays in writertc().  Efficiency is not critical here, unlike
in rtcin(), and we already use conservative delays there.

Don't touch the hardware when machdep.i8254_freq is being changed but
the maximum count wouldn't change.  This fixes jitter of up to 10 msec
for most small adjustments to machdep.i8254_freq.  When the maximum
count needs to change, the hardware should be adjusted more carefully.
1998-02-13 06:33:16 +00:00
Bruce Evans
9f449d2aa2 Ifdefed some npx code. npx should be optional again. 1998-02-13 05:30:18 +00:00
Bruce Evans
9729f279db Fixed missing privilege checking and off-by-1 bounds checking in
i386_set_ioperm().  Don't use a magic number for the bound.

Fixed missing bounds checking in i386_get_ioperm().  Don't use a
magic number for the bound elsewhere in this function.

Removed some bogus initializers.
1998-02-13 05:25:37 +00:00
Bruce Evans
045b6fefd8 Fixed initialization of the 4MB page. Kernels larger than about 2.75MB
(from _btext to _end) crashed in pmap_bootstrap().  Smaller kernels
worked accidentally.
1998-02-12 22:00:01 +00:00
Bruce Evans
cd59d49d07 Only use the i586-optimized copying and zeroing functions if they are
actually faster (more than 20% faster for zeroing 1 MB at boot time).
This fixes pessimized copying and zeroing on K6's and perhaps on other
CPUs that are misclassified as i586's.
1998-02-12 21:41:10 +00:00
Eivind Eklund
b265178126 Fix warning after previous staticization. 1998-02-10 17:30:26 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
Eivind Eklund
5589ae9dfd Remove warnings from f00f_hack. 1998-02-09 04:45:53 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
John Dyson
95461b450d 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
Eivind Eklund
7dff14331f Make FAILSAFE a new-style option. 1998-02-04 03:47:16 +00:00
Bruce Evans
44429dc4f7 Converted DISABLE_PSE to a new-style option.
Fixed some formatting in options.i386.
1998-02-03 22:09:01 +00:00
Bruce Evans
a8961a8284 Ifdefed some SMP and VM86 code. Note that although VM86 is not a global
option, the ifdef on it in a header works because only the name of the
VM86 extension is hidden.
1998-02-03 21:27:50 +00:00
Bruce Evans
318ed3ed7a Forward declare a union so that this file is self-sufficient.
Cleaned up ifdefs.
1998-02-03 20:46:18 +00:00
Bruce Evans
7a1a679ecb Ifdefed use of a GNU feature. 1998-02-03 20:32:38 +00:00
Bruce Evans
af3b301b36 Fixed disordering of busdma* and swi_vm. 1998-02-01 23:00:53 +00:00
Bruce Evans
e29b867bd6 Fixed a recently broken comment. 1998-02-01 22:45:23 +00:00
Bruce Evans
76d0502a56 Declare printf() instead of including <stdio.h>, so that this doesn't
depend on anything outside of "sys".

Removed an unused include.

Don't use `extern' in a function declaration.
1998-02-01 18:53:09 +00:00
John Dyson
eaf13dd73a Change the busy page mgmt, so that when pages are freed, they
MUST be PG_BUSY.  It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed.  The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use.  There were some minor problems
with the collapse code, and this plugs those subtile "holes."

Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages.  I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise.  For now, we are stuck
with the current morass.
1998-01-31 11:56:53 +00:00
Eivind Eklund
3f2076daf5 Make the debug options new-style.
This also zaps a DPT option from lint; it wasn't referenced from
anywhere.
1998-01-31 07:23:16 +00:00
Eivind Eklund
e0d781f3a5 Make POWERFAIL_NMI, PPS_SYNC and NATM new style options.
This also fixes a couple of defunct options; submitted by bde.
1998-01-31 05:00:21 +00:00
Eivind Eklund
828084926f Skip probing devices which have already probed true. 1998-01-31 03:29:00 +00:00
Eivind Eklund
5212dc9c0b Include "opt_nfs.h"
Pointed out by: Eric L. Hernes <erich@lodgenet.com>
1998-01-31 02:53:41 +00:00
Poul-Henning Kamp
c5b193bfba Retire LFS.
If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.

If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.

R.I.P
1998-01-30 11:34:06 +00:00