940 Commits

Author SHA1 Message Date
John Dyson
c44013cde6 Get rid of PIPE_NBIO, cleaning up the code a bit.
Reviewed by:	bde
1996-07-04 04:36:56 +00:00
David Greenman
7c818168d5 Fixed a major bug that caused various pmap related panics, hangs, and reboots.
The i386 pmap module uses a special area of kernel virtual memory for mapping
of page tables pages when it needs to modify another process's virtual
address space. It's called the 'alternate page table map'. There is only one
of them and it's expected that only one process will be using it at once and
that the operation is atomic.
When the merged VM/buffer cache was implemented over a year ago, it became
necessary to rundown VM pages at I/O completion. The unfortunate and
unforeseen side effect of this is that pmap functions are now called at bio
interrupt time. If there happend to be a process using the alternate page
table map when this I/O completion occurred, it was possible for a different
process's address space to be switched into the alternate page table map -
leaving the current pmap process with the wrong address space mapped when
the interrupt completed. This resulted in BAD things happening like pages
being mapped or removed from the wrong address space, etc.. Since a very
common case of a process modifying another process's address space is during
fork when the kernel stack is inserted, one of the most common manifestations
of this bug was the kernel stack not being mapped properly, resulting in a
silent hang or reboot. This made it VERY difficult to troubleshoot this bug
(I've been trying to figure out the cause of this for >6 months). Fortunately,
the set of conditions that must be true before this problem occurs is
sufficiently rare enough that most people never saw the bug occur. As I/O
rates increase, however, so does the frequency of the crashes. This problem
used to kill wcarchive about every 10 days, but in more recent times when
the traffic exceeded >100GB/day, the machine could barely manage 6 hours of
uptime.
The fix is to make certain that no process has the pages mapped that are
involved in the I/O, before the I/O is started. The pages are made busy, so
no process will be able to map them, either, until the I/O has finished.
This side-steps the issue by still allowing the pmap functions to be called
at interrupt time, but also assuring that the alternate page table map won't
be switched.
Unfortunately, this appears to not be the only cause of this problem. :-(

Reviewed by:	dyson
1996-06-30 05:17:08 +00:00
John Dyson
4b83b27fea Fix a problem that caused system crashes after physio. This problem
was due to non-aligned 64K transfers taking 17 pages.  We currently
do not support >16 page transfers.  The transfer is unfortunately truncated,
but since buffers are usually malloced, this is a problem only once in
a while.  Savecore is a culprit, but tar/cpio usually aren't.  This
is NOT the final fix (which is likely a bouncing scheme), but will at
least keep the system from crashing.
1996-06-26 05:52:15 +00:00
Bruce Evans
79df6d8597 trap.c:
Fixed profiling of system times.  It was pre-4.4Lite and didn't support
statclocks.  System times were too small by a factor of 8.

Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc().  Call addupc_task() directly instead of using the ADDUPC()
macro.

Removed vestigial support for PROFTIMER.

switch.s:
Removed addupc().

resourcevar.h:
Removed ADDUPC() and declarations of addupc().

cpu.h:
Updated a comment.  i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.

Obtained from:	mostly from NetBSD
1996-06-25 20:02:16 +00:00
Bruce Evans
cc3d522683 Unstaticize psratio and staticize profprocs. psratio needs to be exported
to trap.c to fix user profiling.
1996-06-23 17:40:47 +00:00
John Dyson
6ead3edd9c Clean-up the new VM map procfs code, and also add support for executable
format file "etype".  It contains a description of the binary type for
a process.
1996-06-18 05:16:00 +00:00
Bill Paul
8a095c52ed Add a couple of #ifdef DEVFS/#endif clauses to slence the following
compiler warnings which occur if you don't have 'options DEVFS' in
your kernel config file:

../../kern/kern_descrip.c: In function `fildesc_drvinit':
../../kern/kern_descrip.c:1103: warning: unused variable `fd'
../../kern/kern_descrip.c: At top level:
../../kern/kern_descrip.c:1095: warning: `devfs_token_stdin' defined but not use
d
../../kern/kern_descrip.c:1096: warning: `devfs_token_stdout' defined but not us
ed
../../kern/kern_descrip.c:1097: warning: `devfs_token_stderr' defined but not us
ed
../../kern/kern_descrip.c:1098: warning: `devfs_token_fildesc' defined but not u
sed
1996-06-17 16:54:03 +00:00
Bruce Evans
f0ea07bffb Reduced nesting of #includes in random.h and adjusted isa/random_machdep.c
to match (pc98/random_machdep.c probably requires a similar change).  This
is a problem area for the PC98 merge - all PC98 ifdefs in <machine/*.h> are
kludges to work around incorrect layering.
1996-06-17 16:47:43 +00:00
Bruce Evans
43be698cb6 Moved initialization of defaults for the label for the whole disk from
disklabel(8) to the kernel (dsopen()).  Drivers should initialize the
hardware values (rpm, interleave, skews).  Drivers currently don't do
this, but it usually doesn't matter since rotational position stuff is
normally disabled.
1996-06-17 14:43:54 +00:00
John Dyson
23fd45be00 Disable direct writes for non-blocking output. 1996-06-17 05:15:01 +00:00
Satoshi Asami
ad63a118b2 The Great PC98 Merge.
All new code is "#ifdef PC98"ed so this should make no difference to
PC/AT (and its clones) users.

Ok'd by:	core
Submitted by:	FreeBSD(98) development team
1996-06-14 11:02:28 +00:00
Satoshi Asami
d7629dff3b A fast memory copy for Pentiums using floating point registers.
It is called from copyin and copyout.

The new routine is conditioned on I586_CPU and I586_FAST_BCOPY, so you
need

options "I586_FAST_BCOPY"

(quotes essenstial) in your kernel config file.

Also, if you have other kernel types configured in your kernel, an
additional check to make sure it is running on a Pentium is inserted.
(It is not clear why it doesn't help on P6s, it may be just that the
 Orion chipset doesn't prefetch as efficiently as Tritons and friends.)

Bruce can now hack this away. :)
1996-06-13 07:17:21 +00:00
Joerg Wunsch
6b1d48f79c Externalize the declaration of dc_list. This is required in order to
get a ``generic'' kernel (``config kernel swap generic'') to compile.
1996-06-12 15:10:30 +00:00
Gary Palmer
c23670e294 Clean up -Wunused warnings.
Reviewed by:		bde
1996-06-12 05:11:41 +00:00
John Dyson
47dcd2e56c Change the symbol name used in the last commit from USRSTACK to
VM_MAXUSER_ADDRESS.  Even though they are the same, the new name
is more descriptive.
1996-06-11 23:50:48 +00:00
John Dyson
9a0a69469d Get rid of the unneeded upper address space. 1996-06-11 23:05:26 +00:00
Nate Williams
1c346c7092 Implemented 'kern_sysctl', which differs from 'userland_sysctl' in that
it assumes all of the data exists in the kernel.  Also, fix
sysctl_new-kernel (unused until now) which had reversed operands to
bcopy().

Reviewed by:	phk

Poul writes:
... actually the lock/sleep/wakeup cruft shouldn't be needed in the
kernel version I think, but just leave it there for now.
1996-06-10 16:23:42 +00:00
Bruce Evans
7a60d58aee Updated some comments in settimeofday(). 1996-06-08 11:55:32 +00:00
Bruce Evans
604396ff31 Fixed accumulation of run time for processes that don't accumulate
any statclock ticks.  Pretend that all the time up to the first
statclock tick is system time.  .  This makes a difference mainly for
benchmarks that test short-lived processes - the user and system
times for processes that each lived for about 1ms only added up to
about 10% of the real time even when there was very little interrupt
activity.

Break the printing of a quad_t variable correctly.
1996-06-08 11:48:28 +00:00
Bruce Evans
5e0fc49ea8 Replaced some memcpy()'s by bcopy()'s.
gcc only inlines memcpy()'s whose count is constant and didn't inline
these.  I want memcpy() in the kernel go away so that it's obvious that
it doesn't need to be optimized.  Now it is only used for one struct
copy in si.c.
1996-06-08 08:18:00 +00:00
Poul-Henning Kamp
3ce93e4e80 Fix the same problem that davidg fixed in -stable some days ago and
restructure sysctl stuff a bit.  KERN_PROC_PID now uses pfind().
1996-06-06 17:19:21 +00:00
Poul-Henning Kamp
7a69d9230f If handler function returns EAGAIN, restart operation. 1996-06-06 17:17:54 +00:00
John Dyson
261fe9665d Fix an error when B_MALLOC buffers are returned from the cluster read
code without the B_READ flag being set.  This is a problem when the
data is not cached, and the result will be a bogus attempted write.
Submitted by:	Kato Takenori <kato@eclogite.eps.nagoya-u.ac.jp>
1996-06-03 04:40:35 +00:00
David Greenman
86064318c4 Use kmem_alloc_wait/kmem_free_wakeup() to avoid allocation failures
from running out of string space in the exec_map.
1996-06-03 04:12:18 +00:00
David Greenman
6120fef1bc Fix declaration of ps_strings. 1996-06-03 04:09:36 +00:00
John Dyson
4ebce1e9a6 Remove the now-unnecessary and incorrect wiring of the "other" processes
page table pages.  The pmap layer now handles that fully.
1996-06-02 06:24:27 +00:00
John Dyson
268e9c5397 Keep brelse from freeing busy pages. 1996-05-31 00:41:37 +00:00
Peter Wemm
114a8cff43 Add an option "EXTRA_VNODES" to cause an extra number of vnode structures
to be allocated at boot time.  This is an expensive option, as they
consume physical ram and are not pageable etc.  In certain situations,
this kind of option is quite useful, especially for news servers that
access a large number of directories at random and torture the name cache.
Defining 5000 or 10000 extra vnodes should cut down the amount of vnode
recycling somewhat, which should allow better name and directory caching
etc.

This is a "your mileage may vary" option, with no real indication of
what works best for your machine except trial and error.  Too many will
cost you ram that you could otherwise use for disk buffers etc.

This is based on something John Dyson mentioned to me a while ago.
1996-05-31 00:20:34 +00:00
David Greenman
cd73303c45 Fix a panic caused by (proc)->p_session being dereferenced for a process
that was exiting.
1996-05-30 01:21:50 +00:00
Peter Wemm
472fe5e4db Dont allow directories to be link()ed or unlink()ed, even for root
(returns EPERM always, the errno is specified by POSIX).

If you really have a desperate need to link or unlink a directory, you
can use fsdb. :-)

This should stop any chance of ftpd, rdist, "rm -rf", etc from
bugging out and damaging the filesystem structure or loosing races
with malicious users.

Reviewed by: davidg, bde
1996-05-24 16:19:23 +00:00
John Dyson
301051a01e Make sure that we don't place a busy or held page onto the PQ_CACHE queue. 1996-05-24 05:21:58 +00:00
John Dyson
c51bd6784e Change the *evil* allocation of memory from kmem_map to the kernel_map.
This will mess things up especially recently.
1996-05-24 01:39:50 +00:00
John Dyson
14bf02f8f4 Minor performance improvement to kern_malloc.c that increases the
probability of reuse of recently freed memory.  This improves cache
hit stats on cached memory, and improves at least fork speed consistancy.
1996-05-18 22:33:13 +00:00
John Dyson
b18bfc3da7 This set of commits to the VM system does the following, and contain
contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>,
Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:

	More usage of the TAILQ macros.  Additional minor fix to queue.h.
	Performance enhancements to the pageout daemon.
		Addition of a wait in the case that the pageout daemon
		has to run immediately.
		Slightly modify the pageout algorithm.
	Significant revamp of the pmap/fork code:
		1) PTE's and UPAGES's are NO LONGER in the process's map.
		2) PTE's and UPAGES's reside in their own objects.
		3) TOTAL elimination of recursive page table pagefaults.
		4) The page directory now resides in the PTE object.
		5) Implemented pmap_copy, thereby speeding up fork time.
		6) Changed the pv entries so that the head is a pointer
		   and not an entire entry.
		7) Significant cleanup of pmap_protect, and pmap_remove.
		8) Removed significant amounts of machine dependent
		   fork code from vm_glue.  Pushed much of that code into
		   the machine dependent pmap module.
		9) Support more completely the reuse of already zeroed
		   pages (Page table pages and page directories) as being
		   already zeroed.
	Performance and code cleanups in vm_map:
		1) Improved and simplified allocation of map entries.
		2) Improved vm_map_copy code.
		3) Corrected some minor problems in the simplify code.
	Implemented splvm (combo of splbio and splimp.)  The VM code now
		seldom uses splhigh.
	Improved the speed of and simplified kmem_malloc.
	Minor mod to vm_fault to avoid using pre-zeroed pages in the case
		of objects with backing objects along with the already
		existant condition of having a vnode.  (If there is a backing
		object, there will likely be a COW...  With a COW, it isn't
		necessary to start with a pre-zeroed page.)
	Minor reorg of source to perhaps improve locality of ref.
1996-05-18 03:38:05 +00:00
Poul-Henning Kamp
7642f4742d Ups, I removed NMB_INIT too.
Complained about by:	asami
1996-05-12 07:48:47 +00:00
Poul-Henning Kamp
0482730e40 Nail down NCL_INIT = 1, and put a comment there telling what it is. 1996-05-11 20:43:23 +00:00
Bruce Evans
d03b40173c Hide options for emulators and static file systems in opt_dontuse.h.
These options only apply at config time.  Using them at compile time
would break the corresponding lkms.
1996-05-11 04:39:53 +00:00
Garrett Wollman
cb7545a995 Allocate mbufs from a separate submap so that NMBCLUSTERS works as
expected.
1996-05-10 19:28:55 +00:00
Garrett Wollman
82dab6ce62 Make it possible to return more than one piece of control information
(PR #1178).
Define a new SO_TIMESTAMP socket option for datagram sockets to return
packet-arrival timestamps  as control information (PR #1179).

Submitted by:	Louis Mamakos <loiue@TransSys.com>
1996-05-09 20:15:26 +00:00
Gary Palmer
4e31a37b94 Correct a comment. There is no fn `kprintf' 1996-05-09 18:58:06 +00:00
Garrett Wollman
6a06dea05f Our new-old mbugf allocator. This is actually something of a blast from
the past, since it returns to the old system of allocating mbufs out of
a private area rather than using the kernel malloc().  While this may seem
like a backwards step to some, the new allocator is some 20% faster than
the old one and has much better caching properties.

Written by: John Wroclawski <jtw@lcs.mit.edu>
1996-05-08 19:38:27 +00:00
Gary Palmer
6ddbf1e299 Clean up various compiler warnings. Most (if not all) were benign
Reviewed by:	bde
1996-05-08 04:29:08 +00:00
Poul-Henning Kamp
b74a76ee15 An old typo MCLBYTES/CLBYTES became more obvious bogus now.
Submitted by:		wollman
1996-05-06 17:18:12 +00:00
Joerg Wunsch
9e609dde83 uninitialized auto variable shmseg is used in ...
Closes PR #kern/1174

Submitted by:	enami@ba2.so-net.or.jp
1996-05-05 13:53:48 +00:00
Poul-Henning Kamp
aa8de40ae5 Another sweep over the pmap/vm macros, this time with more focus on
the usage.  I'm not satisfied with the naming, but now at least there is
less bogus stuff around.
1996-05-03 21:01:54 +00:00
Poul-Henning Kamp
ab76ac21e7 disksort() is gone, all drivers now use tqdisksort(). 1996-05-03 15:05:17 +00:00
Poul-Henning Kamp
e911eafcba removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
        ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
        NPDEPG

Major macro cleanup.
1996-05-02 14:21:14 +00:00
Peter Wemm
88d1b64235 Fix a nasty bug that causes random crashes and lockups particularly on
very busy servers (eg: news, web).  This is an interaction between
embryonic processes that have not yet finished forking, and happen to
cause the kernel VM space to grow, hitting the uninitialised variable.

It was possible for this to strike at any time, depending on the size of
your kernel and load patterns.  One machine had paniced occasionally
when cron launches a job since before the 2.1 release.

If you had "options DIAGNOSTIC", you may have seen references to bogus
addresses like 0xdeadc142 and the like.

This is a minimal change to fix the problem, it will probably be done
better by reordering p_vmspace to be in the startzero section, but it
becomes harder to validate then.

It's been vulnerable since pmap.c rev 1.40 (Jan 9, 1995), so it's been a
cause of problems since well before 2.0.5.  This was when the merged
VM/buffer cache and the dynamic growing kernel VM space were first
committed.  This probably fixes a few of PR's.
1996-05-02 11:38:05 +00:00
Poul-Henning Kamp
f8845af0db First pass at cleaning up macros relating to pages, clusters and all that. 1996-05-02 10:43:17 +00:00
Poul-Henning Kamp
a8c5fef5e6 KGDB is dead. It may come back one day if somebody does it. 1996-05-02 09:34:51 +00:00