Commit Graph

1336 Commits

Author SHA1 Message Date
Garrett Wollman
3f7773458e When APM is configured, turn off the power when halting for good. 1997-06-15 02:03:03 +00:00
Bruce Evans
ac5fcb6d4d Removed unused #includes. 1997-06-14 11:38:46 +00:00
Bruce Evans
bad324ca54 Fixed livelock in getnewbuf().
It is possible for multiple process to sleep concurrently waiting
for a buffer.  When the buffer shortage is a shortage of space but
not a shortage of buffer headers, the processes took turns creating
empty buffers and waking each other to advertise the brelse() of
the empties; progress was never made because tsleep() always found
another high-priority process to run and everything was done at
splbio(), so vfs_update never had a chance to flush delayed writes,
not to mention that i/o never had a chance to complete.

The problem seems to be rare in practice, but it can easily be
reproduced by misusing block devices, at least for sufficently slow
devices on machines with a sufficiently small buffer cache.  E.g.,
`tar cvf /dev/fd0 /kernel' on an 8MB system with no disk in fd0
causes the problem quickly; the same command with a disk in fd0
causes the problem not quite as quickly; and people have reported
problems newfs'ing file systems on block devices.

Block devices only cause this problem indirectly.  They are pessimized
for time and space, and the space pessimization causes the shortage
(it manifests as internal fragmentation in buffer_map).

This should be fixed in 2.2.
1997-06-13 08:30:40 +00:00
David Greenman
2e58c0f892 Disabled the kern.vnode sysctl variable. It's causing system crashes on
large systems and needs to be re-thinked or removed wholesale.
1997-06-10 02:48:08 +00:00
Andrey A. Chernov
7f533ff73f Add safety check in case "conflicts" keyword specified more times than
needed
1997-06-08 17:15:31 +00:00
Bruce Evans
7b3c84247b Preserve %fs and %gs across context switches. This has a relatively low
cost since it is only done in cpu_switch(), not for every exception.
The extra state is kept in the pcb, and handled much like the npx state,
with similar deficiencies (the state is not preserved across signal
handlers, and error handling loses state).
1997-06-07 04:36:10 +00:00
Doug Rabson
e90b93a1d0 Don't throw NFS B_DELWRI buffers back to the vm system in brelse.
Make sure that b_validoff..b_validend is at least as big as
b_dirtyoff..b_dirtyend.
1997-06-06 09:04:28 +00:00
Doug Rabson
501338ca4f Fix some performance problems with the NFS mmap fixes. 1997-06-03 09:42:43 +00:00
Doug Rabson
0c514a25a0 The defines INTR_FAST and INTR_EXCL are part of the public interface. The
previous commit made them private which broke things.
1997-06-02 10:46:28 +00:00
Doug Rabson
6d47a3a499 Change isa_device.h to intr_machdep.h 1997-06-02 10:44:08 +00:00
Doug Rabson
683523378c Move interrupt handling code from isa.c to a new file. This should make
isa.c (slightly) more portable and will make my life developing the really
portable version much easier.

Reviewed by:	peter, fsmp
1997-06-02 08:19:06 +00:00
Julian Elischer
939c19614c tiny spelling fix in comment 1997-06-02 04:56:38 +00:00
Peter Wemm
8c046d14dc Move "typedef struct intrec {} intrec" from sys/interrupt.h to kern_intr.c
since that's the only place that it's used.

Submitted by: se  (apparently on suggestion from dfr)
1997-06-01 16:05:14 +00:00
Peter Wemm
bf5acbf51f oops, fix a braino that I noticed during the commit.. Don't verify the
remaining time pointer if it's NULL, since we don't write back in that
case! (*blush*!)
1997-06-01 09:05:19 +00:00
Peter Wemm
5b870b7ba7 - implement signanosleep(2) by moving common code from nanosleep() into a
shared function.
- use p->p_sleepend to try and get more accurate "time remaining" results
when the time has been adjusted.
- verify writeability of return address so that we can fail before sleeping
if the address for the result is bogus.
1997-06-01 09:01:07 +00:00
Peter Wemm
83a6ec5e6a Regenerate 1997-06-01 08:56:12 +00:00
Peter Wemm
99f06d5c02 New syscall, signanosleep(), which is a hybrid of sigsuspend(2) and
nanosleep(2).  It sleeps until either the time expires, or a signal
permitted by the supplied mask arrives (eg: SIGALRM if appropriate)
1997-06-01 08:52:38 +00:00
Peter Wemm
4407ec71c7 <machine/spl.h> -> <machine/ipl.h>
s/intrmask/intrmask_t/g

Reviewed by: bde, se
1997-05-31 09:30:39 +00:00
Peter Wemm
5400ed3b2f Include file updates.. <machine/spl.h> -> <machine/ipl.h>, add
<machine/ipl.h> to those files that were depending on getting SWI_*
implicitly via <machine/cpufunc.h>
1997-05-31 09:27:31 +00:00
Doug Rabson
bc3718bb36 The previous fix didn't work properly for small block size filesystems,
which caused very slow file access for cd9660 and some ext2fs filesystems.

Reviewed by:	bde
1997-05-30 22:25:35 +00:00
Steve Passe
a8baaafda0 Code such as apic_base[APIC_ID] converted to lapic__id
Changes to pmap.c for lapic_t lapic && ioapic_t ioapic pointers,
currently equal to apic_base && io_apic_base, will stand alone with the
private page mapping.
1997-05-29 05:58:41 +00:00
Peter Wemm
64b3672f39 minor style police (recent divergence from KNF code) 1997-05-29 05:07:10 +00:00
Peter Wemm
ae9249615f remove opt_smp.h and fix the reason it was needed. 1997-05-29 05:04:30 +00:00
Peter Wemm
8f453f3ed3 Don't need "opt_smp.h" on these files 1997-05-29 04:52:04 +00:00
Stefan Eßer
f9220cde63 Fix problem reported by PHK: Panic in pcic probe because of NULL pointer
dereference (head->next in intr_disconnect).
1997-05-28 22:11:00 +00:00
Alexander Langer
b7564dc0b0 Define NPRIMES in terms of the number of elements in 'primes' (as opposed
to hardcoding it).
1997-05-28 00:47:27 +00:00
Steve Passe
c21701d99b Nuke the printing of the unredirect message unless bootverbose. 1997-05-27 19:28:10 +00:00
Stefan Eßer
425f9fda52 Add support for shared interrupts to the kernel. This code is meant
be (eventually) architecture independent. It provides an emulation
of the ISA interrupt registration function register_intr(), but that
function does no longer manipulated the interrupt controller and
interrupt descriptor table, but calls the architecture dependent
function setup_icu() for that purpose.

After theISA/EISA bus code has been modified to directly call the new
interrupt registartion functions (intr_create() and intr_connect()),
the emulation of register_intr() should be dropped.

The C level interrupt handler function should take a (void*) argument,
and the function pointer type (inthand2_t) should defined in some  other
place than isa_device.h.

This commit is a pre-requisite for the removal of the PCI specific shared
interrupt code.

Reviewed by:	dfr,bde
1997-05-26 14:37:43 +00:00
Steve Passe
1ffa54ef38 Added a test called 'LATE_START'.
This is now the default, it delays most of the MP startup to the function
machdep.c:cpu_startup().  It should be possible to move the 2 functions
found there (mp_start() & mp_announce()) even further down the path once
we know exactly where that should be...

Help from: Peter Wemm <peter@spinner.dialix.com.au>
1997-05-26 09:23:30 +00:00
Steve Passe
45fedb144d Broke up parse_mp_table() into 2 passes:
- The 1st (preparse_mp_table()) counts the number of cpus, busses, etc. and
   records the LOCAL and IO APIC addresses.
 - The 2nd pass (parse_mp_table()) does the actual parsing of info and recording
   into the incore MP table.

This will allow us to defer the 2nd pass untill malloc() & private pages
are available (but thats for another day!).
1997-05-25 02:49:03 +00:00
Steve Passe
7f6c65fa06 Now that panic() is properly printing messages for early SMP panics all
the 'printf("..."); panic("\n")' sections are returned to 'panic("...")'.
1997-05-24 18:48:53 +00:00
Steve Passe
47d818975b Move the printing of "cpu#%d" to AFTER the general panic argument string.
When a panic occurs early in the SMP boot process 'cpunumber()' hangs,
causing the panic string to be lost.  Now the system appears to hang
in 'breakpoint()', but at least the user sees the panic string before the
hang.
1997-05-24 18:35:44 +00:00
Peter Wemm
9f90798686 Attempt to convert the ip_divert code to use the new-style protocol request
switch.  I needed 'LINT' to compile for other reasons so I kinda got the
blood on my hands.  Note: I don't know how to test this, I don't know if
it works correctly.
1997-05-24 17:23:11 +00:00
Steve Passe
89921bb594 Convert all:
panic( "xxxxx\n" );

to:
 printf( "xxxxx\n" );
 panic( "\n" );

For some as yet undetermined reason the argument to panic() is often NOT
printed, and the system sometimes hangs before reaching the panic printout.
So we hopefully at least print some useful info before the hang, as oppossed to
leaving the user clueless as to what has happened.
1997-05-22 22:35:42 +00:00
Poul-Henning Kamp
c5318f2363 Remove cruft relating to p_selbits and p_selbits_size 1997-05-22 07:25:20 +00:00
Doug Rabson
32ad9cb531 Fix a few bugs with NFS and mmap caused by NFS' use of b_validoff
and b_validend.  The changes to vfs_bio.c are a bit ugly but hopefully
can be tidied up later by a slight redesign.

PR:		kern/2573, kern/2754, kern/3046 (possibly)
Reviewed by:	dyson
1997-05-19 14:36:56 +00:00
Tor Egge
432aad0e98 Bring in some kernel bootp support. This removes the need for netboot
to fill in the nfs_diskless structure, at the cost of some kernel
bloat. The advantage is that this code works on a wider range of
network adapters than netboot. Several new kernel options are
documented in LINT.
Obtained from: parts of the code comes from NetBSD.
1997-05-11 18:05:39 +00:00
Peter Wemm
708e768480 Fixes from Bruce:
Serious:
- An important timevalfix() in settime[ofday]() was lost.

Not so serious:
- There was a race initializing `delta' in the check for setting the
  time backwards.
- The `#ifdef notyet' check for setting the time more than a day forwards
  was back to front.
[[I deleted the code, it's useless because of iteration - Peter]]
- The timespec was not checked for validity in clock_settime().
- The timespec was not fully checked for validity in nanotime().  The
  check in itimerfix() is too late, since the conversion from a timespec
  to a timeval may overflow.
- A garbage timeval was checked in settimeofday() for the (uap->tv == NULL
  && uap->tzp != NULL) case.  I added the broken check this some time ago.

Cosmetic:
- The "inadvertantly (sic) sleeping forever" test always failed.  hzto()
  always returns >= 1.
- The style wasn't very KNFish.  (I only changed new code.)

Submitted by: bde
1997-05-10 12:00:03 +00:00
Joerg Wunsch
8eea4d3c76 Add a DDB command `show buffer', to display a struct buf. It's impossible
to display everything, so i've chosen a small subset.  Add more to this as
you think seems useful.
1997-05-10 09:09:42 +00:00
Brian Somers
f2cc6198fc Pay attention to what Bruce actually says
rather than what I think he's going to say.
(Now undoing the last timerval change)

Really suggested by:	bde
1997-05-10 06:04:23 +00:00
Brian Somers
93154d87be Don't require that it_interval be valid if
it_value is set to zero - as per documentation.

Suggested by: ache & bde
1997-05-10 05:29:41 +00:00
Peter Wemm
94c8fcd8fb Implementation of posix-style clock_* and nanosleep syscalls as implemented
in NetBSD.  The core of settimeofday() is moved to a seperate static
function settime() which both clock_settime() and settimeofday() call.

Note that I picked up the securelevel > 1 check from NetBSD that prevents
the clock being set backwards in high securelevel mode (this was a hole
that allowed resetting of inode access timestamps to arbitary values)

Obtained from:  mostly from NetBSD, but the settime() function is from
our gettimeofday(), some tweaks by me.
1997-05-08 14:16:25 +00:00
Peter Wemm
3a5322f2f0 regenerate 1997-05-08 14:08:49 +00:00
Peter Wemm
851679e514 oops. NODIDE -> NOHIDE 1997-05-08 14:07:11 +00:00
Peter Wemm
b6f031b70b Define entries for the posix-style clock/timer syscalls including
nanosleep().  Also, note some syscall conflicts with other systems and
indicate slots tagged for use with other syscalls some day.
1997-05-08 14:04:37 +00:00
Steve Passe
c2855f6e47 fix bug in get_isa_apic_mask() where EISA bus was ignored.
Submitted by:	 Peter Wemm <peter@spinner.DIALix.COM>
1997-05-07 22:25:27 +00:00
Peter Wemm
835834c085 md_regs is now a struct trapframe * 1997-05-07 20:08:53 +00:00
Doug Rabson
cea6c86c11 This is the kernel linker. To use it, you will first need to apply
the patches in freefall:/home/dfr/ld.diffs to your ld sources and set
BINFORMAT to aoutkld when linking the kernel.

Library changes and userland utilities will appear in a later commit.
1997-05-07 16:05:47 +00:00
Poul-Henning Kamp
8670684a2f Fix a race condition that did, after all, exist.
Reviewed by:	phk
Submitted by:	dfr
1997-05-06 15:19:38 +00:00
Steve Passe
75877d5a93 removed the "#error ..." line preventing casual invokation of SMP_AUTOSTART.
autostart appears to be working now, at least on my dual P6.
I have no explanation why....
1997-05-06 07:10:06 +00:00
Steve Passe
2479ac60b9 Code to handle SMP/APIC_IO mapping of ISA INTs to APIC pins above IRQ15.
- doesn't break my system.
 - NOT yet verified on the affected motherboard.

Submitted by:	"John S. Dyson" <toor@dyson.iquest.net>
1997-05-05 22:56:37 +00:00
John Dyson
b332d9a66e Make sure that *fork() always returns with %edx == 1 in the
child.  This was sometimes not happening correctly during my
threads code work.
1997-05-05 04:08:12 +00:00
Peter Wemm
b455e1e981 don't #ifdef out reference to i586_ctr_freq. 1997-05-04 14:28:22 +00:00
Poul-Henning Kamp
b15a966ec6 1. Add a {pointer, v_id} pair to the vnode to store the reference to the
".." vnode.  This is cheaper storagewise than keeping it in the
    namecache, and it makes more sense since it's a 1:1 mapping.

2.  Also handle the case of "." more intelligently rather than stuff
    the namecache with pointless entries.

3.  Add two lists to the vnode and hang namecache entries which go from
    or to this vnode.  When cleaning a vnode, delete all namecache
    entries it invalidates.

4.  Never reuse namecache enties, malloc new ones when we need it, free
    old ones when they die.  No longer a hard limit on how many we can
    have.

5.  Remove the upper limit on namelength of namecache entries.

6.  Make a global list for negative namecache entries, limit their number
    to a sysctl'able (debug.ncnegfactor) fraction of the total namecache.
    Currently the default fraction is 1/16th.  (Suggestions for better
    default wanted!)

7.  Assign v_id correctly in the face of 32bit rollover.

8.  Remove the LRU list for namecache entries, not needed.  Remove the
    #ifdef NCH_STATISTICS stuff, it's not needed either.

9.  Use the vnode freelist as a true LRU list, also for namecache accesses.

10. Reuse vnodes more aggresively but also more selectively, if we can't
    reuse, malloc a new one.  There is no longer a hard limit on their
    number, they grow to the point where we don't reuse potentially
    usable vnodes.  A vnode will not get recycled if still has pages in
    core or if it is the source of namecache entries (Yes, this does
    indeed work :-)  "." and ".." are not namecache entries any longer...)

11. Do not overload the v_id field in namecache entries with whiteout
    information, use a char sized flags field instead, so we can get
    rid of the vpid and v_id fields from the namecache struct.  Since
    we're linked to the vnodes and purged when they're cleaned, we don't
    have to check the v_id any more.

12. NFS knew about the limitation on name length in the namecache, it
    shouldn't and doesn't now.

Bugs:
        The namecache statistics no longer includes the hits for ".."
        and "." hits.

Performance impact:
        Generally in the +/- 0.5% for "normal" workstations, but
        I hope this will allow the system to be selftuning over a
        bigger range of "special" applications.  The case where
        RAM is available but unused for cache because we don't have
        any vnodes should be gone.

Future work:
        Straighten out the namecache statistics.

        "desiredvnodes" is still used to (bogusly ?) size hash
        tables in the filesystems.

        I have still to find a way to safely free unused vnodes
        back so their number can shrink when not needed.

        There is a few uses of the v_id field left in the filesystems,
        scheduled for demolition at a later time.

        Maybe a one slot cache for unused namecache entries should
        be implemented to decrease the malloc/free frequency.
1997-05-04 09:17:38 +00:00
Peter Wemm
653abe61a6 Finish off and activate the smp_active sysctl handler.. 1997-05-04 02:08:09 +00:00
Steve Passe
9999694073 code to allow range checking on smp_active.
disabled by default, not sure its ready for prime time.

Submitted by:	 Peter Wemm <peter@spinner.DIALix.COM>
1997-05-03 18:24:25 +00:00
Steve Passe
462e62c9a0 new function to turn an APIC pin# into an INT mask.
added missing APIC_IO define.

Submitted by:	"John S. Dyson" <toor@dyson.iquest.net>
1997-05-03 17:42:01 +00:00
Steve Passe
f3a946e800 fixed spelling error.
Submitted by:	Bruce Albrecht <bruce@zuhause.mn.org>
1997-05-01 19:27:58 +00:00
Peter Wemm
29c70804e6 This is obvious to people who've been using the smp kernel for a while,
but now that we've widened the scope of the smp work to -current, it might
be an idea to warn new people that might not have read all the docs yet
that the SMP support needs to be activated via a sysctl.
1997-05-01 14:18:05 +00:00
John Dyson
82b8e119b4 Staticize an unnecessarily global function: vputrele.
Submitted by:	 Michael Hancock <michaelh@cet.co.jp>
1997-04-30 03:09:15 +00:00
Steve Passe
04964d153d Enabled 'FIX_MP_TABLE_WORKS' code.
This code re-numbers PCI busses in the MP table to match PCI semantics
when the MP BIOS fails to do it properly.

Reviewed by:	Peter Wemm <peter@spinner.DIALix.COM>
1997-04-29 22:12:32 +00:00
Steve Passe
34e63b4cd7 removed all the TEST_UPPERPRIO crud. 1997-04-28 01:08:47 +00:00
Steve Passe
2c5d02fff3 remove all the SMP_INVLTLB defines, making the code default for APIC_IO.
Reviewed by:	informal discussion with Peter Wemm <peter@spinner.DIALix.COM>
1997-04-28 00:25:00 +00:00
Steve Passe
9caa2d558d remove all the SMP_INVLTLB defines, making the code default for APIC_IO.
replace invldebug with invltlb_ok for throttling smp_invltlb() during boot.

Reviewed by:	informal discussion with Peter Wemm <peter@spinner.DIALix.COM>
1997-04-28 00:24:00 +00:00
Alexander Langer
cf72998ef3 Remove bogon from previous commit: doubly included sys/systm.h. 1997-04-27 21:26:29 +00:00
Steve Passe
2897614119 informal discussion between Bruce Evans <bde@zeta.org.au>,
Peter Wemm <peter@spinner.DIALix.COM>, Steve Passe <smp@csn.net>

removed all the IPI_INTS code.
made the XFAST_IPI32 code default, renaming Xfastipi32 to Xinvltlb.
1997-04-27 21:17:56 +00:00
Garrett Wollman
a29f300e80 The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
Alexander Langer
ee7877dfec Prevent debugger attachment to init when securelevel > 0.
Noticed by:	Brian Buchanan <brian@wasteland.calbbs.com>
1997-04-27 19:02:37 +00:00
Peter Wemm
c76e95c3c7 Create sysctl kern.fast_vfork, on for uniprocessor by default, off for
SMP.
1997-04-26 15:59:50 +00:00
Peter Wemm
c32ba2484e Disable RFMEM in vfork for smp case.. It doesn't seem to work too well
yet..
1997-04-26 14:31:36 +00:00
Peter Wemm
477a642cee Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people.  A special thanks to Steve Passe for implementing
the APIC code!
1997-04-26 11:46:25 +00:00
Doug Rabson
be4952f1df Don't zero b_dirtyoff and b_dirtyend on error.
Submitted by:	Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>
1997-04-25 11:14:00 +00:00
Peter Wemm
5f61c81d66 copyin the export network mask to the correct variable.
Submitted by: Mike Hibler <mike@marker.cs.utah.edu>, PR#3380
1997-04-25 06:47:12 +00:00
Andrey A. Chernov
0eaa559cbf Restore memory space separation (RFMEM) for vfork() after
shell imgact memory clobbering fixed
1997-04-23 22:13:18 +00:00
Andrey A. Chernov
5cf3d12ca5 Don't clobber user space argv0 memory on shell exec, mainly for vfork()
Fix another bug: if argv[0] is NULL, garbadge args might be added for
shell script
Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no> (with yet one fault detect from me)
1997-04-23 22:07:05 +00:00
John Dyson
6b707440d3 Give up on the fast vfork() for a while. 1997-04-23 01:59:14 +00:00
John Dyson
c58494e476 Re-institute the efficent version of vfork. It appears to make a
difference of approx 3mins in make world on my P6!!!  This means
that vfork now has full address space sharing, so beware with
sloppy vfork programming.  Also, you really do need to apply
the previously committed popen fix in libc.
1997-04-20 16:57:12 +00:00
Bruce Evans
0e4f24a34e Avoid division by 0 in check_part(). (It occurred when max_nsectors == 0.
This case is clearly an error, but we keep calling check_part() to get
diagnostics.)

Fixed nearby indentation and commenting bugs.
1997-04-19 14:14:17 +00:00
Doug Rabson
18cab10cb3 Don't allow partial buffers to be cluster-comitted.
Zero the b_dirty{off,end} after cluster-comitting a group of buffers.

With these fixes, I was able to complete a 'make world' with remote src
and obj directories.
1997-04-18 14:12:17 +00:00
David Greenman
1ebd0c5945 Brought fix from the 2.2 branch forward (see rev 1.47.2.7): serious bugs
with reading the image header.
1997-04-18 02:43:05 +00:00
Poul-Henning Kamp
936342eff1 #include <sys/queue.h> 1997-04-14 18:23:48 +00:00
Bruce Evans
58611a61ed Fixed printing of registers in dbflalt_handler(). The registers
were always in a tss; that tss just changed from the one in the
pcb to common_tss (who knows where it was when there was no curpcb?).
Not using the pcb also fixed the problem that there is no pcb in
idle(), so we now always get useful register values.
1997-04-14 13:52:52 +00:00
John Dyson
d7f7f3f20e Make a problem that I cannot reproduce go away for now. This commit
is to decrease the inconvienience of other developers until I can
really fix the code.
Reviewed by:	Donald J. Maddox <dmaddox@scsn.net>
1997-04-14 01:28:58 +00:00
John Dyson
95395ca1c1 Improve the buffer cache memory policy by moving pages over to the
cache queue more often.  The pageout daemon had to be waken up
more often than necessary since pages were not put on the
cache queue, when they should have been.
Submitted by:	David Greenman <dg@freebsd.org>
1997-04-13 03:33:25 +00:00
John Dyson
492da96c9d Correct the previous thread-fix commit. I made a clerical error. 1997-04-13 03:05:31 +00:00
John Dyson
5856e12e69 Fully implement vfork. Vfork is now much much faster than even our
fork. (On my machine, fork is about 240usecs, vfork is 78usecs.)

Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory
	from the other threads of a group.

Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating
	possible existing shares with other threads/processes.

Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a
	thread from the rest of the group.

Fix the case where a thread does an exec.  It is almost nonsense for a thread
	to modify the other threads address space by an exec, so we
	now automatically divorce the address space before modifying it.
1997-04-13 01:48:35 +00:00
John Dyson
c04b956c6f Effectively remove the previous commit to fix threads forking. The
change was a false-start, and needs more work.
1997-04-12 04:07:50 +00:00
John Dyson
af9ec88589 Allow a kernel-supported process thread to do an exec without blasting
away the VM space of all of the other, associated threads.
1997-04-11 23:37:23 +00:00
Bruce Evans
9dd8309d56 Removed support for OLD_PIPE. <sys/stat.h> is now missing the hack that
supported nameless pipes being indistinguishable from fifos.  We're not
going back.
1997-04-09 16:53:45 +00:00
Bruce Evans
4e7506495b Include <sys/buf.h> instead of <sys/vnode.h>. kern_sysctl.c no
longer has anything to do with vnodes and never had anything to do
with buffers, but it needs the definitions of B_READ and B_WRITE
for use with the bogus useracc() interface and was getting them
bogusly due to excessive cleanups in rev.1.49.
1997-04-09 15:23:09 +00:00
Peter Wemm
263a339213 Remove explicit zero of p_vmspace on creation, it's now in the startzero
section of the proc struct.
1997-04-07 09:38:39 +00:00
Peter Wemm
a2a1c95c10 The biggie: Get rid of the UPAGES from the top of the per-process address
space. (!)

Have each process use the kernel stack and pcb in the kvm space.  Since
the stacks are at a different address, we cannot copy the stack at fork()
and allow the child to return up through the function call tree to return
to user mode - create a new execution context and have the new process
begin executing from cpu_switch() and go to user mode directly.
In theory this should speed up fork a bit.

Context switch the tss_esp0 pointer in the common tss.  This is a lot
simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer
to each process's tss since the esp0 pointer is a 32 bit pointer, and the
sd_base setting is split into three different bit sections at non-aligned
boundaries and requires a lot of twiddling to reset.

The 8K of memory at the top of the process space is now empty, and unmapped
(and unmappable, it's higher than VM_MAXUSER_ADDRESS).

Simplity the pmap code to manage process contexts, we no longer have to
double map the UPAGES, this simplifies and should measuably speed up fork().

The following parts came from John Dyson:

Set PG_G on the UPAGES that are now in kernel context, and invalidate
them when swapping them out.

Move the upages object (upobj) from the vmspace to the proc structure.

Now that the UPAGES (pcb and kernel stack) are out of user space, make
rfork(..RFMEM..) do what was intended by sharing the vmspace
entirely via reference counting rather than simply inheriting the mappings.
1997-04-07 07:16:06 +00:00
Peter Wemm
271b264e4c No longer use an i386tss as the basis of our pcb - it wasn't particularly
convenient and makes life difficult for my next commit.  We still need
an i386tss to point to for the tss slot in the gdt, so we use a common
tss shared between all processes.

Note that this is going to break debugging until this series of commits
is finished.  core dumps will change again too. :-(  we really need
a more modern core dump format that doesn't depend on the pcb/upages.

This change makes VM86 mode harder, but the following commits will remove
a lot of constraints for the VM86 system, including the possibility of
extending the pcb for an IO port map etc.

Obtained from: bde
1997-04-07 06:45:18 +00:00
Peter Dufault
0ddf9be1f0 Make MOD_* macros almost consistent:
Use the name argument almost the same in all LKM types.  Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name.  This is a candidate for change and I vote just the name without
the "_mod".

Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.

Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.

Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.

Change source in tree to use the new interface.

Reviewed by:	Bruce Evans
1997-04-06 11:14:13 +00:00
John Dyson
a04c970a7a Fix the gdb executable modify problem. Thanks to the detective work
by Alan Cox <alc@cs.rice.edu>, and his description of the problem.

The bug was primarily in procfs_mem, but the mistake likely happened
due to the lack of vm system support for the operation.  I added
better support for selective marking of page dirty flags so that
vm_map_pageable(wiring) will not cause this problem again.

The code in procfs_mem is now less bogus (but maybe still a little
so.)
1997-04-06 02:29:45 +00:00
Doug Rabson
42146e3747 [Previous comment was incorrect for these files]
Added calls to VFS lock debugging macros to make fixing filesystems' locking
easier.
1997-04-04 17:47:43 +00:00
Doug Rabson
de15ef6aef Add a function vop_sharedlock which a copy of vop_nolock without the
implementation #ifdef out.  This can be used for now by NFS.  As soon
as all the other filesystems' locking is fixed, this can go away.

Print the vnode address in vprint for easier debugging.
1997-04-04 17:46:21 +00:00
David Greenman
66141753e6 Killed unnecessary vp == NULL check after namei. 1997-04-04 09:06:20 +00:00
David Greenman
a3cf6ebae3 Oops, only free component name buffer if namei() didn't. This bug has
been in here since I wrote the code 3 years ago! Thanks, Bruce!

Submitted by:	bde
1997-04-04 07:30:06 +00:00
David Greenman
6d5a0a8c23 Various fixes:
1. imgp->image_header needs to be cleared for the bp == NULL && `goto
   interpret' case, else exec_fail_dealloc would free it twice after
   an error.

2. Moved the vp->v_writecount check in exec_check_permissions() to
   near the end.  This fixes execve("/dev/null", ...) returning the
   bogus errno ETXTBSY.  ETXTBSY is still returned for attempts to
   exec interpreted files that are open for writing.  The man page
   is very old and wrong here.  It says that ETXTBSY is for pure
   procedure (shared text) files that are open for writing or reading.

3. Moved the setuid disabling in exec_check_permissions() to the end.
   Cosmetic.  It's more natural to dispose of all the error cases
   first.

...plus a couple of other cosmetic changes.

Submitted by:	bde
1997-04-04 04:17:11 +00:00