Commit Graph

2258 Commits

Author SHA1 Message Date
Bruce Evans
f95ac73519 Use copyout() instead of bcopy() to copy the image to user space.
bcopy() caused panics under heavy paging (not quite as suspected -
the kernel stack seemed to get corrupted).

Fixed long lines.

Reviewed by:	phk
1998-06-16 14:36:40 +00:00
Doug Rabson
b1bf661000 [Add missing files from previous commit]
Major changes to the generic device framework for FreeBSD/alpha:

* Eliminate bus_t and make it possible for all devices to have
  attached children.

* Support dynamically extendable interfaces for drivers to replace
  both the function pointers in driver_t and bus_ops_t (which has been
  removed entirely.  Two system defined interfaces have been defined,
  'device' which is mandatory for all devices and 'bus' which is
  recommended for all devices which support attached children.

* In addition, the alpha port defines two simple interfaces 'clock'
  for attaching various real time clocks to the system and 'mcclock'
  for the many different variations of mc146818 clocks which can be
  attached to different alpha platforms.  This eliminates two more
  function pointer tables in favour of the generic method dispatch
  system provided by the device framework.

Future device interfaces may include:

* cdev and bdev interfaces for devfs to use in replacement for specfs
  and the fixed interfaces bdevsw and cdevsw.

* scsi interface to replace struct scsi_adapter (not sure how this
  works in CAM but I imagine there is something similar there).

* various tailored interfaces for different bus types such as pci,
  isa, pccard etc.
1998-06-14 13:53:12 +00:00
Doug Rabson
99d11cde56 Major changes to the generic device framework for FreeBSD/alpha:
* Eliminate bus_t and make it possible for all devices to have
  attached children.

* Support dynamically extendable interfaces for drivers to replace
  both the function pointers in driver_t and bus_ops_t (which has been
  removed entirely.  Two system defined interfaces have been defined,
  'device' which is mandatory for all devices and 'bus' which is
  recommended for all devices which support attached children.

* In addition, the alpha port defines two simple interfaces 'clock'
  for attaching various real time clocks to the system and 'mcclock'
  for the many different variations of mc146818 clocks which can be
  attached to different alpha platforms.  This eliminates two more
  function pointer tables in favour of the generic method dispatch
  system provided by the device framework.

Future device interfaces may include:

* cdev and bdev interfaces for devfs to use in replacement for specfs
  and the fixed interfaces bdevsw and cdevsw.

* scsi interface to replace struct scsi_adapter (not sure how this
  works in CAM but I imagine there is something similar there).

* various tailored interfaces for different bus types such as pci,
  isa, pccard etc.
1998-06-14 13:46:10 +00:00
Poul-Henning Kamp
938ee3ce4d Introduce std_pps_ioctl() to automagically DTRT.
Add scaling capability to timex.offset, ntpd-4.0.73 will support this.
1998-06-13 09:30:26 +00:00
Doug Rabson
3900ddb2dc Only build this on i386 for now. I may use it for the alpha later but
currently it doesn't compile.
1998-06-11 07:23:59 +00:00
Julian Elischer
32f5d4d843 Replace 'sleep()' with 'tsleep()'
Accidentally imported from Kirk's codebase.

Pointed out by: various.
1998-06-10 22:02:14 +00:00
Julian Elischer
28913ebe4e Submitted by: Kirk McKusick <mckusick@McKusick.COM>
Fix for potential hang when trying to reboot the system or
to forcibly unmount a soft update enabled filesystem.
FreeBSD already handled the reboot case differently, this is however a better
fix.
1998-06-10 18:13:19 +00:00
Doug Rabson
897cd717a5 Add initial support for the FreeBSD/alpha kernel. This is very much a
work in progress and has never booted a real machine.  Initial
development and testing was done using SimOS (see
http://simos.stanford.edu for details).  On the SimOS simulator, this
port successfully reaches single-user mode and has been tested with
loads as high as one copy of /bin/ls :-).

Obtained from: partly from NetBSD/alpha
1998-06-10 10:57:29 +00:00
Doug Rabson
8c12612cf6 64bit fixes: don't cast pointers to int. 1998-06-10 10:31:08 +00:00
Doug Rabson
2b605d0804 64bit fixes: don't cast p->p_retval to an int*. 1998-06-10 10:30:23 +00:00
Doug Rabson
831b9ef2be 64bit fixes: use u_long not int for ioctl command. 1998-06-10 10:29:31 +00:00
Doug Rabson
10d4743f6f 64bit fixes: use size_t not u_int for sizes. 1998-06-10 10:28:29 +00:00
Doug Rabson
2ef49ddfcb 64bit fixes: p->p_retval is a register_t[] not an int[]. 1998-06-10 10:27:43 +00:00
Poul-Henning Kamp
a58f0f8e66 Add a tc_ prefix to struct timecounter members.
Urged by:	bde
1998-06-09 13:10:54 +00:00
Bruce Evans
1afde994e9 Pass lists of possible root devices and their names up to the
machine-independent code and try mounting the devices in the
lists instead of guessing alternative root devices in a machine-
dependent way.

autoconf.c:
Reject preposterous slice numbers instead of silently converting
them to COMPATIBILITY_SLICE.

Don't forget to force slice = COMPATIBILITY_SLICE in the floppy
device name.

Eliminated most magic numbers and magic device names in setroot().

Fixed dozens of style bugs.

vfs_conf.c:
Put the actual root device name instead of "root_device" in the
mount struct if the actual name is available.  This is useful after
booting with -s.  If it were set in all cases then it could be used
to do mount(8)'s ROOTSLICE_HUNT and fsck(8)'s hotroot guess better.
1998-06-09 12:52:35 +00:00
Bruce Evans
e7c1c309fa Don't generate COMPAT_43 cruft if there are no COMPAT_43 syscalls.
In particular, don't generate an include of "opt_compat.h" if it
wouldn't affect anything we create.  This will fix recent breakage
of the ibcs2 LKM.  The ibcs2 syscall files were not regenerated
properly, so the LKM didn't break immediately when we started
generating this extraneous include.
1998-06-09 03:32:05 +00:00
John Dyson
0d3dd8fbc5 Remove some junk left over from a previous commit.
Submitted by:	phk
1998-06-08 18:18:28 +00:00
Bruce Evans
414c93f3aa Updated generated files. 1998-06-08 11:08:35 +00:00
Bruce Evans
bf0955a99d Fixed some style bugs in output (missing tabs and unparenthesized macros).
Fixed some style bugs in source (mostly, superfluous backslashes).
1998-06-08 11:02:00 +00:00
Doug Rabson
2e91d07af9 Fix a typo which prevented i386 elf from working at all (including Linux
emulated elf binaries).
1998-06-08 09:19:35 +00:00
Poul-Henning Kamp
48115288df Add a member function more to the timecounters, this one is for use
with latch based PPS implementations.  The client that uses it will
be committed after more testing.
1998-06-07 20:36:55 +00:00
Doug Rabson
ecbb00a262 This commit fixes various 64bit portability problems required for
FreeBSD/alpha.  The most significant item is to change the command
argument to ioctl functions from int to u_long.  This change brings us
inline with various other BSD versions.  Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
1998-06-07 17:13:14 +00:00
Poul-Henning Kamp
dbb3475507 Add a "this" style argument and a "void *private" so timecounters can
figure out which instance to wount with.
1998-06-07 08:40:53 +00:00
Bruce Evans
e3a03f0cfb Don't attempt to copy the whole slices "struct" for DIOCGSLICEINFO.
The slices "struct" isn't really a struct; we allocate only part of
it in the fully dangerously dedicated case.  Since the "struct" is
malloced, the page beyond it may not be mapped, so attempts to copy
it would crash.  This problem became larger when the full struct was
bloated from < 1K to > 3K by the addition of (mostly unused) DEVFS
tokens some time before 2.2.0 was released.
1998-06-06 03:06:55 +00:00
David Greenman
b5afad7198 Moved limit frobbing (and the resulting limcopy()) that occurs for
accounting to the accounting function so that this isn't needlessly
done for some process exits.
Reviewed by:	bde,phk
1998-06-05 21:44:20 +00:00
David Greenman
9523f5c199 If we are out of mb_map space and we failed to m_reclaim() anything and
the alloc is not M_DONTWAIT, then panic with "Out of mbuf clusters".
Callers that specify M_WAIT can't deal with getting a NULL buffer, so this
is a more graceful failure than randomly page faulting in the socket code
or elsewhere.
1998-06-05 21:41:48 +00:00
John Dyson
e8f367853b Correct sleep priority. 1998-06-02 05:39:13 +00:00
Peter Dufault
ce47711dee Set PAGE_SIZE for _SC_PAGESIZE sysconf(). 1998-06-01 21:54:43 +00:00
Peter Wemm
4dc75870b2 Have the wakeup routine do the upcall if needed.
Obtained from: NetBSD
1998-05-31 18:38:43 +00:00
Poul-Henning Kamp
e796e00de3 Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.
Clean up (or if antipodic: down) some of the msgbuf stuff.

Use an inline function rather than a macro for timecounter delta.

Maintain process "on-cpu" time as 64 bits of microseconds to avoid
needless second rollover overhead.

Avoid calling microuptime the second time in mi_switch() if we do
not pass through _idle in cpu_switch()

This should reduce our context-switch overhead a bit, in particular
on pre-P5 and SMP systems.

WARNING:  Programs which muck about with struct proc in userland
will have to be fixed.

Reviewed, but found imperfect by:       bde
1998-05-28 09:30:28 +00:00
John Dyson
cf2819ccb8 Make flushing dirty pages work correctly on filesystems that
unexpectedly do not complete writes even with sync I/O requests.
This should help the behavior of mmaped files when using
softupdates (and perhaps in other circumstances also.)
1998-05-21 07:47:58 +00:00
Peter Dufault
aebde78243 1. Add new defs for mins and maxs for the POSIX flavor priorities. They
end up being the same, but it doesn't look like you're comparing
apples and oranges.

2. Use need_resched instead of reset_priority.  This isn't right
either, since for example you'll round-robin against equal priority FIFO
processes when lowering the priority of another process,
but this works better and a real fix needs to be in kern_synch and
not out here.

3. This is not a device driver: copyin/copyout the structure.
1998-05-19 21:11:53 +00:00
Poul-Henning Kamp
579f4456b9 Change a data type internal to the timecounters, and remove the "delta"
function.

Reviewed, but not entirely approved by: bde
1998-05-19 18:55:02 +00:00
Poul-Henning Kamp
58067a9909 Make the size of the msgbuf (dmesg) a "normal" option. 1998-05-19 08:58:53 +00:00
Tor Egge
afc6ea238f Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by:	David Greenman <dg@root.com>
1998-05-19 00:00:14 +00:00
Peter Dufault
2a61a11038 1. Don't use "nosys" and generate coredumps for unconfigured
system calls - return ENOSYS per the spec.

2. Fix interface stub to set priority properly.
1998-05-18 12:53:45 +00:00
Tor Egge
2f1e70693d Add forwarding of roundrobin to other cpus. This gives a more regular
update of cpu usage as shown by top when one process is cpu bound
(no system calls) while the system is otherwise idle (except for top).

Don't attempt to switch to the BSP in boot().  If the system was idle when
an interrupt caused a panic, this won't work.  Instead, switch to the BSP
in cpu_reset.

Remove some spurious forward_statclock/forward_hardclock warnings.
1998-05-17 22:12:14 +00:00
Bruce Evans
ee002b68d1 Fixed interval calculation in realitimexpire() again. Obtained from:
rev.1.9.  Broken in: rev.1.50.

Fixed a spelling error.  Obtained from: Lite2.
1998-05-17 20:13:01 +00:00
Bruce Evans
c8b4782815 Fixed stale references to hzto() in comments. 1998-05-17 20:08:05 +00:00
Tor Egge
cb87a87c16 Supply the correct process argument to dounmount when possible. 1998-05-17 19:38:55 +00:00
Tor Egge
5931a9c24e For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1.
get_ptbase and pmap_pte_quick no longer generates IPIs.
This should reduce the number of IPIs during heavy paging.
1998-05-17 18:53:19 +00:00
Poul-Henning Kamp
c21410e119 s/nanoruntime/nanouptime/g
s/microruntime/microuptime/g

Reviewed by:	bde
1998-05-17 11:53:46 +00:00
Garrett Wollman
98271db4d5 Convert socket structures to be type-stable and add a version number.
Define a parameter which indicates the maximum number of sockets in a
system, and use this to size the zone allocators used for sockets and
for certain PCBs.

Convert PF_LOCAL PCB structures to be type-stable and add a version number.

Define an external format for infomation about socket structures and use
it in several places.

Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through
sysctl(3) without blocking network interrupts for an unreasonable
length of time.  This probably still has some bugs and/or race
conditions, but it seems to work well enough on my machines.

It is now possible for `netstat' to get almost all of its information
via the sysctl(3) interface rather than reading kmem (changes to follow).
1998-05-15 20:11:40 +00:00
Peter Wemm
9c4aed2ed7 Nuke signanosleep(). (I've left nanosleep1() seperate to nanosleep()
as I don't want to mess with the multiple returns)
1998-05-14 11:31:08 +00:00
Peter Wemm
06b6493558 regen after signanosleep nuke 1998-05-14 11:29:06 +00:00
Peter Wemm
786cf38a29 deep-six signanosleep(). It sounded like a good idea at the time. 1998-05-14 11:28:11 +00:00
Peter Wemm
1973d51bfb Commit an old change that has been sitting around for a long while.
signanosleep() did not deal with signal masks properly.  This change was
based on a discussion with bde some time ago (at least 6 months or more).

signanosleep() should probably go away since it was never really used for
more than a few weeks and doesn't appear in released code.  It should
probably be killed before somebody uses it and it becomes a gratuitous
nonstandard feature.
1998-05-14 10:38:52 +00:00
Bruce Evans
b322fb5d76 Backed out previous commit. It is invalid to call d_ioctl() on
possibly non-open devices, and we don't want to restrict dumping
to swap devices anwyay.  It is especially invalid to call d_ioctl()
in non-process context for panics.  d_psize() can be called on
non-open devices, at least on non-SLICED ones that support d_dump(),
and setdumpdev() has depended on this for a long time although it
is probably wrong, but even d_psize() can't be called in non-process
context - that's why dumpsys() depends on previously computed values
although these values may be stale.  The historical restriction to
devices with dkpart(dev) == SWAP_PART should go away.
1998-05-12 17:34:02 +00:00
John Dyson
1f56217280 Fix the futimes/undelete/utrace conflict with other BSD's. Note that
the only common  usage of utrace (the possible problem with this
commit) is with malloc, so this should be a real problem.  Add
the various NetBSD syscalls that allow full emulation of their
development environment.
1998-05-11 03:55:28 +00:00
John Dyson
f0175db1ee Attempt to set write combining mode for graphics devices. 1998-05-11 01:06:08 +00:00
Mike Smith
7be2d30077 In the words of the submitter:
---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it.  This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp.  vop_rename
still releases 4 vnode arguments before it returns.  These remaining cases
will be corrected in the next set of patches.
---------

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-07 04:58:58 +00:00
Julian Elischer
7f2f1b784e Add dump support to the DEVFS/slice code.
now we can actually catch our crashes :-)

Submitted by: Luoqi Chen <luoqi@chen.ml.org> (the man who's everywhere)
1998-05-06 22:14:48 +00:00
Mike Smith
79cc756d8b As described by the submitter:
Reverse the VFS_VRELE patch.  Reference counting of vnodes does not need
to be done per-fs.  I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-06 05:29:41 +00:00
John Dyson
96fb8cf258 Fix the shm panic. I mistakenly used the shadow_count to keep the object
from being split, and instead added an OBJ_NOSPLIT.
1998-05-04 17:12:53 +00:00
John Dyson
cbd8ec0902 Work around some VM bugs, the worst being an overly aggressive
swap space free calculation.  More complete fixes will be forthcoming,
in a week.
1998-05-04 03:01:44 +00:00
Bruce Evans
77849078bf Oops, the previous commit should have changed i386' to __i386__',
not `__i386'.
1998-05-01 16:40:21 +00:00
Bruce Evans
809e3a8464 Partially fixed write clustering for cases where cluster_wbuild() is
called from vfs_bio_awrite() without going through cluster_write()
or ufs_bmaparray(), in particular for all writes to block disk devices.
Only ufs_bmaparray() sets vp->v_maxio in a correct way, and it doesn't
seem to be called early enough even for regular files.
1998-05-01 16:29:27 +00:00
Peter Wemm
b1951f4028 vm_page_is_valid() wasn't expecting a large offset argument, it's
expecting a sub-page offset.  We were passing the file position,
and vm_page_bits() could do some interesting things when base was
larger PAGE_SIZE.
if (size > PAGE_SIZE - base)
	size = PAGE_SIZE - base;
is interesting when (PAGE_SIZE - base) is negative.  I could imagine that
this could have interesting consequences for memory page -> device block
bit validation.
1998-05-01 15:10:59 +00:00
Peter Wemm
f806d5a257 Fix one problem with NFSv3 > 2GB file support.
Submitted by: bde
1998-05-01 15:04:35 +00:00
Eivind Eklund
288078be0f Translate T_PROTFLT to SIGSEGV instead of SIGBUS when running under
Linux emulation.  This make Allegro Common Lisp 4.3 work under
FreeBSD!

Submitted by: Fred Gilham <gilham@csl.sri.com>
Commented on by: bde, dg, msmith, tg
Hoping he got everything right:  eivind
1998-04-28 18:15:08 +00:00
David E. O'Brien
cbcfa1ba6a Discussed with: bde 1998-04-24 11:50:30 +00:00
David E. O'Brien
8f89f24fc3 Create virgin disklabels with 8 (MAXPARTITIONS) partitions rather than
three (RAW_PART + 1);
This makes ``disklabel -Brw sdN auto'' do the Right Thing.
1998-04-24 11:49:57 +00:00
David Greenman
9351a2295a Added kern.ipc.nmbclusters 1998-04-24 04:15:52 +00:00
Julian Elischer
c0bab11dfe Make the devfs SLICE option a standard type option.
(hopefully it will go away eventually anyhow)
1998-04-20 03:57:41 +00:00
Julian Elischer
3e425b968d Add changes and code to implement a functional DEVFS.
This code will be turned on with the TWO options
DEVFS and SLICE. (see LINT)
Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes.

/dev will be automatically mounted by init (thanks phk)
on bootup. See /sys/dev/slice/slice.4 for more info.
All code should act the same without these options enabled.

Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others

This code does not support the following:
bad144 handling.
Persistance. (My head is still hurting from the last time we discussed this)
ATAPI flopies are not handled by the SLICE code yet.

When this code is running, all major numbers are arbitrary and COULD
be dynamically assigned. (this is not done, for POLA only)
Minor numbers for disk slices ARE arbitray and dynamically assigned.
1998-04-19 23:32:49 +00:00
Dag-Erling Smørgrav
59bad7c53b Backed out lseek changes. 1998-04-19 22:20:32 +00:00
Dag-Erling Smørgrav
25096724e8 Return EINVAL and do not change file pointer if resulting offset is negative.
PR:		kern/6184
1998-04-18 19:24:44 +00:00
Peter Wemm
37b8ccd37a In vfs_msync(), test to see if the vnode being examined is "interesting"
(ie: it has a vm_object attached and is marked as OBJ_MIGHTBEDIRTY) before
attempting to lock it.  This should reduce the cpu hit that is incurred
when doing a sync(2) and when the syncer process is doing the 30-second
writeback of dirty mmap() data to disk.  Skip this speedup if we are
doing an unmount() to be sure to get everything - we can afford to
occasionally miss a msync while the system is running, but not at unmount.

I'm not sure about the VXLOCK and MNT_WAIT case, it seems a bit odd to skip
doing a page_clean at unmount time just because a vnode is VXLOCKed, but
that's what was being done before...
1998-04-18 06:26:16 +00:00
Dag-Erling Smørgrav
dc73342347 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
Bruce Evans
ab36c3d3e7 Really finish supporting compiling with `gcc -ansi'. 1998-04-17 04:53:44 +00:00
Peter Wemm
efdc5523c0 When the softdep conversion took place, the periodic vfs_msync() from
update got lost.  This is responsible for ensuring that dirty mmap() pages
get periodically written to disk.  Without it, long time mmap's might not
have their dirty pages written out at all of the system crashes or isn't
cleanly shut down.  This could be nasty if you've got a long-running
writing via mmap(), dirty pages used to get written to disk within 30
seconds or so.
1998-04-16 03:31:26 +00:00
Tor Egge
71033a8c50 Unlock mountlist_slock if the mount point was busy (unmount in progress)
during the attempt at lazy fsync.
1998-04-15 18:37:49 +00:00
Bruce Evans
c1087c1324 Support compiling with `gcc -ansi'. 1998-04-15 17:47:40 +00:00
Poul-Henning Kamp
115facb29d Fix a minor mbuf leak created by the previous change.
Reviewed by:	phk
Submitted by:	pb@fasterix.freenix.org (Pierre Beyssac)
1998-04-14 06:24:43 +00:00
Poul-Henning Kamp
aba558930b setsockopt() transports user option data in an mbuf. if the user
data is greater than MLEN, setsockopt is unable to pass it onto
the protocol handler.  Allocate a cluster in such case.

PR:		2575
Reviewed by:	 phk
Submitted by:	Julian Assange proff@iq.org
1998-04-11 20:31:46 +00:00
Poul-Henning Kamp
a2481bbe8e When pmap_pinit0() allocates a page for proc0's page directory,
kernal page table may need to be extended.  But while growing the
kernel page table (pmap_growkernel()), newly allocated kernel page
table pages are entered into every process' page directory. For
proc0, the page directory is not allocated yet, and results in a
page fault.  Eventually, the machine panics with "lockmgr: not
holding exclusive lock".

PR:		5458
Reviewed by:	phk
Submitted by:	Luoqi Chen <luoqi@luoqi.watermarkgroup.com>
1998-04-11 17:24:06 +00:00
Alexander Langer
7c2e3d329a Grammar police. 1998-04-10 00:09:04 +00:00
Wolfram Schneider
5ddc8ded1d New mount option nosymfollow. If enabled, the kernel lookup()
function will not follow symbolic links on the mounted
file system and return EACCES (Permission denied).
1998-04-08 18:31:59 +00:00
Poul-Henning Kamp
5f88ec3625 Minor adjustments to the timecounting and proc0.
Mostly Submitted by:	bde
1998-04-08 09:01:53 +00:00
Peter Wemm
100ceca222 Today is not my lucky day. Fix missing brace and I got a request
to use EMLINK instead.
1998-04-06 19:32:37 +00:00
Peter Wemm
193afe0189 Use a different errno (ELOOP (as sef mentioned) since the text that goes
with the error sounds ok for the condition) if O_NOFOLLOW gets a link.
1998-04-06 18:43:28 +00:00
Peter Wemm
0fdc628b41 Rather than let users get fd's to symlink files, make O_NOFOLLOW cause
an error if it gets a link (like it does if it gets a socket).  The
implications of letting users try and do file operations on symlinks
themselves were too worrying.
1998-04-06 18:25:21 +00:00
Peter Wemm
7e3426aa1f Implement a new open(2) flag: O_NOFOLLOW. This will instruct open
to not follow symlinks, but to open a handle on the link itself(!).
As strange as this might sound, it has several useful applications
safe race-free ways of opening files in hostile areas (eg: /tmp, a mode
1777 /var/mail, etc).  It also would allow things like fchown() to work
on the link rather than having to implement a new syscall specifically for
that task.

Reviewed by: phk
1998-04-06 17:38:43 +00:00
Peter Wemm
aacdc613e5 curproc is initialized in locore at the same time for both SMP and UP now. 1998-04-06 15:51:22 +00:00
Peter Wemm
cf34ef61ee Use real types for the SMP pages being allocated rather than arrays of
ints.  Remove some no longer needed casts.  Initialize the per-cpu
global data area using the structs rather than knowing too much about
layout, alignment, etc.
1998-04-06 15:48:30 +00:00
Poul-Henning Kamp
2eeb0e2ea0 Make read_random() take a (void *) argument instead of (char *) 1998-04-06 09:30:42 +00:00
Poul-Henning Kamp
4cf41af3d4 Make a kernel version of the timer* functions called timerval* to be
more consistent.

OK'ed by:	bde
1998-04-06 08:26:08 +00:00
Poul-Henning Kamp
5704ba6a06 More fixes for the iterative case of nanosleep1 from bruce.
I hate the 2-arg time{spec|val}{add|sub} functions!
1998-04-05 12:10:41 +00:00
Poul-Henning Kamp
bfe6c9fabf Make the dummy timecounter run at 1 MHz rather than 100kHz (noticed by bde)
fix the itimer(REAL) handling.
1998-04-05 11:49:36 +00:00
Peter Wemm
d59fbbf6c8 If there is no error code, don't copyout the remaining time. (As
documented in the man page and the standards).  (and besides, nanosleep1
isn't setting it in this case at present anyway, so we'd be copying junk).
1998-04-05 11:17:19 +00:00
Poul-Henning Kamp
338418263d Fix nanosleep1 based on Bruces suggestion. 1998-04-05 10:28:01 +00:00
Andrey A. Chernov
80a39463c9 Remove unused atv.tv_usec = 0; from select/poll code 1998-04-05 10:03:52 +00:00
Peter Wemm
2257b488b9 tsleep() returns EWOULDBLOCK if the timeout expired. Don't return this
to usermode, otherwise sleep(3) fails, cron doesn't work, etc etc etc.
1998-04-05 07:31:44 +00:00
Peter Wemm
b90dcc0c5d Fix previous commit. Don't people read compiler messages or something?? 1998-04-05 02:59:10 +00:00
Poul-Henning Kamp
91ad39c6b3 Handle double fraction overflow in nano & microtime functions (spotted by Bruce)
Use tvtohz() a place where it fits.
1998-04-04 18:46:13 +00:00
Poul-Henning Kamp
00af9731c9 Time changes mark 2:
* Figure out UTC relative to boottime.  Four new functions provide
      time relative to boottime.

    * move "runtime" into struct proc.  This helps fix the calcru()
      problem in SMP.

    * kill mono_time.

    * add timespec{add|sub|cmp} macros to time.h.  (XXX: These may change!)

    * nanosleep, select & poll takes long sleeps one day at a time

Reviewed by:    bde
Tested by:      ache and others
1998-04-04 13:26:20 +00:00
John Dyson
aec0bcdf5b Perhaps fix a problem that some drivers have that they don't properly
initialize the b_kvasize element.  This might fix some of the split
I/O requests that some people have.
1998-04-04 05:55:05 +00:00
Poul-Henning Kamp
4ff16568be Try to fix poll & select after I broke them. 1998-04-02 07:22:17 +00:00
Tor Egge
5758c2de94 Add two workarounds for broken MP tables:
- Attempt to handle PCI devices where the interrupt is
	  an ISA/EISA interrupt according to the mp table.

	- Attempt to handle multiple IO APIC pins connected to
	  the same PCI or ISA/EISA interrupt source.  Print a
	  warning if this happens, since performance is suboptimal.
	  This workaround is only used for PCI devices.

With these two workarounds, the -SMP kernel is capable of running on
my Asus P/I-P65UP5 motherboard when version 1.4 of the MP table is disabled.
1998-04-01 21:07:37 +00:00
Poul-Henning Kamp
460608e768 Fix an off by 1<<32 error. 1998-03-31 10:47:01 +00:00
Poul-Henning Kamp
75da0aa298 Add a dummy timecounter until we find the real thing(s). 1998-03-31 10:44:56 +00:00
Poul-Henning Kamp
227ee8a188 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
John Dyson
006b9b7df9 Correct a significant problem with the softupdates port. Allow fsync
to work properly within the softupdates framework, and thereby eliminate
some unfortunate panics.
1998-03-29 18:23:44 +00:00
Poul-Henning Kamp
934f5f3306 Export MD5Transform in md5.c and remove a private version in random_machdep.c
md5 is standard as a consequence of this.
1998-03-29 11:55:06 +00:00
Peter Dufault
7c9f6f8f8b Remove duplicate comment 1998-03-28 18:16:29 +00:00
Peter Dufault
38c76440b8 Include sys/resource.h to get PRIO_MAX. 1998-03-28 14:49:47 +00:00
Bruce Evans
3c1300a6b3 Removed unused #includes. 1998-03-28 13:25:01 +00:00
Bruce Evans
771b51ef7b Don't depend on <sys/mount.h> including <sys/socket.h>. 1998-03-28 12:04:40 +00:00
Peter Dufault
8a6472b723 Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and
_KPOSIX_PRIORITY_SCHEDULING options to work.  Changes:

Change all "posix4" to "p1003_1b".  Misnamed files are left
as "posix4" until I'm told if I can simply delete them and add
new ones;

Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux;

Add man pages for _POSIX_PRIORITY_SCHEDULING system calls;

Add options to LINT;

Minor fixes to P1003_1B code during testing.
1998-03-28 11:51:01 +00:00
Bruce Evans
08637435f2 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
Poul-Henning Kamp
c6bcf724da Split the padding out into a separate function.
Synchronize the kernel and libmd versions of md5c.c

PR:		misc/6127
Reviewed by:	phk
Submitted by:	Ari Suutari <ari@suutari.iki.fi>
1998-03-27 10:23:00 +00:00
John Dyson
f9be84912c Correct a problem where buffers might not be zeroed when needed. The
B_MALLOC buffers might not have been properly zeroed.
1998-03-27 06:48:24 +00:00
Poul-Henning Kamp
a0502b19d4 Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.
1998-03-26 20:54:05 +00:00
Jonathan Lemon
640c4313af Add the ability to make real-mode BIOS calls from the kernel. Currently,
everything is contained inside #ifdef VM86, so this option must be
present in the config file to use this functionality.

Thanks to Tor Egge, these changes should work on SMP machines.  However,
it may not be throughly SMP-safe.

Currently, the only BIOS calls made are memory-sizing routines at bootup,
these replace reading the RTC values.
1998-03-23 19:52:59 +00:00
John Dyson
52c64c95c5 In kern_physio.c fix tsleep priority messup.
In vfs_bio.c, remove b_generation count usage,
	remove redundant reassignbuf,
	remove redundant spl(s),
	manage page PG_ZERO flags more correctly,
	utilize in invalid value for b_offset until it
		is properly initialized.  Add asserts
		for #ifdef DIAGNOSTIC, when b_offset is
		improperly used.
	when a process is not performing I/O, and just waiting
		on a buffer generally, make the sleep priority
		low.
	only check page validity in getblk for B_VMIO buffers.

In vfs_cluster, add b_offset asserts, correct pointer calculation
	for clustered reads.  Improve readability of certain parts of
	the code.  Remove redundant spl(s).

In vfs_subr, correct usage of vfs_bio_awrite (From Andrew Gallatin
	<gallatin@cs.duke.edu>).  More vtruncbuf problems fixed.
1998-03-19 22:48:16 +00:00
John Dyson
1c77c6b7b0 Fix an embarassing problem in vtruncbuf. 1998-03-19 18:46:58 +00:00
John Dyson
4641c8ac1d Correct a problem where data OR metadata could be thrown away if a
buffer is grown.
1998-03-17 17:36:05 +00:00
KATO Takenori
f1aca9c33f Deleted PC-98 code because (1) machine dependent code should not be in
here, and (2) the flag used in PC-98 code has been assigned to another
purpose.
1998-03-17 08:41:28 +00:00
John Dyson
2deb5d0417 Correct a severely evil bug in the vtruncbuf code. It didn't cause
me any problems until after the previous commit.  This problem then
caused a severe case of creeping crud on my diskdrive, and hosed
my system so bad, that I needed to do a complete reinstall.  Sorry!!!

I assume that others have manifest this bug.
1998-03-17 06:30:52 +00:00
Julian Elischer
c2a94b7a3c Remove a soft-update hook that was accidentally added to the READ path.
also add some comments, and a couple of very minor cosmetic changes.
1998-03-16 18:39:41 +00:00
Poul-Henning Kamp
b05dcf3c2f A bunch of BNN (Bruce Normal Nits) from bde:
Bring back the softclock inlining
	save a couple of <<32's
	many white-space shuffles.
1998-03-16 10:19:12 +00:00
John Dyson
e85c1afb7c Allow vfs_ioopt to be enabled with a (temporary) config option. 1998-03-16 02:13:03 +00:00
John Dyson
bef608bd7e Some VM improvements, including elimination of alot of Sig-11
problems.  Tor Egge and others have helped with various VM bugs
lately, but don't blame him -- blame me!!!

pmap.c:
1)	Create an object for kernel page table allocations.  This
	fixes a bogus allocation method previously used for such, by
	grabbing pages from the kernel object, using bogus pindexes.
	(This was a code cleanup, and perhaps a minor system stability
	 issue.)

pmap.c:
2)	Pre-set the modify and accessed bits when prudent.  This will
	decrease bus traffic under certain circumstances.

vfs_bio.c, vfs_cluster.c:
3)	Rather than calculating the beginning virtual byte offset
	multiple times, stick the offset into the buffer header, so
	that the calculated offset can be reused.  (Long long multiplies
	are often expensive, and this is a probably unmeasurable performance
	improvement, and code cleanup.)

vfs_bio.c:
4)	Handle write recursion more intelligently (but not perfectly) so
	that it is less likely to cause a system panic, and is also
	much more robust.

vfs_bio.c:
5)	getblk incorrectly wrote out blocks that are incorrectly sized.
	The problem is fixed, and writes blocks out ONLY when B_DELWRI
	is true.

vfs_bio.c:
6)	Check that already constituted buffers have fully valid pages.  If
	not, then make sure that the B_CACHE bit is not set. (This was
	a major source of Sig-11 type problems.)

vfs_bio.c:
7)	Fix a potential system deadlock due to an incorrectly specified
	sleep priority while waiting for a buffer write operation.  The
	change that I made opens the system up to serious problems, and
	we need to examine the issue of process sleep priorities.

vfs_cluster.c, vfs_bio.c:
8)	Make clustered reads work more correctly (and more completely)
	when buffers are already constituted, but not fully valid.
	(This was another system reliability issue.)

vfs_subr.c, ffs_inode.c:
9)	Create a vtruncbuf function, which is used by filesystems that
	can truncate files.  The vinvalbuf forced a file sync type operation,
	while vtruncbuf only invalidates the buffers past the new end of file,
	and also invalidates the appropriate pages.  (This was a system reliabiliy
	and performance issue.)

10)	Modify FFS to use vtruncbuf.

vm_object.c:
11)	Make the object rundown mechanism for OBJT_VNODE type objects work
	more correctly.  Included in that fix, create pager entries for
	the OBJT_DEAD pager type, so that paging requests that might slip
	in during race conditions are properly handled.  (This was a system
	reliability issue.)

vm_page.c:
12)	Make some of the page validation routines be a little less picky
	about arguments passed to them.  Also, support page invalidation
	change the object generation count so that we handle generation
	counts a little more robustly.

vm_pageout.c:
13)	Further reduce pageout daemon activity when the system doesn't
	need help from it.  There should be no additional performance
	decrease even when the pageout daemon is running.  (This was
	a significant performance issue.)

vnode_pager.c:
14)	Teach the vnode pager to handle race conditions during vnode
	deallocations.
1998-03-16 01:56:03 +00:00
John Dyson
26300b34f1 Disable the vfs.ioopt option for now, so that we don't get gratuitious
bugreports.  I might not be able to fix the problems before 3.0, due
to other, more important things.
1998-03-14 19:50:36 +00:00
Tor Egge
8293f20aee Don't misuse vnode interlocks in routines that can be called from interrupts.
PR:		5893
1998-03-14 02:55:01 +00:00
Peter Dufault
917827427e idprio processes must be preempted as soon as anything is runnable. 1998-03-11 20:50:42 +00:00
Mike Smith
617dd81f70 If the root mount fails from a device that is not the compatability slice
of a disk, because that slice does not exist, try again mounting from the
compatability slice.

This handles the case where a disk has been initialised by 'disklabel
auto', which places a bogus and invalid slice entry on the disk.
The bootstrap is not smart enough to reject this slice, and pretends to
boot from it.  Believing the the bootstrap at this point is unwise.

Booting from non-'wd' disks thus prepared is still broken, as
'disklabel -rwB xdN auto' does not initialise the disk type field, and
the bootstrap mistakenly claims that the disk is handled by 'wd'.

Behaviour is now consistent with DEVFS expected characteristics.
1998-03-11 00:10:31 +00:00
John Birrell
cbe0799aaf Add statements to generate a sys/syscall.mk file for inclusion
during the libc/libc_r to automatically pick up syscall names on
the assumption that default asm code needs to generated for them.

In the up-coming changes to the libc makefiles, there is the option
to provide a machine dependent asm source file which will turn off
the automatic generation of the default. There is also an option
to just stop code being generated for a syscall. In most cases,
though, the default asm code is all that is required, so this
change makes that the most convenient was to do business.

Idea suggested by: bde
1998-03-09 04:00:42 +00:00
Julian Elischer
b1897c197c Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by:	Kirk McKusick (mcKusick@mckusick.com)
Obtained from:  WHistle development tree
1998-03-08 09:59:44 +00:00
John Dyson
eed2412e5a Free the first page also if it is not valid. 1998-03-08 06:21:33 +00:00
John Dyson
8f9110f6a1 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
Tor Egge
1146c3560f The APs now reload the interrupt descriptor table pointer after
f00f_hack has run.

Use the global r_idt descriptor in f00f_hack when in SMP mode,
so the APs find the relocated interrupt descriptor table.

Submitted by:	Partially from David A Adkins <adkin003@tc.umn.edu>
1998-03-07 20:16:49 +00:00
John Dyson
9b2e5bad34 Some kern_lock code improvements. Add missing wakeup, and enable
disabling some diagnostics when memory or speed is at a premium.
1998-03-07 19:25:34 +00:00
Bruce Evans
0b8a3ff790 Set the input and output buffer sizes and the input buffer watermarks
dynamically depending on the line speed(s).  This should give the old
sizes and watermarks until drivers are changed.

Display the input watermarks in pstat and sicontrol.
1998-03-07 15:36:29 +00:00
Peter Dufault
917e476dad Reviewed by: msmith, bde long ago
POSIX.4 headers and sysctl variables.  Nothing should change
unless POSIX4 is defined or _POSIX_VERSION is set to 199309.
1998-03-04 10:27:00 +00:00
Peter Dufault
644d85f4ca Reviewed by: msmith, bde long ago
Fix for RTPRIO scheduler to eliminate invalid context switches.

POSIX.4 headers and sysctl variables.  Nothing should change
unless POSIX4 is defined or _POSIX_VERSION is set to 199309.
1998-03-04 10:25:55 +00:00
John Dyson
a638dbdbf4 Fix a rounding error for the NFS buffer validend.
Submitted by:	John W. De Boskey <jwd@unx.sas.com>
1998-03-04 03:17:30 +00:00
Tor Egge
02c1dc3bbc When entering the apic version of slow interrupt handler, level
interrupts are masked, and EOI is sent iff the corresponding ISR bit
is set in the local apic. If the CPU cannot obtain the interrupt
service lock (currently the global kernel lock) the interrupt is
forwarded to the CPU holding that lock.

Clock interrupts now have higher priority than other slow interrupts.
1998-03-03 22:56:30 +00:00
Tor Egge
3163861c7b Forward the signal if the process runs on a different CPU. This reduces
the signal handling latency for cpu-bound processes that performs very
few system calls.

The IPI for forcing an additional software trap is no longer dependent upon
BETTER_CLOCK being defined.
1998-03-03 20:55:26 +00:00
Tor Egge
fe9cd27373 Reduce timeout before assuming that forwarding of hardclock or softclock
failed. Don't complain on forwarding failure, unless
BETTER_CLOCK_DIAGNOSTIC is defined.
1998-03-03 20:09:14 +00:00
Peter Wemm
c8a7999933 Update the ELF image activator to use some of the exec resources rather
than rolling it's own.  This means that it now uses the "safe"
exec_map_first_page() to get the ld.so headers rather than risking a panic
on a page fault failure (eg: NFS server goes down).
Since all the ELF tools go to a lot of trouble to make sure everything
lives in the first page for executables, this is a win.  I have not seen
any ELF executable on any system where all the headers didn't fit in the
first page with lots of room to spare.
I have been running variations of this code for some time on my pure ELF
systems.
1998-03-02 05:47:58 +00:00
John Dyson
59228495d7 Change vfs.ioopt default back to '0'. 1998-03-01 23:07:45 +00:00
Mike Smith
34bdbbd0de The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE.  This is
initially only for file systems that implement the following ops
that do a WILLRELE:

	vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
	vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet.  VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
	ffs mfs nfs msdosfs devfs ext2fs

Custom
	union umapfs

Just EOPNOTSUPP
	fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers.  vput will be replaced by unlock in these cases.  Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer.  This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by:	phk, dyson, terry et. al.
Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-03-01 22:46:53 +00:00
Guido van Rooij
4049a04253 Make sure that you can only bind a more specific address when it is
done by the same uid.
Obtained from: OpenBSD
1998-03-01 19:39:29 +00:00
John Dyson
ffc82b0a70 1) Use a more consistent page wait methodology.
2)	Do not unnecessarily force page blocking when paging
	pages out.
3)	Further improve swap pager performance and correctness,
	including fixing the paging in progress deadlock (except
	in severe I/O error conditions.)
4)	Enable vfs_ioopt=1 as a default.
5)	Fix and enable the page prezeroing in SMP mode.

All in all, SMP systems especially should show a significant
improvement in "snappyness."
1998-03-01 04:18:54 +00:00
Guido van Rooij
d3c0af6943 Raise ncallout from NPROC + 16 to NPROC + 16 + MAXFILES. This shold
prevent a possible DOS attack. The proper fix (to dynamically grow
the callout list) is in the make.
Submitted by:	Paul Traina
1998-02-27 19:58:29 +00:00
Bruce Evans
5132080e71 Removed unused #includes. 1998-02-25 13:08:07 +00:00
Bruce Evans
57518a4e83 Removed a stale comment and staler code. 1998-02-25 06:30:15 +00:00
Bruce Evans
79aa4f4704 Don't depend on "implicit int" or bloat the data section in the
declaration of ptc_devsw_installed.

Fixed a spelling error.
1998-02-25 06:19:15 +00:00
Bruce Evans
2094493a6c Don't depend on "implicit int". 1998-02-25 06:16:37 +00:00
Bruce Evans
8f03c6f18f Declare function pointer args as pointers, not as functions. 1998-02-25 06:13:32 +00:00
Bruce Evans
a0d38b495a Fixed a missing newline in a debugging printf.
Fixed punctuation in some comments.
1998-02-25 06:04:46 +00:00
Bruce Evans
6b16931c00 Removed unused #includes. 1998-02-25 05:58:50 +00:00
Bruce Evans
9c8fff87fc Fixed the calculation of `delta' in settime(). We once set all
times consistently wrong (up to 1 tick too late), but recent changes
fixed the setting of the main clock, making other times inconsistent.
The inconsistencies tended to show up as a negative resource usage
for the process that set the time.

Fixed the check for setting the clock backwards.  A stale timestamp
(`time') was checked, so it was possible to set the clock backwards
by up to almost 1 tick.  Until recently, this bug was compensated
for by setting the clock consistently wrong.

Merged the comment about setting the clock backwards from Lite2.

Removed latency micro-optimizations/speed pessimizations in settime().
microtime() and set_timecounter() are relatively expensive, and
they must be called together with clock updates blocked to get a
consistent `delta', so significant latency optimizations are not
possible.

Removed some stale comments.
1998-02-25 04:10:32 +00:00
John Dyson
8a58a9f6c9 Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden
in a way identically as before.)  I had problems with the system properly
handling the number of vnodes when there is alot of system memory, and the
default VM_KMEM_SIZE.  Two new options "VM_KMEM_SIZE_SCALE" and
"VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems
with greater than 128MB.
1998-02-23 07:41:23 +00:00
John Dyson
64d3c7e32d Clean-up the vget mechanism by permanently attaching VM objects to
vnodes, therefore vget doesn't need to do so anymore.  Other minor
improvements include the temp free vnode queue obeying the VAGE
flag and a printf that warns of to-be-removed code being executed.
1998-02-23 06:59:52 +00:00
Poul-Henning Kamp
7ec73f6417 Replace TOD clock code with more systematic approach.
Highlights:
    * Simple model for underlying hardware.
    * Hardware basis for timekeeping can be changed on the fly.
    * Only one hardware clock responsible for TOD keeping.
    * Provides a real nanotime() function.
    * Time granularity: .232E-18 seconds.
    * Frequency granularity:  .238E-12 s/s
    * Frequency adjustment is continuous in time.
    * Less overhead for frequency adjustment.
    * Improves xntpd performance.

Reviewed by:    bde, bde, bde
1998-02-20 16:36:17 +00:00
Bruce Evans
876a94ee2c Staticized.
Don't depend on "implicit int".
1998-02-20 13:52:15 +00:00
Bruce Evans
e31abede1f Don't depend on "implicit int" or bloat the data section in the
declaration of xxx_devsw_installed.
1998-02-20 13:46:58 +00:00
Bruce Evans
d68fa50ccb Don't depend on "implicit int". 1998-02-20 13:37:40 +00:00
Bruce Evans
39e4376ba7 Removed unused #includes. 1998-02-20 13:11:54 +00:00
Bill Fenner
92f57d003c Revert sosend() to its behavior from 4.3-Tahoe and before: if
so_error is set, clear it before returning it.  The behavior
introduced in 4.3-Reno (to not clear so_error) causes potentially
transient errors (e.g.  ECONNREFUSED if the other end hasn't opened
its socket yet) to be permanent on connected datagram sockets that
are only used for writing.

(soreceive() clears so_error before returning it, as does
getsockopt(...,SO_ERROR,...).)

Submitted by:	Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.
1998-02-19 19:38:20 +00:00
Eivind Eklund
d94f38ace2 Add HW_WDOG to LINT, and turn it into a new-style option. 1998-02-16 23:57:49 +00:00
Poul-Henning Kamp
15b7a47005 A bunch of nits from bde. 1998-02-15 14:15:21 +00:00
Poul-Henning Kamp
c7c9a816a1 Add a nanotime() function so that we can start to use this call. 1998-02-15 13:55:06 +00:00
Poul-Henning Kamp
9ada5a50f3 unifdef -UEXT_CLOCK fdef -UEXT_CLOCK, it is irrelevant.
Fix a couple of nits from bde while here anyway.
1998-02-15 13:50:12 +00:00
Bruce Evans
338ca54caf Fixed an aliasing bug. It was too easy to defeat the check for moving
or shrinking an open partition (by changing the label for a compatibility
slice while partitions on the corresponding real slice are open, or vice
versa).
1998-02-15 05:41:31 +00:00
John Dyson
9f24f214c3 Make the rootdir handling more consistent. Now, processes always
have a root vnode associated with them, and no special checks for
the null case are needed.
Submitted by:	terry@freebsd.org
1998-02-15 04:17:09 +00:00
Eivind Eklund
be41061b8b Make NO_LKM a new-style option.
Forgotten by:	dima
1998-02-12 18:02:07 +00:00
Dima Ruban
bd45deefaa I'm not sure whether this is a correct way to do it,
but here's a new kernel option - "NO_LKM"

If anyone has better ideas - please let me know.
1998-02-11 20:47:55 +00:00
David Greenman
c78ab18a81 Fix a && that should be an &.
Reviewed by:	"John S. Dyson" <dyson@FreeBSD.ORG>
Submitted by:	jwd@unx.sas.com (John W. DeBoskey)
1998-02-11 20:06:48 +00:00
Eivind Eklund
e9fe146bb4 Include SIMPLELOCK_DEBUG functions even if SMP if compiling LINT; give
an error for the combination if _not_ compiling LINT.
1998-02-11 00:05:26 +00:00
Eivind Eklund
cfa5673efd Move include of <machine/ipl.h> inside ifndef SMP where it is used, to
avoid getting 'unused include file' warnings in the SMP case.
1998-02-10 17:10:23 +00:00
KATO Takenori
1b11919b2b Fixed vnode interlock handling.
Reviewed by:	Bruce Evans <bde@zeta.org.au>
            	Tor Egge <Tor.Egge@idi.ntnu.no>
1998-02-10 02:54:24 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
John Dyson
3217023e7c Fix a problem with vn_lock in fsync. 1998-02-08 01:41:33 +00:00
KATO Takenori
16e3b0b67a When the vp is lcoked, vget() calls vfs_object_create() with
waslocked = TRUE.  This change may fix lockmgr panic in umapfs/nullfs.

PR:		5634
Reviewed by:	"John S. Dyson" <toor@dyson.iquest.net>
Suggested by:	Bruce Evans <bde@zeta.org.au>
1998-02-07 08:44:31 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
John Dyson
95461b450d 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
David Greenman
1540674007 Restrict idleprio to superuser:
Realtime priority has to be restricted for reasons which should be
obvious. However, for idle priority, there is a potential for
system deadlock if an idleprio process gains a lock on a resource
that other processes need (and the idleprio process can't run
due to a CPU-bound normal process). Fix me! XXX
PR: 5639
1998-02-04 18:43:10 +00:00
Bruce Evans
a3bdf7a34c Fixed staticization. 1998-02-03 21:41:12 +00:00
Bruce Evans
a92ae47539 Updated generated files. 1998-02-03 17:52:21 +00:00
Bruce Evans
14f1d4260d Fixed type of mincore(). 1998-02-03 17:45:43 +00:00
Bruce Evans
125ff6b079 Generate a forward declaration of `struct proc' in <sys/sysproto.h>.
Removed extra args to a printf.

Fixed some style inconsistencies (unnecessary parentheses for printf).
awk is not C.
1998-02-03 17:39:13 +00:00
John Dyson
5abb66d243 Return the vm_map in the eproc structure, so we can support more accurate
VSZ display in PS.
1998-02-02 05:14:03 +00:00
John Dyson
eaf13dd73a Change the busy page mgmt, so that when pages are freed, they
MUST be PG_BUSY.  It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed.  The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use.  There were some minor problems
with the collapse code, and this plugs those subtile "holes."

Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages.  I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise.  For now, we are stuck
with the current morass.
1998-01-31 11:56:53 +00:00
Eivind Eklund
3f2076daf5 Make the debug options new-style.
This also zaps a DPT option from lint; it wasn't referenced from
anywhere.
1998-01-31 07:23:16 +00:00
Eivind Eklund
e0d781f3a5 Make POWERFAIL_NMI, PPS_SYNC and NATM new style options.
This also fixes a couple of defunct options; submitted by bde.
1998-01-31 05:00:21 +00:00
Tor Egge
d09a16d804 Update freevnodes when adding a vnode to the head of the free list. 1998-01-31 01:17:58 +00:00
Poul-Henning Kamp
c5b193bfba Retire LFS.
If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.

If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.

R.I.P
1998-01-30 11:34:06 +00:00
Steve Price
694ad0a9b1 Fix a couple of operator precedence bugs.
PR:		5450
Submitted by:	Sakari Jalovaara <sja@tekla.fi>
1998-01-25 17:25:41 +00:00
John Dyson
33b90a70cd Various NFS fixes:
Make vfs_bio buffer mgmt work better.
	Buffers were being used after brelse.
	Make nfs_getpages work independently of other NFS
		interfaces.  This eliminates some difficult
		recursion problems and decreases pagefault
		overhead.
	Remove an erroneous vfs_unbusy_pages.
	Fix a reentrancy problem, with nfs_vinvalbuf when
		vnode is already being rundown.
	Reassignbuf wasn't being called when needed under
		certain circumstances.

	(Thanks to Bill Paul for help.)
1998-01-25 06:24:09 +00:00
Eivind Eklund
7b778b5e61 Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style.
This introduce an xxxFS_BOOT for each of the rootable filesystems.
(Presently not required, but encouraged to allow a smooth move of option *FS
to opt_dontuse.h later.)

LFS is temporarily disabled, and will be re-enabled tomorrow.
1998-01-24 02:54:56 +00:00
John Dyson
50ce7ff499 Add better support for larger I/O clusters, including larger physical
I/O.  The support is not mature yet, and some of the underlying implementation
needs help.  However, support does exist for IDE devices now.
1998-01-24 02:01:46 +00:00
John Dyson
2d8acc0f4a VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
Bruce Evans
ffbb164e19 Set p_retval for the correct process in getpriority(). This fixes
a null pointer panic when the pointer for the incorrect process is
NULL.  getpriority() was broken in rev.1.27.  Rev.1.28 broke the
warning instead of fixing the problem.

PR:	5495
1998-01-19 12:39:00 +00:00
John Dyson
4722175765 Tie up some loose ends in vnode/object management. Remove an unneeded
config option in pmap.  Fix a problem with faulting in pages.  Clean-up
some loose ends in swap pager memory management.

The system should be much more stable, but all subtile bugs aren't fixed yet.
1998-01-17 09:17:02 +00:00
Poul-Henning Kamp
6f70df1587 Move almost all the ntp related stuff from kern_clock.c to
kern_ntptime.c.  The only bit left over is that which is executed
in all calls to hardclock().  Various cleanups and staticizing
along the road.
1998-01-14 20:48:16 +00:00
Poul-Henning Kamp
7907a6bc55 Make softticks static.
Remove unneeded stuff.
1998-01-14 19:42:47 +00:00
John Dyson
53f6f08545 Fix another vnode leak. 1998-01-12 03:15:01 +00:00
John Dyson
925a3a419a Fix some vnode management problems, and better mgmt of vnode free list.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.

PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
1998-01-12 01:46:33 +00:00
John Dyson
1616db3cf8 Implement the first page access for object type determination more
VM clean.  Also, use vm_map_insert instead of vm_mmap.
Reviewed by:	dg@freebsd.org
1998-01-11 21:35:38 +00:00
Poul-Henning Kamp
bb303fe246 Try to solve timeout race by not touching softtics here. 1998-01-11 19:07:58 +00:00
Poul-Henning Kamp
55c449bc0f Fix softclock calling so we don't loose timeouts (I broke this ~10h ago) 1998-01-11 00:44:31 +00:00
Poul-Henning Kamp
eeb355f73f Whoops. softclock is called from doreti_swi as well. Abandon call from
hardclock().

Forgot this:

Pointed hat sent by:	bd
1998-01-10 14:55:14 +00:00
Poul-Henning Kamp
a50ec50568 Effect the divorce of kern_clock.c and kern_timeout.c (which was
repository copied from kern_clock.c)
1998-01-10 13:16:26 +00:00
Eivind Eklund
e4f4247a08 Make the BOOTP family new-style options (in opt_bootp.h) 1998-01-09 03:21:07 +00:00
Poul-Henning Kamp
9cf25fbec7 Improve hardpps readability a bit:
* Rename usec to p_usec so you can search for it.
* Macroize the huge median_of_3_samples if statement.
1998-01-07 12:29:17 +00:00
John Dyson
857d737ed6 Disable io optimizations again, minor bug found, and will be fixed in
a few days.
1998-01-07 09:26:29 +00:00
John Dyson
95e5e988e0 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
Alexander Langer
de17eb59b4 Added missing caddr_t --> void * conversions for sys/mman.h functions.
Submitted by:	bde
1998-01-01 17:07:46 +00:00
Bruce Evans
b1679c0f7e Use a real malloc type for M_LINKER instead of #defining it as M_TEMP.
Fixed a comment.
1998-01-01 08:56:24 +00:00
John Dyson
483140ead1 Add the vnode interlock back around vref. 1997-12-29 16:54:03 +00:00
Bruce Evans
b2ef07b7a2 Fixed style bugs in previous commit. 1997-12-29 08:54:52 +00:00
John Dyson
60f8d46448 Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem
with the object cache removal.
1997-12-29 01:03:55 +00:00
John Dyson
2be70f79f6 Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00
Bruce Evans
f82057be9e Handle "%...p" as "%#...x" instead of "0x%...x". This is a quick fix
for field widths being 2 larger than specified for "%<number>p".  Only
printing of null pointers is "wrong" now (it is actually "right", but
inconsistent with printf(3)).
1997-12-28 05:03:33 +00:00
Bruce Evans
64cfdf460e Restored used include of <sys/malloc.h>. malloc() is not used
here, but kmem_malloc() is used and it takes the same "flags" as
malloc().

Use the mbuf allocation "flags" M_WAIT and M_DONTWAIT consistently.
There is really only one boolean flag, M_DONTWAIT, but the "flags"
were always treated as enum-like values, except in some places here
where the values are tacitly converted to boolean flags.  Treat
them as enum-like values everywhere, except where we tacitly assume
that there are only two values in order to convert them to the
corresponding two kmem_malloc() "flags".
1997-12-28 01:01:13 +00:00
Bruce Evans
675ea6f083 Unspammed nested include of <vm/vm_zone.h>. 1997-12-27 02:56:39 +00:00
Poul-Henning Kamp
71f461f86a Rename "i586_ctr" to "tsc" (both upper and lower case instances).
Fix a couple of printfs too.

Warning: This changes the names of a couple of kernel options!
1997-12-26 20:42:37 +00:00
Gary Palmer
b3b84d9b17 Make kern.ncpu reports the number of detected processors when running
with a SMP kernel.
1997-12-25 13:14:21 +00:00
Nate Williams
e1d6dc656d This patch causes the "calltodo" timer list to be decremented by the amount
of time that the laptop was suspending.  Thus, select() calls that might have
suspended rather than firing  at 1hr + "time suspended" since the timer was
posted.

Adding:

    options  APM_FIXUP_CALLTODO

to the kernel config enables the patch.

[
This patch was slightly modified to use a consistant indent style and
I removed some unused local variables.  After this has been tested a
few weeks we'll make the options the default, so for now I'm now
documenting it in LINT.  Mike can later if he wants.
]

Reviewed by:	Mike Smith <msmith@freebsd.org>
Submitted by:	Ken Key  <key@cs.utk.edu>
1997-12-23 16:32:35 +00:00
John Dyson
6d94bea461 Improve my copyright. 1997-12-22 11:54:00 +00:00
Sean Eric Fagan
d5f81602a7 Clear the p_stops field on change of user/group id, unless the correct
flag is set in the p_pfsflags field.  This, essentially, prevents an SUID
proram from hanging after being traced.  (E.g., "truss /usr/bin/rlogin" would
fail, but leave rlogin in a stopevent state.)  Yet another case where procctl
is (hopefully ;)) no longer needed in the general case.

Reviewed by:	bde (thanks bruce :))
1997-12-20 03:05:47 +00:00
Bruce Evans
214279cec9 Use __inline instead of inline to prevent pedantic compiler warnings. 1997-12-19 23:25:16 +00:00
Bruce Evans
1aa9ea7cb9 Removed some bogus casts. 1997-12-19 23:18:37 +00:00
John Dyson
1efb74fbcc Some performance improvements, and code cleanups (including changing our
expensive OFF_TO_IDX to btoc whenever possible.)
1997-12-19 09:03:37 +00:00
Garrett Wollman
e51d1e8707 Revert poll() for UFS files to traditional behavior where polling for read-
or writability always returns true.  This works around bugs in netscape and
squid, at a minimum.
1997-12-17 14:44:23 +00:00
Eivind Eklund
4027779c22 Regenerate after changing makesyscalls.sh. 1997-12-16 22:27:22 +00:00
Eivind Eklund
0b3c4d50c2 Move around opt_compat include to accomodate Linulator brokenness (for
the time being).
1997-12-16 18:51:45 +00:00
Eivind Eklund
5591b823d1 Make COMPAT_43 and COMPAT_SUNOS new-style options. 1997-12-16 17:40:42 +00:00
David Greenman
c7ce9e2634 Fix bug where a struct buf was free()'d back to the system malloc pool.
Quite amazing that the system runs at all with this bug. Also present in
2.2.5. The bug appears to have come in with changes in rev 1.53.

PR:		might fix PR#5313
Submitted by:	bde
1997-12-16 15:40:29 +00:00
Garrett Wollman
1cbbd625cc Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL
if one of the new poll types is requested; hopefully this will not break
any existing code.  (This is done so that programs have a dependable
way of determining whether a filesystem supports the extended poll types
or not.)

The new poll types added are:

	POLLWRITE - file contents may have been modified
	POLLNLINK - file was linked, unlinked, or renamed
	POLLATTRIB - file's attributes may have been changed
	POLLEXTEND - file was extended

Note that the internal operation of poll() means that it is impossible
for two processes to reliably poll for the same event (this could
be fixed but may not be worth it), so it is not possible to rewrite
`tail -f' to use poll at this time.
1997-12-15 03:09:59 +00:00
Mike Smith
0bec68bf7c Consult sa_len before trampling it with MSG_COMPAT set.
PR:             kern/5291
Submitted by:   pb@fasterix.freenix.org (Pierre Beyssac)
1997-12-15 02:29:11 +00:00
Tor Egge
5c623cb649 Add support for low resolution SMP kernel profiling.
- A nonprofiling version of s_lock (called s_lock_np) is used
    by mcount.

  - When profiling is active, more registers are clobbered in
    seemingly simple assembly routines. This means that some
    callers needed to save/restore extra registers.

  - The stack pointer must have space for a 'fake' return address
    in idle, to avoid stack underflow.
1997-12-15 02:18:35 +00:00
Tor Egge
549a42942d Don't forward hardclock or statclock to stopped cpus. Disable forwarding
when a panic has occured.
1997-12-15 01:14:10 +00:00
John Polstra
a8e99ec909 Make gzipped dynamically linked executables work again. There was
an old bug here that failed to copy the a.out header into memory
properly.  It didn't matter until changes were made recently to
the dynamic linker.
1997-12-14 19:36:24 +00:00
Mike Smith
5af7db2b73 As described by the submitter:
... fix a bug with orecvfrom() or recvfrom() called with
the MSG_COMPAT flag on kernels compiled with the COMPAT_43 option.
The symptom is that the fromaddr is not correctly returned.

This affects the Linux emulator.

Submitted by:	pb@fasterix.freenix.org (Pierre Beyssac)
1997-12-14 03:15:21 +00:00
John Dyson
8256655132 After one of my analysis passes to evaluate methods for SMP TLB mgmt, I
noticed some major enhancements available for UP situations.  The number
of UP TLB flushes is decreased much more than significantly with these
changes.  Since a TLB flush appears to cost minimally approx 80 cycles,
this is a "nice" enhancement, equiv to eliminating between 40 and 160
instructions per TLB flush.

Changes include making sure that kernel threads all use the same PTD,
and eliminate unneeded PTD switches at context switch time.
1997-12-14 02:11:23 +00:00
Tor Egge
80db913bd6 Add needed #include.
Problem found by: Bruce Evans <bde@zeta.org.au>
1997-12-12 21:45:23 +00:00
John Dyson
74b2192ae6 We have had support for running the kernel daemons as threads for
quite a while, but forgot to do so.  For now, this code supports
most daemons  running as kernel threads in UP kernels, and as
full processes in SMP.  We will soon be able to run them as
threads in SMP, but not yet.
1997-12-12 04:00:59 +00:00
John Dyson
648899413d Quiet some lint. 1997-12-10 04:14:23 +00:00
Steve Passe
eae8fc2c8a The improvements to clock statistics by Tor Egge
Wrappered and enabled by the define BETTER_CLOCK (on by default in smpyests.h)

Reviewed by:	smp@csn.net
Submitted by:	Tor Egge <Tor.Egge@idi.ntnu.no>
1997-12-08 23:00:24 +00:00
John-Mark Gurney
0c2a51b49e add process id to tmp files... this prevents two runs from stomping
over each other's tmp files...  (usr.bin/truss uncovered this bug)
1997-12-08 09:00:47 +00:00
John Dyson
78922e413c Correct prototypes to match POSIX. Correct return code for aio_cancel.
Submitted by:	Alex Nash <nash@mcs.com>
1997-12-08 02:18:25 +00:00
Sean Eric Fagan
847e5f5f9a Use at_exit() to invoke procfs_exit() instead of calling it directly.
Note that an unload facility should be used to call rm_at_exit() (if
procfs is being loaded as an LKM and is subsequently removed), but it
was non-obvious how to do this in the VFS framework.

Reviewed by:	Julian Elischer
1997-12-08 01:06:36 +00:00
Sean Eric Fagan
ed1b05436a Surround the call to procfs_exit() by #ifdef PROCFS/#endif -- much to my
surprise, procfs actually is optional, and some people truly do generate
kernels without it.  Wow.  I built a kernel without 'options PROCFS' and
it compiled and linked.
1997-12-07 18:16:43 +00:00
John Dyson
f2e6e69d92 Slight performance improvement, removal of unneeded SPLs. 1997-12-07 04:06:41 +00:00
Bruce Evans
df1c78063c Use ENOIOCTL instead of -1 (= ERESTART) for diskslice ioctls that are
not handled at a particular level.
1997-12-06 14:27:56 +00:00
Bruce Evans
239b7b699e Use ENOIOCTL instead of -1 (= ERESTART) for tty ioctls that are
not handled at a particular level.  This fixes mainly restarting
of interrupted TIOCDRAINs and TIOCSETA{W,F}s.
1997-12-06 13:25:01 +00:00
Sean Eric Fagan
2a024a2b05 Changes to allow event-based process monitoring and control. 1997-12-06 04:11:14 +00:00
Bruce Evans
1cd52ec333 Don't include <sys/lock.h> in headers when only `struct simplelock' is
required.  Fixed everything that depended on the pollution.
1997-12-05 19:55:52 +00:00
John Dyson
d4060a8751 Some fixes from John Hood:
1) Fix the initialization of malloc structure that changed
		due to perf opt.
	2) Remove unneeded include.
	3) An initialization assert added to malloc.
Submitted by:	John Hood <cgull@smoke.marlboro.vt.us>
1997-12-05 05:36:58 +00:00
John-Mark Gurney
4d9deedb49 document and make the NO_F00F_HACK a proper option...
also, sort some option includes while I'm here..

Forgotten by:	sef
1997-12-04 21:21:26 +00:00
Jordan K. Hubbard
e41b6f2db7 After consultation with David, change
#ifndef NO_F00F_HACK
to
#if defined(I586_CPU) && !defined(NO_F00F_HACK)
1997-12-04 14:35:40 +00:00
Sean Eric Fagan
c4fbf2774d Work around for the Intel Pentium F00F bug; this is Intel's recommended
workaround.  Note that this currently eats up two pages extra in the system;
this could be alleviated by aligning idt correctly, and then only dealing with
that (as opposed to the current method of allocated two pages and copying the
IDT table to that, and then setting that to be the IDT table).
1997-12-03 02:45:50 +00:00
Poul-Henning Kamp
ab3f746966 In all such uses of struct buf: 's/b_un.b_addr/b_data/g' 1997-12-02 21:07:20 +00:00
Bruce Evans
52aef196f7 Cleaned up __getcwd(). This should be cosmetic except disabled calls
are now counted.

Reviewed by:	phk
1997-12-02 10:32:21 +00:00
John Dyson
b4b3edc1f4 Fix a serious problem during resizing buffers where old buffers
address space wasn't being properly reclaimed.
Submitted by:	Bruce Evans <bde@freebsd.org>
1997-12-01 19:04:00 +00:00
John Dyson
e499ed6f86 Fix a problem when creating a new kernel thread. In some cases, aio_read
or aio_write can return the pid of the new thread.  This is due to the
way that return values from system calls being passed by side-effect in
the proc structure now.  This commit fixes the problem with aio_read and
aio_write.
1997-12-01 18:41:08 +00:00
Julian Elischer
1b0493988c Cleanup my last patch here
Reviewed by: sef@kthrup.com and phk@freebsd.org
1997-12-01 11:34:41 +00:00
John Dyson
11783b142b Fix error handling for VCHR type I/O. Also, fix another spl problem, and
remove alot of overly verbose debugging statements.
ioproclist {
	int aioprocflags;			/* AIO proc flags */
	TAILQ_ENTRY(aioproclist) list;		/* List of processes */
	struct proc *aioproc;			/* The AIO thread */
	TAILQ_HEAD (,aiocblist) jobtorun;	/* suggested job to run */
};

/*
 * data-structure for lio signal management
 */
struct aio_liojob {
	int lioj_flags;
	int	lioj_buffer_count;
	int	lioj_buffer_finished_count;
	int	lioj_queue_count;
	int	lioj_queue_finished_count;
	struct sigevent lioj_signal;	/* signal on all I/O done */
	TAILQ_ENTRY (aio_liojob) lioj_list;
	struct kaioinfo *lioj_ki;
};
#define	LIOJ_SIGNAL			0x1 /* signal on all done (lio) */
#define	LIOJ_SIGNAL_POSTED	0x2	/* signal has been posted */

/*
 * per process aio data structure
 */
struct kaioinfo {
	int	kaio_flags;			/* per process kaio flags */
	int	kaio_maxactive_count;	/* maximum number of AIOs */
	int	kaio_active_count;	/* number of currently used AIOs */
	int	kaio_qallowed_count;	/* maxiumu size of AIO queue */
	int	kaio_queue_count;	/* size of AIO queue */
	int	kaio_ballowed_count;	/* maximum number of buffers */
	int	kaio_queue_finished_count;	/* number of daemon jobs finished */
	int	kaio_buffer_count;	/* number of physio buffers */
	int	kaio_buffer_finished_count;	/* count of I/O done */
	struct proc *kaio_p;			/* process that uses this kaio block */
	TAILQ_HEAD (,aio_liojob) kaio_liojoblist;	/* list of lio jobs */
	TAILQ_HEAD (,aiocblist)	kaio_jobqueue;	/* job queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_jobdone;	/* done queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_bufqueue;	/* buffer job queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_bufdone;	/* buffer done queue for process */
};

#define KAIO_RUNDOWN 0x1		/* process is being run down */
#define KAIO_WAKEUP 0x2			/* wakeup process when there is a significant
								   event */


TAILQ_HEAD (,aioproclist) aio_freeproc, aio_activeproc;
TAILQ_HEAD(,aiocblist) aio_jobs;			/* Async job list */
TAILQ_HEAD(,aiocblist) aio_bufjobs;			/* Phys I/O job list */
TAILQ_HEAD(,aiocblist) aio_freejobs;		/* Pool of free jobs */

static void aio_init_aioinfo(struct proc *p) ;
static void aio_onceonly(void *) ;
static int aio_free_entry(struct aiocblist *aiocbe);
static void aio_process(struct aiocblist *aiocbe);
static int aio_newproc(void) ;
static int aio_aqueue(struct proc *p, struct aiocb *job, int type) ;
static void aio_physwakeup(struct buf *bp);
static int aio_fphysio(struct proc *p, struct aiocblist *aiocbe, int type);
static int aio_qphysio(struct proc *p, struct aiocblist *iocb);
static void aio_daemon(void *uproc);

SYSINIT(aio, SI_SUB_VFS, SI_ORDER_ANY, aio_onceonly, NULL);

static vm_zone_t kaio_zone=0, aiop_zone=0,
	aiocb_zone=0, aiol_zone=0, aiolio_zone=0;

/*
 * Single AIOD vmspace shared amongst all of them
 */
static struct vmspace *aiovmspace = NULL;

/*
 * Startup initialization
 */
void
aio_onceonly(void *na)
{
	TAILQ_INIT(&aio_freeproc);
	TAILQ_INIT(&aio_activeproc);
	TAILQ_INIT(&aio_jobs);
	TAILQ_INIT(&aio_bufjobs);
	TAILQ_INIT(&aio_freejobs);
	kaio_zone = zinit("AIO", sizeof (struct kaioinfo), 0, 0, 1);
	aiop_zone = zinit("AIOP", sizeof (struct aioproclist), 0, 0, 1);
	aiocb_zone = zinit("AIOCB", sizeof (struct aiocblist), 0, 0, 1);
	aiol_zone = zinit("AIOL", AIO_LISTIO_MAX * sizeof (int), 0, 0, 1);
	aiolio_zone = zinit("AIOLIO",
		AIO_LISTIO_MAX * sizeof (struct aio_liojob), 0, 0, 1);
	aiod_timeout = AIOD_TIMEOUT_DEFAULT;
	aiod_lifetime = AIOD_LIFETIME_DEFAULT;
	jobrefid = 1;
}

/*
 * Init the per-process aioinfo structure.
 * The aioinfo limits are set per-process for user limit (resource) management.
 */
void
aio_init_aioinfo(struct proc *p)
{
	struct kaioinfo *ki;
	if (p->p_aioinfo == NULL) {
		ki = zalloc(kaio_zone);
		p->p_aioinfo = ki
1997-12-01 07:01:45 +00:00
John Dyson
f4f0ecefab Correct a last minute code change. Would have been an infinite loop under
certain error conditions.
Submitted by:	pst@shockwave.com
1997-11-30 23:21:08 +00:00
John Dyson
c5efdcbdec Fix an spl nit. 1997-11-30 21:47:36 +00:00
John Dyson
84af4da65a Finish up the vast majority of the AIO/LIO functionality. Proper signal
support was missing in the previous version of the AIO code.  More
tunables added, and very efficient support for VCHR files has been added.
Kernel threads are not used for VCHR files, all work for such files is
done for the requesting process directly.  Some attempt has been made to
charge the requesting process for resource utilization, but more work
is needed.  aio_fsync is still missing (but the original fsync system
call can be used for now.)  aio_cancel is essentially a noop, but that
is okay per POSIX.  More aio_cancel functionality can be added later,
if it is found to be needed.

The functions implemented include:
	aio_read, aio_write, lio_listio, aio_error, aio_return,
	aio_cancel, aio_suspend.

The code has been implemented to support the POSIX spec 1003.1b
(formerly known as POSIX 1003.4 spec) features of the above.  The
async I/O features are truly async, with the VCHR mode of operation
being essentially the same as physio (for appropriate files) for
maximum efficiency.  This code also supports the signal capability,
is highly tunable, allowing management of resource usage, and
has been written to allow a per process usage quota.

Both the O'Reilly POSIX.4 book and the actual POSIX 1003.1b document
were the reference specs used.  Any filedescriptor can be used with
these new system calls.  I know of no exceptions where these
system calls will not work.  (TTY's will also probably work.)
1997-11-30 04:36:31 +00:00
John Dyson
f4feb04e1f Disable the VCHR optimization for AIO until I have implemented it. Just in
case anyone wants to play with the POSIX AIO/LIO stuff.  (As it is, it should
work with ANY vnode, on UP systems only, for now.)
1997-11-29 02:57:46 +00:00
John Dyson
fd3bf77574 Fix and complete the AIO syscalls. There are some performance enhancements
coming up soon, but the code is functional.  Docs will be forthcoming.
1997-11-29 01:33:10 +00:00
Julian Elischer
95802bf803 Shift a few SYSINT() calls around.
this results in a few functions becoming static, and
the SYSINITs being close to the code they are related to.
setting up the dump device is with dumpsys() and
kicking off the scheduler is with the scheduler.
Mounting root is with the code that does it.

Reviewed by: phk
1997-11-25 07:07:48 +00:00
Bruce Evans
c463cf1cae Fixed multiple definitions of boothowto.
Fixed bitrot in the read-only access to kern.boottime.
1997-11-24 18:35:04 +00:00
Bruce Evans
b672aa4ba6 Removed all traces of P_IDLEPROC. It was tested but never set. 1997-11-24 15:15:33 +00:00
Bruce Evans
21e5241572 Fixed some #include messes.
Hid the check of the user %cs in syscall() under `#ifdef DIAGNOSTIC'.
1997-11-24 13:25:37 +00:00
John Dyson
4ced7dd5bf Avoid manipulating the buffer map at interrupt time by deferring bfreekva
to getnewbuf, and remove from brelse.
Reviewed by:	dg@root.com
1997-11-24 06:18:27 +00:00
John Dyson
289500ad9e Fix the buffer flag frobbing. Note: It is invalid to gratuitiously modify
b_flags, and this patch removes unneeded modifications.  Only the needed b_flags
bits are modified now.  (Specifically, it is usually wrong to zero b_flags.)
Submitted by:	bde@freebsd.org
1997-11-24 04:14:21 +00:00
Bruce Evans
a3c78a768e Fixed a missing conversion of retval to p_retval in disabled code.
Fixed overflow of FFLAGS() in fcntl(F_SETFL, ...).  This was not
a security hole, but gave wrong results for silly flags values.
E.g., it make fcntl(F_SETFL, -1) equivalent to fcntl(F_SETFL, 0).
POSIX requires ignoring the open mode bits in fcntl() (even if
they would be invalid for open()).
1997-11-23 12:24:59 +00:00
Bruce Evans
d826c47904 Fixed duplicate definitions of M_FILE (one static). 1997-11-23 10:43:49 +00:00
Bruce Evans
2087c8967c Fixed some style bugs in the poll() code.
Removed dead code to "Avoid inadvertently sleeping forever".  hzto()
never returns 0.
1997-11-23 10:30:50 +00:00
Bruce Evans
cb451ebdbd Staticized. 1997-11-22 08:35:46 +00:00
Bruce Evans
865737f450 Staticized.
Use OID_AUTO instead of a magic number for the debug.syncprt sysctl.
(This sysctl doesn't actually work.  FreeBSD nuked it, but parts
of it were mismerged from Lite2.  It is not very good, but better
than nothing.)
1997-11-22 06:41:21 +00:00
Bruce Evans
d02601f8cf Fixed rev.1.81. mp->mnt_kern_flag was restored in the non-error case of
`mount -u'.  This only matters for `mount -u' competing with unmounts.
If I understand the locking correctly: if mount() blocks, then unmount()
may run and set mp->kern_flag for the same mp.  Then unmount() blocks
waiting for mount() to finish.  When unmount() continues, its MNTK flags
(MNTK_UNMOUNT and MNTK_MWAIT) may have been clobbered.

Didn't fix old bugs:
- restoring mp->mnt_kern_flag is wrong for the same reasons in the error
  case.
- the error case of unmount() seems to be broken too:
  (a) MNTK_UNMOUNT gets clobbered, although another unmount() may have
      set it.  Perhaps it shouldn't be set until after the full lock is
      aquired.
  (b) MNTK_MWAIT isn't honoured.

Fixed a nearby style bug.
1997-11-22 06:10:36 +00:00
Bruce Evans
6881d20fbb Const poisoning from ks_shortdesc. 1997-11-21 11:37:03 +00:00
Bruce Evans
d73424aa6b Fixed a sloppy common-style definitions. 1997-11-20 20:07:59 +00:00
Bruce Evans
01166e9245 Avoid passing a `retval' to wait1()
Disallow wait options that are not a combination of the standard POSIX
options WUNTRACED and WNOHANG, as is required by POSIX.  BSD doesn't
have any extensions here, but the code was `#ifdef notyet' for some
reason.
1997-11-20 19:09:43 +00:00
Bruce Evans
be67169a57 Removed unused includes.
Staticized.

Avoid passing a `retval' to fork1().

Fixed some style bugs.
1997-11-20 16:36:17 +00:00
Bruce Evans
d96bc99dac Removed unused #includes. Ifdefed a conditionally used #include.
Fixed nonblocking mode.  It was per-device instead of per-file.

Don't depend on gcc's misfeature of rewriting char args in old-style
function definitions to match wrong prototypes.  Break K&R1 support
to fix this quickly.
1997-11-18 16:12:51 +00:00
Bruce Evans
fc8f7066d2 Get buffer stuff by #including <sys/buf.h> instead of <sys/vnode.h>.
Staticized boot().

Fixed a gratuitous ANSIism.
1997-11-18 15:16:43 +00:00
Bruce Evans
d662024a0d Removed an unused #include.
Fixed a style bug (one of many KNF breakages in vfs_subr.c moved here).
1997-11-18 13:03:48 +00:00
Bruce Evans
1fe5398ce7 Get tty ioctl numbers by #including <sys/ttycom.h> instead of
<sys/tty.h>.  Don't #include <sys/fcntl.h> (the select -> poll
changes removed all dependencies on it).
1997-11-18 12:59:09 +00:00
Bruce Evans
b7f5d5b54d Removed an unused #include. Added an unsed #include of <sys/ucred.h>
to prepare for not including it in <sys/param.h>.  Moved conditionally
used #includes inside an ifdef.
1997-11-18 12:52:10 +00:00
Bruce Evans
e4474ce8bf Removed an unused #include. Ifdefed a conditionally used #include. 1997-11-18 12:43:41 +00:00
Bruce Evans
99af8d4e43 Removed unused #include. 1997-11-18 12:24:22 +00:00
Bruce Evans
fdebd4f0fd Get locking stuff by #including <sys/lock.h> instead of <sys/vnode.h>. 1997-11-18 10:02:40 +00:00
Peter Wemm
6245570f67 Don't generate new prototype files with the extra int retval[] arg at
the end since pdk deleted them.

Forgotten by: phk
1997-11-18 03:34:39 +00:00
Julian Elischer
52bf64c787 Reviewed by: hackers@freebsd.org in general
Obtained from: Whistle Communications tree

Add an option to the way UFS works dependent on the SUID bit of directories
This changes makes things a whole lot simpler on systems running as
fileservers for PCs and MACS. to enable the new code you must
1/ enable option SUIDDIR on the kernel.
2/ mount the filesystem with option suiddir.
hopefully this makes it difficult enough for people to
do this accidentally.
see the new chmod(2) man page for detailed info.
1997-11-13 00:28:51 +00:00
Tor Egge
d72ec6655e Set return value for the correct process in ptrace(). 1997-11-12 12:28:12 +00:00
Julian Elischer
b1f4a44b03 Reviewed by: various.
Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left  for another day.
1997-11-12 05:42:33 +00:00
Jordan K. Hubbard
64bd2f7b57 MF22: MSG_EOR bug fix.
Submitted by:	wollman
1997-11-09 05:07:40 +00:00
Tor Egge
31e5225482 Use UPAGES when setting up private pages for SMP (which includes idle stack). 1997-11-07 19:58:34 +00:00
Poul-Henning Kamp
0abc78a697 Rename some local variables to avoid shadowing other local variables.
Found by: -Wshadow
1997-11-07 09:21:01 +00:00
Poul-Henning Kamp
4a11ca4e29 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
Poul-Henning Kamp
cb226aaa62 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
Poul-Henning Kamp
d1bbc7ec65 Remove the long description from the in-kernel datastructure.
Put a magic field in there instead, to help catch uninitialized
malloc types.
1997-10-28 19:01:02 +00:00
Bruce Evans
55b211e3af Removed unused #includes. 1997-10-28 15:59:26 +00:00
Bruce Evans
1315bcf7d4 Fixed style bugs in open() fix. 1997-10-28 10:29:55 +00:00
Bruce Evans
4090154b9e Moved declaration of etext from <machine/md_var.h> to <machine/cpu.h>
and fixed everything that dependended on it being declared in the old
place.  It is used in "machine-independent" code in subr_prof.c.

Moved declaration of btext from subr_prof.c to <machine/cpu.h>.  It
is machine-dependent.
1997-10-27 17:23:18 +00:00
Bruce Evans
32e4d4c5e6 Use 127 instead of CHAR_MAX for the limit on the sequence count. The
limit doesn't have anything to do with characters.  The count mainly
needs to fit in the VOP_READ() ioflag after being left shifted by 16.

Moved vn_lock() before vn_closefile().  vn_lock() was mismerged from
Lite2.

Removed some gratuitous braces.
1997-10-27 15:26:23 +00:00
Poul-Henning Kamp
dba3870c10 VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place.
Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite.
Use vop_null instead of nullop.
Move vop_nopoll from vfs_subr.c to vfs_default.c
Move vop_sharedlock from vfs_subr.c to vfs_default.c
Move vop_nolock from vfs_subr.c to vfs_default.c
Move vop_nounlock from vfs_subr.c to vfs_default.c
Move vop_noislocked from vfs_subr.c to vfs_default.c
Use vop_ebadf instead of *_ebadf.
Add vop_defaultop for getpages on master vnode in MFS.
1997-10-26 20:55:39 +00:00
Poul-Henning Kamp
e83f76772d Remade syscalls.master derived files. 1997-10-26 20:28:54 +00:00
Poul-Henning Kamp
e6e21bc0a4 Add "NOIMPL" for syscalls we know what is, but don't implement as "STD".
Use this for getfh & nfssvc.
1997-10-26 20:27:51 +00:00
Poul-Henning Kamp
1b09ae776d Simplify the lease_check stuff. 1997-10-26 20:26:33 +00:00
John-Mark Gurney
f173e80c8b make a couple functions static...
also change module_register_static to module_register_init as this
function initalizes the module for both dynamic and static modules...
1997-10-24 05:29:07 +00:00
KATO Takenori
1c1ff2947b Disallow non-root mount. If you want to allow non-root mount, change
vfs.usermount into 1 with sysctl.
1997-10-23 09:29:09 +00:00
Joerg Wunsch
2094bd7342 Reject attempts to call open() with an illegal combination of O_RDONLY,
O_WRONLY, O_RDWR.
1997-10-22 07:28:51 +00:00
Poul-Henning Kamp
c988324000 Add const to a couple of casts to silence some of the warnings Bruce
has let loose on us.
1997-10-21 13:28:36 +00:00
Andrey A. Chernov
5332bc655c Fix returned sleep period for large values
Submitted by: bde
1997-10-20 18:43:49 +00:00
David Greenman
916ca17535 kern.maxproc is not writable since there are tables that are statically
sized at startup.
PR: 4675
1997-10-19 18:45:59 +00:00
David Greenman
7aef05003a Killed non-sensical call to splimp/splx in crfree(). 1997-10-17 23:52:56 +00:00
Poul-Henning Kamp
d54d34b533 Make a set of VOP standard lock, unlock & islocked VOP operators, which
depend on the lock being located at vp->v_data.  Saves 3x3 identical
vop procs, more as the other filesystems becomes lock aware.
1997-10-17 12:36:19 +00:00
Poul-Henning Kamp
e9565321ea VFS clean up "hekto commit"
1.  Add defaults for more VOPs
        VOP_LOCK        vop_nolock
        VOP_ISLOCKED    vop_noislocked
        VOP_UNLOCK      vop_nounlock
    and remove direct reference in filesystems.

2.  Rename the nfsv2 vnop tables to improve sorting order.
1997-10-16 22:01:05 +00:00
Poul-Henning Kamp
987f569678 Another VFS cleanup "kilo commit"
1.  Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
    intereface function, and now lives in the ufsmount structure.

2.  Remove VOP_SEEK, it was unused.

3.  Add mode default vops:

    VOP_ADVLOCK          vop_einval
    VOP_CLOSE            vop_null
    VOP_FSYNC            vop_null
    VOP_IOCTL            vop_enotty
    VOP_MMAP             vop_einval
    VOP_OPEN             vop_null
    VOP_PATHCONF         vop_einval
    VOP_READLINK         vop_einval
    VOP_REALLOCBLKS      vop_eopnotsupp

    And remove identical functionality from filesystems

4.   Add vop_stdpathconf, which returns the canonical stuff.  Use
     it in the filesystems.  (XXX: It's probably wrong that specfs
     and fifofs sets this vop, shouldn't it come from the "host"
     filesystem, for instance ufs or cd9660 ?)

5.   Try to make system wide VOP functions have vop_* names.

6.   Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)
1997-10-16 20:32:40 +00:00
Poul-Henning Kamp
2df896458d Oops. forgot the blasted cvs add.
Pointed hat sent from:	Karl Denninger  <karl@Mcs.Net>
1997-10-16 17:48:22 +00:00
Poul-Henning Kamp
cec0f20ce7 VFS mega cleanup commit (x/N)
1.  Add new file "sys/kern/vfs_default.c" where default actions for
    VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
    POLL, REVOKE and STRATEGY.  Various stuff spread over the entire
    tree belongs here.

2.  Change VOP_BLKATOFF to a normal function in cd9660.

3.  Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC.  These
    are private interface functions between UFS and the underlying
    storage manager layer (FFS/LFS/MFS/EXT2FS).  The functions now
    live in struct ufsmount instead.

4.  Remove a kludge of VOP_ functions in all filesystems, that did
    nothing but obscure the simplicity and break the expandability.
    If a filesystem doesn't implement VOP_FOO, it shouldn't have an
    entry for it in its vnops table.  The system will try to DTRT
    if it is not implemented.  There are still some cruft left, but
    the bulk of it is done.

5.  Fix another VCALL in vfs_cache.c (thanks Bruce!)
1997-10-16 10:50:27 +00:00
Julian Elischer
4199ed4098 We are mounting the root.
mount it at the HEAD of the queue, DEVFS might already be there..
1997-10-16 07:32:14 +00:00
Guido van Rooij
d021ae3db5 On execing a sgid program, do not set P_SUGID when cr_gid and cr)_uid
do not change.
PR:		4755
Reviewed by:	Bruce Evans
1997-10-15 18:28:34 +00:00
Peter Wemm
987163643c Sigh. Signal handlers are executed on leaving the system call, not
at moment of delivery.  Restoring the signal mask after the tsleep()
is next to useless since the signal is still queued.. This was interacting
with usleep(3) on receipt of a SIGALRM causing it to near busy loop.

Now, we set the new signal mask "permanently" for signanosleep().

Problem noted by:  bde
1997-10-15 13:58:52 +00:00
Poul-Henning Kamp
138ec1f71a vnops megacommit
1.  Use the default function to access all the specfs operations.
2.  Use the default function to access all the fifofs operations.
3.  Use the default function to access all the ufs operations.
4.  Fix VCALL usage in vfs_cache.c
5.  Use VOCALL to access specfs functions in devfs_vnops.c
6.  Staticize most of the spec and fifofs vnops functions.
7.  Make UFS panic if it lacks bits of the underlying storage handling.
1997-10-15 13:24:07 +00:00
Poul-Henning Kamp
a1c995b626 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
Peter Wemm
0c111a5c17 Try and fix some style problems 1997-10-12 15:24:39 +00:00
Poul-Henning Kamp
55166637cd Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
Poul-Henning Kamp
22c6434807 Freeing with unknown type is a panic kind of thing. 1997-10-11 13:13:09 +00:00
Poul-Henning Kamp
a6cf5c6dfc Remove all traces of M_VFSCONF, which were for all practical
purposes unused.
1997-10-11 13:11:32 +00:00
Poul-Henning Kamp
a42428bb03 Remove a debug printf entirely. 1997-10-11 10:49:43 +00:00
Peter Wemm
bd975223e0 Disable an extremely annoying printf. 1997-10-11 10:41:44 +00:00
Poul-Henning Kamp
f7891f9adb Dike out a weird warning. 1997-10-11 07:34:27 +00:00
John Dyson
c486068695 Make the target for the number of AIO daemons work. 1997-10-11 01:07:03 +00:00
Poul-Henning Kamp
60a513e942 Rename "struct kmemstats" to "struct malloc_type" it makes more sense now.
Fix type argument to hashinit() and phashinit()
1997-10-10 18:14:23 +00:00
Poul-Henning Kamp
254c6cb330 Make malloc more extensible. The malloc type is now a pointer to
the struct kmemstats that describes the type.

This allows subsystems to declare their malloc types locally
and <sys/malloc.h> doesn't need tweaked everytime somebody
gets an idea.  You can even have a type local to a lkm...

I don't know if we really need the longdesc, comments welcome.

TODO: There is a single nit in ext2fs, that will be fixed later,
and I intend to remove all unused malloc types and distribute
the rest closer to their use.
1997-10-10 14:06:34 +00:00
Peter Wemm
b67dffdad2 Compensate for pcb.h tweaks.
(Bruce pointed out the nesting)
1997-10-10 12:42:54 +00:00
Peter Wemm
98823b2366 Convert the VM86 option from a global option to an option only depended
on by the files that use it.  Changing the VM86 option now only causes
a recompile of a dozen files or so rather than the entire kernel.
1997-10-10 09:44:12 +00:00
John Dyson
a624e84fff Major cleanup and debugging of the new AIO/LIO code. The LIO code is
now corrected.  New tunables/instrumentation added.  The code is now
likely "good enough to use."  I will add the userland support soon.
The "high performance" mode for raw devices is still missing, and will
be added next.  POSIX system calls that now appear to work:
aio_cancel, aio_error, aio_read, aio_return, aio_suspend, aio_write,
lio_listio.  Missing, but to be added soon: aio_fsync.
1997-10-09 04:14:41 +00:00
Peter Wemm
115526d03e Ack! Fix excessive cut/paste blunder during poll mods. Who had the
pointy hat last? :-]

When one is selecting (or polling) for write, it helps if we use the
write side of the pipe when requesting wakeups instead of the read side.
This broke ghostview (at least) - I'm suprised it wasn't noticed for
so long.

Reviewed by:  Greg Lehey <grog@lemis.com>
1997-10-06 08:30:08 +00:00
Nate Williams
fd35ecc322 - Hide the 'device doesn't supported shared interrupts' code behind
bootverbose, since the older register_intr() code didn't print out
  anything, and the laptop support will cause lots of these un-necessary
  messages.
1997-10-06 04:27:32 +00:00
John Dyson
e7b0208f61 Relax the vnode locking for read only operations. 1997-10-06 02:38:30 +00:00
John Dyson
42c0de4926 It is possible that MB's with really broken bios's not set up more of
the mtrr registers.  This just fills in more of the registers.
1997-10-06 02:11:32 +00:00
John Dyson
c63ba9f5ae Make sure that the memory type registers are the same for each CPU
in a P6 SMP system.  Some MB bios'es don't set the registers up correctly
for the AP's.  Additionally, set the memory between 0xa0000 and 0xbffff
as write combining.
1997-10-05 03:19:29 +00:00
Poul-Henning Kamp
eabecea346 While booting diskless we have no proc pointer. 1997-10-04 18:21:15 +00:00
Poul-Henning Kamp
ad324c8891 Fix handling of nested mountpoints in __getcwd()
Detected by:	Simon Shapiro <Shimon@i-Connect.Net>
1997-09-28 06:37:02 +00:00
Joerg Wunsch
91f7577b37 Hide the `no magic' babble behind bootverbose, since it has proven to
be too much magic for 99.9 % of the users.
1997-09-27 15:34:34 +00:00
KATO Takenori
81bca6ddae Clustered read and write are switched at mount-option level.
1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
   bits of the mnt_flag.  The sysctl variables, vfs.foo.doclusterread
   and vfs.foo.doclusterwrite are deleted.  Only mount option can
   control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
   and D_CLUSTERW bits of the d_flags member in the block device switch
   table.  If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
   MNT_NOCLUSTERW bits will be set.  In this case, MNT_NOCLUSTERR and
   MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by:	bde
1997-09-27 13:40:20 +00:00
Poul-Henning Kamp
d047b580c6 I lost a bit of my change in the last commit, this is more like it.
Noticed by:	bde
1997-09-26 08:08:58 +00:00
Poul-Henning Kamp
87b1940afa Reduce the target number of vnodes on the freelist from desiredvnodes
(usually a couple of thousand) to 25.  The measured impact on cache-hits
doesn't justify spending memory this way:

Target number of free vnodes versus namecache hit rate in % during a
make world:
          10    98.5316
         200    98.5479
         500    98.5546
        1000    98.5709
        3000    98.6006
        4000    98.6126
1997-09-25 16:17:57 +00:00
Justin T. Gibbs
453276111e Store an absolute tick value in callout entries so that a subtraction on
hash chain traversal isn't needed.  This also allows untimeout to recompute
the hash to find the bucket that the entry to remove is stored in so
that each callout entry no longer needs to store that information.

Reviewed by:	 Nate Williams <nate@mt.sri.com>
1997-09-24 16:39:27 +00:00
Poul-Henning Kamp
46c320bab8 Add one more counter so we can truly find out how good our name cache
is.  If we don't find something and don't what to have found something,
it's actually a success.
1997-09-24 15:54:10 +00:00
Poul-Henning Kamp
0054419366 A couple of handles to tweak, more statistics. 1997-09-24 07:46:54 +00:00
Julian Elischer
e29b30aa7d urk, fix spelling error in comment I just fixed. 1997-09-21 22:20:12 +00:00
Julian Elischer
4010d6d962 Fix a comment. 1997-09-21 22:14:54 +00:00
Justin T. Gibbs
14d9217731 Convert tqdisksort to bufqdisksort. Honor the B_ORDERED buffer flag
so that meta-data writes go out to the device in the right order.
1997-09-21 22:10:49 +00:00
Justin T. Gibbs
ab36c06737 init_main.c subr_autoconf.c:
Add support for "interrupt driven configuration hooks".
	A component of the kernel can register a hook, most likely
	during auto-configuration, and receive a callback once
	interrupt services are available.  This callback will occur before
	the root and dump devices are configured, so the configuration
	task can affect the selection of those two devices or complete
	any tasks that need to be performed prior to launching init.
	System boot is posponed so long as a hook is registered.  The
	hook owner is responsible for removing the hook once their task
	is complete or the system boot can continue.

kern_acct.c kern_clock.c kern_exit.c kern_synch.c kern_time.c:
	Change the interface and implementation for the kernel callout
	service.  The new implemntaion is based on the work of
	Adam M. Costello and George Varghese, published in a technical
	report entitled "Redesigning the BSD Callout and Timer Facilities".
	The interface used in FreeBSD is a little different than the one
	outlined in the paper.  The new function prototypes are:

	struct callout_handle timeout(void (*func)(void *),
				      void *arg, int ticks);

	void untimeout(void (*func)(void *), void *arg,
		       struct callout_handle handle);

	If a client wishes to remove a timeout, it must store the
	callout_handle returned by timeout and pass it to untimeout.

	The new implementation gives 0(1) insert and removal of callouts
	making this interface scale well even for applications that
	keep 100s of callouts outstanding.

	See the updated timeout.9 man page for more details.
1997-09-21 22:00:25 +00:00
Justin T. Gibbs
919429034e autoconf.c:
Add cpu_rootconf and cpu_dumpconf so that configuring these
	two devices can be better controlled by the MI configuration
	code.

machdep.c:
	MD initialization code for the new callout interface.

trap.c:
	Add support for printing out whether cam interrupts are masked
	during a panic.
1997-09-21 21:38:05 +00:00
Peter Wemm
dfd5aef3a8 Implement the parts needed for VM86 under SMP. 1997-09-21 15:03:59 +00:00
John Dyson
a65247e12c Add support for more than 1 page of idle process stack on SMP systems. 1997-09-21 05:50:02 +00:00
John Dyson
804cd17e21 Re-institute a bugfix in allocation of anonymous buffer memory. 1997-09-21 04:49:30 +00:00
John Dyson
99448ed11d Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half.  The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs.  Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.
1997-09-21 04:24:27 +00:00
Peter Wemm
1560a9d538 We were (I think) missing a vrele() on the vnode for the object loaded
via PT_INTERP (usually /usr/libexec/ld-elf.so.1).
1997-09-21 03:13:21 +00:00
Bruce Evans
043a2f3b8b Fixed staticization. buckets[] was staticized but was still declared
extern in <sys/malloc.h> and it should not have been staticized for
the !(KMEMSTATS || DIAGNOSTIC) case.

Fixed the !(KMEMSTATS || DIAGNOSTIC) case.  The MALLOC() and FREE()
macros are evil, but code generally doesn't allow for this and some code
involving else clauses did not compile.

Finished staticization.
1997-09-16 13:52:04 +00:00
Bruce Evans
514ede0953 Fixed gratuitous ANSIisms. 1997-09-16 11:44:05 +00:00
Bruce Evans
0e6021ad31 Reject attempts to set an in-core label which says that the "disk"
or a partition is larger than the slice.

Now `disklabel -Brw sdX auto' should fail properly on sliced disks
without partition of type 165, e.g., on zip disks with the factory
default formatting.  Previously it set a bogus in-core label for
the compatibility slice and used this to corrupt the MBR (the slice
has offset 0 and size 0, but setting the label in effect corrupted
its size to nonzero).

`disklabel -Brw sdX auto' already failed properly on normally (not
dangerously dedicated) sliced disks _with_ partition of type 165,
because the compatibility slice has a nonzero offset so the MBR
remained inaccessible when the size was corrupted.

This bug only affected in-core labels.  On-disk labels are checked
carefully when they read and written.
1997-09-16 10:11:49 +00:00
Poul-Henning Kamp
044839fb8b Don't leak memory, from sef.
Stylistic nits and a blunder, from bde.
1997-09-16 08:05:09 +00:00
Poul-Henning Kamp
7874d7a3bb Solve race-condition, return path in normal order.
A couple of stylistic nits from Bruce.

If your libc contains version 1.11 or 1.12 of getcwd.c, (ie: if
you recompiled libc one of the last couple of days):
>>> Recompile LIBC before you boot a new kernel <<<
A new libc will deal with both old and new kernels.
1997-09-15 19:11:07 +00:00
Poul-Henning Kamp
d56f6402d5 Deal more correctly with mountpoints. 1997-09-15 08:25:43 +00:00
Peter Wemm
921af254ca Regenerate _after_ the commit to syscalls.master 1997-09-15 02:03:45 +00:00
Poul-Henning Kamp
7822f1c624 Add a __getcwd() syscall. This is intentionally undocumented, but all
it does is to try to figure the pwd out from the vfs namecache, and
return a reversed string to it.  libc:getcwd() is responsible for
flipping it back.
1997-09-14 16:51:31 +00:00
Peter Wemm
35b8b2ddab Update select -> poll in drivers. 1997-09-14 03:19:42 +00:00
Peter Wemm
51338ea83c Various select -> poll changes 1997-09-14 02:52:18 +00:00
Peter Wemm
a2f9bc72c1 vn_select -> vn_poll 1997-09-14 02:51:16 +00:00
Peter Wemm
d87652c3de Zap nxselect and noselect. 1997-09-14 02:50:28 +00:00
Peter Wemm
7fab77996c Provide a 'return true' poll vnode op rather than duplicating the
'do nothing' case all over the various filesystems.
1997-09-14 02:49:06 +00:00
Peter Wemm
1514b90f2d Extend select hook to support poll 1997-09-14 02:46:44 +00:00
Peter Wemm
d080b0b0be Implement the poll backend for the pipe file type. 1997-09-14 02:43:25 +00:00
Peter Wemm
659ffb486a Convert select handler to poll style 1997-09-14 02:42:03 +00:00
Peter Wemm
6183953301 Extend to use poll backend. If memory serves correctly, most of this was
adapted from NetBSD..  However, there are some differences in the tty
system that are big enough to cause their code to not fit comfortably.

Obtained from:  NetBSD (I think)
1997-09-14 02:40:46 +00:00
Peter Wemm
6b8e64f55f Change VOP_SELECT to VOP_POLL 1997-09-14 02:35:25 +00:00
Peter Wemm
e25aa68e0c Extend select backend for sockets to work with a poll interface (more
detail is passed back and forwards).  This mostly came from NetBSD, except
that our interfaces have changed a lot and this funciton is in a different
part of the kernel.

Obtained from: NetBSD
1997-09-14 02:34:14 +00:00
Peter Wemm
42d1175732 Implement poll(2). This is mostly taken from the NetBSD implementation
(from some time ago) but with a few tweaks along the way.

Obtained from: NetBSD
1997-09-14 02:30:32 +00:00
Peter Wemm
818661c8f6 Regenerate (added poll etc) 1997-09-14 02:23:46 +00:00
Peter Wemm
8cb0553a7c Activate poll(2) syscall 1997-09-14 02:22:05 +00:00
Joerg Wunsch
245f17d43c Implement SA_NOCLDWAIT.
The implementation is done (unlike what i've originally been
contemplating) by reparenting kids of processes that have the
appropriate bit set to PID 1, and let PID 1 handle the zombie.  This
is far less problematical than what would seem to be ``doing it
right'', for a number of reasons.

Of our currently shipping PID-1-intended programs, 50 % fail the above
assumption. ;-)  (Read this: sysinstall doesn't do it right.  This is
no problem as long as no program called by sysinstall actually uses
SA_NOCLDWAIT.)

ToDo:		. clarify the correct SA_* flag inheritance, compared
		  to other systems,
		. decide whether the compat cruft (osigvec(9)) should
		  deal with new system additions or not,
		. merge OpenBSD's SA_SIGINFO implementation. ;)
Reviewed by:	bde
1997-09-13 19:42:29 +00:00
Peter Wemm
557fe2c5e1 print correct function name in a panic (vop_nolock -> vop_sharedlock) 1997-09-13 15:02:28 +00:00
Poul-Henning Kamp
bf1d104a34 3 lines of code and updates to a number of comments.
Reviewed by:	phk
Submitted by:	 Terry Lambert <tlambert@primenet.com>
1997-09-10 20:11:02 +00:00
Poul-Henning Kamp
c1f95f1378 The patch is needed in order to not throw away unmodified
local filesystem metadata at the first brelse call when the
block device vnode has v_tag set to VT_NFS.

Reviewed by:	phk
Submitted by:	Tor Egge <tegge@idi.ntnu.no>
1997-09-10 20:09:22 +00:00
Steve Passe
20233f27f4 General cleanup of the lock pushdown code. They are grouped and enabled
from machine/smptests.h:

#define PUSHDOWN_LEVEL_1
#define PUSHDOWN_LEVEL_2
#define PUSHDOWN_LEVEL_3
#define PUSHDOWN_LEVEL_4_NOT
1997-09-07 22:04:09 +00:00
Bruce Evans
2d85d0df17 Some staticized variables were still declared to be extern. 1997-09-07 16:56:34 +00:00
Bruce Evans
2931901b1d Removed trailing semicolons from the definitions of the sysctl
declaration macros so that a semicolon can be added when the macros
are invoked without giving a (pedantic) syntax error.  Invocations
need to be followed by a semicolon so that programs like indent and
gtags don't get confused.

Fixed the one invocation that wasn't followed by a trailing semicolon.
1997-09-07 16:53:52 +00:00
Bruce Evans
41fadeeb28 Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.
1997-09-07 16:21:11 +00:00
Bruce Evans
a910e75cdc Removed vestiges of config-time "argument processing" configuration. 1997-09-07 13:49:56 +00:00
Bruce Evans
bea0f0be7b Some staticized variables were still declared to be extern. 1997-09-07 05:27:26 +00:00
Peter Wemm
279a69322c Cosmetic adjustment for the trap/double fault/panic cpu id listing.
It now prints the apic id in hex rather than decimal.
1997-09-05 08:54:55 +00:00
Tor Egge
882e68c8a9 sonewconn no longer passes curproc to the protocol attach method
since that might cause in_pcballoc to call MALLOC with M_WAITOK during
a software interrupt.
Reviewed by:	Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
1997-09-04 17:39:16 +00:00
Poul-Henning Kamp
4d1122bdd6 Revert to the previous hashing, double the hashtable size instead. 1997-09-04 08:24:44 +00:00
Poul-Henning Kamp
fd9d9ff13e Hmm, this is hopefully better. 1997-09-03 13:29:41 +00:00
Poul-Henning Kamp
119b6f4cf2 Use 2^N hash sizes rather than primesize, this replaces a division
with an and. (Submitted by davidg)

Preemptively record ".." values.

Reviewed by:	phk
1997-09-03 09:20:17 +00:00
Poul-Henning Kamp
7cb22688e9 Revert the v_usecount handling in relation to VOP_INACTIVE. 1997-09-03 09:18:48 +00:00
Bruce Evans
e4ba6a82b0 Removed unused #includes. 1997-09-02 20:06:59 +00:00
Bruce Evans
4d1d4912ae Added used #include - don't depend on <sys/mbuf.h> including
<sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).
1997-09-02 01:19:47 +00:00
Steve Passe
7245dff0f1 Cleanup. 1997-09-01 07:31:54 +00:00
Bruce Evans
e28b96049b Move closer to supporting VM86 under SMP.
LINT now compiles but doesn't link.  Other link-time breakage for LINT
is now visible (SMP is incompatible with SIMPLELOCK_DEBUG).
Submitted by:	jlemon
1997-09-01 01:54:52 +00:00
Bruce Evans
6d58e6cbc4 Fixed options SHOW_BUSYBUFS and PANIC_REBOOT_WAIT_TIME which were broken
by incomplete cutting and pasting from machdep.c to kern_shutdown.c.

PR:		3953
1997-08-31 23:08:38 +00:00
Poul-Henning Kamp
a051452ae2 Change the 0xdeadb hack to a flag called VDOOMED.
Introduce VFREE which indicates that vnode is on freelist.
Rename vholdrele() to vdrop().
Create vfree() and vbusy() to add/delete vnode from freelist.
Add vfree()/vbusy() to keep (v_holdcnt != 0 || v_usecount != 0)
  vnodes off the freelist.
Generalize vhold()/v_holdcnt to mean "do not recycle".
Fix reassignbuf()s lack of use of vhold().
Use vhold() instead of checking v_cache_src list.
Remove vtouch(), the vnodes are always vget'ed soon enough
  after for it to have any measuable effect.
Add sysctl debug.freevnodes to keep track of things.
Move cache_purge() up in getnewvnodes to avoid race.
Decrement v_usecount after VOP_INACTIVE(), put a vhold() on
  it during VOP_INACTIVE()
Unmacroize vhold()/vdrop()
Print out VDOOMED and VFREE flags (XXX: should use %b)

Reviewed by:		dyson
1997-08-31 07:32:39 +00:00
Steve Passe
2645264a72 Debug version of simple_lock. This will store the CPU id of the
holding CPU along with the lock.  When a CPU fails to get the lock
it compares its own id to the holder id.  If they are the same it
panic()s, as simple locks are binary, and this would cause a deadlock.

Controlled by smptests.h: SL_DEBUG, ON by default.

Some minor cleanup.
1997-08-31 03:17:48 +00:00
Steve Passe
78292efeef Another round of lock pushdown.
Add a simplelock to deal with disable_intr()/enable_intr() as used in UP kernel.
UP kernel expects that this is enough to guarantee exclusive access to
regions of code bracketed by these 2 functions.
Add a simplelock to bracket clock accesses in clock.c: clock_lock.

Help from:	Bruce Evans <bde@zeta.org.au>
1997-08-30 08:08:10 +00:00
KATO Takenori
662f9a6987 Move MACHINE_ARCH definition from <machine/param.h> to <machine/cpu.h>.
Submitted by:	Bruce Evans <bde@zeta.org.au>
1997-08-30 02:52:04 +00:00
KATO Takenori
664f85174a Added a sysctl arg, hw.machine_arch. The hw.machine_arch is "ibm-pc"
on IBM-PC box and is "pc-98" on NEC PC-98 box.  Userland program can
distinguish architecture on which the program runs.
1997-08-29 09:03:40 +00:00
Jonathan Lemon
5f07393373 Remove the vm86 support as an LKM, and link it directly into the kernel
if 'options "VM86"' is in the config file.  The LKM was really for
development, and has probably outlived its usefulness.
1997-08-28 14:36:56 +00:00
Peter Wemm
90bcb528a8 Correct some things I forgot about until it was too late with smp_active.
smp_active = 1 used to indicate that the system had frozen previously
started AP's, while smp_active = 0 was "AP's not yet started".  I have split
this into smp_started (which is set when the AP's come online), and
smp_active is left for turning on/off AP scheduling.
1997-08-26 18:36:15 +00:00
Peter Wemm
9a3b3e8bce Clean up the SMP AP bootstrap and eliminate the wretched idle procs.
- We now have enough per-cpu idle context, the real idle loop has been
revived (cpu's halt now with nothing to do).
- Some preliminary support for running some operations outside the
global lock (eg: zeroing "free but not yet zeroed pages") is present
but appears to cause problems.  Off by default.
- the smp_active sysctl now behaves differently. It's merely a 'true/false'
option.  Setting smp_active to zero causes the AP's to halt in the idle
loop and stop scheduling processes.
- bootstrap is a lot safer.  Instead of sharing a statically compiled in
stack a number of times (which has caused lots of problems) and then
abandoning it, we use the idle context to boot the AP's directly.  This
should help >2 cpu support since the bootlock stuff was in doubt.
- print physical apic id in traps.. helps identify private pages getting
out of sync.  (You don't want to know how much hair I tore out with this!)

More cleanup to follow, this is more of a checkpoint than a
'finished' thing.
1997-08-26 18:10:38 +00:00
Bruce Evans
d3114049c1 Restored rev.1.92 which was clobbered by the previous commit. 1997-08-26 11:59:20 +00:00
Poul-Henning Kamp
0fa2443f0e Uncut&paste cache_lookup().
This unifies several times in theory indentical 50 lines of code.

The filesystems have a new method: vop_cachedlookup, which is the
meat of the lookup, and use vfs_cache_lookup() for their vop_lookup
method.  vfs_cache_lookup() will check the namecache and pass on
to the vop_cachedlookup method in case of a miss.

It's still the task of the individual filesystems to populate the
namecache with cache_enter().

Filesystems that do not use the namecache will just provide the
vop_lookup method as usual.
1997-08-26 07:32:51 +00:00
John Dyson
a5db4bf475 Back out some incorrect changes that was worse than the original bug. 1997-08-26 04:36:27 +00:00
Bruce Evans
7d7fb492c5 Don't return EINVAL for negative timespecs in the nanosleep functions.
Negative timespecs are perfectly valid.  Just return 0 immediately
for them.  Also, return 0 immediately for zero timespecs.

Fixed some style bugs.
1997-08-26 00:40:04 +00:00
Bruce Evans
8a2d9f5076 Finished staticizing. 1997-08-26 00:31:04 +00:00
Bruce Evans
9a629c9302 Fixed some formatting and style bugs.
Fixed a gratuitous ANSIism.
1997-08-26 00:24:25 +00:00
Bruce Evans
1d9655ae4d Print more info in the "calcru: negative time" message. 1997-08-26 00:20:11 +00:00
Bruce Evans
eb776aea19 Fixed some gratuitous ANSIisms. 1997-08-26 00:15:04 +00:00
Bruce Evans
32545fd108 Removed some stale comments.
Fixed a gratuitous ANSIism.
1997-08-26 00:09:44 +00:00
Bruce Evans
282ec22c77 Removed redundant test against MAXDSIZ (the rlimit test is stronger). 1997-08-26 00:02:24 +00:00
Bruce Evans
2a2968a896 Removed a bogus comment. 1997-08-25 21:28:08 +00:00
Poul-Henning Kamp
c049f06469 Add a new vnode op (cachedlookup) so that filesystems can plug into
a global vfs_cache check.  The rest of this change will come when the
current zero size file problem is resolved.
1997-08-25 20:28:49 +00:00
Steve Passe
8ee0110a44 A clean fix for the spl "deadlock before smp_active" problem.
Added a new variable, 'bsp_apic_ready', which is set as soon as the bootstrap
CPU has initialized its local APIC.  Conditionalize the GENSPLR functions
to call ss_lock ONLY after bsp_apic_ready is TRUE;  This should prevent
any problems with races between the time the 1st AP becomes ready and the
time smp_active is set.
1997-08-24 20:33:32 +00:00
Peter Wemm
e384a9801e Print a warning if an unsupported (under SMP) shared address space fork
is attempted rather than just failing with an errno.
1997-08-22 15:10:00 +00:00
Poul-Henning Kamp
0e61ac7b5d typo in comment. 1997-08-22 07:16:46 +00:00
John Dyson
89721f6f1a This is a trial improvement for the vnode reference count while on the vnode
free list problem.  Also, the vnode age flag is no longer used by the
vnode pager.  (It is actually incorrect to use then.)  Constructive
feedback welcome -- just be kind.
1997-08-22 03:56:37 +00:00
Bruce Evans
b1037dcd53 #include <machine/limits.h> explicitly in the few places that it is required. 1997-08-21 20:33:42 +00:00
Steve Passe
fbca51f50a Added a half dozen casts to eliminate annoying warnings. 1997-08-21 06:39:41 +00:00
Philippe Charnier
40d5099441 Revert my previous commit about using CS_SECURE macro.
Requested by:	Bruce.
1997-08-21 06:33:04 +00:00
Steve Passe
4a73d99f7e Made PEND_INTS default.
Made NEW_STRATEGY default.
Removed misc. old cruft.

Centralized simple locks into mp_machdep.c
Centralized simple lock macros into param.h

More cleanup in the direction of making splxx()/cpl MP-safe.
1997-08-21 05:08:25 +00:00
John Dyson
745b842305 Some corrections to the anonymous page managment.
Submitted by:	Peter Chen <pmchen@eecs.umich.edu>
1997-08-21 01:35:37 +00:00
Steve Passe
7b185ef809 Preperation for moving cpl into critical region access.
Several new fine-grained locks.
New FAST_INTR() methods:
 - separate simplelock for FAST_INTR, no more giant lock.
 - FAST_INTR()s no longer checks ipending on way out of ISR.
sio made MP-safe (I hope).
1997-08-20 05:25:48 +00:00
Steve Passe
77625cfe0b Moved splq() to isa/ipl_funcs.c for SMP only.
This is in preperation for moving all cpl accesses behind a critical region lock.
1997-08-20 05:19:49 +00:00
Peter Wemm
1a5018a043 Implement XPG/SYSV-style getpgid()/getsid() syscalls. getpgid() uses the
same syscall number as NetBSD/OpenBSD.  The getpgid() came from NetBSD
(I think) originally, but it's basically cut/paste/edit from the other
simple get*() syscalls.
1997-08-19 06:00:27 +00:00
Peter Wemm
217cb20cdc Regenerate 1997-08-19 05:57:04 +00:00
Peter Wemm
6871cc6262 SVR4/XPG-style getpgid()/getsid() syscalls. 1997-08-19 05:53:48 +00:00
John Dyson
891e0f24c4 Allow lockmgr to work without a current process. Disallowing that
was a mistake in the lockmgr rewrite.
1997-08-19 00:27:07 +00:00
Philippe Charnier
15f3549108 Use CS_SECURE macro.
Reviewed by:	John Dyson
1997-08-18 06:58:59 +00:00
Steve Passe
7cbfd031b6 Added includes of smp.h for SMP.
This eliminates a bazillion warnings about implicit s_lock & friends.
1997-08-18 03:29:21 +00:00
John Dyson
03e9c6c101 Fix kern_lock so that it will work. Additionally, clean-up some of the
VM systems usage of the kernel lock (lockmgr) code.  This is a first
pass implementation, and is expected to evolve as needed.  The API
for the lock manager code has not changed, but the underlying implementation
has changed significantly.  This change should not materially affect
our current SMP or UP code without non-standard parameters being used.
1997-08-18 02:06:35 +00:00
Julian Elischer
ff36905c57 Take verbal beating by wollman into account and fix DIAGNOSTIC test.
This version.
1/ avoids garret's introduced  potential page fault. (I got one)
2/ removes compiler warnings

Also fix the tunable scheduling quantum to return a better error code when
fed a bad argument.
1997-08-18 01:34:38 +00:00
Garrett Wollman
fa5cde129b Delete a bit of debugging code that mistakenly crept in, and as a consequence
revert rev. 1.28's header file additions which are no longer needed.
1997-08-17 19:47:28 +00:00
Tor Egge
19c0663e5e Use KERNBASE, not 0xf0000000. 1997-08-17 17:40:11 +00:00
Garrett Wollman
57bf258e3d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
Garrett Wollman
62f74f2f72 Dejulianize DIAGNOSTIC panic code. The types are wrong; probably there's
a missing dereference.
1997-08-16 19:07:20 +00:00
Steve Passe
3905c09afb The promised "better fix" for "Trap 9 When Boot SMP" problem.
We now tsleep() in kthread_init() between start_init()
and prepare_usermode() while waiting for ALL the idle_loop()
processes to come online.

Debugged & tested by:	"Thomas D. Dean" <tomdean@ix.netcom.com>

Reviewed by:	David Greenman <dg@root.com>
1997-08-15 02:33:30 +00:00
Andrey A. Chernov
cd9f713d45 setitimer: if it_value == 0 clear it_interval now
non-zero it_interval values have no sense if it_value == 0 but
checked by itimerfix which may cause EINVAL return
1997-08-14 08:15:12 +00:00
Steve Passe
22e5e9b058 Cheap TEMPORARY fix for "Trap 9 When Boot SMP" problem.
This is on the top of my list for a correct fix.

Submitted by:	"Thomas D. Dean" <tomdean@ix.netcom.com>
1997-08-13 23:05:33 +00:00
Julian Elischer
3d0a7bc3b8 add a diagnostic to catch some common cases of tsleep being
called from the wrong place.
1997-08-13 19:29:33 +00:00
Andrey A. Chernov
76aab1da0f Bypass itimerfix 100000000 limit in nanosleep1 using loop through timeouts 1997-08-13 17:55:11 +00:00
John Dyson
0b6e0f74f9 Back out a part of the disk scheduling "improvements" :-(. Let me know
how the system works now!!!
1997-08-12 19:07:42 +00:00
Steve Passe
cb02d4da35 Cheap fix for kern/4255.
If the problem is seen this fix suggests a compile-time work-around then panics.
1997-08-10 19:32:38 +00:00
Steve Passe
7acc960834 Some fixes towards making "default configs" work again.
Still not fixed, no idea why.

Debug help from: "Thomas D. Dean" <tomdean@ix.netcom.com>
1997-08-09 23:01:03 +00:00
John Dyson
c0ecffb96b Modify the scheduling policy to take into account disk I/O waits
as chargeable CPU usage.  This should mitigate the problem of processes
doing disk I/O hogging the CPU.  Various users have reported the
problem, and test code shows that the problem should now be gone.
1997-08-09 10:13:32 +00:00
Julian Elischer
63fe995cb4 Teach both disk drivers how to cope with a hardware watchdog
while dumping core.. I'm tired of getting 1/2 of a core-dump

conditional on -DHW_WDOG for now
this will migrate to 2.2 as that's where I need it.
1997-08-09 01:44:25 +00:00
Julian Elischer
5230cfd2f4 Use up 4 precious bytes to give the kernel a hook to
support hardware watchdogs. The actual functions would be supplied in an LKM
or a linked file, but they need to hang off something.
1997-08-09 01:25:54 +00:00
John Dyson
48a09cf276 VM86 kernel support.
Work done by BSDI, Jonathan Lemon <jlemon@americantv.com>,
	Mike Smith <msmith@gsoft.com.au>, Sean Eric Fagan <sef@kithrup.com>,
	and probably alot of others.
Submitted by:	Jnathan Lemon <jlemon@americantv.com>
1997-08-09 00:04:06 +00:00
Julian Elischer
90405c80e1 Make the scheduler quantum a tunable parameter
Reviewd by: John Dyson  dyson@freebsd.org
1997-08-08 22:48:57 +00:00
Julian Elischer
a39a7bceee Make a function static to quieten gcc 1997-08-08 20:29:47 +00:00
Julian Elischer
e142af9aba Clean up the console muting functionality.
this has been in production now for a long time with no known effects.
1997-08-08 20:09:50 +00:00
Steve Passe
6b556c4b4c Fixes kern/3835: SMP kernel crash on enable "dumps on wd0"
- SMP: set value of curproc in main(), before the SYSINIT stuff runs.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
1997-08-07 21:22:29 +00:00
John Dyson
0d65e566b9 Another attempt at cleaning up the new memory allocator. 1997-08-05 22:24:31 +00:00
John Dyson
e258d33a51 Fix up come cruft that I left on a previous commit. 1997-08-05 00:05:00 +00:00
John Dyson
3075778b63 Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of
a simple, clean zone type allocator.  This new allocator will also be
used for machine dependent pmap PV entries.
1997-08-05 00:02:08 +00:00
Steve Passe
248fcb669b pushed down "volatility" of simplelock to actual int inside the struct.
Submitted by:	 bde@zeta.org.au, smp@csn.net
1997-08-04 19:11:26 +00:00
John Dyson
23be6be853 Fix a problem with the vfs vnode caching that it doesn't grow quickly
enough and can cause some strange performance problems.  Specifically, at
or near startup time is when the problem is worst.  To reproduce
the problem, run "lat_syscall stat" from the alpha lmbench code right
after bootup.  A positive side effect of this mod is that the name
cache can be set to grow again by sysctl.  A noticable positive
performance impact is realized due to a larger namecache being available
as needed (or tuned.)
1997-08-04 07:43:28 +00:00
Poul-Henning Kamp
2401f27c25 remove unused MAXVNODEUSE macro. 1997-08-04 07:31:36 +00:00
David Greenman
a78e8d2a83 Fixed security hole with sharing the file descriptor table (via rfork)
when execing a setuid/setgid binary. Code submitted by Sean Eric Fagan
(sef@freebsd.org).
Also consolidated the setuid/setgid checks into one place.
Reviewed by:	dyson,sef
1997-08-04 05:39:24 +00:00
Bruce Evans
3fc9295da7 Fixed syscall arg checking in clock_settime(). Stack garbage was
checked to be >= 0.  This bug was introduced in rev.1.26.

Reported by:	John Hay <jhay@mikom.csir.co.za>
1997-08-03 07:26:50 +00:00
Bruce Evans
1fd0b0588f Removed unused #includes. 1997-08-02 14:33:27 +00:00
Steve Passe
da9f018228 Converted the TEST_LOPRIO code to default.
Created mplock functions that save/restore NO registers.
Minor cleanup.
1997-07-31 05:43:05 +00:00
John-Mark Gurney
5e2022633a fix a few problems with pty. warn about how if you only have 1 pty
defined, your really getting 32.  Also warn about how you can't have
more than 256 pty's when your using DEVFS (non DEVFS can use more, just
the makedev script doesn't know how to make >256).  it also doesn't
allocate more memory than needed in this case.

Make sure that the signal passed in TIOCSIG isn't 0 as it might cause
a panic.  I personally haven't seen this happen, but after a similar
bug in syscons crashed my machine, I'm acutely aware of this one. :)
1997-07-30 10:05:18 +00:00
Steve Passe
412f3e4d71 Modified the PEND_INTS algorithm to fix the ISA INT loss problem.
Noticed by:	dave adkins <adkin003@gold.tc.umn.edu> and others.
1997-07-28 03:59:54 +00:00
Steve Passe
f9e8dbb8c3 mpapic.c & mp_machdep:
- removed TEST_ALTTIMER.
 - removed APIC_PIN0_TIMER.
 - removed TIMER_ALL.

mplock.s:
 - minor update of try_mplock for new algorithm where a CPU uses try_mplock
	instead of get_mplock in the ISRs.
1997-07-26 01:55:19 +00:00
Steve Passe
812e4da7a8 New simple_lock code in asm:
- s_lock_init()
 - s_lock()
 - s_lock_try()
 - s_unlock()

Created lock for IO APIC and apic_imen  (SMP version of imen)
 - imen_lock

Code to use imen_lock for access from apic_ipl.s and apic_vector.s.
Moved this code *outside* of mp_lock.

It seems to work!!!
1997-07-23 20:47:19 +00:00
Steve Passe
f2aeb7eaac Cleaned up the FPU init. 1997-07-22 16:49:54 +00:00
Steve Passe
b9f415331e SMP code initializes the FPU of APs.
Suggested by:     Bruce Evans <bde@FreeBSD.ORG>
1997-07-21 17:03:22 +00:00
Steve Passe
35b3c4a0e5 Developed a new strategy for handling the 8254/8259/APIC issue. 1997-07-20 19:41:38 +00:00
Steve Passe
3577278519 Minor cleanup.
Pass string arg to apic_dump.
Moved bootverbose printing of SMP enabled INTs from clock.c to autoconf.c
1997-07-20 18:05:20 +00:00
Bruce Evans
e31521c3dd Removed unused #includes. 1997-07-20 08:37:24 +00:00
Bill Fenner
548af2789b Remove sonewconn() macro kludge, introduced in 4.3-Reno to catch argument
mismatches.  Prototypes do a much better job these days.

Noticed by:	bde
1997-07-19 20:15:43 +00:00
Steve Passe
1dec61e7c0 Added code to support #define APIC_PIN0_TIMER.
This code ALWAYS runs the 8254 timer thru the 8259 ICU.
It depricates the usage of "options SMP_TIMER_NC" in the config file.
1997-07-19 04:00:35 +00:00
Steve Passe
d2ecb616f2 Split TEST_CPUSTOP code into CPUSTOP_ON_DDBBREAK and mainline code. 1997-07-18 21:27:53 +00:00
Steve Passe
75c179003e printf cleanup. 1997-07-18 03:58:14 +00:00
John Dyson
78342719d6 Hopefully fix a few problems that could cause hangs in SMP mode.
1)	Make sure that the region mapped by a 4MB page is
	properly aligned.
2)	Don't turn on the PG_G flag in locore for SMP.  I plan
	to do that later in startup anyway.
3)	Make sure the 2nd processor has PSE enabled, so that 4MB
	pages don't hose it.

We don't use PG_G yet on SMP -- there is work to be done to make that
work correctly.  It isn't that important anyway...
1997-07-17 19:45:01 +00:00
Doug Rabson
f6b4c28555 Merge WebNFS support from NetBSD
Obtained from:	NetBSD
1997-07-17 07:17:33 +00:00
John Dyson
5aaef07c50 Clean up some lint associated with the AIO code. 1997-07-17 04:49:43 +00:00
Steve Passe
6c8a949030 Minor cleanup. 1997-07-15 02:46:37 +00:00
Bruce Evans
7e88aafca7 Use the correct size for a sector in the search for a label in
readdisklabel().  Sectors may be larger than DEV_BSIZE.
1997-07-13 15:53:20 +00:00
Steve Passe
7503ccc1c8 new code to control other CPUs: stop_cpus()/restart_cpus()/_Xstopcpu
this code is controlled by smptests.h: TEST_CPUSTOP, OFF by default

new code for handling mixed-mode 8259/APIC programming without 'ExtInt'
this code is controlled by smptests.h: TEST_ALTTIMER, ON by default
1997-07-13 01:22:48 +00:00
Steve Passe
409ba536dc Cleanup old stop_cpus/restart_cpus() cruft.
Leave TEST_TEST1 for now.
1997-07-13 01:07:57 +00:00
David Nugent
f4e39ee7af Adds sysctl int for shutdown timeout.
Reviewed by:	Poul-Henning Kamp <phk@dk.tfs.com>
1997-07-10 11:44:42 +00:00
Andrey A. Chernov
c6d372f6f3 Back out changes for 'conflicts' with IRQ, remove intr_registered() 1997-07-09 18:06:25 +00:00
Steve Passe
17ebd4d085 General cleanup of APIC code.
stop_cpus()/restart_cpus() STILL not working!
1997-07-08 23:46:00 +00:00
Steve Passe
a8988a70ce Reordered call to apic_initialize and setting invltlb_ok. 1997-07-08 23:25:40 +00:00