Commit Graph

2827 Commits

Author SHA1 Message Date
green
cff0ff5321 Fix a bug that could crash the system if you press ^T while a slower
system is slowed down and in the right spot (a race condition in fork()).

The "previous time" fields have moved from pstat to proc.  Anything which
uses KVM needs to be recompiled with a new libkvm/headers.

A couple wacky u_quad_t's in struct proc are now u_int64_t (the same, but
according to lack of 'quad's in proc.h and usage in kern_resource.c).
This will have no effect on code.

This has been make-world-and-installed-new-kernel-which-works-fine-tested.

Reviewed by:	bde (previous version)
2000-01-28 20:40:29 +00:00
archie
6dd36cc401 Back out previous commit; it was premature. 2000-01-28 17:11:07 +00:00
bde
e79d76144b Fixed a memory leak for slices with an (unsupported) bad sector table.
Broken in: rev.1.80.
2000-01-28 11:51:08 +00:00
bde
dff980d98b Don't permit generation of non-physical disk addresses.
subr_diskmbr.c:
Don't "helpfully" enlarge our idea of the disk size to cover all the
primary slices.  Instead, truncate or discard slices that don't seem
to be on the disk.  The enlargement was a hack for disks that don't
report their size (e.g., MFM disks).  It is just wrong in general.

wd.c:
In CHS mode, limit the disk size so that cylinder numbers >= 65536
cannot occur.  This normally only affects disks larger than 33.8GB.
CHS mode accesses to addresses above the limit are now properly broken
(an error is returned instead of garbage for reads and disk corruption
for writes).

PR:		15611
Reviewed by:	readers of freebsd-bugs did not respond to a request
            	for review
2000-01-28 10:22:07 +00:00
dg
fab6f30ed1 Fixed sign and overflow bugs that caused the allocation size of the kernel
malloc region (kmem_map) to be wrong and semi-random on systems with more
than 1GB of RAM. This is not a complete fix, but is sufficient for
machines with 4GB or less of memory. A complete fix will require some
changes to the getenv stuff so that 64bit values can be passed around.

NOT FIXED: machines with more than 4GB of RAM (e.g. some large Alphas)
since we're still using ints to hold some of the values.

Reviewed by:	bde
2000-01-28 04:04:58 +00:00
archie
219d3e4583 When an attempt to install a line discipline fails, check for
known KLD's that might support it, and load the KLD if found.
Currently the list includes SLIPDISC, PPPDISC, and NETGRAPHDISC.
2000-01-28 02:22:22 +00:00
bde
88cf0ffda8 Quick fix for stack overflow when there are more than about 25 slices.
Using recursion to traverse the recursive data structure for extended
partitions was never good, but when slice support was implemented in
1995, the recursion worked for the default maximum number of slices
(32), and standard fdisk utilities didn't support creating more than
the default number.  Even then, corrupt extended partitions could
cause endless recursion, because we attempt to check all slices, even
ones which we don't turn into devices.

The recursion has succumbed to creeping features.  The stack requirements
for each level had grown to 204 bytes on i386's.  Most of the growth was
caused by adding a 64-byte copy of the DOSpartition table to each frame.
The kernel stack size has shrunk to about 5K on i386's.  Most of the
shrinkage was caused by the growth of `struct sigacts' by 2388 bytes
to support 128 signals.

Linux fdisk (a 1997 version at least) can now create 60 slices (4 standard
ones, 56 for logical drives within extended partitions, and it seems to
be leaving room to map the 4 BSD partitions on my test drive), and Linux
(2.2.29 and 2.3.35 at least) now reports all these slices at boot time.

The fix limits the recursion to 16 levels (4 + 16 slices) and recovers
32 bytes per level caused by gcc pessimizing for space.  Switching to
a static buffer doesn't cause any problems due to recursion, since the
buffer is not passed down.  Using a static buffer is wrong in general
because it requires the giant lock to protect it.  However, this problem
is small compared with using a static buffer for dsname().  We sometimes
neglect to copy the result of dsname() before sleeping.

Also fixed slice names when we find more than MAX_SLICES (32) slices.
The number of the last slice found was not passed passed recursively.
The limit on the recursion now prevents finding more than 32 slices
with a standard extended partition data structure anyway.
2000-01-27 05:11:29 +00:00
mckusick
b5a8876127 Add soft updates to the set of things being tagged. Syntax cleanup. 2000-01-27 01:22:06 +00:00
bde
9fec7300da Improved English in the messages printed by diskerr().
Fixed some formatting bugs.
2000-01-26 10:28:23 +00:00
bde
59f42795e8 Don't follow null pointers if we somehow have a null devswitch entry
despite having a non-null cn_tab entry.  This case now works the same
as if there is no physical console, except i/o at the kernel printf
level may still work.  This frees drivers of physical console drivers
from the responsibility of attaching the device no matter what.
2000-01-25 09:20:08 +00:00
bde
5ac10ba7c8 Fixed some style bugs (mainly ones associated with the bogus name
condev_t for a non-typedef).
2000-01-24 11:48:11 +00:00
bp
0f32b2a255 Backout previous commit. It was a mistake. 2000-01-23 15:47:46 +00:00
bp
530d15757a Replace non obvious number with SPECNAMELEN constant.
Reviewed by:	phk
2000-01-23 14:58:53 +00:00
phk
56d1c048ef Add a couple of strategic sysctls for monitoring.
In the rather obscure case of hardpps(), use a type-II PLL if the external
signal is phase locked, but a FLL if it isn't.
2000-01-23 14:52:37 +00:00
imp
72c8ff7d8a Fix the style bugs in the style bugs fix. The style bug fix made the
new function inconsistant with the rest of this file.  The spelling
and grammer fixes were good and remain.
2000-01-21 06:57:52 +00:00
green
c6da76a1a6 Fix style bugs in the last commit. 2000-01-21 02:52:54 +00:00
imp
f6db7985c4 bdeize last commit:
o Remove opt_dontuse.h and ifdef PROCFS

Subitted by: bde, peter
2000-01-20 17:03:53 +00:00
jasone
cec957051d Back out the previous spl change, since it opens a race window.
Reviewed by:	alfred, dillon, peter
2000-01-20 08:15:13 +00:00
imp
4e884c480a When we are execing a setugid program, and we have a procfs filesystem
file open in one of the special file descriptors (0, 1, or 2), close
it before completing the exec.

Submitted by: nergal@idea.avet.com.pl
Constructive comments: deraadt@openbsd.org, sef, peter, jkh
2000-01-20 07:12:52 +00:00
jasone
ff778f6b27 Don't tsleep() while at splbio().
Correctly return EINPROGRESS from aio_error() even when an aio request
is still in the socket queue.

Submitted by:	Adrian Chadd <adrian@bofh.co.uk>
2000-01-20 01:59:58 +00:00
rwatson
f2d8638a5c Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Reviewed by:	bde
2000-01-19 06:07:34 +00:00
rwatson
3a39a81644 Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Commit 2 out of 3.

Reviewed by:	bde
2000-01-19 06:02:31 +00:00
rwatson
e6adc4e6db Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Commit 1 out of 3.

Reviewed by:	bde
2000-01-19 06:01:07 +00:00
mckusick
41c200930c Need to reset the buffer pointer to avoid reconsidering the same buffer
again (without this the rollback analysis was being lost). Should reduce
the write count for most workloads.

Submitted by:	Craig A Soules <soules+@andrew.cmu.edu>
2000-01-18 02:13:26 +00:00
green
24ae07bb54 Fix vn_isdisk() usage to make AIO work on non-disk-files again, rather
than just return ENOTBLK.

PR:	16163
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
2000-01-17 21:18:39 +00:00
peter
75fd4c5f10 Implement setres[ug]id() and getres[ug]id(). This has been sitting in
my tree for ages (~2 years) waiting for an excuse to commit it.  Now Linux
has implemented it and it seems that Staroffice (when using the
linux_base6.1 port's libc) calls this in the linux emulator and dies in
setup.  The Linux emulator can call these now.
2000-01-16 16:34:26 +00:00
phk
6daeac3303 Cleanup some more remaining bdev fluff. 2000-01-16 09:25:34 +00:00
jasone
241bd93929 Add aio_waitcomplete(). Make aio work correctly for socket descriptors.
Make gratuitous style(9) fixes (me, not the submitter) to make the aio
code more readable.

PR:		kern/12053
Submitted by:	Chris Sedore <cmsedore@maxwell.syr.edu>
2000-01-14 02:53:29 +00:00
mdodd
c945c09e3c Allow SMP systems with an MCA bus to work properly.
Reviewed by:	peter
2000-01-13 09:09:02 +00:00
luoqi
858958c167 Seconds to ticks conversion was done at the wrong place. 2000-01-12 17:26:42 +00:00
yokota
715966bf8a Add a new mechanism, cndbctl(), to tell the console driver that
ddb is entered.  Don't refer to `in_Debugger' to see if we
are in the debugger.  (The variable used to be static in Debugger()
and wasn't updated if ddb is entered via traps and panic anyway.)

- Don't refer to `in_Debugger'.
- Add `db_active' to i386/i386/db_interface.d (as in
  alpha/alpha/db_interface.c).
- Remove cnpollc() stub from ddb/db_input.c.
- Add the dbctl function to syscons, pcvt, and sio. (The function for
  pcvt and sio is noop at the moment.)

Jointly developed by: bde and me

(The final version was tweaked by me and not reviewed by bde.  Thus,
if there is any error in this commit, that is entirely of mine, not
his.)

Some changes were obtained from: NetBSD
2000-01-11 14:54:01 +00:00
phk
8eb5bbb861 Also handle zero return from dscheck().
PR:		15956
2000-01-10 12:21:39 +00:00
phk
ae0c1ec8f7 Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
imp
92d6fa4fe7 Panic if proc0 hasn't been created and we try to call kthread_create.
This prevents a more mysterious crash later.

XXX The long term solution is defer creation of these things until
XXX proc0 lives
2000-01-10 08:00:58 +00:00
sef
31b9ca1819 Handle the case where we truss an SUGID program -- in particular, we need
to wake up any processes waiting via PIOCWAIT on process exit, and truss
needs to be more aware that a process may actually disappear while it's
waiting.

Reviewed by:	Paul Saab <ps@yahoo-inc.com>
2000-01-10 04:09:05 +00:00
mckusick
d4409da210 Several performance improvements for soft updates have been added:
1) Fastpath deletions. When a file is being deleted, check to see if it
   was so recently created that its inode has not yet been written to
   disk. If so, the delete can proceed to immediately free the inode.
2) Background writes: No file or block allocations can be done while the
   bitmap is being written to disk. To avoid these stalls, the bitmap is
   copied to another buffer which is written thus leaving the original
   available for futher allocations.
3) Link count tracking. Constantly track the difference in i_effnlink and
   i_nlink so that inodes that have had no change other than i_effnlink
   need not be written.
4) Identify buffers with rollback dependencies so that the buffer flushing
   daemon can choose to skip over them.
2000-01-10 00:24:24 +00:00
mckusick
2f9951ffbd Add bwillwrite to all system calls that create things in the filesystem.
Benchmarks that create huge trees of empty files overwhelm the buffer cache.
2000-01-10 00:08:53 +00:00
mckusick
a44e140976 Remove the P_BUFEXHAUST flag from the syncer process (leaving
it only on the buf_daemon process). The problem is that when the
syncer process starts running the worklist, it wants to delete
lots of files. It does this by VFS_VGET'ing the vnodes, clearing
the blocks in them and bdwrite'ing the buffer. It can process close
to a thousand files per second which generates a large number of
dirty buffers. So, giving it special priviledge at the buffer trough
leads to trouble as the buf_daemon does occationally need a free
buffer to proceed and if the syncer has used every last one up,
we are toast.
2000-01-10 00:07:24 +00:00
eivind
767bad2cc1 Change NDFREE() from a macro to a function for the time being; the macro
version caused intolerable bloat (30k).  I'm likely to revisit this with an
attempt at a smarter macro.

Bloat noticed by:       bde
2000-01-08 16:20:06 +00:00
luoqi
35b4c17c79 Allow SMP && NCPU == 1 to work. From now on, there's no restriction on the
value of NCPU relative to the number of cpus physically present, the actual
number of cpus utilized will be the smaller of the two.
2000-01-07 08:49:25 +00:00
luoqi
e100d44d55 Introduce a mechanism to suspend/resume system processes. Suspend syncer
and bufdaemon prior to disk sync during system shutdown.
2000-01-07 08:36:44 +00:00
peter
a1b69f2dc4 Export the nselcoll counter via the kern.nselcoll sysctl so we can see
just how bad it gets in various situations.

Reminded by:  adrian
2000-01-05 19:40:17 +00:00
dillon
c6689c797d Enhance reassignbuf(). When a buffer cannot be time-optimally inserted
into vnode dirtyblkhd we append it to the list instead of prepend it to
    the list in order to maintain a 'forward' locality of reference, which
    is arguably better then 'reverse'.  The original algorithm did things this
    way to but at a huge time cost.

    Enhance the append interlock for NFS writes to handle intr/soft mounts
    better.

    Fix the hysteresis for NFS async daemon I/O requests to reduce the
    number of unnecessary context switches.

    Modify handling of NFS mount options.  Any given user option that is
    too high now defaults to the kernel maximum for that option rather then
    the kernel default for that option.

Reviewed by:	 Alfred Perlstein <bright@wintelcom.net>
2000-01-05 05:11:37 +00:00
tegge
8bfa846d93 ISA device drivers use the ISA source interrupt number in locations where
the low level interrupt handler number should be used.  Change
setup_apic_irq_mapping() to allocate low level interrupt handler X (Xintr${X})
for any ISA interrupt X mentioned in the MP table.

Remove an assumption in the driver for the system clock (clock.c) that
interrupts mentioned in the MP table as delivered to IOAPIC #0 intpin Y
is handled by low level interrupt handler Y (Xintr${Y}) but don't assume
that low level interrupt handler 0 (Xintr0) is used.

Don't allocate two low level interrupt handlers for the system clock.
Reviewed by:	NOKUBI Hirotaka <hnokubi@yyy.or.jp>
2000-01-04 22:24:59 +00:00
phk
be15f1db17 Be more careful about NOUDEV and NODEV.
Submitted by:	bde
2000-01-04 12:51:50 +00:00
phk
325c58929a Create a separate pps_offset variable to use for applying the
hardpps() produced offset component.  This is tested and behaved
stable with frequency offsets from -338.05 to +499.91 PPM.

Interestingly the machine I tested this on would fail if the clock
were slower than 14.3132 MHz whereas it was perfectly happy to run
at 16.384 MHz, in other words [-340PPM ... +14.4%]

Make pps_shift tweakable with sysctl.
2000-01-04 12:04:39 +00:00
phk
1275074122 truss /usr/bin/su
login (or not if root)
	then exit the shell

truss will get stuct in tsleep

I dont know if this is correct, but it fixes the problem and
according to the commends in pioctl.h, PF_ISUGID is set when we
want to ignore UID changes.

The code is checking for when PF_ISUGID is not set and since it
never is set, we always ignore UID changes.

Submitted by:	Paul Saab <ps@yahoo-inc.com>
2000-01-03 14:26:47 +00:00
phk
c5dd6ef219 Don't use time_offset as a leaky bucket variable in hardpps(), this
resulted in vastly optimistic offset values reported to userland
(typically a factor 40+ too small).  Apart from that, the code had
two sign-bugs.

Apply the hardpps() phase with the right sign with a simply
scaling by integration interval.  (This may be too stiff at
long integration intervals, see below).

Allow pps_shiftmax to be reduced again.

Before this, the phase lock in hardpps() were broken, but due to
two bugs mostly cancelling out, it would end up basically working
with a large stochastic component.  Now it behaves as one would
expect: smooth and quiet.

It seems that pps_shiftmax above 7..9 somewhere makes the phaselock
too weak to hold onto random walk phase errors from a HP-105 OCXO,
which basically means that it is too weak for real-life use with
such integration times.  This is yet to be resolved.

Submitted to:	Prof. Dave "NTP" Mills.
Tested by:	Terje Mathisen <Terje.Mathisen@hda.hydro.com>
1999-12-29 14:39:24 +00:00
peter
7e0cb159e7 Remove vnode_if.sh - it's a perl script. This stayed around for a while
because bsd.kmod.mk is usually out of sync with kernel source.  However
bsd.kmod.mk has to be updated now because of the _KERNEL change so there
is no need to keep this (pre-repo copy) version around.
1999-12-29 05:37:14 +00:00
peter
d53e4c1d80 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 05:07:58 +00:00