Commit Graph

2635 Commits

Author SHA1 Message Date
Poul-Henning Kamp
4d4f932326 s/v_specinfo/v_rdev/ 1999-08-13 10:10:12 +00:00
Alfred Perlstein
f4af31cb1c Replace a redundant vfs_object_create() call (already done in vn_open)
with a KASSERT.

Reviewed by: Eivind, Alan Cox
1999-08-12 20:38:32 +00:00
Peter Wemm
e426af039f Make subr_bus.c actually compile with -DBUS_DEBUG 1999-08-11 22:55:39 +00:00
Nik Clayton
2395507999 Add CPT_NOA, LIBCOMPAT, NODEF, NOARGS, NOPROTO, and NOIMPL to the commented
list of available types.

PR:             docs/13007
Submitted by:   Assar Westerlund <assar@sics.se>
1999-08-11 22:13:46 +00:00
Peter Wemm
3af0907ba4 Zap some stray references to DRIVER_TYPE_foo in the BUS_DEBUG case, as
discovered by Bill Paul.
1999-08-11 22:05:17 +00:00
Warner Losh
fdf4e8b30c Stop profiling on exec.
Obtained from: NetBSD
1999-08-11 20:35:38 +00:00
Alfred Perlstein
59d5fe5a90 When doing a dump, if ENODEV is returned explain what happened to the user,
"the device doesn't support a dump routine"

Only print "dump succeeded" when 0 is returned, instead of when an unexpected
error number is returned, print that error number.

Reviewed by: Eivind
1999-08-11 14:02:20 +00:00
Poul-Henning Kamp
f1fe3bf115 make alpha compile again. 1999-08-09 11:02:45 +00:00
Poul-Henning Kamp
ce9edcf5b5 Merge the cons.c and cons.h to the best of my ability. alpha may or
may not compile, I can't test it.
1999-08-09 10:35:05 +00:00
Poul-Henning Kamp
7517504c24 Enable ttymalloc(). 1999-08-08 20:24:58 +00:00
Poul-Henning Kamp
08add33166 Add new sysctl "kern.ttys" which return all the struct tty's which have
been registered with ttyregister().

register ptys with ttyregister().
1999-08-08 19:47:32 +00:00
Poul-Henning Kamp
ef40c56108 Make the pty driver as close to a cloning device as we can get for now,
we create the pty on the fly when it is first opened.

If you run out of ptys now, just MAKEDEV some more.

This also demonstrate the use of dev_t->si_tty_tty and dev_t->si_drv1
in a device driver.
1999-08-08 19:28:59 +00:00
Poul-Henning Kamp
0ef1c82630 Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.
1999-08-08 18:43:05 +00:00
Greg Lehey
32c0c324d5 cdevsw_remove: place correct value in bmaj2cmaj. This had caused
warnings of the following nature on reloading a kld:

  WARNING: "vinum" is usurping "console"'s bmaj

This only applies to cases where "console" is mentioned.

Broken-by:	  grog
1999-08-08 00:34:00 +00:00
Brian Feldman
301ca4ffe6 Make long longs ("%ll" format) work.
Reviewed by:	msmith
1999-08-07 20:13:32 +00:00
Jordan K. Hubbard
909bbf3c49 Re-commit these files after updating syscalls.master (in the proper order
this time).

Pointed out by:		bde
1999-08-05 08:26:27 +00:00
Jordan K. Hubbard
45f26d4120 Move syscall 180 back to where it was before and fix the
incorrect comment which led me to move it in the first place.
1999-08-05 08:18:45 +00:00
Jordan K. Hubbard
b24eb2795d Reserve a syscall for the arla folks. I'm assuming that since syscalls.c
and init_sysent.c are checked into CVS, I should also commit the regenerated
copies even though they're built by syscalls.master.  Correct?  Bruce? :)
1999-08-04 20:04:25 +00:00
Brian Feldman
e32c66c539 Fix fd race conditions (during shared fd table usage.) Badfileops is
now used in f_ops in place of NULL, and modifications to the files
are more carefully ordered. f_ops should also be set to &badfileops
upon "close" of a file.

This does not fix other problems mentioned in this PR than the first
one.

PR:		11629
Reviewed by:	peter
1999-08-04 18:53:50 +00:00
Warner Losh
711103c1cc o Typo in prior version kept it from compiling (blush).
Noticed by: Nobody!

o Add comment about why we restrict chflags to root for devices.
o nit noticed by bde wrt return values.
1999-08-04 04:52:18 +00:00
Warner Losh
e82ef978fe brucify:
o use suser_xxx rather than suser to support JAIL code.
	o KNF comment convention
	o use vp->type rather than vaddr.type and eliminate call to
	  VOP_GETATTR.  Bruce says that vp->type is valid at this
	  point.

Submitted by: bde.

Not fixed:
	o return (value)
	o Comment needs to be longer and more explicit.  It will be after
	  the advisory.
1999-08-03 17:07:04 +00:00
Warner Losh
f76f09c129 Only allow root to set file flags on devices. 1999-08-02 21:34:46 +00:00
Brian Feldman
ab533dd005 lutimes() bug: FOLLOW should be NOFOLLOW for this one.
Submitted by:	Dan Nelson <dnelson@emsphone.com>
1999-07-29 17:02:56 +00:00
Bruce Evans
992fd07673 Removed references to a nonexistent variable. This fixes building kernels
without -O.
1999-07-29 07:14:28 +00:00
Matthew N. Dodd
f4e3b1e7dd Fix a typo.
Back out a few lines that I haven't dealt with properly yet.

Snickered at by: Mike Smith
1999-07-29 01:51:49 +00:00
Matthew N. Dodd
15317dd875 Alter the behavior of sys/kern/subr_bus.c:device_print_child()
- device_print_child() either lets the BUS_PRINT_CHILD
	  method produce the entire device announcement message or
	  it prints "foo0: not found\n"

Alter sys/kern/subr_bus.c:bus_generic_print_child() to take on
the previous behavior of device_print_child() (printing the
"foo0: <FooDevice 1.1>" bit of the announce message.)

Provide bus_print_child_header() and bus_print_child_footer()
to actually print the output for bus_generic_print_child().
These functions should be used whenever possible (unless you can
just use bus_generic_print_child())

The BUS_PRINT_CHILD method now returns int instead of void.

Modify everything else that defines or uses a BUS_PRINT_CHILD
method to comply with the above changes.

	- Devices are 'on' a bus, not 'at' it.
	- If a custom BUS_PRINT_CHILD method does the same thing
	  as bus_generic_print_child(), use bus_generic_print_child()
	- Use device_get_nameunit() instead of both
	  device_get_name() and device_get_unit()
	- All BUS_PRINT_CHILD methods return the number of
	  characters output.

Reviewed by: dfr, peter
1999-07-29 01:03:04 +00:00
Alan Cox
6745299365 Add sysctl and support code to allow directories to be VMIO'd. The default
setting for the sysctl is OFF, which is the historical operation.

Submitted by:	dillon
1999-07-26 06:25:53 +00:00
Martin Cracauer
a7674320e9 On FPU exceptions, pass a useful error code (one of the FPE_...
macros) to the signal handler, for old-style BSD signal handlers as
the second (int) argument, for SA_SIGINFO signal handlers as
siginfo_t->si_code. This is source-compatible with Solaris, except
that we have no <siginfo.h> (which isn't even mentioned in POSIX
1003.1b).

An rather complete example program is at
  http://www3.cons.org/cracauer/freebsd-signal.c
This will be added to the regression tests in src/.

This commit also adds code to disable the (hardware) FPU from
userconfig, so that you can use a software FP emulator on a machine
that has hardware floating point. See LINT.
1999-07-25 13:16:09 +00:00
Bruce Evans
a1a10fdfc0 Oops, the previous commit only worked in the one case it was tested for. 1999-07-24 20:21:10 +00:00
Kazutaka YOKOTA
3d03248c70 - Correctly initialize cn_dev_t and cn_udev_t.
- Add D_TTY for alpha.

Reviewed by: bde, dfr
1999-07-24 09:41:06 +00:00
Doug Rabson
f1550d9d41 This makes the in kernel printf routines conform to the documented
behavior of their userland counterparts with respect to return values.

Submitted by: Matthew N. Dodd <winter@jurai.net>
1999-07-24 09:34:12 +00:00
Alan Cox
d4da2dbae6 Fix the following problem:
When creating new processes (or performing exec), the new page
directory is initialized too early.  The kernel might grow before
p_vmspace is initialized for the new process.  Since pmap_growkernel
doesn't yet know about the new page directory, it isn't updated, and
subsequent use causes a failure.

The fix is (1) to clear p_vmspace early, to stop pmap_growkernel
from stomping on memory, and (2) to defer part of the initialization
of new page directories until p_vmspace is initialized.

PR:		kern/12378
Submitted by:	tegge
Reviewed by:	dfr
1999-07-21 18:02:27 +00:00
Brian Feldman
57d86fc695 Fix a REALLY embarrassing mistake. Don't look; I warned you. 1999-07-20 21:51:12 +00:00
Brian Feldman
fb30b5bdaf Make a dev2budev() function, and use it. This refixes pstat (working, broken,
working, broken, working) and savecore (working, working, broken, working,
working).

Sorta Reviewed by:	phk
1999-07-20 21:29:13 +00:00
Brian Feldman
240a86a432 dev2udev() returns a CDEV udev_t, but we use block io in savecore. Savecore
also gets the device by st_rdev, which is alright except for the fact that
the sysctl kern.dumpdev passed out a char device. This is a workaround.
Sorry for not committing the fix earlier, before people started having
problems.
1999-07-20 20:55:50 +00:00
Poul-Henning Kamp
698bfad7f2 Now a dev_t is a pointer to struct specinfo which is shared by all specdev
vnodes referencing this device.

Details:
        cdevsw->d_parms has been removed, the specinfo is available
        now (== dev_t) and the driver should modify it directly
        when applicable, and the only driver doing so, does so:
        vn.c.  I am not sure the logic in checking for "<" was right
        before, and it looks even less so now.

        An intial pool of 50 struct specinfo are depleted during
        early boot, after that malloc had better work.  It is
        likely that fewer than 50 would do.

        Hashing is done from udev_t to dev_t with a prime number
        remainder hash, experiments show no better hash available
        for decent cost (MD5 is only marginally better)  The prime
        number used should not be close to a power of two, we use
        83 for now.

        Add new checkalias2() to get around the loss of info from
        dev2udev() in bdevvp();

        The aliased vnodes are hung on a list straight of the dev_t,
        and speclisth[SPECSZ] is unused.  The sharing of struct
        specinfo means that the v_specnext moves into the vnode
        which grows by 4 bytes.

        Don't use a VBLK dev_t which doesn't make sense in MFS, now
        we hang a dummy cdevsw on B/Cmaj 253 so that things look sane.

	Storage overhead from all of this is O(50k).

        Bump __FreeBSD_version to 400009

The next step will add the stuff needed so device-drivers can start to
hang things from struct specinfo
1999-07-20 09:47:55 +00:00
Poul-Henning Kamp
d7bf417de7 add debug.sizeof.specinfo 1999-07-20 07:19:32 +00:00
Mike Smith
91fe3dc1e1 Implement an all-CPU shootdown-style rendezvous facility. This allows
the caller to specify a function to be guarded between an entry and exit
barrier, as well as pre- and post-barrier functions.

The primary use for this function is synchronised update of per-cpu private
data.  The implementation is almost (but not quite) MI; with a better
mechanism for masking per-CPU interrupts it could probably be hoisted.

Reviewed by:	peter (partially)
1999-07-20 06:52:35 +00:00
Poul-Henning Kamp
3de280c443 [click] Now all dev_t's in the kernel have their char device major.
Only know casualy of this is swapinfo/pstat which should be fixes
the right way:  Store the actual pathname in the kernel like mount
does.  [Volounteers sought for this task]

The road map from here is roughly:  expand struct specinfo into struct
based dev_t.  Add dev_t registration facilities for device drivers and
start to use them.
1999-07-19 09:37:59 +00:00
Poul-Henning Kamp
6f13bfc261 Add sysctl tree debug.sizeof to tell us how big things are. First two
entries are struct proc and struct vnode.
1999-07-19 09:13:12 +00:00
Bruce Evans
6b6ef746e5 Added a sysctl "kern.timecounter.hardware" for selecting the hardware
used for timecounting.  The possible values are the names of the
physically present harware timecounters ("i8254" and "TSC" on i386's).

Fixed some nearby bitrot in comments in <sys/time.h>.

Reviewed by:	phk
1999-07-18 15:07:20 +00:00
Poul-Henning Kamp
6ca5486476 Introduce the vn_todev(struct vnode*) function, which returns the dev_t
corresponding to a VBLK or VCHR node, or NODEV.
1999-07-18 14:30:37 +00:00
Peter Wemm
80e907a1df Reset SA_NOCLDWAIT on exec().
PR:		kern/12669
Submitted by:	Doug Ambrisko <ambrisko@whistle.com>
1999-07-18 13:40:11 +00:00
John Polstra
7ac9503b86 Remove four no-op casts. 1999-07-18 01:35:26 +00:00
Poul-Henning Kamp
f06a54f0a0 Centralize dumpdev handling. 1999-07-17 20:47:52 +00:00
Poul-Henning Kamp
68f7448fd7 Reverse the sense of a test, dev2udev() will be much cheaper than
udev2dev().
1999-07-17 20:29:10 +00:00
Poul-Henning Kamp
d21c632c4b Use 256 as magic in bmaj2cmaj[]. Treat BLK/CHR dev_t more correctly. 1999-07-17 19:57:25 +00:00
Poul-Henning Kamp
c7119ea7dd Fix 2nd arg to udev2dev(). 1999-07-17 19:38:00 +00:00
Poul-Henning Kamp
f008cfcc1a I have not one single time remembered the name of this function correctly
so obviously I gave it the wrong name.  s/umakedev/makeudev/g
1999-07-17 18:43:50 +00:00
Peter Wemm
341c61590c Oops, missed out one chunk of the last patch. (*blush*)
Submitted by:	Kazutaka YOKOTA <yokota@zodiac.mech.utsunomiya-u.ac.jp>
Submitted by:	"Matthew N. Dodd" <winter@jurai.net>
1999-07-14 17:37:53 +00:00
Kris Kennaway
e7647e6c20 Correct a couple of spelling errors in comments. 1999-07-12 15:02:51 +00:00
Doug Rabson
ca7036d8cb Add a hook for a bus to detect child devices which didn't find drivers.
This allows the bus to print an informative message about unknown devices.

Submitted by: Matthew N. Dodd <winter@jurai.net>
1999-07-11 13:42:37 +00:00
Peter Wemm
8294196430 Fixes for a couple of problems in last commit:
1. Printing large quads in small bases overflowed the buffer if
   sizeof(u_quad_t) > sizeof(u_long).
2. The sharpflag checks had operator precedence bugs due to excessive
   parentheses in all the wrong places.
3. The explicit 0L was bogus in the quad_t comparison and useless in
   the long comparision.
4. There was some more bitrot in the comment about ksprintn().  Our
   ksprintn() handles bases up to 36 as well as down to 2.

Bruce has other complaints about using %q in kernel and would rather
we went towards using the C9X style %ll and/or %j.  (I agree for that
matter, as long as gcc/egcs know how to deal with that.)

Submitted by:	bde
1999-07-10 15:27:05 +00:00
Poul-Henning Kamp
23d762834b Fix a dev_t/udev_t issue with accounting. lastcomm now shows the
right tty again.

Submitted by:	"D. Rock" <rock@dead-end.net>
Reviewed by:	phk
1999-07-10 06:27:36 +00:00
Peter Wemm
bdbc8c265e Fix the previous warning a different way since the emul_path exposure was
intentional.  Avoid the warning by propagating the const filename through
to elf_load_file() instead.
1999-07-09 19:10:14 +00:00
Peter Wemm
c6bb4a64b8 Minor tweak - don't cause a warning.
I don't know if it was intentional or not, but it would have printed out:
  /compat/linux/foo/bar.so: interpreter not found
If it was, then I've broken it.  De-constifying the 'interp' variable
or carrying the constness through to elf_load_file() are alternatives.
1999-07-09 18:05:03 +00:00
Peter Wemm
7d921a016d Implement the %q prefix for the integer types. Note that egcs on the
Alpha believes that %q is for long long, whereas our quad_t and int64_t
is only just a plain long.  long long on the alpha is the same size (64
bit) as a long.  It was requested, but I have not implemented yet, support
for C9X style %lld - it should be pretty easy though.
1999-07-09 17:54:39 +00:00
Peter Wemm
29a751bf4e bufhashinit() is called with a caddr_t and is expected to return the
same in both the alpha and i386 ports.
1999-07-09 16:41:19 +00:00
Jonathan Lemon
ab001a72be Implement support for hardware debug registers on the i386.
Submitted by:	Brian Dean <brdean@unx.sas.com>
1999-07-09 04:16:00 +00:00
Kirk McKusick
025037833c Condition in KASSERT was reversed. 1999-07-08 17:58:55 +00:00
Kirk McKusick
ad8ac923fa These changes appear to give us benefits with both small (32MB) and
large (1G) memory machine configurations.  I was able to run 'dbench 32'
on a 32MB system without bring the machine to a grinding halt.

    * buffer cache hash table now dynamically allocated.  This will
      have no effect on memory consumption for smaller systems and
      will help scale the buffer cache for larger systems.

    * minor enhancement to pmap_clearbit().  I noticed that
      all the calls to it used constant arguments.  Making
      it an inline allows the constants to propogate to
      deeper inlines and should produce better code.

    * removal of inherent vfs_ioopt support through the emplacement
      of appropriate #ifdef's, with John's permission.  If we do not
      find a use for it by the end of the year we will remove it entirely.

    * removal of getnewbufloops* counters & sysctl's - no longer
      necessary for debugging, getnewbuf() is now optimal.

    * buffer hash table functions removed from sys/buf.h and localized
      to vfs_bio.c

    * VFS_BIO_NEED_DIRTYFLUSH flag and support code added
      ( bwillwrite() ), allowing processes to block when too many dirty
      buffers are present in the system.

    * removal of a softdep test in bdwrite() that is no longer necessary
      now that bdwrite() no longer attempts to flush dirty buffers.

    * slight optimization added to bqrelse() - there is no reason
      to test for available buffer space on B_DELWRI buffers.

    * addition of reverse-scanning code to vfs_bio_awrite().
      vfs_bio_awrite() will attempt to locate clusterable areas
      in both the forward and reverse direction relative to the
      offset of the buffer passed to it.  This will probably not
      make much of a difference now, but I believe we will start
      to rely on it heavily in the future if we decide to shift
      some of the burden of the clustering closer to the actual
      I/O initiation.

    * Removal of the newbufcnt and lastnewbuf counters that Kirk
      added.  They do not fix any race conditions that haven't already
      been fixed by the gbincore() test done after the only call
      to getnewbuf().  getnewbuf() is a static, so there is no chance
      of it being misused by other modules.  ( Unless Kirk can think
      of a specific thing that this code fixes.  I went through it
      very carefully and didn't see anything ).

    * removal of VOP_ISLOCKED() check in flushbufqueues().  I do not
      think this check is necessary, the buffer should flush properly
      whether the vnode is locked or not. ( yes? ).

    * removal of extra arguments passed to getnewbuf() that are not
      necessary.

    * missed cluster_wbuild() that had to be a cluster_wbuild_wb() in
      vfs_cluster.c

    * vn_write() now calls bwillwrite() *PRIOR* to locking the vnode,
      which should greatly aid flushing operations in heavy load
      situations - both the pageout and update daemons will be able
      to operate more efficiently.

    * removal of b_usecount.  We may add it back in later but for now
      it is useless.  Prior implementations of the buffer cache never
      had enough buffers for it to be useful, and current implementations
      which make more buffers available might not benefit relative to
      the amount of sophistication required to implement a b_usecount.
      Straight LRU should work just as well, especially when most things
      are VMIO backed.  I expect that (even though John will not like
      this assumption) directories will become VMIO backed some point soon.

Submitted by:	Matthew Dillon <dillon@backplane.com>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-07-08 06:06:00 +00:00
Martin Cracauer
aff66c5455 Implement SA_SIGINFO for i386. Thanks to Bruce Evans for much more
than a review, this was a nice puzzle.

This is supposed to be binary and source compatible with older
applications that access the old FreeBSD-style three arguments to a
signal handler.

Except those applications that access hidden signal handler arguments
bejond the documented third one. If you have applications that do,
please let me know so that we take the opportunity to provide the
functionality they need in a documented manner.

Also except application that use 'struct sigframe' directly. You need
to recompile gdb and doscmd. `make world` is recommended.

Example program that demonstrates how SA_SIGINFO and old-style FreeBSD
handlers (with their three args) may be used in the same process is at
http://www3.cons.org/tmp/fbsd-siginfo.c

Programs that use the old FreeBSD-style three arguments are easy to
change to SA_SIGINFO (although they don't need to, since the old style
will still work):

  Old args to signal handler:
    void handler_sn(int sig, int code, struct sigcontext *scp)

  New args:
    void handler_si(int sig, siginfo_t *si, void *third)
  where:
    old:code == new:second->si_code
    old:scp == &(new:si->si_scp)     /* Passed by value! */

The latter is also pointed to by new:third, but accessing via
si->si_scp is preferred because it is type-save.

FreeBSD implementation notes:
- This is just the framework to make the interface POSIX compatible.
  For now, no additional functionality is provided. This is supposed
  to happen now, starting with floating point values.
- We don't use 'sigcontext_t.si_value' for now (POSIX meant it for
  realtime-related values).
- Documentation will be updated when new functionality is added and
  the exact arguments passed are determined. The comments in
  sys/signal.h are meant to be useful.

Reviewed by:	BDE
1999-07-06 07:13:48 +00:00
Marcel Moolenaar
7a583b02b6 Also try to load the interpreter without prepending "emul_path". This allows
dynamicly linked binaries to run in a chroot'd environment with "emul_path"
as the new root. The new behavior of loading interpreters is identical to the
principle of overlaying.

PR: 10145
1999-07-05 18:38:29 +00:00
Mike Smith
134c934ce7 Move the initialisation/tuning of nmbclusters from param.c/machdep.c
into uipc_mbuf.c.  This reduces three sets of identical tunable code to
one set, and puts the initialisation with the mbuf code proper.

Make NMBUFs tunable as well.

Move the nmbclusters sysctl here as well.

Move the initialisation of maxsockets from param.c to uipc_socket2.c,
next to its corresponding sysctl.

Use the new tunable macros for the kern.vm.kmem.size tunable (this should have
been in a separate commit, whoops).
1999-07-05 08:52:54 +00:00
Poul-Henning Kamp
03016f421b Remove cmaj and bmaj args from DEV_DRIVER_MODULE. 1999-07-04 14:58:56 +00:00
Bruce Evans
1168ab0815 Fixed corruption of the "blocked" list in lf_setlock() when tsleep()
returns 0 after ptrace() attach and/or detach doesn't quite quite
deliver a signal.  Perhaps the process shouldn't be woken in this
case, but avoiding the problem is easy.

PR:		12247

Fixed a couple of places where mechanical fixing of compiler warnings
caused misspelling of NOLOCKF as NULL.
1999-07-04 14:43:01 +00:00
Kirk McKusick
1c9ca5858f The vfs.write_behind sysctl and related code support has been added to
allow changes to the filesystem's write_behind behavior.  By the
default the filesystem aggressively issues write_behind's.  Three values
may be specified for vfs.write_behind.  0 disables write_behind, 1 results
in historical operation (agressive write_behind), and 2 is an experimental
backed-off write_behind.  The values of 0 and 1 are recommended.  The value
of 0 is recommended in conjuction with an increase in the number of
NBUF's and the number of dirty buffers allowed (vfs.{lo,hi}dirtybuffers).
Note that a value of 0 will radically increase the dirty buffer load on
the system.  Future work on write_behind behavior will use values 2 and
greater for testing purposes.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-07-04 00:31:17 +00:00
Kirk McKusick
e929c00d23 The buffer queue mechanism has been reformulated. Instead of having
QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN,
QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA.  With this patch clean
and dirty buffers have been separated.  Empty buffers with KVM
assignments have been separated from truely empty buffers.  getnewbuf()
has been rewritten and now operates in a 100% optimal fashion.  That is,
it is able to find precisely the right kind of buffer it needs to
allocate a new buffer, defragment KVM, or to free-up an existing buffer
when the buffer cache is full (which is a steady-state situation for
the buffer cache).

Buffer flushing has been reorganized.  Previously buffers were flushed
in the context of whatever process hit the conditions forcing buffer
flushing to occur.  This resulted in processes blocking on conditions
unrelated to what they were doing.  This also resulted in inappropriate
VFS stacking chains due to multiple processes getting stuck trying to
flush dirty buffers or due to a single process getting into a situation
where it might attempt to flush buffers recursively - a situation that
was only partially fixed in prior commits.  We have added a new daemon
called the buf_daemon which is responsible for flushing dirty buffers
when the number of dirty buffers exceeds the vfs.hidirtybuffers limit.
This daemon attempts to dynamically adjust the rate at which dirty buffers
are flushed such that getnewbuf() calls (almost) never block.

The number of nbufs and amount of buffer space is now scaled past the
8MB limit that was previously imposed for systems with over 64MB of
memory, and the vfs.{lo,hi}dirtybuffers limits have been relaxed
somewhat.  The number of physical buffers has been increased with the
intention that we will manage physical I/O differently in the future.

reassignbuf previously attempted to keep the dirtyblkhd list sorted which
could result in non-deterministic operation under certain conditions,
such as when a large number of dirty buffers are being managed.  This
algorithm has been changed.  reassignbuf now keeps buffers locally sorted
if it can do so cheaply, and otherwise gives up and adds buffers to
the head of the dirtyblkhd list.  The new algorithm is deterministic but
not perfect.  The new algorithm greatly reduces problems that previously
occured when write_behind was turned off in the system.

The P_FLSINPROG proc->p_flag bit has been replaced by the more descriptive
P_BUFEXHAUST bit.  This bit allows processes working with filesystem
buffers to use available emergency reserves.  Normal processes do not set
this bit and are not allowed to dig into emergency reserves.  The purpose
of this bit is to avoid low-memory deadlocks.

A small race condition was fixed in getpbuf() in vm/vm_pager.c.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-07-04 00:25:38 +00:00
Peter Wemm
1943af613f Stop rfork(0) from panicing. (oops!!)
Submitted by:	Peter Holm <peter@holm.cc>
1999-07-03 20:58:44 +00:00
Peter Wemm
ca224f89a9 Fix warnings in last commit (dev_t is not an int, and not even int
compatable in arg lists on the Alpha)
1999-07-03 17:40:31 +00:00
Poul-Henning Kamp
ad6cb55952 Be more informative and try to ask the user in some instances if we can't
figure out the root device.
1999-07-03 08:24:00 +00:00
Poul-Henning Kamp
c31558b215 Warn about drivers which take over other drivers cdevsw entries, but still
grant them squatters right.
1999-07-03 08:22:30 +00:00
Poul-Henning Kamp
8947a90a90 Make sure that stat(2) and friends always return a valid st_dev field.
Pseudo-FS need not fill in the va_fsid anymore, the syscall code
will use the first half of the fsid, which now looks like a udev_t
with major 255.
1999-07-02 16:29:47 +00:00
Peter Wemm
b9dffbec61 Fix a warning - the code is correct but gcc can't tell. 1999-07-01 22:54:55 +00:00
Peter Wemm
7a0dde6879 Moving the initialization for write sooner quiets a warning. 1999-07-01 22:52:40 +00:00
Peter Wemm
00858ccd88 Quiet warnings on an Alpha. CBSIZE has long type and causes the other
ints to promote to long.
1999-07-01 19:46:36 +00:00
Peter Wemm
9c8b8baa38 Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface.  kproc_start is still available
as a SYSINIT() hook.  This allowed simplification of chunks of the
sysinit code in the process.  This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process.  It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit.  This means that nfsiod
doesn't need to be in /sbin and is always "available".  This is a fair bit
easier to do outside of the SYSINIT_KT() framework.
1999-07-01 13:21:46 +00:00
Peter Wemm
df8abd0bb9 Slight tweak to fork1() calling conventions. Add a third argument so
the caller can easily find the child proc struct.  fork(), rfork() etc
syscalls set p->p_retval[] themselves.  Simplify the SYSINIT_KT() code
and other kernel thread creators to not need to use pfind() to find the
child based on the pid.  While here, partly tidy up some of the fork1()
code for RF_SIGSHARE etc.
1999-06-30 15:33:41 +00:00
Peter Wemm
ddebd8794d Hopefully fix the remaining glitches with the BUF_*() changes. This should
(really this time) fix pageout to swap and a couple of clustering cases.

This simplifies BUF_KERNPROC() so that it unconditionally reassigns the
lock owner rather than testing B_ASYNC and having the caller decide when
to do the reassign.  At present this is required because some places use
B_CALL/b_iodone to free the buffers without B_ASYNC being set.  Also,
vfs_cluster.c explicitly calls BUF_KERNPROC() when attaching the buffers
rather than the parent walking the cluster_head tailq.

Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-06-29 05:59:47 +00:00
Peter Wemm
72283ee95d Fix a bug that was almost certainly making breadn() fail. BUF_KERNPROC()
was being called on the wrong bp - it should be called on the one that's
just about to be fed to VOP_STRATEGY().
1999-06-28 15:32:10 +00:00
Kirk McKusick
33638e9384 When requesting an exclusive lock with LK_NOWAIT, do not panic
if LK_RECURSIVE is not set, as we will simply return that the
lock is busy and not actually deadlock. This allows processes
to use polling locks against buffers that they may already
hold exclusively locked.
1999-06-28 07:54:58 +00:00
Peter Wemm
e96c1fdc3f Minor tweaks to make sure (new) prerequisites for <sys/buf.h> (mostly
splbio()/splx()) are #included in time.
1999-06-27 11:44:22 +00:00
Doug Rabson
c049aba8c3 Call the chained module handler before unregistering the syscall so that
errors can be detected.

Submitted by: "A.Yu.Isupov" <isupov@moonhe.jinr.ru>
PR:	      kern/12239
1999-06-27 09:38:44 +00:00
Peter Wemm
e6257a9a09 GC the remnants of the old pre-softupdates update daemon. It's been
#if 0'd for a fair while now.
1999-06-26 14:46:35 +00:00
Peter Wemm
fec1aafc01 I'm tired of having a 'hanging root device'.. This isn't a "fix", just
a workaround for a specific case where cam interrupts right in the middle
of this printf.
1999-06-26 14:44:24 +00:00
Peter Wemm
82e4c0061b Quieten some warnings as a result of changes in ls_items[] constness over
time.
1999-06-26 12:19:03 +00:00
Doug Rabson
6d4ce7aa8c * Call cdevsw_remove from the MOD_UNLOAD event.
* Fix a couple of warnings while I'm here.
1999-06-26 11:39:27 +00:00
Doug Rabson
9fd5198d2d Make sure that we record the flags in all cases.
Submitted by: Bernd Walter <ticso@cicely.de>
PR:	      kern/12399
1999-06-26 10:27:30 +00:00
Kirk McKusick
67812eacd7 Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.
1999-06-26 02:47:16 +00:00
Greg Lehey
9a9eb2b92b Add function cdevsw_remove, the opposite of cdevsw_add: remove an
entry in cdevsw (and bdevsw if appropriate).

Reviewed-by: phk
1999-06-25 07:49:01 +00:00
Mike Smith
d42c1ee5c3 Changes in the way that the APs are started appears to have removed the
problem with having more CPUs than NCPU.

PR:		kern/4255
Submitted by:	peter
1999-06-23 23:02:38 +00:00
Luoqi Chen
541e018708 Do not setup 4M pdir until all APs are up. 1999-06-23 21:47:24 +00:00
Mike Smith
b9ab2461b6 Remove an unnecessary panic when sparse PCI bus numbering is encountered.
This is found eg. on some Compaq  Proliant systems.

Submitted by:	peter
1999-06-22 20:54:25 +00:00
Kazutaka YOKOTA
6e8394b8ba The second phase of syscons reorganization.
- Split syscons source code into manageable chunks and reorganize
  some of complicated functions.

- Many static variables are moved to the softc structure.

- Added a new key function, PREV.  When this key is pressed, the vty
  immediately before the current vty will become foreground.  Analogue
  to PREV, which is usually assigned to the PrntScrn key.
  PR: kern/10113
  Submitted by: Christian Weisgerber <naddy@mips.rhein-neckar.de>

- Modified the kernel console input function sccngetc() so that it
  handles function keys properly.

- Reorganized the screen update routine.

- VT switching code is reorganized.  It now should be slightly more
  robust than before.

- Added the DEVICE_RESUME function so that syscons no longer hooks the
  APM resume event directly.

- New kernel configuration options: SC_NO_CUTPASTE, SC_NO_FONT_LOADING,
  SC_NO_HISTORY and SC_NO_SYSMOUSE.
  Various parts of syscons can be omitted so that the kernel size is
  reduced.

  SC_PIXEL_MODE
  Made the VESA 800x600 mode an option, rather than a standard part of
  syscons.

  SC_DISABLE_DDBKEY
  Disables the `debug' key combination.

  SC_ALT_MOUSE_IMAGE
  Inverse the character cell at the mouse cursor position in the text
  console, rather than drawing an arrow on the screen.
  Submitted by: Nick Hibma (n_hibma@FreeBSD.ORG)

  SC_DFLT_FONT
  makeoptions "SC_DFLT_FONT=_font_name_"
  Include the named font as the default font of syscons.  16-line,
  14-line and 8-line font data will be compiled in.  This option replaces
  the existing STD8X16FONT option, which loads 16-line font data only.

- The VGA driver is split into /sys/dev/fb/vga.c and /sys/isa/vga_isa.c.

- The video driver provides a set of ioctl commands to manipulate the
  frame buffer.

- New kernel configuration option: VGA_WIDTH90
  Enables 90 column modes: 90x25, 90x30, 90x43, 90x50, 90x60.  These
  modes are mot always supported by the video card.
  PR: i386/7510
  Submitted by: kbyanc@freedomnet.com and alexv@sui.gda.itesm.mx.

- The header file machine/console.h is reorganized; its contents is now
  split into sys/fbio.h, sys/kbio.h (a new file) and sys/consio.h
  (another new file).  machine/console.h is still maintained for
  compatibility reasons.

- Kernel console selection/installation routines are fixed and
  slightly rebumped so that it should now be possible to switch between
  the interanl kernel console (sc or vt) and a remote kernel console
  (sio) again, as it was in 2.x, 3.0 and 3.1.

- Screen savers and splash screen decoders
  Because of the header file reorganization described above, screen
  savers and splash screen decoders are slightly modified.  After this
  update, /sys/modules/syscons/saver.h is no longer necessary and is
  removed.
1999-06-22 14:14:06 +00:00
Kirk McKusick
45623f31bc When allocating new buffers in getnewbuf, there are several points
at which we may sleep. So, after completing our buffer allocation
we must ensure that another process has not come along and allocated
a different buffer with the same identity. We do this by keeping a
global counter of the number of buffers that getnewbuf has allocated.
We save this count when we enter getnewbuf and check it when we are
about to return. If it has changed, then other buffers were allocated
while we were in getnewbuf, so we must return NULL to let our parent
know that it must recheck to see if it still needs the new buffer.
Hopefully this fix will eliminate the creation of duplicate buffers
with the same identity and the obscure corruptions that they cause.
1999-06-22 01:39:53 +00:00
Tim Vanderhoek
8630d8cfc6 Correctly return ENOEXEC for really short zipped files. The way this is
done is less-than cute, but this whole file is suffering from some amount
of bitrot.  Execution of zipped files should probably be implemented in a
manner similar to that of #!/interpreted files.

PR:		kern/10780
1999-06-21 16:23:13 +00:00
Greg Lehey
53a03d1c97 dsopen: Print a message if the unit has an invalid sector size.
Reviewed-by:	ken, bde
1999-06-21 03:48:16 +00:00
Alan Cox
dc92aa57fd For consistency with other implementations, check for the existence
of the segment before checking its permissions.

PR:		kern/11999
Submitted by:	Brooks Davis <brooks@one-eyed-alien.net>
1999-06-19 23:53:13 +00:00
Bruce Evans
50045fbc7c Changed the global `idt' from an array to a pointer so that npx.c
automatically hacks on the active copy of the IDT if f00f_hack()
has changed it.  This also allows simplifications in setidt().
This fixes breakage of FP exception handling by rev.1.55 of
sys/kernel.h.  FP exceptions were sent to npx.c's probe handlers
because npx.c "restored" the old handlers to the wrong copy of the
IDT.  The SYSINIT for f00f_hack() was purposely run quite late to
avoid problems like this, but it is bogusly associated with the
SYSINIT for proc0 so it was moved with the latter.

Problem reported and fix tested by:  Martin Cracauer <cracauer@cons.org>
1999-06-18 14:32:21 +00:00
Brian Feldman
f29be02190 Reviewed by: the cast of thousands
This is the change to struct sockets that gets rid of so_uid and replaces
it with a much more useful struct pcred *so_cred. This is here to be able
to do socket-level credential checks (i.e. IPFW uid/gid support, to be added
to HEAD soon). Along with this comes an update to pidentd which greatly
simplifies the code necessary to get a uid from a socket. Soon to come:
a sysctl() interface to finding individual sockets' credentials.
1999-06-17 23:54:50 +00:00
Gary Palmer
0625ba2fc3 Add Id strings 1999-06-17 23:42:45 +00:00
Nick Hibma
6979450036 Update the comments on values than can be returned by DEVICE_PROBE.
DEVICE_PROBE can return priorities.

Reviewed by:	Doug Rabson <dfr@nlsystems.com>
1999-06-17 19:22:12 +00:00
Bruce Evans
212dfe6fc7 Fixed a missing userland dev_t to kernel dev_t conversion. 1999-06-17 07:07:55 +00:00
Julian Elischer
4e1b754078 Reformat comment to match indentation of code around it. 1999-06-17 01:25:25 +00:00
Kirk McKusick
f9c8cab591 Add a vnode argument to VOP_BWRITE to get rid of the last vnode
operator special case. Delete special case code from vnode_if.sh,
vnode_if.src, umap_vnops.c, and null_vnops.c.
1999-06-16 23:27:55 +00:00
Dmitrij Tejblum
71ddfdbbd5 Make sure syscall arguments properly aligned in ktrace records.
Make syscall return value a register_t.

Based on a patch from Hidetoshi Shimokawa.
Mostly reviewed by:	Hidetoshi Shimokawa and Bruce Evans.
1999-06-16 18:37:01 +00:00
David Greenman
cd3fe8d008 Changed trypbuf to a getpbuf to work around a problem where redundant writes
would occur when clustering them - caused by running out of buffers
and taking a degenerate code path as a result. It appears that waiting
instead for buffers to become available is okay.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
Discovered by: Craig A Soules <soules+@andrew.cmu.edu>
1999-06-16 15:54:30 +00:00
Tor Egge
01cf8ad024 If we still haven't got a sufficient number of free buffers after the
call to flushdirtybuffers() then sleep in waitfreebuffers().
PR:		11697
Reviewed by:	David Greenman, Matt Dillon
1999-06-16 03:19:04 +00:00
Kirk McKusick
e4ab40bcb6 Get rid of the global variable rushjob and replace it with a function in
kern/vfs_subr.c named speedup_syncer() which handles the speedup request.
Change the various clients of rushjob to use the new function.
1999-06-15 23:37:29 +00:00
Mike Smith
79fc0bf4a0 From the submitter:
- this causes POSIX locking to use the thread group leader
   (p->p_leader) as the locking thread for all advisory locks.
   In non-kernel-threaded code p->p_leader == p, so this will have
   no effect.

   This results in (more) correct POSIX threaded flock-ing semantics.

   It also prevents the leader from exiting before any of the children.
   (so that p->p_leader will never be stale) in exit1().

   We have been running this patch for over a month now in our lab
   under load and at customer sites.

Submitted by:	John Plevyak <jplevyak@inktomi.com>
1999-06-07 20:37:29 +00:00
Archie Cobbs
05292ba234 ksprintn() may be called with base=2, so redefine MAXNBUF accordingly.
Other brucification tweaks.

Obtained from:	bde@freebsd.org
1999-06-07 18:26:26 +00:00
Archie Cobbs
ad4f8dbd23 The function ksprintn(), which is used to convert numbers to ASCII, is not
reentrant because it returns a static buffer. This results in a race condition
when/if an interrupt handler calls log(), printf() etc. Fix this.
1999-06-06 02:41:55 +00:00
Alan Cox
3d41489171 Restructure pipe_read in order to eliminate several race conditions.
Submitted by:	Matthew Dillon <dillon@apollo.backplane.com> and myself
1999-06-05 03:53:57 +00:00
Peter Wemm
9c9906e912 Plug a mbuf leak in tcp_usr_send(). pru_send() routines are expected
to either enqueue or free their mbuf chains, but tcp_usr_send() was
dropping them on the floor if the tcpcb/inpcb has been torn down in the
middle of a send/write attempt.  This has been responsible for a wide
variety of mbuf leak patterns, ranging from slow gradual leakage to rather
rapid exhaustion.  This has been a problem since before 2.2 was branched
and appears to have been fixed in rev 1.16 and lost in 1.23/1.28.

Thanks to Jayanth Vijayaraghavan <jayanth@yahoo-inc.com> for checking
(extensively) into this on a live production 2.2.x system and that it
was the actual cause of the leak and looks like it fixes it.  The machine
in question was loosing (from memory) about 150 mbufs per hour under
load and a change similar to this stopped it.  (Don't blame Jayanth
for this patch though)

An alternative approach to this would be to recheck SS_CANTSENDMORE etc
inside the splnet() right before calling pru_send() after all the potential
sleeps, interrupts and delays have happened.  However, this would mean
exposing knowledge of the tcp stack's reset handling and removal of the
pcb to the generic code.  There are other things that call pru_send()
directly though.

Problem originally noted by:  John Plevyak <jplevyak@inktomi.com>
1999-06-04 02:27:06 +00:00
Dmitrij Tejblum
4ea5ad99d5 || vs && confusion in cdevsw_add(). 1999-06-01 20:41:26 +00:00
Poul-Henning Kamp
6fcd8a7c93 Introduce the makebdev() function, it does the same as the makedev()
function for now, but that will change.
1999-06-01 18:56:26 +00:00
Jonathan Lemon
eb9d435ae7 Unifdef VM86.
Reviewed by:	silence on on -current
1999-06-01 18:20:36 +00:00
Poul-Henning Kamp
2447bec829 Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it.  cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.

cdevsw_add() will print an message if the d_maj field looks bogus.

Remove nblkdev and nchrdev variables.  Most places they were used
bogusly.  Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.

Move bdevsw() and devsw() functions to kern/kern_conf.c

Bump __FreeBSD_version to 400006

This commit removes:
        72 bogus makedev() calls
        26 bogus SYSINIT functions

if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.

I4b and vinum not changed.  Patches emailed to authors.  LINT
probably broken until they catch up.
1999-05-31 11:29:30 +00:00
Poul-Henning Kamp
4e2f199e0c This commit should be a extensive NO-OP:
Reformat and initialize correctly all "struct cdevsw".

        Initialize the d_maj and d_bmaj fields.

        The d_reset field was not removed, although it is never used.

I used a program to do most of this, so all the files now use the
same consistent format.  Please keep it that way.

Vinum and i4b not modified, patches emailed to respective authors.
1999-05-30 16:53:49 +00:00
Doug Rabson
20b62c5ac9 * Add a function devclass_create() which looks up the named devclass and
creates it if it doesn't exist.
* Rename resource_list_remove() to resource_list_delete() for consistency.
1999-05-30 10:27:11 +00:00
Doug Rabson
bea6af4d31 * Change device_add_child_after() to device_add_child_ordered() which is
easier to use and more flexible.
* Change BUS_ADD_CHILD to take an order argument instead of a place.
* Define a partial ordering for isa devices so that sensitive devices are
  probed before non-sensitive ones.
1999-05-28 09:25:16 +00:00
Doug Rabson
8be70a6eac Fix an embarrasing typo in device_add_child_after(). I can't understand
how this hasn't caused problems before.

Submitted by: Kazutaka YOKOTA <yokota@zodiac.mech.utsunomiya-u.ac.jp>
1999-05-27 07:18:41 +00:00
John Birrell
350185153d Back out my previous change (phk didn't like it) in favour of setting
rootdev in the mfs initialisation code iff MFS_ROOT (which Bruce doesn't
like). Damned if I do - damned if I don't.
1999-05-24 00:37:26 +00:00
John Birrell
02013ff886 Remove the test for bdevsw(dev) == NULL from bdevvp() because it fails
if there is no character device associated with the block device. In this
case that doesn't matter because bdevvp() doesn't use the character
device structure.

I can use the pointy bit of the axe too.
1999-05-24 00:34:10 +00:00
John Birrell
d47066829a Make MFS_ROOT work again. MFS_ROOT means that rootdev is not set.
Broken by: phk
Problem ignored by: phk
1999-05-23 10:51:33 +00:00
Dmitrij Tejblum
9d3a442583 Don't call calcru() on a swapped-out process. calcru() access p_stats, which
is in U-area.
1999-05-22 20:10:31 +00:00
Doug Rabson
7e082b48a2 Add some helper functions to make it easier to write a driver for a bus
which needs to manage resources for its children.
1999-05-22 14:57:15 +00:00
Peter Wemm
13b758eb46 Add seatbelt like in previous function.. 1999-05-22 09:52:21 +00:00
Andrey A. Chernov
925fa5c3f5 Realy fix overflow on SO_*TIMEO
Submitted by: bde
1999-05-21 15:54:40 +00:00
Doug Rabson
0053cc2cfe Silently return NULL from devclass_get_device if dc == NULL. The caller
should be handling NULL returns already.

Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
1999-05-21 08:23:58 +00:00
Peter Wemm
1d23cba9b7 Oops, set module->file..
PR: 1179
Submitted-by: lha@stacken.kth.se
1999-05-20 00:00:58 +00:00
Luoqi Chen
15a33d7c1d TIOCEXT is also inapproriate before the slave is open, return EAGAIN when
these ioctls are attempted. Move a misplaced comment.

Pointed out by:	Bruce
1999-05-18 14:53:52 +00:00
Luoqi Chen
519566d2e3 Avoid negative numbers in dev_t manipulation. This should fix recent MFS
related crashes.
1999-05-18 13:14:43 +00:00
Poul-Henning Kamp
6a0ce00218 Use NOUDEV for udev_t's 1999-05-17 13:50:24 +00:00
Doug Rabson
f40ddd55b0 Change the definition of e_tdev in struct kinfo_proc from dev_t to udev_t
Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>
1999-05-17 13:28:35 +00:00
Alan Cox
e972780a11 Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert,
eliminating the need for the pmap_object_init_pt calls in imgact_* and
mmap.

Reviewed by:	David Greenman <dg@root.com>
1999-05-17 00:53:56 +00:00
Eivind Eklund
1d8290f3c3 Add enough include files to make this actually compile on an a.out system. 1999-05-15 23:18:32 +00:00
Alan Cox
e5f13bdd09 Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying
MAP_COPY_ON_WRITE, and vice versa.  Thus, MAP_COPY_ON_WRITE suffices.

Reviewed by:	David Greenman <dg@root.com>
1999-05-14 23:09:34 +00:00
Luoqi Chen
7af0acae17 Ignore some ioctls on the master until the slave is open. 1999-05-14 20:44:20 +00:00
Luoqi Chen
0ce54cbb0c Legally acquire a major number for mfs. 1999-05-14 20:40:23 +00:00
Doug Rabson
6c2e3dde8c * Define a new static method DEVICE_IDENTIFY which is called to add device
instances to a parent bus.
* Define a new method BUS_ADD_CHILD which can be called from DEVICE_IDENTIFY
  to add new instances.
* Add a generic implementation of DEVICE_PROBE which calls DEVICE_IDENTIFY
  for each driver attached to the parent's devclass.
* Move the hint-based isa probe from the isa driver to a new isahint driver
  which can be shared between i386 and alpha.
1999-05-14 11:22:47 +00:00
Doug Rabson
454a0546ba Adjust method dispatch to ensure that default methods are called properly. 1999-05-14 09:13:43 +00:00
Kirk McKusick
eaea7a9e9f Previously directories were sync'ed every 10 seconds while bitmaps &
inodes were synced every 15 seconds. This is now reversed as during
directory create, we cannot commit the directory entry until its
inode has been written. With this switch, the inodes will be more
likely to be written by the time that the directory is written thus
reducing the number of directory rollbacks that are needed.
1999-05-14 01:29:21 +00:00
Bruce Evans
7d9509f96c Added ../sys/syscall.mk to targets. Back it up like all the other
targets.
1999-05-13 09:19:14 +00:00
Bruce Evans
853cbeeb35 Regenerated. 1999-05-13 09:12:57 +00:00
Bruce Evans
f664346fbe Fixed nonsense arg type `const caddr_t' in the prototype() for utrace().
Changed to `const void *'.  utrace() is undocumented, so nothing should
notice.

Fixed missing consts for utrace() and ktrace() in syscalls.master.

sys/ktrace.h is missing some Lite2 changes of shorts to ints.
1999-05-13 09:09:37 +00:00
Peter Wemm
ccb84588dd Try an fix a couple of dev_t/major/minor etc nits. 1999-05-12 22:30:50 +00:00
Luoqi Chen
0f0fe5a4c5 Unbreak VESA on SMP. 1999-05-12 21:39:07 +00:00
Peter Wemm
cc5881cff5 Fix (?) SPECHASH dev_t/major/minor/etc args 1999-05-12 19:06:40 +00:00
Poul-Henning Kamp
e519e78b42 braino. 1999-05-12 13:06:34 +00:00
Bruce Evans
ce45b512b3 Fixed corruption of the kmemstatistcs list. The first malloc()
with malloc type at the tail of the list changed the list from
linear to circular.  This seemed to cause surprisingly few problems,
but it now causes weird output from `vmstat -m', probably because
a more important malloc type is now at the tail of the list.

Fix it by abusing ks_limit instead of ks_next as a flag for being
on the list.  Don't forget to clear the flag when a malloc type is
uninit'ed.  Uninit'ing is still fundamentally broken -- it loses
history.
1999-05-12 11:11:27 +00:00
Poul-Henning Kamp
adfea48f2b Produce compiler warning if dev_t and udev_t is confused. 1999-05-12 11:06:56 +00:00
Poul-Henning Kamp
8bee45c44e Don't peek into dev_t 1999-05-12 11:06:07 +00:00
Poul-Henning Kamp
bfbb9ce670 Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
        major()         umajor()
        minor()         uminor()
        makedev()       umakedev()
        dev2udev()      udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits.  This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference.  If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
Peter Wemm
ac48316f68 Send subr_rlist.c off to the big Attic in the sky. It's been #if 0'ed
for quite some time now and can be revived in a moment's notice if needed.
(It was replaced by subr_blist.c for VM/swap)
1999-05-11 14:29:59 +00:00
John Birrell
e40822840b Use colons instead of semi-colons to behave like UNIX instead of DOS.
Suggested by: bde
1999-05-11 10:08:10 +00:00
Peter Wemm
dc97381e37 Update one set of comments.. s/so_q0/so_incomp/ and s/so_q/so_comp/ (that's
incomplete and complete connections I think)
1999-05-10 18:15:40 +00:00
Poul-Henning Kamp
784fc12f7b Use NODEV instead of -1 1999-05-10 18:10:08 +00:00
Don Lewis
bd508d391b Fix descriptor leak provoked by KKIS.05051999.003b exploit code.
unp_internalize() takes a reference to the descriptor.  If the send
fails after unp_internalize(), the control mbuf would be freed ophaning
the reference.

Tested in -CURRENT by: Pierre Beyssac <beyssac@enst.fr>
1999-05-10 18:09:39 +00:00
Nick Hibma
b57947c947 Remove hack to accept French spelling of METHOD (METHODE) 1999-05-10 17:45:49 +00:00
Doug Rabson
8b2970bbe6 * Augment the interface language to allow arbitrary C code to be 'passed
through' to the C compiler.
* Allow the interface to specify a default implementation for methods.
* Allow 'static' methods which are not device specific.
* Add a simple scheme for probe routines to return a priority value. To
  make life simple, priority values are negative numbers (positive numbers
  are standard errno codes) with zero being the highest priority. The
  driver which returns the highest priority will be chosen for the device.
1999-05-10 17:06:14 +00:00
Doug Rabson
276794a4a4 Superceded by makedevops.pl 1999-05-10 16:45:19 +00:00
Peter Wemm
7067d561cc Lites2 seems to have pretty much disappeared from the radar, and I suspect
far more than this hack would be needed now..
1999-05-09 20:42:45 +00:00
Peter Wemm
18c3dd1745 s/main/mi_startup/ for the kernel entry point so that egcs doesn't get
upset about it (and generate things like __main() calls that are reserved
for main()).  Renaming was phk's suggestion, but I'd already thought about
it too.  (phk liked my suggested name tada() but I decided against it :-)

Reviewed by:	phk
1999-05-09 19:01:49 +00:00
Peter Wemm
e37622b251 Fix a couple of warnings and some bitrot in comments. 1999-05-09 16:04:14 +00:00
Poul-Henning Kamp
0a346dab99 major(something) can never become NODEV. 1999-05-09 13:13:52 +00:00
Poul-Henning Kamp
52400704e9 Unconfuse DEV_MODULE() and DEV_DRIVER_MODULE() about the difference between
a major number for a dev_t.
1999-05-09 13:00:50 +00:00
Doug Rabson
c586a73949 Hack the diskslice stuff so that it allows the alpha sysinstall to
manipulate the disklabel. This is almost certainly not the right way
to do it but I'm desperate.
1999-05-09 11:27:41 +00:00
Poul-Henning Kamp
8f0024a54a Peter beat me to half this patch, but didn't do the other half:
set d_bmaj

	don't cast a dev_t to int before comparing to NODEV
1999-05-09 08:18:12 +00:00
Peter Wemm
0da14f00bf Comment advising ordering of cdevsw_add and bdevsw_add is obsolete (no
bdevsw_add any more).
1999-05-09 08:10:17 +00:00
Dmitrij Tejblum
db72e05829 Fix a freelist trashing under following confitions:
- first program lock a region in a file,
- second program wait on the lock,
- first program extend the region,
- second program interrupted by a signal.
1999-05-08 22:46:46 +00:00
Doug Rabson
566643e39e Move the declaration of the interrupt type from the driver structure
to the BUS_SETUP_INTR call.
1999-05-08 21:59:43 +00:00
Peter Wemm
73d35bdd96 Change resource_set_*() to be more useful. BTW; resource_find() is a
bit odd, it looks like the wildcard stuff isn't right.
1999-05-08 18:08:59 +00:00
Peter Wemm
9c06a38614 Make sure the mem_range_AP_init() prototype is seen where it's needed, and
#ifdef SMP around it for fun.
1999-05-08 17:48:22 +00:00
Peter Wemm
4173e42044 Use KERNBASE for the load address of the kernel rather than magic constants
as it seems to work..  (at least on i386/elf).
1999-05-08 13:03:49 +00:00
Peter Wemm
b5b15c3ff0 First stages of a module dependency cleanup. This part fixes a
particularly annoying hack, namely having the linker bash the moduledata
to set the container pointer, preventing it being const.  In the process,
a stack of warnings were fixed and will probably allow a revisit of the
const C_SYSINIT() changes.  This explicitly registers modules in files or
preload areas with the module system first, and let them initialize via
SYSINIT/DECLARE_MODULE later in their SI_ORDER_xxx order.  The kludge of
finding the containing file is no longer needed since the registration
of modules onto the modules list is done in the context of initializing
the linker file.
1999-05-08 13:01:59 +00:00
Poul-Henning Kamp
1637aa4b1c Fix some of the places where too much inside knowledge about major/minor
layout and dev_t structure is being (ab)used.
1999-05-08 07:02:41 +00:00
Poul-Henning Kamp
4be2eb8c49 I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.
1999-05-08 06:40:31 +00:00
Dag-Erling Smørgrav
b83308b00b Nit fix. 1999-05-07 17:37:08 +00:00
Poul-Henning Kamp
46eede0058 Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw.  bdevsw() is now an (inline)
        function.

        Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
        to the order of the cmaj/bmaj arguments!)

        Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
        (ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
1999-05-07 10:11:40 +00:00
Poul-Henning Kamp
e994c55884 Fix a goof in the #ifdef DEVFS case which was found by inspection,
it may have made things very difficult for people if they tried to
used DEVFS.
1999-05-07 09:10:10 +00:00
Poul-Henning Kamp
c48d17750f Introduce two functions: physread() and physwrite() and use these directly
in *devsw[] rather than the 46 local copies of the same functions.

(grog will do the same for vinum when he has time)
1999-05-07 07:03:47 +00:00
Poul-Henning Kamp
b0eeea2042 remove b_proc from struct buf, it's (now) unused.
Reviewed by:	dillon, bde
1999-05-06 20:00:34 +00:00
Peter Wemm
d5558c001a Fix up a few easy 'assignment used as truth value' and 'suggest parens
around && within ||' type warnings.  I'm pretty sure I have not masked
any problems here, I've committed real problem fixes seperately.
1999-05-06 18:44:42 +00:00
Peter Wemm
dfd5dee1b0 Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.
1999-05-06 18:13:11 +00:00
Poul-Henning Kamp
84c55b38e4 Remove unused fields from struct buf:
b_savekva
	b_validoff
	b_validend

Reviewed by:	dillon, bde
1999-05-06 17:06:41 +00:00
Bruce Evans
ea2b3e3d1b Fixed profiling of elf kernels. Made high resolution profiling compile
for elf kernels (it is broken for all kernels due to lack of egcs support).

Renaming of many assembler labels is avoided by declaring by declaring
the labels that need to be visible to gprof as having type "function"
and depending on the elf version of gprof being zealous about discarding
the others.  A few type declarations are still missing, mainly for SMP.

PR:		9413
Submitted by:	Assar Westerlund <assar@sics.se> (initial parts)
1999-05-06 09:44:57 +00:00
John Birrell
67481196cc Allow the init_path to be customised in an embedded system using the
INIT_PATH config option.

Also fix two bugs which caused an infinite loop in none of the programs
in the init_path were found. That code was obviously not tested!
1999-05-05 12:20:23 +00:00
Bill Fumerola
3d177f465a Add sysctl descriptions to many SYSCTL_XXXs
PR:		kern/11197
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
Reviewed by:	billf(spelling/style/minor nits)
Looked at by:	bde(style)
1999-05-03 23:57:32 +00:00
Alan Cox
4221e284a3 The VFS/BIO subsystem contained a number of hacks in order to optimize
piecemeal, middle-of-file writes for NFS.  These hacks have caused no
end of trouble, especially when combined with mmap().  I've removed
them.  Instead, NFS will issue a read-before-write to fully
instantiate the struct buf containing the write.  NFS does, however,
optimize piecemeal appends to files.  For most common file operations,
you will not notice the difference.  The sole remaining fragment in
the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache
coherency issues with read-merge-write style operations.  NFS also
optimizes the write-covers-entire-buffer case by avoiding the
read-before-write.  There is quite a bit of room for further
optimization in these areas.

The VM system marks pages fully-valid (AKA vm_page_t->valid =
VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault.  This
is not correct operation.  The vm_pager_get_pages() code is now
responsible for marking VM pages all-valid.  A number of VM helper
routines have been added to aid in zeroing-out the invalid portions of
a VM page prior to the page being marked all-valid.  This operation is
necessary to properly support mmap().  The zeroing occurs most often
when dealing with file-EOF situations.  Several bugs have been fixed
in the NFS subsystem, including bits handling file and directory EOF
situations and buf->b_flags consistancy issues relating to clearing
B_ERROR & B_INVAL, and handling B_DONE.

getblk() and allocbuf() have been rewritten.  B_CACHE operation is now
formally defined in comments and more straightforward in
implementation.  B_CACHE for VMIO buffers is based on the validity of
the backing store.  B_CACHE for non-VMIO buffers is based simply on
whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear,
and vise-versa).  biodone() is now responsible for setting B_CACHE
when a successful read completes.  B_CACHE is also set when a bdwrite()
is initiated and when a bwrite() is initiated.  VFS VOP_BWRITE
routines (there are only two - nfs_bwrite() and bwrite()) are now
expected to set B_CACHE.  This means that bowrite() and bawrite() also
set B_CACHE indirectly.

There are a number of places in the code which were previously using
buf->b_bufsize (which is DEV_BSIZE aligned) when they should have
been using buf->b_bcount.  These have been fixed.  getblk() now clears
B_DONE on return because the rest of the system is so bad about
dealing with B_DONE.

Major fixes to NFS/TCP have been made.  A server-side bug could cause
requests to be lost by the server due to nfs_realign() overwriting
other rpc's in the same TCP mbuf chain.  The server's kernel must be
recompiled to get the benefit of the fixes.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-05-02 23:57:16 +00:00
Mark Murray
a8af2bd86b This routine was "use"ing File::Basename. This commit removes that
"use" and replaces it with equivalent inline code. The reason is that
Perl has some very nasty circular dependancies, and I am trying to
get the System Perl upgraded by one maintenance level.

The basic rule, until I can find a way to solve this, is that
the build tools MAY NOT use any library code; it must all be inline.
1999-05-02 08:55:27 +00:00
Mike Smith
4a034f21cd Add a hook that can be called to initialise a slave processor's memory
range attributes after they have been extracted from the master.

Hook up the i686 MP code to do this for each AP.

Be more careful about printing the default memory type for the i686.

Suggestions from: luoqi
1999-04-30 22:09:45 +00:00
Poul-Henning Kamp
07901f227b Add beer-ware license and $Id$
Noticed by:	dillon
1999-04-30 06:51:51 +00:00
Poul-Henning Kamp
430210c00b Make BOOTP to work again.
Submitted by:	dillon
Reviewed by:	phk
1999-04-30 06:30:15 +00:00
Dmitrij Tejblum
188554bba1 Set curproc at the end of proc0_init().
This patch also moves the bogus comment (the comment is still not quite
right) and (as a side effect) removes some verbose initialisations (we
depend on static initialisation to 0 for almost everything in proc0).

The alpha kernels are bootable again. The change  won't affect i386's
until machdep.c is changed.

Submitted by:	bde
1999-04-29 22:51:59 +00:00
Alan Cox
0043b4376a Address a performance problem in getnewbuf:
In heavy-writing situations, QUEUE_LRU can contain a large number
	of DELWRI buffers at its head.  These buffers must be moved
	to the tail if they cannot be written async in order to reduce
	the scanning time required to skip past these buffers in later
	getnewbuf() calls.

Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-04-29 18:15:25 +00:00
Poul-Henning Kamp
75c1354190 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
Poul-Henning Kamp
02daf150a4 Add the jail system call. 1999-04-28 11:28:49 +00:00
Dmitrij Tejblum
604359cf9b s/static foo_devsw_installed = 0;/static int foo_devsw_installed;/.
(Edited automatically)
1999-04-28 10:54:24 +00:00
Luoqi Chen
5206bca10a Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00