freebsd-dev/sys/kern
Bruce Evans f0ebe4973f Scheduler fixes equivalent to the ones logged in the following NetBSD
commit to kern_synch.c:

  ----------------------------
  revision 1.55
  date: 1999/02/23 02:56:03;  author: ross;  state: Exp;  lines: +39 -10
  Scheduler bug fixes and reorganization
  * fix the ancient nice(1) bug, where nice +20 processes incorrectly
    steal 10 - 20% of the CPU, (or even more depending on load average)
  * provide a new schedclk() mechanism at a new clock at schedhz, so high
    platform hz values don't cause nice +0 processes to look like they are
    niced
  * change the algorithm slightly, and reorganize the code a lot
  * fix percent-CPU calculation bugs, and eliminate some no-op code

  === nice bug === Correctly divide the scheduler queues between niced and
  compute-bound processes. The current nice weight of two (sort of, see
  `algorithm change' below) neatly divides the USRPRI queues in half; this
  should have been used to clip p_estcpu, instead of UCHAR_MAX.  Besides
  being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op,
  and it was done after decay_cpu() which can only _reduce_ the value.  It
  has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can
  scheduler-penalize themselves onto the same queue as nice +20 processes.
  (Or even a higher one.)

  === New schedclk() mechansism === Some platforms should be cutting down
  stathz before hitting the scheduler, since the scheduler algorithm only
  works right in the vicinity of 64 Hz. Rather than prescale hz, then scale
  back and forth by 4 every time p_estcpu is touched (each occurance an
  abstraction violation), use p_estcpu without scaling and require schedhz
  to be generated directly at the right frequency. Use a default stathz (well,
  actually, profhz) / 4, so nothing changes unless a platform defines schedhz
  and a new clock.  Define these for alpha, where hz==1024, and nice was
  totally broke.

  === Algorithm change === The nice value used to be added to the
  exponentially-decayed scheduler history value p_estcpu, in _addition_ to
  be incorporated directly (with greater wieght) into the priority calculation.
  At first glance, it appears to be a pointless increase of 1/8 the nice
  effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that
  because it will ramp up linearly but be decayed only exponentially, thus
  converging to an additional .75 nice for a loadaverage of one. I killed
  this, it makes the behavior hard to control, almost impossible to analyze,
  and the effect (~~nothing at for the first second, then somewhat increased
  niceness after three seconds or more, depending on load average) pointless.

  === Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation.
  Collect scheduler functionality. Try to put each abstraction in just one
  place.
  ----------------------------

The details are a little different in FreeBSD:

=== nice bug ===   Fixing this is the main point of this commit.  We use
essentially the same clipping rule as NetBSD (our limit on p_estcpu
differs by a scale factor).  However, clipping at all is fundamentally
bad.  It gives free CPU the hoggiest hogs once they reach the limit, and
reaching the limit is normal for long-running hogs.  This will be fixed
later.

=== New schedclk() mechanism ===  We don't use the NetBSD schedclk()
(now schedclock()) mechanism.  We require (real)stathz to be about 128
and scale by an extra factor of 2 compared with NetBSD's statclock().
We scale p_estcpu instead of scaling the clock.  This is more accurate
and flexible.

=== Algorithm change ===  Same change.

=== Other bugs ===  The p_pctcpu bug was fixed long ago.  We don't try as
hard to abstract functionality yet.

Related changes: the new limit on p_estcpu must be exported to kern_exit.c
for clipping in wait1().

Agreed with by:		dufault
1999-11-28 12:12:13 +00:00
..
bus_if.m
device_if.m
imgact_aout.c s/p_cred->pc_ucred/p_ucred/g 1999-11-21 12:38:21 +00:00
imgact_elf.c s/p_cred->pc_ucred/p_ucred/g 1999-11-21 12:38:21 +00:00
imgact_gzip.c useracc() the prequel: 1999-10-29 18:09:36 +00:00
imgact_shell.c
inflate.c
init_main.c struct mountlist and struct mount.mnt_list have no business being 1999-11-20 10:00:46 +00:00
init_sysent.c Cop on a bit and regenerate things correctly. 1999-11-18 20:45:04 +00:00
kern_acct.c
kern_clock.c Fixed some comments in statclock(). The previous commit made it clearer 1999-11-27 14:37:34 +00:00
kern_conf.c Zap devsw_module_handler(). 1999-11-08 08:10:00 +00:00
kern_descrip.c Only bother converting the stat structure if we intend to return it, 1999-11-18 08:08:28 +00:00
kern_environment.c Change the prototype of the strto* routines to make the second 1999-11-24 01:03:08 +00:00
kern_exec.c Add a sysctl to control if argv is disclosed to the world: 1999-11-26 08:27:16 +00:00
kern_exit.c s/p_cred->pc_ucred/p_ucred/g 1999-11-21 12:38:21 +00:00
kern_fork.c The at_exit and at_fork functions currently use a 'roll your own' 1999-11-19 21:29:03 +00:00
kern_intr.c
kern_jail.c
kern_kthread.c
kern_ktrace.c This is a partial commit of the patch from PR 14914: 1999-11-16 10:56:05 +00:00
kern_linker.c Tempt fate and stop index from converting a const char * into a char *. 1999-11-21 04:26:48 +00:00
kern_lock.c Correct a locking error in apause: It should always hold 1999-11-11 03:02:03 +00:00
kern_lockf.c Commit the remaining part of PR14914: 1999-11-16 16:28:58 +00:00
kern_malloc.c KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP 1999-11-22 02:45:11 +00:00
kern_mib.c
kern_module.c A hack basically.. We have a bunch of code that used to call 1999-11-08 06:53:30 +00:00
kern_ntptime.c
kern_physio.c Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the 1999-10-30 06:32:05 +00:00
kern_proc.c Add a sysctl to control if argv is disclosed to the world: 1999-11-26 08:27:16 +00:00
kern_prot.c Introduce the new function 1999-11-21 19:03:20 +00:00
kern_random.c
kern_resource.c This is a partial commit of the patch from PR 14914: 1999-11-16 10:56:05 +00:00
kern_shutdown.c struct mountlist and struct mount.mnt_list have no business being 1999-11-20 10:00:46 +00:00
kern_sig.c Introduce the new function 1999-11-21 19:03:20 +00:00
kern_subr.c useracc() the prequel: 1999-10-29 18:09:36 +00:00
kern_switch.c
kern_synch.c Scheduler fixes equivalent to the ones logged in the following NetBSD 1999-11-28 12:12:13 +00:00
kern_syscalls.c
kern_sysctl.c Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the 1999-10-30 06:32:05 +00:00
kern_tc.c Fixed some comments in statclock(). The previous commit made it clearer 1999-11-27 14:37:34 +00:00
kern_threads.c
kern_time.c Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the 1999-10-30 06:32:05 +00:00
kern_timeout.c
kern_xxx.c
ksched.c
link_aout.c Take a shot at implementing the fix for PR 15014 for the a.out kernel 1999-11-28 12:06:29 +00:00
link_elf_obj.c Fix an embarresing mistake in the kld symbol lookup for DDB. It should 1999-11-28 11:59:18 +00:00
link_elf.c Fix an embarresing mistake in the kld symbol lookup for DDB. It should 1999-11-28 11:59:18 +00:00
Make.tags.inc
makedevops.pl Fix some bugs in user-end output and add a reference to the original 1999-11-22 14:40:04 +00:00
Makefile ${MACHINE} -> ${MACHINE_ARCH} 1999-11-14 13:54:44 +00:00
makesyscalls.sh
md5c.c
p1003_1b.c
posix4_mib.c
subr_autoconf.c
subr_blist.c useracc() the prequel: 1999-10-29 18:09:36 +00:00
subr_bus.c 'const'ify a bunch of pointers in the resource_*() functions for accessing 1999-11-18 06:05:30 +00:00
subr_clist.c
subr_devstat.c This is a partial commit of the patch from PR 14914: 1999-11-16 10:56:05 +00:00
subr_disk.c Conditionalise unwanted chattyness. 1999-11-19 23:34:01 +00:00
subr_disklabel.c
subr_diskmbr.c Fix a warning. 1999-11-09 21:35:10 +00:00
subr_diskslice.c
subr_dkbad.c
subr_eventhandler.c Commit the remaining part of PR14914: 1999-11-16 16:28:58 +00:00
subr_log.c
subr_module.c
subr_param.c
subr_prf.c
subr_prof.c
subr_rman.c Commit the remaining part of PR14914: 1999-11-16 16:28:58 +00:00
subr_scanf.c Change the prototype of the strto* routines to make the second 1999-11-24 01:03:08 +00:00
subr_smp.c Moved scheduling-related code to kern_synch.c so that it is easier to fix 1999-11-27 12:32:27 +00:00
subr_trap.c Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It 1999-11-09 01:44:28 +00:00
subr_xxx.c
sys_generic.c
sys_pipe.c Update pipe code for fo_stat() entry point - pipe_stat() is now no longer 1999-11-08 03:28:49 +00:00
sys_process.c Introduce the new function 1999-11-21 19:03:20 +00:00
sys_socket.c Update socket file type for fo_stat(). soo_stat() becomes a fileops 1999-11-08 03:31:01 +00:00
syscalls.c Cop on a bit and regenerate things correctly. 1999-11-18 20:45:04 +00:00
syscalls.master modfind(char *) -> modfind(const char *) 1999-11-17 21:32:40 +00:00
sysv_ipc.c
sysv_msg.c
sysv_sem.c
sysv_shm.c useracc() the prequel: 1999-10-29 18:09:36 +00:00
tty_compat.c
tty_conf.c Now that Netgraph is in the system there are some cleanups we can do. 1999-10-23 04:28:11 +00:00
tty_cons.c Remove cdevsw_add() - the necessary make_dev() is already there. 1999-11-18 06:37:00 +00:00
tty_pty.c Revert peter's commit to remove cdevsw_add() - it was a bit premature 1999-11-21 02:54:54 +00:00
tty_snoop.c Remove cdevsw_add() - the make_dev() calls are already there. 1999-11-18 06:39:47 +00:00
tty_subr.c
tty_tb.c
tty_tty.c
tty.c This is a partial commit of the patch from PR 14914: 1999-11-16 10:56:05 +00:00
uipc_domain.c
uipc_mbuf.c Fix a warning. 1999-11-18 06:29:57 +00:00
uipc_proto.c
uipc_sockbuf.c
uipc_socket2.c
uipc_socket.c KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP 1999-11-22 02:45:11 +00:00
uipc_syscalls.c General clean-up of socket.h and associated sources to synchronise up 1999-11-24 20:49:04 +00:00
uipc_usrreq.c This is a partial commit of the patch from PR 14914: 1999-11-16 10:56:05 +00:00
vfs_aio.c Convert various pieces of code to use vn_isdisk() rather than checking 1999-11-22 10:33:55 +00:00
vfs_bio.c Convert various pieces of code to use vn_isdisk() rather than checking 1999-11-22 10:33:55 +00:00
vfs_cache.c
vfs_cluster.c useracc() the prequel: 1999-10-29 18:09:36 +00:00
vfs_conf.c Retire MFS_ROOT and MFS_ROOT_SIZE options from the MFS implementation. 1999-11-26 20:08:44 +00:00
vfs_default.c Make vop_panic() a little more informative. 1999-11-07 15:09:49 +00:00
vfs_export.c Convert various pieces of code to use vn_isdisk() rather than checking 1999-11-22 10:33:55 +00:00
vfs_extattr.c struct mountlist and struct mount.mnt_list have no business being 1999-11-20 10:00:46 +00:00
vfs_init.c Move a couple of globals here where they are initialised, rather than 1999-11-01 23:54:07 +00:00
vfs_lookup.c
vfs_mount.c Retire MFS_ROOT and MFS_ROOT_SIZE options from the MFS implementation. 1999-11-26 20:08:44 +00:00
vfs_subr.c Convert various pieces of code to use vn_isdisk() rather than checking 1999-11-22 10:33:55 +00:00
vfs_syscalls.c struct mountlist and struct mount.mnt_list have no business being 1999-11-20 10:00:46 +00:00
vfs_vnops.c Ensure that garbage from the kernel stack does not wind up being 1999-11-18 08:14:20 +00:00
vnode_if.pl
vnode_if.sh
vnode_if.src Remove WILLRELE from VOP_SYMLINK 1999-11-13 20:58:17 +00:00