Commit Graph

75 Commits

Author SHA1 Message Date
alc
b7d6153751 Correct an error in the comments for init_param3().
Discussed with: silby
2008-07-04 19:36:58 +00:00
pjd
a902aa50c3 - Export HZ value via kern.hz sysctl (this is the same name as for the
loader tunable).
- Document other sysctls in this file and also mark them as loader tunable
  via CTLFLAG_RDTUN flag.

Reviewed by:	roberto
2008-05-09 07:42:02 +00:00
alfred
3dcb842f61 Export maxswzone, maxbcache, maxtsiz, dfldsiz, maxdsiz, dflssiz, maxssiz,
and sgrowsiz via sysctl.

MFC after: 1 week
2007-10-16 10:40:53 +00:00
kris
a72fd404e3 Partially revert revision 1.66, which contained a change that did not
correspond to the commit log.  It changed the maxswzone and maxbcache
parameters from int to long, without changing the extern definitions
in <sys/buf.h>.

In fact it's a good thing it did not, because other parts of the system
are not yet ready for this, and on large-memory sparc machines it causes
severe filesystem damage if you try.

The worst effect of the change was that the tunables controlling the
above variables stopped working.  These were necessary to allow such
large sparc64 machines (with >12GB RAM) to boot, since sparc64 did not
set a hard-coded upper limit on these parameters and they ended
up overflowing an int, causing an infinite loop at boot in bufinit().

Reviewed by:	mlaier
2005-10-14 19:15:10 +00:00
marius
dfe8329b58 Increase default HZ for sparc64 to 1000. 2005-04-16 15:07:41 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
bms
8ea3319e24 Fix the build. 2004-11-30 03:23:35 +00:00
peter
0cb38b1818 Switch from 1024hz to 1000hz on amd64 to match i386. 1024 is a bad
choice because it is so in sync with stathz (128hz or 4096hz etc).
2004-11-30 00:25:26 +00:00
des
e836fd23ea #include <vm/vm_param.h> instead of <machine/vmparam.h> (the former
includes the latter, but also declares variables which are defined
in kern/subr_param.c).

Change som VM parameters from quad_t to unsigned long.  They refer to
quantities (size limits for text, heap and stack segments) which must
necessarily be smaller than the size of the address space, so long is
adequate on all platforms.

MFC after:	1 week
2004-11-08 18:20:02 +00:00
marcel
c361afa258 Increase default HZ for ia64 to 1000. 2004-11-08 04:50:02 +00:00
phk
e582d20cf5 Increase default HZ for i386 to 1000 2004-11-06 11:33:43 +00:00
imp
74cf37bd00 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
alc
707630ec9c White space and wording changes to init_param3().
Mostly submitted by:	bde
2004-03-30 08:00:11 +00:00
alc
521fa57364 Revise the direct or optimized case to use uiomove_fromphys() by the reader
instead of ephemeral mappings using pmap_qenter() by the writer.  The
writer is still, however, responsible for wiring the pages, just not
mapping them.  Consequently, the allocation of KVA for the direct case is
unnecessary.  Remove it and the sysctls limiting it, i.e.,
kern.ipc.maxpipekvawired and kern.ipc.amountpipekvawired.  The number
of temporarily wired pages is still, however, limited by
kern.ipc.maxpipekva.

Note: On platforms lacking a direct virtual-to-physical mapping,
uiomove_fromphys() uses sf_bufs to cache ephemeral mappings.  Thus,
the number of available sf_bufs can influence the performance of pipes
on platforms such i386.  Surprisingly, I saw the greatest gain from this
change on such a machine: lmbench's pipe bandwidth result increased from
~1050MB/s to ~1850MB/s on my 2.4GHz, 400MHz FSB P4 Xeon.
2004-03-27 19:50:23 +00:00
peter
7a96caafd4 Set default HZ to 1024 for amd64. The comment in kern/tty.c doesn't
apply here because we have 64 bit longs and don't suffer the hz > 169
overflows.
2004-03-14 05:49:31 +00:00
silby
bd71f7b671 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
silby
22ad6d5be5 Add init_param3() to subr_param. This function is called
immediately after the kernel map has been sized, and is
the optimal place for the autosizing of memory allocations
which occur within the kernel map to occur.

Suggested by:	bde
2003-07-11 00:01:03 +00:00
silby
fa9cd99702 Pull in the entire kmem_map size calculation from kern_malloc, rather
than the shortcircuited version I had been using, which only worked
properly on i386 & amd64.

Also, change an autoscale constant to account for the more correct
kmem_map size.

Problem noticed by:     mux
2003-07-08 18:59:21 +00:00
silby
bba10d998e Put some concrete limits on pipe memory consumption:
- Limit the total number of pipes so that we do not
  exhaust all vm objects in the kernel map.  When
  this limit is reached, a ratelimited message will
  be printed to the console.

- Put a soft limit on the amount of memory consumable
  by pipes.  Once the limit has been reached, all new
  pipes will be limited to 4K in size, rather than the
  default of 16K.

- Put a limit on the number of pages that may be used
  for high speed page flipping in order to reduce the
  amount of wired memory.  Pipe writes that occur
  while this limit is exceeded will fall back to
  non-page flipping mode.

The above values are auto-tuned in subr_param.c and
are scaled to take into account both the size of
physical memory and the size of the kernel map.

These limits help to reduce the "kernel resources exhausted"
panics that could be caused by opening a large
number of pipes.  (Pipes alone are no longer able
to exhaust all resources, but other kernel memory hogs
in league with pipes may still be able to do so.)

PR:			53627
Ideas / comments from:	hsu, tjr, dillon@apollo.backplane.com
MFC after:		1 week
2003-07-08 04:02:31 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
peter
c3bdd669c3 Change hw.physmem and hw.usermem to unsigned long like they used to be
in the original hardwired sysctl implementation.

The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int.  Use
'long' there.

Change Maxmem and physmem and related variables to 'long', mostly for
completeness.  Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody.  This
comes for free on 32 bit machines, so why not?
2002-08-30 04:04:37 +00:00
phk
b6bf4c07cf Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
silby
e3a68020c7 Unconditionally limit maxproc so that it is not possible
to exhaust all kmaps.  The only reward for setting maxproc
to a value which will cause kmap exhaustion is a panic
during a forkbomb attack.

MFC after:	3 days
2002-03-07 04:50:36 +00:00
dillon
9371a9a23b Allow the kern.maxusers boot tuneable to be set to 0 (previously only
the kernel config's maxusers could be set to 0 for autosizing to work).
Reviewed by:	rwatson, imp
MFC after:	3 days
2002-02-06 01:19:19 +00:00
dillon
e1e10af6b7 Make the 'maxusers 0' auto-sizing code slightly more conservative. Change
from 1 megabyte of ram per user to 2 megabytes of ram per user, and
reduce the cap from 512 to 384.  512 leaves around 240 MB of KVM available
while 384 leaves 270 MB of KVM available.  Available KVM is important
in order to deal with zalloc and kernel malloc area growth.

Reviewed by:	mckusick
MFC: either before 4.5 if re's agree, or after 4.5
2002-01-25 01:54:16 +00:00
peter
dd0f3c5ca2 Proper fix for old config setting maxusers to 8. 2001-12-14 09:39:29 +00:00
dillon
62f062ea62 Too many people are compiling kernels with maxusers set to 0 without the new
config.  Hack the kernel to force auto-sizing if the old config is used.
2001-12-14 04:01:08 +00:00
silby
dc4fed395a Limit maxprocperuid to 9/10 maxproc, and limit maxfilesperproc to 9/10
maxfiles.  This should make local resource exhaustion attacks easier
to handle with a non-tweaked setup.

MFC after:	3 days
2001-12-13 20:00:45 +00:00
dillon
6fe4980d43 Allow maxusers to be specified as 0 in the kernel config, which will
cause the system to auto-size to between 32 and 512 depending on the
amount of memory.

MFC after:	1 week
2001-12-09 01:57:09 +00:00
dillon
27124b4079 Create a mutex pool API for short term leaf mutexes.
Replace the manual mutex pool in kern_lock.c (lockmgr locks) with the new API.
Replace the mutexes embedded in sxlocks with the new API.
2001-11-13 21:55:13 +00:00
ps
db0d5cd641 Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by:	peter
MFC after:	2 weeks
2001-10-10 23:06:54 +00:00
dillon
d2c2ab2442 Conditionalize VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX so MD sections
that don't define these constants don't break.
2001-08-20 16:29:13 +00:00
dillon
05c33a209b Limit the amount of KVM reserved for the buffer cache and for swap-meta
information.  The default limits only effect machines with > 1GB of ram
and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX
and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and
kern.maxbcache.  This has the effect of leaving more KVM available for
sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad
adds memory to a machine and then sees the kernel panic on boot due to
running out of KVM.

Also change the default swap-meta auto-sizing calculation to allocate half
of what it was previously allocating.  The prior defaults were way too high.
Note that we cannot afford to run out of swap-meta structures so we still
stay somewhat conservative here.
2001-08-20 00:41:12 +00:00
peter
df2f882214 Move param.c out of the conf directory and make it fully dynamic.
Tunables are now derived at boot time from maxusers.  ie: change maxusers
via a tunable and all the derivative settings change.  You can change
the other tunables individually as well.  Even hz etc is tunable.
2001-07-26 23:04:03 +00:00
bmilekic
5d710b296b Introduce numerous SMP friendly changes to the mbuf allocator. Namely,
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:

 o Reduce contention for SMP by offering per-CPU pools and locks.
 o Better use of data cache due to per-CPU pools.
 o Much less code cache pollution due to excessively large allocation macros.
 o Framework for `grouping' objects from same page together so as to be able
   to possibly free wired-down pages back to the system if they are no longer
   needed by the network stacks.

 Additional things changed with this addition:

  - Moved some mbuf specific declarations and initializations from
    sys/conf/param.c into mbuf-specific code where they belong.
  - m_getclr() has been renamed to m_get_clrd() because the old name is really
    confusing. m_getclr() HAS been preserved though and is defined to the new
    name. No tree sweep has been done "to change the interface," as the old
    name will continue to be supported and is not depracated. The change was
    merely done because m_getclr() sounds too much like "m_get a cluster."
  - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
    systat(1) (see TODO below).
  - Fixed systat(1) to display number of "free mbufs" based on new per-CPU
    stat structures.
  - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
    per-CPU stat structures. All infos are fetched via sysctl.

 TODO (in order of priority):

  - Re-enable mbtypes statistics in both netstat(1) and systat(1) after
    introducing an SMP friendly way to collect the mbtypes stats under the
    already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
    seems too costly for a mere stat update, especially when other locks are
    already present).
  - Optionally have systat(1) display not only "total free mbufs" but also
    "total free mbufs per CPU pool."
  - Fix minor length-fetching issues in netstat(1) related to recently
    re-enabled option to read mbuf stats from a core file.
  - Move reference counters at least for mbuf clusters into an unused portion
    of the cluster itself, to save space and need to allocate a counter.
  - Look into introducing resource freeing possibly from a kproc.

Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/
2001-06-22 06:35:32 +00:00
phk
db841f6fcc Remove unneeded <stddef.h> #includes. 2000-10-29 16:57:42 +00:00
jasone
db944ecf00 For lockmgr mutex protection, use an array of mutexes that are allocated
and initialized during boot.  This avoids bloating sizeof(struct lock).
As a side effect, it is no longer necessary to enforce the assumtion that
lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been
removed.

Idea taken from:	BSD/OS.
2000-10-12 22:37:28 +00:00
peter
1adeb7ffb1 Move the MSG* and SEM* options to opt_sysvipc.h
Remove evil allocation macros from machdep.c (why was that there???) and
use malloc() instead.
Move paramters out of param.h and into the code itself.
Move a bunch of internal definitions from public sys/*.h headers (without
#ifdef _KERNEL even) into the code itself.

I had hoped to make some of this more dynamic, but the cost of doing
wakeups on all sleeping processes on old arrays was too frightening.
The other possibility is to initialize on the first use, and allow
dynamic sysctl changes to parameters right until that point. That would
allow /etc/rc.sysctl to change SEM* and MSG* defaults as we presently
do with SHM*, but without the nightmare of changing a running system.
2000-05-01 13:33:56 +00:00
peter
7f3e149212 Make sysv-style shared memory tuneable params fully runtime adjustable
via sysctl.  It's done pretty simply but it should be quite adequate.
Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that
went with it were wrong... we don't allocate KVM space for the pages so
that comment is bogus..  The only practical limit is how much physical
ram you want to lock up as this stuff isn't paged out or swap backed.
2000-03-30 07:17:05 +00:00
green
56a46611e1 This is Bosko Milekic's mbuf allocation waiting code. Basically, this
means that running out of mbuf space isn't a panic anymore, and code
which runs out of network memory will sleep to wait for it.

Submitted by:	Bosko Milekic <bmilekic@dsuper.net>
Reviewed by:	green, wollman
1999-12-12 05:52:51 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
msmith
af7615d39a Move the initialisation/tuning of nmbclusters from param.c/machdep.c
into uipc_mbuf.c.  This reduces three sets of identical tunable code to
one set, and puts the initialisation with the mbuf code proper.

Make NMBUFs tunable as well.

Move the nmbclusters sysctl here as well.

Move the initialisation of maxsockets from param.c to uipc_socket2.c,
next to its corresponding sysctl.

Use the new tunable macros for the kern.vm.kmem.size tunable (this should have
been in a separate commit, whoops).
1999-07-05 08:52:54 +00:00
des
f9cbfc392a Allow setting MAXFILES in the kernel config. 1999-04-09 16:28:11 +00:00
dillon
5b976b6d8d Fixed problems with kernel config file overrides of sysv semaphore
parameters.  Prior to this fix a kernel config override would effect
only some of the kernel files, resulting in panics.

PR:	kern/9068
1998-12-14 08:34:55 +00:00
dg
b178f74f12 Implemented zero-copy TCP/IP extensions via sendfile(2) - send a
file to a stream socket. sendfile(2) is similar to implementations in
HP-UX, Linux, and other systems, but the API is more extensive and
addresses many of the complaints that the Apache Group and others have
had with those other implementations. Thanks to Marc Slemko of the
Apache Group for helping me work out the best API for this.
Anyway, this has the "net" result of speeding up sends of files over
TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU
cycles) when compared to a traditional read/write loop.
1998-11-05 14:28:26 +00:00
bde
0c05d7ae1f Moved definition of fscale from param.c to kern_synch.c where it
should always have been (it has no user-servicable parts even at
compile time) and staticized it.
1998-07-11 13:06:41 +00:00
phk
107b074e23 Add 3 sysctl variables for future use by ps)1_ 1998-06-30 21:25:58 +00:00
bde
37526a29b0 Round tickadj up. This prevents tickadj from being 0 when HZ > 500,
which makes adjtime(2) useless and confuses xntpd(8) into refusing
to start even when it would use the kernel PLL instead of adjtime().
The result is the same as recommended by tickadj(8), at least when
HZ divides 10^6.  Of course, you wouldn't want to actually use
adjtime() when HZ is large.  In the silly boundary case of HZ == 10^6,
tickadj == tick == 1 so the clock stops while adjtime() is active.
1998-06-21 12:22:35 +00:00
wollman
bbc4497ada Convert socket structures to be type-stable and add a version number.
Define a parameter which indicates the maximum number of sockets in a
system, and use this to size the zone allocators used for sockets and
for certain PCBs.

Convert PF_LOCAL PCB structures to be type-stable and add a version number.

Define an external format for infomation about socket structures and use
it in several places.

Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through
sysctl(3) without blocking network interrupts for an unreasonable
length of time.  This probably still has some bugs and/or race
conditions, but it seems to work well enough on my machines.

It is now possible for `netstat' to get almost all of its information
via the sysctl(3) interface rather than reading kmem (changes to follow).
1998-05-15 20:11:40 +00:00
guido
391bb65d14 Raise ncallout from NPROC + 16 to NPROC + 16 + MAXFILES. This shold
prevent a possible DOS attack. The proper fix (to dynamically grow
the callout list) is in the make.
Submitted by:	Paul Traina
1998-02-27 19:58:29 +00:00