Commit Graph

98 Commits

Author SHA1 Message Date
alc
c84b8f6e0c Modestly increase the maximum allowed size of the kmem map on i386.
Also, express this new maximum as a fraction of the kernel's address
space size rather than a constant so that increasing KVA_PAGES will
automatically increase this maximum.  As a side-effect of this change,
kern.maxvnodes will automatically increase by a proportional amount.

While I'm here ensure that this change doesn't result in an unintended
increase in maxpipekva on i386.  Calculate maxpipekva based upon the
size of the kernel address space and the amount of physical memory
instead of the size of the kmem map.  The memory backing pipes is not
allocated from the kmem map.  It is allocated from its own submap of
the kernel map.  In short, it has no real connection to the kmem map.
(In fact, the commit messages for the maxpipekva auto-sizing talk
about using the kernel map size, cf. r117325 and r117391, even though
the implementation actually used the kmem map size.)  Although the
calculation is now done differently, the resulting value for
maxpipekva should remain almost the same on i386.  However, on amd64,
the value will be reduced by 2/3.  This is intentional.  The recent
change to VM_KMEM_SIZE_SCALE on amd64 for the benefit of ZFS also had
the unnecessary side-effect of increasing maxpipekva.  This change is
effectively restoring maxpipekva on amd64 to its prior value.

Eliminate init_param3() since it is no longer used.
2011-03-23 16:38:29 +00:00
pluknet
5f536fc1d3 Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by:	perryh pluto.rain.com (previous version)
Reviewed by:	jhb
Approved by:	kib (mentor)
Tested by:	universe
2011-01-21 10:26:26 +00:00
csjp
1e529a8eb9 Add Xen to the list of virtual vendors. In the non PV (HVM) case this fixes
the virtualization detection successfully disabling the clflush instruction.
This fixes insta-panics for XEN hvm users when the hw.clflush_disable
tunable is -1 or 0 (-1 by default).

Discussed with:	jhb
2010-08-06 15:04:40 +00:00
nwhitehorn
ecf1995ac7 Reverse the logic of the if statement that sets the default value of
HZ; the list of 1000 Hz platforms was getting unwieldy.

Suggested by:	marcel
2010-06-24 00:27:20 +00:00
nwhitehorn
da5a28c706 Move default HZ from 100 to 1000 on powerpc.
Reviewed by:	marcel
MFC after:	2 weeks
2010-06-23 23:26:14 +00:00
ivoras
14f9175723 Document the VM detection type and sysctl a bit better. 2010-03-02 23:57:42 +00:00
alc
83149d5d10 When running as a guest operating system, the FreeBSD kernel must assume
that the virtual machine monitor has enabled machine check exceptions.
Unfortunately, on AMD Family 10h processors the machine check hardware
has a bug (Erratum 383) that can result in a false machine check exception
when a superpage promotion occurs.  Thus, I am disabling superpage
promotion when the FreeBSD kernel is running as a guest operating system
on an AMD Family 10h processor.

Reviewed by:	jhb, kib
MFC after:	3 days
2010-02-27 18:00:57 +00:00
brooks
e7e1754f54 Don't inforce an upper bound on kern.ngroups. The INT_MAX-1 limit was
too high due to several overflows.  The actual limit is somewhere in the
neighborhood of INT_MAX/4 on 64-bit machines, but most systems could not
support such a limit due to a lack of memory and the cost of duplicate
credentials.

Reported by:	bde
2010-02-24 15:52:18 +00:00
brooks
a093b41daf Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic
kern.ngroups+1.  kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1.  Given that the Windows group limit is 1024, this range
should be sufficient for most applications.

MFC after:	1 month
2010-01-12 07:49:34 +00:00
silby
13615958a8 Increase HZ_VM from 10 to 100. While 10 hz saves cpu time
under VM environments, it's too slow for FreeBSD to work
properly.  For example, ping at 10hz pings about every 600ms
instead of about every second.

Approved by:	re (kib)
2009-07-08 01:09:12 +00:00
jhb
ffcd13a80f Improve the description of a few sysctls.
Submitted by:	bde (partially)
MFC after:	3 days
2009-03-23 20:18:06 +00:00
jhb
db47507f01 Change the sysctls for maxbcache and maxswzone from int to long. I missed
this earlier since these sysctls don't exist in 7.x yet.
2009-03-12 17:23:02 +00:00
jhb
192cd27cf3 Export the current values of nbuf, ncallout, and nswbuf via read-only
sysctls that match the tunable names.

MFC after:	3 days
2009-03-12 17:21:58 +00:00
jhb
50289fd1c1 - Make maxpipekva a signed long rather than an unsigned long as overflow
is more likely to be noticed with signed types.
- Make amountpipekva a long as well to match maxpipekva.

Discussed with:	bde
2009-03-10 21:28:43 +00:00
jhb
80d9458a56 Adjust some variables (mostly related to the buffer cache) that hold
address space sizes to be longs instead of ints.  Specifically, the follow
values are now longs: runningbufspace, bufspace, maxbufspace,
bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace,
hirunningspace, maxswzone, maxbcache, and maxpipekva.  Previously, a
relatively small number (~ 44000) of buffers set in kern.nbuf would result
in integer overflows resulting either in hangs or bogus values of
hidirtybuffers and lodirtybuffers.  Now one has to overflow a long to see
such problems.  There was a check for a nbuf setting that would cause
overflows in the auto-tuning of nbuf.  I've changed it to always check and
cap nbuf but warn if a user-supplied tunable would cause overflow.

Note that this changes the ABI of several sysctls that are used by things
like top(1), etc., so any MFC would probably require a some gross shims
to allow for that.

MFC after:	1 month
2009-03-09 19:35:20 +00:00
ivoras
4136fd8892 Document the relationship between enum VM_GUEST and the vm_guest_sysctl_names
array.

Approved by:	gnn (original version)
2008-12-30 23:49:54 +00:00
bz
7d22a18291 Hide detect_virtual() along with the accompanying string
arrays under #ifndef XEN to make XEN config compile again.
In case of Xen vm_guest is hard coded.

Move the list for the vm_guest sysctl out of the restictive
bounds as the sysctl is there in either case.
2008-12-27 17:19:16 +00:00
ivoras
c6f6eeca99 By popular request, stringify kern.vm_guest sysctl. Now it returns a
short, self-documenting string describing the detected virtual
environment.

Approved by:	gnn (mentor) (earlier version)
2008-12-18 15:34:38 +00:00
ivoras
b769de9274 Introduce a sysctl kern.vm_guest that reflects what the kernel knows about
it running under a virtual environment. This also introduces a globally
accessible variable vm_guest that can be used where appropriate in the
kernel to inspect this environment.

To make it easier for the long run, an enum VM_GUEST is also introduced,
which could possibly be factored out in a header somewhere (but the
question is where - vm/vm_param.h? sys/param.h?) so it eventually becomes
a part of the standard KPI. In any case, it's a start.

The purpose of all this isn't to absolutely detect that the OS is running
under a virtual environment (cf. "redpill") but to allow the parts of the
kernel and the userland that care about this particular aspect and can do
something useful depending on it to have a standardised interface. Reducing
kern.hz is one example but there are other things that could be done like
avoiding context switches, not using CPU instructions that are known to be
slow in emulation, possibly different strategies in VM (memory) allocation,
CPU scheduling, etc.

It isn't clear if the JAILS/VIMAGE functionality should also be exposed
by this particular mechanism (probably not since they're not "full"
virtual hardware environments). Sometime in the future another sysctl and
a variable could be introduced to reflect if the kernel supports any kind
of virtual hosting (e.g. VMWare VMI, Xen dom0).

Reviewed by:	silence from src-commiters@, virtualization@, kmacy@
Approved by:	gnn (mentor)
Security:	Obscurity doesn't help.
2008-12-17 19:57:12 +00:00
jkim
bc7e5e240b - Detect Bochs BIOS variants and use HZ_VM as well.
- Free kernel environment variable after its use.
- Fix style(9) nits.
2008-12-08 18:39:59 +00:00
sobomax
2bddeb51d2 vm_pnames should be "const char *const[]".
Submitted by:	Christoph Mallon
2008-10-27 08:09:05 +00:00
sobomax
c9fd562aa0 vm_pnames has no reason to be global.
MFC after:	2 weeks
2008-10-27 06:34:41 +00:00
sobomax
6b076dc603 Default HZ value (1,000) on i386/amd64 is not very virtual machine friendly.
Due to the nature of the beast it causes lot of unproductive overhead. This
is especially bad when running SMP kernel on VMWare with several virtual
processors - idle FreeBSD guest with SMP kernel takes 150% host CPU time on my
dual-core MacBook Pro when I am enabling two virtual CPUs, making even host
not very usable. Detect when we are running in the sandbox and reduce HZ
to 10 (can be adjusted via VM_HZ in the kernel config) in such cases. This
brings host CPU usage of idle FreeBSD/SMP on two virtual processors down
to 10%.

Detect most popular VM platforms out there - VMWare, Parallels, VirtualBox
and VirtualPC.

MFC after:	2 weeks
2008-10-27 06:25:02 +00:00
alc
b7d6153751 Correct an error in the comments for init_param3().
Discussed with: silby
2008-07-04 19:36:58 +00:00
pjd
a902aa50c3 - Export HZ value via kern.hz sysctl (this is the same name as for the
loader tunable).
- Document other sysctls in this file and also mark them as loader tunable
  via CTLFLAG_RDTUN flag.

Reviewed by:	roberto
2008-05-09 07:42:02 +00:00
alfred
3dcb842f61 Export maxswzone, maxbcache, maxtsiz, dfldsiz, maxdsiz, dflssiz, maxssiz,
and sgrowsiz via sysctl.

MFC after: 1 week
2007-10-16 10:40:53 +00:00
kris
a72fd404e3 Partially revert revision 1.66, which contained a change that did not
correspond to the commit log.  It changed the maxswzone and maxbcache
parameters from int to long, without changing the extern definitions
in <sys/buf.h>.

In fact it's a good thing it did not, because other parts of the system
are not yet ready for this, and on large-memory sparc machines it causes
severe filesystem damage if you try.

The worst effect of the change was that the tunables controlling the
above variables stopped working.  These were necessary to allow such
large sparc64 machines (with >12GB RAM) to boot, since sparc64 did not
set a hard-coded upper limit on these parameters and they ended
up overflowing an int, causing an infinite loop at boot in bufinit().

Reviewed by:	mlaier
2005-10-14 19:15:10 +00:00
marius
dfe8329b58 Increase default HZ for sparc64 to 1000. 2005-04-16 15:07:41 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
bms
8ea3319e24 Fix the build. 2004-11-30 03:23:35 +00:00
peter
0cb38b1818 Switch from 1024hz to 1000hz on amd64 to match i386. 1024 is a bad
choice because it is so in sync with stathz (128hz or 4096hz etc).
2004-11-30 00:25:26 +00:00
des
e836fd23ea #include <vm/vm_param.h> instead of <machine/vmparam.h> (the former
includes the latter, but also declares variables which are defined
in kern/subr_param.c).

Change som VM parameters from quad_t to unsigned long.  They refer to
quantities (size limits for text, heap and stack segments) which must
necessarily be smaller than the size of the address space, so long is
adequate on all platforms.

MFC after:	1 week
2004-11-08 18:20:02 +00:00
marcel
c361afa258 Increase default HZ for ia64 to 1000. 2004-11-08 04:50:02 +00:00
phk
e582d20cf5 Increase default HZ for i386 to 1000 2004-11-06 11:33:43 +00:00
imp
74cf37bd00 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
alc
707630ec9c White space and wording changes to init_param3().
Mostly submitted by:	bde
2004-03-30 08:00:11 +00:00
alc
521fa57364 Revise the direct or optimized case to use uiomove_fromphys() by the reader
instead of ephemeral mappings using pmap_qenter() by the writer.  The
writer is still, however, responsible for wiring the pages, just not
mapping them.  Consequently, the allocation of KVA for the direct case is
unnecessary.  Remove it and the sysctls limiting it, i.e.,
kern.ipc.maxpipekvawired and kern.ipc.amountpipekvawired.  The number
of temporarily wired pages is still, however, limited by
kern.ipc.maxpipekva.

Note: On platforms lacking a direct virtual-to-physical mapping,
uiomove_fromphys() uses sf_bufs to cache ephemeral mappings.  Thus,
the number of available sf_bufs can influence the performance of pipes
on platforms such i386.  Surprisingly, I saw the greatest gain from this
change on such a machine: lmbench's pipe bandwidth result increased from
~1050MB/s to ~1850MB/s on my 2.4GHz, 400MHz FSB P4 Xeon.
2004-03-27 19:50:23 +00:00
peter
7a96caafd4 Set default HZ to 1024 for amd64. The comment in kern/tty.c doesn't
apply here because we have 64 bit longs and don't suffer the hz > 169
overflows.
2004-03-14 05:49:31 +00:00
silby
bd71f7b671 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
silby
22ad6d5be5 Add init_param3() to subr_param. This function is called
immediately after the kernel map has been sized, and is
the optimal place for the autosizing of memory allocations
which occur within the kernel map to occur.

Suggested by:	bde
2003-07-11 00:01:03 +00:00
silby
fa9cd99702 Pull in the entire kmem_map size calculation from kern_malloc, rather
than the shortcircuited version I had been using, which only worked
properly on i386 & amd64.

Also, change an autoscale constant to account for the more correct
kmem_map size.

Problem noticed by:     mux
2003-07-08 18:59:21 +00:00
silby
bba10d998e Put some concrete limits on pipe memory consumption:
- Limit the total number of pipes so that we do not
  exhaust all vm objects in the kernel map.  When
  this limit is reached, a ratelimited message will
  be printed to the console.

- Put a soft limit on the amount of memory consumable
  by pipes.  Once the limit has been reached, all new
  pipes will be limited to 4K in size, rather than the
  default of 16K.

- Put a limit on the number of pages that may be used
  for high speed page flipping in order to reduce the
  amount of wired memory.  Pipe writes that occur
  while this limit is exceeded will fall back to
  non-page flipping mode.

The above values are auto-tuned in subr_param.c and
are scaled to take into account both the size of
physical memory and the size of the kernel map.

These limits help to reduce the "kernel resources exhausted"
panics that could be caused by opening a large
number of pipes.  (Pipes alone are no longer able
to exhaust all resources, but other kernel memory hogs
in league with pipes may still be able to do so.)

PR:			53627
Ideas / comments from:	hsu, tjr, dillon@apollo.backplane.com
MFC after:		1 week
2003-07-08 04:02:31 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
peter
c3bdd669c3 Change hw.physmem and hw.usermem to unsigned long like they used to be
in the original hardwired sysctl implementation.

The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int.  Use
'long' there.

Change Maxmem and physmem and related variables to 'long', mostly for
completeness.  Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody.  This
comes for free on 32 bit machines, so why not?
2002-08-30 04:04:37 +00:00
phk
b6bf4c07cf Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
silby
e3a68020c7 Unconditionally limit maxproc so that it is not possible
to exhaust all kmaps.  The only reward for setting maxproc
to a value which will cause kmap exhaustion is a panic
during a forkbomb attack.

MFC after:	3 days
2002-03-07 04:50:36 +00:00
dillon
9371a9a23b Allow the kern.maxusers boot tuneable to be set to 0 (previously only
the kernel config's maxusers could be set to 0 for autosizing to work).
Reviewed by:	rwatson, imp
MFC after:	3 days
2002-02-06 01:19:19 +00:00
dillon
e1e10af6b7 Make the 'maxusers 0' auto-sizing code slightly more conservative. Change
from 1 megabyte of ram per user to 2 megabytes of ram per user, and
reduce the cap from 512 to 384.  512 leaves around 240 MB of KVM available
while 384 leaves 270 MB of KVM available.  Available KVM is important
in order to deal with zalloc and kernel malloc area growth.

Reviewed by:	mckusick
MFC: either before 4.5 if re's agree, or after 4.5
2002-01-25 01:54:16 +00:00
peter
dd0f3c5ca2 Proper fix for old config setting maxusers to 8. 2001-12-14 09:39:29 +00:00
dillon
62f062ea62 Too many people are compiling kernels with maxusers set to 0 without the new
config.  Hack the kernel to force auto-sizing if the old config is used.
2001-12-14 04:01:08 +00:00