Commit Graph

48 Commits

Author SHA1 Message Date
Kip Macy
93ee134a24 Integrate support for xen in to i386 common code.
MFC after:	1 month
2008-08-15 20:51:31 +00:00
Alan Cox
fdcd29b52b Enable the automatic creation of superpage reservations. 2008-03-26 03:12:00 +00:00
Alan Cox
b8e7fc24fe Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.
2007-12-27 16:45:39 +00:00
Alan Cox
7bfda801a8 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
Alan Cox
e5c45405f0 Add the machine-specific definitions for configuring the new physical
memory allocator.

Set the size of phys_avail[] and dump_avail[] using one of these
definitions.

Approved by:	re
2007-06-05 05:17:20 +00:00
Alan Cox
c155d5d059 Eliminate an unused definition. 2007-05-27 20:34:26 +00:00
Alan Cox
04a18977c8 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
Stephane E. Potvin
0e5179e441 Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
Ruslan Ermilov
2e137367b4 Add the PG_NX support for i386/PAE.
Reviewed by:	alc
2007-04-06 18:15:03 +00:00
David E. O'Brien
9c737de401 Increase the scaling of VM_KMEM_SIZE_MAX.
Submitted by:	alc
2004-08-16 08:35:22 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Peter Wemm
6ccf265bb0 Commit Bosko's patch to clean up the PSE/PG_G initialization to and
avoid problems with some Pentium 4 cpus and some older PPro/Pentium2
cpus.  There are several problems, some documented in Intel errata.
This patch:
1) moves the kernel to the second page in the PSE case.  There is an
errata that says that you Must Not point a 4MB page at physical
address zero on older cpus.  We avoided bugs here due to sheer luck.
2) sets up PSE page tables right from the start in locore, rather than
trying to switch from 4K to 4M (or 2M) pages part way through the boot
sequence at the same time that we're messing with PG_G.

For some reason, the pmap work over the last 18 months seems to tickle
the problems, and the PAE infrastructure changes disturb the cpu
bugs even more.

A couple of people have reported a problem with APM bios calls during
boot.  I'll work with people to get this resolved.

Obtained from:	bmilekic
2003-10-01 23:46:08 +00:00
Peter Wemm
4580c352e1 KPT_MIN_ADDRESS and KPT_MAX_ADDRESS are not used anywhere. And if they
were, they are not safe to use outside of the kernel since these values
can change at kernel compile time - ie: we do not want them compiled into
userland binaries.
2003-05-01 00:10:38 +00:00
Jake Burkholder
46ea68dd10 Better fix for previous previous which still allows the 4megs of kva at
the top of the address space to be reclaimed.  The problem is that with
the APTD gone the mapable kernel address space runs right to the end of
the 32 bit address space.  As a max this is 0x100000000, which can't be
represented in 32 bits, so we have to use ptd entry n-1 and pte offset
n-1, instead of ptd entry n and pte offset 0.  There's still 1 page we
can't use, but we gain just under 4 megs of kva (8 megs with PAE).

Sponsored by:	DARPA, Network Associates Laboratories
2003-04-07 14:27:19 +00:00
Jake Burkholder
5cd612b27e - Removed UMAXPTDI and UMAXPTEOFF.
- Changed VM_MAXUSER_ADDRESS to be defined in terms of PTDPTDI.  In order for
  assumptions about the recursive page table map to work it must be the base
  of the recursive map.  Any pte offset that's not NPTEPG will break these
  assumptions.

Sponsored by:	DARPA, Network Associates Laboratories
2003-02-24 20:29:52 +00:00
Peter Wemm
255108f385 Make sysv-style shared memory tuneable params fully runtime adjustable
via sysctl.  It's done pretty simply but it should be quite adequate.
Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that
went with it were wrong... we don't allocate KVM space for the pages so
that comment is bogus..  The only practical limit is how much physical
ram you want to lock up as this stuff isn't paged out or swap backed.
2000-03-30 07:17:05 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
David Greenman
6704748cf6 Increased max kmem to 200MB. This should fix some out-of-kmem panics on
large systems.
1999-07-24 22:26:42 +00:00
David Greenman
f4fabec6b0 Increased MAXTSIZ to 128MB...there are binaries that get quite large.
Increased DFLDSIZ to 128MB, as it is a better default.
Reviewed by:	jkh
1998-06-12 09:10:22 +00:00
John Dyson
d9bed5bee1 Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden
in a way identically as before.)  I had problems with the system properly
handling the number of vnodes when there is alot of system memory, and the
default VM_KMEM_SIZE.  Two new options "VM_KMEM_SIZE_SCALE" and
"VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems
with greater than 128MB.

Add some accouting for vm_zone memory allocations, and provide properly
for vm_zone allocations out of the kmem_map.  Also move the vm_zone
allocation stats to the VM OID tree from the KERN OID tree.
1998-02-23 07:42:43 +00:00
John Dyson
95461b450d 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
Jordan K. Hubbard
7416845fdd Bump MAXDSIZ to 512MB so that soft limits have a chance to actually
regulate this.
Reviewed by:	dyson
1997-10-27 00:38:46 +00:00
Tor Egge
7bcc0f3d66 Allow the kernel configuration file to override the amount of memory
available to the kernel (VM_KMEM_SIZE). The default (32 MB) is too low
when having 512 MB or more physical memory in a server environment. This is
relevant on systems where "panic: kmem_malloc: kmem_map too small" is a
problem.
1997-06-25 20:18:58 +00:00
Peter Wemm
a0c3795f19 Use UPAGES_HOLE instead of UPAGES in case it's changed some time.
Rename the PT* index KSTK* #defines to UMAX*, since we don't have a kernel
stack there any more..

These are used to calculate VM_MAXUSER_ADDRESS and USRSTACK, and really
do not want to be changed with UPAGES since BSD/OS 2.x binary compatability
depends on it.
1997-04-07 09:30:22 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
John Dyson
d0aea04fe0 Let the VM system know that on certain arch's that VM_PROT_READ
also implies VM_PROT_EXEC.  We support it that way for now,
since the break system call by default gives VM_PROT_ALL.  Now
we have a better chance of coalesing map entries when mixing
mmap/break type operations.  This was contributing to excessive
numbers of map entries on the modula-3 runtime system.  The
problem is still not "solved", but the situation makes more
sense.

Eventually, when we work on architectures where VM_PROT_READ
is orthogonal to VM_PROT_EXEC, we will have to visit this
issue carefully (esp. regarding security issues.)
1996-12-30 05:31:21 +00:00
Poul-Henning Kamp
e911eafcba removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
        ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
        NPDEPG

Major macro cleanup.
1996-05-02 14:21:14 +00:00
Poul-Henning Kamp
68c1eb1215 pte.h: Add the VADDR(pdi,pti) macro to construct virtual address from
page dir+table index.
pmap.h: remove NUPDE, it was wrong and not used.  Sanitize KSTKPTEOFF.
vmparam.h: Calculate virtual addr from PDI+PTI from pmap.h rather than
using magic math.  Remove UPDT, not used.
1996-04-30 12:02:12 +00:00
David Greenman
dc92971788 Killed some historical #define cruft that we've never used in FreeBSD:
UDOT_SZ
SYSPTSIZE
USRPTSIZE
MSGBUFPTECNT
DMMIN
DMMAX
DMTEXT
USRIOSIZE
VM_PHYS_SIZE
1996-03-12 15:37:58 +00:00
David Greenman
b64b660cd3 Made "NMBCLUSTERS" calculation dynamic and fixed bogus use of "NMBCLUSTERS"
in machdep.c (it should use the global nmbclusters). Moved the calculation
of nmbclusters into conf/param.c (same place where nmbclusters has always
been assigned), and made the calculation include an extra amount based
on "maxusers". NMBCLUSTERS can still be overrided in the kernel config
file as always, but this change will make that generally unnecessary. This
fixes the "bug" reports from people who have misconfigured kernels seeing
the network hang when the mbuf cluster pool runs out.

Reviewed by:	John Dyson
1995-05-25 07:41:28 +00:00
David Greenman
0d94caffca These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme.  The scheme is almost fully compatible with the old filesystem
interface.  Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache.  Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code.  Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now.  Somehow in 2.0, some "enhancements"
broke the code.  This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code.  No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency.  Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache.  Add "bypass" support for sneaking in on
busy buffers.

Submitted by:	John Dyson and David Greenman
1995-01-09 16:06:02 +00:00
David Greenman
a6cd0a2477 Increased SHMMAXPGS from 512 to 1024 now that there is plenty of kernel
virtual memory.
1994-09-23 07:00:12 +00:00
David Greenman
cde7257454 Eliminated a whole pile of ancient (we're taking 4.3BSD) VM system
related #define constants. Corrected incorrect VM_MAX_KERNEL_ADDRESS.

Reviewed by:	John Dyson
1994-09-12 11:38:31 +00:00
David Greenman
431be400a7 Got rid of some old, unused junk. 1994-09-01 03:16:40 +00:00
Rodney W. Grimes
26f9a76710 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
David Greenman
29360eb099 Changed dynamic stack grow code to grow by "SGROWSIZ" amount. Initially
allocate SGROWSIZ amount of stack. Also set vm_ssize to the initial
stack VM size. Increased DFLSSIZ stack rlimit default to 8MB.
1994-03-21 09:35:24 +00:00
David Greenman
7f8cb36869 "New" VM system from John Dyson & myself. For a run-down of the
major changes, see the log of any effected file in the sys/vm
directory (swap_pager.c for instance).
1994-01-14 16:25:31 +00:00
David Greenman
e10a618657 Increased maximum and default 'size' limits to more reasonable values. 1994-01-03 16:00:52 +00:00
Garrett Wollman
aaf08d94ca Make everything compile with -Wtraditional. Make it easier to distribute
a binary link-kit.  Make all non-optional options (pagers, procfs) standard,
and update LINT to reflect new symtab requirements.

NB: -Wtraditional will henceforth be forgotten.  This editing pass was
primarily intended to detect any constructions where the old code might
have been relying on traditional C semantics or syntax.  These were all
fixed, and the result of fixing some of them means that -Wall is now a
realistic possibility within a few weeks.
1993-12-19 00:55:01 +00:00
Garrett Wollman
6e393973f5 Made all header files idempotent and moved incorrect common data from
headers into a related source file.  Added cons.h as first step towards
moving i386/i386/cons.h to machine/cons.h where it belongs.
1993-11-07 17:43:17 +00:00
Rodney W. Grimes
d42d25c451 param.h:
Mark the fact that PGSHIFT and PDRSHIFT are really the same as
PG_SHIFT and PD_SHIFT, these should be collapsed some day soon.

Document that KERNBASE should really be KPTDPTDI << PDRSHIFT, for
now leave it as the constant 0xFE000000 until I make a seperate
common header file for this stuff (vmaddresses.h?)

Remove NKMEMCLUSTERS define, it was only being used to define
VM_KMEM_SIZE, so why have all the indirection.  Besides who wants
to work in CLBYTE sizes chuncks.


pmap.h:

Fix $Id$ and some other minor format clean ups.

Remove the XXX comment about NKPDE, since it now has the correct value
of 7.

Remove unused LASTPTDI and move the APTD into the very end of memory to
free up 4MB of kernel virtual address space.
Remove unused RSVDPTDI and free up 12MB of kernel virtual address space.


vmparam.h

Fix $Id$.

Increase SHMMAXPGS to 512 (2MB) now that there is room for it to be
bigger.  The XXX comment stays until the kernel moves down in memory
to free up enough space to use the proper default of 4MB.

VM_KMEM_SIZE is now a direct constant stating the size of the kernel
malloc region.  Increased the value from 3MB to 16MB.
1993-10-15 10:07:45 +00:00
David Greenman
89ccb410d4 Correct spelling of "SHMMAXPGS" so the config override will actually work. 1993-10-09 15:29:04 +00:00
Rodney W. Grimes
b6c78fe436 define SHMMAXPGS where it is suppose to be, you can over ride this with
a kernel config options "SHMAXPGS=xxx", default is currently 64 pages
due to limit kernel map space.
1993-09-27 00:36:57 +00:00
Rodney W. Grimes
b3b174f5cd Increased stack size to 8MB just to be on the real safe side. 1993-09-01 09:38:32 +00:00
Rodney W. Grimes
8373b81238 Changed MAXSSIZ from MAXDSIZ to 2MB 1993-08-28 09:19:01 +00:00
Charlie Root
c26491ecf4 Increased default data size (DFLDSIZ) to 16MB. Need to rebuild libutil,
kernel, ps and w for this to work!
1993-07-03 21:21:35 +00:00
Rodney W. Grimes
5b81b6b301 Initial import, 0.1 + pk 0.2.4-B1 1993-06-12 14:58:17 +00:00