Commit Graph

112 Commits

Author SHA1 Message Date
Peter Wemm
654bd0e802 Reduce the size of pv entries by 15%. This saves 1MB of KVA for mapping
pv entries per 1GB of user virtual memory.  (eg: if we had 1GB file was
mmaped into 30 processes, that would theoretically reduce the KVA demand by
30MB for pv entries.  In reality though, we limit pv entries so we don't
have that many at once.)

We used to store the vm_page_t for the page table page.  But we recently
had the pa of the ptp, or can calculate it fairly quickly.  If we wanted
to avoid the shift/mask operation in pmap_pde(), we could recover the
pa but that means we have to store it for a while.

This does not measurably change performance.

Suggested by:  alc
Tested by:  alc
2004-06-29 15:57:05 +00:00
Bruce Evans
4df7435a78 Include <sys/_lock.h>'s prerequisite <sys/queue.h> before including the
former, not after.
2004-06-20 00:33:14 +00:00
Alan Cox
4d831945a7 MFamd64
Introduce pmap locking to many of the pmap functions.
2004-06-16 07:03:15 +00:00
Alan Cox
8559e0a291 - Remove an unused declaration.
- Move a definition inside the scope of a #ifdef _KERNEL.
2004-06-13 03:44:11 +00:00
Alan Cox
0af0eeacb6 - pmap_kenter_temporary()'s first parameter, which is a physical address,
should be declared as vm_paddr_t not vm_offset_t.
2004-04-10 23:28:49 +00:00
Alan Cox
b14d6acced - pmap_kenter_temporary() is unused by machine-independent code. Therefore,
move its declaration to the machine-dependent header file on those
   machines that use it.  In principle, only i386 should have it.
   Alpha and AMD64 should use their direct virtual-to-physical mapping.
 - Remove pmap_kenter_temporary() from ia64.  It is unused.  Approved
   by: marcel@
2004-04-10 22:41:46 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Alan Cox
c8607538c8 Remove avail_start on those platforms that no longer use it. (Only amd64
does anything with it beyond simple initialization.)
2004-04-05 04:08:00 +00:00
Alan Cox
925d2fedf5 Remove unused declarations. (Some time ago, these variables became fields
of vm/vm.h's struct kva_md_info.)
2004-03-07 07:13:15 +00:00
Alan Cox
5e4a2fc9aa - Similar to post-PAE RELENG_4 split pmap_pte_quick() into two cases,
pmap_pte() and pmap_pte_quick().  The distinction being based upon the
   locks that are held by the caller.  When the given pmap is not the
   current pmap, pmap_pte() should be used when Giant is held and
   pmap_pte_quick() should be used when the vm page queues lock is held.
 - When assigning to PMAP1 or PMAP2, include PG_A anf PG_M.
 - Reenable the inlining of pmap_is_current().

In collaboration with:	tegge
2003-11-08 03:01:26 +00:00
Bruce M Simpson
2bc7dd5661 Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by:	truckman
Discussed with:	peter
2003-10-06 01:47:12 +00:00
Peter Wemm
6ccf265bb0 Commit Bosko's patch to clean up the PSE/PG_G initialization to and
avoid problems with some Pentium 4 cpus and some older PPro/Pentium2
cpus.  There are several problems, some documented in Intel errata.
This patch:
1) moves the kernel to the second page in the PSE case.  There is an
errata that says that you Must Not point a 4MB page at physical
address zero on older cpus.  We avoided bugs here due to sheer luck.
2) sets up PSE page tables right from the start in locore, rather than
trying to switch from 4K to 4M (or 2M) pages part way through the boot
sequence at the same time that we're messing with PG_G.

For some reason, the pmap work over the last 18 months seems to tickle
the problems, and the PAE infrastructure changes disturb the cpu
bugs even more.

A couple of people have reported a problem with APM bios calls during
boot.  I'll work with people to get this resolved.

Obtained from:	bmilekic
2003-10-01 23:46:08 +00:00
Alan Cox
f3fd831cdd - Eliminate the pte object.
- Use kmem_alloc_nofault() rather than kmem_alloc_pageable() to allocate
   KVA space for the page directory page(s).  Submitted by: tegge
2003-09-25 02:51:06 +00:00
Jake Burkholder
14ce5bd49b Use inlines for loading and storing page table entries. Use cmpxchg8b for
the PAE case to ensure idempotent 64 bit loads and stores.

Sponsored by:	DARPA, Network Associates Laboratories
2003-04-28 20:35:36 +00:00
Jake Burkholder
ac00210525 Remove invalid cast to vm_offset_t to avoid truncating a physical address
when doing pmap_kextract on a 2MB page.

Spotted by:	peter
Sponsored by:	DARPA, Network Associates Laboratories
2003-04-08 18:22:41 +00:00
Jake Burkholder
46ea68dd10 Better fix for previous previous which still allows the 4megs of kva at
the top of the address space to be reclaimed.  The problem is that with
the APTD gone the mapable kernel address space runs right to the end of
the 32 bit address space.  As a max this is 0x100000000, which can't be
represented in 32 bits, so we have to use ptd entry n-1 and pte offset
n-1, instead of ptd entry n and pte offset 0.  There's still 1 page we
can't use, but we gain just under 4 megs of kva (8 megs with PAE).

Sponsored by:	DARPA, Network Associates Laboratories
2003-04-07 14:27:19 +00:00
Jake Burkholder
d1d03c2b72 Bandaid fix for previous commit while I figure out why it broke. This
caused crashes early in boot on i386 UP machines.

Reported by:	phk
Pointy hat to:	jake
2003-04-04 10:09:44 +00:00
Jake Burkholder
163529c2b3 - Removed APTD and associated macros, it is no longer used.
BANG BANG BANG etc.

Sponsored by:	DARPA, Network Associates Laboratories
2003-04-03 23:44:35 +00:00
Peter Wemm
cc66ebe2a9 Commit a partial lazy thread switch mechanism for i386. it isn't as lazy
as it could be and can do with some more cleanup.  Currently its under
options LAZY_SWITCH.  What this does is avoid %cr3 reloads for short
context switches that do not involve another user process.  ie: we can
take an interrupt, switch to a kthread and return to the user without
explicitly flushing the tlb.  However, this isn't as exciting as it could
be, the interrupt overhead is still high and too much blocks on Giant
still.  There are some debug sysctls, for stats and for an on/off switch.

The main problem with doing this has been "what if the process that you're
running on exits while we're borrowing its address space?" - in this case
we use an IPI to give it a kick when we're about to reclaim the pmap.

Its not compiled in unless you add the LAZY_SWITCH option.  I want to fix a
few more things and get some more feedback before turning it on by default.

This is NOT a replacement for Bosko's lazy interrupt stuff.  This was more
meant for the kthread case, while his was for interrupts.  Mine helps a
little for interrupts, but his helps a lot more.

The stats are enabled with options SWTCH_OPTIM_STATS - this has been a
pseudo-option for years, I just added a bunch of stuff to it.

One non-trivial change was to select a new thread before calling
cpu_switch() in the first place.  This allows us to catch the silly
case of doing a cpu_switch() to the current process.  This happens
uncomfortably often.  This simplifies a bit of the asm code in cpu_switch
(no longer have to call choosethread() in the middle).  This has been
implemented on i386 and (thanks to jake) sparc64.  The others will come
soon.  This is actually seperate to the lazy switch stuff.

Glanced at by:  jake, jhb
2003-04-02 23:53:30 +00:00
Jake Burkholder
7ab9b220d9 - Add support for PAE and more than 4 gigs of ram on x86, dependent on the
kernel opition 'options PAE'.  This will only work with device drivers which
  either use busdma, or are able to handle 64 bit physical addresses.

Thanks to Lanny Baron from FreeBSD Systems for the loan of a test machine
with 6 gigs of ram.

Sponsored by:	DARPA, Network Associates Laboratories, FreeBSD Systems
2003-03-30 05:24:52 +00:00
Jake Burkholder
de54353fb8 - Remove invalid casts.
Sponsored by:	DARPA, Network Associates Laboratories
2003-03-30 01:44:16 +00:00
Jake Burkholder
aea57872f0 - Convert all uses of pmap_pte and get_ptbase to pmap_pte_quick. When
accessing an alternate address space this causes 1 page table page at
  a time to be mapped in, rather than using the recursive mapping technique
  to map in an entire alternate address space.  The recursive mapping
  technique changes large portions of the address space and requires global
  tlb flushes, which seem to cause problems when PAE is enabled.  This will
  also allow IPIs to be avoided when mapping in new page table pages using
  the same technique as is used for pmap_copy_page and pmap_zero_page.

Sponsored by:	DARPA, Network Associates Laboratories
2003-03-30 01:16:19 +00:00
Jake Burkholder
227f9a1c58 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
Jake Burkholder
5501d40bb9 Made the prototypes for pmap_kenter and pmap_kremove MD. These functions
are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.
2003-03-16 04:16:03 +00:00
Alan Cox
8480cd45a5 Remove some long unused declarations. (For example, the PV flags have not
been used since revision 1.8, roughly nine years ago.)
2003-02-27 20:13:20 +00:00
Jake Burkholder
0f1a7e05a2 - Added inlines pmap_is_current, pmap_is_alternate and pmap_set_alternate
for testing and setting the current and alternate address spaces.
- Changed PTDpde and APTDpde to arrays to support multiple page directory
  pages.

ponsored by:	DARPA, Network Associates Laboratories
2003-02-25 19:40:21 +00:00
Jake Burkholder
5cd612b27e - Removed UMAXPTDI and UMAXPTEOFF.
- Changed VM_MAXUSER_ADDRESS to be defined in terms of PTDPTDI.  In order for
  assumptions about the recursive page table map to work it must be the base
  of the recursive map.  Any pte offset that's not NPTEPG will break these
  assumptions.

Sponsored by:	DARPA, Network Associates Laboratories
2003-02-24 20:29:52 +00:00
Jake Burkholder
ef49a94104 Previous commit missed a 1 that should be NGPTD, and an NPDEPG that should
be NPDEPTD.  Grumble.

Sponsored by:	DARPA, Network Associates Laboratories
2003-02-23 22:12:08 +00:00
Jake Burkholder
910548dea7 - Added macros NPGPTD, NBPTD, and NPDEPTD, for dealing with the size of the
page directory.
- Use these instead of the magic constants 1 or PAGE_SIZE where appropriate.
  There are still numerous assumptions that the page directory is exactly
  1 page.

Sponsored by:	DARPA, Network Associates Laboratories
2003-02-23 21:20:00 +00:00
Jake Burkholder
e29632c9e1 - Added macros PDESHIFT and PTESHIFT, use these instead of magic constants
in locore.
- Removed the macros PTESIZE and PDESIZE, use sizeof instead in C.

Sponsored by:	DARPA, Network Associates Laboratories
2003-02-23 09:45:50 +00:00
Alan Cox
01a06ce250 The root of the splay tree maintained within the pm_pteobj always refers
to the last accessed pte page.  Thus, the pm_ptphint is redundant and can
be removed.
2003-02-22 23:43:08 +00:00
Alan Cox
8365d52166 o Introduce pmap_page_is_mapped(). Its purpose is to obsolete
the PG_MAPPED flag.
2002-08-05 03:40:28 +00:00
Peter Wemm
f1b665c8fe Revive backed out pmap related changes from Feb 2002. The highlights are:
- It actually works this time, honest!
- Fine grained TLB shootdowns for SMP on i386.  IPI's are very expensive,
  so try and optimize things where possible.
- Introduce ranged shootdowns that can be done as a single IPI.
- PG_G support for i386
- Specific-cpu targeted shootdowns.  For example, there is no sense in
  globally purging the TLB cache for where we are stealing a page from
  the local unshared process on the local cpu.  Use pm_active to track
  this.
- Add some instrumentation for the tlb shootdown code.
- Rip out SMP code from <machine/cpufunc.h>
- Try and fix some very bogus PG_G and PG_PS interactions that were bad
  enough to cause vm86 bios calls to break.  vm86 depended on our existing
  bugs and this was the cause of the VESA panics last time.
- Fix the silly one-line error that caused the 'panic: bad pte' last time.
- Fix a couple of other silly one-line errors that should have caused more
  pain than they did.

Some more work is needed:
- pmap_{zero,copy}_page[_idle].  These can be done without IPI's if we
  have a hook in cpu_switch.
- The IPI handlers need some cleanup.  I have a bogus %ds load that can
  be avoided.
- APTD handling is rather bogus and appears to be a large source of
  global TLB IPI shootdowns for no really good reason.

I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop.
I expect to see a bigger difference when there is significant pageout
activity or the system otherwise has memory shortages.

I have backed out a few optimizations that I had been using over the last
few days in order to be a little more conservative.  I'll revisit these
again over the next few days as the dust settles.

New option:  DISABLE_PG_G - In case I missed something.
2002-07-12 07:56:11 +00:00
Peter Wemm
fcdc053233 Cosmetic. Remove #if 0 definition of vtophys() - it predates 4MB pages.
Remove avtophys(), it isn't referenced anywhere.
2002-07-08 08:14:28 +00:00
Peter Wemm
db17c6fc07 Tidy up some loose ends.
i386/ia64/alpha - catch up to sparc64/ppc:
- replace pmap_kernel() with refs to kernel_pmap
- change kernel_pmap pointer to (&kernel_pmap_store)
  (this is a speedup since ld can set these at compile/link time)
all platforms (as suggested by jake):
- gc unused pmap_reference
- gc unused pmap_destroy
- gc unused struct pmap.pm_count
(we never used pm_count - we track address space sharing at the vmspace)
2002-04-29 07:43:16 +00:00
Alfred Perlstein
b63dc6ad47 Remove __P. 2002-03-20 05:48:58 +00:00
Peter Wemm
d1693e1701 Back out all the pmap related stuff I've touched over the last few days.
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels.  Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely.  Userland programs still crashed.
2002-02-27 09:51:33 +00:00
Peter Wemm
6bd95d70db Work-in-progress commit syncing up pmap cleanups that I have been working
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
  a single IPI, since the IPI is very expensive.  Adjust some callers
  that used to trigger this inside tight loops to do a ranged shootdown
  at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
  PSE and the 4MB pages at the start of the kernel.  This should solve
  some rumored strangeness about stale PG_G entries getting stuck
  underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines.  gcc
  seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers.  I will fix
  this again shortly.

This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round.  The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP).  I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
2002-02-25 23:49:51 +00:00
Peter Wemm
963131fe0a Tidy up some warnings 2002-02-25 21:42:23 +00:00
Peter Wemm
6a3e90ef2f Some more tidy-up of stray "unsigned" variables instead of p[dt]_entry_t
etc.
2002-02-20 01:05:57 +00:00
Peter Wemm
6729cb8800 Start bringing i386/pmap.c into line with cleanups that were done to
alpha pmap.  In particular -
- pd_entry_t and pt_entry_t are now u_int32_t instead of a pointer.
  This is to enable cleaner PAE and x86-64 support down the track sor
  that we can change the pd_entry_t/pt_entry_t types to 64 bit entities.
- Terminate "unsigned *ptep, pte" with extreme prejudice and use the
  correct pt_entry_t/pd_entry_t types.
- Various other cosmetic changes to match cleanups elsewhere.
- This eliminates a boatload of casts.
- use VM_MAXUSER_ADDRESS in place of UPT_MIN_ADDRESS in a couple of places
  where we're testing user address space limits.  Assuming the page tables
  start directly after the end of user space is not a safe assumption.
There is still more to go.
2001-11-17 01:38:32 +00:00
Peter Wemm
f83fbaf22d Introduce a new option, KVA_SPACE, which can be used to reconfigure
the size of the kernel virtual address space relatively painlessly.
Userland will adapt via the exported kernbase symbol.  Increasing
this causes the user part of address space to reduce.
2001-09-21 06:23:03 +00:00
Peter Wemm
083e9ed543 Increase NKPT from 17 to 30. This fixes the 4GB ram boot panic on both
-current and RELENG_4 with GENERIC.

NKPT is the number of initial bootstrap page table pages we create for
the kernel during startup. Once VM is up, we resize it as needed, but
with 4G ram, the size of the vm_page_t structures was pushing it over
the limit.  The fact that trimmed down kernels boot on 4G ram machines
suggests that we were pretty close to the edge.

The "30" is arbitary, but smaller than the 'nkpt' variable on all
machines that I checked.
2000-11-30 01:53:02 +00:00
Tor Egge
d9b05734cc Prepare for a cleanup of pmap module API pollution introduced by the
suggested fix in PR 12378.

Keep track of all existing pmaps independent of existing processes.

This allows for a process to temporarily connect to a different address
space without the risk of missing an update of the original address space if
the kernel grows.

pmap_pinit2() is no longer needed on the i386 platform but is left as a
stub until the alpha pmap code is updated.

PR:		12378
2000-08-16 21:24:44 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Peter Wemm
0385347c1a Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly
to various pmap_*() functions instead of looking up the physical address
and passing that.  In many cases, the first thing the pmap code was doing
was going to a lot of trouble to get back the original vm_page_t, or
it's shadow pv_table entry.

Inspired by: John Dyson's 1998 patches.

Also:
Eliminate pv_table as a seperate thing and build it into a machine
dependent part of vm_page_t.  This eliminates having a seperate set of
structions that shadow each other in a 1:1 fashion that we often went to
a lot of trouble to translate from one to the other. (see above)
This happens to save 4 bytes of physical memory for each page in the
system.  (8 bytes on the Alpha).

Eliminate the use of the phys_avail[] array to determine if a page is
managed (ie: it has pv_entries etc).  Store this information in a flag.
Things like device_pager set it because they create vm_page_t's on the
fly that do not have pv_entries.  This makes it easier to "unmanage" a
page of physical memory (this will be taken advantage of in subsequent
commits).

Add a function to add a new page to the freelist.  This could be used
for reclaiming the previously wasted pages left over from preloaded
loader(8) files.

Reviewed by:	dillon
2000-05-21 12:50:18 +00:00
Peter Wemm
664a31e496 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 04:46:21 +00:00
Peter Wemm
7ba4a77549 Reclaim UPAGES_HOLE (8k) that was chopped out of process address space.
The UPAGES have not been there since Jan '96, but the hole was preserved
for BSD/OS binary compatability.  This has been fixed other ways (%ebx
now has a pointer to PS_STRINGS), and the stack is nowhere near where
it used to be so this hack isn't required anymore.
1999-12-11 10:54:06 +00:00
Peter Wemm
1a16554b8f Make pmap_mapdev() deal with non-page-aligned requests.
Add a corresponding pmap_unmapdev() to release the KVM back to kernel_map.
1999-09-11 20:31:32 +00:00