Commit Graph

324 Commits

Author SHA1 Message Date
Peter Wemm
e6592ee55c Collect N identical (or near identical) mkdumpheader() implementations into
one, as threatened in the comment.  Textdump magic can be passed in.
2008-10-01 22:08:53 +00:00
Marius Strobl
6f04e7b9aa Remove ipi_all() and ipi_self() as the former hasn't been used at
all to date and the latter also is only used in ia64 and powerpc
code which no longer serves a real purpose after bring-up and just
can be removed as well. Note that architectures like sun4u also
provide no means of implementing IPI'ing a CPU itself natively
in the first place.

Suggested by:	jhb
Reviewed by:	arch, grehan, jhb
2008-09-28 18:34:14 +00:00
Marius Strobl
d8f4dd34d6 Work around Cheetah+ erratum 34 (USIII+ erratum #10) by relocating
the locked entry in it16 slot 0, which typically is occupied by the
PROM, and manually entering locked entries in slots != 0.

Thanks to Hubert Feyrer for donating the Blade 2000 this change was
developed on.
2008-09-10 20:07:08 +00:00
Marius Strobl
fd9d647916 MFsparc64: r177642
Remove sysbeep() from the non-beeping archs.
2008-09-02 21:35:57 +00:00
Marius Strobl
b12572f400 Resurrect clock.c from r164371. 2008-09-02 21:28:04 +00:00
Ed Schouten
bc093719ca Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
John Baldwin
70d12a18f2 Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after:	1 week
2008-08-19 19:53:52 +00:00
Bjoern A. Zeeb
603724d3ab Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
Marius Strobl
0b1bfc4986 - Reimplement {d,i}tlb_enter() and {d,i}tlb_va_to_pa() in C. There's
no particular reason for them to be implemented in assembler and
  having them in C allows easier extension as well as using more C
  macros and {d,i}tlb_slot_max rather than hard-coding magic (and
  actually spitfire-only) values.
- Fix the compilation of pmap_print_tte().
- Change pmap_print_tlb() to use ldxa() rather than re-rolling it
  inline as well as TLB_DAR_SLOT and {d,i}tlb_slot_max rather than
  hardcoding magic (and actually spitfire-only) values.
- While at it, suffix the above mentioned functions with "_sun4u" to
  underline they're architecture-specific.
- Use __FBSDID and macros instead of magic values in locore.S.
- Remove unused includes and smp_stack in locore.S.
2008-08-07 22:46:25 +00:00
Ed Schouten
200d80cd74 Disconnect drivers that haven't been ported to MPSAFE TTY yet.
As clearly mentioned on the mailing lists, there is a list of drivers
that have not been ported to the MPSAFE TTY layer yet. Remove them from
the kernel configuration files. This means people can now still use
these drivers if they explicitly put them in their kernel configuration
file, which is good.

People should keep in mind that after August 10, these drivers will not
work anymore. Even though owners of the hardware are capable of getting
these drivers working again, I will see if I can at least get them to a
compilable state (if time permits).
2008-08-03 10:32:17 +00:00
Xin LI
dbd47f1592 Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out
of the box.
2008-07-07 22:55:11 +00:00
Marius Strobl
1239136645 Given that sun4u uses sparc64/sparc64/in_cksum.c, use the sparc64
<machine/in_cksum.h> here also.

MFC after:	3 days
2008-06-25 21:03:26 +00:00
Ed Schouten
721351876c Remove the unused major/minor numbers from iodev and memdev.
Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by:	philip (mentor)
2008-06-25 07:45:31 +00:00
David E. O'Brien
99f233296d Use the "options " spelling (vs. "options<TAB>") so that commented lines
line up nicely.
2008-05-21 03:36:53 +00:00
Alan Cox
1ec1304bdb Retire pmap_addr_hint(). It is no longer used. 2008-05-18 04:16:57 +00:00
Alan Cox
2d17f90775 Add a stub for pmap_align_superpage() on machines that don't (yet)
implement pmap-level support for superpages.
2008-05-09 23:31:42 +00:00
Peter Wemm
43d7128c14 Expand kdb_alt_break a little, most commonly used with the option
ALT_BREAK_TO_DEBUGGER.  In addition to "Enter ~ ctrl-B" (to enter the
debugger), there is now "Enter ~ ctrl-P" (force panic) and
"Enter ~ ctrl-R" (request clean reboot, ala ctrl-alt-del on syscons).

We've used variations of this at work.  The force panic sequence is
best used with KDB_UNATTENDED for when you just want it to dump and
get on with it.

The reboot request is a safer way of getting into single user than
a power cycle.  eg: you've hosed the ability to log in (pam, rtld, etc).
It gives init the reboot signal, which causes an orderly reboot.

I've taken my best guess at what the !x86 and non-sio code changes
should be.

This also makes sio release its spinlock before calling KDB/DDB.
2008-05-04 23:29:38 +00:00
Marius Strobl
0b5a77c6a4 Remove an header which is unused for sun4v.
MFC after:	3 days
2008-05-02 17:44:18 +00:00
Marius Strobl
b7ee09f7b0 Remove the MD isa_irq_pending() and the underlying PCI-specific
infrastructure. Its only consumer ever was sio(4) and thus was
unused on sparc64 since removing the last traces of sio(4) in
sparc64 configuration files in favor for uart(4) over three
years ago. If similar functionality is required again it should
be brought back as an MD intr_pending() which works for all
busses by using for example interrupt controller hooks.
2008-04-26 11:01:38 +00:00
Jeff Roberson
6c47aaae12 - Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
 - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
   suspended in cpu specific states.  This function can fail and cause the
   scheduler to fall back to another mechanism (ipi).
 - Implement support for mwait in cpu_idle() on i386/amd64 machines that
   support it.  mwait is a higher performance way to synchronize cpus
   as compared to hlt & ipis.
 - Allow selecting the idle routine by name via sysctl machdep.idle.  This
   replaces machdep.cpu_idle_hlt.  Only idle routines supported by the
   current machine are permitted.

Sponsored by:	Nokia
2008-04-25 05:18:50 +00:00
Poul-Henning Kamp
0051271e12 Make genclock standard on all platforms.
Thanks to: grehan & marcel for platform support on ia64 and ppc.
2008-04-21 10:09:55 +00:00
Jeff Roberson
9b33b154b5 - Add the interrupt vector number to intr_event_create so MI code can
lookup hard interrupt events by number.  Ignore the irq# for soft intrs.
 - Add support to cpuset for binding hardware interrupts.  This has the
   side effect of binding any ithread associated with the hard interrupt.
   As per restrictions imposed by MD code we can only bind interrupts to
   a single cpu presently.  Interrupts can be 'unbound' by binding them
   to all cpus.

Reviewed by:	jhb
Sponsored by:	Nokia
2008-04-11 03:26:41 +00:00
John Baldwin
1ee1b68792 Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
  'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
  Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
  scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
  Also, don't bother unmasking the interrupt in the post_filter case if
  we never masked it.  The INTR_FILTER case had been doing this by having
  arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
  They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
  both the 'post_filter' and 'post_ithread' hooks to match what the
  non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
  'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
  designed to do even in 5.x).

Glanced at by:	piso
Reviewed by:	marius
Requested by:	marius [1], [5]
Tested on:	amd64, i386, arm, sparc64
2008-04-05 19:58:30 +00:00
Doug Rabson
fa9d9930ca Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
2008-03-27 11:54:20 +00:00
John Birrell
e483943791 When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.
2008-03-27 05:03:26 +00:00
Poul-Henning Kamp
8c0b6469bf Remove two variables which are handled MI now. 2008-03-26 20:28:52 +00:00
Poul-Henning Kamp
e465985885 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
Poul-Henning Kamp
b416a29043 Remove old sysctl stuff which is long gone in other arch's. 2008-03-26 13:03:51 +00:00
Pawel Jakub Dawidek
ab35440fa1 Oops. Use atomic_add_long() for atomic_fetchadd_long() (not atomic_add_int())
for sparc64 and sun4v.

Noticed by:	marius
2008-03-19 07:27:24 +00:00
John Baldwin
07f7fccaaf Catch up to intr_event_create() prototype change.
Pointy hat:	jhb
2008-03-18 13:31:45 +00:00
Pawel Jakub Dawidek
6eb4157ffc Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
John Baldwin
eaf86d1678 Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
Jeff Roberson
32c9d3a767 - Rather than repeating the same preemption code everywhere call the scheduler
specific sched_preempt() routine.
2008-03-10 01:32:48 +00:00
Jeff Roberson
81aa71755b - Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
   properties.
 - Provide several convenience functions for creating one and two level
   cpu trees as well as a default flat topology.  The system now always
   has some topology.
 - On i386 and amd64 create a seperate level in the hierarchy for HTT
   and multi-core cpus.  This will allow the scheduler to intelligently
   load balance non-uniform cores.  Presently we don't detect what level
   of the cache hierarchy is shared at each level in the topology.
 - Add a mechanism for testing common topologies that have more information
   than the MD code is able to provide via the kern.smp.topology tunable.
   This should be considered a debugging tool only and not a stable api.

Sponsored by:	Nokia
2008-03-02 07:58:42 +00:00
Ruslan Ermilov
007b1b7bae Add a wrapper function that bound checks writes to the dump device. 2008-01-28 19:04:07 +00:00
Alan Cox
eb2a051720 Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.
2008-01-03 07:34:34 +00:00
Alan Cox
b8e7fc24fe Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.
2007-12-27 16:45:39 +00:00
Robert Watson
3de213cc00 Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument.  This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.
2007-12-25 17:52:02 +00:00
Joseph Koshy
0da7aa7a7d Add stubs to unbreak LINT. 2007-12-07 13:45:47 +00:00
Robert Watson
3c90d1ea74 Break out stack(9) from ddb(4):
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
  definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
  defined, or also if "options DDB" is defined to provide compatibility
  with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to.  It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested:	amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested:	amd64 (rwatson), arm (cognet), i386 (rwatson)
2007-12-02 20:40:35 +00:00
John Birrell
a9445e17cc Adjust the padding to account for the change of size of the MI part
of struct pcpu.
2007-11-29 20:50:40 +00:00
Attilio Rao
573c6b82df Make ADAPTIVE_GIANT as the default in the kernel and remove the option.
Currently, Giant is not too much contented so that it is ok to treact it
like any other mutexes.

Please don't forget to update your own custom config kernel files.

Approved by:	cognet, marcel (maintainers of arches where option is
		not enabled at the moment)
2007-11-28 05:50:45 +00:00
John Birrell
3aabc4d901 __builtin_stdarg_start was renamed to __builtin_va_start a long
time ago (2002 according to the gcc log). Using the proper name
fixes a warning in src/lib/libc/gen/ulimit.c about the second
argument of va_start() not being the last named (when it really
was).
2007-11-19 07:34:57 +00:00
Alan Cox
59677d3c0e Prevent the leakage of wired pages in the following circumstances:
First, a file is mmap(2)ed and then mlock(2)ed.  Later, it is truncated.
Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the
pages beyond the EOF are unmapped and freed.  However, when the file is
mlock(2)ed, the pages beyond the EOF are unmapped but not freed because
they have a non-zero wire count.  This can be a mistake.  Specifically,
it is a mistake if the sole reason why the pages are wired is because of
wired, managed mappings.  Previously, unmapping the pages destroys these
wired, managed mappings, but does not reduce the pages' wire count.
Consequently, when the file is unmapped, the pages are not unwired
because the wired mapping has been destroyed.  Moreover, when the vm
object is finally destroyed, the pages are leaked because they are still
wired.  The fix is to reduce the pages' wired count by the number of
wired, managed mappings destroyed.  To do this, I introduce a new pmap
function pmap_page_wired_mappings() that returns the number of managed
mappings to the given physical page that are wired, and I use this
function in vm_object_page_remove().

Reviewed by: tegge
MFC after: 6 weeks
2007-11-17 22:52:29 +00:00
Marcel Moolenaar
0c3967e7fe o Rename cpu_thread_setup() to cpu_thread_alloc() to better
communicate that it relates to (is called by) thread_alloc()
o  Add cpu_thread_free() which is called from thread_free()
   to counter-act cpu_thread_alloc().

i386:	Have cpu_thread_free() call cpu_thread_clean() to
	preserve behaviour.
ia64:	Have cpu_thread_free() call mtx_destroy() for the
	mutex initialized in cpu_thread_alloc().

PR: ia64/118024
2007-11-14 20:21:54 +00:00
Julian Elischer
431f890614 generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.
2007-11-14 06:21:24 +00:00
Marius Strobl
5077aaca20 Adjust the padding of struct pcpu to src/sys/sys/pcpu.h rev 1.23. 2007-11-11 12:30:56 +00:00
Konstantin Belousov
89b57fcf01 Fix for the panic("vm_thread_new: kstack allocation failed") and
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.

As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.

The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).

In collaboration with:	Peter Holm
Reviewed by:	jhb
2007-11-05 11:36:16 +00:00
Julian Elischer
3745c395ec Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
Marius Strobl
55aaf894e8 Make the PCI code aware of PCI domains (aka PCI segments) so we can
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.

Suggested by:	jhb
Reviewed by:	grehan, jhb, marcel
Approved by:	re (kensmith), jhb (PCI maintainer hat)
2007-09-30 11:05:18 +00:00
Christian Brueffer
4fabde5686 Use the correct expanded name for SCTP.
PR:		116496
Submitted by:	koitsu
Reviewed by:	rrs
Approved by:	re (kensmith)
2007-09-26 20:05:07 +00:00
Alan Cox
7bfda801a8 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
Alan Cox
6bce07ae73 It has been observed on the mailing lists that the different categories
of pages don't sum to anywhere near the total number of pages on amd64.
This is for the most part because uma_small_alloc() pages have never been
counted as wired pages, like their kmem_malloc() brethren.  They should
be.  This changes fixes that.

It is no longer necessary for the page queues lock to be held to free
pages allocated by uma_small_alloc().  I removed the acquisition and
release of the page queues lock from uma_small_free() on amd64 and ia64
weeks ago.  This patch updates the other architectures that have
uma_small_alloc() and uma_small_free().

Approved by: re (kensmith)
2007-09-15 18:47:02 +00:00
Attilio Rao
0b2e598c14 This is a follow-up, cleaning-up commit about recent changes involving
topology foo functions.
Working at the patch for topology problems in ia32/amd64 evicted some
problems regarding functions ordering in the SI_SUB_CPU family of
SYSINIT'ed subsystems.
In order to avoid problems with new modified to involved functions, a
correct ordering is not semantically specified for SI_SUB_CPU functions
(for a larger view of the issue please visit:
http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075409.html )

Discussed with: peter
Tested by: kris, Rui Paulo <rpaulo@FreeBSD.org>
Approved by: jeff
Approved by: re
2007-09-11 22:54:09 +00:00
Peter Wemm
c5b102f584 Fix warning - add missing #include
Submitted by:	mjacob
Approved by:	re (rwatson)
2007-07-06 00:41:53 +00:00
Marius Strobl
838f76c0a9 - Restore the machine independency of sys/dev/ofw/openfirm.{c,h} by
moving OF_set_mmfsa_traptable() (SUNW,set-trap-table with the two
  arguments used here is specific to sun4v) to MD code.
- In sys/dev/ofw/openfirm.h remove prototypes for unimplemented
  functions and unused Solaris compatibility macros.
2007-06-16 22:30:38 +00:00
Alan Cox
2446e4f02c Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist.  First and
foremost, this allocator is required to support the implementation of
superpages.  As a side effect, it enables a more robust implementation
of contigmalloc(9).  Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).

The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages.  Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space.  The performance benefits vary.  In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.

This allocator does not implement page coloring.  The reason is that
superpages have much the same effect.  The contiguous physical memory
allocation necessary for a superpage is inherently colored.

Finally, the one caveat is that this allocator does not effectively
support prezeroed pages.  I hope this is temporary.  On i386, this is
a slight pessimization.  However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects.  I speculate
that this is true in general of machines with a direct map.

Approved by:	re
2007-06-16 04:57:06 +00:00
Xin LI
a2346f7c3c Enable SCTP by default for GENERIC kernels in order to give it
more exposure.  The current state of SCTP implementation is
considered to be ready for 32-bit platforms, but still need some
work/testing on 64-bit platforms.

Approved by:	re (kensmith)
Discussed with:	rrs
2007-06-14 17:14:27 +00:00
Kip Macy
d84d0dfee6 fix cassert failure by adjusting padding 2007-06-12 21:19:12 +00:00
Marcel Moolenaar
2b39bb4f4f Use default options for default partitioning schemes, rather than
making the relevant files standard. This avoids duplication and
makes it easier to override/disable unwanted schemes. Since ARM
doesn't have a DEFAULTS configuration file, leave the source
files for the BSD and MBR partitioning schemes in files.arm for
now.
2007-06-11 00:38:06 +00:00
Marcel Moolenaar
01bd17cc99 Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
2007-06-09 21:55:17 +00:00
Robert Watson
68d4cc614a Enable AUDIT by default in the GENERIC kernel, allowing security event
auditing to be turned on without a kernel recompile, just an rc.conf
option.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
2007-06-08 20:29:07 +00:00
Jeff Roberson
1b1618fb12 - Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:57:32 +00:00
Jeff Roberson
e4b5aee3a8 Commit 10/14 of sched_lock decomposition.
- Use sched_throw() rather than replicating the same cpu_throw() code for
   each architecture.  This also allows the scheduler to use any locking it
   may want to.
 - Use the thread_lock() rather than sched_lock when preempting.
 - The scheduler lock is not required to synchronize release_aps.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:08 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
Alan Cox
c63f556284 Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-03 23:33:11 +00:00
Attilio Rao
2feb50bf7d Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
Paolo Pisati
3401f2c1df In some particular cases (like in pccard and pccbb), the real device
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.

Discussed with: jhb
2007-05-31 19:25:35 +00:00
Pyun YongHyeon
590f73f72e Honor maxsegsz of less than a page size in a DMA tag. Previously it
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.

Reviewed by:	scottl
2007-05-29 06:30:26 +00:00
Kip Macy
fdd54a53de remove unneccessary curcpu reference in setting mmfsa 2007-05-25 01:55:51 +00:00
Kip Macy
7275e4bf4d move trap table initialization for cpu0 into sparc64_init 2007-05-25 01:21:40 +00:00
Kip Macy
905a24b997 Add some early diagnostics under bootverbose
bootverbose is not getting set early enough so hardcode for the moment
2007-05-23 05:22:58 +00:00
Kip Macy
38354c989f restore interrupts to working order after INTR_THREAD changes
- ithread_wrapper was being treated as a wrapper for fast interrupts when
  in fact it was intended for ithread interrupts
2007-05-22 06:17:55 +00:00
Jeff Roberson
80b200da28 - rename VMCNT_DEC to VMCNT_SUB to reflect the count argument.
Suggested by:	julian@
Contributed by:	attilio@
2007-05-20 22:33:42 +00:00
Marius Strobl
84edfa5dd8 Given that these sparc64 (as in sun4u) specific headers only exist
in the sun4v source in order to be able to compile the source which
is shared between sparc64 and sun4v just #include the sparc64
version here instead of duplicating it.
This is based on the approach taken by pc98 headers in order to
compile the source shared between i386 and pc98.
2007-05-20 13:19:32 +00:00
Marius Strobl
ebf9df0158 Delete the unused/not really used sparc64 (as in sun4u) cache.h,
iommureg.h (which already began to bitrot) and iommuvar.h from the
sun4v source and adjust some of the source which is shared between
sparc64 and sun4v as appropriate.
2007-05-20 13:06:45 +00:00
Marius Strobl
331a66091a Delete a remnant of the old sparc64 nexus(4) which was never used for sun4v. 2007-05-20 09:58:16 +00:00
Marius Strobl
aea97983ee Remove superfluous inclusion of machine/ver.h. 2007-05-20 09:31:31 +00:00
Marius Strobl
fb6b415c96 Make previous revision compile. 2007-05-20 09:21:29 +00:00
Jeff Roberson
222d01951f - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
Marius Strobl
ac474f9545 - Add bits for userland profiling. For sun4u this is compile-tested only.
- Replace magic 14 with PIL_TICK.
2007-05-11 23:43:55 +00:00
Alan Cox
04a18977c8 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
Stephane E. Potvin
0e5179e441 Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
Pawel Jakub Dawidek
fef2a25971 Remove trailing '.' for consistency! 2007-04-10 21:40:13 +00:00
Pawel Jakub Dawidek
57bcf75fd2 Add UFS_GJOURNAL options to the GENERIC kernel.
Approved by:	re (kensmith)
2007-04-10 16:49:41 +00:00
Alexander Kabaev
b27c252dcf Remove extern struct pcb stoppcbs[] declaration from this file.
It breaks GCC 4.1 compiles and does not appear to be required.
2007-04-05 18:34:11 +00:00
Alan Cox
c640357f04 Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Brooks Davis
983f970981 Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by:	pjd
2007-02-09 19:03:18 +00:00
Marcel Moolenaar
1d3aed33e8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
Kip Macy
6d449d27d9 Add support for IPI_PREEMPT in order to enable use of the ULE scheduler 2007-02-02 05:00:21 +00:00
Kip Macy
d5ab3ef787 match against both dirty and writeable for marking page dirty 2007-02-02 04:57:11 +00:00
Ruslan Ermilov
a35dcad5cb MFsparc64: Add .cvsignore file here too. 2007-01-30 10:50:55 +00:00
Marius Strobl
0ca3609e30 Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
2007-01-19 11:15:34 +00:00
Marius Strobl
23e81b7e03 - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
  in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
  in panic and status strings with __func__.
- Fix whitespace nits.
2007-01-18 18:32:26 +00:00
Marius Strobl
441b9412d6 Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
2007-01-18 13:52:44 +00:00
Warner Losh
fed32d7544 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
Christian S.J. Peron
90339ccb12 Invert the logic inside of two KASSERTS which resulted in two kernel panics
for circumstances which are quite normal.

Discussed with:	kmacy
2006-12-31 02:50:07 +00:00
Xin LI
44ecc8e382 Fix build 2006-12-25 17:03:04 +00:00
Kip Macy
35d16ac000 - add ranged shootdowns when fewer than 64 mappings are being invalidated 2006-12-25 02:05:52 +00:00
Kip Macy
4654a1f82f - remove all calls to sched_pin and sched_unpin as they are only useful to
pmap on i386
- check for change in executable status in pmap_enter
- pmap_qenter and pmap_qremove only need to invalidate the range if one
  of the pages has been referenced
- remove pmap_kenter/pmap_kremove as they were only used by pmap_qenter
  and pmap_qremove
- in pmap_copy don't copy wired bit to destination pmap
- mpte was unused in pmap_enter_object - remove
- pmap_enter_quick_locked is not called on the kernel_pmap, remove check
- move pmap_remove_write specific logic out of tte_clear_phys_bit
- in pmap_protect check for removal of execute bit
- panic in the presence of a wired page in pmap_remove_all
- pmap_zero_range can call hwblkclr if offset is zero and size is PAGE_SIZE
- tte_clear_virt_bit is only used by pmap_change_wiring - thus it can be
  greatly simplified
- pmap_invalidate_page need only be called in tte_clear_phys_bit if there
  is a match with flags
- lock the pmap in tte_clear_phys_bit so that clearing the page bits is
  atomic with invalidating the page

- these changes result in 100s reduction in buildworld from a malloc backed
  disk to a malloc backed disk - ~2.5%
2006-12-24 08:03:27 +00:00
Kip Macy
5e296dbf52 Don't count on the first phys_avail range being greater than zero 2006-12-24 07:47:10 +00:00
Kip Macy
83e3f6ad3e - resizing the tte_hash in pmap_copy is not likely to occur
- the implementation also made the mistake of assuming the
  dst_pmap is the current pmap
2006-12-24 01:56:35 +00:00
Kip Macy
a7a6fa29bd reduce padding to compensate for recent change to sys/pcpu.h (tinderbox fix) 2006-12-20 20:18:07 +00:00
Kip Macy
de87749b4a remove unneeded operations in tsb_set_tte_real - the function is
only used early in initialization so SMP safeness isn't really an
issue
2006-12-18 07:46:59 +00:00
Kip Macy
0ebc11deba add declaration for new helper function 2006-12-18 07:25:26 +00:00
Kip Macy
b4935cbceb add helper function for finding a virtual device node in a machine
description
2006-12-18 07:22:25 +00:00
Kip Macy
0578eca08a push trap conversion up into tl1_trap to further simplify spill / fill fault
handling
2006-12-18 02:40:23 +00:00
Kip Macy
0622a9e491 Simplify spill/fill fault handling by updating tl1_trap register
usage to conform to that of tl0_trap - the separate code path
for unaligned faults was never getting used (and evidently doesn't
work), so ifdef out for now
2006-12-18 02:04:43 +00:00
Kip Macy
9c50a94180 remove TRAP_TRACING code that wasn't getting used
pc_caller is no longer part of pcpu
2006-12-17 03:51:12 +00:00
Kip Macy
7e3cb9f8ce GC unused fields in pcpu 2006-12-17 02:04:19 +00:00
Kip Macy
8dd3d530e0 replace PCPU_GET(cpuid) with curcpu and PCPU_GET(curthread) with curthread 2006-12-17 01:31:56 +00:00
Kip Macy
9af9f82482 eliminate use of curpmap except where protected by critical_{enter, exit} 2006-12-17 01:30:53 +00:00
Kip Macy
bbca332e0c make unmap_perm_addr conform to declaration 2006-12-17 01:22:51 +00:00
Kip Macy
7abbc00f33 eliminate extra branches by making better use of branch delay
slots and annulling
2006-12-17 01:22:09 +00:00
Kip Macy
8bba8b74ae - Remove PCPU references by passing field as a reference to _tte_hash_lookup.
- The PCPU usage was to ensure that there were no faults on the stack while
  the tte_hash_bucket lock was held - but this can be avoided by making sure
  the address on the stack is already referenced.
- PCPU removal obviates the need for critical_{enter, exit}
2006-12-17 01:01:52 +00:00
Kip Macy
a3fa231792 Protect consistency of all internal functions in tte_hash.c using PCPU_{GET,SET}
with critical_enter, critical_exit
revert previous change to pmap.c now that tte_hash_resize is protected internally
2006-12-16 08:38:02 +00:00
Kip Macy
f4d27eaf8f tte_hash_resize implicitly expects to be protected from preemption -
put under spinlock_enter
2006-12-16 08:23:14 +00:00
Kip Macy
e7c1a6ce81 change PTL trap type name to assist in tracking down prablems in tl1_trap 2006-12-16 08:01:13 +00:00
Kip Macy
7cfff9ae9a - KASSERT takes two arguments
- a cast is needed to quiet warnings
2006-12-16 07:51:33 +00:00
Kip Macy
8c3bc2c180 - make better use of branch delay slots in exception.S
- rename skip_utrap to tl0_skip_utrap to indicate its use by the fill trap fault handler
- handle a null kstack by switching to the idle threads stack and then going to trap
- correctly handle a unaligned or unmapped stack during a fill trap
- save off some extra data in the pcpu pad in ptl1_panic
- add an assert that PCB is valid in vm_machdep.c
2006-12-16 06:43:24 +00:00
Kip Macy
d8b5b86300 - make intent behind skip check clearer
- protect pmap_ipi with spinlock_enter when resizing tte_hash
2006-12-16 02:41:05 +00:00
Kip Macy
5bd2c4e059 don't return directly to copyin and friends when we hit certain types of faults
this fixes the unkillable syscall in stress2
2006-12-16 02:40:19 +00:00
Kip Macy
a23c97ad89 workaround kernel malloc's brittleness
- don't shuffle phys_avail following kernel to the beginning if the
  range is less than what would remain in a 256MB page (248MB)
2006-12-12 03:50:06 +00:00
Kip Macy
8401edeb32 - provide a more informative panic if mdesc_update() fails
- handle some cases where the return value of mdesc_update() is not zero
  when it should be
2006-12-12 02:50:12 +00:00
Kip Macy
9ac317cd2e - remove vestigial reference to mra[i]
- partition phys_avail along 4GB boundaries as possible workaround for hardware
  problems causing watchdog panics
2006-12-12 01:16:17 +00:00
Kip Macy
abbeb75f29 make size of pad non-zero so that trap-tracing code doesn't overwrite the
base of our stack
2006-12-11 04:50:25 +00:00
Kip Macy
2e05e7d021 KTR entry contained invalid context reference - ifdef out 2006-12-10 18:09:44 +00:00
Kip Macy
160dc4ccc8 remove more uses of trap_conversion to get more meaningful trap messages
add a printf for when we fault on the direct area (should never happen)
2006-12-10 06:00:09 +00:00
Kip Macy
504baf688e better handle the case of hw.physmemstart being hw.physmem not being set,
previously we were acting as if physmem was being set when it was not
2006-12-10 04:14:29 +00:00
Kip Macy
42b20c8aa2 Add hw.physmemstart loader variable to enable the user to specify the address
at which the kernel should start allocating physical memory. The primary
purpose of this is to test 64-bit cleanness of the data path by setting
hw.physmemstart=4G so that all physical allocations are above 4GB. AMD64
and i386/PAE could also benefit from having this option.
2006-12-10 01:52:46 +00:00
Kip Macy
90e405668e Fix handling of the hw.physmem loader variable use real_phys_avail[] which
is already bounded by hw.physmem to calculate phys_avail[] - previously only
real_phys_avail[] was being bound by hw.physmem so we were allocating memory
that wasn't mapped in the direct map
2006-12-09 23:11:30 +00:00
Kip Macy
76cb7acf30 - remove restriction on OFW kernel allocations being 4M
- shuffle memory range following kernel to the beginning of phys_avail
- have the direct area use 256MB pages where possible
- remove dead code from the end of pmap_bootstrap
- have pmap_alloc_contig_pages check all memory ranges in phys_avail before
  giving up

- informal benchmarking indicates a ~5% speedup on buildworld
2006-12-09 05:22:22 +00:00
Kip Macy
5b1bbad223 fix CID 1671 by freeing listp before exit from vnex_attach 2006-12-07 02:09:06 +00:00
Kip Macy
7e8123da0e fix CID 1670 by freeing pointer listp before returning 2006-12-07 02:05:13 +00:00
Kip Macy
1abcd84428 fix CID 1672 by initializing variable clock 2006-12-07 02:04:25 +00:00
Kip Macy
e177c79351 Fix CID 1669 by removing dead sf_buf code 2006-12-07 02:03:18 +00:00
Julian Elischer
ad1e7d285a Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
Kip Macy
4598ab64b0 - separate out rounding memory ranges to 4M boundaries from OFW memory allocation bug workaround
- create real_phys_avail which includes all memory ranges to be added to the direct map
- merge in nucleus memory to real_phys_avail
- distinguish between tag VA and index VA in tsb_set_tte_real for cases where page_size != index_page_size
- clean up direct map loop
2006-12-04 19:35:40 +00:00
Kip Macy
63d8511da8 recent changes have caused TRAP_TRACING to induce corruption
disable until the issue has been tracked down
2006-12-04 05:06:47 +00:00
John Birrell
e0b651251d Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
Kip Macy
547ba2302a - add separate variable for enabling printing of ranges
- simplify handling of rounding phys_avail ranges to 4M boundaries (needed for all
  memory to be in the direct mapped area)
2006-11-29 19:31:23 +00:00
Kip Macy
8751d0554c - Explicitly name the fields in pcb that we use to store trap state for later
retrieval, rather than using pad
- save the fault address in sfar for use by the alignment fixup handler
- mask off the trap number, so the context id doesn't confuse the UT_MAX
  comparison

This change fixes alignment fixup handling which is needed for traceroute
to work in spite of its copious unaligned accesses
2006-11-29 05:18:19 +00:00
Kip Macy
b3592a5ffb We no longer need to remap hardware trap numbers to sparc64 trap numbers
as this happens much earlier in trap handling.

The fact that we continued to do this when it was no longer necessary caused
breapoint to map to SIGILL as opposed to SIGTRAP :-(.
2006-11-29 04:52:51 +00:00
Kip Macy
a60e5bb6eb re-enable tte hash resize, corruption was caused by a missing htole32 in mpt_cam.c 2006-11-27 06:51:51 +00:00
Kip Macy
cca9a263e3 tte hash resizing may be causing errors when building - disable for now 2006-11-27 02:17:33 +00:00
Kip Macy
2aa7f6c1b2 Declare hypervisor system initiated reset function
as needed by the previous commit :-/
2006-11-26 22:47:52 +00:00
Kip Macy
0c41445f77 Fix "shutdown -r" and "shutdown -h"
- "shutdown -r" will reset the system
- "shutdown -h" will power off the system
We don't drop into OFW as newer versions of solaris don't do this either
2006-11-26 22:31:23 +00:00
Kip Macy
a06bcaf96b - remove dead code
- revert a previous change to pmap_enter where we
  could skip invalidates on unmanaged pages
2006-11-26 07:54:44 +00:00
Kip Macy
548d785ec5 add interrupt cookie hypervisor functions 2006-11-26 04:37:49 +00:00