Commit Graph

324 Commits

Author SHA1 Message Date
Christian Brueffer
4fabde5686 Use the correct expanded name for SCTP.
PR:		116496
Submitted by:	koitsu
Reviewed by:	rrs
Approved by:	re (kensmith)
2007-09-26 20:05:07 +00:00
Alan Cox
7bfda801a8 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
Alan Cox
6bce07ae73 It has been observed on the mailing lists that the different categories
of pages don't sum to anywhere near the total number of pages on amd64.
This is for the most part because uma_small_alloc() pages have never been
counted as wired pages, like their kmem_malloc() brethren.  They should
be.  This changes fixes that.

It is no longer necessary for the page queues lock to be held to free
pages allocated by uma_small_alloc().  I removed the acquisition and
release of the page queues lock from uma_small_free() on amd64 and ia64
weeks ago.  This patch updates the other architectures that have
uma_small_alloc() and uma_small_free().

Approved by: re (kensmith)
2007-09-15 18:47:02 +00:00
Attilio Rao
0b2e598c14 This is a follow-up, cleaning-up commit about recent changes involving
topology foo functions.
Working at the patch for topology problems in ia32/amd64 evicted some
problems regarding functions ordering in the SI_SUB_CPU family of
SYSINIT'ed subsystems.
In order to avoid problems with new modified to involved functions, a
correct ordering is not semantically specified for SI_SUB_CPU functions
(for a larger view of the issue please visit:
http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075409.html )

Discussed with: peter
Tested by: kris, Rui Paulo <rpaulo@FreeBSD.org>
Approved by: jeff
Approved by: re
2007-09-11 22:54:09 +00:00
Peter Wemm
c5b102f584 Fix warning - add missing #include
Submitted by:	mjacob
Approved by:	re (rwatson)
2007-07-06 00:41:53 +00:00
Marius Strobl
838f76c0a9 - Restore the machine independency of sys/dev/ofw/openfirm.{c,h} by
moving OF_set_mmfsa_traptable() (SUNW,set-trap-table with the two
  arguments used here is specific to sun4v) to MD code.
- In sys/dev/ofw/openfirm.h remove prototypes for unimplemented
  functions and unused Solaris compatibility macros.
2007-06-16 22:30:38 +00:00
Alan Cox
2446e4f02c Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist.  First and
foremost, this allocator is required to support the implementation of
superpages.  As a side effect, it enables a more robust implementation
of contigmalloc(9).  Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).

The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages.  Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space.  The performance benefits vary.  In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.

This allocator does not implement page coloring.  The reason is that
superpages have much the same effect.  The contiguous physical memory
allocation necessary for a superpage is inherently colored.

Finally, the one caveat is that this allocator does not effectively
support prezeroed pages.  I hope this is temporary.  On i386, this is
a slight pessimization.  However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects.  I speculate
that this is true in general of machines with a direct map.

Approved by:	re
2007-06-16 04:57:06 +00:00
Xin LI
a2346f7c3c Enable SCTP by default for GENERIC kernels in order to give it
more exposure.  The current state of SCTP implementation is
considered to be ready for 32-bit platforms, but still need some
work/testing on 64-bit platforms.

Approved by:	re (kensmith)
Discussed with:	rrs
2007-06-14 17:14:27 +00:00
Kip Macy
d84d0dfee6 fix cassert failure by adjusting padding 2007-06-12 21:19:12 +00:00
Marcel Moolenaar
2b39bb4f4f Use default options for default partitioning schemes, rather than
making the relevant files standard. This avoids duplication and
makes it easier to override/disable unwanted schemes. Since ARM
doesn't have a DEFAULTS configuration file, leave the source
files for the BSD and MBR partitioning schemes in files.arm for
now.
2007-06-11 00:38:06 +00:00
Marcel Moolenaar
01bd17cc99 Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
2007-06-09 21:55:17 +00:00
Robert Watson
68d4cc614a Enable AUDIT by default in the GENERIC kernel, allowing security event
auditing to be turned on without a kernel recompile, just an rc.conf
option.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
2007-06-08 20:29:07 +00:00
Jeff Roberson
1b1618fb12 - Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:57:32 +00:00
Jeff Roberson
e4b5aee3a8 Commit 10/14 of sched_lock decomposition.
- Use sched_throw() rather than replicating the same cpu_throw() code for
   each architecture.  This also allows the scheduler to use any locking it
   may want to.
 - Use the thread_lock() rather than sched_lock when preempting.
 - The scheduler lock is not required to synchronize release_aps.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:08 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
Alan Cox
c63f556284 Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-03 23:33:11 +00:00
Attilio Rao
2feb50bf7d Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
Paolo Pisati
3401f2c1df In some particular cases (like in pccard and pccbb), the real device
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.

Discussed with: jhb
2007-05-31 19:25:35 +00:00
Pyun YongHyeon
590f73f72e Honor maxsegsz of less than a page size in a DMA tag. Previously it
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.

Reviewed by:	scottl
2007-05-29 06:30:26 +00:00
Kip Macy
fdd54a53de remove unneccessary curcpu reference in setting mmfsa 2007-05-25 01:55:51 +00:00
Kip Macy
7275e4bf4d move trap table initialization for cpu0 into sparc64_init 2007-05-25 01:21:40 +00:00
Kip Macy
905a24b997 Add some early diagnostics under bootverbose
bootverbose is not getting set early enough so hardcode for the moment
2007-05-23 05:22:58 +00:00
Kip Macy
38354c989f restore interrupts to working order after INTR_THREAD changes
- ithread_wrapper was being treated as a wrapper for fast interrupts when
  in fact it was intended for ithread interrupts
2007-05-22 06:17:55 +00:00
Jeff Roberson
80b200da28 - rename VMCNT_DEC to VMCNT_SUB to reflect the count argument.
Suggested by:	julian@
Contributed by:	attilio@
2007-05-20 22:33:42 +00:00
Marius Strobl
84edfa5dd8 Given that these sparc64 (as in sun4u) specific headers only exist
in the sun4v source in order to be able to compile the source which
is shared between sparc64 and sun4v just #include the sparc64
version here instead of duplicating it.
This is based on the approach taken by pc98 headers in order to
compile the source shared between i386 and pc98.
2007-05-20 13:19:32 +00:00
Marius Strobl
ebf9df0158 Delete the unused/not really used sparc64 (as in sun4u) cache.h,
iommureg.h (which already began to bitrot) and iommuvar.h from the
sun4v source and adjust some of the source which is shared between
sparc64 and sun4v as appropriate.
2007-05-20 13:06:45 +00:00
Marius Strobl
331a66091a Delete a remnant of the old sparc64 nexus(4) which was never used for sun4v. 2007-05-20 09:58:16 +00:00
Marius Strobl
aea97983ee Remove superfluous inclusion of machine/ver.h. 2007-05-20 09:31:31 +00:00
Marius Strobl
fb6b415c96 Make previous revision compile. 2007-05-20 09:21:29 +00:00
Jeff Roberson
222d01951f - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
Marius Strobl
ac474f9545 - Add bits for userland profiling. For sun4u this is compile-tested only.
- Replace magic 14 with PIL_TICK.
2007-05-11 23:43:55 +00:00
Alan Cox
04a18977c8 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
Stephane E. Potvin
0e5179e441 Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
Pawel Jakub Dawidek
fef2a25971 Remove trailing '.' for consistency! 2007-04-10 21:40:13 +00:00
Pawel Jakub Dawidek
57bcf75fd2 Add UFS_GJOURNAL options to the GENERIC kernel.
Approved by:	re (kensmith)
2007-04-10 16:49:41 +00:00
Alexander Kabaev
b27c252dcf Remove extern struct pcb stoppcbs[] declaration from this file.
It breaks GCC 4.1 compiles and does not appear to be required.
2007-04-05 18:34:11 +00:00
Alan Cox
c640357f04 Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Brooks Davis
983f970981 Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by:	pjd
2007-02-09 19:03:18 +00:00
Marcel Moolenaar
1d3aed33e8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
Kip Macy
6d449d27d9 Add support for IPI_PREEMPT in order to enable use of the ULE scheduler 2007-02-02 05:00:21 +00:00
Kip Macy
d5ab3ef787 match against both dirty and writeable for marking page dirty 2007-02-02 04:57:11 +00:00
Ruslan Ermilov
a35dcad5cb MFsparc64: Add .cvsignore file here too. 2007-01-30 10:50:55 +00:00
Marius Strobl
0ca3609e30 Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
2007-01-19 11:15:34 +00:00
Marius Strobl
23e81b7e03 - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
  in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
  in panic and status strings with __func__.
- Fix whitespace nits.
2007-01-18 18:32:26 +00:00
Marius Strobl
441b9412d6 Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
2007-01-18 13:52:44 +00:00
Warner Losh
fed32d7544 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
Christian S.J. Peron
90339ccb12 Invert the logic inside of two KASSERTS which resulted in two kernel panics
for circumstances which are quite normal.

Discussed with:	kmacy
2006-12-31 02:50:07 +00:00
Xin LI
44ecc8e382 Fix build 2006-12-25 17:03:04 +00:00
Kip Macy
35d16ac000 - add ranged shootdowns when fewer than 64 mappings are being invalidated 2006-12-25 02:05:52 +00:00
Kip Macy
4654a1f82f - remove all calls to sched_pin and sched_unpin as they are only useful to
pmap on i386
- check for change in executable status in pmap_enter
- pmap_qenter and pmap_qremove only need to invalidate the range if one
  of the pages has been referenced
- remove pmap_kenter/pmap_kremove as they were only used by pmap_qenter
  and pmap_qremove
- in pmap_copy don't copy wired bit to destination pmap
- mpte was unused in pmap_enter_object - remove
- pmap_enter_quick_locked is not called on the kernel_pmap, remove check
- move pmap_remove_write specific logic out of tte_clear_phys_bit
- in pmap_protect check for removal of execute bit
- panic in the presence of a wired page in pmap_remove_all
- pmap_zero_range can call hwblkclr if offset is zero and size is PAGE_SIZE
- tte_clear_virt_bit is only used by pmap_change_wiring - thus it can be
  greatly simplified
- pmap_invalidate_page need only be called in tte_clear_phys_bit if there
  is a match with flags
- lock the pmap in tte_clear_phys_bit so that clearing the page bits is
  atomic with invalidating the page

- these changes result in 100s reduction in buildworld from a malloc backed
  disk to a malloc backed disk - ~2.5%
2006-12-24 08:03:27 +00:00
Kip Macy
5e296dbf52 Don't count on the first phys_avail range being greater than zero 2006-12-24 07:47:10 +00:00
Kip Macy
83e3f6ad3e - resizing the tte_hash in pmap_copy is not likely to occur
- the implementation also made the mistake of assuming the
  dst_pmap is the current pmap
2006-12-24 01:56:35 +00:00
Kip Macy
a7a6fa29bd reduce padding to compensate for recent change to sys/pcpu.h (tinderbox fix) 2006-12-20 20:18:07 +00:00
Kip Macy
de87749b4a remove unneeded operations in tsb_set_tte_real - the function is
only used early in initialization so SMP safeness isn't really an
issue
2006-12-18 07:46:59 +00:00
Kip Macy
0ebc11deba add declaration for new helper function 2006-12-18 07:25:26 +00:00
Kip Macy
b4935cbceb add helper function for finding a virtual device node in a machine
description
2006-12-18 07:22:25 +00:00
Kip Macy
0578eca08a push trap conversion up into tl1_trap to further simplify spill / fill fault
handling
2006-12-18 02:40:23 +00:00
Kip Macy
0622a9e491 Simplify spill/fill fault handling by updating tl1_trap register
usage to conform to that of tl0_trap - the separate code path
for unaligned faults was never getting used (and evidently doesn't
work), so ifdef out for now
2006-12-18 02:04:43 +00:00
Kip Macy
9c50a94180 remove TRAP_TRACING code that wasn't getting used
pc_caller is no longer part of pcpu
2006-12-17 03:51:12 +00:00
Kip Macy
7e3cb9f8ce GC unused fields in pcpu 2006-12-17 02:04:19 +00:00
Kip Macy
8dd3d530e0 replace PCPU_GET(cpuid) with curcpu and PCPU_GET(curthread) with curthread 2006-12-17 01:31:56 +00:00
Kip Macy
9af9f82482 eliminate use of curpmap except where protected by critical_{enter, exit} 2006-12-17 01:30:53 +00:00
Kip Macy
bbca332e0c make unmap_perm_addr conform to declaration 2006-12-17 01:22:51 +00:00
Kip Macy
7abbc00f33 eliminate extra branches by making better use of branch delay
slots and annulling
2006-12-17 01:22:09 +00:00
Kip Macy
8bba8b74ae - Remove PCPU references by passing field as a reference to _tte_hash_lookup.
- The PCPU usage was to ensure that there were no faults on the stack while
  the tte_hash_bucket lock was held - but this can be avoided by making sure
  the address on the stack is already referenced.
- PCPU removal obviates the need for critical_{enter, exit}
2006-12-17 01:01:52 +00:00
Kip Macy
a3fa231792 Protect consistency of all internal functions in tte_hash.c using PCPU_{GET,SET}
with critical_enter, critical_exit
revert previous change to pmap.c now that tte_hash_resize is protected internally
2006-12-16 08:38:02 +00:00
Kip Macy
f4d27eaf8f tte_hash_resize implicitly expects to be protected from preemption -
put under spinlock_enter
2006-12-16 08:23:14 +00:00
Kip Macy
e7c1a6ce81 change PTL trap type name to assist in tracking down prablems in tl1_trap 2006-12-16 08:01:13 +00:00
Kip Macy
7cfff9ae9a - KASSERT takes two arguments
- a cast is needed to quiet warnings
2006-12-16 07:51:33 +00:00
Kip Macy
8c3bc2c180 - make better use of branch delay slots in exception.S
- rename skip_utrap to tl0_skip_utrap to indicate its use by the fill trap fault handler
- handle a null kstack by switching to the idle threads stack and then going to trap
- correctly handle a unaligned or unmapped stack during a fill trap
- save off some extra data in the pcpu pad in ptl1_panic
- add an assert that PCB is valid in vm_machdep.c
2006-12-16 06:43:24 +00:00
Kip Macy
d8b5b86300 - make intent behind skip check clearer
- protect pmap_ipi with spinlock_enter when resizing tte_hash
2006-12-16 02:41:05 +00:00
Kip Macy
5bd2c4e059 don't return directly to copyin and friends when we hit certain types of faults
this fixes the unkillable syscall in stress2
2006-12-16 02:40:19 +00:00
Kip Macy
a23c97ad89 workaround kernel malloc's brittleness
- don't shuffle phys_avail following kernel to the beginning if the
  range is less than what would remain in a 256MB page (248MB)
2006-12-12 03:50:06 +00:00
Kip Macy
8401edeb32 - provide a more informative panic if mdesc_update() fails
- handle some cases where the return value of mdesc_update() is not zero
  when it should be
2006-12-12 02:50:12 +00:00
Kip Macy
9ac317cd2e - remove vestigial reference to mra[i]
- partition phys_avail along 4GB boundaries as possible workaround for hardware
  problems causing watchdog panics
2006-12-12 01:16:17 +00:00
Kip Macy
abbeb75f29 make size of pad non-zero so that trap-tracing code doesn't overwrite the
base of our stack
2006-12-11 04:50:25 +00:00
Kip Macy
2e05e7d021 KTR entry contained invalid context reference - ifdef out 2006-12-10 18:09:44 +00:00
Kip Macy
160dc4ccc8 remove more uses of trap_conversion to get more meaningful trap messages
add a printf for when we fault on the direct area (should never happen)
2006-12-10 06:00:09 +00:00
Kip Macy
504baf688e better handle the case of hw.physmemstart being hw.physmem not being set,
previously we were acting as if physmem was being set when it was not
2006-12-10 04:14:29 +00:00
Kip Macy
42b20c8aa2 Add hw.physmemstart loader variable to enable the user to specify the address
at which the kernel should start allocating physical memory. The primary
purpose of this is to test 64-bit cleanness of the data path by setting
hw.physmemstart=4G so that all physical allocations are above 4GB. AMD64
and i386/PAE could also benefit from having this option.
2006-12-10 01:52:46 +00:00
Kip Macy
90e405668e Fix handling of the hw.physmem loader variable use real_phys_avail[] which
is already bounded by hw.physmem to calculate phys_avail[] - previously only
real_phys_avail[] was being bound by hw.physmem so we were allocating memory
that wasn't mapped in the direct map
2006-12-09 23:11:30 +00:00
Kip Macy
76cb7acf30 - remove restriction on OFW kernel allocations being 4M
- shuffle memory range following kernel to the beginning of phys_avail
- have the direct area use 256MB pages where possible
- remove dead code from the end of pmap_bootstrap
- have pmap_alloc_contig_pages check all memory ranges in phys_avail before
  giving up

- informal benchmarking indicates a ~5% speedup on buildworld
2006-12-09 05:22:22 +00:00
Kip Macy
5b1bbad223 fix CID 1671 by freeing listp before exit from vnex_attach 2006-12-07 02:09:06 +00:00
Kip Macy
7e8123da0e fix CID 1670 by freeing pointer listp before returning 2006-12-07 02:05:13 +00:00
Kip Macy
1abcd84428 fix CID 1672 by initializing variable clock 2006-12-07 02:04:25 +00:00
Kip Macy
e177c79351 Fix CID 1669 by removing dead sf_buf code 2006-12-07 02:03:18 +00:00
Julian Elischer
ad1e7d285a Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
Kip Macy
4598ab64b0 - separate out rounding memory ranges to 4M boundaries from OFW memory allocation bug workaround
- create real_phys_avail which includes all memory ranges to be added to the direct map
- merge in nucleus memory to real_phys_avail
- distinguish between tag VA and index VA in tsb_set_tte_real for cases where page_size != index_page_size
- clean up direct map loop
2006-12-04 19:35:40 +00:00
Kip Macy
63d8511da8 recent changes have caused TRAP_TRACING to induce corruption
disable until the issue has been tracked down
2006-12-04 05:06:47 +00:00
John Birrell
e0b651251d Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
Kip Macy
547ba2302a - add separate variable for enabling printing of ranges
- simplify handling of rounding phys_avail ranges to 4M boundaries (needed for all
  memory to be in the direct mapped area)
2006-11-29 19:31:23 +00:00
Kip Macy
8751d0554c - Explicitly name the fields in pcb that we use to store trap state for later
retrieval, rather than using pad
- save the fault address in sfar for use by the alignment fixup handler
- mask off the trap number, so the context id doesn't confuse the UT_MAX
  comparison

This change fixes alignment fixup handling which is needed for traceroute
to work in spite of its copious unaligned accesses
2006-11-29 05:18:19 +00:00
Kip Macy
b3592a5ffb We no longer need to remap hardware trap numbers to sparc64 trap numbers
as this happens much earlier in trap handling.

The fact that we continued to do this when it was no longer necessary caused
breapoint to map to SIGILL as opposed to SIGTRAP :-(.
2006-11-29 04:52:51 +00:00
Kip Macy
a60e5bb6eb re-enable tte hash resize, corruption was caused by a missing htole32 in mpt_cam.c 2006-11-27 06:51:51 +00:00
Kip Macy
cca9a263e3 tte hash resizing may be causing errors when building - disable for now 2006-11-27 02:17:33 +00:00
Kip Macy
2aa7f6c1b2 Declare hypervisor system initiated reset function
as needed by the previous commit :-/
2006-11-26 22:47:52 +00:00
Kip Macy
0c41445f77 Fix "shutdown -r" and "shutdown -h"
- "shutdown -r" will reset the system
- "shutdown -h" will power off the system
We don't drop into OFW as newer versions of solaris don't do this either
2006-11-26 22:31:23 +00:00
Kip Macy
a06bcaf96b - remove dead code
- revert a previous change to pmap_enter where we
  could skip invalidates on unmanaged pages
2006-11-26 07:54:44 +00:00
Kip Macy
548d785ec5 add interrupt cookie hypervisor functions 2006-11-26 04:37:49 +00:00
Kip Macy
44d22a127a The mountroot prompt will drop into ddb if we don't recognize error codes from
getchar correctly - we also need to check for HUP and BREAK
2006-11-25 06:29:46 +00:00
Christian S.J. Peron
5cd3e3457f Make sure we do not sleep while locks are held. Change the malloc(9)
flags from M_WAITOK to M_NOWAIT. This should not cause any problems
since the calling code appears to properly handle failed allocations.

Discussed with:	kmacy
2006-11-24 22:14:37 +00:00
Kip Macy
f523369da9 kernel will not compile without genclock, thus move to DEFAULTS 2006-11-24 20:56:43 +00:00
Kip Macy
4305064c65 - implement remaining pci functions
- fix build errors
2006-11-24 20:47:29 +00:00
Kip Macy
234e688e0c Implement mmu functions and cpu_mondo_send
fix some more kernel compile fallout
2006-11-24 18:50:56 +00:00
Kip Macy
217112e529 comment all remaining documented hypervisor functions except for msi
implement performance counter functions
2006-11-24 07:49:15 +00:00
Kip Macy
01e461dcc2 document and comment all functions outside of MMU and MSI services
from those, implement all those whose arguments don't require save/restore
2006-11-24 07:11:24 +00:00
Kip Macy
17b84f855f - Comment most of the remaining hypercalls in hcall.S
- implement hypercalls returning a single value
- start fixing the fallout of the recent changes needed to get
  the kernel compiling again
2006-11-24 05:27:49 +00:00
Kip Macy
16c87dea9e add comments for cpu configuration hypervisor calls 2006-11-24 02:37:51 +00:00
Kip Macy
ac39496f20 move CDDL licensed machine description support routine files to cddl directory
update files.sun4v accordingly
2006-11-24 01:56:46 +00:00
Kip Macy
c4a6aff220 Add in initial clean room implementation of hypervisor interfaces 2006-11-23 23:47:53 +00:00
Kip Macy
ac4bca68bc Remove system critical files with CDDL origin
with the plan being to create clean room versions
2006-11-23 21:22:06 +00:00
Kip Macy
39c747a0e8 separate out legitimately CDDL code - optimized routines taken from
opensolaries
2006-11-23 19:58:06 +00:00
Kip Macy
4bd2dcb8aa Add watchdog support 2006-11-23 04:59:29 +00:00
Kip Macy
9f349fb200 Add in missing hypercall numbers 2006-11-23 04:38:14 +00:00
Kip Macy
d4abe2870a re-name misnamed single character console interfaces
add in multi character console interfaces
2006-11-23 04:18:21 +00:00
Kip Macy
bcbf936a19 Add hypervisor interfaces for logical domain channels from the hypervisor API docs
remove bogus CDDL
2006-11-23 03:52:39 +00:00
Kip Macy
185877cc43 In contrast to the non-obvious and flexible nature of the optimized bcopy in t1_copy.S (which
shall retain its CDDL copyright, and thus likely be removed from GENERIC) I have removed the CDDL
from hcall.S because there is zero flexibility in the implementation of hypercalls as they derive
directly from the hypervisor interface which is not copyrighted (ironically the source for the
hypervisor itself is BSD licensed).

It is best to start any bikeshed about this as soon as possible.

Discussed with: bsdimp
2006-11-23 02:25:16 +00:00
Kip Macy
88277acb0b Integrate, but do not enable support for dynamically resizing TSBs 2006-11-22 05:54:24 +00:00
Kip Macy
c469ef9bde pmap_track_modified has been removed from other architectures -
likewise remove from sun4v
2006-11-22 04:50:55 +00:00
Kip Macy
19a3dd4f75 reduce whining from LINT by removing another GPL sound driver 2006-11-22 04:35:58 +00:00
Kip Macy
34ec92870b add support for resizing the the tte_hash of multi-threaded processes 2006-11-22 04:33:34 +00:00
Kip Macy
9872e24f3c remove unused field from pcpu structure 2006-11-22 04:27:24 +00:00
Kip Macy
270917fafa remove dead code from tsb.c
switch tsbscratch over to using order of number of pages as opposed to actual number of pages
switch tsb.c over to using wrappers for contig page allocation
2006-11-22 04:13:30 +00:00
Kip Macy
793555da5e move contiguous allocation and free routines from tte_hash.c into pmap.c 2006-11-22 03:35:37 +00:00
Kip Macy
367e547fee Add tte_hash and tsb update handlers for handling tte_hash and tsb
resizing across cpus
2006-11-22 01:47:58 +00:00
Kip Macy
3cf970e7bc Add mechanism to track TSB misses in tsb miss handler
Remove unused debug code
2006-11-22 00:18:22 +00:00
Kip Macy
3ad9c2127b remove unused fields
Approved by: scottl (standing in for mentor rwatson)
2006-11-18 19:23:37 +00:00
Kip Macy
1a667f5b30 eeprom has been removed from sun4v - remove from NOTES 2006-11-18 17:16:02 +00:00
Kip Macy
8c8a01d714 Remove two completely unused files
Reviewed by: jb (mentor rwatson)
2006-11-18 07:28:47 +00:00
Kip Macy
fda40eb204 Remove two more duplicated files
Reviewed by: jb (mentor is rwatson)
2006-11-18 07:24:56 +00:00
Kip Macy
1eea142b6f remove 13 (largely) redundant files and switch to the sparc64/sparc64 version
Reviewed by: jb (mentor rwatson)
2006-11-18 07:10:52 +00:00
Kip Macy
4570c1c11b Resize the hash table upwards if the number of collision entries is greater than 1/4 of the
total
it is possible that the resize threshold should be revised upwards

Approved by: scottl (standing in for mentor rwatson)
2006-11-16 07:50:33 +00:00
Kip Macy
827686013d Heavily re-factor tte_hash to remove redundant code
Add hash resizing support - doesn't quite work yet
2006-11-15 06:29:52 +00:00
Kip Macy
e5cedd89dc add trap trace to tl1 trap 2006-11-15 03:53:27 +00:00
Kip Macy
33719d6b50 add trap tracing to dev_mondo 2006-11-15 03:20:12 +00:00
Kip Macy
f30d482097 add trap tracing to cpu mondo handler and tsb miss handler 2006-11-15 03:16:30 +00:00
Ruslan Ermilov
26af9ac7d0 Fix a comment. 2006-11-13 06:26:57 +00:00
Christian S.J. Peron
430e6e77f0 Enable syscall auditing for sun4v the arch by implementing the
AUDIT_SYSCALL_ENTER/EXIT macros.

Discussed with:	kmacy
2006-11-13 04:38:57 +00:00
Kip Macy
f719846d36 Add time-of-day support to sun4v 2006-11-13 01:02:18 +00:00
Alan Cox
44b8bd66f9 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
Kip Macy
9d6220e622 Support up to 4 nucleus mappings to workaround issue hit by jb@ when booted
off of CD
2006-11-12 01:21:15 +00:00
Kip Macy
7c0435b933 MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb
2006-11-11 03:18:07 +00:00
John Birrell
8328fa1871 Enable ata and atapicd now those work on sun4v. 2006-11-09 08:49:13 +00:00
Kip Macy
42013c462c move panic_bad_hcall to its use site in support.S in attempt to un-break the
tinderbox
2006-11-08 22:16:05 +00:00
Kip Macy
dc4468ed17 Fix for ithread interrupt handling. Don't reset the interrupt vector until
after the interrupt has been handled. Also move panic_bad_hcall to local to
avoid complaints from the linker on the tinderbox.

Approved by: scottl (substituting for mentor rwatson)
2006-11-08 22:09:58 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
John Birrell
8391a99bf7 Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
2006-11-04 23:50:12 +00:00
John Birrell
1f80cd9398 Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
2006-11-04 04:58:10 +00:00
Kip Macy
45897edf72 - map hardware trap numbers to those used by by sparc64 for inter-compatibility
and to make user-level trap handlers work
- add new trap entry to trap table to enable fast fetching of floating point trap
  context
- remove unused debug code
- map unimplemented floating point trap to SIGFPE

Approved by: scottl (standing in for mentor rwatson)
2006-11-03 23:41:53 +00:00
John Birrell
34408d484b The relocation definitions are now defined in the machine independent
elf_common.h so that one arch can identify relocations on another
arch.
2006-11-03 23:03:46 +00:00
Kip Macy
1df1b94714 Fix initialization sequence for console
Fix commenting convention slightly
Approved by: rwatson (mentor)
Reviewed by: jb
2006-11-03 07:29:09 +00:00
Kip Macy
00a8f0b4ff make sure physmem is initialized
add clarifying comments
Reviewed by: jb
Approved by: rwatson (mentor)
2006-11-03 07:27:55 +00:00
John Birrell
fd77f832c7 Add a low level function to write a string to the hypervisor
console directly.

Discussed with: kmacy
2006-11-03 06:31:56 +00:00
John Birrell
a836d5f134 Spaces to tabs. (I shouldn't copy and paste from diff output to a terminal)
Noticed by: pjd
2006-11-01 21:33:17 +00:00
John Birrell
0415f45d48 Add the trap-trace function for the hypervisor. 2006-11-01 08:14:14 +00:00
Marius Strobl
a1610d83b8 In the replacement text of the __bswapN_const() macros encapsulate the
argument in parentheses so these macros are safe to use and invocations
with an expression as the argument like __bswap32_const(42 << 23 | 13)
work as expected. Additionally, mask all the individually shifted bytes
as appropriate so the bytes which exceed the width of the respective
__bswapN_const() macro in invocations like __bswap16_const(0xdead600d)
are ignored like it's the case with the corresponding __bswapN_var()
function.

MFC after:	3 days
2006-10-30 21:50:11 +00:00
John Birrell
013d6d8cb4 Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.

This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.

Submitted by:	scottl@
2006-10-26 22:05:25 +00:00
John Birrell
8460a577a4 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00
Ruslan Ermilov
837f167eb2 Move "device splash" back to MI NOTES and "files", it's MI. 2006-10-23 13:23:14 +00:00
Ruslan Ermilov
9fae53b036 Mechanically kill redundant nodevice/nooption/nomakeoption, i.e.,
those that do not exist in MI NOTES or switched on/off in the MD
NOTES.
2006-10-23 09:45:22 +00:00
Xin LI
e3af7c042c Fix build: remove (now) unnecessary PG_BUSY check, it's handled by vm object locking. 2006-10-22 16:33:43 +00:00
John Birrell
1e118f7ee8 Comment out a debug entry which doesn't compile. Needed to fix LINT. 2006-10-17 03:53:38 +00:00
David Xu
3c573376e8 rename casuptr to casuword. 2006-10-17 03:05:17 +00:00
John Birrell
89eb1586fe Comment out 'device isa'.
Add a lot of nodevice entries for things that depend on isa, kbd and
other PC-centric things.
2006-10-16 22:06:59 +00:00
John Birrell
6a7d6d5826 Ignore the uart device. 2006-10-13 21:55:30 +00:00
John Birrell
2fe1d072b3 sun4v uses the sparc64 version of this file. 2006-10-13 21:30:18 +00:00
John Birrell
ede64266d2 Attempt to fix the sun4v tinderbox build (which unhelpfully doesn't
report errors -- just warnings).

Submitted by: ru@
2006-10-13 09:55:17 +00:00
Kip Macy
ae5c4e2140 Fix console and update to new api.
The style changes are required as a result of the CONSOLE_DRIVER macro.

Approved by: rwatson (mentor)
2006-10-13 06:45:50 +00:00
Kip Macy
f53c03a0e3 This won't be needed. There is already userland infrastructure for fpu emulation
for sparc64.
2006-10-12 06:00:34 +00:00
Ruslan Ermilov
5b7be20f58 Fix CPU value. 2006-10-11 04:14:41 +00:00
Kip Macy
25e328499c kernel clean up to make the sun4v kernel build
Reviewed by: jmg
Approved by: rwatson (mentor)
2006-10-09 04:45:19 +00:00
Simon L. B. Nielsen
4517aab293 - Remove SCHED_ULE from GENERIC to better avoid foot-shooting by
unsuspecting users.
- Add a comment in NOTES about experimental status of SCHED_ULE.
- Make warning about experimental status in sched_ule(4) a bit
  stronger.

Suggested and reviewed by:	dougb
Discussed on:			developers
MFC after:			3 days
2006-10-05 20:31:58 +00:00
Kip Macy
2405f16615 placate Grim Reaper with sun4v support 2006-10-05 06:14:28 +00:00