due to them being faster in certain cases. Therefore we need to save
and restore the v8 %y register around traps in kernel mode as well as
traps in usermode.
Tested by: obrien, tmm
until we do some testing to see what's best. This gives a massive reduction
in system time for processes with a relatively large working set. The size
of the tsb directly affects the rss size that a user process can keep mapped.
When it starts to get full replacements occur and the process takes a lot of
soft vm faults. Increasing the default from 1 page to 2 gives the following
before and after numbers for compiling vfs_bio.c:
before:
14.27 real 6.56 user 5.69 sys
after:
8.57 real 6.11 user 1.62 sys
This should make self hosted builds more tolerable.
I don't believe anyone is quite using the sparc64 kernel sources in CVS
yet -- things aren't just quite ready (but almost). So this commit should
be OK to make.
window to the user stack while in a nested kernel trap. We do this for
entry to the kernel from user mode, but if we get an interrupt in kernel
mode while there are still user windows in the cpu, and we attempt to spill
to the user stack, we may take too many nested traps and overflow the trap
stack, causing a red state exception. This is needed by upcoming changes
to allow the user tsb to not be locked in the tlb.
Reviewed by: tmm
virtual page number in a much more convenient way; all in one piece. This
greatly simplifies the comparison for a matching tte, and allows the fault
handlers to be much simpler due to not having to load wierd masks.
Rewrite the tlb fault handlers to account for the new format. These are also
written to allow faults on the user tsb inside of the fault handlers; the
kernel fault handler must be aware of this and not clobber the other's
registers. The faults do not yet occur due to other support that is needed
(and still under my desk).
Bug fixes from: tmm
Most of the contents are commented out as they are as-yet untested.
However, I wanted the contents to match our other arches, so that when
people make changes to {i386,alpha,ia64}, they will also make the same
changes here.
pmap_qenter and pmap_qremove in preference to pmap_kenter/pmap_kremove.
The former maps in multiple pages at a time, and so can do a ranged
flush. Don't assume that pmap_kenter and pmap_kremove will flush the tlb,
even though they still do. It will not once the MI code is updated to use
pmap_qenter and pmap_qremove.
will be used to reduce the number of tlb shootdown ipis in an smp system
by sending one ipi for a whole range of pages, instead of one per page.
Munge the context demap operations slightly to support demapping a non-primary
context.
If we don't do this here there's a 1 instruction race where an interrupt
could come in and crash the user process due to having no stack.
2. Pass %fsr to the user trap handler in %l4. Since %fsr can only be loaded
from or stored to memory, we need to do some contortions and temporarily
save it to the alternate global stack.
3. Reload the pcb and pcpu registers for traps in kernel mode, for sanity.
Submitted by: tmm (1, 2)
While in userland, keep the thread's ucred reference in a shadow
field so that the usual place to store it is NULL.
If DIAGNOSTIC is not set, the thread ucred is kept valid until the next
kernel entry, at which time it is checked against the process cred
and possibly corrected. Produces a BIG speedup in
kernels with INVARIANTS set. (A previous commit corrected it
for the non INVARIANTS case already)
Reviewed by: dillon@freebsd.org
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386
Reviewed by: bde, jake, tmm
patch from a year ago: give file flags their own type. This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().
support for managing both streaming caches on psycho pairs).
Use explicit bus space accesses instead of mapping the device memory into
kva.
Move DVMA allocation to the map creation/dma memory allocation functions.
disable interrupts completely, and stxa_sync(), which performs a store
immediately followed by a membar #Sync with interrupts disabled (this
is needed for writes to diagnostic registers).
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
some arches and the syscall table is machine-independent. It was
(bogusly) conditional on COMPAT_43, so this usually makes no difference.
ia64: in addition:
- replace the bogus cloned comment before osigreturn() by a correct one.
osigreturn() is just a stub fo ia64's.
- fix the formatting of cloned comment before sigreturn().
- fix the return code. use nosys() instead of returning ENOSYS to get
the same semantics as if the syscall is not in the syscall table.
Generating SIGSYS is actually correct here.
- fix style bugs.
powerpc: copy the cleaned up ia64 stub. This mainly fixes a bogus comment.
sparc64: copy the cleaned up the ia64 stub, since there was no stub before.
cpu(s) into the kernel, and sync-ing them up to "kernel" mode so we can
send them ipis, which also work.
Thanks to John Baldwin for providing me with access to the hardware
that made this possible.
Parts obtained from: bsd/os
Call critical_enter/critical_exit around (fast) interrupt handlers. All
non-threaded interrupts are fast, and the threaded interrupt scheduler is
itself a fast interrupt.
Assert that an interrupt handler we are about to call is non-zero.
Be paranoid about restoring the users global registers. Do it as the
last thing before switching to alternate globals (when we magically get
our preloaded registers back), and do it with interrupts disabled. Any
kind of kernel trap when the globals are not setup properly is bad news.
Don't save and restore the kernel g6, it invariably points to the current
pcb now.
data word in an interrupt packet is non-zero, it points to code to execute
to handle the ipi, so jump to it instead of enqueueing the packet. It
is unclear if we will need queued ipis.
Interrupt g7 now points to pcpu, instead of to the per-cpu interrupt queue
itself, so use that instead. Interrupt g6 is no longer reserved.
parameters needed for smp support.
If we are not the boot processor, jump to the smp startup code instead.
Implement a per-cpu panic stack, which is used for bootstrapping both
primary and secondary processors and during faults on the kernel stack.
Arrange the per-cpu page like the pcb, with the struct pcpu at the end
of the page and the panic stack before it.
Use the boot processor's panic stack for calling sparc64_init.
Split the code to set preloaded global registers and to map the kernel
tsb out into functions, which non-boot processors can call.
Allocate the kstack for thread0 dynamically in pmap_bootstrap, and give
it a guard page too.
to the current pcb.
Remove interrupt global defines; they use PCPU_REG now.
Move ATOMIC_INC_INT here from exception.s, add ATOMIC_DEC_INT.
Add a KASSERT macro for use in assembler.
substantial fraction of the number of entries of tte's in the tsb
would need to be looked up, traverse the tsb instead. This is crucial
in some places, e.g. when swapping out a process, where a certain
pmap_remove() call would take very long time to complete without this.
2. Implement pmap_qenter_flags(), which will become used later
3. Reactivate the instruction cache flush done when mapping as executable.
This is required e.g. when executing files via NFS, but is known to
cause problems on UltraSPARC-IIe CPU's. If you have such a CPU, you
will need to comment this call out for now.
Submitted by: jake (3)
struct ofw_nexus_reg. Implement UPA device memory management in the
nexus driver.
Adapt the psycho driver to these changes, and do some minor cleanup work
while being there.
for certain user pages, stores to kernel pages would not update the
affected cache lines, which would sometimes cause the wrong data to be
returned for loads from kernel pages. This was especially fatal when
the addresses affected held the kernel stack pointer, and a random
value was loaded into it.
Fix a harmless off by one error in a dcache_inval_phys call.
Fix a potential race in setting up the per-cpu pointer if the special
restore fails on return to user mode fails and we need to trap back
into the kernel to fault in more stack.
Remove debug code.
an efficient way for the kernel to bounce certain mundane traps back to
userland for handling there. A user trap handler returns directly to the
trapping user code, rather than going through the kernel again. Only a
handful of instructions are actually executed in kernel mode.
Implement sysarch(SPARC_UTRAP_INSTALL).
Add code to handle sharing of the user trap table across forks and unsharing
at exec.
This can be used to implement efficient tracking of floating point register
usage in userland, fe by a thread library, and to handle alignment fault
fixups and instruction emulation in userland, for which the code may need
to be different for 32bit and 64bit binaries.
something wrong with the kernel stack.
Add code to check the kernel stack pointer in various important places
and try hard not to go down in flames if its wrong.
Add fields to md_page for tracking virtual page color, and pv entry
lists.
Fix pmap_track_modified to work for non-kernel pmaps. This is due to
kernel virtual addresses potentially overlapping with userland addresses.
much less magic, fragile, broken. Use ttes rather than sttes.
We still use the replacement scheme used by the original code, which
is pretty cool.
Many crucial bug fixes from: tmm
of sttes. Also removes many differences between this and the other pmaps.
Reserve the kva space used by the openfirmware translations.
Use physical addresses directly in pmap_zero_page and pmap_copy_page, now
that we have the cache line shooting support.
Add code to track the virtual cachability of mapped pages. The dmmu
requires that multiple mappings of the same phsyical address have the
save virtual address bits up to a colour boundary. Violating this
requires all mappings to be mapped uncacheable. We do not yet handle
the case of a badly aliased mapping becoming cachable again.
Many crucial bug fixes from: tmm
the registers so we don't uselessly save them over and over again for
each context switch until another floating point instruction is executed.
Use a non-specific tlb slot for the tsb, which needs to have a locked
entry.
Remove overly verbose traces.
Add macros to atomically increment an integer variable in the data
section and to atomically set a bit in a tte. Note that the latter
does not return the new value.
Rewrite RESUME_SPILLFILL_MAGIC to use more sensical calculations, and
to preserve all alternate globals religiously. Must now be called on
alternate globals.
Defer switching to the kernel stack until inside the syscall, trap,
interrupt wrappers. Splitting the windows is all that's really urgent.
Adapt to new trap types.
Add %xcc where appropriate in order to not use v8 opcodes inadvertantly
(which work fine).
Modify the low level tlb fault handlers to operate on a tsb made up of
ttes, not sttes. This effectively makes the tsb twice as large.
After atomically updating tte bits in memory, also set the bit in the
register that holds the data which will be loaded into the tlb. The
macro returns the old value.
Use the preloaded mmu global which holds the address of the current
user tsb.
Add back a low level protection fault handler instead of just punting
into the vm system. This effectively saves a soft fault per COW fault.
Add a trace to intr_enqueue.
Pass arguments to the trap, interrupt, syscall wrappers in the out
registers instead of some on the stack, some in registers.
Use the preloaded alternate global pcb register.
2. Make trap_pfault more like it is on other architectures.
3. Fix a bug in syscall() which caused system calls with more than
six arguments that are called through the wild card syscall to
have their arguments scrambled. This affected mmap due to the
(bogus) wrapper in libc.
Submitted by: tmm (3)
Add some traces that can be useful but are also very loud.
Use defines for offsets into jmpbuf instead of magic numbers.
Fix a style bug.
Fixup comments.
specified by the sparc abi. We use numerically higher values for all
internal kernel types.
Remove soft trap types which need to be exposed to userland. They will
move to utrap.h.
their duration. This is still only effective as long as they are
only used in the static kernel. Code in modules may cause instruction
faults which makes these break in different ways anyway.
2. Add a load bearing membar #Sync.
3. Add an inline for demapping an entire context.
Submitted by: tmm (1, 2)
Bloat trapframe with many extra fields so we don't need extra structures.
Use small data types where possible.
Remove second copy of TF_DONE.
Remove mmuframe.
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.
Tested on: alpha, i386
Reviewed by: peter, jake