of sttes. Also removes many differences between this and the other pmaps.
Reserve the kva space used by the openfirmware translations.
Use physical addresses directly in pmap_zero_page and pmap_copy_page, now
that we have the cache line shooting support.
Add code to track the virtual cachability of mapped pages. The dmmu
requires that multiple mappings of the same phsyical address have the
save virtual address bits up to a colour boundary. Violating this
requires all mappings to be mapped uncacheable. We do not yet handle
the case of a badly aliased mapping becoming cachable again.
Many crucial bug fixes from: tmm
the registers so we don't uselessly save them over and over again for
each context switch until another floating point instruction is executed.
Use a non-specific tlb slot for the tsb, which needs to have a locked
entry.
Remove overly verbose traces.
Add macros to atomically increment an integer variable in the data
section and to atomically set a bit in a tte. Note that the latter
does not return the new value.
Rewrite RESUME_SPILLFILL_MAGIC to use more sensical calculations, and
to preserve all alternate globals religiously. Must now be called on
alternate globals.
Defer switching to the kernel stack until inside the syscall, trap,
interrupt wrappers. Splitting the windows is all that's really urgent.
Adapt to new trap types.
Add %xcc where appropriate in order to not use v8 opcodes inadvertantly
(which work fine).
Modify the low level tlb fault handlers to operate on a tsb made up of
ttes, not sttes. This effectively makes the tsb twice as large.
After atomically updating tte bits in memory, also set the bit in the
register that holds the data which will be loaded into the tlb. The
macro returns the old value.
Use the preloaded mmu global which holds the address of the current
user tsb.
Add back a low level protection fault handler instead of just punting
into the vm system. This effectively saves a soft fault per COW fault.
Add a trace to intr_enqueue.
Pass arguments to the trap, interrupt, syscall wrappers in the out
registers instead of some on the stack, some in registers.
Use the preloaded alternate global pcb register.
2. Make trap_pfault more like it is on other architectures.
3. Fix a bug in syscall() which caused system calls with more than
six arguments that are called through the wild card syscall to
have their arguments scrambled. This affected mmap due to the
(bogus) wrapper in libc.
Submitted by: tmm (3)
Add some traces that can be useful but are also very loud.
Use defines for offsets into jmpbuf instead of magic numbers.
Fix a style bug.
Fixup comments.
specified by the sparc abi. We use numerically higher values for all
internal kernel types.
Remove soft trap types which need to be exposed to userland. They will
move to utrap.h.
their duration. This is still only effective as long as they are
only used in the static kernel. Code in modules may cause instruction
faults which makes these break in different ways anyway.
2. Add a load bearing membar #Sync.
3. Add an inline for demapping an entire context.
Submitted by: tmm (1, 2)
Bloat trapframe with many extra fields so we don't need extra structures.
Use small data types where possible.
Remove second copy of TF_DONE.
Remove mmuframe.
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.
Tested on: alpha, i386
Reviewed by: peter, jake
o Hide nonstandard functions and types in <netinet/in.h> when
_POSIX_SOURCE is defined.
o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>.
o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new
__FBSDID() macro.
o Fix some miscellaneous issues in <arpa/inet.h>.
o Correct final argument for the inet_ntop() function (POSIX.1-200x).
o Get rid of the namespace pollution from <sys/types.h> in
<arpa/inet.h>.
Reviewed by: fenner
Partially submitted by: bde
in asm files.
2. Temporarily cause subnormal operands in floating point operations
to be treated as zeros so that comlpetion of the operation does not
need to be emulated.
3. Catch fp_exception_other and correctly skip over the unfinished
instruction, but basically ignore them. Emulating the instruction
is not yet supported.
4. Zero td_retval[1] as well in syscall().
Submitted by: tmm (2, 3)
2. Add a TF_DONE macro, which fiddles a trapframe to make the retry on
return from traps act like a done (advance past the trapping
instruction instead of re-executing).
3. Flush the windows before entering the debugger, since it is no
longer done in the breakpoint trap vector.
4. Print a warning if trace <pid> is attempted, it is not yet implemented.
5. Print traps better and decode system calls in traces.
Submitted by: rwatson (4)
to determine if a process is using floating point. in order to avoid
sign extending a 13 bit immediate.
2. We don't need to context switch cwp anymore, it is better to just
fiddle the save tstate on return from traps. See exception.s 1.10
and 1.12.
3. Completely remove pcb_cwp.
4. Implement vmapbuf, vunmapbuf and vm_fault_quick. Completely remove
TODOs from vm_machdep.c (yay!).
Submitted by: tmm (1, 3, 4)
Obtained from: existing archs (4)
space from kernel space and from an alternate address space to kernel
space.
2. Remove the unused and unprototyped physcopy() and physzero() and replace
with the more versatile ascopy() and aszero(), inspired by the above.
These can be used to copy and zero physical pages of memory without mapping
them into kernel space first.
3. Use magic numbers for the offsets in the jmpbuf structure like other
platforms.
4. Use SET.
Submitted by: tmm (1, 4)
in the window trap vectors were mixed up. All this did is cause unnecesary
traps and look wierd in traces. Superfluous traps happen a lot in normal
operation, so we are rather good at recovering from them.
2. Store the arguments for a ktr trace in the right place.
3. Use a generic trap vector for breakpoints. It should not be special.
4. Save the frame pointer in the trap frame for kernel traps if DDB is compiled
in, otherwsie we don't save the out registers for kernel traps and stack
traces can't go through nested traps.
5. Apply the same fix to the return from kernel mode trap code as for user
mode traps. Ensure that the window we're returning to is the same one
that we restore to by fiddling the cwp in the saved tstate. This requires
that we transfer the values loaded from the trap frame into alternate
globals before restore-ing, but doing so is not very expensive and not
worth worrying about. Not changing the saved cwp can result in the register
values magically changing on return from traps if we happen to have slept
and the windows don't work out exactly the same. Fix the trace just before
the retry to account for different register usage.
6. Use a SET macro for loading address constants rather than a variation of
set and setx. set only works for 32 bit constants, while setx works for
64 bit constants as well, but produces bloated code when unnecessary.
Gas always generates the canonical 2 register, 6 instruction form, even
when it could be optimized; set uses 1 register and 2 instructions. At
the moment we assume that the kernel binary is below 4GB so set is
always sufficient, but the macro allows it to be configured. Note that
this has nothing to do with 32 vs. 64 bit address space, it only applies
to addresses of symbols which are known at compile/link time.
Submitted by: tmm (6)
this case, the firmware trap table needs to be restored). Make use of
it in cpu_halt() and cpu_reset(), and make cpu_reset() reboot the kernel
that was used previously insead of behaving like cpu_halt().
Add a shutdown_final event handler that turns the power off if requested.
unaligned accesses, and instr.h, which contrains definitions for the
sparc64 instruction set (partly from NetBSD).
Make use of some definitions from instr.h in db_disasm.c.
o Make <stdint.h> a symbolic link to <sys/stdint.h>.
o Move most of <sys/inttypes.h> into <sys/stdint.h>, as per C99.
o Remove <sys/inttypes.h>.
o Adjust includes in sys/types.h and boot/efi/include/ia64/efibind.h
to reflect new location of integer types in <sys/stdint.h>.
o Remove previously symbolicly linked <inttypes.h>, instead create a
new file.
o Add MD headers <machine/_inttypes.h> from NetBSD.
o Include <sys/stdint.h> in <inttypes.h>, as required by C99; and
include <machine/_inttypes.h> in <inttypes.h>, to fill in the
remaining requirements for <inttypes.h>.
o Add additional integer types in <machine/ansi.h> and
<machine/limits.h> which are included via <sys/stdint.h>.
Partially obtain from: NetBSD
Tested on: alpha, i386
Discussed on: freebsd-standards@bostonradio.org
Reviewed by: bde, fenner, obrien, wollman
userland. The per thread ucred reference is immutable and thus needs no
locks to be read. However, until all the proc locking associated with
writes to p_ucred are completed, it is still not safe to use the per-thread
reference.
Tested on: x86 (SMP), alpha, sparc64
{set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
one yet.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
was used. This resulted in bogus bad window traps (invalid wstate).
Add a trace to sfsr traps (alignment among other things).
Use KTR_TRAP instead of KTR_CT1.
Use the right registers when storing the values of various
mmu registers into the trap frame. This fixes a bug where sometimes
the context number reported by a fault would be garbage. Sometimes
it would be zero for faults on user address space so the kernel would
wrongly think that it was a fault on kernel address space and fail.
Use the preloaded registers in the vectored interrupt trap instead
of reading pointers from memory. Remove traces due to register
pressure and excess verbosity. We can probably still sneak in one
trace. Remove some debug code.
Go back to using the tsb register during kernel page table lookups.
This is the best way to not have to have the address of the kernel tsb be
a compile time constant. We lie and say we have 1 page tsb when really
its much larger. This way the hardware provides bits 13-22 of the
virtual address (the lower 9 bits of the virtual page number) in the
form of the address of the tte corresponding to the fault address in
the (1 page) kernel tsb. With some clever arithmetic we can then get
bits 22 and up from the tte tag and add them to the tte address in
order to index massive tsbs (basically unlimited).
Add traps for physical address hardware watchpoints.
Don't try to pass the window state from the trap table entry point
all the way down to the common trap code. Its too easy to clobber
and reading it again doesn't cost much.
Fixup some traces.
Fiddle the cwp bits on return from the kernel to user mode so that
the window we are returning to is always the same as the one we
restore to in the trap code. Strictly speaking this is not necessary,
it only affects return from fork and exec, but setting up the windows
right would require hard coding the right cwp values in cpu_fork and
setregs, basically hard coding the number of frames between syscall and
tl0_ret. The result of getting it wrong is usually a spill to an invalid
stack pointer; either 0 or pointing into kernel space. This should also
alleviate the need to context switch the cwp.
Transfer the trap state from locals to alternate globals in the trap
return code so that we can do a restore and rotate the windows before
reloading the trap registers. If the restore fails we'll trap back
into the kernel, so there's no point in loading the trap registers
before hand. Its is crucial that the window trap recovery code not
clobber the alternate globals.
boundary. It must be on at least an 8 byte boundary so that the length
of the signal code is a multiple of 8 (well aligned). The size is used
in the calculation of the address of the argument and environment vectors
on the user stack; getting it wrong results in the string pointers being
misaligned and causes alignment faults in getenv() among other things.
Allocate a regular stack frame below the signal frame on the user stack
and join up the frame pointer to the previous frame. This fixes longjmp-ing
out of signal handlers. Longjmp traverses the stack upwards in order to
find the right frame to return to, so the frame pointers must join up
seamlessly. I thought this would just work, but obviously the frame
needs to be below the signal frame, not above it like before. Account
for the extra space in the signal code.
Preload pointers to interrupt data structures in interrupt globals.
This avoids the need to load the pointers from memory in the vectored
interrupt trap handler.
Transfer the first 2 out registers into td_retval in setregs. We use
the same registers for system call arguments as return values, so these
registers got clobbered by the system call return values on return from
execve. They now get clobbered by the right values. We must put the values
in both the out registers in the trapframe and in td_retval because init
calls exec but fails to transfer the return value into the out registers.
This fixes a bug where the first exec after init would pass junk to the
c runtime, instead of a pointer to the argument strings. A better solution
would be to return EJUSTRETURN on success from execve.
Adjust for change in pmap_bootstraps prototype.
Map the message buffer after the trap table is setup. We will fault
on it immediately.
Don't use a hard coded address constant for the virtual address of the
kernel tsb. Allocate kernel virtual address space for the kernel tsb
at runtime.
Remove unused parameter to pmap_bootstrap.
Adapt pmap.c to use KVA_PAGES.
Map the message buffer too.
Add some traces.
Implement pmap_protect.
be used to index tables of counters.
Remove intr_dispatch() inline, it is implemented directly in tl*_intr now.
Count stray interrupts in a table of counters like intrcnt.
Disable interrupts briefly when setting up the interrupt vector table.
We must disable interrupts completely, not just raise the pil.
Pass pointers to the intr_vector structures rather than a vector number
to sched_ithd and intr_stray.
the existence of the __gnuc_va_list type[*] because our compiler is GCC.
[*] __gnuc_va_list is defined in the GCC ginclude/stdarg.h replacement
headerwhich we don't use.
Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested. Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".
Handle overlap in bcopy.
Add routines for copying and zeroing pages using physical addresses
directly.
Remove all the hacks to account for calling the firmware on its own
trap table, we use the kernel trap table. There is still a problem
with OF_exit().
easier and hopefully this code is done changing radically.
Don't use the mmu tlb register to address the kernel page table, nor
the 8k pointer register. The hardware will do some of the page table
lookup by storing the the base address in an internal register and
calculating the address of the tte in the table. However it is limited
to a 1 meg tsb, which only maps 512 megs. The kernel page table only
has one level, so its easy to just do it by hand, which has the advantage
of supporting abitrary amounts of kvm and only costs a few more instructions.
Increase kvm to 1 gig now that its easy to do so and so we don't waste
most of a 4 meg page.
Fix some traces. Fix more proc locking.
Call tsb_stte_promote if we get a soft fault on a mapping in the upper
levels of the tsb. If there is an invalid or unreferenced mapping
in the primary tsb, it will be replaced.
Immediately fail for faults occuring in {f,s}uswintr.
one 4 meg page can map both the kernel and the openfirmware mappings.
Add the openfirmware mappings to the kernel tsb so we can call the firmware
on the kernel trap table and access kernel memory normally.
Implement pmap_swapout_proc, pmap_swapin_proc, pmap_swapout_thread,
pmap_swapin_thread, pmap_activate, pmap_page_exists, and pmap_phys_address.
Add a guard page at the bottom of the kernel stack. Its unclear how easy
it will be to detect these faults and do something useful.
Setup the registers on exec how the c runtime expects.
Implement various {fill,set}_*regs.
Fix proc locking.
will be private to each CPU.
- Re-style(9) the globaldata structures. There really needs to be a MI
struct pcpu that has a MD struct mdpcpu member at some point.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
it to the MI area. KSE touched cpu_wait() which had the same change
replicated five ways for each platform. Now it can just do it once.
The only MD parts seemed to be dealing with fpu state cleanup and things
like vm86 cleanup on x86. The rest was identical.
XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional
stub in place.
Reviewed by: jake, tmm, dillon
with user windows in kernel mode. We split the windows using %otherwin,
but instead of spilling user window directly to the pcb, we attempt to
spill to user space. If this fails because a stack page is not resident
(or the stack is smashed), the fault handler at tl 2 will detect the
situation and resume at tl 1 again where recovery code can spill to the
pcb. Any windows that have been saved to the pcb will be copied out to
the user stack on return from kernel mode.
Add a first stab at 32 bit window handling. This uses much of the same
recovery code as above because the alignment of the stack pointer is used
to detect 32 bit code. Attempting to spill a 32 bit window to a 64 bit
stack, or vice versa, will cause an alignment fault. The recovery code
then changes the window state to vector to a 32 bit spill/fill handler
and retries the faulting instruction.
Add ktr traces in useful places during trap processing.
Adjust comments to reflect new code and add many more.
Remove the modified tte bit and add a softwrite bit. Mappings are only
writeable if they have been written to, thus in general modify just
duplicates the write bit. The softwrite bit makes it easier to distinguish
mappings which should be writeable but are not yet modified.
Move the exec bit down one, it was being sign extended when used as an
immediate operand.
Use the lock bit to mean tsb page and remove the tsb bit. These are the
only form of locked (tsb) entries we support and we need to conserve bits
where possible.
Implement pmap_copy_page and pmap_is_modified and friends.
Detect mappings that are being being upgraded from read-only to read-write
due to copy-on-write and update the write bit appropriately.
Make trap_mmu_fault do the right thing for protection faults, which is
necessary to implement copy on write correctly. Also handle a bunch
more userland trap types and add ktr traces.
Instead introduce the [M] prefix to existing keywords. e.g.
MSTD is the MP SAFE version of STD. This is prepatory for a
massive Giant lock pushdown. The old MPSAFE keyword made
syscalls.master too messy.
Begin comments MP-Safe procedures with the comment:
/*
* MPSAFE
*/
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).
sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE
ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)
Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not. Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.
Rebuild syscall tables.
o Unify <machine/endian.h>'s across all architectures.
o Make bswapXX() functions use a different spelling of u_int16_t and
friends to reduce namespace pollution. The bswapXX() functions
don't actually exist, but we'll probably import these at some
point. Atleast one driver (if_de) depends on bswapXX() for big
endian cases.
o Deprecate byteorder(3) prototypes from <sys/types.h>, these are
now prototyped indirectly in <arpa/inet.h>.
o Deprecate in_addr_t and in_port_t typedefs in <sys/types.h>, these
are now typedef'd in <arpa/inet.h>.
o Change byteorder(3) prototypes to use standards compliant uint32_t
(spelled __uint32_t to reduce namespace pollution).
o Document new preferred headers and standards compliance.
Discussed with: bde
PR: 29946
Reviewed by: bmilekic
during trap handlers.
Implement ptrace_set_pc FWIW.
Initialize the pcb window scratch area in setregs(), and setup user
registers as specified by the SCD.
Submitted by: tmm
kernel from usermode. The remaining user windows are spilled
to the pcb as necessary. The user land window fault handlers
fill directly from the pcb on return.
Add system call entry points.
Submitted by: tmm
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.
Reviewed by: bde (mostly)
Only set sticks (and acquire sched_lock) on entry from user mode.
Add handlers for all kinds of mmu misses, and for interrupts from
user mode.
Acquire Giant before calling into the vm system so this runs with
invariants.
Try to get the restrictions for page faults on user memory from
kernel mode right.
Only set pcb_onfault and return to the alternate return code if
this is actually a fault on user memory from kernel mode.
2. Use the upcoming "tick" interface.
3. Save a call frame as well as a trap frame on proc0's initial stack.
4. Setup a pointer to the per-cpu interrupt queue.
5. Install the per-cpu pointer in interrupt and alternate globals as well.
6. Flush out setregs so exec works.
Submitted by: tmm (3, 5, 6)
2. Add spill and fill handlers for spills to the user stack on entry
to the kernel.
3. Add code to handle instruction mmu misses from user mode.
4. Add code to handle level interrupts from kernel mode and vectored
interrupt traps from either.
5. Save the pil in the trapframe on entry from kernel mode and restore
it on return.
Submitted by: tmm (1, 2)
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.
gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.
addresses. It helps to use the physical address that the virtual address
actually maps to (doh!). Comment out some code that crashes.
Found independently by: tmm
- mostly complete kernel pmap support, and tested but currently turned
off userland pmap support
- low level assembly language trap, context switching and support code
- fully implemented atomic.h and supporting cpufunc.h
- some support for kernel debugging with ddb
- various header tweaks and filling out of machine dependent structures
to a new architecture. This is the base of the sparc64 port, but contains
limited machine dependent code, and can be used a base for ports. Included
are:
- standard machine dependent headers, tweaked for a 64 bit, big endian
architecture, including empty versions of all the machine dependent
structures
- a machine independent atomic.h, which can be used until a port has
support for interrupts and the operations really need to be atomic
- stub versions of all the machine dependent functions, which panic
when called and print out the name of the function that needs to
be implemented. functions which are normally in assembly files are
not included, but this should reduce the number of different undefined
references on the first few compiles from hundreds to 5 or 6
Given minimal startup code and console support it should be trivial to
make this compile and run the first few sysinits on almost any architecture.
Requested by: alfred, imp, jhb