switch to it before calling mi_startup(). The bootstack is WAY too small
for running acpica during probe/attach. While here, pass modulep/physfree
to the startup routine, rather than writing to the global variables in
locore.S.
Approved by: re (amd64/*)
The current name is confusing, because it indicates to
the client that a bus_dmamap_sync() operation is not
necessary when the flag is specified, which is wrong.
The main purpose of this flag is to hint the underlying
architecture that DMA memory should be mapped in a coherent
way, but the architecture can ignore it. But if the
architecture does supports coherent mapping of memory, then
it makes bus_dmamap_sync() calls cheap.
This flag is the same as the one in NetBSD's Bus DMA.
Reviewed by: gibbs, scottl, des (implicitly)
Approved by: re@ (jhb)
disassembler has not been updated yet, and will do some very strange
things. It does tracebacks (without function arguments due to regparm
calling conventions) if -fno-omit-frame-pointer is used (to come later).
This achieves basic functionality.
Approved by: re (amd64/* blanket)
BUS_DMASYNC_ definitions remain as before. The does not change the ABI,
and reverts the API to be a bit more compatible and flexible. This has
survived a full 'make universe'.
Approved by: re (bmah)
having their stack at the 512GB mark. Give 4GB of user VM space for 32
bit apps. Note that this is significantly more than on i386 which gives
only about 2.9GB of user VM to a process (1GB for kernel, plus page
table pages which eat user VM space).
Approved by: re (blanket)
systems. Of note:
- Implement a direct mapped region using 2MB pages. This eliminates the
need for temporary mappings when getting ptes. This supports up to
512GB of physical memory for now. This should be enough for a while.
- Implement a 4-tier page table system. Most of the infrastructure is
there for 128TB of userland virtual address space, but only 512GB is
presently enabled due to a mystery bug somewhere. The design of this
was heavily inspired by the alpha pmap.c.
- The kernel is moved into the negative address space(!).
- The kernel has 2GB of KVM available.
- Provide a uma memory allocator to use the direct map region to take
advantage of the 2MB TLBs.
- Fixed some assumptions in the bus_space macros about the ability
to fit virtual addresses in an 'int'.
Notable missing things:
- pmap_growkernel() should be able to grow to 512GB of KVM by expanding
downwards below kernbase. The kernel must be at the top 2GB of the
negative address space because of gcc code generation strategies.
- need to fix the >512GB user vm code.
Approved by: re (blanket)
- Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)'
is alays wrong because __FOO_VISIBLE is always defined (to 0 for
invisibility).
sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:
- Style fixes.
Submitted by: bde
Reviewed by: bsdmike
Approved by: re (scottl)
load_gs() calls into a single place that is less likely to go wrong.
Eliminate the per-process context switching of MSR_GSBASE, because it
should be constant for a single cpu. Instead, save/restore it during
the loading of the new %gs selector for the new process.
Approved by: re (amd64/* blanket)
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls. It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4. The ia64 code has not implemented signal delivery, so I had
to do that.
Before you say it, yes, this does need to go in a common place. But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.
On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure. The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting. The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes. This still needs
to be optimized.
Approved by: re (amd64/* blanket)
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.
Reviewed by: arch@
Approved by: re (rwatson)
we do not have to run so long with interrupts disabled. This involved
creating tf_addr in the trapframe. Reorganize the trap stubs so that
they consistently reserve the stack space and initialize any missing
bits.
Approved by: re (amd64 stuff)
bus_dma MD code for AMD64. (And a trivial ifdef update in dev/kbd because
of this). More updates are needed here to take advantage of the 64 bit
instructions.
Approved by: re (blanket amd64/*)
value on entry and exit. This isn't as easy as it sounds because when
we recursively trap or interrupt, we have to avoid duplicating the
swapgs instruction or we end up back with the userland %gs. I implemented
this by testing TF_CS to see if we're coming from supervisor mode
already, and check for returning to supervisor. To avoid a race with
interrupts in the brief period after beginning executing the handler and
before the swapgs, convert all trap gates to interrupt gates, and reenable
interrupts immediately after the swapgs. I am not happy with this.
There are other possible ways to do this that should be investigated.
(eg: storing the GS.base MSR value in the trapframe)
Add some sysarch functions to let the userland code get to this.
Approved by: re (blanket amd64/*)
could use different versions of the math code depending on whether there
was real floating point hardware or math emulation. Since the fpu is
part of the core specification on amd64, there is no need for this here.
Approved by: re (blanket amd64/*)
should really be renamed to fpu.h and npx.c to fpu.c since its part of
the core architecture on amd64 systems, not an isa 'numeric processor
extension'.
that interrupts be disabled and remain disabled until %cr2 is read.
Otherwise we can preempt and another process can fault, and by the
time we read %cr2, we see a different processes fault address. This
Greatly Confuses vm_fault() (to say the least). The i386 port has
got this marked as a bug workaround for a Cyrix CPU, which is what
lead me astray. Its actually necessary for preemption, regardless
of whether Cyrix cpus had a bug or not.
call handler before it was safe. It was possible for to lose context
and for something else to clobber the PCPU scratch variable. This
moves the interrupt enable *way* too late, but its better safe than
sorry for the moment.