pmap.c, and is potentially the cause of hangs reported on machines with a
small amount of memory. On machines with sufficient RAM, and without a lot
of processes running, this situation would probably never occur.
Testing is still incomplete, but it is obviously wrong so remove the
offending code now.
The issue of what to do when both the primary and secondary hash overflow
is still open.
Reported by: Dan Kresja at windriver dot com, via alc
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs. Libpthread processes will already
do this to some extent and libthr processes already disable it.
Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
was written into a user's address space. The fix is to modify uiomove_fromphys
to sync the icache when an executable user-space page is written into.
Alan Cox suggested that there should probably be a higher-level interface
to this in the ptrace code, but agreed that this is an OK short-term solution.
Files changed:
pmap.h - declaration of pmap_page_executable()
pmap_dispatch.c - pass through the page_executable call to the mmu object
mmu_oea.c - implement the page_executable method by examining the PTE_EXEC
field in the vm_page_t
uio_machdep.c - in uiomove_fromphys(), if the op was a UIO_WRITE to user-space,
and if the page is executable, sync the icache since this is at the least
a breakpoint-write from gdb.
Reported by: marcel
Tested by: marcel, grehan on g3+g4
Discussed with: alc
MFC after: 2 weeks
Split subr_clock.c in two parts (by repo-copy):
subr_clock.c contains generic RTC and calendaric stuff. etc.
subr_rtc.c contains the newbus'ified RTC interface.
Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c. They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write(). However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.
The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm
trying to clean up and fix our support for execute access.
Reviewed by: marcel@ (ia64)
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
section.
- Add a new explicit check before userret() to make sure we don't return
with any locks held. The advantage here is that we can include the
syscall number and name in syscall() whereas that info is not available
in userret().
- Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by
the more general checks just added.
MFC after: 2 weeks
This avoids that mem.c has to include ofw_machdep.h, including
all OFW related headers.
o Provide a stub for OF_decode_addr(), which is used by low-level
console drivers to obtain a tag and handle given a OFW phandle.
This is different from sparc64, where a fake bus tag needs to be
created explicitly.
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.
Reviewed by: alc,jhb,peter,ps
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly. Reduce it
from uint64_t to u_int.
Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret(). Use the absolute value of
td->pticks in userret() and eliminate third argument.
to old-style signals, to be the DAR register for DSI miss exceptions.
This gives the address of the access rather than the instruction
address. The behaviour is now the same as on i386.
Found by: libsigsegv tests
- provide an interface (macros) to the page coloring part of the VM system,
this allows to try different coloring algorithms without the need to
touch every file [1]
- make the page queue tuning values readable: sysctl vm.stats.pagequeue
- autotuning of the page coloring values based upon the cache size instead
of options in the kernel config (disabling of the page coloring as a
kernel option is still possible)
MD changes:
- detection of the cache size: only IA32 and AMD64 (untested) contains
cache size detection code, every other arch just comes with a dummy
function (this results in the use of default values like it was the
case without the autotuning of the page coloring)
- print some more info on Intel CPU's (like we do on AMD and Transmeta
CPU's)
Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.
Based upon work by: Chad David <davidc@acns.ab.ca> [1]
Reviewed by: alc, arch (in 2004)
Discussed with: alc, Chad David, arch (in 2004)
handling code so the stack trace unwinders don't start trying to
go into user-space.
Found by trying to create core dumps with a KTR_COMPILE/KTR_GEOM
kernel, which results in a stack_save() call in the ast() coredump
path - this created a panic, and then calling 'trace' in ddb resulted
in the black screen of death after printing out most of the backtrace.
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly. In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures. Basically,
all the archs were taking a trapframe and converting it into a clockframe
one way or another. Now they can just extract the PC and usermode values
directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
timecounter, call hardclock() directly. This removes an extra
conditional check from every clock interrupt on Alpha on the BSP.
There is probably room for even further pruning here by changing Alpha
to use the simplified timecounter we use on x86 with the lapic timer
since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
is false, so add a KASSERT() to that affect and remove a condition
to slightly optimize the non-lapic case.
- Change prototypeof arm_handler_execute() so that it's first arg is a
trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.
Tested on: alpha, amd64, arm, i386, ia64, sparc64
Reviewed by: bde (mostly)
the interface. This allows run-time selection of MMU code, based
on CPU-type detection, or tunable-overrides when testing new code.
Pre-requisite for G5 support.
conf/files.powerpc
- remove pmap.c
- add mmu_if.h, mmu_oea.c, pmap_dispatch.c
powerpc/include/mmuvar.h
- definitions for MMU implementations
powerpc/include/pmap.h
- remove pmap_pte_spill declaration
- add pmap_mmu_install declaration
- size the phys_avail array
- pmap_bootstrapped is now global-scope
powerpc/powerpc/machdep.c
- call kobj_machdep_init early in the boot sequence to allow
kobj usage prior to SI_SUB_LOCK
- install the OEA pmap code. This will be moved to CPU-specific
init code in the future.
powerpc/powerpc/mmu_if.m
- Kobj MMU interface definitions
powerpc/powerpc/pmap_dispatch.c
- central dispatch for pmap calls
- contains the global mmu kobj and the routine to locate the
the mmu implementation and init the kobj
OpenFirmware. FreeBSD/ppc uses SPRG0 as the per-cpu data area pointer,
and SPRG1-3 as temporary registers during exception handling. There
have been a few instances where OpenFirmware does require these to
be part of it's context, such as cd-booting an eMac.
reported by: many
MFC after: 3 days
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.
6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
pmap_bootstrap by using the sync;isync big hammer to make sure
all prior operations have completed.
Reported by: Nathan Whitehorn <nathan at uchicago edu>
MFC after: 2 days
trap_subr.S: declare a stub for the a-unavailable trap
that does an absolute jump to the vector-assist trap.
This is due to the fact that the vec-unavail trap
doesn't start at a 256-byte boundary, so the trick of
masking the bottom 8 bits of the link register to identify
the interrupt doesn't work, so let the vec-assist
case handle Altivec-disabled for the time being.
Note that this will be fixed in the future with a much
smaller vector code-stub (< 16 bytes) that will allow
use of strange vector offsets that are also present in
4xx processors, and also allow smaller differences in
vector codepaths on the G5.
trap.c: Treat altivec-unavailable/assist process traps as SIGILL.
Not quite correct, since altivec-assist should really be a panic,
but it is fine for the moment due to the above measure.
machdep.c Install the stub code for the altivec-unavailable trap, and
the standard trap code at the altivec-assist.
Reported by: Andreas Tobler <toa at pop agri ch>
MFC after: 3 days
address, writting non-canonical address can cause kernel a panic,
by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring
only canonical values get written to the registers.
Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com >
Approved by: re (scottl)
vm_page's machine-dependent fields. Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.
Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.
Remove stale comments from pmap_init().
Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations. Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.
Tested by: cognet, kensmith, marcel