Commit Graph

418 Commits

Author SHA1 Message Date
marcel
a37b953058 In cpu_set_user_tls(), properly set the thread pointer. It is 0x7000
bytes after the end of the TCB, which is itself 8 bytes.
2006-09-01 06:05:40 +00:00
davidxu
87b5aa08ee Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.
2006-08-28 02:28:15 +00:00
sobomax
aac2334c62 Use proper trap code for the EXC_ALI traps. This fixes SIGBUS during
unaligned 64-bits load/stores.

MFC after:	2 weeks
2006-08-03 22:44:46 +00:00
alc
a152234cf9 Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
jhb
3a707d012d Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
jhb
c62c38439f Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
jhb
12302c47d0 Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
  section.
- Add a new explicit check before userret() to make sure we don't return
  with any locks held.  The advantage here is that we can include the
  syscall number and name in syscall() whereas that info is not available
  in userret().
- Drop the mtx_assert()'s of sched_lock and Giant.  They are replaced by
  the more general checks just added.

MFC after:	2 weeks
2006-07-27 22:32:30 +00:00
jhb
39705fd8c6 Add missing ptrace(2) system-call stops to various syscall()
implementations.

MFC after:	1 week
2006-07-27 19:50:16 +00:00
marcel
98879d5954 o Move the prototype of mem_valid() from ofw_machdep.h to md_var.h.
This avoids that mem.c has to include ofw_machdep.h, including
   all OFW related headers.
o  Provide a stub for OF_decode_addr(), which is used by low-level
   console drivers to obtain a tag and handle given a OFW phandle.
   This is different from sparc64, where a fake bus tag needs to be
   created explicitly.
2006-07-26 17:12:54 +00:00
marcel
987baddcf7 Include needed clock.h. 2006-07-26 17:06:39 +00:00
alc
3150e69985 Add synchronization to moea_zero_page() and moea_zero_page_area().
Remove the acquisition and release of Giant from moea_zero_page_idle().

Tested by: grehan@
2006-07-10 07:03:37 +00:00
alc
c05d97a892 Eliminate the acquisition and release of Giant from moea_extract_and_hold()
and moea_protect().

Tested by: grehan@ and rink@
2006-07-01 23:24:32 +00:00
alc
93b3a3577a Synchronize accesses to the PTEG table.
Add many lock assertions.

Tested by: grehan@
2006-06-25 19:07:01 +00:00
rink
d9c30e05f5 Prevent 'mutex not owned' panic on boot if INVARIANTS is in the kernel. This
makes the GENERIC kernel boot on ppc.

Reviewed by:	grehan
Approved by:	imp (mentor)
MFC after:	1 week

dCVS: ----------------------------------------------------------------------
2006-06-17 20:10:32 +00:00
ups
b3a7439a45 Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps
2006-06-15 01:01:06 +00:00
alc
12b0f2baa2 Correct a typo in the previous revision. 2006-06-06 02:02:10 +00:00
alc
ff4adb11fe Introduce the function pmap_enter_object(). It maps a sequence of resident
pages from the same object.  Use it in vm_map_pmap_enter() to reduce the
locking overhead of premapping objects.

Reviewed by: tegge@
2006-06-05 20:35:27 +00:00
phk
ef310efff8 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
phk
7f5f12015d Remove straggling reference to CPU_ macros 2006-05-11 17:51:10 +00:00
phk
74f8e63a10 Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
grehan
713f710c8d Set the siginfo si_addr field, and also the mysterious 3rd parameter
to old-style signals, to be the DAR register for DSI miss exceptions.
This gives the address of the access rather than the instruction
address. The behaviour is now the same as on i386.

Found by:  libsigsegv tests
2006-01-07 01:55:12 +00:00
netchild
507a9b3e93 MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
grehan
b14a151985 Mark the return address of the call to ast() in the generic trap
handling code so the stack trace unwinders don't start trying to
go into user-space.

Found by trying to create core dumps with a KTR_COMPILE/KTR_GEOM
kernel, which results in a stack_save() call in the ast() coredump
path - this created a panic, and then calling 'trace' in ddb resulted
in the black screen of death after printing out most of the backtrace.
2005-12-23 13:05:27 +00:00
jhb
cb0d490ebe Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
grehan
376ec343cb Fix compile warning: pmap_bootstrap is now declared extern in pmap.h,
remove redundant declaration.
2005-11-11 09:32:27 +00:00
grehan
bcff233215 Name change from pmap_* to moea_* to fit into the new order of
mmu implementation.

This code handles the 32-bit 'OEA' MMU found on G2/G3/G4 PPC cores.
2005-11-08 06:49:45 +00:00
grehan
eff5b98fc4 Insert a layer of indirection to the pmap code, using a kobj for
the interface. This allows run-time selection of MMU code, based
on CPU-type detection, or tunable-overrides when testing new code.

Pre-requisite for G5 support.

conf/files.powerpc
  - remove pmap.c
  - add mmu_if.h, mmu_oea.c, pmap_dispatch.c

powerpc/include/mmuvar.h
  - definitions for MMU implementations

powerpc/include/pmap.h
  - remove pmap_pte_spill declaration
  - add pmap_mmu_install declaration
  - size the phys_avail array
  - pmap_bootstrapped is now global-scope

powerpc/powerpc/machdep.c
  - call kobj_machdep_init early in the boot sequence to allow
    kobj usage prior to SI_SUB_LOCK
  - install the OEA pmap code. This will be moved to CPU-specific
    init code in the future.

powerpc/powerpc/mmu_if.m
  - Kobj MMU interface definitions

powerpc/powerpc/pmap_dispatch.c
  - central dispatch for pmap calls
  - contains the global mmu kobj and the routine to locate the
   the mmu implementation and init the kobj
2005-11-08 06:48:08 +00:00
grehan
7d06f79e60 Copy SPRG0-3 registers at boot-time and restore when calling into
OpenFirmware. FreeBSD/ppc uses SPRG0 as the per-cpu data area pointer,
and SPRG1-3 as temporary registers during exception handling. There
have been a few instances where OpenFirmware does require these to
be part of it's context, such as cd-booting an eMac.

reported by:	many
MFC after:	3 days
2005-10-30 21:29:59 +00:00
davidxu
3fbdb3c215 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
grehan
1a84fecd12 Fix boot-time hang/panic on G3 systems when modifying IBAT0 in
pmap_bootstrap by using the sync;isync big hammer to make sure
all prior operations have completed.

Reported by:	Nathan Whitehorn <nathan at uchicago edu>
MFC after:	2 days
2005-09-10 21:03:10 +00:00
alc
39788de49e Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.
2005-09-03 18:20:20 +00:00
grehan
0fa9c00728 Temporary band-aid to fix hang when a process exec's Altivec instructions.
trap_subr.S:  declare a stub for the a-unavailable trap
              that does an absolute jump to the vector-assist trap.
              This is due to the fact that the vec-unavail trap
              doesn't start at a 256-byte boundary, so the trick of
              masking the bottom 8 bits of the link register to identify
              the interrupt doesn't work, so let the vec-assist
              case handle Altivec-disabled for the time being.

              Note that this will be fixed in the future with a much
              smaller vector code-stub (< 16 bytes) that will allow
              use of strange vector offsets that are also present in
              4xx processors, and also allow smaller differences in
              vector codepaths on the G5.

trap.c:       Treat altivec-unavailable/assist process traps as SIGILL.
              Not quite correct, since altivec-assist should really be a panic,
              but it is fine for the moment due to the above measure.

machdep.c     Install the stub code for the altivec-unavailable trap, and
              the standard trap code at the altivec-assist.

Reported by:	Andreas Tobler <toa at pop agri ch>
MFC after:	3 days
2005-07-30 11:14:31 +00:00
davidxu
bc8b519d0f Validate if the value written into {FS,GS}.base is a canonical
address, writting non-canonical address can cause kernel a panic,
by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring
only canonical values get written to the registers.

Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com >
Approved by: re (scottl)
2005-07-10 23:31:11 +00:00
alc
2d109601cb Introduce a procedure, pmap_page_init(), that initializes the
vm_page's machine-dependent fields.  Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.

Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.

Remove stale comments from pmap_init().

Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations.  Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.

Tested by: cognet, kensmith, marcel
2005-06-10 03:33:36 +00:00
davidxu
2155a04472 Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.
2005-04-23 02:32:32 +00:00
ps
9d5eb9620c Don't enter the debugger if KDB_UNATTENDED is set or if
debug.debugger_on_panic=0.

MFC after:	2 weeks
2005-04-20 20:52:46 +00:00
jhb
f9da7305b5 Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
jhb
8ab8d7de10 Change an instance of md_savecrit to md_saved_msr that I missed. 2005-04-08 14:26:55 +00:00
jhb
41cadaa11e Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
grehan
f7e419df97 Include <sys/signalvar.h> for trapsignal prototype. 2005-03-15 11:41:55 +00:00
grehan
d3c3434c1c Replaced previous hw.physmem extraction with des's mods to
getenv_ulong() - much simpler.

Pointed out by:	des
2005-03-07 07:31:20 +00:00
grehan
e685aa6ce9 physmem is a much better indicator for 'real' memory on PPC than Maxmem
since there are often significant holes in the memory map due to the
kernel, loader and OFW data structures not being included: Maxmem is
the highest available, so can be misleading.
2005-03-07 01:52:24 +00:00
grehan
98946b0cc7 Allow user to undersize memory with hw.physmem loader variable.
Obtained from:  i386/machdep.c:getmemsize()
2005-03-07 01:46:06 +00:00
grehan
98882623fa Catch up with "physical memory" sysctl change.
(MFi386: rev 1.608)
2005-03-01 07:59:24 +00:00
grehan
2773bd509b Catch the case where the idle loop is entered with interrupts disabled,
causing a hard hang.
2005-02-28 09:49:00 +00:00
grehan
78aa487988 - switch pcpu to a struct declaration ala amd64. It may be more efficient to
cache-align this struct, but that's a topic for a far-in-the-future
  commit.
- eliminate commented-out reference to a non-existent pcpu field.
2005-02-28 08:47:51 +00:00
grehan
ef9c4cef54 Correctly set kernelname for kern.bootfile sysctl
Noticed by:	gad
Code stolen from: sparc64
2005-02-28 07:14:13 +00:00
grehan
45535fe884 Add PVO_FAKE flag to pvo entries for PG_FICTITIOUS mappings, to
avoid trying to reverse-map a device physical address to the
vm_page array and walking into non-existent vm weeds.

found by:  Xorg server exiting
2005-02-25 02:42:15 +00:00
njl
2958530007 Finish the job of sorting all includes and fix the build by including
malloc.h before proc.h on sparc64.  Noticed by das@

Compiled on:	alpha, amd64, i386, pc98, sparc64
2005-02-06 01:55:08 +00:00
njl
cd2bcf063b Sort includes a little so that bus.h comes before cpu.h (for device_t). 2005-02-04 06:58:09 +00:00