- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.
Tested on: alpha, i386
Reviewed by: peter, jake
o Hide nonstandard functions and types in <netinet/in.h> when
_POSIX_SOURCE is defined.
o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>.
o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new
__FBSDID() macro.
o Fix some miscellaneous issues in <arpa/inet.h>.
o Correct final argument for the inet_ntop() function (POSIX.1-200x).
o Get rid of the namespace pollution from <sys/types.h> in
<arpa/inet.h>.
Reviewed by: fenner
Partially submitted by: bde
pointless and would be inadequate for SMP systems. We will rely on the
VM system's locks to serialise this for now.
* Change pmap_remove() so that if the range being removed is larger than
the number of pages mapped by the pmap, we iterate over the currently
mapped pages instead of over the virtual address range. This should
make a difference when removing large virtual address ranges from an
address space.
Protect against an infinite loop when prefaulting pages. This can
happen when the vm system maps past the end of an object or tries
to map a zero length object, the pmap layer misses the fact that
offsets wrap into negative numbers and we get stuck.
Make flushing dirty pages work correctly on filesystems that
unexpectedly do not complete writes even with sync I/O requests.
This should help the behavior of mmaped files when using
softupdates (and perhaps in other circumstances also.)
reading cr.ivr, as well as writing to cr.eoi.
o use global variables to pass information to os_boot_rendez
so that it doesn't have to jump through hoops to find it
out. This avoids traps on the AP without it even being
initialized. This fixes SMP configurations.
o Move the probing of the MADT to the end of cpu_startup,
instead of at the start of cpu_mp_probe. We need to probe
the MADT for non-SMP configurations as well. This fixes
uniprocessor configurations.
o Serialize AP wake-up by waiting for the AP. We need to do
this since we use global variables to for the AP to use.
As a side-effect, we can use printf() more easily to see
what's going on.
It doesn't help us catch overflowing vector entries at compile time.
Instead use the .org directive. The last entry in the IVT doesn't
strictly need to be limited to 256 bytes, but doing so allows the
the VHPT to be placed immediately following the IVT without wasting
any space due to alignment.
* Re-organise RID allocation so that we don't accidentally give a RID
to two different processes. Also randomise the order to try to reduce
collisions in VHPT and TLB. Don't allocate RIDs for regions which are
unused.
* Allocate space for VHPT based on the size of physical memory. More
tuning is needed here.
* Add sysctl instrumentation for VHPT - see sysctl vm.stats.vhpt
* Fix a bug in pmap_prefault() which prevented it from actually adding
pages to the pmap.
* Remove ancient dead debugging code.
* Add DDB commands for examining translation registers and region
registers.
The first change fixes the 'free/cache page %p was dirty' panic which I
have been seeing when the system is put under moderate load. It also
fixes the negative RSS values in ps which have been confusing me for a
while.
With this set of changes the ia64 port is reliable enough to build its
own kernels, even with a 20-way parallel build. Next stop buildworld.
actually the address of the function descriptor. The fdesc has both
the address of the function and it's corresponding gp value. Now
that we have a gp value, use it instead of passing 0.
o Make <stdint.h> a symbolic link to <sys/stdint.h>.
o Move most of <sys/inttypes.h> into <sys/stdint.h>, as per C99.
o Remove <sys/inttypes.h>.
o Adjust includes in sys/types.h and boot/efi/include/ia64/efibind.h
to reflect new location of integer types in <sys/stdint.h>.
o Remove previously symbolicly linked <inttypes.h>, instead create a
new file.
o Add MD headers <machine/_inttypes.h> from NetBSD.
o Include <sys/stdint.h> in <inttypes.h>, as required by C99; and
include <machine/_inttypes.h> in <inttypes.h>, to fill in the
remaining requirements for <inttypes.h>.
o Add additional integer types in <machine/ansi.h> and
<machine/limits.h> which are included via <sys/stdint.h>.
Partially obtain from: NetBSD
Tested on: alpha, i386
Discussed on: freebsd-standards@bostonradio.org
Reviewed by: bde, fenner, obrien, wollman
When we truncate the msgbuf size because the last chunk is too small,
correctly terminate the phys_avail[] array - the VM system tests
the *end* for zero, not the start. This leads the VM startup to
attempt to recreate a duplicate set of pages for all physical memory.
XXX the msgbuf handling is suspiciously different on i386 vs
alpha/ia64...
* Implement a fairly simplistic parser for unwinding stack frames.
* Use unwind records for DDB's 'trace' command. Also add support for
tracing past exceptions to the context which generated the exception.
The stack unwind code requires a toolchain based on binutils-2.11.2 or
later and gcc-3.0.1 or later.
starts at offset 8; not 6. Hence the structure is 12 bytes and
not 10 bytes. Adjust the definition so that the ProcessorEnabled
flag is moved from bit 15 to bit 31 in the Flags field.
The definition now matches ACPI 2.0 Errata 1.5.
do it as a side-effect of probing for MP hardware. This allows
us to scan for local SAPICs early (especially before MBUF
initialization).
o Fix the Local SAPIC structure so that matches the Local SAPIC
table entry. Now that the Local SAPIC info is the same as the
Local APIC info, stop dumping the Local APIC entries.
o For every Local SAPIC entry in the MADT that's not disabled,
let the SMP code know about it. They represent actual CPUs.
o Register the OS_BOOT_RENDEZ entry point and provide a (bogus)
implementation for the entry point.
o Provide a mapping for internal IPI numbers to ExtINT vectors.
o In a MP system, announce the CPUs and start them by sending
IPI_AP_WAKEUP to each of them. Not that it makes a difference
at this time :-)
o Miscellaneous style fixes and other adjustments.
This emulates APM device node interface APIs (mainly ioctl) and
provides APM services for the applications. The goal is to support
most of APM applications without any changes.
Implemented ioctls in this commit are:
- APMIO_SUSPEND (mapped ACPI S3 as default but changable by sysctl)
- APMIO_STANDBY (mapped ACPI S1 as default but changable by sysctl)
- APMIO_GETINFO and APMIO_GETINFO_OLD
- APMIO_GETPWSTATUS
With above, many APM applications which get batteries, ac-line
info. and transition the system into suspend/standby mode (such as
wmapm, xbatt) should work with ACPI enabled kernel (if ACPI works well :-)
Reviewed by: arch@, audit@ and some guys
userland. The per thread ucred reference is immutable and thus needs no
locks to be read. However, until all the proc locking associated with
writes to p_ucred are completed, it is still not safe to use the per-thread
reference.
Tested on: x86 (SMP), alpha, sparc64
pcb_onfault is not set) or arrange to restart at the location in
pcb_onfault.
This ought to help the stability of a system under moderate load. It
certainly stops DDB from hanging the kernel when it tries to access a
non-present page.
{set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
one yet.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
{set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
one yet.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
structure. This makes it possible to pre-allocate PTEs for the kernel,
which is necessary for a reliable implementation of pmap_kenter(). This
also avoids wasting space (about 48 bytes per page) for kernel mappings
and user mappings of memory-mapped devices.
This also fixes a bug with the previous version where the implementation
required the pv_entry structure to be physically contiguous but did not
enforce this (the structure size was not a power of two). This meant
that the pv_entry free list was quickly corrupted as soon as the system
was even mildly loaded.
the existence of the __gnuc_va_list type[*] because our compiler is GCC.
[*] __gnuc_va_list is defined in the GCC ginclude/stdarg.h replacement
headerwhich we don't use.
- Only release Giant in trap() if we locked it, otherwise we could release
Giant in a kernel trap if we didn't get it for a page fault and the
previous frame had grabbed the lock.
- Only get Giant for !MP safe syscalls.
be set. We need to check isr.w before isr.r so that we can correctly
handle a cmpxchg to a copy-on-write page.
This fixes the hang-after-fork problem for dynamically linked programs.
C calling conventions. This allows crt1.c to be written nearly without
any inline assembler.
* Initialise cpu_model[] so that the hw.model sysctl works properly.
Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested. Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".
* Don't get confused when memory regions don't lie on page boundaries -
remember our page size is typically larger than the firmware's page size.
* Add a function ia64_running_in_simulator() which is intended to detect
whether the kernel is running in SKI or on real hardware.
will be private to each CPU.
- Re-style(9) the globaldata structures. There really needs to be a MI
struct pcpu that has a MD struct mdpcpu member at some point.
* Use the bootinfo's memory map if present instead of hard-coding SKI's
memory map.
* Record the location of the I/O Port Space if present in the memory map.
to locore to process the @fptr relocations in the dynamic executable.
* Don't initialise the timer until *after* we install the timecounter to
avoid a race between timecounter initialisation and hardclock.
* Tidy up bootinfo somewhat including adding sanity checks for when the
kernel is loaded without a recognisable bootinfo.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
* Switch to proc0's stack and backing store before calling ia64_init
so that we don't rely on the loader's stack at all.
* Change kernel entry point name from locorestart to __start.
it to the MI area. KSE touched cpu_wait() which had the same change
replicated five ways for each platform. Now it can just do it once.
The only MD parts seemed to be dealing with fpu state cleanup and things
like vm86 cleanup on x86. The rest was identical.
XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional
stub in place.
Reviewed by: jake, tmm, dillon
o Unify <machine/endian.h>'s across all architectures.
o Make bswapXX() functions use a different spelling of u_int16_t and
friends to reduce namespace pollution. The bswapXX() functions
don't actually exist, but we'll probably import these at some
point. Atleast one driver (if_de) depends on bswapXX() for big
endian cases.
o Deprecate byteorder(3) prototypes from <sys/types.h>, these are
now prototyped indirectly in <arpa/inet.h>.
o Deprecate in_addr_t and in_port_t typedefs in <sys/types.h>, these
are now typedef'd in <arpa/inet.h>.
o Change byteorder(3) prototypes to use standards compliant uint32_t
(spelled __uint32_t to reduce namespace pollution).
o Document new preferred headers and standards compliance.
Discussed with: bde
PR: 29946
Reviewed by: bmilekic
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.
Reviewed by: bde (mostly)
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.
gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.
dynamic symbol table buckets and chains. The sparc64 toolchain uses 32
bit .hash entries, unlike other 64 bits architectures (alpha), which use
64 bit entries.
Discussed with: dfr, jdp
were indices in a dense array. The cpuids are a sparse set and treat
them as such, setting up containers only for CPUs activated during
mb_init().
- Fix netstat(1) and systat(1) to treat the per-CPU stats area as a sparse
map, in accordance with the above.
This allows us to properly boot with certain CPUs disactivated. However, if
we later decide to re-activate said CPUs, we will barf until we decide to
implement CPU spinon/spinoff callback hooks to allow for said CPUs' per-CPU
containers to get configured on their activation.
Reported by: mjacob
Partially (sys/ diffs) Submitted by: mjacob
'dwatch'. The new commands install hardware watchpoints if supported
by the architecture and if there are enough registers to cover the
desired memory area.
No objection by: audit@, hackers@
MFC after: 2 weeks
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.
vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc... vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)
(this commit is just the first stage). Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
.cvsignore file for [A-Za-z]* to keep these directories around rather
than waste a file on .keepme. This should also make people's built
trees place nice with CVS.
Idea for .cvsignore: peter (although I suggested the regexp)
Pointed out by: Makoto MATSUSHITA-san <matusita@jp.FreeBSD.org>
Llama's costuming by: Fernamdo Llamas
lock until after grabbing the sched_lock to avoid CURSIG racing with
psignal.
- Don't grab Giant for addupc_task() as it isn't needed.
Reported by: tegge (signal race), bde (addupc_task a while back)
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs
Requested by: bde
Reviewed by: bde (some time ago)
systems were repo-copied from sys/miscfs to sys/fs.
- Renamed the following file systems and their modules:
fdesc -> fdescfs, portal -> portalfs, union -> unionfs.
- Renamed corresponding kernel options:
FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.
- Install header files for the above file systems.
- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
Makefiles.
- Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be
toggled after boot.
- Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just
afer the display of the copyright message instead of doing it by hand in
three MD places.
registers better. Hold sched_lock not only for checking the flag but
also while performing the actual operation to ensure the process doesn't
get swapped out by another CPU while we the operation is being performed.
If for some reason DEVFS is undesired, the "NODEVFS" option is
needed now.
Pending any significant issues, DEVFS will be made mandatory in
-current on july 1st so that we can start reaping the full
benefits of having it.
been made machine independent and various other adjustments have been made
to support Alpha SMP.
- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.
Reviewed by: jake, peter
Looked over by: eivind
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
and change the u_int mtx_saveintr member of struct mtx to a critical_t
mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules. Change all the
_mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
parameters and rename them to use a prefix of two underscores. Inside
of kern_mutex.c, generate wrapper functions for
_mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
that are called from modules. The macros mtx_{un,}lock_{spin,}_flags()
are mapped to the __mtx_* macros inside of the kernel to inline the
usual case of mutex operations and map to the internal _mtx_* functions
in the module case so that modules will use WITNESS and KTR logging if
the kernel is compiled with support for it.
sections.
- Add implementations of the critical_enter() and critical_exit() functions
and remove restore_intr() and save_intr().
- Remove the somewhat bogus disable_intr() and enable_intr() functions on
the alpha as the alpha actually uses a priority level and not simple bit
flag on the CPU.
Make the name cache hash as well as the nfsnode hash use it.
As a special tweak, create an unsigned version of register_t. This allows
us to use a special tweak for the 64 bit versions that significantly
speeds up the i386 version (ie: int64 XOR int64 is slower than int64
XOR int32).
The code layout is a little strange for the string function, but I was
able to get between 5 to 10% improvement over the original version I
started with. The layout affects gcc code generation choices and this way
was fastest on x86 and alpha.
Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is
around 45% faster with -march=pentiumpro on a p6 cpu.
if we hold a spin mutex, since we can trivially get into deadlocks if we
start switching out of processes that hold spinlocks. Checking to see if
interrupts were disabled was a sort of cheap way of doing this since most
of the time interrupts were only disabled when holding a spin lock. At
least on the i386. To fix this properly, use a per-process counter
p_spinlocks that counts the number of spin locks currently held, and
instead of checking to see if interrupts are disabled in the witness code,
check to see if we hold any spin locks. Since child processes always
start up with the sched lock magically held in fork_exit(), we initialize
p_spinlocks to 1 for child processes. Note that proc0 doesn't go through
fork_exit(), so it starts with no spin locks held.
Consulting from: cp
- Don't try to grab Giant before postsig() in userret() as it is no longer
needed.
- Don't grab Giant before psignal() in ast() but get the proc lock instead.
supported architectures such as the alpha. This allows us to save
on kernel virtual address space, TLB entries, and (on the ia64) VHPT
entries. pmap_map() now modifies the passed in virtual address on
architectures that do not support direct-mapped segments to point to
the next available virtual address. It also returns the actual
address that the request was mapped to.
- On the IA64 don't use a special zone of PV entries needed for early
calls to pmap_kenter() during pmap_init(). This gets us in trouble
because we end up trying to use the zone allocator before it is
initialized. Instead, with the pmap_map() change, the number of needed
PV entries is small enough that we can get by with a static pool that is
used until pmap_init() is complete.
Submitted by: dfr
Debugging help: peter
Tested by: me
- Remove unneeded spl()'s around mi_switch() in userret().
- Don't hold sched_lock across addupc_task().
- Remove the MD function child_return() now that the MI function
fork_return() is used instead.
- Use TRAPF_USERMODE() instead of dinking with the trapframe directly to
check for ast's in kernel mode.
- Check astpending(curproc) and resched_wanted() in ast() and return if
neither is true.
- Use astoff() rather than setting the non-existent per-cpu variable
astpending to 0 to clear an ast.
for us.
- Change the switch_trampoline() to call fork_exit() passing in the
required arguments instead of calling the fork trampoline callout
function directly.
Warning: this hasn't been tested.
Looked over by: dfr
- All processes go into the same array of queues, with different
scheduling classes using different portions of the array. This
allows user processes to have their priorities propogated up into
interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
32. We used to have 4 separate arrays of 32 queues each, so this
may not be optimal. The new run queue code was written with this
in mind; changing the number of run queues only requires changing
constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter. This
is intended to be used to create per-cpu run queues. Implement
wrappers for compatibility with the old interface which pass in
the global run queue structure.
- Group the priority level, user priority, native priority (before
propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
This was used to detect when a process' priority had lowered and
it should yield. We now effectively yield on every interrupt.
- Activate propogate_priority(). It should now have the desired
effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
idle loop. It interfered with propogate_priority() because
the idle process needed to do a non-blocking acquire of Giant
and then other processes would try to propogate their priority
onto it. The idle process should not do anything except idle.
vm_page_zero_idle() will return in the form of an idle priority
kernel thread which is woken up at apprioriate times by the vm
system.
- Update struct kinfo_proc to the new priority interface. Deliberately
change its size by adjusting the spare fields. It remained the same
size, but the layout has changed, so userland processes that use it
would parse the data incorrectly. The size constraint should really
be changed to an arbitrary version number. Also add a debug.sizeof
sysctl node for struct kinfo_proc.
Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by: jhb
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
o Use objdump instead of gensetdefs(1) to build the linker sets.
o Allow overriding of nm and objdump in resp. genassym.sh and
gensetdefs.pl for non-native toolchains.
Reviewed by: arch
Perl improvements: Jos Backus <josb@cncdsl.com>, benno
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context. This is also necessary
in order to context switch from sched_ithd() directly.
Reviewed By: peter
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.
This change is necessary in order to avoid some circular bootstrapping
dependencies.