- stop using the evil 'struct trapframe' argument for mi_startup()
(formerly main()). There are much better ways of doing it.
- do not use prepare_usermode() - setregs() in execve() will do it
all for us as long as the p_md.md_regs pointer is set. (which is
now done in machdep.c rather than init_main.c. The Alpha port did it
this way all along and is much cleaner).
- collect all the magic %cr0 etc register settings into one place and
have the AP's call that instead of using magic numbers (!!) that keep
changing over and over again.
- Make it safe to call kthread_create() earlier, including during the
device probe sequence. It doesn't need the callback mechanism that
NetBSD's version uses.
- kthreads created this way are root-less as they exist before the root
filesystem is mounted. init(1) is set up so that it aquires the root
pointers prior to running. If other kthreads want filesystem acccess
we can make this code more generic.
- set all threads start times once we have decided what time it is.
- init uses a trampoline rather than the evil prepare_usermode() hack.
- kern_descrip.c has a couple of tweaks to deal with forking when there
is no rootdir or cwd etc.
- adjust the early SYSINIT() sequence so that a few prereqisites are in
place. eg: make sure the run queue is initialized before doing forks.
With this, the USB code can easily create a kthread to do the device
tree discovery. (I have tested it, it works nicely).
There are still some open issues before this is truely useful.
- tsleep() does not like working before the clock is running. It
sort-of tries to spin wait, but it can do more useful things now.
- stopping a kthread in kld code at unload time is "interesting" but
we have a solution for that.
The Alpha code needs no changes for this. It already uses pretty much the
same strategies, but a little cleaner.
Don't allow cpu entries in the MP table to contain APIC IDs out of range.
Don't write outside array boundaries if an IO APIC entry in the MP table
contains an APIC ID out of range.
Assign APIC IDs for all IO APICs according to section 3.6.6 in the
Intel MP spec:
- If the current APIC ID on an IO APIC doesn't conflict with other
IO APICs or CPUs, that APIC ID should be used. The copy of the MP
table must be updated if the corresponding APIC ID in the MP table
is different.
- If the current APIC ID was in conflict with other units, the
corresponding APIC ID specified in the MP table is checked for conflict.
- If a conflict is still found then fall back to using a new unique ID.
The copy of the MP table must be updated.
- IDs out of range is considered to be in conflict.
During these operations, the IO_TO_ID array cannot be used, since any
conflict would have caused information loss. The array is then corrected,
since all APIC ID conflicts should have been resolved.
PR: 20312, 18919
errors were normally harmless because they were in unreachable code
and gcc apparently doesn't check the syntax inside asm statements
that it optimizes away.
Further experimentation showed that some Dell 2450 machines with the
prevention kludge installed still got T_RESERVED traps. CPU interrupt
vector 0x7A was observed to be triggered. This might have been the
bitwise OR of two different vectors sent from each of the IOAPICs at
the same time.
IOAPIC #0: 0x68 --> irq 8: RTC timer interrupt
IOAPIC #1: 0x32 --> irq 18: scsi host adapter or network interface
----
0x7a --> T_RESERVED
Both IOAPICs had ID 0.
Appendix B.3 in the MP spec indicates that the operating system is
responsible for assigning unique IDs to the IOAPICs.
The enclosed patch programs the IOAPIC IDs according to the IOAPIC
entries in the MP table.
Submitted by: tegge
to various pmap_*() functions instead of looking up the physical address
and passing that. In many cases, the first thing the pmap code was doing
was going to a lot of trouble to get back the original vm_page_t, or
it's shadow pv_table entry.
Inspired by: John Dyson's 1998 patches.
Also:
Eliminate pv_table as a seperate thing and build it into a machine
dependent part of vm_page_t. This eliminates having a seperate set of
structions that shadow each other in a 1:1 fashion that we often went to
a lot of trouble to translate from one to the other. (see above)
This happens to save 4 bytes of physical memory for each page in the
system. (8 bytes on the Alpha).
Eliminate the use of the phys_avail[] array to determine if a page is
managed (ie: it has pv_entries etc). Store this information in a flag.
Things like device_pager set it because they create vm_page_t's on the
fly that do not have pv_entries. This makes it easier to "unmanage" a
page of physical memory (this will be taken advantage of in subsequent
commits).
Add a function to add a new page to the freelist. This could be used
for reclaiming the previously wasted pages left over from preloaded
loader(8) files.
Reviewed by: dillon
specifies the instruction's operation size. GCC will default to 32-bit
operands reguardless of the prototype (ie, formal parameters' type)
of an inline function.
- Add support for using the PCI BIOS functions for configuration space
accesses, and make this the default.
- Make PNPBIOS the default (obsoletes the PNPBIOS config option).
- Add two new boot-time tunables to disable each of the above.
via sysctl. It's done pretty simply but it should be quite adequate.
Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that
went with it were wrong... we don't allocate KVM space for the pages so
that comment is bogus.. The only practical limit is how much physical
ram you want to lock up as this stuff isn't paged out or swap backed.
includes one of bus_at386.h and bus_pc98.h. Becuase only bus_pc98.h
supports indirect pio and bus_at386.h is identical to old bus.h, there
is no functional change in PC-AT's kernels. That is, it cannot cause
performance loss.
Submitted by: nyan
Reviewed by: imp
bde and luoqi provided useful comments for earlier version.
syscall path inward. A system call may select whether it needs the MP
lock or not (the default being that it does need it).
A great deal of conditional SMP code for various deadended experiments
has been removed. 'cil' and 'cml' have been removed entirely, and the
locking around the cpl has been removed. The conditional
separately-locked fast-interrupt code has been removed, meaning that
interrupts must hold the CPL now (but they pretty much had to anyway).
Another reason for doing this is that the original separate-lock for
interrupts just doesn't apply to the interrupt thread mechanism being
contemplated.
Modifications to the cpl may now ONLY occur while holding the MP
lock. For example, if an otherwise MP safe syscall needs to mess with
the cpl, it must hold the MP lock for the duration and must (as usual)
save/restore the cpl in a nested fashion.
This is precursor work for the real meat coming later: avoiding having
to hold the MP lock for common syscalls and I/O's and interrupt threads.
It is expected that the spl mechanisms and new interrupt threading
mechanisms will be able to run in tandem, allowing a slow piecemeal
transition to occur.
This patch should result in a moderate performance improvement due to
the considerable amount of code that has been removed from the critical
path, especially the simplification of the spl*() calls. The real
performance gains will come later.
Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
was using them exits.
Don't allow a user process to cause the kernel to take a TRCTRAP on a
user space address.
Reviewed by: jlemon, sef
Approved by: jkh
the low level interrupt handler number should be used. Change
setup_apic_irq_mapping() to allocate low level interrupt handler X (Xintr${X})
for any ISA interrupt X mentioned in the MP table.
Remove an assumption in the driver for the system clock (clock.c) that
interrupts mentioned in the MP table as delivered to IOAPIC #0 intpin Y
is handled by low level interrupt handler Y (Xintr${Y}) but don't assume
that low level interrupt handler 0 (Xintr0) is used.
Don't allocate two low level interrupt handlers for the system clock.
Reviewed by: NOKUBI Hirotaka <hnokubi@yyy.or.jp>
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
The UPAGES have not been there since Jan '96, but the hole was preserved
for BSD/OS binary compatability. This has been fixed other ways (%ebx
now has a pointer to PS_STRINGS), and the stack is nowhere near where
it used to be so this hack isn't required anymore.
and extend. The new function containing the code is named schedclock()
as in NetBSD, but it has slightly different semantics (it already handles
incrementation of p->p_cpticks, and it should handle any calling frequency).
Agreed with in principle by: dufault
to use a locked cmpexg when unlocking a lock that we already hold, since
nobody else can touch the lock while we hold it. Second, it is not
necessary to use a locked cmpexg when locking a lock that we already
hold, for the same reason. These changes will allow MP locks to be used
recursively without impacting performance.
Modify two procedures that are called only by assembly and are already
NOPROF entries to pass a critical argument in %edx instead of on the
stack, removing a significant amount of code from the critical path
as a consequence.
Reviewed by: Alfred Perlstein <bright@wintelcom.net>, Peter Wemm <peter@netplex.com.au>
Historically, the documentation of extended asm was lacking, namely you
should NOT specify the same register as an input, and a clobber.
If the register is clobbered, it should be specified as an output as well,
e.g., by linking input and output through the "number" notation.
(Beware of lvalues, some local variables needed...)
URL:http://egcs.cygnus.com/faq.html
In versions up to egcs-1.1.1, the compiler did not even warn about it,
but it was liable to output bad code. Newer egcs are pickier and simply
refuse to swallow such code.
Note, since *addr changes, it needs to be an output operand.
We might be excessive in saying that all memory has changed.
Obtained from: OpenBSD
w/extra thanks to Marc Espie <Marc.Espie@liafa.jussieu.fr>
the countdown register.
this should not be necessary but there are broken laptops that
do not restore the countdown register on resume.
when it happnes, it messes up the hardclock interval and system clock,
which leads to the infamous "calcru: negative time" problem.
Submitted by: kjc, iwasaki
Reviewed by: Steve O'Hara-Smith <steveo@eircom.net> and committers.
Obtained from: PAO3
* Change the hack used on the alpha for mapping devices into DENSE or
BWX memory spaces to a simpler one. Its still a hack and should be
a seperate api to explicitly map the resource.
* Add $FreeBSD$ as necessary.
can provide the correct context to each signal handler.
Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK
as we did before the linuxthreads support merge (submitted by bde).
Move ps_sigstk from to p_sigacts to the main proc structure since signal
stack should not be shared among threads.
Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag.
Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag.
Reviewed by: marcel, jdp, bde
o Remove unused defines from genassym.c that were needed
by the trampoline.
o Add load_gs_param function to support.s that catches
a fault when %gs is loaded with an invalid descriptor.
The function returns EFAULT in that case.
o Remove struct trapframe from mcontext_t and replace it
with the list of registers.
o Modify sendsig and sigreturn accordingly.
This commit contains a patch by bde.
Reviewed by: luoqi, jdp
struct sigcontext and ucontext_t/mcontext_t are defined in such
a way that both (ie struct sigcontext and ucontext_t) can be
passed on to sigreturn. The signal handler is still given a
ucontext_t for maximum flexibility.
For backward compatibility sigreturn restores the state for the
alternate signal stack from sigcontext.sc_onstack and not from
ucontext_t.uc_stack. A good way to determine which value the
application has set and thus which value to use, is still open
for discussion.
NOTE: This change should only affect those binaries that use
sigcontext and/or ucontext_t. In the source tree itself
this is only doscmd. Recompilation is required for those
applications.
This commit also fixes a lot of style bugs without hopefully
adding new ones.
NOTE: struct sigaltstack.ss_size now has type size_t again. For
some reason I changed that into unsigned int.
Parts submitted by: bde
sigaltstack bug found by: bde