to userland in the signal handler that were not being iflled out before, but
should and can be.
This part of sendsig could be slightly refactored to use an MI interface, or
ideally, *sendsig*() would have an API change to accept a siginfo_t, which
would be filled out by an MI function in the level above sendsig, and said MI
function would make a small call into MD code to fill out the MD parts (some
of which may be bogus, such as the si_addr stuff in some places). This would
eventually make it possible for parts of the kernel sending signals to set up
a siginfo with meaningful information.
Reviewed by: mux
MFC after: 2 weeks
sysentvec. Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places. Provided stack
fixup routines for emulations that previously used the default.
in the original hardwired sysctl implementation.
The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int. Use
'long' there.
Change Maxmem and physmem and related variables to 'long', mostly for
completeness. Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody. This
comes for free on 32 bit machines, so why not?
These types are unlikely to ever become very MD. They include:
clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t,
rune_t, socklen_t, timer_t, wchar_t, and wint_t.
While moving them, make a few adjustments (submitted by bde):
o __ct_rune_t needs to be precisely `int', not necessarily __int32_t,
since the arg type of the ctype functions is int.
o __rune_t, __wchar_t and __wint_t inherit this via a typedef of
__ct_rune_t.
o Some minor wording changes in the comment blocks for ct_rune_t and
mbstate_t.
Submitted by: bde (partially)
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif
Concept by: bde
Reviewed by: jake, obrien
Check if the trapped pc is inside of the demarked sections to implement
fault recovery for copyin etc, instead of pcb_onfault. Handle recovery
from data access exceptions as well as page faults.
Inspired by: bde's sys.dif
This is an architecture that present a thing message passing interface
to the OS. You can query as to how many ports and what kind are attached
and enable them and so on.
A less grand view is that this is just another way to package SCSI (SPI or
FC) and FC-IP into a one-driver interface set.
This driver support the following hardware:
LSI FC909: Single channel, 1Gbps, Fibre Channel (FC-SCSI only)
LSI FC929: Dual Channel, 1-2Gbps, Fibre Channel (FC-SCSI only)
LSI 53c1020: Single Channel, Ultra4 (320M) (Untested)
LSI 53c1030: Dual Channel, Ultra4 (320M)
Currently it's in fair shape, but expect a lot of changes over the
next few weeks as it stabilizes.
Credits:
The driver is mostly from some folks from Jeff Roberson's company- I've
been slowly migrating it to broader support that I it came to me as.
The hardware used in developing support came from:
FC909: LSI-Logic, Advansys (now Connetix)
FC929: LSI-Logic
53c1030: Antares Microsystems (they make a very fine board!)
MFC after: 3 weeks
conventions for _mcount and __cyg_profile_func_enter are different, so
statistical profiling kernels build and link but don't actually work.
IWBNI one could tell gcc to only generate calls to the former.
Define uintfptr_t properly for userland, but not for the kernel (I hope).
<stdint.h>. Previously, parts were defined in <machine/ansi.h> and
<machine/limits.h>. This resulted in two problems:
(1) Defining macros in <machine/ansi.h> gets in the way of that
header only defining types.
(2) Defining C99 limits in <machine/limits.h> adds pollution to
<limits.h>.
userland for libc/gmon to compile, so the typedef in <machine/types.h>
isn't good enough. This is really ugly since we end up with the
actual value which uintfptr_t is typedef'd from, in multiple places.
This is bug for bug compatible with the other FreeBSD architectures.
Noticed by: sparc64 tinderbox
basically maps all of physical memory 1:1 to a range of virtual addresses
outside of normal kva. The advantage of doing this instead of accessing
phsyical addresses directly is that memory accesses will go through the
data cache, and will participate in the normal cache coherency algorithm
for invalidating lines in our own and in other cpus' data caches. So
we don't have to flush the cache manually or send IPIs to do so on other
cpus. Also, since the mappings never change, we don't have to flush them
from the tlb manually.
This makes pmap_copy_page and pmap_zero_page MP safe, allowing the idle
zero proc to run outside of giant.
Inspired by: ia64
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
of them, and couple them by always performing all operations on all
present IOMMUs. This is required because with the current API there
is no way to determine on which bus a busdma operation is performed.
While being there, clean up the iommu code a bit.
This should be a step in the direction of allow some of larger machines
to work; tests have shown that there still seem to be problems left.
o Assert that the page queues lock is held in vm_page_unwire().
o Make vm_page_lock_queues() and vm_page_unlock_queues() visible
to kernel loadable modules.
choosethread() in MI C code instead of doing it in in assembly in all the
various cpu_switch() functions. This fixes problems on ia64 and sparc64.
Reviewed by: julian, peter, benno
Tested on: i386, alpha, sparc64
itself; this causes undefined behaviour on UltraSPARCs. In particular,
the interrupt packet data words will not necessarily be delivered
correctly, which would result in a crash.
This bug also caused the cache-flushing work to be done twice on the
triggering CPU (when it did not cause crashes).
Reviewed by: jake
hardly MD, since all our platforms share the same macro. It's not
really compiler dependent either, but this helps in reducing
<machine/ansi.h> to only type definitions.
threaded VM pagezero kthread outside of Giant. For some platforms, this
is really easy since it can just use the direct mapped region. For others,
IPI sending is involved or there are other issues, so grab Giant when
needed.
We still have preemption issues to deal with, but Alan Cox has an
interesting suggestion on how to minimize the problem on x86.
Use Luigi's hack for preserving the (lack of) priority.
Turn the idle zeroing back on since it can now actually do something useful
outside of Giant in many cases.
pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code
and use a single equivalent MI version. There are other cleanups
needed still.
While here, use the UMA zone hooks to keep a cache of preinitialized
proc structures handy, just like the thread system does. This eliminates
one dependency on 'struct proc' being persistent even after being freed.
There are some comments about things that can be factored out into
ctor/dtor functions if it is worth it. For now they are mostly just
doing statistics to get a feel of how it is working.
we just have to deal with the kstack when told to. We do not have a
UMA-managed cache for the proc struct and its associated upage yet. So,
go back to the old lazy mechanism. Note that if UMA destroys pages that
used to contain proc structures, we'll lose the corresponding upage
forever. (zones never did this - once a page was allocated, it stayed
attached to the proc zone forever)
and function) with existing configuration choices. Arguably if
ALT_BREAK_TO_DEBUGGER was present, so should have been
BREAK_TO_DEBUGGER. Regardless, it broke the option sort order in
these kernel configuration files.
Requested by: bde
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..
installed with pmap_kenter_flags, since the physical addresses may not
have an associated vm_page. Add a function to do this.
Tested by: Tomi Vainio <Tomi.Vainio@Sun.COM>
obtained, when all other scheduling activity is suspended. This is needed
on sparc64 to deactivate the vmspace of the exiting process on all cpus.
Otherwise if another unrelated process gets the exact same vmspace structure
allocated to it (same address), its address space will not be activated
properly. This seems to fix some spontaneous signal 11 problems with smp
on sparc64.
dcache aliasing. A page that already had more than 1 mapping of the
same virtual colour would not be correctly uncached.
Noticed by: Artur Grabowski <art@openbsd.org>
implementations can provide a base zero ffs function if they wish.
This changes
#define RQB_FFS(mask) (ffs64(mask))
foo = RQB_FFS(mask) - 1;
to
#define RQB_FFS(mask) (ffs64(mask) - 1)
foo = RQB_FFS(mask);
On some platforms we can get the "- 1" for free, eg: those that use the
C code for ffs64().
Reviewed by: jake (in principle)
magic numbers. Use stxa_sync instead of stxa; membar #Sync; to ensure
that no instruction is placed between the two. This can cause random
corruption even though interrupts are already disabled.
code. Both tasks are not always performed completely by the firmware.
The former is required to get some e450 models to boot; the latter fixes
the repeated fifo underruns with hme(4)s and gem(4)s observed on some
machines (and probably performance problems with other peripherals as
well).
in their tlb which the prom doesn't clear out, so we have to do so manually
before mapping the kernel page table or the cpu can hang due various
conditions which cause undefined behaviour from the tlb.
- ktrace no longer requires Giant so do ktrace syscall events before and
after acquiring and releasing Giant, respectively.
- For i386, ia32 syscalls on ia64, powerpc, and sparc64, get rid of the
goto bad hack and instead use the model on ia64 and alpha were we
skip the actual syscall invocation if error != 0. This fixes a bug
where if we the copyin() of the arguments failed for a syscall that
was not marked MP safe, we would try to release Giant when we had
not acquired it.
the pv lists in the vm_page, even unmanaged kernel mappings. This is so
that the virtual cachability of these mappings can be tracked when a page
is mapped to more than one virtual address. All virtually cachable
mappings of a physical page must have the same virtual colour, or illegal
alises can be created in the data cache. This is a bit tricky because we
still have to recognize managed and unmanaged mappings, even though they
are all on the pv lists.
value of the tag or data field.
Add macros for getting the page shift, size and mask for the physical page
that a tte maps (which may be one of several sizes).
Use the new cache functions for invalidating single pages.
a floating point instruction into a 6-bit register number for
double and quad arguments.
Make use of the new INSFPdq_RN macro where apporpriate; this
is required for correctly handling the "high" fp registers
(>= %f32).
Fix a number of bugs related to the handling of the high registers
which were caused by using __fpu_[gs]etreg() where __fpu_[gs]etreg64()
should be used (the former can only access the low, single-precision,
registers).
Submitted by: tmm
Rearrange things slightly so that the contents of the tag access
register are read and restored outside of the macros. The intention
is to pass the page size to look up as an argument to the macros.
i386/ia64/alpha - catch up to sparc64/ppc:
- replace pmap_kernel() with refs to kernel_pmap
- change kernel_pmap pointer to (&kernel_pmap_store)
(this is a speedup since ld can set these at compile/link time)
all platforms (as suggested by jake):
- gc unused pmap_reference
- gc unused pmap_destroy
- gc unused struct pmap.pm_count
(we never used pm_count - we track address space sharing at the vmspace)
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.
Don't pass the symbol name to elf_reloc(), as it isn't used any more.
user stack in response to a failed window fill, allowing the process to be
killed if its wrong. This caused user programs which misalign their stack
pointer to get stuck in an infinite loop at the kernel-userland boundary,
which is mostly harmless.
The same thing causes a fatal RED state exception on OpenBSD and probably
NetBSD.
Inspired by: art@openbsd.org
and pmap_copy_page(). This gets rid of a couple more physical addresses
in upper layers, with the eventual aim of supporting PAE and dealing with
the physical addressing mostly within pmap. (We will need either 64 bit
physical addresses or page indexes, possibly both depending on the
circumstances. Leaving this to pmap itself gives more flexibilitly.)
Reviewed by: jake
Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
_BYTE_ORDER. These are far more useful than their non-underscored
equivalents as these can be used in restricted namespace environments.
Mark the non-underscored variants as deprecated.
and add some compatibility defines. Add fields for ins and locals to
struct reg also for the same reason; these aren't filled in yet because
getting at those registers sucks and I'd rather not save them in the
trapframe just for this. Reorder struct reg to be ABI compatible as
well. Add needed include of machine/emul.h.
This gets pmdb (poor man's debugger) from OpenBSD mostly compiling but it
doesn't work yet :(
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.
Avoid locking in userret() in most of the remaining cases.
Submitted by: luoqi (first part only, long ago, reorganized by me)
Reminded by: dillon
various machdep.c's to being declared in kern_mutex.c.
- Add a new function mutex_init() used to perform early initialization
needed for mutexes such as setting up thread0's contested lock list
and initializing MI mutexes. Change the various MD startup routines
to call this function instead of duplicating all the code themselves.
Tested on: alpha, i386
hold the kernel text, data and loader metadata by not using a fixed slot
to store the TSB page(s) into. Enter fake 8k page entries into the kernel
TSB that cover the 4M kernel page(s), sot that pmap_kenter() will work
without having to treat these pages as a special case.
Problem reported by: mjacob, obrien
Problem spotted and 4M page handling proposed by: jake
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.
Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development. Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures. Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable. Re-integrate bug fixes that Jake
made to the sparc64 code.
Note: In general, developers should not gratuitously move declarations out
of sub-blocks. They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.
Reviewed by: jake
code can use it. This takes a single constant argument and fails to compile
if it is 0 (false). The main application of this is to make assertions about
structure sizes at compile time, in order to validate assumptions made in
other code. Examples:
CTASSERT(sizeof(struct foo) == FOO_SIZEOF);
CTASSERT(sizeof(struct foo) == (1 << FOO_SHIFT));
Requested by: jhb, phk
dump the trace buffer feasible.
- Remove KTR_EXTEND. This changes the format of the trace entries when
activated, making writing a userland tool which is not tied to a specific
kernel configuration difficult.
- Use get_cyclecount() for timestamps. nanotime() is much too heavy weight
and requires recursion protection due to ktr traces occuring as a result
of ktr traces. KTR_VERBOSE may still require recursion protection, which
is now conditional on it.
- Allow KTR_CPU to be overridden by MD code. This is so that it is possible
to trace early in startup before pcpu and/or curthread are setup.
- Add a version number for the ktr interface. A userland tool can check this
to detect mismatches.
- Use an array for the parameters to make decoding in userland easier.
- Add file and line recording to the non-extended traces now that the extended
version is no more.
These changes will break gdb macros to decode the extended version of the
trace buffer which are floating around. Users of these macros should either
use the show ktr command in ddb, or use the userland utility which can be run
on a core dump.
Approved by: jhb
Tested on: i386, sparc64
back into the calling MD code. The MD code must ensure no races between
checking the astpening flag and returning to usermode.
Submitted by: peter (ia64 bits)
Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64
with this flag. Remove the dup_list and dup_ok code from subr_witness. Now
we just check for the flag instead of doing string compares.
Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface. The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.
Approved by: jhb
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it
from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).
Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections. This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.
This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways. This should
be temporary.
Reviewed by: core
Approved by: core