are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
Fixed memory leak in the "nodevice" option implementation.
Use these instead of sed(1) in MD NOTES.
Use a single makefile (sys/conf/makeLINT.mk) to generate
LINT for all architectures. (Previous versions missed
the LINT dependency on Makefile, and i386 version also
missed the dependency on ${NOTES}.)
Fixed bugs in the previous NOTES conversion using the
"nodevice" token and sed(1):
- i386 LINT lost "device pst".
- pc98 LINT lost SC_*, MAXCONS and KBD_DISABLE_KEYMAP_LOAD
options, and got needless DPT_* options.
- Added nooptions PPC_DEBUG, PPC_PROBE_CHIPSET, KBD_INSTALL_CDEV
to sparc64 LINT so that it has a chance to config(8).
This basically returns us to where we were before.
- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386
group number properly based on the board id. Perform dummy reads of
registers after writing to flush the hardware write buffers.
This gets the soon to be committed zs attachment working.
UltraSPARCs, and an eeprom attachment for fhc, which allows the date
to be set properly on these machines. Central is a wierd bus which
seems to only ever have 1 fhc attached to it. FHC (FireHose Controller)
is another wierd bus with various things on it depending where its attached.
The fhc attached to central has eeprom and zs, and the fhcs which attach
directly to nexus have simm-status, environment and other nodes, none of
which I'll probably ever have documentation for.
Thanks to Ade Lovett for providing access to an 8 cpu e4500.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.
Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@
o Add a MD header private to libc called _fpmath.h; this header
contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
<math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
<machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
on the size of its argument. __fpclassifyl() is never called on
alpha because (sizeof(long double) == sizeof(double)), which is good
since __fpclassifyl() can't deal with such a small `long double'.
This was developed by David Schultz and myself with input from bde and
fenner.
PR: 23103
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
(significant portions)
Reviewed by: bde, fenner (earlier versions)
uio segment is empty. In this case no dma segment is create by
bus_dmamap_load_buffer, but the calling routine clears the first flag.
Under certain combinations of addresses of the first and second mbuf/uio
buffer this leads to corrupted DMA segment descriptors. This was already
fixed by tmm in sparc64/sparc64/iommu.c.
PR: kern/47733
Reviewed by: sam
Approved by: jake (mentor)
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.
Reviewed by: jhb, tmm
Tested on: i386, sparc64
I'm not convinced there is anything major wrong with the patch but
them's the rules..
I am using my "David's mentor" hat to revert this as he's
offline for a while.
counterparts to bus_dmamem_alloc() and bus_dmamem_free(). This allows
the caller to specify the size of the allocation instead of it defaulting
to the max_size field of the busdma tag.
This is intended to aid in converting drivers to busdma. Lots of
hardware cannot understand scatter/gather lists, which forces the
driver to copy the i/o buffers to a single contiguous region
before sending it to the hardware. Without these new methods, this
would require a new busdma tag for each operation, or a complex
internal allocator/cache for each driver.
Allocations greater than PAGE_SIZE are rounded up to the next
PAGE_SIZE by contigmalloc(), so this is not suitable for multiple
static allocations that would be better served by a single
fixed-length subdivided allocation.
Reviewed by: jake (sparc64)
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
1.) Fix an off-by-one in the DVMA space handling, which would make it
possible to allocate one page beyond the end of the DVMA area.
This page was aliased to the first page. Apparently, this bug was
responsible for the trashed nvram/eeprom some people were reporting,
in conjunction with a number of unfortunate coincidences.
2.) Fix broken boundary and and lowaddr calculations.
3.) Fix a memory leak on an error path.
4.) Update a outdated comment to reflect the introduction of IOMMU_MAX_PRE,
make the usage of IOMMU_MAX_PRE more consistent and KASSERT that the
preallocation size is not 0.
5.) Fix a case where an error return was lost.
6.) When signalling an error to the caller by invoking the callback, do
not use a segment pointer of NULL for compatability with existing
drivers.
Also, increase the maximum segment number to 64; it is rather arbitrary,
with the exception of the of the stack space consumed by the segment
array.
Special thanks go to Harti Brandt <brandt@fokus.fraunhofer.de> for
spotting 4 and 5, and testing many iterations of patches.
Pointy hats to: tmm
constants where flag bits (as in NetBSD), although they are consecutively
numbered in FreeBSD. This would cause unnecessary flushing in the
BUS_DMASYNC_POSTWRITE case, but was otherwise mostly harmless.
indicate that uma_small_alloc should not. This code should be refactored so
that there is not so much cross arch duplication.
Reviewed by: jake
Spotted by: tmm
Tested on: alpha, sparc64
Pointy hat to: jeff and everyone who cut and pasted the bad code. :-)
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.
Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64
portable copy. Note that pmap_extract() must be used instead of
pmap_kextract().
This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
map. Use this new feature to implement iommu_dvmamap_load_mbuf() and
iommu_dvmamap_load_uio() functions in terms of a new helper function,
iommu_dvmamap_load_buffer(). Reimplement the iommu_dvmamap_load()
to use it, too.
This requires some changes to the map format; in addition to that,
remove unused or redundant members.
Add SBus and Psycho wrappers for the new functions, and make them
available through the respective DMA tags.
_nexus_dmamap_load_buffer()
- implement nexus_dmamap_load() in terms of _nexus_dmamap_load_buffer().
Note that this is untested, as this code is not currently used (but
might be later for UPA devices).
- move BUS_DMAMAP_NSEGS to bus_private.h
- disable the ecache flushing in nexus_dmamap_sync(); it should not be
needed, although the docs are not entirely clear on that.
- move some constants into iommureg.h
- correct some comments
- use KASSERT() in one place instead of rolling our own
- take a sanity check out of #ifdef DIAGNOSTIC
- fix a syntax error in normally #ifdef'ed out debug code
- tweak the announce message a bit
- remove '\n's from a few panic() calls
- don't use the DVMA base adress the firmware reports; instead, figure
it out from the appropriate register on Sabres and let the IOMMU code
choose it on Psychos. This also makes the IOMMU TSB size freely
selectable.
2.) pass the requesting child device (instead of the bus one) up when
handling interrupt resources
3.) remeber to mark the resource list entry as unused in
sbus_release_resource().
Reported by: scottl (3)
With a 1 byte transmit fifo, 3 byte receive fifo, and wierd multiplexed I/O
designed for a Z80 cpu, this chip redefines suckage.
Based on the openbsd and netbsd drivers. Only really works as a console,
modem support is not complete since I can't test it.
- Restore %g6 and %g7 for kernel traps if we are returning to prom code.
This allows complex traps (ones that call into C code) to be handled from
the prom.
be used for zones that allocate objects of less 1 page. The biggest advantage
of this is that all of a sudden the majority of kernel malloc-ed data doesn't
need kva allocated for it. Besides microbenchmarks I haven't seen a measurable
performance improvement from doing this.
useful for accessing more than 1 page of contiguous physical memory, and
to use 4mb tlb entries instead of 8k. This requires that the system only
use the direct mapped addresses when they have the same virtual colour as
all other mappings of the same page, instead of being able to choose the
colour and cachability of the mapping.
- Adapt the physical page copying and zeroing functions to account for not
being able to choose the colour or cachability of the direct mapped
address. This adds a lot more cases to handle. Basically when a page has
a different colour than its direct mapped address we have a choice between
bypassing the data cache and using physical addresses directly, which
requires a cache flush, or mapping it at the right colour, which requires
a tlb flush. For now we choose to map the page and do the tlb flush.
This will allows the direct mapped addresses to be used for more things
that don't require normal pmap handling, including mapping the vm_page
structures, the message buffer, temporary mappings for crash dumps, and will
provide greater benefit for implementing uma_small_alloc, due to the much
greater tlb coverage.
- Put the kernel tsb before before the kernel load address, below
VM_MIN_KERNEL_ADDRESS, instead of after the kernel where it consumes
usable kva. This is magic mapped so the virtual address is irrelevant,
it just needs to be out of the way.
a mapping belongs to by setting it in the vm_page_t structure that backs
the tsb page that the tte for a mapping is in. This allows the pmap that
a mapping belongs to to be found without keeping a pointer to it in the
tte itself.
- Remove the pmap pointer from struct tte and use the space to make the
tte pv lists doubly linked (TAILQs), like on other architectures. This
makes entering or removing a mapping O(1) instead of O(n) where n is the
number of pmaps a page is mapped by (including kernel_pmap).
- Use atomic ops for setting and clearing bits in the ttes, now that they
return the old value and can be easily used for this purpose.
- Use __builtin_memset for zeroing ttes instead of bzero, so that gcc will
inline it (4 inline stores using %g0 instead of a function call).
- Initially set the virtual colour for all the vm_page_ts to be equal to their
physical colour. This will be more useful once uma_small_alloc is
implemented, but basically pages with virtual colour equal to phsyical
colour are easier to handle at the pmap level because they can be safely
accessed through cachable direct virtual to physical mappings with that
colour, without fear of causing illegal dcache aliases.
In total these changes give a minor performance improvement, about 1%
reduction in system time during buildworld.
namely the ones for the timers, error handling and power management.
The registers for the timers, power management and PCI bus b errors are
reserved on Sabres (US-IIi) and can lead to false matches there.
Since all of them are never used for devices on the bus, they can be omitted
safely.
Approved by: re
import, as it breaks the relocation kernel modules built with the new
binutils.
Note that this, together with the binutils import, marks a kernel module
flag day on sparc64: modules built with the old binutils will not work
with new kernels and vice versa. Mismatches will result in panics.
Approved by: re
register to the one of the processor doing the interrupt setup. This
is required since this field is preinitialized to 0, but there exist
machines which have no processor with a MID of 0 (e.g. e450s with 1 or 2
processors).
Add some more macros for handle the interrupt mapping registers, and
rename some existing ones for consistency.
Approved by: re
are nevers used for PCI interrupts, but can cause false matches since
they are fully programmable.
2.) Skip the mapping registers for slot a2 and a3 on "psycho" bridges,
since they are not present there. Again, this could cause false matches,
which would result in the interrupt being delivered at most once.
Submitted by: jake (2)
Approved by: re
this is now done on all machines except for some known problematic ones.
Add an additional guard to make sure that the interrupt numbers are
in the correct range before swizzling. This should catch any remaining
models for which the swizzle is inappropriate.
Correct the swizzle calculation to account for the fact that the parent
interrupt numbers to be swizzled are 1-based.
Approved by: re
i386 cpu_thread_exit(). This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit(). We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().
Approved by: re@
Suggested by: jhb
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
to reflect its new location, and add page queue and flag locking.
Notes: (1) alpha, i386, and ia64 had identical implementations
of pmap_collect() in terms of machine-independent interfaces;
(2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
1. At least some Netra t1 models have PCI buses with no associated
interrupt map, but obviously expect the PCI swizzle to be done with
the interrupt number from the higher level as intpin. In this case,
the mapping also needs to continue at parent bus nodes.
To handle that, add a quirk table based on the "name" property of
the root node to avoid breaking other boxen. This property is now
retrieved and printed at boot.
2. On SPARCengine Ultra AX machines, interrupt numbers are not mapped
at all, and full interrupt numbers (not just INOs) are given in
the interrupt properties. This is more or less cosmetical; the
PCI interrupt numbers would be wrong, but the psycho resource
allocation method would pass the right numbers on anyway.
Tested by: mux (1), Maxim Mazurok <maxim@km.ua> (2)
not look like the prerequisites to fill it in properly will be in the tree
for the upcoming release, but it's mostly done, so there is no need for these
to stay around to remind us.
for sparc64 from trap #9 to trap #65. This is one of the ABI "blessed"
system call vectors and is different from any other system that we might
want to emulate, making the emulation easier by reducing the number of
code paths that need to be shared. Compatibility with old applications
is provided with COMPAT_FREEBSD4.
Add defines for a few special traps that we may need to implement for
compatibility with 32bit applications, and add comments on which vectors
are used for what in other systems, and which are available.
Pass magic flags to trap() for deprecated or unimplemented system call
vectors so they will deliver SIGSYS instead of SIGILL.
This piggy backs nicely with the recent sigaction(2) system call number
change, and provided the rules are followed for upgrading past it, this
change should not be noticed.
handling clean and functional as 5.x evolves. This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43. Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago).
Approved by: re
streaming cache. This bug could have the potential to cause data
corruption on systems with Psycho U2P bridges (Sabre bridges have no
streaming cache).
However, due to the usual driver architecture, it is believed that
corruption did occur only in rare cases (if at all).
trap types and signals to send. Rearrange KASSERTs to better handle faults
early before curthread is setup, or in the case that it gets corrupted or
set to 0.
so that there is ony one copy of it. Fix that one copy
so that KSEs with no mailbox in a KSE program are not a cause
of page faults (this can legitmatly happen).
Submitted by: (parts) davidxu
same size. Add some fields that previously overlapped with something else
or were missing.
- Make struct regs and struct mcontext (minus floating point) the same as
struct trapframe so converting between them is easy (null).
- Add space for saving floating point state to struct mcontext. This requires
that it be 64 byte aligned.
- Add assertions that none of these structures change size, as they are part
of the ABI.
- Remove some dead code in sendsig().
- Save and restore %gsr in struct trapframe. Remember to restore %fsr.
- Add some comments to exception.S.
as sparc64/sparc64/dump_machdep.c a while back).
Other than ia64 (which uses ELF), sparc64 uses a homegrown format for
the dumps (headers are required because the physical address and size of
the tsb must be noted, and because physical memory may be discontiguous);
ELF would not offer any advantages here.
Reviewed by: jake
The primary reason for this is to allow MD code to process machine
specific attributes, segments or sections in the ELF file and
update machine specific state accordingly. An immediate use of this
is in the ia64 port where unwind information is updated to allow
debugging and tracing in/across modules. Note that this commit
does not add the functionality to the ia64 port. See revision 1.9
of ia64/ia64/elf_machdep.c.
Validated on: alpha, i386, ia64
ACL configuration changes, this shouldn't result in different code paths
for file systems not explicitly configured for ACLs by the system
administrator. For UFS1, administrators must still recompile their
kernel to add support for extended attributes; for UFS2, it's sufficient
to enable ACLs using tunefs or at mount-time (tunefs preferred for
reliability reasons). UFS2, for a variety of reasons, including
performance and reliability, is the preferred file system for use with
ACLs.
Approved by: re
bits that might be set in the firmware tte data field, and set the soft
flag TD_EXEC to mark the page executable. Failing to do the latter would
cause fatal instruction faults in the prom in certain situations.
Reviewed by: jake
recognized compat properties. This should make the psycho driver attach
properly on SPARCengine Ultra AX machines.
Switch to a table-driven logic to recognize the ID's, since their number
is now large enough to justify this.
These changes are analogous to those made in NetBSD r.1.35, but
implemented a bit differently.
NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to
need the NO_GEOM option:
boot0cfg/fdisk:
Tries to update the MBR while it is being used to control
slices. GEOM does not allow this as a direct operation.
SCSI floppy drives:
Appearantly the scsi-da driver return "EBUSY" if no media
is inserted. This is wrong, it should return ENXIO.
PC98:
It is unclear if GEOM correctly recognizes all variants of
PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create. Passing the
value 0 prevents the alternate kstack from being created. Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.
Reviewed by: jake, peter, jhb
sparc v9 ABI. The Elf_Rela records for local symbols appear to already
have the symbol's value added in to the addend field, even though the ABI
specifies we need to lookup the symbol and add its value too. This breaks
text relocations in klds because the symbol's value is added twice, and
the resulting address points off into nowhere land, so for now just use
the addend.
Tested by: rwatson
so that it is MI. Allow nfs_mountroot to return an error if the nfs_diskless
struct is not valid, rather than panicing later on. Call nfs_setup_diskless()
from nfs_mountroot if NFS_ROOT is defined, like bootpc_init(). Removed legacy
root mount support for sparc64, and enabled NFS_ROOT by default.
MD function is just a wrapper around db_stack_trace_cmd() that prints out
a backtrace of curthread. Currently, this function is only implemented
on i386 and alpha (and the alpha version isn't quite tested yet, will do
that in a bit). Other changes:
- For i386, fix a bug in the raw frame address case. The eip we extract
from the passed in frame address does not match the frame we received.
Thus, instead of printing a bogus frame with the wrong eip, go ahead
and advance frame down to the same frame as the eip we are using.
- For alpha, attempt to add a way of doing a raw trace for alpha. Instead
of passing a frame address in 'addr', pass in a pointer to a structure
containing PC and KSP and use those to start the backtrace. The alpha
db_print_backtrace() uses asm to read in the current PC and KSP values
into such a request.
Tested on: i386
Requested by: many
under way to move the remnants of the a.out toolchain to ports. As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL. It has already lasted WAY beyond
that.
Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.
Tested on: i386 (extensively), alpha
next step is to allow > 1 to be allocated per process. This would give
multi-processor threads. (when the rest of the infrastructure is
in place)
While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc
are diverging more than they should.. corrective action needed soon.
to userland in the signal handler that were not being iflled out before, but
should and can be.
This part of sendsig could be slightly refactored to use an MI interface, or
ideally, *sendsig*() would have an API change to accept a siginfo_t, which
would be filled out by an MI function in the level above sendsig, and said MI
function would make a small call into MD code to fill out the MD parts (some
of which may be bogus, such as the si_addr stuff in some places). This would
eventually make it possible for parts of the kernel sending signals to set up
a siginfo with meaningful information.
Reviewed by: mux
MFC after: 2 weeks
sysentvec. Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places. Provided stack
fixup routines for emulations that previously used the default.
in the original hardwired sysctl implementation.
The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int. Use
'long' there.
Change Maxmem and physmem and related variables to 'long', mostly for
completeness. Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody. This
comes for free on 32 bit machines, so why not?
These types are unlikely to ever become very MD. They include:
clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t,
rune_t, socklen_t, timer_t, wchar_t, and wint_t.
While moving them, make a few adjustments (submitted by bde):
o __ct_rune_t needs to be precisely `int', not necessarily __int32_t,
since the arg type of the ctype functions is int.
o __rune_t, __wchar_t and __wint_t inherit this via a typedef of
__ct_rune_t.
o Some minor wording changes in the comment blocks for ct_rune_t and
mbstate_t.
Submitted by: bde (partially)
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif
Concept by: bde
Reviewed by: jake, obrien
Check if the trapped pc is inside of the demarked sections to implement
fault recovery for copyin etc, instead of pcb_onfault. Handle recovery
from data access exceptions as well as page faults.
Inspired by: bde's sys.dif
This is an architecture that present a thing message passing interface
to the OS. You can query as to how many ports and what kind are attached
and enable them and so on.
A less grand view is that this is just another way to package SCSI (SPI or
FC) and FC-IP into a one-driver interface set.
This driver support the following hardware:
LSI FC909: Single channel, 1Gbps, Fibre Channel (FC-SCSI only)
LSI FC929: Dual Channel, 1-2Gbps, Fibre Channel (FC-SCSI only)
LSI 53c1020: Single Channel, Ultra4 (320M) (Untested)
LSI 53c1030: Dual Channel, Ultra4 (320M)
Currently it's in fair shape, but expect a lot of changes over the
next few weeks as it stabilizes.
Credits:
The driver is mostly from some folks from Jeff Roberson's company- I've
been slowly migrating it to broader support that I it came to me as.
The hardware used in developing support came from:
FC909: LSI-Logic, Advansys (now Connetix)
FC929: LSI-Logic
53c1030: Antares Microsystems (they make a very fine board!)
MFC after: 3 weeks
conventions for _mcount and __cyg_profile_func_enter are different, so
statistical profiling kernels build and link but don't actually work.
IWBNI one could tell gcc to only generate calls to the former.
Define uintfptr_t properly for userland, but not for the kernel (I hope).
<stdint.h>. Previously, parts were defined in <machine/ansi.h> and
<machine/limits.h>. This resulted in two problems:
(1) Defining macros in <machine/ansi.h> gets in the way of that
header only defining types.
(2) Defining C99 limits in <machine/limits.h> adds pollution to
<limits.h>.
userland for libc/gmon to compile, so the typedef in <machine/types.h>
isn't good enough. This is really ugly since we end up with the
actual value which uintfptr_t is typedef'd from, in multiple places.
This is bug for bug compatible with the other FreeBSD architectures.
Noticed by: sparc64 tinderbox
basically maps all of physical memory 1:1 to a range of virtual addresses
outside of normal kva. The advantage of doing this instead of accessing
phsyical addresses directly is that memory accesses will go through the
data cache, and will participate in the normal cache coherency algorithm
for invalidating lines in our own and in other cpus' data caches. So
we don't have to flush the cache manually or send IPIs to do so on other
cpus. Also, since the mappings never change, we don't have to flush them
from the tlb manually.
This makes pmap_copy_page and pmap_zero_page MP safe, allowing the idle
zero proc to run outside of giant.
Inspired by: ia64
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
of them, and couple them by always performing all operations on all
present IOMMUs. This is required because with the current API there
is no way to determine on which bus a busdma operation is performed.
While being there, clean up the iommu code a bit.
This should be a step in the direction of allow some of larger machines
to work; tests have shown that there still seem to be problems left.
o Assert that the page queues lock is held in vm_page_unwire().
o Make vm_page_lock_queues() and vm_page_unlock_queues() visible
to kernel loadable modules.
choosethread() in MI C code instead of doing it in in assembly in all the
various cpu_switch() functions. This fixes problems on ia64 and sparc64.
Reviewed by: julian, peter, benno
Tested on: i386, alpha, sparc64
itself; this causes undefined behaviour on UltraSPARCs. In particular,
the interrupt packet data words will not necessarily be delivered
correctly, which would result in a crash.
This bug also caused the cache-flushing work to be done twice on the
triggering CPU (when it did not cause crashes).
Reviewed by: jake
hardly MD, since all our platforms share the same macro. It's not
really compiler dependent either, but this helps in reducing
<machine/ansi.h> to only type definitions.
threaded VM pagezero kthread outside of Giant. For some platforms, this
is really easy since it can just use the direct mapped region. For others,
IPI sending is involved or there are other issues, so grab Giant when
needed.
We still have preemption issues to deal with, but Alan Cox has an
interesting suggestion on how to minimize the problem on x86.
Use Luigi's hack for preserving the (lack of) priority.
Turn the idle zeroing back on since it can now actually do something useful
outside of Giant in many cases.
pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code
and use a single equivalent MI version. There are other cleanups
needed still.
While here, use the UMA zone hooks to keep a cache of preinitialized
proc structures handy, just like the thread system does. This eliminates
one dependency on 'struct proc' being persistent even after being freed.
There are some comments about things that can be factored out into
ctor/dtor functions if it is worth it. For now they are mostly just
doing statistics to get a feel of how it is working.
we just have to deal with the kstack when told to. We do not have a
UMA-managed cache for the proc struct and its associated upage yet. So,
go back to the old lazy mechanism. Note that if UMA destroys pages that
used to contain proc structures, we'll lose the corresponding upage
forever. (zones never did this - once a page was allocated, it stayed
attached to the proc zone forever)
and function) with existing configuration choices. Arguably if
ALT_BREAK_TO_DEBUGGER was present, so should have been
BREAK_TO_DEBUGGER. Regardless, it broke the option sort order in
these kernel configuration files.
Requested by: bde
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..
installed with pmap_kenter_flags, since the physical addresses may not
have an associated vm_page. Add a function to do this.
Tested by: Tomi Vainio <Tomi.Vainio@Sun.COM>
obtained, when all other scheduling activity is suspended. This is needed
on sparc64 to deactivate the vmspace of the exiting process on all cpus.
Otherwise if another unrelated process gets the exact same vmspace structure
allocated to it (same address), its address space will not be activated
properly. This seems to fix some spontaneous signal 11 problems with smp
on sparc64.
dcache aliasing. A page that already had more than 1 mapping of the
same virtual colour would not be correctly uncached.
Noticed by: Artur Grabowski <art@openbsd.org>
implementations can provide a base zero ffs function if they wish.
This changes
#define RQB_FFS(mask) (ffs64(mask))
foo = RQB_FFS(mask) - 1;
to
#define RQB_FFS(mask) (ffs64(mask) - 1)
foo = RQB_FFS(mask);
On some platforms we can get the "- 1" for free, eg: those that use the
C code for ffs64().
Reviewed by: jake (in principle)
magic numbers. Use stxa_sync instead of stxa; membar #Sync; to ensure
that no instruction is placed between the two. This can cause random
corruption even though interrupts are already disabled.
code. Both tasks are not always performed completely by the firmware.
The former is required to get some e450 models to boot; the latter fixes
the repeated fifo underruns with hme(4)s and gem(4)s observed on some
machines (and probably performance problems with other peripherals as
well).
in their tlb which the prom doesn't clear out, so we have to do so manually
before mapping the kernel page table or the cpu can hang due various
conditions which cause undefined behaviour from the tlb.
- ktrace no longer requires Giant so do ktrace syscall events before and
after acquiring and releasing Giant, respectively.
- For i386, ia32 syscalls on ia64, powerpc, and sparc64, get rid of the
goto bad hack and instead use the model on ia64 and alpha were we
skip the actual syscall invocation if error != 0. This fixes a bug
where if we the copyin() of the arguments failed for a syscall that
was not marked MP safe, we would try to release Giant when we had
not acquired it.
the pv lists in the vm_page, even unmanaged kernel mappings. This is so
that the virtual cachability of these mappings can be tracked when a page
is mapped to more than one virtual address. All virtually cachable
mappings of a physical page must have the same virtual colour, or illegal
alises can be created in the data cache. This is a bit tricky because we
still have to recognize managed and unmanaged mappings, even though they
are all on the pv lists.
value of the tag or data field.
Add macros for getting the page shift, size and mask for the physical page
that a tte maps (which may be one of several sizes).
Use the new cache functions for invalidating single pages.
a floating point instruction into a 6-bit register number for
double and quad arguments.
Make use of the new INSFPdq_RN macro where apporpriate; this
is required for correctly handling the "high" fp registers
(>= %f32).
Fix a number of bugs related to the handling of the high registers
which were caused by using __fpu_[gs]etreg() where __fpu_[gs]etreg64()
should be used (the former can only access the low, single-precision,
registers).
Submitted by: tmm
Rearrange things slightly so that the contents of the tag access
register are read and restored outside of the macros. The intention
is to pass the page size to look up as an argument to the macros.
i386/ia64/alpha - catch up to sparc64/ppc:
- replace pmap_kernel() with refs to kernel_pmap
- change kernel_pmap pointer to (&kernel_pmap_store)
(this is a speedup since ld can set these at compile/link time)
all platforms (as suggested by jake):
- gc unused pmap_reference
- gc unused pmap_destroy
- gc unused struct pmap.pm_count
(we never used pm_count - we track address space sharing at the vmspace)
the symbol index defined by the relocation. The elf_lookup() support
function is to be used by elf_reloc() when symbol lookups need to be
done. The elf_lookup() function operates on the symbol index and
will do a symbol name based lookup when such is required, otherwise
it uses the symbol index directly. This solves the problem seen on
ia64 where the symbol hash table does not contain local symbols and
a symbol name based lookup would fail for those symbols.
Don't pass the symbol name to elf_reloc(), as it isn't used any more.
user stack in response to a failed window fill, allowing the process to be
killed if its wrong. This caused user programs which misalign their stack
pointer to get stuck in an infinite loop at the kernel-userland boundary,
which is mostly harmless.
The same thing causes a fatal RED state exception on OpenBSD and probably
NetBSD.
Inspired by: art@openbsd.org
and pmap_copy_page(). This gets rid of a couple more physical addresses
in upper layers, with the eventual aim of supporting PAE and dealing with
the physical addressing mostly within pmap. (We will need either 64 bit
physical addresses or page indexes, possibly both depending on the
circumstances. Leaving this to pmap itself gives more flexibilitly.)
Reviewed by: jake
Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
_BYTE_ORDER. These are far more useful than their non-underscored
equivalents as these can be used in restricted namespace environments.
Mark the non-underscored variants as deprecated.
and add some compatibility defines. Add fields for ins and locals to
struct reg also for the same reason; these aren't filled in yet because
getting at those registers sucks and I'd rather not save them in the
trapframe just for this. Reorder struct reg to be ABI compatible as
well. Add needed include of machine/emul.h.
This gets pmdb (poor man's debugger) from OpenBSD mostly compiling but it
doesn't work yet :(