are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.
in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h>
lines not needed and some biofinish() calls can be reduced to
biodone() again.
not save (restore) the global pointer (GP) in the jmpbuf in setjmp
(longjmp) because it's not needed in general. GP is considered a
scratch register at callsites and hence is always restored after a
call (when it's possible that the call resolves to a symbol in a
different loadmodule; otherwise GP does not have to be saved and
restored at all), including calls to setjmp/longjmp. There's just
one problem with this now that we use setjmp/longjmp for context
switching: A new context must have GP defined properly for the
thread's entry point. This means that we need to put GP in the
jmpbuf and consequently that we have to restore is in longjmp.
This automaticly requires us to save it as well.
When setjmp/longjmp isn't used for context switching, this can be
reverted again.
the J_SIG0 field. While here, rename J_SIG0 to J_SIGSET and
remove J_SIG1. The main reason for this change is that the
128-bit sigset_t is now aligned on a 16-byte boundary, which
allows us to use 16-byte atomic loads and stores on CPUs that
support it. The removal of J_SIG1 is done to avoid confusion:
it is never accessed and should not be. Renaming J_SIG0 to
J_SIGSET is the icing on the cake that's better done now than
later.
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386
dev_t to the method functions.
The dev_t can still be found at struct consdev *->cn_dev.
Add a void *cn_arg element to struct consdev which the drivers can use
for retrieving their softc.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.
Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@
is already in pages, so we should not convert from bytes to pages.
The result of this bug was bad scaling of the VHPT relative to the
available memory.
Submitted by: Arun Sharma <arun@sharma-home.net>
o Add a MD header private to libc called _fpmath.h; this header
contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
<math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
<machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
on the size of its argument. __fpclassifyl() is never called on
alpha because (sizeof(long double) == sizeof(double)), which is good
since __fpclassifyl() can't deal with such a small `long double'.
This was developed by David Schultz and myself with input from bde and
fenner.
PR: 23103
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
(significant portions)
Reviewed by: bde, fenner (earlier versions)
uio segment is empty. In this case no dma segment is create by
bus_dmamap_load_buffer, but the calling routine clears the first flag.
Under certain combinations of addresses of the first and second mbuf/uio
buffer this leads to corrupted DMA segment descriptors. This was already
fixed by tmm in sparc64/sparc64/iommu.c.
PR: kern/47733
Reviewed by: sam
Approved by: jake (mentor)
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.
Reviewed by: jhb, tmm
Tested on: i386, sparc64
likely not present under the simulator. If multiple partitions are
present on the virtual disk, then the 'a' partition would be the
most logical choice. Nowadays partitions are GPT based, which would
make the assumption of a disklabel even more questionable. Given
all the possible scenarios, assuming a raw "device" seems best.
saving and restoring ia32 specific registers when switching
context and ia32 support has not been compiled-in. The primary
reason for this change is that one of the ia32 registers (ar.fcr)
is wrongly marked as invalid by the simulator. Now that we avoid
using the register when possible, usability is improved. The
secundary reason is that it saves us 7 loads and stores.
Note that the PCB will continue to have room for these registers,
irrespective of the IA32 option. There are no benefits that make
it worthwhile.
and instead add platform, firmware and EFI stubs to the loader.
The net effect of this change is that besides a special console and
disk driver, the kernel has no knowledge of the simulator. This has
the following advantages:
o Simulator support is much harder to break,
o It's easier to make use of more feature complete simulators.
This would only need a change in the simulator specific loader,
o Running SMP kernels within the simulator. Note that ski at this
time does not simulate IPIs, so there's no way to start APs.
The platform, firmware and EFI stubs describe the following hardware:
o 4 CPU Itanium,
o 128 MB RAM within the 4GB address space,
o 64 MB RAM above the 4GB address space.
NOTE: The stubs in the skiloader describe a machine that should in
parts be defined by the simulator. Things like processor interrupt
block and AP wakeup vector cannot be choosen at random because they
require interpretation by the simulator. Currently the simulator is
ignorant of this.
This change introduces an unofficial SSC call SSC_SAL_SET_VECTORS
which is ignored by the simulator.
Tested with: ski (version 0.943 for linux)
I'm not convinced there is anything major wrong with the patch but
them's the rules..
I am using my "David's mentor" hat to revert this as he's
offline for a while.
I belive it got here by copy&paste and I see no signs in the source
code that BIO_DELETE was dealt with correctly and can only wonder
what kind of trouble this may have caused.
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
indicate that uma_small_alloc should not. This code should be refactored so
that there is not so much cross arch duplication.
Reviewed by: jake
Spotted by: tmm
Tested on: alpha, sparc64
Pointy hat to: jeff and everyone who cut and pasted the bad code. :-)
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.
Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64
portable copy. Note that pmap_extract() must be used instead of
pmap_kextract().
This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
and declare them extern in interrupt.c. This eliminates the need
for ia64_add_sapic(), which is called from sapic.c.
While here, reformat ia64_enable() in interrupt.c to improve
indentation and add a sysctl (machdep.apic) to dump the I/O APIC
entries currently programmed into all I/O APICs. The latter can
help analyze interrupt problems.
Note that the sysctl is not intended as a userland (software)
interface. It may be changed in the future to include counters
so that vmstat -i can make use of it. It may also be removed...
copies of the reload. Note that we use the precomputed itm_reload value
so that we can avoid a division in the kernel. The ia64 cpu does not
have integer divide, so this would have been done by a floating point
operation.
CLOCK_VECTOR and define it as 254, not 255. Vector 255 is already
in use as the AP wakeup vector on the HP rx2600.
This needs to be made more dynamic. The likelyhood of vector 254
being in use is pretty small, but we already have code to assign
vectors to IPIs (see sal.c) and it's preobably better to have a
centralized "vector manager" that hands out vectors based on
some imput (like priority).
handleclock itself is trivial.
While here, replace (itc_frequency+hz/2)/hz with itm_reload for
consistency. There's now a single place where we determine the
ITM reload value.
interrupt block). We use the previously hardcoded address as a
default only, but will otherwise use whatever ACPI tells us.
The address can be found in the MADT table header or in the
LAPIC override table entry.
space most of the time, but handles machines with lots of I/O
(S)APICs. We cannot make this more dynamic without breaking the
interface with vmstat. Hence, we need to fix the interface first.
name of unused entries from "intr XXX" to "#XXX". This makes it
easier to debug interrupt problems, because vmstat can be hacked
more easily to dump all interrupt entries that are in use and not
those that have had interrupts.
devices aren't necessarily mapped within 4GB. I/O port addresses
are offsets into the memory mapped I/O port space, which is not
larger than 16MB. No need to convert those to 64 bit types.
o Make the URL of the handbook match reality
o Improve some comments (either wording or formatting)
o Sync with i386: comment-out DDB, INVARIANTS, INVARIANT_SUPPORT
o Add some more SCSI/RAID controllers:
ahd, mpt, asr, ciss, dpt, iir, mly, ida
o Remove support for the parallel port
o Add NICs: em, bge
o Remove NICs: ste, tl, tx, vr, wb
o Enable USB support again, except of the UHCI host controller.
UHCI still hangs the BigSur (=HP i2000) machines, and makes
them useless. The OHCI controller works fine. Note that newer
ia64 boxes based on the Intel host controllers (UHCI or EHCI)
still won't have USB support. We really need to import the
EHCI host controller from NetBSD...
of the `machdep.acpi_root' sysctl. This is required on ia64
because the root pointer hardly ever, if at all, lives in the
first MB of memory and also because scanning the first MB of
memory can cause machine checks.
This provides a save and reliable way for ACPI tools to work
with the tables if ACPI support is present in the kernel. On
ia64 ACPI is non-optional.
end up with a dump offset that's smaller than the start of the
dump device and either clobber data in preceding partitions or
try to write beyond the end of the medium (unsigned wrap).
Implement legacy behaviour to never write to the first 64KB as
that is where metadata (ie disklabels) may reside.
The HCDP table is one (non-proprietary) way for the platform to
inform the OS about headless operation. This field would normally
hold the address as can be found by scanning the EFI system table,
which we also pass to the kernel. The apparent duplication allows
us to synthesize a HCDP table in the loader by whatever means we
can think of, including relocating the platform table into pre-
mapped address space. In short: it gives us more freedom.
Approved by: re (blanket)
of that, there's some nasty process corruption when running with
SMP.
Note that this was already in effect for the 5.0-RC1 kernels in
the form of a local patch.
Approved by: re (blanket)
Add function map_port_space() to map the memory mapped I/O port
range as uncacheable virtual memory and call it prior to probing
for a console. This removes the dependency on the loader to have
done this for us. Note that this change does not include doing
the same for APs.
Approved by: re (blanket)
the kernel itself, but SAL on Itanium2 machines spontaneously
rebooted the machine.
Approved by: re (blanket)
Submitted by: Arun Sharma <adsharma@unix-os.sc.intel.com>
i386 cpu_thread_exit(). This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit(). We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().
Approved by: re@
Suggested by: jhb
- Clear the PG_WRITEABLE flag in pmap_page_protect() if write access is
being removed. Return immediately if write access is being removed and
PG_WRITEABLE is already clear.
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
to worry about ABI vs released systems yet. This is mostly transparent
since there is no significant exposure in the syscall interface. The
things that go wrong are mostly userland stuff - time(&intvariable).
Reviewed by: dfr, marcel
Approved by: re (jhb)
to reflect its new location, and add page queue and flag locking.
Notes: (1) alpha, i386, and ia64 had identical implementations
of pmap_collect() in terms of machine-independent interfaces;
(2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
Don't force 16-byte alignment at run-time. Do it at compile-time.
This saves us the pointer fiddling by the setjmp functions and
reduces complexity. While here, increase the jmp_buf by 16 bytes
to an even 512 bytes. Coincidentally, due to the way alignment
was handled prior to this change, the jmp_buf has not changed in
size, but only in how the space is used. Prior to this change
the 16 bytes were reserved for enforcing alignment; now they are
reserved by us for future extensions.
Therefore, this ABI breaker is relatively save: the failure is
always an alignment trap.
sysctls to MI code; this reduces code duplication and makes all of them
available on sparc64, and the latter two on powerpc.
The semantics by the i386 and pc98 hw.availpages is slightly changed:
previously, holes between ranges of available pages would be included,
while they are excluded now. The new behaviour should be more correct
and brings i386 in line with the other architectures.
Move physmem to vm/vm_init.c, where this variable is used in MI code.
not look like the prerequisites to fill it in properly will be in the tree
for the upcoming release, but it's mostly done, so there is no need for these
to stay around to remind us.