architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
Clearing a page table entry's accessed bit and setting the page's
PG_REFERENCED flag in pmap_protect() can't really be justified, so
don't do it. Moreover, on ia64, don't set the page's dirty field
unless pmap_protect() is removing write access.
In the end, it does help fixing /dev/io usage from multithreaded
processes.
- On i386 and amd64 the old behaviour is kept but multithreaded
processes must use the new interface in order to work well.
- Support for the other architectures is greatly improved, where
necessary, by the necessity to define very small things now.
Manpage update will happen shortly.
Sponsored by: Sandvine Incorporated
PR: threads/116181
Reviewed by: emaste, marcel
MFC after: 3 weeks
pmap_ts_referenced() is not always appropriate for checking whether or
not pages have been referenced because it clears any reference bits
that it encounters. For example, in mincore(), clearing the reference
bits has two negative consequences. First, it throws off the activity
count calculations performed by the page daemon. Specifically, a page
on which mincore() has called pmap_ts_referenced() looks less active
to the page daemon than it should. Consequently, the page could be
deactivated prematurely by the page daemon. Arguably, this problem
could be fixed by having mincore() duplicate the activity count
calculation on the page. However, there is a second problem for which
that is not a solution. In order to clear a reference on a 4KB page,
it may be necessary to demote a 2/4MB page mapping. Thus, a mincore()
by one process can have the side effect of demoting a superpage
mapping within another process!
The sequence number is used as the name of a sysctl node,
under which we add the MCA records using the CPU id as the
leaf name.
Add the hw.mca.inject sysctl to provide a way to inject
MC errors and trigger machine checks.
PR: ia64/113102
o Switch to ITANIUM2 has the cpu. This has absolutely no effect
on the code, but makes for a better example.
o Drop COMPAT_FREEBSD6. We're tier 2, so you're supposed to run
8-stable or newer.
o Add PREEMPTION. It works now.
o Remove HWPMC_HOOKS. We don't have support for hwpmc yet.
o Add a bunch of new devices: atapist, hptiop, amr, ips, twa, igb,
ixgbe, ae, age, alc, ale, bce, bfe, et, jme, msk, nge, sk, ste,
stge, tx, vge, axe, rue, udav, fwip, and all USB serial.
o Remove "legacy" devices: le, vx, dc, pcn, rl, sis.
Make sure to the module list is a superset of what goes into GENERIC.
with PCI busses. Remove nexus_read_ivar() and nexus_write_ivar()
to give default behaviour. Remove <machine/nexusvar.h> as well,
because there's nothing in it that's being used.
to ia64_enable_intr(). This reduces confusion with intr_disable() and
intr_restore().
Have configure_final() call ia64_finalize_intr() instead of enable_intr()
in preparation of adding support for binding interrupts to all CPUs.
have the BSP use IPIs to trigger clock interrupts on the APs.
This allows us to run on hardware configurations for which the
ITC has non-uniform frequencies across CPUs.
While here, change the clock XIV to type IPI so as to protect
the interrupt delivery against CPU re-balancing once that's
implemented.
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.
Reviewed by: jhb
other than in a potentially dangerous KASSERT.
o Hand-inline pmap_remove_page() as it's only called from 1 place and
the abstraction that pmap_remove_page() provides is not enough to
warrant the obfuscation. Eliminate the dangerous KASSERT in the
process.
o In pmap_remove_pte(), remove the KASSERT for pmap being the current
one as it's not safe in the face of CPU migration.
than in a KASSERT. The KASSERT is broken in that it's done outside the
critical section and as such isn't protected against CPU migration.
Improve pmap_invalidate_page() as follows:
o calculate vhpt_ofs inside the critical region for exactly the same
reason.
o calculate the tag outside the FOREACH loop, as it's loop-invariant.
This is more efficient.
o Replace the test and set with an atomic cmpset operation because we
are changing other CPU's VHPT tables and this avoids invalidating
after the entry got modified. Not necessarily a problem, but better
safe than sorry.
before we grab the mutex. Don't assert that they must be disabled at
that point. We pretty much bypass all logic in that case anyway and
leave immediately, so there's no harm.
preemption doesn't happen until after all pending interrupt have
been services.
While here again, simplify the EOI handling by doing it after we
call the XIV-specific handlers, rather than in each of them. The
original thought was that we may want to do an EOI first and the
actual IPI handling next, but that's mostly a micro-optimization.
cycles. This serves 2 purposes:
1. It prevents preemption and CPU migration while running SAL code.
2. It reduces the chance of stack overflows: we're supposed to enter
SAL with at least 16KB of either memory- or register stack space,
which we can't do without switching to a different stack.
This is not for multiple inclusion purposes, because _regset.h already
handles this, but to enable inclusion of the MD header by cross-tools
on non-ia64 installations. The cross-tool can include _regset.h itself
before including MD headers that depend on it.
o Introduce XIV, eXternal Interrupt Vector, to differentiate from
the interrupts vectors that are offsets in the IVT (Interrupt
Vector Table). There's a vector for external interrupts, which
are based on the XIVs.
o Keep track of allocated and reserved XIVs so that we can assign
XIVs without hardcoding anything. When XIVs are allocated, an
interrupt handler and a class is specified for the XIV. Classes
are:
1. architecture-defined: XIV 15 is returned when no external
interrupt are pending,
2. platform-defined: SAL reports which XIV is used to wakeup
an AP (typically 0xFF, but it's 0x12 for the Altix 350).
3. inter-processor interrupts: allocated for SMP support and
non-redirectable.
4. device interrupts (i.e. IRQs): allocated when devices are
discovered and are redirectable.
o Rewrite the central interrupt handler to call the per-XIV
interrupt handler and rename it to ia64_handle_intr(). Move
the per-XIV handler implementation to the file where we have
the XIV allocation/reservation. Clock interrupt handling is
moved to clock.c. IPI handling is moved to mp_machdep.c.
o Drop support for the Intel 8259A because it was broken. When
XIV 0 is received, the CPU should initiate an INTA cycle to
obtain the interrupt vector of the 8259-based interrupt. In
these cases the interrupt controller we should be talking to
WRT to masking on signalling EOI is the 8259 and not the I/O
SAPIC. This requires adriver for the Intel 8259A which isn't
available for ia64. Thus stop pretending to support ExtINTs
and instead panic() so that if we come across hardware that
has an Intel 8259A, so have something real to work with.
o With XIVs for IPIs dynamically allocatedi and also based on
priority, define the IPI_* symbols as variables rather than
constants. The variable holds the XIV allocated for the IPI.
o IPI_STOP_HARD delivers a NMI if possible. Otherwise the XIV
assigned to IPI_STOP is delivered.
a long time and has gone unnoticed just as long, because I kept
using sched_4bsd (due to sched_ule not working with preemption),
but GENERIC had sched_ule by default -- including SMP.
While here, remove unused inclusion of <machine/clock.h>, remove
totally bogus inclusion of <i386/include/specialreg.h>.
COMPAT_43TTY enables the sgtty interface. Even though its exposure has
only been removed in FreeBSD 8.0, it wasn't used by anything in the base
system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your
ports/packages are less than two years old, they will prefer termios
over sgtty.
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.
Reviewed by: kib, jhb
o Assign vectors based on priority, because vectors have
implied priority in hardware.
o Use unordered memory accesses to the I/O sapic and use
the acceptance form of the mf instruction.
o Remove the sapicreg.h and sapicvar.h headers. All definitions
in sapicreg.h are private to sapic.c and all definitions in
sapicvar.h are either private or interface functions. Move the
interface functions to intr.h.
o Hide the definition of struct sapic.
o Eliminate IA64_PHYS_TO_RR6 and change all places where the macro is used
by calling either bus_space_map() or pmap_mapdev().
o Implement bus_space_map() in terms of pmap_mapdev() and implement
bus_space_unmap() in terms of pmap_unmapdev().
o Have ia64_pib hold the uncached virtual address of the processor interrupt
block throughout the kernel's life and access the elements of the PIB
through this structure pointer.
This is a non-functional change with the exception of using ia64_ld1() and
ia64_st8() to write to the PIB. We were still using assignments, for which
the compiler generates semaphore reads -- which cause undefined behaviour
for uncacheable memory. Note also that the memory barriers in ipi_send() are
critical for proper functioning.
With all the mapping of uncached memory done by pmap_mapdev(), we can keep
track of the translations and wire them in the CPU. This then eliminates
the need to reserve a whole region for uncached I/O and it eliminates
translation traps for device I/O accesses.
the 'debugging' section of any HEAD kernel and enable for the mainstream
ones, excluding the embedded architectures.
It may, of course, enabled on a case-by-case basis.
Sponsored by: Sandvine Incorporated
Requested by: emaste
Discussed with: kib
path. When the taken branch leaves the kernel and enters the process,
we still need to execute the instruction at that address. Don't raise
SIGTRAP when we branch into the process, but enable single-stepping
instead.