for now. It introduces a OFW PCI bus driver and a generic OFW PCI-PCI
bridge driver. By utilizing these, the PCI handling is much more elegant
now.
The advantages of the new approach are:
- Device enumeration should hopefully be more like on Solaris now,
so unit numbers should match what's printed on the box more
closely.
- Real interrupt routing is implemented now, so cardbus bridges
etc. have at least a chance to work.
- The quirk tables are gone and have been replaced by (hopefully
sufficient) heuristics.
- Much cleaner code.
There was also a report that previously bogus interrupt assignments
are fixed now, which can be attributed to the new heuristics.
A pitfall, and the reason why this is not the default yet, is that
it changes device enumeration, as mentioned above, which can make
it necessary to change the system configuration if more than one
unit of a device type is present (on a system with two hme cars,
for example, it is possible that hme0 becomes hme1 and vice versa
after enabling the option). Systems with multiple disk controllers
may need to be booted into single user (and require manual specification
of the root file system on boot) to adjust the fstab.
Nevertheless, I would like to encourage users to use this option,
so that it can be made the default soon.
In detail, the changes are:
- Introduce an OFW PCI bus driver; it inherits most methods from the
generic PCI bus driver, but uses the firmware for enumeration,
performs additional initialization for devices and firmware-specific
interrupt routing. It also implements an OFW-specific method to allow
child devices to get their firmware nodes.
- Introduce an OFW PCI-PCI bridge driver; again, it inherits most
of the generic PCI-PCI bridge driver; it has it's own method for
interrupt routing, as well as some sparc64-specific methods (one to
get the node again, and one to adjust the bridge bus range, since
we need to reenumerate all PCI buses).
- Convert the apb driver to the new way of handling things.
- Provide a common framework for OFW bridge drivers, used be the two
drivers above.
- Provide a small common framework for interrupt routing (for all
bridge types).
- Convert the psycho driver to the new framework; this gets rid of a
bunch of old kludges in pci_read_config(), and the whole
preinitialization (ofw_pci_init()).
- Convert the ISA MD part and the EBus driver to the new way
interrupts and nodes are handled.
- Introduce types for firmware interrupt properties.
- Rename the old sparcbus_if to ofw_pci_if by repo copy (it is only
required for PCI), and move it to a more correct location (new
support methodsx were also added, and an old one was deprecated).
- Fix a bunch of minor bugs, perform some cleanups.
In some cases, I introduced some minor code duplication to keep the
new code clean, in hopes that the old code will be unifdef'ed soon.
Reviewed in part by: imp
Tested by: jake, Marius Strobl <marius@alchemy.franken.de>,
Sergey Mokryshev <mokr@mokr.net>,
Chris Jackman <cjackNOSPAM@klatsch.org>
Info on u30 firmware provided by: kris
This commit has two pieces. One half is the watchdog kernel code which lives
primarily in hardclock() in sys/kern/kern_clock.c. The other half is a userland
daemon which, when run, will keep the watchdog from firing while the userland
is intact and functioning.
Approved by: jeff (mentor)
with a new implementation that has a mostly reentrant "addchar"
routine, supports multiple message buffers in the kernel, and hides
the implementation details from callers.
The new code uses a kind of sequence number to represend the current
read and write positions in the buffer. This approach (suggested
mainly by bde) permits the read and write pointers to be maintained
separately, which reduces the number of atomic operations that are
required. The "mostly reentrant" above refers to the way that while
it is now always safe to have any number of concurrent writers,
readers could see the message buffer after a writer has advanced
the pointers but before it has witten the new character.
Discussed on: freebsd-arch
redundant paths to the same device.
This class reacts to a label in the first sector of the device,
which is created the following way:
# "0123456789abcdef012345..."
# "<----magic-----><-id-...>
echo "GEOM::FOX someid" | dd of=/dev/da0 conv=sync
NB: Since the fact that multiple disk devices are in fact the same
device is not known to GEOM, the geom taste/spoil process cannot
fully catch all corner cases and this module can therefore be
confused if you do the right wrong things.
NB: The disk level drivers need to do the right thing for this to
be useful, and that is not by definition currently the case.
toggle several media options (sonet/sdh, for example) with ifconfig and
to see the carrier state in ifconfig's output. It gives also read/write
access (given the right privilegs) to the S/Uni registers to user space
programs.
userland, and the kernel. In the kernel by way of the 'ident[]' variable
akin to all the other stuff generated by newvers.sh. In userland it is
available to sysctl consumers via KERN_IDENT or 'kern.ident'. It is exported
by uname(1) by the -i flag.
Reviewed by: hackers@
section to stop gcc generating the dwarf2 .eh_frame unwind tables. It
is dead weight for the time being. Maybe it can be used to perform
stack traces and/or get the location of function arguments in ddb, but
that requires a dwarf2 runtime interpreter, which we do not have.
Approved by: re (amd64 "safe" bits)
systems. Of note:
- Implement a direct mapped region using 2MB pages. This eliminates the
need for temporary mappings when getting ptes. This supports up to
512GB of physical memory for now. This should be enough for a while.
- Implement a 4-tier page table system. Most of the infrastructure is
there for 128TB of userland virtual address space, but only 512GB is
presently enabled due to a mystery bug somewhere. The design of this
was heavily inspired by the alpha pmap.c.
- The kernel is moved into the negative address space(!).
- The kernel has 2GB of KVM available.
- Provide a uma memory allocator to use the direct map region to take
advantage of the 2MB TLBs.
- Fixed some assumptions in the bus_space macros about the ability
to fit virtual addresses in an 'int'.
Notable missing things:
- pmap_growkernel() should be able to grow to 512GB of KVM by expanding
downwards below kernbase. The kernel must be at the top 2GB of the
negative address space because of gcc code generation strategies.
- need to fix the >512GB user vm code.
Approved by: re (blanket)
prime objectives are:
o Implement a syscall path based on the epc inststruction (see
sys/ia64/ia64/syscall.s).
o Revisit the places were we need to save and restore registers
and define those contexts in terms of the register sets (see
sys/ia64/include/_regset.h).
Secundairy objectives:
o Remove the requirement to use contigmalloc for kernel stacks.
o Better handling of the high FP registers for SMP systems.
o Switch to the new cpu_switch() and cpu_throw() semantics.
o Add a good unwinder to reconstruct contexts for the rare
cases we need to (see sys/contrib/ia64/libuwx)
Many files are affected by this change. Functionally it boils
down to:
o The EPC syscall doesn't preserve registers it does not need
to preserve and places the arguments differently on the stack.
This affects libc and truss.
o The address of the kernel page directory (kptdir) had to
be unstaticized for use by the nested TLB fault handler.
The name has been changed to ia64_kptdir to avoid conflicts.
The renaming affects libkvm.
o The trapframe only contains the special registers and the
scratch registers. For syscalls using the EPC syscall path
no scratch registers are saved. This affects all places where
the trapframe is accessed. Most notably the unaligned access
handler, the signal delivery code and the debugger.
o Context switching only partly saves the special registers
and the preserved registers. This affects cpu_switch() and
triggered the move to the new semantics, which additionally
affects cpu_throw().
o The high FP registers are either in the PCB or on some
CPU. context switching for them is done lazily. This affects
trap().
o The mcontext has room for all registers, but not all of them
have to be defined in all cases. This mostly affects signal
delivery code now. The *context syscalls are as of yet still
unimplemented.
Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.
Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.
This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.
Approved by: re@ (jhb)
ia64 only uses relocations with addend, remove the sections specific to
non-addend relocations (.rel.*). Also remove C++ specific sections.
Approved by: re@ (blanket)
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls. It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4. The ia64 code has not implemented signal delivery, so I had
to do that.
Before you say it, yes, this does need to go in a common place. But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.
On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure. The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting. The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes. This still needs
to be optimized.
Approved by: re (amd64/* blanket)