communicate that it relates to (is called by) thread_alloc()
o Add cpu_thread_free() which is called from thread_free()
to counter-act cpu_thread_alloc().
i386: Have cpu_thread_free() call cpu_thread_clean() to
preserve behaviour.
ia64: Have cpu_thread_free() call mtx_destroy() for the
mutex initialized in cpu_thread_alloc().
PR: ia64/118024
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).
In collaboration with: Peter Holm
Reviewed by: jhb
Blade 1500/SX1500 boards have inherited the firmware bug of the
AX1105 mainboards to not include an interrupt map entry for the
parallel port controller (for the AX1105 the heuristic code for
E450s probably erroneously kicks in and guesses an interrupt).
- Take advantage of bus_generic_setup_intr(9).
- Fix some whitespace bugs.
in the way we implement handling of relocations.
As for the kernel part this fixes the loading of lots of modules,
which failed to load due to unresolvable symbols when built after
the GCC 4.2.0 import. This wasn't due to a change in GCC itself
though but one of several changes in configuration done along the
import. Specfically, HAVE_AS_REGISTER_PSEUDO_OP, which causes GCC
to denote global registers used for scratch purposes and in turn
GAS uses R_SPARC_OLO10 relocations for, is now defined.
While at it replace some more ELF_R_TYPE which should have been
ELF64_R_TYPE_ID but didn't cause problems so far.
- Sync a sanity check between kernel and rtld(1) and change it to be
maintenance free regarding the type used for the lookup table.
- Sprinkle const on lookup tables.
- Use __FBSDID.
Reported and tested by: yongari
MFC after: 5 days
a consequence of sparc64/sparc64/vm_machdep.c revision 1.76. It occurs
when uma_small_free() frees a page. The solution has two parts: (1) Mark
pages allocated with VM_ALLOC_NOOBJ as PG_UNMANAGED. (2) Defer the lock
assertion in pmap_page_is_mapped() until after PG_UNMANAGED is tested.
This is safe because both PG_UNMANAGED and PG_FICTITIOUS are immutable
flags, i.e., they do not change state between the time that a page is
allocated and freed.
Approved by: re (kensmith)
PR: 116794
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
33MHz for calculating the latency timer values for its children.
Inspired by NetBSD doing the same and Linux as well as OpenSolaris
using a similar approach.
While at it rename a variable and change its type to be more
appropriate fuer values of PCI properties so the variable can be
more easily reused.
- Initialize the cache line size register of PCI devices to a
legal value; the cache line size is limited to 64 bytes by the
Fireplane/Safari, JBus and UPA interconnection busses. Setting
it to an unsupported value caused bad performance at least with
GEM as it causes them to not do cache line bursts and to not
issue cache line commands on the PCI bus.
Approved by: re (kensmith)
MFC after: 1 week
ways:
(1) Cached pages are no longer kept in the object's resident page
splay tree and memq. Instead, they are kept in a separate per-object
splay tree of cached pages. However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock. Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.
This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held. Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.
Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case. Cached pages
are reclaimed far, far more often than they are reactivated. Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.
(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.
Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated. Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page. Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.
Discussed with: many over the course of the summer, including jeff@,
Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
previously the sched_lock. These bugs have existed for some time.
- Allow swapout to try each thread in a process individually and then
swapin the whole process if any of these fail. This allows us to move
most scheduler related swap flags into td_flags.
- Keep ki_sflag for backwards compat but change all in source tools to
use the new and more correct location of P_INMEM.
Reported by: pho
Reviewed by: attilio, kib
Approved by: re (kensmith)
of pages don't sum to anywhere near the total number of pages on amd64.
This is for the most part because uma_small_alloc() pages have never been
counted as wired pages, like their kmem_malloc() brethren. They should
be. This changes fixes that.
It is no longer necessary for the page queues lock to be held to free
pages allocated by uma_small_alloc(). I removed the acquisition and
release of the page queues lock from uma_small_free() on amd64 and ia64
weeks ago. This patch updates the other architectures that have
uma_small_alloc() and uma_small_free().
Approved by: re (kensmith)
with the INTR_FILTER-enabled MI code. Basically this consists of
registering an interrupt controller (of which there can be multiple
and optionally different ones either per host-to-foo bridge or shared
amongst host-to-foo bridges in any one machine) along with an interrupt
vector as specific argument for all the interrupt vectors used by a
given host-to-foo bridge (roughly similar to registering interrupt
sources on amd64 and i386), providing functions to enable, clear and
disable the interrupts of the children beneath the bridge.
This also includes:
- No longer entering a critical section in tl0_intr() and tl1_intr()
for executing interrupt handlers but rather let the handlers enter
it themselves so in the case of intr_event_handle() we don't enter
a nested critical section.
- Adding infrastructure for binding delivery of interrupt vectors to
specific CPUs which later on can be interfaced with the code from
amd64/i386 for binding interrupts to specific CPUs.
- Getting rid of the wrapper hack introduced along the lines of the
API changes for INTR_FILTER which as a side-effect caused interrupts
associated with ithread handlers only to get the elevated priority
of those associated with filters ("fast handlers") (this removes the
hack also in the non-INTR_FILTER case).
- Disabling (by not clearing) an interrupt in the interrupt controller
until all associated handlers have been executed, which is crucial
for the typical locking strategy of NIC drivers in order to work
correctly in case of shared interrupts. This was a more or less
theoretical problem on sparc64 though, as shared interrupts are
rather uncommon there except for the on-board SCCs and UARTs.
Note that due to the behavior of at least of some of the interrupt
controllers used on sparc64 an enable+EOI instead of a disable+EOI
approach (as implied by the INTR_FILTER MI code and implemented on
other architectures) is used as the latter can cause lost interrupts
or in the worst case interrupt starvation.
o Correct a typo in sbus_alloc_resource() which caused (pass-through)
allocations to only work down to the grandchildren of the bus, which
wasn't a real problem so far as we don't support any devices which are
great-grandchildren or greater of a U2S bridge, yet.
o In fhc(4) use bus_{read,write}_4() instead of bus_space_{read,write}_4()
in order to get rid of sc_bh and sc_bt in the fhc_softc. Also get rid
of some other unneeded members in fhc_softc.
Reviewed by: marcel (earlier version)
Approved by: re (kensmith)
instead of per IOMMU, so we no longer need to program all of them
identically in systems having multiple IOMMUs. This continues the
rototilling of the nexus(4) done about 5 months ago, which amongst
others changed nexus(4) and the drivers for host-to-foo bridges
to provide bus_get_dma_tag methods, allowing to handle DMA tags in
a hierarchical way and to link them with devices.
This still doesn't move the silicon bug workarounds for Sabre (and
in the uncommitted schizo(4) for Tomatillo) bridges into special
bus_dma_tag_create() and bus_dmamap_sync() methods though, as w/o
fully newbus'ified bus_dma_tag_create() and bus_dma_tag_destroy()
this still requires too much hackery, i.e. per-child parent DMA
tags in the parent driver.
- Let the host-to-foo drivers supply the maximum physical address
of the IOMMU accompanying the bridges. Previously iommu(4) hard-
coded an upper limit of 16GB, which actually only applies to the
IOMMUs of the Hummingbird and Sabre bridges. The Psycho variants
as well as the U2S in fact can can translate to up to 2TB, i.e.
translate to 41-bit physical addresses. According to the recently
available Tomatillo documentation these bridges even translate to
43-bit physical addresses and hints at the Schizo bridges doing
43 bits as well.
This fixes the issue the FreeBSD 6.0 todo list item "Max RAM on
sparc64" was refering to and pretty much obsoletes the lack of
support for bounce buffers on sparc64.
Thanks to Nathan Whitehorn for pointing me at the Tomatillo manual.
Approved by: re (kensmith)
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).
Approved by: re
MFC after: 3 weeks
new code and third party modules which try to depend on it.
- Initialize sched_lock in sched_4bsd.c.
- Declare sched_lock in sparc64 pmap.c and assert that we're compiling
with SCHED_4BSD to prevent accidental crashes from running ULE. This
is the sole remaining file outside of the scheduler that uses the
global sched_lock.
Approved by: re
allowing the driver for the host-PCI-bridge to indicate that
reenumeration of the PCI busses isn't supported by returning
-1 instead of a valid PCI bus number. This is needed in order
support both Tomatillo, which don't support reenumeration and
thus are apparently intended to be used for independently
numbered PCI domains only, and Psycho bridges, whose busses
need to be reenumerated on at least some E450, without the
#ifndef currently used for sun4v in order to support multiple
independently PCI domains. The actual allocation/incrementation
of the PCI bus numbers is now done in psycho(4), though it
no longer establish a mapping between bus numbers and device
nodes like ofw_pci_alloc_busno() did as that functionality
wasn't used (but can easily brought back if really needed).
The now no longer used sys/sparc64/pci/ofw_pci.c is also
removed from sys/conf/files.sun4v as ofw_pci_alloc_busno()
wasn't used there in the first place.
- In ofw_pci_default_{adjust_busrange,intr_pending}() sanity
check that the device has a parent before passing it on.
- Make psycho_softcs static to sys/sparc64/pci/psycho.c as
it's not used outside of that module.
- In sys/sparc64/pci/ofw_pcib_subr.c remove the superfluous
inclusion of opt_global.h and correct the debug output for
adjusting the subordinate bus number.
instead of using the PCI bus number, like it's already done for
sun4v in order to deal properly with independently numbered PCI
domains which can't be reenumerated (in the case of sun4u f.e.
Tomatillo bridges). For machines where we need to reenumerate
all PCI busses this change obviously introduces the theoretical
cosmetic problem that the device number of the PCI bus no longer
equals to its PCI bus number. In practice this doesn't happen
as both are assigned linearly and in parallel.
handlers as filter/"fast" handlers so shutdown_nice() can
acquire the process lock.
- Use bus_{read,write}_8() instead of bus_space_{read,write}_8()
in order to get rid of sc_bushandle and sc_bustag in the softc.
- Remove the banal and outdated comment above sbus_filter_stub().
allowing it to be a filter/"fast" handler. Locking the interrupt
handlers with a spin lock is mainly a requirement in schizo(4)
but as we ought to register the spin lock anyway it should not
hurt to take advantage of it in psycho(4).
- Pass both a driver_filter_t and a driver_intr_t argument to
psycho_set_intr(), allowing to get rid of the FAST interrupt
flag hack.
- Don't register the over-temperature interrupt handler as filter/
"fast" handler so shutdown_nice() can acquire the process lock.
- Use bus_{read,write}_8() instead of bus_space_{read,write}_8()
in order to get rid of sc_bushandle and sc_bustag in the softc.
- Correct the debug output for adjusting the subordinate bus number.
- Remove the banal and outdated above psycho_filter_stub().
- Fix some white space nits.
These CPUs use an enhanced layout of the interrupt vector dispatch
and dispatch status registers in order to allow sending IPIs to
multiple targets simultaneously. Thus support for these CPUs was
put in a newly added cheetah_ipi_selected(). This is intended to
be pointed to by cpu_ipi_selected, which now is a function pointer,
in order to avoid cpu_impl checks once booted. Alternatively it
can point to spitfire_ipi_selected(), which was renamed from
cpu_ipi_selected(). Consequently cpu_ipi_send() was also renamed
to spitfire_ipi_send() (there's no need for a cheetah equivalent
of this so far). Initialization of the cpu_ipi_selected pointer
and other requirements is done in mp_init(), which was renamed
from mp_tramp_alloc(), as cpu_mp_start() isn't called on UP
systems while cpu_ipi_selected() is. As a side-effect this allows
to make mp_tramp static to sys/sparc64/sparc64/mp_machdep.c.
For the sake of avoiding #ifdef SMP and for keeping the history in
place cheetah_ipi_selected() and spitfire_ipi_{selected,send}()
where not put into/moved to sys/sparc64/sparc64/{cheetah,spitfire}.c
- Add some CTASSERTs and KASSERTs ensuring that MAXCPU doesn't
exceed the data types we use to store the CPU bit fields or the
number of USIII and greater CPUs supported by the current
cheetah_ipi_selected() implementation (which for JBus-CPUs is
only 4; that should be fine though as according to OpenSolaris
there are no sun4u machines with more than 4 JBus-CPUs).
- In cpu_mp_start() don't enumerate and start more than MAXCPU CPUs
as we can't handle more than that.
- In cpu_mp_start() check for upa-portid vs. portid depending on
cpu_impl for consistency with nexus(4).
- In spitfire_ipi_selected() add KASSERTs ensuring that a CPU isn't
told to IPI itself as sun4u CPUs just can't do that.
- In spitfire_ipi_send() do a MEMBAR #Sync after writing the
interrupt vector data as we want to make sure the payload was
actually written before we trigger the dispatch.
- In spitfire_ipi_send() also verify IDR_BUSY when checking whether
the dispatch was successful as it has to be cleared for this to
be the case.
- Remove some redundant variables.
RTC function of a National Semiconductor PC87317/PC97317. This
consists of using the century register the same way Solaris does
for compatibility reasons. Once there is a MD power(4) we'd also
want to interface the APC (Advanced Power Control) functionality
of the same chip function with it.
- Use a macro for the device description and take advantage of
ISA_PNP_PROBE() setting the device description.
- Use the generated typedefs for the prototypes of the device
interface functions.
reasons outlined in the comment removed along with it, because the
OFW hostid has no real meaning for FreeBSD and mainly so the OFW
hostid is not confused with the FreeBSD hostid.
more exposure. The current state of SCTP implementation is
considered to be ready for 32-bit platforms, but still need some
work/testing on 64-bit platforms.
Approved by: re (kensmith)
Discussed with: rrs
making the relevant files standard. This avoids duplication and
makes it easier to override/disable unwanted schemes. Since ARM
doesn't have a DEFAULTS configuration file, leave the source
files for the BSD and MBR partitioning schemes in files.arm for
now.
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
- Use sched_throw() rather than replicating the same cpu_throw() code for
each architecture. This also allows the scheduler to use any locking it
may want to.
- Use the thread_lock() rather than sched_lock when preempting.
- The scheduler lock is not required to synchronize release_aps.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
given a specific value.
Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.
Reviewed by: alc, bde
Approved by: jeff (mentor)
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.
Requested by: alc
Approved by: jeff (mentor)
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.
Discussed with: jhb
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.
Reviewed by: scottl
Note on dcons:
To enable dcons in kernel, put the following lines in /boot/loader.conf.
You may also want to enable dcons in /etc/ttys.
boot_multicons="YES"
#Force dcons to be the high-level console if a firewire bus presents.
#hw.firewire.dcons_crom.force_console=1
FireWire/dcons support in loader will come shortly.
(i386/amd64 only)
same way it was enabled for Linux binares in linuxulator.
This allows binaries built with -pie. Many ports auto-detect -fPIE support
in GCC 4.2 and build binaries FreeBSD was unable to run.
referenced outside of mp_machdep.c
- Replace a magic 14 with the newly added IDC_ITID_SHIFT macro.
- Remove the global mp_boot_mid variable as it's not really necessary
and just replacing it with PCPU_GET(mid) doesn't have any impact on
performance once booted.
- Replace PCPU_GET(cpuid) with the curcpu shortcut.
- Replace hardcoded function names in panic strings etc with __func__
so they don't need to be updated when renaming the function.
- Use register_t instead of u_long for variables used to hold the
return value of intr_disable() so we don't need to apply any
knowledge about the actual width of that value here.
- Improve the wording of some comments.
- Fix several style(9) bugs.
- Use __FBSDID in identcpu.c.
- Remove #ifndef SUN4V around global cpu_impl variable; it doesn't
hurt on sun4v for now and once setPQL2() is gone sun4v can stop
sharing identcpu.c with sparc64, making the reminder of this file
also sparc64-only again. [1]
Submitted by: kmacy [1]
iommureg.h (which already began to bitrot) and iommuvar.h from the
sun4v source and adjust some of the source which is shared between
sparc64 and sun4v as appropriate.
vmcnts. This can be used to abstract away pcpu details but also changes
to use atomics for all counters now. This means sched lock is no longer
responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory. The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used. Previously,
only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.
This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.
Discussed with: kmacy, marius, Nathan Whitehorn
PR: 112194
functions with CPUs they apply to only, otherwise default to the
plain C functions. This is modeled in a way so that f.e. a Cheetah
version of these functions can be inserted easily.
the UPA_IMR2 resource is also shared with/a subset of the Schizo PCI
bus B CSR bank. I'm not entirely sure how this previously managed to
escape testing...
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.
Approved by: njl (mentor)
Reviewed by: alc
GETATTRs being generated - one from lookup()/namei() and the other
from nfs_open() (for cto consistency). This change eliminates the
GETATTR in nfs_open() if an otw GETATTR was done from the namei()
path. Instead of extending the vop interface, we timestamp each attr
load, and use this to detect whether a GETATTR was done from namei()
for this syscall. Introduces a thread-local variable that counts the
syscalls made by the thread and uses <pid, tid, thread syscalls> as
the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on
thread state that could be used as the timestamp with minimal overhead.
sun4v nexus(4) in turn is based on):
o Change nexus(4) to manage the resources of its children so the
respective device drivers don't need to figure them out of OFW
themselves.
o Change nexus(4) to provide the ofw_bus KOBJ interface instead of
using IVARs for supplying the OFW node and the subset of standard
properties of its children. Together with the previous change this
also allows to fully take advantage of newbus in that drivers like
fhc(4), which attach on multiple parent busses, no longer require
different bus front-ends as obtaining the OFW node and properties
as well as resource allocation works the same for all supported
busses. As such this change also is part 4/4 of allowing creator(4)
to work in USIII-based machines as it allows this driver to attach
on both nexus(4) and upa(4). On the other hand removing these IVARs
breaks API compatibility with the powerpc nexus(4) but which isn't
that bad as a) sparc64 currently doesn't share any device driver
hanging off of nexus(4) with powerpc and b) they were no longer
compatible regarding OFW-related extensions at the pci(4) level
since quite some time.
o Provide bus_get_dma_tag methods in nexus(4) and its children in
order to handle DMA tags in a hierarchical way and get rid of the
sparc64_root_dma_tag kludge. Together with the previous two items
this changes also allows to completely get rid of the nexus(4)
IVAR interface. It also includes:
- pushing the constraints previously specified by the nexus_dmatag
down into the DMA tags of psycho(4) and sbus(4) as it's their
IOMMUs which induce these restrictions (and nothing at the
nexus(4) or anything that would warrant specifying them there),
- fixing some obviously wrong constraints of the psycho(4) and
sbus(4) DMA tags, which happened to not actually be used with
the sparc64_root_dma_tag kludge in place and therefore didn't
cause problems so far,
- replacing magic constants for constraints with macros as far
as it is obvious as to where they come from.
This doesn't include taking advantage of the newbus way to get
the parent DMA tags implemented by this change in order to divorce
the IOTSBs of the PCI and SBus IOMMUs or for implementing the
workaround for the DMA sync bug in Sabre (and Tomatillo) bridges,
yet, though.
o Get rid of the notion that nexus(4) (mostly) reflects an UPA bus
by replacing ofw_upa.h and with ofw_nexus.h (which was repo-copied
from ofw_upa.h) and renaming its content, which actually applies to
all of Fireplane/Safari, JBus and UPA (in the host bus case), as
appropriate.
o Just use M_DEVBUF instead of a separate M_NEXUS malloc type for
allocating the device info for the children of nexus(4). This is
done in order to not need to export M_NEXUS when deriving drivers
for subordinate busses from the nexus(4) class.
o Use the DEFINE_CLASS_0() macro to declare the nexus(4) driver so
we can derive subclasses from it.
o Const'ify the nexus_excl_name and nexus_excl_type arrays as well
as add 'associations' and 'rsc', which are pseudo-devices without
resources and therefore of no real interest for nexus(4), to the
former.
o Let the nexus(4) device memory rman manage the entire 64-bit address
space instead of just the UPA_MEMSTART to UPA_MEMEND subregion as
Fireplane/Safari- and JBus-based machines use multiple ranges,
which can't be as easily divided as in the case of UPA (limiting
the address space only served for sanity checking anyway).
o Use M_WAITOK instead of M_NOWAIT when allocating the device info
for children of nexus(4) in order to give one less opportunity
for adding devices to nexus(4) to fail.
o While adapting the drivers affected by the above nexus(4) changes,
change them to take advantage of rman_get_rid() instead of caching
the RIDs assigned to allocated resources, now that the RIDs of
resources are correctly set.
o In iommu(4) and nexus(4) replace hard-coded functions names, which
actually became outdated in several places, in panic strings and
status massages with __func__. [1]
o Use driver_filter_t in prototypes where appropriate.
o Add my copyright to creator(4), fhc(4), nexus(4), psycho(4) and
sbus(4) as I changed considerable amounts of these drivers as well
as added a bunch of new features, workarounds for silicon bugs etc.
o Fix some white space nits.
Due to lack of access to Exx00 hardware, these changes, i.e. central(4)
and fhc(4), couldn't be runtime tested on such a machine. Exx00 are
currently reported to panic before trying to attach nexus(4) anyway
though.
PR: 76052 [1]
Approved by: re (kensmith)
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).
The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
work when we start requiring this.
- Don't specify an alignment when creating our own parent DMA tag;
the supported DMA engines require no alignment constraint (f.e. the
LANCE child does though) and it's no inherited by the child DMA
tags anyway (which probably is a bug though).
- Fix whitespace nits.
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
in panic and status strings with __func__.
- Fix whitespace nits.
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
bus hanging off from the Fireplane/Safari bus in some USIII machines.
This is part 3/4 of allowing creator(4) to work in these machines.
The little info needed on how to configure the bridge and to work
around the incorrect values contained in the `interrupts' properties
of its children were obtained form OpenSolaris.
The separate bus front-end was inherited from the OpenBSD creator(4),
which at that time had a mainbus(4) (for USI/II machines, which use
an UPA interconnection bus as the nexus) and an upa(4) (for USIII
machines, which use a subordinate/slave UPA bus hanging off from the
Fireplane/Safari interconnection bus) front-end. With FreeBSD and
newbus there is/will be no need to have two separate bus front-ends
for these busses, so we can easily coallapse the shared front-end
and the back-end into a single source file (note that the FreeBSD
creator_upa.c was misnomer anyway; based on what it actually attached
to that should have been creator_nexus.c), actually OpenBSD meanwhile
also has moved to a shared front-end and a single source file. Due
to the low-level console support creator.c also wasn't free from bus
related things before.
While at it, also split sys/sparc64/creator/creator.h into a
sys/dev/fb/creatorreg.h that only contains register macros and move
the structures to the top of sys/dev/fb/creator.c as suggested by
style(9) so creator(4) is no longer scattered over two directories.
- Use OF_decode_addr()/sparc64_fake_bustag() to obtain the bus tags and
handles for the low-level console support instead of hardcoding
support for AFB/FFB hanging off from nexus(4) only. This is part 2/4
of allowing creator(4) to work in USIII machines (which have a UPA
bus hanging off from the Fireplane/Safari bus reflected by the nexus),
which already makes it work as the low-level console there.
- Allocate resources in the bus attach routine regardless of whether
creator(4) is used as for the low-level console and thus the required
bus tags and handles have been already obtained or not so the resources
are marked as taken in the respective RMAN.
- For both obtaining the bus tags and handles for the low-level console
support as well as allocating the corresponding resources in the
regular bus attach routine don't bother to get all for the maximum of
24 register banks but only (for) the two tag/handle pairs required for
providing the video interface for syscons(4) support. If we can't
allocate the rest of them just limit the memory range accessible via
creator_fb_mmap() accordingly.
- Sanity check the memory range spanned by the first and last resources
and the resources in between as far as possible, as the XFree86/Xorg
sunffb(4) expects to be able to access the whole region, even though
the backing resources are actually non-continuous. Limit and check
the memory range accessible via creator_fb_mmap() accordingly.
- Reduce the size of buffers for OFW properties to what they actually
need to hold.
- Rename some tables to creator_<foo> for consistency.
- Also for the sizes in the creator_fb_mmap() mapping table entries use
macros for consistency, add macros for the remaining register banks
for completeness.
nexus (which might or might not reflect an UPA interconnection bus;
accordingly UPA_BUS_SPACE should be renamed to NEXUS_BUS_SPACE at a
later point) and subordinate/slave UPA busses. This is part 1/4 of
allowing creator(4) to work in USIII machines (which have a UPA bus
hanging off from the Fireplane/Safari bus reflected by the nexus).
- Clear the PCI AFSR and status error bits as previous errors still
might be indicated.
- Set up the PCI control and diagnostic registers according to the
capabilities, workarounds, etc of/for specific revisions of the
supported bridges. This includes no longer setting Hummingbird-/
Sabre-specific bits in the PCI control register but preserving
what the firmware has initialized them to like OpenSolaris does.
Previously we were setting these bits according to the example in
the Sabre documentation, which I doubt is appropriate for all
Sabre based designs and especially not for Hummingbirds. This
also includes not enabling bus parking unless the firmware tells
us to.
- Set the PCI latency timer register as this isn't always done by
the firmware.
o Remove a redundant argument from psycho_set_intr() and in this
function check the return value of bus_setup_intr(). [2]
o Let psycho_setup_intr() return ENOMEM instead of 0 when it can't
allocate memory for the interrupt wrapper stub and EINVAL instead
of 0 if it can't find the interrupt vector in the interrupt map.
o Add a workaround for a bug of the Sabre-APB-combination where it
doesn't drain DMA write data for devices behind additional PCI-PCI
bridges underneath the APB PCI-PCI bridge. This workaround (do
things necessary in order to achieve a manual drain when coherency
is required) is currently implemented in psycho_setup_intr() and
psycho_intr_stub() (for easy MFC'ing) and therefore is only applied
for interrupt handlers. This should be moved to psycho(4)-specific
bus_dma_tag_create() and bus_dmamap_sync() methods, respectively,
once this driver is converted to make use of BUS_GET_DMA_TAG(), so
the workaround is also applied for polling(4) callbacks. [3]
o Fix some minor style issues.
Info from: OpenSolaris [1]
Info from: Linux, OpenBSD, OpenSolaris [3]
Suggested by: Coverity Prevent (CID 682) [2]
MFC after: 1 month
firmware (mainly 'pmu' and its 'lomp' dupe found in a couple of
later USII{e,i}-based machines) by checking whether a device with
the same triple of bus number, slot and function already has been
added. This is the simple yet effective approach introduced in
OpenBSD some time ago, but which has the flaw that it assumes
that the device and its dupe(s) found in the OFW device tree are
equal or at least the one encountered first is in some way the
more important one (this is the case with 'pmu' and 'lomp'; the
'pmu' node has couple of properties and children while the 'lomp'
one misses most of these). If there's ever a device/dupe pair
where we don't encounter the more important node first, we'll
probably need to introduce a quirk list in order to add the
desired device but prevent its dupe(s) from being added.
MFC after: 1 week
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs. Libpthread processes will already
do this to some extent and libthr processes already disable it.
Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
by default for sun4v where it is absolutely required.
This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.
Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
retrieval, rather than using pad
- save the fault address in sfar for use by the alignment fixup handler
- mask off the trap number, so the context id doesn't confuse the UT_MAX
comparison
This change fixes alignment fixup handling which is needed for traceroute
to work in spite of its copious unaligned accesses
it as a default.
For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.
By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.
Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.
I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.
Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.
I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
as we have no use for that info. Instead let this function return the
keyboard ID and verify at its invocation in sunkbd_configure() that we're
talking to a Sun type 4/5/6 keyboard, i.e. a keyboard supported by this
driver.
- Add an option SUNKBD_EMULATE_ATKBD whose code is based on the respective
code in ukbd(4) and like UKBD_EMULATE_ATSCANCODE causes this driver to
emit AT keyboard/KB_101 compatible scan codes in K_RAW mode as assumed by
kbdmux(4). Unlike UKBD_EMULATE_ATSCANCODE, SUNKBD_EMULATE_ATKBD also
triggers the use of AT keyboard maps and thus allows to use the map files
in share/syscons/keymaps with this driver at the cost of an additional
translation (in ukbd(4) this just is the way of operation).
- Implement an option SUNKBD_DFLT_KEYMAP, which like the equivalent options
of the other keyboard drivers allows to specify the default in-kernel
keyboard map. For obvious reasons this made to only work when also using
SUNKBD_EMULATE_ATKBD.
- Implement sunkbd_check(), sunkbd_check_char() and sunkbd_clear_state(),
which are also required for interoperability with kbdmux(4).
- Implement K_CODE mode and FreeBSD keypad compose.
- As a minor hack define KBD_DFLT_KEYMAP also in the !SUNKBD_EMULATE_ATKBD
case so we can obtain fkey_tab from <dev/kbd/kbdtables.h> rather than
having to duplicate it and #ifdef some more code.
- Don't use the TX-buffer for writing the two command bytes for setting the
keyboard LEDs as this consequently requires a hardware FIFO that is at
least two bytes in depth, which the NMOS-variant of the Zilog SCCs doesn't
have. Thus use an inlined version of uart_putc() to consecutively write
the command bytes (a cleaner approach would be to do this via the soft
interrupt handler but that variant wouldn't work while in ddb(4)). [1]
- Fix some minor style(9) bugs.
PR: 90316 [1]
Reviewed by: marcel [1]
a lock to prevent interspersed strings written from different CPUs
at the same time.
To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.
String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.
Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.
A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.
Reviewed by: scottl, jhb
MFC after: 2 weeks
argument in parentheses so these macros are safe to use and invocations
with an expression as the argument like __bswap32_const(42 << 23 | 13)
work as expected. Additionally, mask all the individually shifted bytes
as appropriate so the bytes which exceed the width of the respective
__bswapN_const() macro in invocations like __bswap16_const(0xdead600d)
are ignored like it's the case with the corresponding __bswapN_var()
function.
MFC after: 3 days
The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.
Submitted by: scottl@
except sun4v.
This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.
Submitted by: scottl@
: # Options for atkbd:
: options ATKBD_DFLT_KEYMAP # specify the built-in keymap
: makeoptions ATKBD_DFLT_KEYMAP=jp.106
[...]
: nooption ATKBD_DFLT_KEYMAP
: nomakeoption ATKBD_DFLT_KEYMAP
(Previously the option was inherited from MI NOTES.) So my tool in
rev. 1.26 reduced this to removing all "ATKBD_DFLT_KEYMAP" lines,
leaving the option effectively disabled as it was before, but since
it's actually supported on sparc64, turn it on now.