Commit Graph

832 Commits

Author SHA1 Message Date
Nathan Whitehorn
c226d0b31a Fix two return values damaged by copy/paste. 2013-11-12 01:28:38 +00:00
Nathan Whitehorn
e39c26a950 Use the same implementation of copyinout.c for both AIM and Book-E. This
fixes some bugs in both implementations related to validity checks on
mapping bounds.
2013-11-11 23:37:16 +00:00
Nathan Whitehorn
bdac436008 Follow up r223485, which made AIM use the ABI thread pointer instead of
PCPU fields for curthread, by doing the same to Book-E. This closes
some potential races switching between CPUs. As a side effect, it turns out
the AIM and Book-E swtch.S implementations were the same to within a few
registers, so move that to powerpc/powerpc.

MFC after: 3 months
2013-11-11 17:37:50 +00:00
Andreas Tobler
48f22b9682 Prepare for 64-bit. Iow, use Elf_*hdr instead of the 32-bit ones. 2013-11-10 22:42:56 +00:00
Justin Hibbits
c17f21575c Clamp the dump block size to the dump device max I/O size. 2013-11-07 21:02:57 +00:00
Justin Hibbits
d6bff760cd Make the powerpc dumpsys() more consistent with the other architectures.
MFC after:	10.0-RELEASE
2013-11-06 15:56:03 +00:00
Konstantin Belousov
80938e75f0 Add bus_dmamap_load_ma() function to load map with the array of
vm_pages.  Provide trivial implementation which forwards the load to
_bus_dmamap_load_phys() page by page.  Right now all architectures use
bus_dmamap_load_ma_triv().

Tested by:	pho (as part of the functional patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-10-27 21:39:16 +00:00
Nathan Whitehorn
33724f17d2 Interrelated improvements to early boot mappings:
- Remove explicit requirement that the SOC registers be found except as an
  optimization (although the MPC85XX LAW drivers still require they be found
  externally, which should change).
- Remove magic CCSRBAR_VA value.
- Allow bus_machdep.c's early-boot code to handle non 1:1 mappings and
  systems not in real-mode or global 1:1 maps in early boot.
- Allow pmap_mapdev() on Book-E to reissue previous addresses if the
  area is already mapped. Additionally have it check all mappings, not
  just the CCSR area.

This allows the console on e500 systems to actually work on systems where
the boot loader was not kind enough to set up a 1:1 mapping before starting
the kernel.
2013-10-26 18:18:14 +00:00
Nathan Whitehorn
b27a2490bc Remove dead reference to PSL_MBO. 2013-10-25 14:38:46 +00:00
Nathan Whitehorn
d4602c7200 Remove some #ifdef and duplication in the MSR bit definitions. This adds
some security features to the Book-E kernel as well.
2013-10-25 14:37:15 +00:00
Nathan Whitehorn
544234026d Allow PIC drivers to translate firmware sense codes for themselves. This
is designed to replace the tables in dev/fdt/fdt_ARCH.c, but will not
happen quite yet.
2013-10-24 15:37:32 +00:00
Nathan Whitehorn
a8126ae500 Factor out MI portions of the PowerPC nexus device into /sys/dev/ofw. The
sparc64 driver will be modified to use this shortly.
2013-10-23 20:00:14 +00:00
Nathan Whitehorn
f214848258 Add two new interfaces to ofw_bus:
- ofw_bus_map_intr()
  Maps an (iparent, IRQ) tuple to a system-global interrupt number in some
  platform dependent way. This is meant to be implemented as a replacement
  for [FDT_]MAP_IRQ() that is an MI interface that knows about the bus
  hierarchy.
- ofw_bus_config_intr()
  Configures an interrupt (previously mapped) based on firmware sense flags.
  This replaces manual interpretation of the sense field in bus drivers and
  will, in a follow-up, allow that interpretation to be redirected to the PIC
  drivers where it belongs. This will eventually replace the tables in
  /sys/dev/fdt/fdt_ARCH.c

The PowerPC/AIM code has been converted to use these globally, with an
implementation in terms of MAP_IRQ() and powerpc_config_intr(), assuming
OpenPIC, at the bus root in nexus(4). The ofw_bus_config_intr() will shortly
be integrated into pic_if.m and bounced through nexus into the PIC tree.

FDT integration will happen significantly later due to larger testing
requirements. This patch in general also lays the groundwork for the removal
of /sys/dev/fdt/fdt_ARCH.c and machine/fdt.h.
2013-10-23 17:24:21 +00:00
Nathan Whitehorn
081431ad8f Use OF_getencprop() in preference to OF_getprop() for numerical quantities.
Since all supported PowerPC systems are big-endian, this is a no-op, but
this is preparatory work to moving this to /sys/dev/ofw.
2013-10-23 14:06:41 +00:00
Nathan Whitehorn
c6f776c7e4 Ignore registers on devices where the reg property is malformed. Issue a
warning if this happens under bootverbose. This prevents some
strange-looking entries in dmesg for SMU devices on Apple G5 systems.
2013-10-22 15:47:13 +00:00
Nathan Whitehorn
7a759c54e8 Catch up on 6 years of improvements in Open Firmware nexus devices by
importing the sparc64 one. At least 90% of this code is MI and will be
moved into /sys/dev/ofw at some point in the future.
2013-10-22 14:11:16 +00:00
Nathan Whitehorn
17593f8612 Standards-conformance and code deduplication:
- Use bus reference phandles in place of FDT offsets as IRQ domain keys
- Unify the identical macio/fdt/mambo OpenPIC drivers into one
- Be more forgiving (following ePAPR) about what we need from the device
  tree to identify an OpenPIC
- Correctly map all IRQs into an interrupt domain
- Set IRQ_*_CONFORM for interrupts on an unknown PIC type instead of
  failing attachment for that device.
2013-10-22 14:07:57 +00:00
Nathan Whitehorn
09e5acd4bb Use standard ofw_bus helpers instead of reinventing the wheel. 2013-10-21 18:47:02 +00:00
Nathan Whitehorn
72c775da6c Fix 80-column line wrapping in a comment. 2013-10-21 00:58:35 +00:00
Nathan Whitehorn
e4cf0633b8 Since the PS3 port was committed, the AIM nexus device works perfectly fine
on all PowerPC platforms, whether or not they have Open Firmware. Remove
some more duplication and have there be only one nexus driver.
2013-10-20 18:40:55 +00:00
Nathan Whitehorn
228f09b3ef Replace the two almost-exactly-identical AIM and Book-E clock.c
implementations with a single one after the application of a very small
amount of #ifdef.
2013-10-20 16:37:03 +00:00
Nathan Whitehorn
1cfdc97153 Unify the AIM and Book-E vm_machdep.c implementations, which previously
differed only with respect to the AIM version not following style(9) and
some additional features for 64-bit systems and machines with direct maps
in the AIM implementation that are no-ops on Book-E (at least for now).
2013-10-20 16:14:03 +00:00
Alan Cox
e57a196dbf Eliminate the declaration for a method that is no longer used. (This
change should have been a part of r255724.)

Reminded by:	nathan
Approved by:	re (gjb)
2013-09-26 15:36:20 +00:00
Alan Cox
deb179bb4c The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8.  Now, that call no
longer exists.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-20 04:30:18 +00:00
Nathan Whitehorn
5d548e66ff Add POWER7+ and POWER8 to the CPU ID table.
Approved by:	re (kib)
2013-09-17 17:29:56 +00:00
Nathan Whitehorn
58aa4de0aa Make sure to copy segments back to the segs array if non-NULL. This is
relied upon by bus_dmamap_load_mbuf_sg() (i.e. all network drivers).

Approved by:	re (kib)
MFC after:	2 weeks
2013-09-17 17:29:07 +00:00
Nathan Whitehorn
1aff10b99e Fix bug in busdma: if segs is a preexisting buffer, we memcpy it
into the DMA map. The length of the buffer had not yet been
initialized, however, so this would copy gibberish unless it
happened to be right by chance. This bug mostly only affected
systems with IOMMUs.

Approved by:	re (gjb)
MFC after:	3 days
2013-09-16 14:32:56 +00:00
Nathan Whitehorn
c84bb047d4 Raise artificial limits on number of CPUs and number of interrupts.
Approved by:	re (kib)
2013-09-09 12:52:34 +00:00
Nathan Whitehorn
c5915fdc44 Add POWER CPUs to the kernel's knowledge. This does not imply we currently
actually run on any machines with POWER CPUs but avoids closing that door
unnecessarily.

Approved by:	re (kib)
2013-09-09 12:51:24 +00:00
Nathan Whitehorn
0658fe8ce1 Add hook called when every new processor is brought online -- including the
BSP -- so that platform modules have a chance to add the new CPU to any
internal bookkeeping.

Approved by:	re (kib)
2013-09-09 12:49:19 +00:00
Alan Cox
51321f7c31 Significantly reduce the cost, i.e., run time, of calls to madvise(...,
MADV_DONTNEED) and madvise(..., MADV_FREE).  Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE.  Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference().  Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2).  A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).

Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact.  For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.

Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures.  I will commit implementations for the
remaining architectures after further testing.  For now, a stub function is
sufficient because of the advisory nature of pmap_advise().

Discussed with: jeff, jhb, kib
Tested by:      pho (i386), marcel (ia64)
Sponsored by:   EMC / Isilon Storage Division
2013-08-29 15:49:05 +00:00
Jeff Roberson
5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
Andrey V. Elsukov
dbd4437b06 Include sys/systm.h after sys/param.h.
Suggested by:	pluknet
2013-07-15 15:40:57 +00:00
Rui Paulo
51091a0763 Fix a KTR_BUSDMA format string. 2013-06-18 06:55:58 +00:00
Konstantin Belousov
ee75e7de7b Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA.  The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer.  For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer.  The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings.  Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags.  Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by:	The FreeBSD Foundation
Discussed with:	jeff (previous version)
Tested by:	pho, scottl (previous version), jhb, bf
MFC after:	2 weeks
2013-03-19 14:13:12 +00:00
Konstantin Belousov
e8a4a618cf Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination.  Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations.  For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used.  The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after:	2 weeks
2013-03-14 20:18:12 +00:00
Davide Italiano
acccf7d8b4 MFcalloutng:
When CPU becomes idle, cpu_idleclock() calculates time to the next timer
event in order to reprogram hw timer. Return that time in sbintime_t to
the caller and pass it to acpi_cpu_idle(), where it can be used as one
more factor (quite precise) to extimate furter sleep time and choose
optimal sleep state. This is a preparatory change for further callout
improvements will be committed in the next days.

The commmit is not targeted for MFC.
2013-02-28 10:46:54 +00:00
Konstantin Belousov
dd0b4fb6d5 Reform the busdma API so that new types may be added without modifying
every architecture's busdma_machdep.c.  It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code.  The MD busdma is then given a chance to do any final processing
in the complete() callback.

The cam changes unify the bus_dmamap_load* handling in cam drivers.

The arm and mips implementations are updated to track virtual
addresses for sync().  Previously this was done in a type specific
way.  Now it is done in a generic way by recording the list of
virtuals in the map.

Submitted by:	jeff (sponsored by EMC/Isilon)
Reviewed by:	kan (previous version), scottl,
	mjacob (isp(4), no objections for target mode changes)
Discussed with:	     ian (arm changes)
Tested by:	marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
	amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
2013-02-12 16:57:20 +00:00
John Baldwin
2db99100a4 Improve the handling of static DMA buffers that use non-default memory
attributes (currently just BUS_DMA_NOCACHE):
- Don't call pmap_change_attr() on the returned address, instead use
  kmem_alloc_contig() to ask the VM system for memory with the requested
  attribute.
- As a result, always use kmem_alloc_contig() for non-default memory
  attributes, even for sub-page allocations.  This requires adjusting
  bus_dmamem_free()'s logic for determining which free routine to use.
- For x86, add a new dummy bus_dmamap that is used for static DMA
  buffers allocated via kmem_alloc_contig().  bus_dmamem_free() can then
  use the map pointer to determine which free routine to use.
- For powerpc, add a new flag to the allocated map (bus_dmamem_alloc()
  always creates a real map on powerpc) to indicate which free routine
  should be used.

Note that the BUS_DMA_NOCACHE handling in powerpc is currently #ifdef'd out.
I have left it disabled but updated it to match x86.

Reviewed by:	scottl
MFC after:	1 month
2012-08-03 13:50:29 +00:00
Alan Cox
8d9e6d9f93 Avoid recursion on the pvh global lock in the aim oea pmap.
Correct the return type of the pmap_ts_referenced() implementations.

Reported by:	jhibbits [1]
Tested by:	andreast
2012-07-10 22:10:21 +00:00
Rafal Jaworowski
17f4cae4a5 Let us manage differences of Book-E PowerPC variations i.e. vendor /
implementation specific vs. the common architecture definition.

Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.

This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.

Obtained from:	AppliedMicro, Freescale, Semihalf
2012-05-27 10:25:20 +00:00
Rafal Jaworowski
0a67fa33d6 Move OpenPIC FDT bus glue to a shared location, so that other PowerPC
platforms can use it, not only MPC85XX.

This is just reorg, no functional changes.
2012-05-26 21:02:49 +00:00
Rafal Jaworowski
2f6bd24181 Rename e500 prefix to match other Book-E CPU variations. CPU id tidbits for
the new cores.

Obtained from:	Freescale, Semihalf.
2012-05-26 13:36:18 +00:00
Rafal Jaworowski
21e7982efd Missing vm_paddr_t bits which should have been part of r235936. 2012-05-25 15:13:55 +00:00
Rafal Jaworowski
20b7961267 Fix physical address type to vm_paddr_t. 2012-05-24 21:13:24 +00:00
Nathan Whitehorn
ccc4a5c761 Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies
range operations like pmap_remove() and pmap_protect() as well as allowing
simple operations like pmap_extract() not to involve any global state.
This substantially reduces lock coverages for the global table lock and
improves concurrency.
2012-05-20 14:33:28 +00:00
Nathan Whitehorn
a1f8f44820 Remove dead code. The routines in atomic.S did not work properly anyway, and
were everywhere unused. If we turn out to need them, they should be
reimplemented.

MFC after:	2 weeks
2012-04-22 18:56:56 +00:00
Nathan Whitehorn
13d47f302f Replace eieio; sync for creating bus-space memory barriers with sync.
sync performs a strict superset of the functions of eieio, so using both
is redundant. While here, expand bus barriers to all bus_space operations,
since many drivers do not correctly use bus_space_barrier().

In principle, we can also replace sync just with eieio, for a significant
performance increase, but it remains to be seen whether any poorly-written
drivers currently depend on the side effects of sync to properly function.

MFC after:	1 week
2012-04-22 18:54:51 +00:00
Nathan Whitehorn
88fe385600 Do not restore the register holding the TLS pointer when doing various
usermode context switches (long jumps and ucontext operations). If these
are used across threads, multiple threads can end up with the same TLS base.
Madness will then result.

This makes behavior on PPC match that on x86 systems and on Linux.

MFC after:	10 days
2012-04-11 00:00:40 +00:00
John Baldwin
831ce4cb3d - Change contigmalloc() to use the vm_paddr_t type instead of an unsigned
long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
  constraints.

These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t).  Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.

Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.

Reviewed by:	scottl
2012-03-01 19:58:34 +00:00