Commit Graph

1907 Commits

Author SHA1 Message Date
alc
6eeaee04e4 The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap
layer, but it is read directly by the MI VM layer.  This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.

Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.

As an added bonus, tidy up some nearby comments concerning page flags.

Reviewed by:	kib
MFC after:	6 weeks
2012-06-16 18:56:19 +00:00
raj
4542634c3c Panic openly if we cannot retrieve memory information from the device tree.
This is a critical condition and can lead to all sorts of misterious hangs if
not handled.

Obtained from:	Semihalf
Also reported by: thompsa
2012-05-30 18:05:48 +00:00
raj
76640cb48b Extract vendor specific Book-E pieces into separate files and have a common
skeleton (maybe we should kobj-tize this one day).

Note the PPC4xx bit is not connected to the build yet.

Obtained from:	AppliedMicro, Semihalf.
2012-05-30 17:34:40 +00:00
raj
39bb31b171 Remove redundant check, we catch ULE platform support in common
sys/kern/sched_ule.c
2012-05-27 10:32:10 +00:00
raj
7136f7f893 Let us manage differences of Book-E PowerPC variations i.e. vendor /
implementation specific vs. the common architecture definition.

Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.

This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.

Obtained from:	AppliedMicro, Freescale, Semihalf
2012-05-27 10:25:20 +00:00
raj
f58c0d7427 Import eSDHC driver for Freescale integrated controller.
Obtained from:	Freescale, Semihalf
Written by:	Michal Dubiel
2012-05-26 21:07:15 +00:00
raj
dc343feae3 Move OpenPIC FDT bus glue to a shared location, so that other PowerPC
platforms can use it, not only MPC85XX.

This is just reorg, no functional changes.
2012-05-26 21:02:49 +00:00
raj
e87745389c Retrieve CPU number info from the device tree.
Obtained from:	Freescale, Semihalf.
2012-05-26 13:42:55 +00:00
raj
8c2ef762a7 Rename e500 prefix to match other Book-E CPU variations. CPU id tidbits for
the new cores.

Obtained from:	Freescale, Semihalf.
2012-05-26 13:36:18 +00:00
raj
0557f549f6 Provide SPR definitions for newer Book-E (E500mc, E5500, PPC465).
Obtained from:	Freescale, Semihalf.
2012-05-26 12:39:23 +00:00
raj
9e109ca411 Unify SPR defines formatting, no funtional changes. 2012-05-26 12:15:13 +00:00
raj
c4e990d23d Update HID defines for E500mc and E5500 CPU cores.
Obtained from:	Freescale, Semihalf
2012-05-25 21:12:24 +00:00
raj
cb9537b464 Fix physical address type to vm_paddr_t also for powerpc64. 2012-05-25 18:17:26 +00:00
raj
0c51e3a0d5 Missing vm_paddr_t bits which should have been part of r235936. 2012-05-25 15:13:55 +00:00
bz
9254329b05 Add a missing " to get closer to compiling. 2012-05-24 23:46:17 +00:00
nwhitehorn
cb2f55559e Atomic operation acquire barriers also need to be isync on 64-bit systems. 2012-05-24 22:14:39 +00:00
marcel
3184129b6e Revert isync for ILP32 to sync as per my original change that I discussed
with Nathan. Leave __ATOMIC_ACQ as an isync as per Nathan.
2012-05-24 22:06:00 +00:00
bz
cd8b136e12 MFp4 bz_ipv6_fast:
in_cksum.h required ip.h to be included for struct ip.  To be
  able to use some general checksum functions like in_addword()
  in a non-IPv4 context, limit the (also exported to user space)
  IPv4 specific functions to the times, when the ip.h header is
  present and IPVERSION is defined (to 4).

  We should consider more general checksum (updating) functions
  to also allow easier incremental checksum updates in the L3/4
  stack and firewalls, as well as ponder further requirements by
  certain NIC drivers needing slightly different pseudo values
  in offloading cases.  Thinking in terms of a better "library".

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 22:00:48 +00:00
marcel
5ba5b2308f A few improvements:
1.  Define all registers. These definitions are needed to support
    the FCM driver for direct-connect NAND.
2.  Repurpose lbc_read_reg() and lbc_write_reg() for use by localbus
    attached device drivers. Use bus_space functions directly in the
    lbc driver itself.
3.  Be smarter about programming LAWs and mapping memory. The ranges
    defined in the FDT are per bank (= chip select) and since we can
    have up to 8 banks, we could easily use more than 8 LAWs or TLB
    enrties when per-bank memory ranges need multiple LAWs or TLBs
    due to alignment or size constraints.
    We now combine all memory ranges into the fewest possible set of
    contiguous regions and program the hardware for that. Thus, a
    cleverly written FDT with 8 devices may still only need 1 LAW or
    1 TLB entry. Note that the memory ranges can be assigned randomly
    to the banks. We sort as we build to handle that.
4.  Support the FCM when programming the OR register. This is mostly
    for documention purposes as we do not have a way to define the
    mode for a bank.
5.  Remove Semihalf-ism: do not define DEBUG (only to undefine it
    again).
2012-05-24 21:23:13 +00:00
raj
91f8a79888 Fix physical address type to vm_paddr_t. 2012-05-24 21:13:24 +00:00
marcel
fece2de465 Remove Semihakf-ism. DEBUG is a kernel configuration option. It
should not be defined in source files.
2012-05-24 21:09:38 +00:00
marcel
4015ece7fb Just return if the size of the window is 0. This can happen when the
FDT does not define all ranges possible for a particular node (e.g.
PCI).
While here, only update the trgt_mem and trgt_io pointers if there's
no error. This avoids that we knowingly write an invalid target (= -1).
2012-05-24 21:07:10 +00:00
marcel
2c73b7699a Either the I/O port range or the memory mapped I/O range may not be
defined in the FDT. The range will have a zero size in that case.
2012-05-24 21:01:35 +00:00
marcel
eeea3251f8 o Rename kernload_ap to bp_kernelload. This to introduce a common prefix
for variables that live in the boot page.
o   Add bp_trace (yes, it's in the boot page) that gets zeroed before we
    try to wake a core and to which the core being woken can write markers
    so that we know where the core was in case it doesn't wake up. The
    boot code does not yet write markers (too follow).
o   Disable the boot page translation to allow the last 4K page to be used
    for whatever we please. It would get mapped otherwise.
o   Fix kernstart in the case of SMP. The start argument is typically page
    aligned due to the alignment requirements that come with having a boot
    page. The point of using trunc_page is that we get the actual load
    address given that the entry point is immediately following the ELF
    headers. In the SMP case this ended up exactly 4K after the load
    address. Hence subtracting 1 from start.
2012-05-24 20:58:40 +00:00
marcel
c933e51f6c Fix the memory barriers for CPUs that do not like lwsync and wedge or cause
exceptions early enough during boot that the kernel will do ithe same.
Use lwsync only when compiling for LP64 and revert to the more proven isync
when compiling for ILP32. Note that in the end (i.e. between revision 222198
and this change) ILP32 changed from using sync to using isync. As per Nathan
the isync is needed to make sure I/O accesses are properly serialized with
locks and isync tends to be more effecient than sync.

While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file
so as not to leak their definitions.

Discussed with: nwhitehorn
2012-05-24 20:45:44 +00:00
nwhitehorn
e83623fb1f Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies
range operations like pmap_remove() and pmap_protect() as well as allowing
simple operations like pmap_extract() not to involve any global state.
This substantially reduces lock coverages for the global table lock and
improves concurrency.
2012-05-20 14:33:28 +00:00
nwhitehorn
68e9eabdbf Fix final bugs in memory barriers on PowerPC:
- Use isync/lwsync unconditionally for acquire/release. Use of isync
  guarantees a complete memory barrier, which is important for serialization
  of bus space accesses with mutexes on multi-processor systems.
- Go back to using sync as the I/O memory barrier, which solves the same
  problem as above with respect to mutex release using lwsync, while not
  penalizing non-I/O operations like a return to sync on the atomic release
  operations would.
- Place an acquisition barrier around thread lock acquisition in
  cpu_switchin().
2012-05-04 16:00:22 +00:00
dim
81a3e2b46d Add a convenience macro for the returns_twice attribute, and apply it to
the prototypes of the appropriate functions (getcontext, savectx,
setjmp, sigsetjmp and vfork).

MFC after:	2 weeks
2012-04-29 11:04:31 +00:00
nwhitehorn
98252594c8 Fix build on 32-bit systems. 2012-04-28 14:42:49 +00:00
nwhitehorn
d4577289f5 After switching mutexes to use lwsync, they no longer provide sufficient
guarantees on acquire for the tlbie mutex. Conversely, the TLB invalidation
sequence provides guarantees that do not need to be redundantly applied on
release. Roll a small custom lock that is just right. Simultaneously,
convert the SLB tree changes back to lwsync, as changing them to sync
was a misdiagnosis of the tlbie barrier problem this commit actually fixes.
2012-04-28 00:12:23 +00:00
nwhitehorn
337d2f2292 Switch the default I/O memory barrier to eieio, as it should be. This
does not appear to cause any problems due to fixes elsewhere.

MFC after:	2 months
2012-04-24 13:37:43 +00:00
nwhitehorn
d5494b21d8 Revert r234581 for this file. The lockless SLB tree code does in fact need
a heavyweight sync instead of a lightweight sync to function properly.
Thanks to mdf for the clarification.
2012-04-24 13:36:41 +00:00
nwhitehorn
1f2cd3bf47 Fix copy-and-paste error in r230400.
MFC after: 3 days
2012-04-23 20:53:50 +00:00
nwhitehorn
ff8c08b546 Fix missing header for powerpc_iomb().
Pointy hat to:	me
2012-04-23 15:47:07 +00:00
nwhitehorn
265701f492 Provide a clearer split between read/write and acquire/release barriers.
This should really, actually be correct now.
2012-04-22 22:27:35 +00:00
nwhitehorn
19e4decf83 Correctly specify assembler constrains for synchronization instructions.
MFC after: 3 days
2012-04-22 21:55:19 +00:00
nwhitehorn
f4ccf1d6d0 Clarify what we are doing in r234583 a little better: eieio and isync do
not provide general barriers, but only barriers in the context of the
atomic sequences here. As such, make them private and keep the global
*mb() routines using a variant of sync.
2012-04-22 21:11:01 +00:00
nwhitehorn
9735a82985 On non-64-bit systems (which generally don't have lwsync), use eieio and
isync to implement read and write barriers, following Appendix B.2 of
Book II of the architecture manual. This provides a 25% speed increase
to fork() on the PowerPC G4.
2012-04-22 20:23:34 +00:00
nwhitehorn
086dac3dc0 Use lwsync to provide memory barriers on systems that support it instead
of sync (lwsync is an alternate encoding of sync on systems that do not
support it, providing graceful fallback). This provides more than an order
of magnitude reduction in the time required to acquire or release a mutex.

MFC after:	2 months
2012-04-22 19:00:51 +00:00
nwhitehorn
0e093d4003 Remove dead code. The routines in atomic.S did not work properly anyway, and
were everywhere unused. If we turn out to need them, they should be
reimplemented.

MFC after:	2 weeks
2012-04-22 18:56:56 +00:00
nwhitehorn
666d3956e9 Replace eieio; sync for creating bus-space memory barriers with sync.
sync performs a strict superset of the functions of eieio, so using both
is redundant. While here, expand bus barriers to all bus_space operations,
since many drivers do not correctly use bus_space_barrier().

In principle, we can also replace sync just with eieio, for a significant
performance increase, but it remains to be seen whether any poorly-written
drivers currently depend on the side effects of sync to properly function.

MFC after:	1 week
2012-04-22 18:54:51 +00:00
nwhitehorn
d394621163 Avoid a lock order reversal in pmap_extract_and_hold() from relocking
the page. This PMAP requires an additional lock besides the PMAP lock
in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release.

Suggested by:	kib
MFC after:	4 days
2012-04-22 17:58:30 +00:00
nwhitehorn
5af196c3d0 Organize some members of ucontext_t in the same order they are in the
trap frame. These are usually not used, and so this changes very little.

MFC after:	5 days
2012-04-21 14:39:47 +00:00
nwhitehorn
8f2d167ee3 Make sure all pending operations have completed on the existing thread
before (potentially) migrating it to a different CPU.

MFC after:	5 days
2012-04-20 23:01:36 +00:00
nwhitehorn
d89ad2c78a We don't need kcopy() in any of the remaining places it is used, so
remove it.

MFC after:	2 weeks
2012-04-11 22:23:50 +00:00
nwhitehorn
e1fa94aa4e Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy
for whether the page is physical. On dense phys mem systems (32-bit),
VM_PHYS_TO_PAGE will not return NULL for device memory pages if device
memory is above physical memory even if there is no allocated vm_page.
Attempting to use the returned page could then cause either memory
corruption or a page fault.
2012-04-11 21:56:55 +00:00
nwhitehorn
31bc0a8bff Fix error in r233949. Synchronizing icaches on uncacheable pages turns out
not to be a good idea, and of course the PV entry list for a page is never
empty after the page has been mapped.
2012-04-11 20:28:05 +00:00
nwhitehorn
5ff92b1d9c Do not restore the register holding the TLS pointer when doing various
usermode context switches (long jumps and ucontext operations). If these
are used across threads, multiple threads can end up with the same TLS base.
Madness will then result.

This makes behavior on PPC match that on x86 systems and on Linux.

MFC after:	10 days
2012-04-11 00:00:40 +00:00
nwhitehorn
a8563567c4 Execute an initial ptesync if and only if the PTE is actually being
invalidated, as opposed to a ref/changed bit update.
2012-04-06 22:33:13 +00:00
nwhitehorn
6c37128bdd Substantially reduce the scope of the locks held in pmap_enter(), which
improves concurrency slightly.
2012-04-06 18:18:48 +00:00