freebsd-dev

Author	SHA1	Message	Date
Rafal Jaworowski	2ae7b3e42d	Unify SPR defines formatting, no funtional changes.	2012-05-26 12:15:13 +00:00
Rafal Jaworowski	ec0453765b	Update HID defines for E500mc and E5500 CPU cores. Obtained from: Freescale, Semihalf	2012-05-25 21:12:24 +00:00
Rafal Jaworowski	d7c8c7fdfb	Fix physical address type to vm_paddr_t also for powerpc64.	2012-05-25 18:17:26 +00:00
Rafal Jaworowski	21e7982efd	Missing vm_paddr_t bits which should have been part of r235936.	2012-05-25 15:13:55 +00:00
Bjoern A. Zeeb	08c5f3303d	Add a missing " to get closer to compiling.	2012-05-24 23:46:17 +00:00
Nathan Whitehorn	270dc329b7	Atomic operation acquire barriers also need to be isync on 64-bit systems.	2012-05-24 22:14:39 +00:00
Marcel Moolenaar	7097794901	Revert isync for ILP32 to sync as per my original change that I discussed with Nathan. Leave __ATOMIC_ACQ as an isync as per Nathan.	2012-05-24 22:06:00 +00:00
Bjoern A. Zeeb	920b965865	MFp4 bz_ipv6_fast: in_cksum.h required ip.h to be included for struct ip. To be able to use some general checksum functions like in_addword() in a non-IPv4 context, limit the (also exported to user space) IPv4 specific functions to the times, when the ip.h header is present and IPVERSION is defined (to 4). We should consider more general checksum (updating) functions to also allow easier incremental checksum updates in the L3/4 stack and firewalls, as well as ponder further requirements by certain NIC drivers needing slightly different pseudo values in offloading cases. Thinking in terms of a better "library". Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 22:00:48 +00:00
Marcel Moolenaar	f6703dd295	A few improvements: 1. Define all registers. These definitions are needed to support the FCM driver for direct-connect NAND. 2. Repurpose lbc_read_reg() and lbc_write_reg() for use by localbus attached device drivers. Use bus_space functions directly in the lbc driver itself. 3. Be smarter about programming LAWs and mapping memory. The ranges defined in the FDT are per bank (= chip select) and since we can have up to 8 banks, we could easily use more than 8 LAWs or TLB enrties when per-bank memory ranges need multiple LAWs or TLBs due to alignment or size constraints. We now combine all memory ranges into the fewest possible set of contiguous regions and program the hardware for that. Thus, a cleverly written FDT with 8 devices may still only need 1 LAW or 1 TLB entry. Note that the memory ranges can be assigned randomly to the banks. We sort as we build to handle that. 4. Support the FCM when programming the OR register. This is mostly for documention purposes as we do not have a way to define the mode for a bank. 5. Remove Semihalf-ism: do not define DEBUG (only to undefine it again).	2012-05-24 21:23:13 +00:00
Rafal Jaworowski	20b7961267	Fix physical address type to vm_paddr_t.	2012-05-24 21:13:24 +00:00
Marcel Moolenaar	5704576a0a	Remove Semihakf-ism. DEBUG is a kernel configuration option. It should not be defined in source files.	2012-05-24 21:09:38 +00:00
Marcel Moolenaar	e845939dc1	Just return if the size of the window is 0. This can happen when the FDT does not define all ranges possible for a particular node (e.g. PCI). While here, only update the trgt_mem and trgt_io pointers if there's no error. This avoids that we knowingly write an invalid target (= -1).	2012-05-24 21:07:10 +00:00
Marcel Moolenaar	05917fee1b	Either the I/O port range or the memory mapped I/O range may not be defined in the FDT. The range will have a zero size in that case.	2012-05-24 21:01:35 +00:00
Marcel Moolenaar	a45d9127bd	o Rename kernload_ap to bp_kernelload. This to introduce a common prefix for variables that live in the boot page. o Add bp_trace (yes, it's in the boot page) that gets zeroed before we try to wake a core and to which the core being woken can write markers so that we know where the core was in case it doesn't wake up. The boot code does not yet write markers (too follow). o Disable the boot page translation to allow the last 4K page to be used for whatever we please. It would get mapped otherwise. o Fix kernstart in the case of SMP. The start argument is typically page aligned due to the alignment requirements that come with having a boot page. The point of using trunc_page is that we get the actual load address given that the entry point is immediately following the ELF headers. In the SMP case this ended up exactly 4K after the load address. Hence subtracting 1 from start.	2012-05-24 20:58:40 +00:00
Marcel Moolenaar	df0bef25eb	Fix the memory barriers for CPUs that do not like lwsync and wedge or cause exceptions early enough during boot that the kernel will do ithe same. Use lwsync only when compiling for LP64 and revert to the more proven isync when compiling for ILP32. Note that in the end (i.e. between revision 222198 and this change) ILP32 changed from using sync to using isync. As per Nathan the isync is needed to make sure I/O accesses are properly serialized with locks and isync tends to be more effecient than sync. While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file so as not to leak their definitions. Discussed with: nwhitehorn	2012-05-24 20:45:44 +00:00
Nathan Whitehorn	ccc4a5c761	Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies range operations like pmap_remove() and pmap_protect() as well as allowing simple operations like pmap_extract() not to involve any global state. This substantially reduces lock coverages for the global table lock and improves concurrency.	2012-05-20 14:33:28 +00:00
Nathan Whitehorn	bc96dccc69	Fix final bugs in memory barriers on PowerPC: - Use isync/lwsync unconditionally for acquire/release. Use of isync guarantees a complete memory barrier, which is important for serialization of bus space accesses with mutexes on multi-processor systems. - Go back to using sync as the I/O memory barrier, which solves the same problem as above with respect to mutex release using lwsync, while not penalizing non-I/O operations like a return to sync on the atomic release operations would. - Place an acquisition barrier around thread lock acquisition in cpu_switchin().	2012-05-04 16:00:22 +00:00
Dimitry Andric	460378bf13	Add a convenience macro for the returns_twice attribute, and apply it to the prototypes of the appropriate functions (getcontext, savectx, setjmp, sigsetjmp and vfork). MFC after: 2 weeks	2012-04-29 11:04:31 +00:00
Nathan Whitehorn	284ea61312	Fix build on 32-bit systems.	2012-04-28 14:42:49 +00:00
Nathan Whitehorn	50e13823c8	After switching mutexes to use lwsync, they no longer provide sufficient guarantees on acquire for the tlbie mutex. Conversely, the TLB invalidation sequence provides guarantees that do not need to be redundantly applied on release. Roll a small custom lock that is just right. Simultaneously, convert the SLB tree changes back to lwsync, as changing them to sync was a misdiagnosis of the tlbie barrier problem this commit actually fixes.	2012-04-28 00:12:23 +00:00
Nathan Whitehorn	de63b4d2d5	Switch the default I/O memory barrier to eieio, as it should be. This does not appear to cause any problems due to fixes elsewhere. MFC after: 2 months	2012-04-24 13:37:43 +00:00
Nathan Whitehorn	8387bb0c78	Revert r234581 for this file. The lockless SLB tree code does in fact need a heavyweight sync instead of a lightweight sync to function properly. Thanks to mdf for the clarification.	2012-04-24 13:36:41 +00:00
Nathan Whitehorn	51a6f57e4a	Fix copy-and-paste error in r230400. MFC after: 3 days	2012-04-23 20:53:50 +00:00
Nathan Whitehorn	2a134f71d1	Fix missing header for powerpc_iomb(). Pointy hat to: me	2012-04-23 15:47:07 +00:00
Nathan Whitehorn	a4cbf436e7	Provide a clearer split between read/write and acquire/release barriers. This should really, actually be correct now.	2012-04-22 22:27:35 +00:00
Nathan Whitehorn	14758466eb	Correctly specify assembler constrains for synchronization instructions. MFC after: 3 days	2012-04-22 21:55:19 +00:00
Nathan Whitehorn	a6349a998d	Clarify what we are doing in r234583 a little better: eieio and isync do not provide general barriers, but only barriers in the context of the atomic sequences here. As such, make them private and keep the global *mb() routines using a variant of sync.	2012-04-22 21:11:01 +00:00
Nathan Whitehorn	83ae3d5531	On non-64-bit systems (which generally don't have lwsync), use eieio and isync to implement read and write barriers, following Appendix B.2 of Book II of the architecture manual. This provides a 25% speed increase to fork() on the PowerPC G4.	2012-04-22 20:23:34 +00:00
Nathan Whitehorn	6f26a88999	Use lwsync to provide memory barriers on systems that support it instead of sync (lwsync is an alternate encoding of sync on systems that do not support it, providing graceful fallback). This provides more than an order of magnitude reduction in the time required to acquire or release a mutex. MFC after: 2 months	2012-04-22 19:00:51 +00:00
Nathan Whitehorn	a1f8f44820	Remove dead code. The routines in atomic.S did not work properly anyway, and were everywhere unused. If we turn out to need them, they should be reimplemented. MFC after: 2 weeks	2012-04-22 18:56:56 +00:00
Nathan Whitehorn	13d47f302f	Replace eieio; sync for creating bus-space memory barriers with sync. sync performs a strict superset of the functions of eieio, so using both is redundant. While here, expand bus barriers to all bus_space operations, since many drivers do not correctly use bus_space_barrier(). In principle, we can also replace sync just with eieio, for a significant performance increase, but it remains to be seen whether any poorly-written drivers currently depend on the side effects of sync to properly function. MFC after: 1 week	2012-04-22 18:54:51 +00:00
Nathan Whitehorn	0b852c03eb	Avoid a lock order reversal in pmap_extract_and_hold() from relocking the page. This PMAP requires an additional lock besides the PMAP lock in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release. Suggested by: kib MFC after: 4 days	2012-04-22 17:58:30 +00:00
Nathan Whitehorn	fbd21ea620	Organize some members of ucontext_t in the same order they are in the trap frame. These are usually not used, and so this changes very little. MFC after: 5 days	2012-04-21 14:39:47 +00:00
Nathan Whitehorn	c13aac3896	Make sure all pending operations have completed on the existing thread before (potentially) migrating it to a different CPU. MFC after: 5 days	2012-04-20 23:01:36 +00:00
Nathan Whitehorn	e3c2930d36	We don't need kcopy() in any of the remaining places it is used, so remove it. MFC after: 2 weeks	2012-04-11 22:23:50 +00:00
Nathan Whitehorn	b6aeb1ab97	Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy for whether the page is physical. On dense phys mem systems (32-bit), VM_PHYS_TO_PAGE will not return NULL for device memory pages if device memory is above physical memory even if there is no allocated vm_page. Attempting to use the returned page could then cause either memory corruption or a page fault.	2012-04-11 21:56:55 +00:00
Nathan Whitehorn	805bee55eb	Fix error in r233949. Synchronizing icaches on uncacheable pages turns out not to be a good idea, and of course the PV entry list for a page is never empty after the page has been mapped.	2012-04-11 20:28:05 +00:00
Nathan Whitehorn	88fe385600	Do not restore the register holding the TLS pointer when doing various usermode context switches (long jumps and ucontext operations). If these are used across threads, multiple threads can end up with the same TLS base. Madness will then result. This makes behavior on PPC match that on x86 systems and on Linux. MFC after: 10 days	2012-04-11 00:00:40 +00:00
Nathan Whitehorn	b7d0d1fabf	Execute an initial ptesync if and only if the PTE is actually being invalidated, as opposed to a ref/changed bit update.	2012-04-06 22:33:13 +00:00
Nathan Whitehorn	348bc07000	Substantially reduce the scope of the locks held in pmap_enter(), which improves concurrency slightly.	2012-04-06 18:18:48 +00:00
Nathan Whitehorn	57bd5cce62	Reduce the frequency that the PowerPC/AIM pmaps invalidate instruction caches, by invalidating kernel icaches only when needed and not flushing user caches for shared pages. Suggested by: kib MFC after: 2 weeks	2012-04-06 16:03:38 +00:00
Nathan Whitehorn	629e40e45e	Give the kernel pmap lock a different name than user pmap locks. It has (slightly) different semantics and renaming it prevents a (harmless) WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs. MFC after: 5 days	2012-04-06 16:00:37 +00:00
John Baldwin	1f22be4547	- Rename VM_MEMATTR_UNCACHED to VM_MEMATTR_WEAK_UNCACHEABLE on x86 to be less ambiguous and more clearly identify what it means. This attribute is what Intel refers to as UC-, and it's only difference relative to normal UC memory is that a WC MTRR will override a UC- PAT entry causing the memory to be treated as WC, whereas a UC PAT entry will always override the MTRR. - Remove the VM_MEMATTR_UNCACHED alias from powerpc.	2012-03-29 16:51:22 +00:00
Nathan Whitehorn	13b5e92e01	Allow multiple inclusion of trap.h. This has always been broken, but until recently never caused problems.	2012-03-29 02:02:14 +00:00
Fabien Thomas	f5f9340b98	Add software PMC support. New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8). Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions. Sponsored by: NETASQ MFC after: 1 month	2012-03-28 20:58:30 +00:00
Nathan Whitehorn	7e55df27cb	More PMAP performance improvements: skip 256 MB segments entirely if they are are not mapped during ranged operations and reduce the scope of the tlbie lock only to the actual tlbie instruction instead of the entire sequence. There are a few more optimization possibilities here as well.	2012-03-28 17:25:29 +00:00
Nathan Whitehorn	a3e9e259b3	Make sure to call vm_page_dirty() before the pmap lock is released to prevent a race where another process could conclude the page was clean. Submitted by: alc	2012-03-27 01:26:00 +00:00
Nathan Whitehorn	5afcb4c91e	More PMAP concurrency improvements: replace the table lock and (almost) all uses of the page queues mutex with a new rwlock that protects the page table and the PV lists. This reduces system time during a parallel buildworld by 35%. Reviewed by: alc	2012-03-27 01:24:18 +00:00
Nathan Whitehorn	e71dfa7b84	More PMAP performance improvements: on powerpc64, when TLBIE can be run with exceptions enabled, leave them enabled and use a regular mutex to guard TLB invalidations instead of a spinlock.	2012-03-25 06:01:34 +00:00
Nathan Whitehorn	d456d3e31f	Only call vm_page_dirty() on pages that are writable in order not to confuse the VM.	2012-03-24 22:32:19 +00:00

1 2 3 4 5 ...

1900 Commits