Commit Graph

262 Commits

Author SHA1 Message Date
Warner Losh
dfb7d4cdef Merge definitions for ARM9E, ARM10 and ARM11 processors from p4 (which
got them from NetBSD).
2007-10-18 05:06:58 +00:00
Olivier Houchard
258f866cbf Define _ARM_ARCH_5E too, so that we know if pld/strd/ldrd are available.
MFC After:	3 days
2007-10-13 12:04:10 +00:00
Alan Cox
7bfda801a8 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
Olivier Houchard
75f66155bf Twist the RAS logic a bit to avoid branching.
MFC After:	1 week
Approved by:	re (blanket)
2007-09-22 14:23:52 +00:00
Olivier Houchard
4168e66b1f In __bswap16_var(), make sure the 16 upper bits are cleared; while
optimizing, gcc4 doesn't always do so.

Reported by:	Nathan Whitehorn
Approved by:	re (blanket)
2007-09-09 11:58:38 +00:00
Olivier Houchard
5f78cb4a35 XScale core 3 definitions.
Approved by:	re (blanket)
2007-07-27 14:54:27 +00:00
Olivier Houchard
e905513c06 Fix the cache mode description.
Approved by:	re (blanket)
2007-07-27 14:45:33 +00:00
Olivier Houchard
b4db6fd942 Properly handle supersections.
Make sure we cache entries in the L2 cache.

Approved by:	re (blanket)
2007-07-27 14:45:04 +00:00
Olivier Houchard
425b5be335 Add a new set of functions to handle L2 cache. Make them no-op for every
CPU except Xscale core 3.

Approved by:	re (blanket)
2007-07-27 14:39:41 +00:00
Olivier Houchard
d076bcf203 The iop34x has 128 interrupts. 2007-06-16 15:03:33 +00:00
Olivier Houchard
10d8c18005 Introduce pmap_kenter_supersection(), which maps 16MB super-sections into
the kernel pmap.
Document a bit more the behavior of the xscale core 3.
2007-06-11 21:29:26 +00:00
Marcel Moolenaar
01bd17cc99 Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
2007-06-09 21:55:17 +00:00
Jeff Roberson
4736604759 - PCPU_ADD is no longer spelled with LAZY_ in the middle.
Submitted by:	attilio
2007-06-06 23:23:47 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
Alan Cox
9211deca08 Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-04 08:02:22 +00:00
Alan Cox
66ab556097 Eliminate some unused definitions that came from NetBSD. 2007-05-28 21:04:22 +00:00
Olivier Houchard
705fda849d Use __mcount() instead of _mcount() to reduce diffs with NetBSD. 2007-05-19 16:20:37 +00:00
Olivier Houchard
fe85f6cee8 Switch the kernel's pmap domain from 15 to 0.
This should be a no-op, and this is needed for xscale core 3 supersections
support, as they are always part of the domain 0
2007-05-19 12:47:34 +00:00
Alan Cox
04a18977c8 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
Kevin Lo
4eaa43e6f4 Remove __P 2007-03-21 03:28:16 +00:00
Alan Cox
c640357f04 Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Olivier Houchard
47010239a8 - Add bounce pages for arm, largely based on the i386 implementation.
- Add a default parent dma tag, similar to what has been done for sparc64.
- Before invalidating the dcache in POSTREAD, save the bits which are in the
same cachelines than our buffers, but not part of it, and restore them after
the invalidation.
2007-01-17 00:53:05 +00:00
Bernd Walter
69b40f4db3 MFp4: Add missing atomic functions
Based on a patch by: des
2007-01-05 02:50:27 +00:00
Olivier Houchard
2feb83cec2 Introduce CPU_XSCALE_CORE3, as XScale Core 3 is significally different than
regular Xscale (it has no mini data cache, has armv6-style 16MB
supersections, and can address 36bits).
Define it for i81342.
2006-11-30 23:30:40 +00:00
Sam Leffler
588a2322a9 correct bus space unmap prototype
Reviewed by:	cognet, imp
MFC after:	1 month
2006-11-19 23:46:50 +00:00
Ruslan Ermilov
26af9ac7d0 Fix a comment. 2006-11-13 06:26:57 +00:00
Alan Cox
cc0d48ffb6 Eliminate unused global variables. 2006-11-11 20:57:52 +00:00
Olivier Houchard
676b1fbdbf Identify the xscale 81342. 2006-11-07 22:36:57 +00:00
Olivier Houchard
2c7b82c9dd Add atomic_cmpset_acq_32. 2006-11-07 11:53:44 +00:00
John Birrell
6825d60738 PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.
2006-10-04 21:37:10 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Alexander Kabaev
d9cb97ff9d Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and  __builtin_va_start was present in all GCC version 3.1 and
later.
2006-09-21 01:37:02 +00:00
Olivier Houchard
4731df1ee7 Remove dead code, already defined in sys/cdef.h
Spotted out by:	bde
2006-08-30 11:45:07 +00:00
Alan Cox
b554f899bd Eliminate unused definitions. (They came from NetBSD.)
Discussed with: cognet, grehan, marcel
2006-08-25 23:51:11 +00:00
Olivier Houchard
11d1528ce0 Finally bring it support for the i80219 XScale processor.
Submitted by:	Max M. Boyarov <m.boyarov bsd by>
2006-08-24 23:51:28 +00:00
Olivier Houchard
ba282be9f3 Use ELFDATA2MSB if we're building big endian.
Noticed by:	Oleksandr Tymoshenko <gonzo freebsd org>
2006-08-24 23:00:03 +00:00
Olivier Houchard
49953e11d7 Rewrite ARM_USE_SMALL_ALLOC so that instead of the current behavior, it maps
whole the physical memory, cached, using 1MB section mappings. This reduces
the address space available for user processes a bit, but given the amount of
memory a typical arm machine has, it is not (yet) a big issue.
It then provides a uma_small_alloc() that works as it does for architectures
which have a direct mapping.
2006-08-08 20:59:38 +00:00
Olivier Houchard
9c8cab3814 Define BYTE_MSF if we're compiling a big endian kernel, so that DDB can
correctly disassemble instructions on big endian.
2006-07-27 11:41:37 +00:00
Olivier Houchard
be050429a3 Add remote GDB bits for arm. 2006-07-14 00:50:51 +00:00
Alan Cox
ed48a217f6 Add partial pmap locking.
Eliminate the unused allpmaps list.

Tested by: cognet@
2006-06-06 04:32:20 +00:00
Olivier Houchard
b2adc703fd Don't #error if no CPU is defined but we're not compiling the kernel. 2006-06-02 09:39:06 +00:00
Olivier Houchard
27b45ae819 Don't enable the FIQ in enable_interrupts() if F32_bit is not specified.
This has been committed by mistake.

Reported by:	ssouhlal
2006-06-01 16:17:44 +00:00
Olivier Houchard
c712f1ef5b Ooops arm10 is armv5, not armv4.
Submitted by:	kevlo
2006-05-31 13:06:08 +00:00
Olivier Houchard
87adbb81cc Include machine/cpuconf.h in pmap.h in order to get ARM_NMMUS defined,
to appease -Wundef.
2006-05-31 11:57:37 +00:00
Olivier Houchard
ec21307611 Add definitions for atomic_subtract_rel_32, atomic_add_rel_32 and
atomic_load_acq_32, needed for hwpmc.
2006-05-15 13:08:12 +00:00
Olivier Houchard
ef4d5877dd Switch to a 64bit time_t, while it's not a big problem to do so.
Suggested by:	imp
2006-05-15 00:17:27 +00:00
Olivier Houchard
d5d776c16b Resurrect Skyeye support :
Add a new option, SKYEYE_WORKAROUNDS, which as the name suggests adds
workarounds for things skyeye doesn't simulate. Specifically :
- Use USART0 instead of DBGU as the console, make it not use DMA, and           manually provoke an interrupt when we're done in the transmit function.
- Skyeye maintains an internal counter for clock, but apparently there's
no way to access it, so hack the timecounter code to return a value which
is increased at every clock interrupts. This is gross, but I didn't find a
better way to implement timecounters without hacking Skyeye to get the
counter value.
- Force the write-back of PTEs once we're done writing them, even if they
are supposed to be write-through. I don't know why I have to do that.
2006-05-13 23:41:16 +00:00
Poul-Henning Kamp
5405ab4889 Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
2006-05-11 17:29:25 +00:00
Olivier Houchard
b8986f5675 Disable/enable fiqs as well as irqs. 2006-04-13 14:25:28 +00:00
Olivier Houchard
174329aff2 MFp4: Don't write-back the PTEs if they are mapped write-through, this was
apparently only needed because skyeye has bugs in its cache emulation.
2006-04-09 20:03:03 +00:00
Olivier Houchard
c0e239dead MFp4: Forget the asm inlined version of in_cksum_hdr(). It doesn't work if
the pointer is unaligned, and it just doesn't worth it.
2006-03-09 23:33:59 +00:00
Olivier Houchard
2456c0ea88 Try to honor BUS_DMA_COHERENT : if the flag is set, normally allocate memory
with malloc() or contigmalloc() as usual, but try to re-map the allocated
memory into a VA outside the KVA, non-cached, thus making the calls to
bus_dmamap_sync() for these buffers useless.
2006-03-01 23:04:25 +00:00
Olivier Houchard
123f34932c Use memory clobbers, to be on the safe side.
Suggested by:	jhb
2006-02-06 18:29:05 +00:00
Olivier Houchard
697e7cb715 Backout rev 1.12. It would have been a good thing, if gcc was smart enough
not to generate bad code.
2006-02-05 22:06:12 +00:00
Warner Losh
d5e61c97a6 By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into
param.h.  Per request, I've placed these just after the
_NO_NAMESPACE_POLLUTION ifndef.  I've not renamed anything yet, but
may since we don't need the __.

Submitted by: bde, jhb, scottl, many others.
2006-01-09 06:05:57 +00:00
Warner Losh
501755f4f6 Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for
each platform.  These will be used in the pci code in preference to
the complicated #ifdefs we have there now.
2006-01-01 20:59:28 +00:00
John Baldwin
b439e431bf Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
Olivier Houchard
b34658e8a9 A #define is not enough, we need to cast from u_long * to uint32_t *. 2005-12-09 22:58:07 +00:00
Olivier Houchard
858b811f34 Define atomic_whatever_long 2005-12-09 22:33:20 +00:00
Ruslan Ermilov
224d140293 Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE).  Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.
2005-12-06 13:27:21 +00:00
Olivier Houchard
ce4210d673 Use a magic number to know we were started from the elf wrapper.
Add a dummy _start function to make the non-elf version of the wrapper work.
2005-11-24 02:27:55 +00:00
Olivier Houchard
94d8cf9916 Force pmap to write-back the pte cacheline after each pte modification,
even if the pte is supposed to be cached in write through mode (might be a
skyeye bug, I'll have to check).
2005-11-21 19:10:44 +00:00
Olivier Houchard
f9126cfb8f Add an alternate ID for the arm920t (the real solution is to have
per-cpu class masks, but oh well).
2005-11-21 19:06:25 +00:00
Olivier Houchard
9e581686a3 There's no need to include <machine/asmacros.h> here. 2005-11-08 13:01:29 +00:00
Olivier Houchard
812779897c MFi386 rev 1.536 (sort of)
Move what can be moved (UMA zones creation, pv_entry_* initialization) from
pmap_init2() to pmap_init().
Create a new function, pmap_postinit(), called from cpu_startup(), to do the
L1 tables allocation.
pmap_init2() is now empty for arm as well.
2005-11-06 16:10:28 +00:00
John Baldwin
21aa010bb5 Whitespace. 2005-10-14 18:36:49 +00:00
John Baldwin
43e2ef2bb6 Change the userland atomic operations on arm to use memory operands for
the modified memory rather than using register operands that held a pointer
to the memory.  The biggest effect is that we now correctly tell the
compiler that these functions change the memory that these functions
modify.

Reviewed by:	cognet
2005-10-14 18:07:45 +00:00
Olivier Houchard
db7db23dd8 dump_avail has nothing to do with ARM_USE_SMALL_ALLOC, so move its
declaration out of the #ifdef.
2005-10-04 16:29:31 +00:00
Olivier Houchard
b834efd591 Provide a dump_avail[] variable, which contains the page ranges to be
dumped.

For iq31244_machdep.c, attempt to recognize hints provided by the elf
trampoline.
2005-10-03 14:15:50 +00:00
Olivier Houchard
0122bd1470 Add a new API to let platform-specific ports provide functions for big
copy/zeroing.
2005-10-03 14:12:10 +00:00
Olivier Houchard
93d18f4760 asm versions of in_cksum_hdr() and in_pseudo(). 2005-10-03 14:06:44 +00:00
John Baldwin
3c2bc2bf26 Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on:	i386, alpha, sparc64, arm (cognet)
Reviewed by:	arch@
Submitted by:	cognet (arm)
MFC after:	1 week
2005-09-27 17:39:11 +00:00
Stefan Farfeleder
a1f85d7f83 Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ.  Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with:		bde
2005-08-20 16:44:41 +00:00
Warner Losh
95e4208ebf msdosfs_conv.c references cmos_wall_clock and adjkerntz. Since these
are 0 for arm, define them as such to make msdosfs_conv.c compile
again on arm.
2005-07-27 21:19:28 +00:00
John Baldwin
d9610574a2 Add extra constraints to tell the compiler that the memory be modified
in the arm __swp() and sparc64 casa() and casax() functions is actually
being used as an input and output and not just the value of the register
that points to the memory location.  This was the underlying source of
the mbuf refcount problems on sparc64 a while back.  For arm this should be
a nop because __swp() has a constraint to clobber all memory which can
probably be removed now.

Reviewed by:	alc, cognet
MFC after:	1 week
2005-07-27 20:01:45 +00:00
John Baldwin
e11fe02dfb Use a + constraint modifier for a register arg in __bswap16_var().
Reviewed by:	cognet
2005-07-27 19:59:21 +00:00
John Baldwin
122eceef61 Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables.  This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after:	3 days
Tested on:	i386, alpha, sparc64
Compiled on:	ia64, powerpc, amd64
Kernel toolchain busted on:	arm
2005-07-15 18:17:59 +00:00
John Baldwin
dc802c0628 Fix a typo.
Approved by:	re (scottl)
2005-06-23 21:54:17 +00:00
Joseph Koshy
f263522a45 MFP4:
- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
  PMC implementations across different architectures.
  Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
  every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
  in the future.  Add more 'alias' names for commonly used events.

- bug fixes & documentation.
2005-06-09 19:45:09 +00:00
Olivier Houchard
f60e923b23 - MFp4: modify slightly the arm intr API, there's arm CPUs with more than 32
interrupts.
- Implement teardown methods where appropriate.
2005-06-09 12:26:20 +00:00
Olivier Houchard
56e472e2b5 Add a new arm-specific option, ARM_USE_SMALL_ALLOC. If defined, it provides
an implementation of uma_small_alloc() which tries to preallocate memory
1MB per 1MB, and maps it into a section mapping.
2005-06-07 23:04:24 +00:00
Olivier Houchard
094df9739b Bring in bits I forgot while importing write back support for arm9. 2005-06-03 19:49:53 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Olivier Houchard
e59bc6b04e s/_KLD_MODULE/KLD_MODULE/ 2005-05-26 16:05:22 +00:00
Olivier Houchard
08a94fbcf9 Remove bits specific to CPUs we won't support (< armv4). 2005-05-25 13:46:32 +00:00
Olivier Houchard
0f18d3256d Use asm versions of in_cksum() and friends. 2005-05-24 21:44:34 +00:00
Olivier Houchard
fdc05f7913 Asm version of bswap16().
Obtained from: 	NetBSD
2005-05-24 21:43:16 +00:00
Olivier Houchard
fa7e20fdd4 Make sure we clean the RAS start address once we're done.
This fixes the random segfaults which occurs at high interrupts rate.
2005-05-24 21:42:31 +00:00
Marcel Moolenaar
ff7125a623 Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.
2005-04-20 18:44:53 +00:00
Warner Losh
06db52b609 Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb
2005-04-18 21:45:34 +00:00
Olivier Houchard
2d93998b00 Import a basic implementation of the restartable atomic sequences to provide
atomic operations to userland (this is OK for UP only, but SMP is still so
far away).
2005-04-07 22:03:04 +00:00
Olivier Houchard
139e3f7c33 - Try harder to report dirty page.
- Garbage-collect pmap_update(), it became quite useless.
2005-04-07 22:01:53 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
Olivier Houchard
7fc53c7b12 Bring in a version of float.h more correct for softfloat. 2005-03-20 00:34:24 +00:00
Scott Long
5974e5c71c Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch.  This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date.  Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API.  sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.
2005-03-14 16:46:28 +00:00
Joerg Wunsch
a5f50ef9e4 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
Olivier Houchard
f4c01f1508 Instead of using sysarch() to store-retrieve the tp, add a magic address,
ARM_TP_ADDRESS, where the tp will be stored. On CPUs that support it, a cache
line will be allocated and locked for this address, so that it will never go
to RAM. On CPUs that does not, a page is allocated for it (it will be a bit
slower, and is wrong for SMP, but should be fine for UP).
The tp is still stored in the mdthread struct, and at each context switch,
ARM_TP_ADDRESS gets updated.

Suggested by:   davidxu
2005-02-26 18:59:01 +00:00
Olivier Houchard
b6e4194946 Add the field in the md part of the struct thread required by ARM_[GET|SET]_TP. 2005-02-26 00:02:14 +00:00
Olivier Houchard
a74985cdd4 Implement two new sysarch for arm, ARM_GET_TP and ARM_SET_TP, to work around
the lack of tls on arm.
2005-02-25 22:56:16 +00:00