Commit Graph

1485 Commits

Author SHA1 Message Date
Alan Cox
65336314cf In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock
cannot possibly occur.
2005-11-13 02:17:05 +00:00
Alan Cox
7a35a21e7b Reimplement the reclamation of PV entries. Specifically, perform
reclamation synchronously from get_pv_entry() instead of
asynchronously as part of the page daemon.  Additionally, limit the
reclamation to inactive pages unless allocation from the PV entry zone
or reclamation from the inactive queue fails.  Previously, reclamation
destroyed mappings to both inactive and active pages.  get_pv_entry()
still, however, wakes up the page daemon when reclamation occurs.  The
reason being that the page daemon may move some pages from the active
queue to the inactive queue, making some new pages available to future
reclamations.

Print the "reclaiming PV entries" message at most once per minute, but
don't stop printing it after the fifth time.  This way, we do not give
the impression that the problem has gone away.

Reviewed by: tegge
2005-11-09 08:19:21 +00:00
Alan Cox
e9cb1037da Begin and end the initialization of pvzone in pmap_init().
Previously, pvzone's initialization was split between pmap_init() and
pmap_init2().  This split initialization was the underlying cause of
some UMA panics during initialization.  Specifically, if the UMA boot
pages was exhausted before the pvzone was fully initialized, then UMA,
through no fault of its own, would use an inappropriate back-end
allocator leading to a panic.  (Previously, as a workaround, we have
increased the UMA boot pages.)  Fortunately, there is no longer any
reason that pvzone's initialization cannot be completed in
pmap_init().

Eliminate a check for whether pv_entry_high_water has been initialized
or not from get_pv_entry().  Since pvzone's initialization is
completed in pmap_init(), this check is no longer needed.

Use cnt.v_page_count, the actual count of available physical pages,
instead of vm_page_array_size to compute the maximum number of pv
entries.

Introduce the vm.pmap.pv_entries tunable on alpha and ia64.

Eliminate some unnecessary white space.

Discussed with: tegge (item #1)
Tested by: marcel (ia64)
2005-11-04 18:03:24 +00:00
Alan Cox
fcf67b0496 Remove the remaining spl*() calls. Add some assertions. Eliminate some
excessive white space.
2005-11-03 07:51:02 +00:00
Robert Watson
5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Marcel Moolenaar
6739824a02 Remove a stray return statement in the interrupt dispatch function
that caused a premature exit after calling a fast interrupt handler
and bypassing a much needed critical_exit() and the scheduling of
the interrupt thread for non-fast handlers. In short: unbreak :-)
2005-10-30 17:23:01 +00:00
John Baldwin
e0f66ef861 Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces.  struct intr_event holds the list
  of interrupt handlers associated with interrupt sources.
  struct intr_thread contains the data relative to an interrupt thread.
  Currently we still provide a 1:1 relationship of events to threads
  with the exception that events only have an associated thread if there
  is at least one threaded interrupt handler attached to the event.  This
  means that on x86 we no longer have 4 bazillion interrupt threads with
  no handlers.  It also means that interrupt events with only INTR_FAST
  handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
  intr_foo naming convention.  This did require renaming the powerpc
  MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
  powerpc.  This means that multiple INTR_FAST handlers can attach to the
  same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
  to the same interrupt.  Sharing INTR_FAST handlers may not always be
  desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
  either.  Drivers can always still use INTR_EXCL to ask for an interrupt
  exclusively.  The way this sharing works is that when an interrupt
  comes in, all the INTR_FAST handlers are executed first, and if any
  threaded handlers exist, the interrupt thread is scheduled afterwards.
  This type of layout also makes it possible to investigate using interrupt
  filters ala OS X where the filter determines whether or not its companion
  threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
  is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
  dumping their state.  It also has a '/v' verbose switch which dumps
  info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
  handler is removed because it would suck for things like ppbus(8)'s
  braindead behavior.  The code is present, though, it is just under
  #if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
  event into a separate function so that ithread_loop() becomes more
  readable.  Previously this code was all in the middle of ithread_loop()
  and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
  with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
  curthread's proc against idlethread's proc. (Not really related to intr
  changes)

Tested on:	alpha, amd64, i386, sparc64
Tested on:	arm, ia64 (older version of patch by cognet and marcel)
2005-10-25 19:48:48 +00:00
Ade Lovett
8d228514fb Specifically panic() in the case where pmap_insert_entry() fails to
get a new pv under high system load where the available pv entries
have been exhausted before the pagedaemon has a chance to wake up
to reclaim some.

Prior to this, the NULL pointer dereference ended up causing
secondary panics with rather less than useful resulting tracebacks.

Reviewed by:	alc, jhb
MFC after:	1 week
2005-10-21 19:42:43 +00:00
Poul-Henning Kamp
7423b2b40c Make ttyconsolemode() call ttsetwater() so that drivers don't have to. 2005-10-16 20:58:22 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
Poul-Henning Kamp
2628fdabad Eliminate need for __RMAN_RESOURCE_VISIBLE
Reviewed by:	marcel@
2005-10-06 17:39:18 +00:00
Robert Watson
5f419982c2 Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by:	bde
2005-09-28 07:03:03 +00:00
Peter Wemm
add121a476 Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added
stubs for ia64 to keep it compiling.  These are used by 32 bit apps such
as gdb.
2005-09-27 18:04:20 +00:00
John Baldwin
3c2bc2bf26 Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on:	i386, alpha, sparc64, arm (cognet)
Reviewed by:	arch@
Submitted by:	cognet (arm)
MFC after:	1 week
2005-09-27 17:39:11 +00:00
Robert Watson
84d2b7df26 Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions.  In most cases, this means adding an
acquisition of Giant immediately around the function.  In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset.  This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after:	1 week
2005-09-19 16:51:43 +00:00
Christian S.J. Peron
33cdc78d01 Introduce a kernel config for the Mandatory Access Control framework.
This kernel config briefly describes some of the major MAC policies
available on FreeBSD. The hope is that this will raise the awareness
about MAC and get more people interested.

Discussed with:	scottl
2005-09-18 03:15:36 +00:00
Alan Cox
ac31d065a6 Eliminate unused definitions. 2005-09-11 20:51:15 +00:00
David E. O'Brien
2a191126de Canonize the include of acpi.h. 2005-09-11 18:39:03 +00:00
Marcel Moolenaar
8115693121 Merge db_interface.c and db_trace.c into db_machdep.c. 2005-09-10 03:18:51 +00:00
Marcel Moolenaar
216e80c2ba Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint()
and db_md_list_watchpoints() to ddb/ddb.h.
2005-09-10 03:01:25 +00:00
Marcel Moolenaar
464d16ddf0 Move the ia32_sigcode structure from ia32_sigtramp.c to ia32_signal.c.
It's a bit excessive to have it in a file of its own.
2005-09-10 02:12:49 +00:00
Marcel Moolenaar
0522a40412 Remove redundant $FreeBSD$ 2005-09-10 01:13:33 +00:00
Marcel Moolenaar
87a59250b5 Change the High FP lock from a sleep lock to a spin lock. We can
take the lock from interrupt context, which causes an implicit
lock order reversal. We've been using the lock carefully enough
that making it a spin lock should not be harmful.
2005-09-09 19:18:36 +00:00
Marcel Moolenaar
cca2e0f1cc Milestone: enable SMP by default. 2005-09-05 21:36:28 +00:00
Marcel Moolenaar
ab870058d7 o In pmap_remove_pte: always invalidate the page. Previously the page
was not invalidated if the PTE was not actually being removed.  In
   an UP kernel this didn't cause problems, because the new mapping
   would preempt the old one. In an SMP kernel this could lead to the
   use of stale translations when processes move between CPUs at the
   "right" moment.  This fixes the last of the obvious SMP problems
   and it should be safe to enable SMP by default now.
o  In pmap_remove_pte: minor code refactoring to avoid duplication.
o  Test all PTE pointers against NULL. Don't use implicit boolean
   tests.
2005-09-05 21:32:02 +00:00
Marcel Moolenaar
5280c8c2ab o s/vhpt_size/pmap_vhpt_log2size/g
o  s/vhpt_base/pmap_vhpt_base/g
o  s/vhpt_bucket/pmap_vhpt_bucket/g
o  Declare the above in <machine/pmap.h>
o  Move the vm.stats.vhpt.* sysctls to machdep.vhpt.*
o  Create a tunable machdep.vhpt.log2size, with corresponding sysctl.
   The tunable allows the user to specify the VHPT size from the loader.
o  Don't keep track of the number of PTEs in the VHPT. Calculate the
   population when necessary by iterating the buckets and summing up
   the length of the buckets.
o  Don't perform the tpa instruction with a bucket lock held. The
   instruction can (theoretically) fault and locking is not needed.
2005-09-03 23:53:50 +00:00
Marcel Moolenaar
43be3aac7a Fix collision chain termination checks. The result of IA64_PHYS_TO_RR7
is never 0, so one cannot test for a NULL pointer after a physical
address is translated into a virtual pointer with said macro. Instead,
keep the physical address around and test it against 0. Note that
this obviously implies that a PTE can never be allocated at physical
address 0. This isn't exactly guaranteed, but hasn't been a problem
so far. We test the physical address against 0 for as long as the ia64
port exists...
2005-09-03 19:43:15 +00:00
Alan Cox
ba8bca610c Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.
2005-09-03 18:20:20 +00:00
Stefan Farfeleder
a1f85d7f83 Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ.  Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with:		bde
2005-08-20 16:44:41 +00:00
Marcel Moolenaar
d41a7ed490 Remove the execute permission for stacks. 2005-08-14 23:17:59 +00:00
Marcel Moolenaar
a812f8435a o s/pmap_lpte_/pmap_/g
o  Remove pmap_is_referenced(). It was already compiled-out.
2005-08-13 21:16:38 +00:00
Marcel Moolenaar
86257f240a Fix the problem with the IPI for the lazy context switching of the
high FP registers. It was not that the IPI got lost due to the
perceived unreliability of the IPI delivery, but rather that the
IPI was not assigned a vector (ugh). Sending a 0 vector to a CPU
results in a stray external interrupt.
Add a KASSERT to ipi_send() to catch this. The initialization of
the IPIs could be better, but it's not at all sure what the future
of the code is. Avoid wasting a lot of time on something that is
going to be rewritten anyway.
2005-08-13 21:08:32 +00:00
Marcel Moolenaar
4630415a47 Improve SMP support:
o  Allocate a VHPT per CPU. The VHPT is a hash table that the CPU
   uses to look up translations it can't find in the TLB. As such,
   the VHPT serves as a level 1 cache (the TLB being a level 0 cache)
   and best results are obtained when it's not shared between CPUs.
   The collision chain (i.e. the hash bucket) is shared between CPUs,
   as all buckets together constitute our collection of PTEs. To
   achieve this, the collision chain does not point to the first PTE
   in the list anymore, but to a hash bucket head structure. The
   head structure contains the pointer to the first PTE in the list,
   as well as a mutex to lock the bucket. Thus, each bucket is locked
   independently of each other. With at least 1024 buckets in the VHPT,
   this provides for sufficiently finei-grained locking to make the
   ssolution scalable to large SMP machines.
o  Add synchronisation to the lazy FP context switching. We do this
   with a seperate per-thread lock. On SMP machines the lazy high FP
   context switching without synchronisation caused inconsistent
   state, which resulted in a panic. Since the use of the high FP
   registers is not common, it's possible that races exist. The ia64
   package build has proven to be a good stress test, so this will
   get plenty of exercise in the near future.
o  Don't use the local ID of the processor we want to send the IPI to
   as the argument to ipi_send(). use the struct pcpu pointer instead.
   The reason for this is that IPI delivery is unreliable. It has been
   observed that sending an IPI to a CPU causes it to receive a stray
   external interrupt. As such, we need a way to make the delivery
   reliable. The intended solution is to queue requests in the target
   CPU's per-CPU structure and use a single IPI to inform the CPU that
   there's a new entry in the queue. If that IPI gets lost, the CPU
   can check it's queue at any convenient time (such as for each
   clock interrupt). This also allows us to send requests to a CPU
   without interrupting it, if such would be beneficial.

With these changes SMP is almost working. There are still some random
process crashes and the machine can hang due to having the IPI lost
that deals with the high FP context switch.

The overhead of introducing the hash bucket head structure results
in a performance degradation of about 1% for UP (extra pointer
indirection). This is surprisingly small and is offset by gaining
reasonably/good scalable SMP support.
2005-08-06 20:28:19 +00:00
Marcel Moolenaar
045f23cd0d Reduce the default MAXCPU from 16 to 4. This is in preparation of
allocating a VHPT per CPU. Since we don't yet know how many CPUs
are actually in the system at the time we need to allocate the
VHPTs, we allocate for MAXCPU processors. This can result in a
lot of wasted space for 2-way machines. So, for now, limit MAXCPU
to something smaller until we have something more dynamic.
2005-08-06 19:59:23 +00:00
Marcel Moolenaar
cbef4d0edc For ia64_ptc_{e,g,ga,l}(), use instruction serialization. We
typically don't know what the TLB described and need to assume
that it affects the fetching of instructions.
2005-08-06 19:54:31 +00:00
Jeff Roberson
8d511e2a05 - Add support for saving stack traces and displaying them via printf(9)
and KTR.

Contributed by:		Antoine Brodin <antoine.brodin@laposte.net>
Concept code from:	Neal Fachan <neal@isilon.com>
2005-08-03 04:27:40 +00:00
John Baldwin
122eceef61 Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables.  This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after:	3 days
Tested on:	i386, alpha, sparc64
Compiled on:	ia64, powerpc, amd64
Kernel toolchain busted on:	arm
2005-07-15 18:17:59 +00:00
Ken Smith
22e59cec3b Add recently invented COMPAT_FREEBSD5 option.
MFC after:	3 days
2005-07-14 15:39:06 +00:00
David Xu
740fd64d65 Validate if the value written into {FS,GS}.base is a canonical
address, writting non-canonical address can cause kernel a panic,
by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring
only canonical values get written to the registers.

Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com >
Approved by: re (scottl)
2005-07-10 23:31:11 +00:00
Marcel Moolenaar
7906787a5f Enhance ia64_flush_dirty() to handle the case in which td != curthread.
This case is triggered with ptrace(2) and the PT_SETREGS function.
Change the return type of the function to int so that errors can be
passed on to the caller.

Approved by: re (scottl)
2005-07-05 17:12:18 +00:00
Marcel Moolenaar
a2aeb24eff Implement functions calls from within DDB on ia64. On ia64 a function
pointer doesn't point to the first instruction of that function, but
rather to a descriptor. The descriptor has the address of the first
instruction, as well as the value of the global pointer. The symbol
table doesn't know anything about descriptors, so if you lookup the
name of a function you get the address of the first instruction. The
cast from the address, which is the result of the symbol lookup, to a
function pointer as is done in db_fncall is therefore invalid.
Abstract this detail behind the DB_CALL macro. By default DB_CALL is
defined as db_fncall_generic, which yields the old behaviour. On ia64
the macro is defined as db_fncall_ia64, in which a descriptor is
constructed to yield a valid function pointer.

While here, introduce DB_MAXARGS. DB_MAXARGS replaces the existing
(local) MAXARGS. The DB_MAXARGS macro can be defined by platforms to
create a convenient maximum. By default this will be the legacy 10.
On ia64 we define this macro to be 8, for 8 is the maximum number of
arguments that can be passed in registers. This avoids having to
implement spilling of arguments on the memory stack.

Approved by: re (dwhite)
2005-07-02 23:52:37 +00:00
Marcel Moolenaar
5116398a06 Fix a buglet that was present in the ia64 code and that got inherited
by amd64 and i386: For buffered writes we collect data and write it
out a ${DEV_BSIZE}-sized block at a time. The fragsz variable is used
to keep track of how much data we have collected in the buffer so far
and it's reset to zero immediately after writing a block to the dump
device.
When the last, possibly partially filled buffer is flushed, we didn't
reset fragsz to 0 and as such would stop reflecting reality. Since we
currently only need to do buffered writes once, this isn't a problem.
However, when kernel dumps are made by hand (say by callling doadump
from within DDB), the improperly cleared state from the first call to
dumpsys causes the next call to dumpsys to create an invalid code file.
This change resets fragsz after flushing the partially filled buffer so
that it fixes the two problems at once.

Approved by: re (scottl)
2005-07-02 19:57:31 +00:00
Peter Wemm
62919d788b Jumbo-commit to enhance 32 bit application support on 64 bit kernels.
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.

ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
	procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
	and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
	sscanf fails.  They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets.  Note
	that 64 bit consumers can still debug 32 bit targets.

IA64 has got stubs for ia32_reg.c.

Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet.  We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.

Approved by:	re
2005-06-30 07:49:22 +00:00
Marcel Moolenaar
c31450b00d Handle B-unit break instructions. The break.b is unique in that the
immediate is not saved by the architecture. Any of the break.{mifx}
instructions have their immediate saved in cr.iim on interruption.
Consequently, when we handle the break interrupt, we end up with a
break value of 0 when it was a break.b. The immediate is important
because it distinguishes between different uses of the break and
which are defined by the runtime specification.
The bottomline is that when the GNU debugger replaces a B-unit
instruction with a break instruction in the inferior, we would not
send the process a SIGTRAP when we encounter it, because the value
is not one we recognize as a debugger breakpoint.

This change adds logic to decode the bundle in which the break
instruction lives whenever the break value is 0. The assumption
being that it's a break.b and we fetch the immediate directly out
of the instruction. If the break instruction was not a break.b,
but any of break.{mifx} with an immediate of 0, we would be doing
unnecessary work. But since a break 0 is invalid, this is not a
problem and it will still result in a SIGILL being sent to the
process.

Approved by: re (scottl)
2005-06-27 23:51:38 +00:00
Marcel Moolenaar
fc37111e5d Replace the existing copyright notice with my own. Over the years I've
changed this file so much that it's equivalent to a rewrite, and I'm not
talking about any of the cosmetic changes of course.

Approved by: re (scottl)
2005-06-27 23:34:35 +00:00
Marcel Moolenaar
9701d67eb8 Cosmetic: s/u_int64_t/uint64_t/g
Approved by: re (scottl)
2005-06-27 23:29:06 +00:00
David E. O'Brien
c3e0dfa1f8 Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from
questing kernel config files not in CVS.

Approved by:	re(kensmith)
2005-06-20 16:52:59 +00:00
Marcel Moolenaar
442add308f Define IPI_PREEMPT. Update a nearby comment while I'm here. 2005-06-12 19:03:01 +00:00
Alan Cox
1c245ae7d1 Introduce a procedure, pmap_page_init(), that initializes the
vm_page's machine-dependent fields.  Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.

Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.

Remove stale comments from pmap_init().

Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations.  Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.

Tested by: cognet, kensmith, marcel
2005-06-10 03:33:36 +00:00
Joseph Koshy
f263522a45 MFP4:
- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
  PMC implementations across different architectures.
  Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
  every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
  in the future.  Add more 'alias' names for commonly used events.

- bug fixes & documentation.
2005-06-09 19:45:09 +00:00
Marcel Moolenaar
470cd51ee6 Create nexus in configure_first() instead of in configure(). This
makes sure that sysinit tasks that run after configure_first(),
but before configure() have a nexus to hang devices off.
2005-05-29 23:44:22 +00:00
Marcel Moolenaar
a0c51afb16 Call cninit_finish() in configure_final(). 2005-05-29 22:48:41 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Yoshihiro Takahashi
b22bf66063 - Move bus dependent defines to {isa,cbus}_dmareg.h.
- Use isa/isareg.h rather than <arch>/isa/isa.h.

Tested on: i386, pc98
2005-05-14 10:14:56 +00:00
Marcel Moolenaar
6fab4fece2 Don't define _MACHINE_BUS_MEMIO_H_ nor _MACHINE_BUS_PIO_H_. 2005-05-10 02:59:24 +00:00
David Xu
21fc316430 Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.
2005-04-23 02:32:32 +00:00
Marcel Moolenaar
8773a80baf Sanity the RTC code:
o  Remove the clock interface. Not only does it conflict with the MI
   version when device genclock is added to the kernel, it was also
   not possible to have more than 1 clock device. This of course would
   have been a problem if we actually had more than 1 clock device.
   In short: we don't need a clock interface and if we do eventually,
   we should be using the MI one.
o  Rewrite inittodr() and resettodr() to take into account that:
   1)  We use the EFI interface directly.
   2)  time_t is 64-bit and we do need to make sure we can determine
       leap years from year 2100 and on. Add a nice explanation of
       where leap years come from and why.
   3)  This rewrite happened in 2005 so any date prior to 1/1/2005
       (either M/D/Y or D/M/Y) is bogus. Reprogram the EFI clock with
       1/1/2005 in that case.
   4)  The EFI clock has a high probability of being correct, so
       only (further) correct the EFI clock when the file system time
       is larger. That should never happen in a time-synchronised world.
       Complain when EFI lost 2 days or more.

Replace the copyright notice now that I (pretty much) rewrote all of
this file.
2005-04-22 05:04:58 +00:00
Marcel Moolenaar
ff7125a623 Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.
2005-04-20 18:44:53 +00:00
Warner Losh
06db52b609 Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb
2005-04-18 21:45:34 +00:00
Marcel Moolenaar
02b47ea204 Add a kpte command to DDB. It dumps the PTE of a KVA. This helps
to analyze faults and TLB/VHPT inconsistencies.
2005-04-16 23:38:32 +00:00
Marcel Moolenaar
e190f6efc8 Return better "error" values for UWX_BOTTOM and UWX_ABI_FRAME in
unw_step(). Both errors denote the end of a stack trace (i.e. no
prior frame), but are otherwise not error conditions.
Have db_trace() return 0 when the trace ends due to one of these
return codes as they are really normal termination conditions.

This change especially improves the output of the "show thread"
command in DDB when there are threads in fork_trampoline() and
previously db_trace() would return an error, causing the show
command to emit '***'.
2005-04-16 05:38:59 +00:00
Marcel Moolenaar
64c92ba929 Initialize curthread before we save the APs MCA state. Saving the
MCA state requires a spin lock, which requires a valid curthread.
This change allows SMP kernels to boot into multi-user again.

While here, update the copyright notice and use __FBSDID for the
revision string.
2005-04-15 00:21:23 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
Marcel Moolenaar
a08d773359 Dot the i's:
1  Move the debug.clock_adjust_* sysctls to debug.clock.adjust_* to
   make it easier to get only the clock statistics.
2  Make the sysctls read-only [suggested by Marius].
3  When determining the new clock adjustment, we checked for an error
   either larger than 12.5% or smaller than 12.5%. We left out an error
   of exactly 12.5%. For errors larger than 12.5% we adjust the clock
   reload value in such a way that the next clock interrupt would be
   early (as in premature). For errors less than 12.5% we stopped the
   adjustment.
   The current algorithm doesn't benefit from excluding an error of
   exactly 12.5%. Change the code to stop adjusting the clock if the
   error is *not* larger than 12.5% [suggested by Marius].

Discussed with: marius@
2005-04-12 18:50:57 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
Maxim Sobolev
6bcf003260 Add USB Communication Device Class Ethernet driver. Originally written for
FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported
to NetBSD and finally NetBSD version merged with original one goes into
FreeBSD.

Obtained from:  http://www.gank.org/freebsd/cdce/
                NetBSD
                OpenBSD
2005-03-22 14:52:40 +00:00
Nate Lawson
ac5f2dab74 s/SLIST/STAILQ to catch up with changes to resource lists.
Missed by:	imp
2005-03-20 06:55:49 +00:00
Murray Stokely
991f5121f0 Add a comment to note that pseudo-device bpf is required for DHCP.
This is mentioned in the Handbook but it is not as obvious to new
users why bpf is needed compared to the other largely self-explanatory
items in GENERIC.

PR:		conf/40855
MFC after:	1 week
2005-03-18 15:24:00 +00:00
Ian Dowse
60719a1a44 Split configure() into 3 separate steps like we do on other
architectures. This makes it possible to insert hooks before and
after the device attachment step.

Tested thanks to:	marcel
2005-03-18 09:45:43 +00:00
Scott Long
5974e5c71c Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch.  This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date.  Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API.  sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.
2005-03-14 16:46:28 +00:00
Scott Long
8bf0837c7a Remove dead code. 2005-03-07 02:18:52 +00:00
Joerg Wunsch
a5f50ef9e4 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
Marcel Moolenaar
f685f62c98 Make sure fpswa_iface equals NULL when bootinfo.bi_fpswa equals 0.
We need to be able to test for the (possible) non-existence of the
FPSWA code.

PR: ia64/77591
Submitted by: Christian Kandeler (christian dot kandeler at hob dot de)
MFC after: 1 day
2005-03-02 20:29:04 +00:00
Wes Peters
95e2054492 Attempt to doff the pointy hat: implement 'hw.realmem' on remaining
architectures.  Pointed out by O'Brien, ScottL via email.

Reviewed by:	obrien (various)
2005-03-01 21:55:27 +00:00
Xin LI
130d7d9ffb Remove acpi_perf from {ARCH}/conf/NOTES, to make tinderbox happy.
Reported by:	tinderbox
Inspired by:	acpi_perf build structure removal commit
2005-02-25 07:10:37 +00:00
Ruslan Ermilov
3971d2cf5e Use a common multi-inclusion protection, and add such a
protection to alpha/include/exec.h.
2005-02-19 21:16:48 +00:00
Marcel Moolenaar
3ec2e857c1 s/descr/oid_descr/ 2005-02-09 04:48:23 +00:00
Poul-Henning Kamp
0c3c54da63 Since we are quite unlikely to ever face another platform which
uses the i8237 without trying to emulate the PC architecture move
the register definitions for the i8237 chip into the central include
file for the chip, except for the PC98 case which is magic.

Add new isa_dmatc() function which tells us as cheaply as possible
if the terminal count has been reached for a given channel.
2005-02-06 13:46:39 +00:00
Nate Lawson
3888a87205 Finish the job of sorting all includes and fix the build by including
malloc.h before proc.h on sparc64.  Noticed by das@

Compiled on:	alpha, amd64, i386, pc98, sparc64
2005-02-06 01:55:08 +00:00
Nate Lawson
69bc96f231 Build cpufreq and acpi_perf on platforms that are likely to be able to
use them.
2005-02-05 21:01:09 +00:00
Marcel Moolenaar
6fb59928a6 Include sys/bus.h before sys/cpu.h. The latter needs device_t. 2005-02-04 06:38:58 +00:00
Nate Lawson
4c4381e288 Add an implementation of cpu_est_clockrate(9). This function estimates the
current clock frequency for the given CPU id in units of Hz.
2005-02-04 05:32:56 +00:00
Warner Losh
1f0ce611b3 nit in /*- 2005-01-31 08:16:45 +00:00
Marcel Moolenaar
b71bca0f84 Fix handling of post increment: Either the first or second operand
is the register with the memory address, and it's that register's
value we need to increment or decrement.

MFC after: 3 days
2005-01-27 06:01:44 +00:00
Scott Long
9580b6766e Fix compile errors. Bah. 2005-01-18 11:06:34 +00:00
Scott Long
a9e9b7e47b Fix an assignment that I missed in the last commit. 2005-01-15 20:03:59 +00:00
Scott Long
33072f4de7 Add bus_dmamap_load_mbuf_sg() to ia64 2005-01-15 19:26:17 +00:00
John Baldwin
f4ef9cec40 - Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer
the last action of kern_exit().  Instead, it is a MD callout to cleanup
  per-process state during exit.
- Add notes of concern to Alpha and ia64 about the possible need to drop
  fp state in cpu_thread_exit() rather than in cpu_exit() since it is
  per-thread state rather than per-process.
2005-01-14 20:13:04 +00:00
Warner Losh
86cb007f9f /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 22:18:23 +00:00
Marcel Moolenaar
2fa9a15eca Further enhance the handling of misaligned loads and stores:
o  implement double-extended and single precision loads and stores,
o  implement double precision stores,
o  replace the machdep.unaligned_print sysctl with debug.unaligned_print
   and change the default value to 0,
o  replace the machdep.unaligned_sigbus sysctl with debug.unaligned_test,
o  Remmove the fillfd() function. The function is trvial enough for
   inline assembly.

The debug.unaligned_test sysctl is used to test the emulation of
misaligned loads and stores. When PSR.ac is 0, the CPU will handle
misaligned memory accesses itselfi and we don't get an exception
for it. When PSR.ac is 1, the process needs to be signalled and we
should not emulate. The sysctl takes effect when PSR.ac is 1 and
tells us that we should emulate and not send a signal.

PR: 72268
MFC after: 1 week
2005-01-02 00:20:54 +00:00
Alan Cox
1f70d62298 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
Alan Cox
85f5b24573 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
Marcel Moolenaar
7126a966b3 Fix the last of the instability and the cause of the annoying
"vm_fault: fault on nofault entry, addr: %lx" panic. The problem was a
stale PTE in the TLB that marked the page as not present, even though
we had a good PTE in the VHPT. We typically don't yet insert PTEs in
the TLB. We do that lazily. The CPU will look for the PTE in the VHPT
when there's no PTE in the TLB. Unfortunately this doesn't handle the
case of the stale PTE in the TLB. The quick fix is to invalidate the
TLB (sloppily) when the VHPT doesn't contain a valid PTE. This is also
the only case that may cause a PTE in the TLB that marks a page as
non-present.
2004-12-12 19:27:58 +00:00
Marcel Moolenaar
3579953091 Use primitive types to avoid creating an artificial header dependency:
o  s/u_long/unsigned long/
o  s/uint32_t/unsigned int/g
o  s/uint64_t/unsigned long/g

Trigger case: multimedia/mpeg2codec
2004-12-11 06:15:12 +00:00
Marcel Moolenaar
f5929532f1 Don't obtain the HCDP address directly from the bootinfo structure.
Use a function to keep the details at arms length from uart(4).
2004-12-08 05:46:54 +00:00
Marcel Moolenaar
bcc5241c43 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
Marcel Moolenaar
c0678028d7 Whitespace fixes:
o  Remove a bogus comment that relates to alpha.
o  s/u_int64_t/uint64_t/g
o  Add bi_spare2 to make the internal padding explicit.
o  Move BOOTINFO_MAGIC after the field it applies to.
2004-11-28 04:34:17 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Marcel Moolenaar
2ba0042660 Remove struct ia64_itir and use a plain old uint64_t instead. 2004-11-21 21:40:08 +00:00
David Schultz
ab44ebf537 Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
David Schultz
11111b709f U areas are going away, so don't allocate one for process 0.
Reviewed by:	arch@
2004-11-20 02:29:25 +00:00
David Schultz
ff3fd2e764 user.h is included only to get pcb.h, so use the latter directly instead. 2004-11-20 02:28:14 +00:00
Marcel Moolenaar
a0c88fb1d9 Remove the BR tag. When the machine doesn't have the DIG64 HCDP
table with console settings, we now only need to know at which
address the UART lives. Leaving the baudrate unspecified results
in us using the baudrate at which the UART operates. This removes
one parameter that can interfere with a successful installation
out of the box.
2004-11-14 23:42:48 +00:00
John Baldwin
d39d4a6e64 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
Poul-Henning Kamp
32204b9721 Use bioq_takefirst() 2004-10-23 12:44:19 +00:00
Poul-Henning Kamp
95bc568977 Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure.

Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.

The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.

Use the new function throughout.
2004-10-18 21:51:27 +00:00
Nate Lawson
8f528832e5 Print flags in the nexus for child devices. 2004-10-14 22:36:47 +00:00
Nate Lawson
31ad3b8802 Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after:	2 weeks
2004-10-11 05:39:15 +00:00
Marcel Moolenaar
07cf947238 Add the Madison II, which is the second generation Madison. The Madison II
is model 2 in the Itanium 2 family and has up to 9MB of L3 cache and clocks
higher than 1.5Ghz. There's no LV variant AFAICT.
2004-10-06 02:43:28 +00:00
Alan Cox
8ceb3dcb60 The physical address stored in the vm_page is page aligned. There is no
need to mask off the page offset bits.  (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
2004-10-03 00:16:43 +00:00
Alan Cox
07b3303943 Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure.  The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
2004-10-02 07:34:58 +00:00
Marcel Moolenaar
287e12f172 ...And fix WITNESS builds: declare syscallnames. 2004-09-26 20:39:56 +00:00
Marcel Moolenaar
feae534e49 Fix INVARIANTS build: Include <machine/cpu.h>. 2004-09-26 00:38:56 +00:00
Marcel Moolenaar
03bfdd1362 Move the IA-32 trap handling from trap() to ia32_trap(). Move the
ia32_syscall() function along with it to ia32_trap.c. When COMPAT_IA32
is not defined, we'll raise SIGEMT instead.
2004-09-25 04:27:44 +00:00
Marcel Moolenaar
0c32530bb7 Redefine a PTE as a 64-bit integral type instead of a struct of
bit-fields. Unify the PTE defines accordingly and update all
uses.
2004-09-23 00:05:20 +00:00
Marcel Moolenaar
0675c65d6b s/u_int#_t/uint#_t/g 2004-09-22 23:12:46 +00:00
Marcel Moolenaar
08d3edb315 For the atomic_{add|clear|set|subtract} family of inlines, return the
old or previous value instead of void. This is not as is documented
in atomic(9), but is API (and ABI) compatible and simply makes sense.
This feature will primarily be used for atomic PTE updates in PMAP/ng.
2004-09-22 19:58:43 +00:00
Marcel Moolenaar
5c48823c36 MFp4: various style fixes, including
o  s/u_int/uint/g
o  s/#define<sp>/#define<tab>/g
o  indent macro definitions
o  Improve vertical spacing
o  Globally align line continuation character
2004-09-22 19:47:42 +00:00
John Baldwin
76764432e4 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
Marcel Moolenaar
13e6668525 MFp4:
Completely remove the remaining EFI includes and add our own (type)
definitions instead. While here, abstract more of the internals by
providing interface functions.
2004-09-19 03:50:46 +00:00
Alan Cox
7580b56bdc Release the page queues lock earlier in pmap_protect() and pmap_remove() in
order to reduce contention.
2004-09-18 22:56:58 +00:00
Marcel Moolenaar
9f9ae8ebb7 Provide our own FPSWA definitions, instead of depending on the Intel
EFI headers and put them all in <machine/fpu.h>. The Intel EFI headers
conflict with the Intel ACPI headers (duplicate type definitions), so
are being phased out in the kernel.
2004-09-17 22:19:41 +00:00
Marcel Moolenaar
cba8d3ae49 Remove useless inclusion of <machine/fpu.h> 2004-09-17 20:42:45 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Marcel Moolenaar
fa3b7cae8d Catch up with other platforms: switch the default scheduler to 4BSD. 2004-09-12 05:50:32 +00:00
Scott Long
50736a153b Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
Marcel Moolenaar
566d143be0 Sync the busdma code with i386. The most tangible upshot is that
the alignment and boundary constraints are being respected, which
fixes the reported ATA problems with SiI chips.
I consider the busdma implementation worrisome nonetheless. Not
only is there too much MI code duplicated in MD files, there's a
lot of questionable code. I smell a wholesale, cross-platform
overhaul coming...

MT5 candidate.
2004-09-08 02:55:04 +00:00
Julian Elischer
ed062c8d66 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
Marcel Moolenaar
44af2aa001 Add aac(4) and aacp(4). The driver is 64-bit clean for roughly a year
now and has been mentioned on the freebsd-ia64 list.
2004-09-02 18:05:26 +00:00
Julian Elischer
5995adc206 Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Alan Cox
bfa15df9ba Remove unnecessary check for curthread == NULL. 2004-08-30 03:52:05 +00:00
Marcel Moolenaar
224407d6a8 s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow
the C calling convention or are otherwise not regular functions. This
allows us to boot a profiling kernel.
2004-08-30 01:32:28 +00:00
Marcel Moolenaar
82ecff453f Catch up with the drive-by renaming of IA32 to COMPAT_IA32. Missed
11 days ago when all the other places were fixed and finally caught
by the tinderbox run...
2004-08-27 21:57:00 +00:00
Marcel Moolenaar
0f2fe153bc Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
Alan Cox
8991a235cb The machine-independent parts of the virtual memory system always pass a
valid pmap to the pmap functions that require one.  Remove the checks for
NULL.  (These checks have their origins in the Mach pmap.c that was
integrated into BSD.  None of the new code written specifically for
FreeBSD included them.)
2004-08-27 19:06:17 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
Marcel Moolenaar
04f093dde7 Get a step closer to profiling the kernel by fixing the definitions
of the MCOUNT_ENTER, MCOUNT_EXIT and MCOUNT_DECL defines. Also make
sure there's a prototype of _MCOUNT_DECL(). This allows us to build
a kernel. There are still unresolved symbols, so linking fails.
2004-08-25 08:03:48 +00:00
Marcel Moolenaar
f0556e70bb Make profiling actually work. The gcc compiler emits a call to the
_mcount() stub when profiling is enabled. Emit this code sequence
for assembly routines as welli (MCOUNT definition in <machine/asm.h>.
We do not pass the GOT entry however as the 4th argument, because it's
not used. The _mcount() stub calls __mcount(), which does the actual
work. Define _MCOUNT_DECL to define __mcount. We do not have an
implementation of mcount(), so we define MCOUNT as empty, but have a
weak alias to _mcount() in _mcount.S.
Note that the _mcount() stub in the kernel is slightly different from
the stub in userland. This is because we do not have to worry about
nested routines in the kernel.
2004-08-25 07:42:34 +00:00
Nate Lawson
dc6851d588 Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list(). 2004-08-24 19:22:54 +00:00
David E. O'Brien
6cda6c4a35 sr(4) definately won't work on IA64. 2004-08-24 18:31:27 +00:00
Arun Sharma
2d24da614a The existing code fails some corner cases. Replace it with
ia64_bsp_adjust() which has been tested to work in all cases for
arbitrary (bsp, nslots) combinations.

reviewed by: marcel@
2004-08-16 22:09:58 +00:00
Marcel Moolenaar
344bbdbd54 As I said: the previous commit was untested... Remove an #endif which
should have ceased to exist when its corresponding #if was removed.
2004-08-16 19:05:08 +00:00
Marcel Moolenaar
97752b2cbd Catch up with the drive-by renaming of IA32 to COMPAT_IA32. It must
have been rush hour...

While here, move COMPAT_IA32 from opt_global.h to opt_compat.h like on
amd64. Consequently, it's unsafe to use the option in pcb.h. We now
unconditionally have the ia32 specific registers in the PCB.

This commit is untested.
2004-08-16 18:54:23 +00:00
Arun Sharma
646c6dd2c0 ITC.{i,d} instructions use format M41 not M42.
reviewed by: marcel@
2004-08-16 18:41:24 +00:00
Marcel Moolenaar
c66fdb617d Allocate memory in the unwinder with M_NOWAIT. We may need to provide
backtraces with locks held.
2004-08-14 05:00:37 +00:00
Marcel Moolenaar
3a00930042 In set_regs(), flush the dirty registers onto the backingstore before
we update the registers. That way we don't have any dirty registers to
worry about and also know that bsp=bspstore, which makes updating the
RSE related registers predictable.
This is not the end of it. We need more validity checks, but for now
this allows us to complete the gdb testsuite without crashing the
kernel.
2004-08-11 05:29:13 +00:00
Marcel Moolenaar
4da47b2fec Add __elfN(dump_thread). This function is called from __elfN(coredump)
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.

Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc
2004-08-11 02:35:06 +00:00
Marcel Moolenaar
b4b7c60d70 Better preserve the original protection for the mappings we maintain.
The hardware always gives read access for privilege level 0, which
means that we cannot use the hardware access rights and privilege
level in the PTE to test whether there's a change in protection.  So,
we save the original vm_prot_t in the PTE as well.
Add pmap_pte_prot() to set the proper access rights and privilege
level on the PTE given a pmap and the requested protection.

The above allows us to compare the protection in pmap_extract_and_hold()
which was missing. While in pmap_extract_and_hold(), add pmap locking.

While here, clean up most (i.e. all but one) PTE macros we inherited
from alpha. They were either unused, used inconsistently, badly named
or simply weren't beneficial. We save the wired and managed state of
the PTE in distinct (bit) fields.

While in pte.h, s/u_int64_t/uint64_t/g

pmap locking obtained from: alc@
feedback & review by: alc@
2004-08-09 20:44:41 +00:00
Marcel Moolenaar
47a86e3cd4 Implement single stepping when we leave the kernel through the EPC syscall
path. The basic problem is that we cannot set the single stepping flag
directly, because we don't leave the kernel via an interrupt return. So,
we need another way to set the single stepping flag.
The way we do this is by enabling the lower-privilege transfer trap, which
gets raised when we drop the privilege level. However, since we're still
running in kernel space (sec), we're not yet done. We clear the lower-
privilege transfer trap, enable the taken-branch trap and continue exiting
the kernel until we branch into user space.
Given the current code, there's a total of two traps this way before
we can raise SIGTRAP.
2004-08-08 00:28:07 +00:00