Commit Graph

4440 Commits

Author SHA1 Message Date
alc
a618275b13 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
alc
6e5db3a043 Use vtopde() instead of pmap_pde() in pmap_kextract(); vtopde() is smaller
and faster in cases, such as pmap_kextract(), where the pde is known to
exist.
2004-12-21 19:25:56 +00:00
alc
ede2fb9751 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
peter
e6d23e6014 MFi386: rev 1.12: re-allow fast interrupts to cause preemption 2004-12-06 22:56:15 +00:00
alc
dbfbcad51b Replace (inlined) pmap_pte() calls with smaller, faster code where
possible, such as the inner loop of pmap_copy().

Remove two comments that apply to i386 but not amd64.
2004-12-04 22:02:31 +00:00
alc
833856a521 For efficiency eliminate the call to pmap_pte() from pmap_protect()'s and
pmap_remove()'s inner loop.  Instead, call pmap_pde_to_pte(), a new
function, prior to the inner loop.

Reviewed by: peter@, tegge@
2004-12-02 04:06:40 +00:00
marcel
c106bd9120 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
peter
c701aac0d6 Remove unused cnt variable for the SMP case. Trim some excessive blank
lines while here.
2004-11-30 20:25:46 +00:00
peter
c3b3223504 Update the gdb register extraction support to use the pcb wherever
possible, like on i386.  Registers are handled differently for caller
vs callee saved registers.
2004-11-30 00:55:49 +00:00
peter
3b740d0ce9 MFi386: join the %cr0 setup line now that i386 has lost the I386 ifdefs. 2004-11-29 23:27:07 +00:00
peter
9c467ceeec Take advantage of the shutdown processing being wired to the BSP and
eliminate the evil cpu_reset_proxy code now that it will never be
activated.  i386 should pick this up as well.
2004-11-29 23:25:56 +00:00
scottl
c9eff643a7 Don't flag alignment constraints as a reason for bouncing. This fixes the
trigger for other misbehaviour in the sym driver that was causing freezes at
boot.  Thanks to phk@ for reporting and testing this.
2004-11-29 14:49:27 +00:00
das
130bed6547 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
scottl
374ba89ea3 Remove an extra #include 2004-11-21 06:28:35 +00:00
scottl
3018b80090 Consolidate all of the bounce tests into the BUS_DMA_COULD_BOUNCE flag.
Allocate the bounce zone at either tag creation or map creation to help
avoid null-pointer derefs later on.  Track total pages per zone so that
each zone can get a minimum allocation at tag creation time instead of
being defeated by mis-behaving tags that suck up the max amount.
2004-11-21 04:15:26 +00:00
das
901a681d75 Remove references to U area and garbage collect includes.
Reviewed by:	arch@
2004-11-20 02:30:59 +00:00
das
90a65a896e Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
das
8375566745 U areas are going away, so don't allocate one for process 0.
Reviewed by:	arch@
2004-11-20 02:29:25 +00:00
scottl
34b1372302 Revert part of rev 1.56. Tag boundaries are handled by splitting segments,
not through bouncing.
2004-11-19 17:51:29 +00:00
scottl
8d0c676b01 MFi386 rev 1.63-1.64:
Use tag-specific pools of bounce pages instead of a single global pool.
2004-11-10 03:49:24 +00:00
peter
56b95b228c MFi386 1.238 (jhb): Allow hints to disable cpus 2004-11-05 18:25:22 +00:00
peter
44e0a1bb90 MFi386:
rev 1.61 (scottl):  Add KTR tracing
rev 1.62 (scottl):  Optimize (td->pmap, inlines, etc)
2004-11-05 18:24:01 +00:00
scottl
f85ea86e71 Don't use atomic ops to increment interrupt stats. This was only done on
amd64 and i386 anyways.  The stats are only kept for informational purposes.
2004-11-03 18:03:06 +00:00
andre
8b6661b126 Reduce annoying SCSI probing delay from 15 to 5 seconds in all GENRIC kernels.
Discussed on:	-current
2004-11-02 20:57:20 +00:00
jhb
a9860ec891 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
des
22d75b597a Add TUNABLE_LONG and TUNABLE_ULONG, and use the latter for the
hw.pci.host_mem_start tunable.  Add comments to TUNABLE_INT and
TUNABLE_QUAD recommending against their use.

MFC after:	3 weeks
2004-10-31 15:50:33 +00:00
des
871722fc59 Whitespace cleanup 2004-10-31 15:02:53 +00:00
simokawa
59e7ed6ecd MFi386: preserve dcons buffer passed by loader. 2004-10-28 12:16:03 +00:00
peter
7f88af1f93 Raise MAXDSIZ from 8G to 32G. The old limit was just an arbitary choice
that was greater than 4G.  I originally used the same values as i386 in
order to save opening a new PML4 page slot, but in the day of gigabytes
of memory, worrying about a 4K page seems futile.  Moving from 8 to 32G
moves the page to a different index, it doesn't increase the number of
pages used.
2004-10-27 17:21:15 +00:00
njl
2982f3e0a0 Print flags in the nexus for child devices. 2004-10-14 22:36:47 +00:00
peter
46a2ab741b MFi386: sync with latest updates 2004-10-11 21:51:27 +00:00
njl
abd4abd5bd Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after:	2 weeks
2004-10-11 05:39:15 +00:00
alc
417a40f2bf Make pte_load_store() an atomic operation in all cases, not just i386 PAE.
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit
in a race between processors.  (This restructuring assumes the newly atomic
pte_load_store() for correct operation.)

Reviewed by: tegge@
PR: i386/61852
2004-10-08 08:23:43 +00:00
jhb
ce2d3f89af Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
alc
52911b00b3 Undo revision 1.251. This change was a performance pessimizing work-around
that is no longer required.  (In fact, it is not clear that it was ever
required in HEAD or RELENG_4, only RELENG_3 required a work-around.)  Now,
as before revision 1.251, if the preexisting PTE is invalid, pmap_enter()
does not call pmap_invalidate_page() to update the TLB(s).

Note: Even with this change, the handling of a copy-on-write fault is
inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice.

Discussed with: bde@
PR: kern/16568
2004-10-03 20:14:07 +00:00
alc
c4db706631 The physical address stored in the vm_page is page aligned. There is no
need to mask off the page offset bits.  (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
2004-10-03 00:16:43 +00:00
alc
19377ec887 Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure.  The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
2004-10-02 07:34:58 +00:00
alc
de84da4673 Remove an unused declaration. (I should have included this change in
revision 1.486.)
2004-10-02 05:58:32 +00:00
alc
118a3c283b Prevent the unexpected deallocation of a page table page while performing
pmap_copy().  This entails additional locking in pmap_copy() and the
addition of a "flags" parameter to the page table page allocator for
specifying whether it may sleep when memory is unavailable.  (Already,
pmap_copy() checks the availability of memory, aborting if it is scarce.
In theory, another CPU could, however, allocate memory between
pmap_copy()'s check and the call to the page table page allocator,
causing the current thread to release its locks and sleep.  This change
makes this scenario impossible.)

Reviewed by: tegge@
2004-09-29 19:20:40 +00:00
peter
52704bafb2 MFi386: rev 1.239 - invalidate tlb after pte update 2004-09-29 01:59:10 +00:00
peter
883b7abe1a MFi386: rev 1.236 - improve panic message for a busted mptable 2004-09-29 01:58:24 +00:00
peter
56d687195a Like on i386, use the definition of struct bios_smap from machine/pc/bios.h
again.
2004-09-24 01:11:11 +00:00
peter
04c2ef2193 Converge towards i386. I originally resisted creating <machine/pc/bios.h>
because it was mostly irrelevant - except for the silly BIOS_PADDRTOVADDR
etc macros.  Along the way of working around this, I missed a few things.

* Make syscons properly inherit the bios capslock/shiftlock/etc state like
  i386 does.  Note that we cannot inherit the bios key repeat rate because
  that requires a bios call (which is impossible for us).
* Give syscons the ability to beep on amd64.  Oops.

While here, make bios.c compile and add it to files.amd64.
2004-09-24 01:08:34 +00:00
peter
09b56c9832 Severely strip down the repocopied i386/bios.c and bios.h files. It turns
out that bios_sigsearch() etc is useful for finding tables in roms.
2004-09-24 00:42:36 +00:00
alc
a6bec8ad06 Correct a long-standing error in _pmap_unwire_pte_hold() affecting
multiprocessors.  Specifically, the error is conditioning the call to
pmap_invalidate_page() on whether the pmap is active on the current CPU.
This call must be unconditional.  Regardless of whether the pmap is active
on the CPU performing _pmap_unwire_pte_hold(), it could be active on another
CPU.  For example, a call to pmap_remove_all() by the page daemon could
result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the
current CPU and active on another CPU.  In such circumstances, failing to
call pmap_invalidate_page() results in a stale TLB entry on the other CPU
that still maps the now deallocated page table page.  What happens next is
typically a mysterious panic in pmap_enter() by the other CPU, either
"pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished,
va: 0x%lx".  Both occur because the former page table page has been recycled
and allocated to a new purpose.  Consequently, it no longer contains zeroes.

See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail
thread last year.

Many thanks to the engineers at Sandvine for providing clear and concise
information until all of the pieces of the puzzle fell into place and
for testing an earlier patch.

MT5 Candidate
2004-09-22 05:01:48 +00:00
peter
fd5eab6b91 MFi386: adapt rev 1.19 (debugger fixes) 2004-09-22 01:27:06 +00:00
peter
994869267e Minor sync-up with i386. Catch up on de-quoting and de-counting after
config changes.
2004-09-22 01:04:54 +00:00
peter
5e49be005a MFi386: add ispfw (except using correct device<tab><tab>ispfw format,
<space><tab> is for the options line)
2004-09-22 00:44:13 +00:00
jhb
e487fab495 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
alc
a58cafd11e Simplify the reference counting of page table pages. Specifically, use
the page table page's wired count rather than its hold count to contain
the reference count.  My rationale for this change is based on several
factors:

1. The machine-independent and pmap layers used the same hold count field
   in subtly different ways.  The machine-independent layer uses the hold
   count to implement a form of ephemeral wiring that is used by pipes,
   physio, etc.  In other words, subsystems where we wish to temporarily
   block a page from being swapped out while it is mapped into the kernel's
   address space.  Such pages are never removed from the page queues.
   Instead, the page daemon recognizes a non-zero hold count to mean "hands
   off this page."  In contrast, page table pages are never in the page
   queues; they are wired from birth to death.  The hold count was being
   used as a kind of reference count, specifically, the number of valid
   page table entries within the page.  Not surprisingly, these two
   different uses imply different synchronization rules: in the machine-
   independent layer access to the hold count requires the page queues
   lock; whereas in the pmap layer the pmap lock is required.  Thus,
   continued use by the pmap layer of vm_page_unhold(), which asserts that
   the page queues lock is held, made no sense.

2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired
   count.  An unexpected wired count on a page table page was ignored and
   the underlying page leaked.

3. In a word, microoptimization.  Using the wired count exclusively, rather
   than a combination of the wired and hold counts, makes the code slightly
   smaller and faster.

Reviewed by: tegge@
2004-09-19 21:20:01 +00:00
alc
78d7eda383 Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc()
succeeds, the page's queue field is unconditionally set to PQ_NONE by
vm_pageq_remove_nowakeup().)
2004-09-19 02:39:31 +00:00
alc
3b2a3888c0 Release the page queues lock earlier in pmap_protect() and pmap_remove() in
order to reduce contention.
2004-09-18 22:56:58 +00:00
phk
1795816cf7 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
phk
fe0362c5d2 Remove now unused #include files. 2004-09-15 12:02:35 +00:00
alc
acf520c2b0 Use an atomic op to update the pte in pmap_protect(). This is to prevent
the loss of a page modified (PG_M) bit in a race between processors.

Quoting Tor:
	One scenario where the old code could cause a lost PG_M bit is a
	multithreaded linux program (or FreeBSD program using the
	linuxthreads port) where one thread was starting a subprocess.
	The thread doing fork() would call vmspace_fork(), which would then
	call vm_map_copy_entry() which would call pmap_protect() on an area
	possibly accessed by other threads.

Additionally, make the clearing of PG_M by pmap_protect() unconditional if
write permission is removed.  Previously, PG_M could persist on a read-only
unmanaged page.  That seems inconsistent and confusing.

In collaboration with: tegge@

MT5 candidate
PR: 61852
2004-09-12 20:20:40 +00:00
scottl
136dd28a2e Double the number of kernel page tables for amd64 and for i386/PAE. The old
value was only enough for 8GB of RAM, the new value can do 16GB.  This still
isn't optimal since it doesn't scale.  Fixing this for amd64 looks to be
fairly easy, but for i386 will be quite difficult.

Reviewed by: peter
2004-09-11 01:31:26 +00:00
wpaul
a2f7a53a34 Add device driver support for the VIA Networking Technologies
VT6122 gigabit ethernet chip and integrated 10/100/1000 copper PHY.
The vge driver has been added to GENERIC for i386, pc98 and amd64,
but not to sparc or ia64 since I don't have the ability to test
it there. The vge(4) driver supports VLANs, checksum offload and
jumbo frames.

Also added the lge(4) and nge(4) drivers to GENERIC for i386 and
pc98 since I was in the neighborhood. There's no reason to leave them
out anymore.
2004-09-10 20:57:46 +00:00
alc
ffa5e13a9f Use atomic ops in pmap_clear_ptes() to prevent SMP races that could
result in the loss of an accessed or modified bit from the pte.

In collaboration with: tegge@

MT5 candidate
2004-09-08 18:58:29 +00:00
scottl
e23092c9a9 Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
scottl
061851f776 Switch the default scheduler to 4BSD to match what will go into RELENG_5 soon.
It can be switched back once 5.3 is tested and released.  Also turn on
PREEMPTION as many of the stability problems with it have been fixed.

MT5: 3 days.
2004-09-07 22:37:43 +00:00
julian
5813d27029 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
scottl
d9af98161a Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
julian
8354ba9e3a Give the 4bsd scheduler the ability to wake up idle processors
when there is new work to be done.

MFC after:	5 days
2004-09-01 06:42:02 +00:00
julian
e9d9514975 Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
julian
2782d4b3fc Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
julian
ee753ed190 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
peter
43365c0b41 Add the mp_watchdog hooks, although it locks up my SMP test box. It might
be useable to somebody.
2004-08-30 23:33:33 +00:00
alc
6b508bc507 Remove unnecessary check for curthread == NULL. 2004-08-30 03:52:05 +00:00
obrien
0fe47008f6 s/smp_rv_mtx/smp_ipi_mtx/g
Requested by:	jhb
2004-08-28 00:49:55 +00:00
arved
78d5f4b4e2 Fix a comment, IA32 was renamed to COMPAT_IA32
Approved by:	marcel
2004-08-27 21:29:20 +00:00
marcel
01fd13440d Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
alc
d0b59a4d47 The machine-independent parts of the virtual memory system always pass a
valid pmap to the pmap functions that require one.  Remove the checks for
NULL.  (These checks have their origins in the Mach pmap.c that was
integrated into BSD.  None of the new code written specifically for
FreeBSD included them.)
2004-08-27 19:06:17 +00:00
andre
d243747d92 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
jhb
325fe79e0c Correct the arguments to kern_sigaltstack() as they were reversed.
PR:		kern/68079
Submitted by:	Georg-W. Koltermann gwk at rahn-koltermann dot de
2004-08-24 20:52:52 +00:00
njl
62d6f572a8 Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list(). 2004-08-24 19:22:54 +00:00
peter
dc3c5ad492 It is now an error to call pmap_unuse_pt without the paddr of the pde
that contained the pte.
2004-08-24 00:17:52 +00:00
peter
878672b652 Oops, I forgot to have the idle loop call mp_grab_cpu_hlt() on the amd64
SMP case.
2004-08-24 00:16:43 +00:00
peter
326b7f663e Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock.
We were obtaining different spin mutexes (which disable interrupts after
aquisition) and spin waiting for delivery.  For example, KSE processes
do LDT operations which use smp_rendezvous, while other parts of the
system are doing things like tlb shootdowns with a different mutex.

This patch uses the common smp_rendezvous mutex for all MD home-grown
IPIs that spinwait for delivery.  Having the single mutex means that
the spinloop to aquire it will enable interrupts periodically, thus
avoiding the cross-ipi deadlock.

Obtained from: dwhite, alc
Reviewed by:   jhb
2004-08-23 21:39:29 +00:00
peter
b23c8d6fe5 Sync with i386 - Optimize intr_execute_handlers a bit etc. 2004-08-16 23:12:30 +00:00
peter
f1342b61c3 Sync with i386 - remove unused includes 2004-08-16 23:10:46 +00:00
peter
846b1ee92d Sync with i386 - get the softc via the devclass rather than caching the dev 2004-08-16 23:10:18 +00:00
peter
2bd0b02c86 Sync with i386 - add ADAPTIVE_GIANT, remove pcic 2004-08-16 22:59:24 +00:00
peter
7f3727c1da Sync with i386 - add foot shooting protection for the DDB/KDB thing. 2004-08-16 22:57:47 +00:00
peter
a6185d4b05 Sync with i386 - set rbp reg to 0 for upcalls as a frame marker, not that
it is guaranteed to be used in userland though.
2004-08-16 22:57:13 +00:00
peter
9f0b195d9f Sync with i386 - trace syscall entry/exit times, and a cosmetic fix. 2004-08-16 22:56:20 +00:00
peter
89798174d7 Sync with i386 - fix bounds check in lapic_create() 2004-08-16 22:55:32 +00:00
peter
42ec62e265 Sync with i386 - pass resource requests up to parent 2004-08-16 22:54:50 +00:00
peter
03a8108c60 Sync with i386 - s/cpu_swtch/cpu_switch/ 2004-08-16 22:53:29 +00:00
peter
b543ac08f9 Sync with i386 - dont count needed bounce pages if loading a buffer that
was created with bud_dmamem_alloc()
2004-08-16 22:53:03 +00:00
peter
3948d5b6c5 Sync with i386 - cosmetic fixes 2004-08-16 22:52:02 +00:00
peter
4fb7f8b389 Catch up with i386 - remove lots of no longer used symbolic constants 2004-08-16 22:51:44 +00:00
peter
2800cddd5f Sync with i386 2004-08-16 22:51:13 +00:00
obrien
9dc119c4b6 Complete 'IA32' -> 'COMPAT_IA32' change for the Linuxulator32. 2004-08-16 12:51:33 +00:00
tjr
72f8d44703 Un-comment LINPROCFS. 2004-08-16 12:39:27 +00:00
obrien
cdbbacbd49 I missed an 'IA32' in the documentation. 2004-08-16 11:15:46 +00:00
obrien
6cd572e133 I'm not sure what tjr envisioned for turning on FreeBSD/i386 rt support,
but make it COMPAT_IA32 for now.
Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.
2004-08-16 11:09:59 +00:00
obrien
9d0b1233c4 Fix the 'DEBUG' argument code to unbreak the amd64 LINT build. 2004-08-16 10:54:25 +00:00
tjr
5847463612 Regen. 2004-08-16 08:07:06 +00:00
tjr
8a2532c456 Add preliminary support for running 32-bit Linux binaries on amd64, enabled
with the COMPAT_LINUX32 option. This is largely based on the i386 MD Linux
emulations bits, but also builds on the 32-bit FreeBSD and generic IA-32
binary emulation work.

Some of this is still a little rough around the edges, and will need to be
revisited before 32-bit and 64-bit Linux emulation support can coexist in
the same kernel.
2004-08-16 07:55:06 +00:00
rwatson
6f9a1cc98b Preemptive anti-footshooting: cause a #error if MP_WATCHDOG is compiled
with SCHED_ULE.
2004-08-15 20:32:40 +00:00
rwatson
abf6ea2973 Add an "options MP_WATCHDOG" to i386. This option allows one of the
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond.  A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog.  If the callout fails to reset the timer for ten
seconds, the watchdog will fire.  The sysctl allows you to select which
CPU will run the watchdog.

A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog.  On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.

This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.

On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.
2004-08-15 18:02:09 +00:00
ambrisko
30b952654a Fix the memory scaling bug when basemem was converted to Kbytes from
bytes for AMD64.  Otherwise the AP will be started at 640K which
won't work.  Bug found on a Xeon 64bit system.
2004-08-13 22:30:55 +00:00
davidxu
54683a131e Mark end of frames. 2004-08-11 23:23:05 +00:00
marcel
fbbaea5f90 Add __elfN(dump_thread). This function is called from __elfN(coredump)
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.

Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc
2004-08-11 02:35:06 +00:00
davidxu
8c3963846d As AMD64 architecture volume 1 chapter 3.1.2 says, high 32 bits of %rflags
are resevered, they can be written with anything, but they always read
as zero, we should simulate it in set_regs() as we are reading/writting
real hardware %rflags register.
2004-08-10 12:15:27 +00:00
davidxu
9ec724bfc3 In syscall, always make a copy of parameters from trapframe, this
becauses some syscalls using set_mcontext can sneakily change
parameters and later when those syscalls references parameters,
they will wrongly use register values in mcontext_t.

Approved by: peter
2004-08-09 23:57:59 +00:00
alc
1ac3389eba With the advent of pmap locking it makes sense for pmap_copy() to be less
forgiving about inconsistencies in the source pmap.  Also, remove a new-
line character terminating a nearby panic string.
2004-08-08 00:31:58 +00:00
scottl
ec3b17da1d Move the definition of M_MEMDESC to a non-optional file. This allows
kernels configurations without the 'mem' device to compile.
2004-08-07 06:21:37 +00:00
alc
584ed250ae Eliminate a variable that became unused in the i386 to amd64 conversion. 2004-08-07 04:31:26 +00:00
markm
b665823aa4 MFi386: Fix mem device. Grrr. 2004-08-06 07:22:36 +00:00
markm
cce2e351df MFi386: sort out the mem device. Grrrr. 2004-08-06 07:20:32 +00:00
markm
124b176fef Fix module builds for i386 and amd64. 2004-08-04 18:30:31 +00:00
alc
97dc3be9d1 Post-locking clean up/simplification, particularly, the elimination of
vm_page_sleep_if_busy() and the page table page's busy flag as a
synchronization mechanism on page table pages.

Also, relocate the inline pmap_unwire_pte_hold() so that it can be used
to shorten _pmap_unwire_pte_hold() on alpha and amd64.  This places
pmap_unwire_pte_hold() next to a comment that more accurately describes
it than _pmap_unwire_pte_hold().
2004-08-04 18:04:44 +00:00
markm
f516045149 Making a loadable null.ko for /dev/(null|zero) proved rather
unpopular, so remove this (mis)feature.

Encouragement provided by:	jhb (and others)
2004-08-03 19:24:54 +00:00
mux
35780dc21a Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait().  It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by:	jhb
Reviewed by:	jhb
2004-08-03 18:44:27 +00:00
dfr
8354a9598d Add style(9) foolishness. 2004-08-03 08:21:48 +00:00
markm
f2823af1c4 Diff reduction WRT i386 version. 2004-08-02 20:36:47 +00:00
dfr
090a3e72f1 Add definitions for TLS relocations. 2004-08-02 19:12:17 +00:00
obrien
2a58f76a64 Fix the build by providing 'PHYS_TO_DMAP' and 'M_MEMDESC'. 2004-08-02 02:07:20 +00:00
markm
b8bae5430c Add the I/O device for those architectures that have it. 2004-08-01 19:37:34 +00:00
scottl
30cf65f35d Turn off PREEMPTION by default while it gets debugged. It's been causing
4 weeks of problems including deadlocks and instant panics.  Note that the
real bugs are likely in the scheduler.
2004-08-01 14:31:45 +00:00
markm
a6c822020d Break out the MI part of the /dev/[k]mem and /dev/io drivers into
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.
2004-08-01 11:40:54 +00:00
davidxu
3720a53426 Turn on PCB_FULLCTX for set_mcontext, functions like kse_switchin
needs to fully restore asynchronous context which did not come
from fast syscall.
2004-07-31 14:02:29 +00:00
alc
add997e1ff Add pmap locking to pmap_object_init_pt(). 2004-07-31 06:42:05 +00:00
ps
1be1b43db4 MFia64:
Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead
of creating an inline function that just calls __builtin_ffs.
2004-07-30 16:44:29 +00:00
alc
ff80a7c7f2 Advance the state of pmap locking on alpha, amd64, and i386.
- Enable recursion on the page queues lock.  This allows calls to
   vm_page_alloc(VM_ALLOC_NORMAL) and UMA's obj_alloc() with the page
   queues lock held.  Such calls are made to allocate page table pages
   and pv entries.
 - The previous change enables a partial reversion of vm/vm_page.c
   revision 1.216, i.e., the call to vm_page_alloc() by vm_page_cowfault()
   now specifies VM_ALLOC_NORMAL rather than VM_ALLOC_INTERRUPT.
 - Add partial locking to pmap_copy().  (As a side-effect, pmap_copy()
   should now be faster on i386 SMP because it no longer generates IPIs
   for TLB shootdown on the other processors.)
 - Complete the locking of pmap_enter() and pmap_enter_quick().  (As of now,
   all changes to a user-level pmap on alpha, amd64, and i386 are performed
   with appropriate locking.)
2004-07-29 18:56:31 +00:00
kan
c1bceab802 Use newly added __used attribute to keep static function symbol from
being eliminated.
2004-07-29 18:02:28 +00:00
phk
98d8f3741c Move a relic to its correct location(s): Put nfs diskless initialization
calls with the code they call.  (Yet another example of mindless copy&paste).
2004-07-28 21:54:57 +00:00
rwatson
4ab080249a Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread.  It is called only from critical_{enter,exit}(),
which already dereferences curthread.  This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding:	jhb, bmilekic
2004-07-27 16:41:01 +00:00
imp
5a3e9fbaf3 Remove ahb, aha, ie, le and wl devices. They are all ISA/EISA only.
I went ahead and left in the ISA cards that also have pccard
attachments.  There's no way that these devices could attach.

OK'd by: peter
2004-07-22 22:29:45 +00:00
imp
0973e758b4 There is no pcic device on amd64. OLDCARD isn't supported, and
NEWCARD will call it something different.  and there are no ISA add-in
devices.
2004-07-22 22:28:34 +00:00
marcel
f77d7b9449 Unify db_stack_trace_cmd(). All it did was look up the thread given
the thread ID and call db_trace_thread().
Since arm has all the logic in db_stack_trace_cmd(), rename the
new DB_COMMAND function to db_stack_trace to avoid conflicts on
arm.
While here, have db_stack_trace parse its own arguments so that
we can use a more natural radix for IDs. If the ID is not a thread
ID, or more precisely when no thread exists with the ID, try if
there's a process with that ID and return the first thread in it.
This makes it easier to print stack traces from the ps output.

requested by: rwatson@
tested on: amd64, i386, ia64
2004-07-21 05:07:09 +00:00
alc
55d33c6b7d Remove the allpmaps list. It's unused.
Reviewed by: peter@
2004-07-20 02:40:56 +00:00
jhb
d4ea20b7e1 As a temporary hack, turn off deferred preemptions that are the result of
a fast interrupt handler doing an swi_sched().  This fixed the lockups I
saw on my laptop when using xmms in KDE and on rwatson's MySQL benchmarks
on SMP.  This will eventually be removed and/or modified when I figure out
what the root cause is and fix that.
2004-07-19 16:37:47 +00:00
das
86c293bf54 Make FLT_ROUNDS correctly reflect the dynamic rounding mode. 2004-07-19 08:17:25 +00:00
scottl
7ad60316cd Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
maxim
531ff1b33d In -CURRENT pseudo devices are not statically assigned at compile time,
remove a stale comment.

PR:		kern/62285
2004-07-18 09:03:12 +00:00
ps
8f0acfb821 Fix the build. pcm is no more. 2004-07-16 21:48:30 +00:00
alc
123cfa6b64 Push down the acquisition and release of the page queues lock into
pmap_protect() and pmap_remove().  In general, they require the lock in
order to modify a page's pv list or flags.  In some cases, however,
pmap_protect() can avoid acquiring the lock.
2004-07-15 18:00:43 +00:00
peter
363f0d38d5 Like on i386, eliminate pv_ptem (which was suggested by alc). This
reduces the size of the pv_entry structure a small but significant amount.

This is implemented a little differently because it isn't so cheap to get
the physical address of the page tabke page on amd64.. instead of it
being directly accessible from the top level page directory, it is now
two additional tree levels down.  However.. In almost all cases, we
recently had the physical address if the page table page a short while
before we needed it, but it slipped through our fingers.  This patch
saves it for when we do need it.  Also, for the one case where we do not
have the ptp paddr, we are always running in curproc context and so we
can do a vtopte-like trick.  I've implemented vtopde() for this purpose.

There is still a CYA entry in pmap_unuse_pt() that needs to be removed.  I
think it can be removed now but I forgot to test with it gone.
2004-07-14 07:13:35 +00:00
davidxu
0a29616acf Add ptrace_clear_single_step(), alpha already has it for years, the function
will be used by ptrace to clear a thread's single step state.
2004-07-13 07:22:56 +00:00
alc
ea94acfaa6 Push down the acquisition and release of the page queues lock into
pmap_remove_pages().  (The implementation of pmap_remove_pages() is
optional.  If pmap_remove_pages() is unimplemented, the acquisition and
release of the page queues lock is unnecessary.)

Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
2004-07-13 02:49:22 +00:00
marcel
62f362bd8f MFi386: rev 1.213 -- fix DELAY while the debugger is active.
This also fixes the (runtime) breakage introduced in the previous
commit that was the result of a botched merge. This hasn't even
been compile-tested...
2004-07-11 18:07:55 +00:00
marcel
f18ce8babb Add options KDB and GDB. KDB takes on the function of what DDB used
to be. Both DDB and GDB specify which KDB backends to include.
2004-07-11 03:20:09 +00:00
marcel
1a65a2060a Remove the now unused GDB stubs. See src/sys/gdb/* for the new KDB
backend.
2004-07-11 01:47:26 +00:00
marcel
aae5483213 Mega update for the KDB framework: turn DDB into a KDB backend.
Most of the changes are a direct result of adding thread awareness.
Typically, DDB_REGS is gone. All registers are taken from the
trapframe and backtraces use the PCB based contexts. DDB_REGS was
defined to be a trapframe on all platforms anyway.
Thread awareness introduces the following new commands:
	thread X	switch to thread X (where X is the TID),
	show threads	list all threads.

The backtrace code has been made more flexible so that one can
create backtraces for any thread by giving the thread ID as an
argument to trace.

With this change, ia64 has support for breakpoints.
2004-07-10 23:47:20 +00:00
marcel
1339cad36a MFi386: don't fake the time counter when the debugger is active.
This breaks the fundamental property of DELAY(). Instead, avoid
grabbing clock_lock when kdb_active is non-zero.
2004-07-10 22:42:22 +00:00
marcel
5d30d6766f Remove obsolete prototype of kdb_trap(). 2004-07-10 22:39:56 +00:00
marcel
a2239c0d3c Update for the KDB framework:
o  Make debugging support conditional upon KDB instead of DDB.
o  Remove implementation of Debugger().
o  Don't make setjump() and longjump() conditional upon DDB.
o  s/ddb_on_nmi/kdb_on_nmi/g
o  Call kdb_reenter() when kdb_active is non-zero. Call kdb_trap()
   otherwise.
2004-07-10 22:39:17 +00:00
marcel
006d404cf7 Implement makectx(). The makectx() function is used by KDB to create
a PCB from a trapframe for purposes of unwinding the stack. The PCB
is used as the thread context and all but the thread that entered the
debugger has a valid PCB.
This function can also be used to create a context for the threads
running on the CPUs that have been stopped when the debugger got
entered. This however is not done at the time of this commit.
2004-07-10 19:56:00 +00:00
marcel
60b53542e9 Introduce the KDB debugger frontend. The frontend provides a framework
in which multiple (presumably different) debugger backends can be
configured and which provides basic services to those backends.
Besides providing services to backends, it also serves as the single
point of contact for any and all code that wants to make use of the
debugger functions, such as entering the debugger or handling of the
alternate break sequence. For this purpose, the frontend has been
made non-optional.
All debugger requests are forwarded or handed over to the current
backend, if applicable. Selection of the current backend is done by
the debug.kdb.current sysctl. A list of configured backends can be
obtained with the debug.kdb.available sysctl. One can enter the
debugger by writing to the debug.kdb.enter sysctl.
2004-07-10 18:40:12 +00:00
marcel
6e0dcca8a9 Introduce the GDB debugger backend for the new KDB framework. The
backend improves over the old GDB support in the following ways:
o  Unified implementation with minimal MD code.
o  A simple interface for devices to register themselves as debug
   ports, ala consoles.
o  Compression by using run-length encoding.
o  Implements GDB threading support.
2004-07-10 17:47:22 +00:00
brian
aae31dbf32 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
brian
2821a50eaa Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
peter
e3e493024d MFi386: various io apic cleanups 2004-07-08 01:42:49 +00:00
peter
dd3c90cb13 MFi386: use rman access methods instead of groping around inside
struct resource
2004-07-08 01:34:24 +00:00
peter
fc114e00d8 MFi386: whitespace nit fix (spare blank line) 2004-07-08 01:32:25 +00:00
peter
e5ce867c01 MFi386: fix up CR0 settings 2004-07-08 01:31:13 +00:00
peter
bb56fe721b MFi386: 1.57: transparently respect alignment/boundary tags 2004-07-08 01:28:33 +00:00
alc
d7d59aa241 Simplify the control flow in pmap_extract(), enabling the elimination of a
PMAP_UNLOCK() call.
2004-07-07 16:47:58 +00:00
alc
fda3c26a4c White space and style changes only. 2004-07-07 02:23:46 +00:00
alc
88e9582a49 Style changes to pmap_extract(). 2004-07-06 02:33:11 +00:00
jhb
696704716d Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
imp
d698489e91 We need to make resources visible here as well. 2004-06-30 19:24:26 +00:00
njl
53a70792e9 Add machdep quirks functions. On i386, this disables acpi on systems with
BIOS dates earlier than Jan 1, 1999.  Add prototypes and quirks flags.
2004-06-30 04:42:29 +00:00
jhb
8371bbfa53 Fetch the actual acpi0 device_t and use device_is_attached() to see if
it's alive rather than trying to fetch its softc pointer via its devclass.

Glanced at by:	imp, njl
2004-06-23 17:59:01 +00:00
alc
a7339dd630 Implement the protection check required by the pmap_extract_and_hold()
specification.  This enables the elimination of Giant from that function.
2004-06-23 04:37:14 +00:00
alc
1450b79b4f - Simplify pmap_remove_pages(), eliminating unnecessary indirection.
- Simplify the locking of pmap_is_modified() by converting control flow to
   data flow.
2004-06-20 20:57:06 +00:00
alc
d711478366 Add pmap locking to pmap_is_prefaultable(). 2004-06-20 06:11:00 +00:00
bde
da4e7c693b Backed out previous commit. Blind substitution of dev_t by `struct cdev *'
was just wrong here because the dev_t's are user dev_t's.
2004-06-20 03:52:50 +00:00
alc
72c4b0bd2f Remove unused pt_entry_ts. Remove an unneeded semicolon. 2004-06-19 19:09:08 +00:00
bde
e041a584a6 Include <sys/_lock.h>'s prerequisite <sys/queue.h> before including the
former, not after.

Don't hide this bug by including <sys/queue.h> in <sys/_lock.h>.
2004-06-19 14:58:35 +00:00
peter
8c286596fc Try harder to give new processes a clean initial fpu state. fpu_cleanstate
wasn't actually clean, it was saving the xmm registers as left over by the
bios.  fninit() doesn't clear those.

In fpudna(), instead of doing a fninit() and forgetting to load the initial
mxcsr, do a full fxrstor(&fpu_cleanstate).  Otherwise we hand over whatever
random values are left in the xmm registers by the last user.

I'm not certain of whether this is excessive paranoia or not, but there was
an outright bug in neglecting to set the mxcsr value that caused awk to
SIGFPE in some case.  Especially for Tim Robbins. :-)

i386 probably should do something about the mxcsr setings too.

Found by:  tjr
2004-06-18 04:01:54 +00:00
njl
bbd8cdb07b Revert last change. If acpi is loaded or compiled into the kernel, its
devclass will be present even if the driver was disabled by a hint.  Using
device_get_softc() provides the right info even if it's overkill.

Explained by:	jhb
2004-06-17 17:27:37 +00:00
alc
4c2e464ea2 Do not preset PG_BUSY on VM_ALLOC_NOOBJ pages. Such pages are not
accessible through an object.  Thus, PG_BUSY serves no purpose.
2004-06-17 06:16:58 +00:00
phk
dfd1f7fd50 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
alc
315177e2ec Add some lock assertions. Lock a small part of pmap_enter(). 2004-06-16 07:51:19 +00:00
alc
d82b0bb246 Correct an error in the implementation of pmap_is_prefaultable(). When I
introduced this function in revision 1.441, I inverted one of the
comparisons.
2004-06-16 03:11:24 +00:00
alc
72b65ac70d Remove a stale comment. 2004-06-15 19:28:40 +00:00
alc
1ccccce78d Add pmap locking to pmap_extract(), pmap_mincore(), and pmap_remove(). 2004-06-15 07:41:44 +00:00
njl
8d89806526 We only need the devclass_find() result, not the softc. 2004-06-15 02:12:12 +00:00
alc
c14fd5dbce Introduce pmap locking to many of the pmap functions. There is more to
come later.
2004-06-14 01:17:50 +00:00
obrien
a57bdc7ec6 The majority of FreeBSD/amd64 machines are SMP, so use ADAPTIVE_MUTEXES
by default to improve performance.
2004-06-13 23:03:57 +00:00
alc
a73d5d0e50 Prevent the loss of a PG_M bit through an SMP race in pmap_ts_referenced(). 2004-06-13 21:59:42 +00:00
alc
3a5f5107c8 Remove dead or unneeded code, e.g., spl calls. 2004-06-13 19:48:38 +00:00
alc
91c65a303c - Remove an unused declaration.
- Move a definition inside the scope of a #ifdef _KERNEL.
2004-06-13 03:44:11 +00:00
alc
f6af690bde In a multiprocessor, the PG_W bit in the pte must be changed atomically.
Otherwise, the setting of the PG_M bit by one processor could be lost if
another processor is simultaneously changing the PG_W bit.

Reviewed by:	tegge@
2004-06-12 20:01:48 +00:00
phk
86602fc06c Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
peter
d7de00696f Argh. Add the mini-stack-frame back in for mcount's benefit for syscall
stubs.
2004-06-10 22:02:26 +00:00
peter
0fe04edc1b Make profiling work for varargs functions.. %al is an additional argument
which indicates the number of xmm registers used in the varargs.  This
stops the explosion that happened when profiling printf() etc.
2004-06-10 22:00:58 +00:00
peter
59fbae3b46 Insta-MFi386: ignore disabled cpu apic id's entirely 2004-06-10 21:30:08 +00:00
jhb
ee8370535a - Use the correct devclass name ("acpi" vs "ACPI") to detect if acpi0 is
present and thus that the PnPBIOS probe should be skipped instead of
  having ACPI zero out the PnPBIOStable pointer.
- Make the PnPBIOStable pointer static to i386/i386/bios.c now that that is
  the only place it is used.
2004-06-10 20:43:04 +00:00
jhb
f7c8770deb Remove atdevbase and replace it's remaining uses with direct references to
KERNBASE instead.
2004-06-10 20:31:00 +00:00
peter
f9a170c34a In pmap_extract_and_hold(), there is no need to mask off PG_FRAME because
pmap_extract() already does it.
In pmap_enter(), opa has already been masked so don't do it again.
Wrap a long line (recent transgression).
Use trunc_page() in pmap_mapdev() instead of anding with PG_FRAME, since
that is what we really meant.

Submitted by:  alc (first item)
2004-06-08 02:20:40 +00:00
peter
0ef9412643 Fix my silly typo in asm statement in previous commit. 2004-06-08 01:35:48 +00:00
peter
6d14a160e7 Argh. Remove stray number that slipped into the previous commit. 2004-06-08 01:20:37 +00:00
peter
7740067a4a Reapply rev 1.151 after enable sse/fpuinit order fixed in mp_machdep.c
Obtained from:  das
2004-06-08 01:14:39 +00:00
peter
5cebad93ca Set up the fpu *after* enabling SSE mode on AP's
Submitted by: (argh, I can't find the email)
2004-06-08 01:07:51 +00:00
peter
35d8561b4c Initial PG_NX support (no-execute page bit)
- export the rest of the cpu features (and amd's features).
- turn on EFER_NXE, depending on the NX amd feature bit
- reorg the identcpu stuff a bit in order to stop treating the
  amd features as second class features (since it is now a primary feature
  bit set) and make it easier to export.
2004-06-08 01:02:52 +00:00
peter
8a41fbc207 Mask pte's with PG_FRAME before passing it to PHYS_TO_VM_PAGE().. PG_NX
lives in the top 12 'available' bits.  atop() in the PHYS_TO_VM_PAGE()
macro only masks off the lower bits (by accident) and the upper bits
in the 64 bit ptes turn into "interesting" index values.
2004-06-08 00:29:42 +00:00
peter
e0ac92157b Use trunc_page(va) when we mean it rather than anding it with PG_FRAME
(which doesn't work all that well when there are bits at the top that are
 masked by PG_FRAME)
2004-06-08 00:11:32 +00:00
peter
33636f4f34 Fix a serious problem that manifested during swap, and a few other times.
pmap_remove() would be called with a huge range and we'd stride across
it in only 2MB chunks.  This would manifest as massive cpu time and a
largely unresponsive system during hard swap.  Instead, check the higher
page directories which means we can run pmap_remove() in just a few
hundred loop iterations instead of millions since we can process
address space in chunks of 512GB and 1GB as well as 2MB.

Eternal thanks to:  tmm
2004-06-07 23:51:20 +00:00
peter
101e49d30a Be a little more consistent in the naming of the PML4 defines. 2004-06-07 23:47:59 +00:00
das
51a41f503f Back out revision 1.150, since dwmalone reports that it causes a panic
upon startup on his machine.
2004-06-06 09:16:02 +00:00
das
46f1d333d1 Initialize the MXCSR to the appropriate default value at startup.
Tested on:	tjr
2004-06-05 03:13:39 +00:00
phk
89408985ea Add new bios_string() which will hunt for a string inside a given range
of the BIOS.  This can be used for finding arbitrary magic in the BIOS
in order to recognize particular platforms.
2004-06-03 22:36:24 +00:00
peter
1c9355a69a MFi386: add ixgp device 2004-06-03 21:40:41 +00:00
peter
14c9e7f788 MFi386: apic intpin programming updates etc. 2004-06-03 20:25:05 +00:00
peter
e1d648f932 MFi386: remove debug printf 2004-06-03 20:22:48 +00:00
peter
dc5552e5f9 Move module.h include to the same place as on i386 for diff reduction. 2004-06-03 20:21:30 +00:00
peter
fb5cc8376e MFi386: move cpu_nameclass struct next to its only consumer 2004-06-03 20:18:15 +00:00
tjr
48c79c9521 Remove checks for curthread == NULL - it can't happen. 2004-06-03 10:22:47 +00:00
phk
c0b3b891ee Add missing <sys/module.h> instances which were shadowed by the nested
include in <sys/kernel.h>
2004-06-03 05:58:30 +00:00
tjr
7a46b27935 Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by:	jhb
2004-06-03 01:47:37 +00:00
tjr
80d36400ed Move TDF_SA from td_flags to td_pflags (and rename it accordingly)
so that it is no longer necessary to hold sched_lock while
manipulating it.

Reviewed by:	davidxu
2004-06-02 07:52:36 +00:00
phk
d6f7d2bde6 Add some missing <sys/module.h> includes which are masked by the
one on death-row in <sys/kernel.h>
2004-05-30 17:57:46 +00:00
alc
c8b1bb8032 MFi386 revision 1.6
Reenable ithread preemption for interrupts that occur while executing in
 the kernel.
2004-05-30 04:49:39 +00:00
tjr
3793bc5d48 Implement __bb_init_func. This is a fairly straightforward conversion
of the i386 version.
2004-05-29 01:13:28 +00:00
alc
87de22f863 Remove a broken micro-optimization from pmap_enter(). The ill effect
of this micro-optimization occurs when we call pmap_enter() to wire an
already mapped page.  Because of the micro-optimization, we fail to
mark the PTE as wired.  Later, on teardown of the address space,
pmap_remove_pages() destroys the PTE before vm_fault_unwire() has
unwired the page.  (pmap_remove_pages() is not supposed to destroy
wired PTEs.  They are destroyed by a later call to pmap_remove().)
Thus, the page becomes lost.

Note: The page is not lost if the application called munlock(2), only
if it relies on teardown of the address space to unwire its pages.

For the historically inclined, this bug was introduced by a
megacommit, revision 1.182, roughly six years ago.

Leak observed by: green@ and dillon independently
Patch submitted by: dillon at backplane dot com
Reviewed by: tegge@
MFC after: 1 week
2004-05-28 19:42:02 +00:00
tmm
7b769ce88f Retire cpu_sched_exit(); it is not used any more. 2004-05-26 12:09:39 +00:00
bde
8be6dace2b Quick fix for overflow when tsc_freq >= 2^31. "int profrate" in struct
gmon and struct gmonhdr was originally just to represent the kernel
(profiling) clock frequency and it remains poorly suited to representing
the frequencies of fast counters like the TSC.  It broke a year or two
ago.  This quick fix keeps it working for another year or month or two
until TSC frequencies can exceed 2^32, by dividing the frequency by 2.
Dividing the frequency by 4 would work for a little longer but would
lose a little too much precision.
2004-05-26 09:43:38 +00:00
bde
2f8036c976 Oops, ".align 4" for the data section in the previous commit should
have been ".p2align 4".  This bug is cosmetic since the data section
happens to be empty.
2004-05-24 12:42:16 +00:00
bde
5e80bb386e Fixed profiling of trap, syscall and interrupt handlers and some
ordinary functions, essentially by backing out half of rev.1.115 of
amd64/exception.S.  The handlers must be between certain labels for
the purposes of profiling, and this was broken by scattering them in
separately compiled .S files, especially for ordinary functions that
ended up between the labels.  Merge the files by #including them as
before, except with different pathnames and better comments and
organization.  Changes to the scattered files are minimal -- just
move the labels to the file that does the #includes.

This also partly fixes profiling of IPIs -- all IPI handlers are now
correctly classified as interrupt handlers, but many are still missing
mcount calls.
2004-05-24 12:08:56 +00:00
bde
f15855a457 Don't repeat the definition of IDTVEC(). It is in asmacros.h. 2004-05-24 11:28:11 +00:00
bde
aead6fc992 Added profiling support for Xint0x80_syscall. 2004-05-23 19:06:15 +00:00
bde
01dd640f6d Adjusted for amd64 after repo-copy. The adjustments are routine, except:
- perfmon headers must be avoided until perfmon is supported.
- all call-used registers including return registers must be preserved
  by .mcount(), etc., not quite as in profile.h.  __cyg_profile_func_*()
  don't require this, but they are (mis)implemented as aliases for
  .mcount(), etc. so they preserve the registers.
- i386 ifdefs related to perfmon have not been adjusted yet.
2004-05-23 18:27:14 +00:00
bde
858eefda2e Restored FAKE_MCOUNT() and MEXITCOUNT invocations and adjusted them for
amd64 as necessary.  This is routine, except:
- the FAKE_MCOUNT($bintr) in doreti was missing the '$'.  This gave a
  a garbage address made up of padding bytes (with the nop byte 0x90 as
  the MSB) instead of the intended address of bintr.  This accidentally
  worked on i386's because (0x90 << 24) is close enough to bintr, but
  it doesn't work on amd64's because (0x90 << 56) is much further away
  from bintr.
- the FAKE_MCOUNT($btrap) in calltrap was similarly broken.  It hasn't
  been needed since FreeBSD-1, so just delete it.
2004-05-23 17:18:48 +00:00
bde
f11d5307d3 Adjusted FAKE_MCOUNT()s for amd64. This is needed for both ordinary
and high resolution profiling of interrupt handlers.  The adjustments
are routine once the magic stack offset 13*4 is decoded to be TF_RIP
(there were originally more types of stack frames so using TF_EIP for
one of them wouldn't have been much simpler).

Removed garbage comments attached to some of the FAKE_MCOUNT()s.
2004-05-23 16:23:29 +00:00
bde
e3c5341a54 Spell "retq" as "ret" in pagezero() like it is everywhere, else so
that the usual macro for "ret" hides the detail of calling .mexitcount
before returning.

Fixed missing call to .mexitcount in lgdt().  This was missing on
i386's, mainly because lgdt() uses lret[q] insted of ret.  This is
very unimportant since lgdt() is not (normally?) called until after
profiling is initialized.
2004-05-23 14:56:02 +00:00
bde
37bf942f68 MFi386 (1.103 and 1.104: fixed some problems in high resolution profiling
and improved some comments).  Also, made the documented {f,s}uword()
functions the standard entry points and the undocumented {f,s}uword64()
functions alternative entry points, like {f,s}uword32() for i386's.  The
bitrot in the comments was a little larger here -- there are new undocumented
32-bit sub-word functions, not just renaming of 16-bit functions from
documented ones to undocumented ones.
2004-05-21 16:50:57 +00:00
bde
5e189e571a MFi386 (1.37: GUPROF calibration macros; only routine adjustments needed). 2004-05-20 16:22:57 +00:00
peter
acd31bc19e Like on i386, clear the last three entries in the pml4 page when doing a
pmap_release(), and put it the free queue marked as already zeroed.
2004-05-19 21:55:37 +00:00
bde
a128646d2f Fixed the type of fptrdiff_t. It needs to be 64 bits in theory, and in
practice too since kernel addresses are almost 2^64 higher than most
user addresses.
2004-05-19 16:19:11 +00:00
bde
1808931bc7 Fixed some style bugs (mainly misalignment of backslashes). 2004-05-19 16:04:26 +00:00
bde
f309a749f5 Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>.  Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.
2004-05-19 15:41:26 +00:00
peter
fecf8b1849 Unbreak builds without DDB. Bad Bruce! No cookie! :-) 2004-05-19 01:23:48 +00:00
peter
18fd75c1c3 The 'call mcount' hooks that gcc inserts when profiling are in a place that
cannot handle the scratch registers being trashed.  So we have to preserve
them ourselves.
2004-05-18 22:52:32 +00:00
stefanf
e784f59f15 <stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is
defined.  Otherwise first including <wchar.h> and then <stdint.h> leads to no
WINT_M{AX,IN} at all.

PR:		64956
Approved by:	das (mentor)
2004-05-18 16:04:57 +00:00
bde
28faba2e27 Fixed DDB_NOKLDSYM on amd64's:
machdep.c:
Initialize the symbol table pointers, not quite like for other arches.

db_elf.c:
Don't claim to be an i486 in the fake ELF header.
2004-05-18 05:30:06 +00:00
peter
4070a4a8d4 Turn on modules for amd64. Fear. 2004-05-17 22:13:14 +00:00
peter
d0605f2d9e Deal with REL records that have the addend embedded variable sized targets
rather than the RELA table.  I dont know if bintutils will ever generate
REL records, but just in case.....
2004-05-17 21:16:49 +00:00
peter
8b77ed77bd Checkpoint some of what I was starting to tinker with for having some
different context support for 32 vs 64 bit processes.  This simply omits
the save/restore of the segment selector registers for non 32 bit
processes.  This avoids the rdmsr/rwmsr juggling when restoring %gs
clobbers the kernel msr that holds the gsbase.

However, I suspect it might be better to conditionally do this at
user<->kernel transition where we wouldn't need to do the juggling in the
first place.  Or have per-thread extended context save/restore hooks.
2004-05-16 22:43:57 +00:00
peter
be73fb5080 Kill the LAZYPMAP ifdefs. While they worked, they didn't do anything
to help the AMD cpus (which have a hardware tlb flush filter).  I held
off to see what the 64 bit Intel cpus did, but it doesn't seem to help
much there either.  Oh well, store it in the Attic.
2004-05-16 22:11:50 +00:00
peter
a3df9de5e8 Converge some more with i386. 2004-05-16 21:27:29 +00:00
peter
f0d8d38564 MFi386: add rue and twa 2004-05-16 20:57:01 +00:00
peter
cd0ccfc778 MFi386: avoid partial register references, for what its worth. 2004-05-16 20:46:13 +00:00
peter
172c58b998 For consistency with i386, have pmap_kenter_temporary() take a vm_paddr_t
argument.  It is actually the same type on amd64 (vm_paddr_t = vm_offset_t)
but this reduces the i386<->amd64 diffs a little.
2004-05-16 20:44:41 +00:00
peter
c6a708cab1 MFi386: numerous interrupt and acpi updates 2004-05-16 20:30:47 +00:00
peter
ea4215c521 Make a small revision to the api between the elf linker core and the
elf_reloc() backends for two reasons.  First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).
2004-05-16 20:00:28 +00:00
njl
3d06d54b9d Make unnecessary globals static and remove unused includes.
Pointed out by:	cscout
2004-05-06 02:18:58 +00:00