Commit Graph

4441 Commits

Author SHA1 Message Date
Warner Losh
812fb8f294 Get rid of #ifdef for legacy system. Move that into the MD code.
Export minimal symbols to allow this to happen.
2004-12-24 23:03:17 +00:00
Alan Cox
1f70d62298 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
Alan Cox
ead42fc389 Use vtopde() instead of pmap_pde() in pmap_kextract(); vtopde() is smaller
and faster in cases, such as pmap_kextract(), where the pde is known to
exist.
2004-12-21 19:25:56 +00:00
Alan Cox
85f5b24573 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
Peter Wemm
2f6a1b4744 MFi386: rev 1.12: re-allow fast interrupts to cause preemption 2004-12-06 22:56:15 +00:00
Alan Cox
e07a123caf Replace (inlined) pmap_pte() calls with smaller, faster code where
possible, such as the inner loop of pmap_copy().

Remove two comments that apply to i386 but not amd64.
2004-12-04 22:02:31 +00:00
Alan Cox
664c816978 For efficiency eliminate the call to pmap_pte() from pmap_protect()'s and
pmap_remove()'s inner loop.  Instead, call pmap_pde_to_pte(), a new
function, prior to the inner loop.

Reviewed by: peter@, tegge@
2004-12-02 04:06:40 +00:00
Marcel Moolenaar
bcc5241c43 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
Peter Wemm
6210b1477c Remove unused cnt variable for the SMP case. Trim some excessive blank
lines while here.
2004-11-30 20:25:46 +00:00
Peter Wemm
a649898dd8 Update the gdb register extraction support to use the pcb wherever
possible, like on i386.  Registers are handled differently for caller
vs callee saved registers.
2004-11-30 00:55:49 +00:00
Peter Wemm
40d315c6c9 MFi386: join the %cr0 setup line now that i386 has lost the I386 ifdefs. 2004-11-29 23:27:07 +00:00
Peter Wemm
545d0f0638 Take advantage of the shutdown processing being wired to the BSP and
eliminate the evil cpu_reset_proxy code now that it will never be
activated.  i386 should pick this up as well.
2004-11-29 23:25:56 +00:00
Scott Long
ee8d8ca5c1 Don't flag alignment constraints as a reason for bouncing. This fixes the
trigger for other misbehaviour in the sym driver that was causing freezes at
boot.  Thanks to phk@ for reporting and testing this.
2004-11-29 14:49:27 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Scott Long
4c10e55d26 Remove an extra #include 2004-11-21 06:28:35 +00:00
Scott Long
25590d09b9 Consolidate all of the bounce tests into the BUS_DMA_COULD_BOUNCE flag.
Allocate the bounce zone at either tag creation or map creation to help
avoid null-pointer derefs later on.  Track total pages per zone so that
each zone can get a minimum allocation at tag creation time instead of
being defeated by mis-behaving tags that suck up the max amount.
2004-11-21 04:15:26 +00:00
David Schultz
6484fde022 Remove references to U area and garbage collect includes.
Reviewed by:	arch@
2004-11-20 02:30:59 +00:00
David Schultz
ab44ebf537 Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
David Schultz
11111b709f U areas are going away, so don't allocate one for process 0.
Reviewed by:	arch@
2004-11-20 02:29:25 +00:00
Scott Long
e835255791 Revert part of rev 1.56. Tag boundaries are handled by splitting segments,
not through bouncing.
2004-11-19 17:51:29 +00:00
Scott Long
48ad03b872 MFi386 rev 1.63-1.64:
Use tag-specific pools of bounce pages instead of a single global pool.
2004-11-10 03:49:24 +00:00
Peter Wemm
9ad69266e1 MFi386 1.238 (jhb): Allow hints to disable cpus 2004-11-05 18:25:22 +00:00
Peter Wemm
7681443a26 MFi386:
rev 1.61 (scottl):  Add KTR tracing
rev 1.62 (scottl):  Optimize (td->pmap, inlines, etc)
2004-11-05 18:24:01 +00:00
Scott Long
0971df6e14 Don't use atomic ops to increment interrupt stats. This was only done on
amd64 and i386 anyways.  The stats are only kept for informational purposes.
2004-11-03 18:03:06 +00:00
Andre Oppermann
32672ba88d Reduce annoying SCSI probing delay from 15 to 5 seconds in all GENRIC kernels.
Discussed on:	-current
2004-11-02 20:57:20 +00:00
John Baldwin
d39d4a6e64 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
Dag-Erling Smørgrav
b0e1e474f7 Add TUNABLE_LONG and TUNABLE_ULONG, and use the latter for the
hw.pci.host_mem_start tunable.  Add comments to TUNABLE_INT and
TUNABLE_QUAD recommending against their use.

MFC after:	3 weeks
2004-10-31 15:50:33 +00:00
Dag-Erling Smørgrav
38228f7221 Whitespace cleanup 2004-10-31 15:02:53 +00:00
Hidetoshi Shimokawa
1107015620 MFi386: preserve dcons buffer passed by loader. 2004-10-28 12:16:03 +00:00
Peter Wemm
3904b13fab Raise MAXDSIZ from 8G to 32G. The old limit was just an arbitary choice
that was greater than 4G.  I originally used the same values as i386 in
order to save opening a new PML4 page slot, but in the day of gigabytes
of memory, worrying about a 4K page seems futile.  Moving from 8 to 32G
moves the page to a different index, it doesn't increase the number of
pages used.
2004-10-27 17:21:15 +00:00
Nate Lawson
8f528832e5 Print flags in the nexus for child devices. 2004-10-14 22:36:47 +00:00
Peter Wemm
598f31f75a MFi386: sync with latest updates 2004-10-11 21:51:27 +00:00
Nate Lawson
31ad3b8802 Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after:	2 weeks
2004-10-11 05:39:15 +00:00
Alan Cox
aced26ce6e Make pte_load_store() an atomic operation in all cases, not just i386 PAE.
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit
in a race between processors.  (This restructuring assumes the newly atomic
pte_load_store() for correct operation.)

Reviewed by: tegge@
PR: i386/61852
2004-10-08 08:23:43 +00:00
John Baldwin
78c85e8dfc Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
Alan Cox
caa665aae3 Undo revision 1.251. This change was a performance pessimizing work-around
that is no longer required.  (In fact, it is not clear that it was ever
required in HEAD or RELENG_4, only RELENG_3 required a work-around.)  Now,
as before revision 1.251, if the preexisting PTE is invalid, pmap_enter()
does not call pmap_invalidate_page() to update the TLB(s).

Note: Even with this change, the handling of a copy-on-write fault is
inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice.

Discussed with: bde@
PR: kern/16568
2004-10-03 20:14:07 +00:00
Alan Cox
8ceb3dcb60 The physical address stored in the vm_page is page aligned. There is no
need to mask off the page offset bits.  (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
2004-10-03 00:16:43 +00:00
Alan Cox
07b3303943 Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure.  The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
2004-10-02 07:34:58 +00:00
Alan Cox
bbda1f18d9 Remove an unused declaration. (I should have included this change in
revision 1.486.)
2004-10-02 05:58:32 +00:00
Alan Cox
0a752e9843 Prevent the unexpected deallocation of a page table page while performing
pmap_copy().  This entails additional locking in pmap_copy() and the
addition of a "flags" parameter to the page table page allocator for
specifying whether it may sleep when memory is unavailable.  (Already,
pmap_copy() checks the availability of memory, aborting if it is scarce.
In theory, another CPU could, however, allocate memory between
pmap_copy()'s check and the call to the page table page allocator,
causing the current thread to release its locks and sleep.  This change
makes this scenario impossible.)

Reviewed by: tegge@
2004-09-29 19:20:40 +00:00
Peter Wemm
ffee5dac09 MFi386: rev 1.239 - invalidate tlb after pte update 2004-09-29 01:59:10 +00:00
Peter Wemm
083e5bdc72 MFi386: rev 1.236 - improve panic message for a busted mptable 2004-09-29 01:58:24 +00:00
Peter Wemm
6fdf763cef Like on i386, use the definition of struct bios_smap from machine/pc/bios.h
again.
2004-09-24 01:11:11 +00:00
Peter Wemm
2169193596 Converge towards i386. I originally resisted creating <machine/pc/bios.h>
because it was mostly irrelevant - except for the silly BIOS_PADDRTOVADDR
etc macros.  Along the way of working around this, I missed a few things.

* Make syscons properly inherit the bios capslock/shiftlock/etc state like
  i386 does.  Note that we cannot inherit the bios key repeat rate because
  that requires a bios call (which is impossible for us).
* Give syscons the ability to beep on amd64.  Oops.

While here, make bios.c compile and add it to files.amd64.
2004-09-24 01:08:34 +00:00
Peter Wemm
c3277f936c Severely strip down the repocopied i386/bios.c and bios.h files. It turns
out that bios_sigsearch() etc is useful for finding tables in roms.
2004-09-24 00:42:36 +00:00
Alan Cox
a971139680 Correct a long-standing error in _pmap_unwire_pte_hold() affecting
multiprocessors.  Specifically, the error is conditioning the call to
pmap_invalidate_page() on whether the pmap is active on the current CPU.
This call must be unconditional.  Regardless of whether the pmap is active
on the CPU performing _pmap_unwire_pte_hold(), it could be active on another
CPU.  For example, a call to pmap_remove_all() by the page daemon could
result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the
current CPU and active on another CPU.  In such circumstances, failing to
call pmap_invalidate_page() results in a stale TLB entry on the other CPU
that still maps the now deallocated page table page.  What happens next is
typically a mysterious panic in pmap_enter() by the other CPU, either
"pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished,
va: 0x%lx".  Both occur because the former page table page has been recycled
and allocated to a new purpose.  Consequently, it no longer contains zeroes.

See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail
thread last year.

Many thanks to the engineers at Sandvine for providing clear and concise
information until all of the pieces of the puzzle fell into place and
for testing an earlier patch.

MT5 Candidate
2004-09-22 05:01:48 +00:00
Peter Wemm
7789933b6a MFi386: adapt rev 1.19 (debugger fixes) 2004-09-22 01:27:06 +00:00
Peter Wemm
b05ba1f73d Minor sync-up with i386. Catch up on de-quoting and de-counting after
config changes.
2004-09-22 01:04:54 +00:00
Peter Wemm
46ec31b083 MFi386: add ispfw (except using correct device<tab><tab>ispfw format,
<space><tab> is for the options line)
2004-09-22 00:44:13 +00:00
John Baldwin
76764432e4 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
Alan Cox
de6c3db01f Simplify the reference counting of page table pages. Specifically, use
the page table page's wired count rather than its hold count to contain
the reference count.  My rationale for this change is based on several
factors:

1. The machine-independent and pmap layers used the same hold count field
   in subtly different ways.  The machine-independent layer uses the hold
   count to implement a form of ephemeral wiring that is used by pipes,
   physio, etc.  In other words, subsystems where we wish to temporarily
   block a page from being swapped out while it is mapped into the kernel's
   address space.  Such pages are never removed from the page queues.
   Instead, the page daemon recognizes a non-zero hold count to mean "hands
   off this page."  In contrast, page table pages are never in the page
   queues; they are wired from birth to death.  The hold count was being
   used as a kind of reference count, specifically, the number of valid
   page table entries within the page.  Not surprisingly, these two
   different uses imply different synchronization rules: in the machine-
   independent layer access to the hold count requires the page queues
   lock; whereas in the pmap layer the pmap lock is required.  Thus,
   continued use by the pmap layer of vm_page_unhold(), which asserts that
   the page queues lock is held, made no sense.

2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired
   count.  An unexpected wired count on a page table page was ignored and
   the underlying page leaked.

3. In a word, microoptimization.  Using the wired count exclusively, rather
   than a combination of the wired and hold counts, makes the code slightly
   smaller and faster.

Reviewed by: tegge@
2004-09-19 21:20:01 +00:00
Alan Cox
8478ea241b Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc()
succeeds, the page's queue field is unconditionally set to PQ_NONE by
vm_pageq_remove_nowakeup().)
2004-09-19 02:39:31 +00:00
Alan Cox
7580b56bdc Release the page queues lock earlier in pmap_protect() and pmap_remove() in
order to reduce contention.
2004-09-18 22:56:58 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Poul-Henning Kamp
5757a0b985 Remove now unused #include files. 2004-09-15 12:02:35 +00:00
Alan Cox
031102cc7b Use an atomic op to update the pte in pmap_protect(). This is to prevent
the loss of a page modified (PG_M) bit in a race between processors.

Quoting Tor:
	One scenario where the old code could cause a lost PG_M bit is a
	multithreaded linux program (or FreeBSD program using the
	linuxthreads port) where one thread was starting a subprocess.
	The thread doing fork() would call vmspace_fork(), which would then
	call vm_map_copy_entry() which would call pmap_protect() on an area
	possibly accessed by other threads.

Additionally, make the clearing of PG_M by pmap_protect() unconditional if
write permission is removed.  Previously, PG_M could persist on a read-only
unmanaged page.  That seems inconsistent and confusing.

In collaboration with: tegge@

MT5 candidate
PR: 61852
2004-09-12 20:20:40 +00:00
Scott Long
9e0c3bdf64 Double the number of kernel page tables for amd64 and for i386/PAE. The old
value was only enough for 8GB of RAM, the new value can do 16GB.  This still
isn't optimal since it doesn't scale.  Fixing this for amd64 looks to be
fairly easy, but for i386 will be quite difficult.

Reviewed by: peter
2004-09-11 01:31:26 +00:00
Bill Paul
a07bd003bf Add device driver support for the VIA Networking Technologies
VT6122 gigabit ethernet chip and integrated 10/100/1000 copper PHY.
The vge driver has been added to GENERIC for i386, pc98 and amd64,
but not to sparc or ia64 since I don't have the ability to test
it there. The vge(4) driver supports VLANs, checksum offload and
jumbo frames.

Also added the lge(4) and nge(4) drivers to GENERIC for i386 and
pc98 since I was in the neighborhood. There's no reason to leave them
out anymore.
2004-09-10 20:57:46 +00:00
Alan Cox
e232eb8288 Use atomic ops in pmap_clear_ptes() to prevent SMP races that could
result in the loss of an accessed or modified bit from the pte.

In collaboration with: tegge@

MT5 candidate
2004-09-08 18:58:29 +00:00
Scott Long
50736a153b Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
Scott Long
444ba94513 Switch the default scheduler to 4BSD to match what will go into RELENG_5 soon.
It can be switched back once 5.3 is tested and released.  Also turn on
PREEMPTION as many of the stability problems with it have been fixed.

MT5: 3 days.
2004-09-07 22:37:43 +00:00
Julian Elischer
ed062c8d66 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
Julian Elischer
6804a3ab6d Give the 4bsd scheduler the ability to wake up idle processors
when there is new work to be done.

MFC after:	5 days
2004-09-01 06:42:02 +00:00
Julian Elischer
2630e4c90c Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
Julian Elischer
5995adc206 Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Peter Wemm
5fe155d0fd Add the mp_watchdog hooks, although it locks up my SMP test box. It might
be useable to somebody.
2004-08-30 23:33:33 +00:00
Alan Cox
bfa15df9ba Remove unnecessary check for curthread == NULL. 2004-08-30 03:52:05 +00:00
David E. O'Brien
dd68efd05b s/smp_rv_mtx/smp_ipi_mtx/g
Requested by:	jhb
2004-08-28 00:49:55 +00:00
Tilman Keskinoz
0039290c1f Fix a comment, IA32 was renamed to COMPAT_IA32
Approved by:	marcel
2004-08-27 21:29:20 +00:00
Marcel Moolenaar
0f2fe153bc Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
Alan Cox
8991a235cb The machine-independent parts of the virtual memory system always pass a
valid pmap to the pmap functions that require one.  Remove the checks for
NULL.  (These checks have their origins in the Mach pmap.c that was
integrated into BSD.  None of the new code written specifically for
FreeBSD included them.)
2004-08-27 19:06:17 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
John Baldwin
ef36ad6921 Correct the arguments to kern_sigaltstack() as they were reversed.
PR:		kern/68079
Submitted by:	Georg-W. Koltermann gwk at rahn-koltermann dot de
2004-08-24 20:52:52 +00:00
Nate Lawson
dc6851d588 Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list(). 2004-08-24 19:22:54 +00:00
Peter Wemm
648bfe0b75 It is now an error to call pmap_unuse_pt without the paddr of the pde
that contained the pte.
2004-08-24 00:17:52 +00:00
Peter Wemm
a32f55d629 Oops, I forgot to have the idle loop call mp_grab_cpu_hlt() on the amd64
SMP case.
2004-08-24 00:16:43 +00:00
Peter Wemm
f1009e1e1f Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock.
We were obtaining different spin mutexes (which disable interrupts after
aquisition) and spin waiting for delivery.  For example, KSE processes
do LDT operations which use smp_rendezvous, while other parts of the
system are doing things like tlb shootdowns with a different mutex.

This patch uses the common smp_rendezvous mutex for all MD home-grown
IPIs that spinwait for delivery.  Having the single mutex means that
the spinloop to aquire it will enable interrupts periodically, thus
avoiding the cross-ipi deadlock.

Obtained from: dwhite, alc
Reviewed by:   jhb
2004-08-23 21:39:29 +00:00
Peter Wemm
1c0dea0f6e Sync with i386 - Optimize intr_execute_handlers a bit etc. 2004-08-16 23:12:30 +00:00
Peter Wemm
a9cd97ba05 Sync with i386 - remove unused includes 2004-08-16 23:10:46 +00:00
Peter Wemm
e88022749d Sync with i386 - get the softc via the devclass rather than caching the dev 2004-08-16 23:10:18 +00:00
Peter Wemm
deefc3c4f5 Sync with i386 - add ADAPTIVE_GIANT, remove pcic 2004-08-16 22:59:24 +00:00
Peter Wemm
9727279886 Sync with i386 - add foot shooting protection for the DDB/KDB thing. 2004-08-16 22:57:47 +00:00
Peter Wemm
b93b95f67f Sync with i386 - set rbp reg to 0 for upcalls as a frame marker, not that
it is guaranteed to be used in userland though.
2004-08-16 22:57:13 +00:00
Peter Wemm
717209c708 Sync with i386 - trace syscall entry/exit times, and a cosmetic fix. 2004-08-16 22:56:20 +00:00
Peter Wemm
81f6665ec9 Sync with i386 - fix bounds check in lapic_create() 2004-08-16 22:55:32 +00:00
Peter Wemm
526d706192 Sync with i386 - pass resource requests up to parent 2004-08-16 22:54:50 +00:00
Peter Wemm
2c87e00194 Sync with i386 - s/cpu_swtch/cpu_switch/ 2004-08-16 22:53:29 +00:00
Peter Wemm
e4b3358b15 Sync with i386 - dont count needed bounce pages if loading a buffer that
was created with bud_dmamem_alloc()
2004-08-16 22:53:03 +00:00
Peter Wemm
6bd599a050 Sync with i386 - cosmetic fixes 2004-08-16 22:52:02 +00:00
Peter Wemm
66d2648494 Catch up with i386 - remove lots of no longer used symbolic constants 2004-08-16 22:51:44 +00:00
Peter Wemm
fd57ce88db Sync with i386 2004-08-16 22:51:13 +00:00
David E. O'Brien
7a071c6a49 Complete 'IA32' -> 'COMPAT_IA32' change for the Linuxulator32. 2004-08-16 12:51:33 +00:00
Tim J. Robbins
7a47419763 Un-comment LINPROCFS. 2004-08-16 12:39:27 +00:00
David E. O'Brien
ce55a234ee I missed an 'IA32' in the documentation. 2004-08-16 11:15:46 +00:00
David E. O'Brien
c680f6b12d I'm not sure what tjr envisioned for turning on FreeBSD/i386 rt support,
but make it COMPAT_IA32 for now.
Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.
2004-08-16 11:09:59 +00:00
David E. O'Brien
186b870df3 Fix the 'DEBUG' argument code to unbreak the amd64 LINT build. 2004-08-16 10:54:25 +00:00
Tim J. Robbins
6766a2386d Regen. 2004-08-16 08:07:06 +00:00
Tim J. Robbins
ea0fabbc4f Add preliminary support for running 32-bit Linux binaries on amd64, enabled
with the COMPAT_LINUX32 option. This is largely based on the i386 MD Linux
emulations bits, but also builds on the 32-bit FreeBSD and generic IA-32
binary emulation work.

Some of this is still a little rough around the edges, and will need to be
revisited before 32-bit and 64-bit Linux emulation support can coexist in
the same kernel.
2004-08-16 07:55:06 +00:00
Robert Watson
273ad9acbc Preemptive anti-footshooting: cause a #error if MP_WATCHDOG is compiled
with SCHED_ULE.
2004-08-15 20:32:40 +00:00
Robert Watson
a632deec30 Add an "options MP_WATCHDOG" to i386. This option allows one of the
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond.  A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog.  If the callout fails to reset the timer for ten
seconds, the watchdog will fire.  The sysctl allows you to select which
CPU will run the watchdog.

A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog.  On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.

This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.

On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.
2004-08-15 18:02:09 +00:00
Doug Ambrisko
4f1132a88a Fix the memory scaling bug when basemem was converted to Kbytes from
bytes for AMD64.  Otherwise the AP will be started at 640K which
won't work.  Bug found on a Xeon 64bit system.
2004-08-13 22:30:55 +00:00
David Xu
91574a97b7 Mark end of frames. 2004-08-11 23:23:05 +00:00
Marcel Moolenaar
4da47b2fec Add __elfN(dump_thread). This function is called from __elfN(coredump)
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.

Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc
2004-08-11 02:35:06 +00:00
David Xu
767d9fcf7c As AMD64 architecture volume 1 chapter 3.1.2 says, high 32 bits of %rflags
are resevered, they can be written with anything, but they always read
as zero, we should simulate it in set_regs() as we are reading/writting
real hardware %rflags register.
2004-08-10 12:15:27 +00:00
David Xu
d421c2dc36 In syscall, always make a copy of parameters from trapframe, this
becauses some syscalls using set_mcontext can sneakily change
parameters and later when those syscalls references parameters,
they will wrongly use register values in mcontext_t.

Approved by: peter
2004-08-09 23:57:59 +00:00
Alan Cox
a9cb79ba4e With the advent of pmap locking it makes sense for pmap_copy() to be less
forgiving about inconsistencies in the source pmap.  Also, remove a new-
line character terminating a nearby panic string.
2004-08-08 00:31:58 +00:00
Scott Long
066853b70e Move the definition of M_MEMDESC to a non-optional file. This allows
kernels configurations without the 'mem' device to compile.
2004-08-07 06:21:37 +00:00
Alan Cox
606ae16f13 Eliminate a variable that became unused in the i386 to amd64 conversion. 2004-08-07 04:31:26 +00:00
Mark Murray
1e10d9c135 MFi386: Fix mem device. Grrr. 2004-08-06 07:22:36 +00:00
Mark Murray
b1b7862b7b MFi386: sort out the mem device. Grrrr. 2004-08-06 07:20:32 +00:00
Mark Murray
a20ad05beb Fix module builds for i386 and amd64. 2004-08-04 18:30:31 +00:00
Alan Cox
1b3b9cfe1d Post-locking clean up/simplification, particularly, the elimination of
vm_page_sleep_if_busy() and the page table page's busy flag as a
synchronization mechanism on page table pages.

Also, relocate the inline pmap_unwire_pte_hold() so that it can be used
to shorten _pmap_unwire_pte_hold() on alpha and amd64.  This places
pmap_unwire_pte_hold() next to a comment that more accurately describes
it than _pmap_unwire_pte_hold().
2004-08-04 18:04:44 +00:00
Mark Murray
d23a262fc5 Making a loadable null.ko for /dev/(null|zero) proved rather
unpopular, so remove this (mis)feature.

Encouragement provided by:	jhb (and others)
2004-08-03 19:24:54 +00:00
Maxime Henrion
9f1b87f106 Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait().  It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by:	jhb
Reviewed by:	jhb
2004-08-03 18:44:27 +00:00
Doug Rabson
595a88c6af Add style(9) foolishness. 2004-08-03 08:21:48 +00:00
Mark Murray
ae17d056a1 Diff reduction WRT i386 version. 2004-08-02 20:36:47 +00:00
Doug Rabson
4d84a58d1d Add definitions for TLS relocations. 2004-08-02 19:12:17 +00:00
David E. O'Brien
b4adaf5679 Fix the build by providing 'PHYS_TO_DMAP' and 'M_MEMDESC'. 2004-08-02 02:07:20 +00:00
Mark Murray
b6527a3667 Add the I/O device for those architectures that have it. 2004-08-01 19:37:34 +00:00
Scott Long
9352fe30a0 Turn off PREEMPTION by default while it gets debugged. It's been causing
4 weeks of problems including deadlocks and instant panics.  Note that the
real bugs are likely in the scheduler.
2004-08-01 14:31:45 +00:00
Mark Murray
8ab2f5ecc5 Break out the MI part of the /dev/[k]mem and /dev/io drivers into
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.
2004-08-01 11:40:54 +00:00
David Xu
e3086a21e8 Turn on PCB_FULLCTX for set_mcontext, functions like kse_switchin
needs to fully restore asynchronous context which did not come
from fast syscall.
2004-07-31 14:02:29 +00:00
Alan Cox
c6bf9f0455 Add pmap locking to pmap_object_init_pt(). 2004-07-31 06:42:05 +00:00
Paul Saab
bc35f5dc9e MFia64:
Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead
of creating an inline function that just calls __builtin_ffs.
2004-07-30 16:44:29 +00:00
Alan Cox
a087914310 Advance the state of pmap locking on alpha, amd64, and i386.
- Enable recursion on the page queues lock.  This allows calls to
   vm_page_alloc(VM_ALLOC_NORMAL) and UMA's obj_alloc() with the page
   queues lock held.  Such calls are made to allocate page table pages
   and pv entries.
 - The previous change enables a partial reversion of vm/vm_page.c
   revision 1.216, i.e., the call to vm_page_alloc() by vm_page_cowfault()
   now specifies VM_ALLOC_NORMAL rather than VM_ALLOC_INTERRUPT.
 - Add partial locking to pmap_copy().  (As a side-effect, pmap_copy()
   should now be faster on i386 SMP because it no longer generates IPIs
   for TLB shootdown on the other processors.)
 - Complete the locking of pmap_enter() and pmap_enter_quick().  (As of now,
   all changes to a user-level pmap on alpha, amd64, and i386 are performed
   with appropriate locking.)
2004-07-29 18:56:31 +00:00
Alexander Kabaev
8b5ae4db0d Use newly added __used attribute to keep static function symbol from
being eliminated.
2004-07-29 18:02:28 +00:00
Poul-Henning Kamp
0658bb8ef8 Move a relic to its correct location(s): Put nfs diskless initialization
calls with the code they call.  (Yet another example of mindless copy&paste).
2004-07-28 21:54:57 +00:00
Robert Watson
1a8cfbc450 Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread.  It is called only from critical_{enter,exit}(),
which already dereferences curthread.  This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding:	jhb, bmilekic
2004-07-27 16:41:01 +00:00
Warner Losh
2eed2482ba Remove ahb, aha, ie, le and wl devices. They are all ISA/EISA only.
I went ahead and left in the ISA cards that also have pccard
attachments.  There's no way that these devices could attach.

OK'd by: peter
2004-07-22 22:29:45 +00:00
Warner Losh
6d1275e568 There is no pcic device on amd64. OLDCARD isn't supported, and
NEWCARD will call it something different.  and there are no ISA add-in
devices.
2004-07-22 22:28:34 +00:00
Marcel Moolenaar
fd32d93b97 Unify db_stack_trace_cmd(). All it did was look up the thread given
the thread ID and call db_trace_thread().
Since arm has all the logic in db_stack_trace_cmd(), rename the
new DB_COMMAND function to db_stack_trace to avoid conflicts on
arm.
While here, have db_stack_trace parse its own arguments so that
we can use a more natural radix for IDs. If the ID is not a thread
ID, or more precisely when no thread exists with the ID, try if
there's a process with that ID and return the first thread in it.
This makes it easier to print stack traces from the ps output.

requested by: rwatson@
tested on: amd64, i386, ia64
2004-07-21 05:07:09 +00:00
Alan Cox
fa543780cc Remove the allpmaps list. It's unused.
Reviewed by: peter@
2004-07-20 02:40:56 +00:00
John Baldwin
788195c186 As a temporary hack, turn off deferred preemptions that are the result of
a fast interrupt handler doing an swi_sched().  This fixed the lockups I
saw on my laptop when using xmms in KDE and on rwatson's MySQL benchmarks
on SMP.  This will eventually be removed and/or modified when I figure out
what the root cause is and fix that.
2004-07-19 16:37:47 +00:00
David Schultz
479f8d2214 Make FLT_ROUNDS correctly reflect the dynamic rounding mode. 2004-07-19 08:17:25 +00:00
Scott Long
701f140800 Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
Maxim Konovalov
aa355a2679 In -CURRENT pseudo devices are not statically assigned at compile time,
remove a stale comment.

PR:		kern/62285
2004-07-18 09:03:12 +00:00
Paul Saab
a146798856 Fix the build. pcm is no more. 2004-07-16 21:48:30 +00:00
Alan Cox
3d2e54c317 Push down the acquisition and release of the page queues lock into
pmap_protect() and pmap_remove().  In general, they require the lock in
order to modify a page's pv list or flags.  In some cases, however,
pmap_protect() can avoid acquiring the lock.
2004-07-15 18:00:43 +00:00
Peter Wemm
6897c4aef7 Like on i386, eliminate pv_ptem (which was suggested by alc). This
reduces the size of the pv_entry structure a small but significant amount.

This is implemented a little differently because it isn't so cheap to get
the physical address of the page tabke page on amd64.. instead of it
being directly accessible from the top level page directory, it is now
two additional tree levels down.  However.. In almost all cases, we
recently had the physical address if the page table page a short while
before we needed it, but it slipped through our fingers.  This patch
saves it for when we do need it.  Also, for the one case where we do not
have the ptp paddr, we are always running in curproc context and so we
can do a vtopte-like trick.  I've implemented vtopde() for this purpose.

There is still a CYA entry in pmap_unuse_pt() that needs to be removed.  I
think it can be removed now but I forgot to test with it gone.
2004-07-14 07:13:35 +00:00
David Xu
53dbf30349 Add ptrace_clear_single_step(), alpha already has it for years, the function
will be used by ptrace to clear a thread's single step state.
2004-07-13 07:22:56 +00:00
Alan Cox
ce8da3091f Push down the acquisition and release of the page queues lock into
pmap_remove_pages().  (The implementation of pmap_remove_pages() is
optional.  If pmap_remove_pages() is unimplemented, the acquisition and
release of the page queues lock is unnecessary.)

Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
2004-07-13 02:49:22 +00:00
Marcel Moolenaar
35e454b36c MFi386: rev 1.213 -- fix DELAY while the debugger is active.
This also fixes the (runtime) breakage introduced in the previous
commit that was the result of a botched merge. This hasn't even
been compile-tested...
2004-07-11 18:07:55 +00:00
Marcel Moolenaar
8bcb1e9e84 Add options KDB and GDB. KDB takes on the function of what DDB used
to be. Both DDB and GDB specify which KDB backends to include.
2004-07-11 03:20:09 +00:00
Marcel Moolenaar
1ca618fcaa Remove the now unused GDB stubs. See src/sys/gdb/* for the new KDB
backend.
2004-07-11 01:47:26 +00:00
Marcel Moolenaar
37224cd3fc Mega update for the KDB framework: turn DDB into a KDB backend.
Most of the changes are a direct result of adding thread awareness.
Typically, DDB_REGS is gone. All registers are taken from the
trapframe and backtraces use the PCB based contexts. DDB_REGS was
defined to be a trapframe on all platforms anyway.
Thread awareness introduces the following new commands:
	thread X	switch to thread X (where X is the TID),
	show threads	list all threads.

The backtrace code has been made more flexible so that one can
create backtraces for any thread by giving the thread ID as an
argument to trace.

With this change, ia64 has support for breakpoints.
2004-07-10 23:47:20 +00:00
Marcel Moolenaar
417e3e7b6f MFi386: don't fake the time counter when the debugger is active.
This breaks the fundamental property of DELAY(). Instead, avoid
grabbing clock_lock when kdb_active is non-zero.
2004-07-10 22:42:22 +00:00
Marcel Moolenaar
f9c8fc6017 Remove obsolete prototype of kdb_trap(). 2004-07-10 22:39:56 +00:00
Marcel Moolenaar
4b0cc40d57 Update for the KDB framework:
o  Make debugging support conditional upon KDB instead of DDB.
o  Remove implementation of Debugger().
o  Don't make setjump() and longjump() conditional upon DDB.
o  s/ddb_on_nmi/kdb_on_nmi/g
o  Call kdb_reenter() when kdb_active is non-zero. Call kdb_trap()
   otherwise.
2004-07-10 22:39:17 +00:00
Marcel Moolenaar
5a39cbaf69 Implement makectx(). The makectx() function is used by KDB to create
a PCB from a trapframe for purposes of unwinding the stack. The PCB
is used as the thread context and all but the thread that entered the
debugger has a valid PCB.
This function can also be used to create a context for the threads
running on the CPUs that have been stopped when the debugger got
entered. This however is not done at the time of this commit.
2004-07-10 19:56:00 +00:00
Marcel Moolenaar
cbc174356c Introduce the KDB debugger frontend. The frontend provides a framework
in which multiple (presumably different) debugger backends can be
configured and which provides basic services to those backends.
Besides providing services to backends, it also serves as the single
point of contact for any and all code that wants to make use of the
debugger functions, such as entering the debugger or handling of the
alternate break sequence. For this purpose, the frontend has been
made non-optional.
All debugger requests are forwarded or handed over to the current
backend, if applicable. Selection of the current backend is done by
the debug.kdb.current sysctl. A list of configured backends can be
obtained with the debug.kdb.available sysctl. One can enter the
debugger by writing to the debug.kdb.enter sysctl.
2004-07-10 18:40:12 +00:00
Marcel Moolenaar
72d44f31a6 Introduce the GDB debugger backend for the new KDB framework. The
backend improves over the old GDB support in the following ways:
o  Unified implementation with minimal MD code.
o  A simple interface for devices to register themselves as debug
   ports, ala consoles.
o  Compression by using run-length encoding.
o  Implements GDB threading support.
2004-07-10 17:47:22 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Peter Wemm
d8ad50b704 MFi386: various io apic cleanups 2004-07-08 01:42:49 +00:00
Peter Wemm
335e282d3c MFi386: use rman access methods instead of groping around inside
struct resource
2004-07-08 01:34:24 +00:00
Peter Wemm
2e37b53aba MFi386: whitespace nit fix (spare blank line) 2004-07-08 01:32:25 +00:00
Peter Wemm
f08560a569 MFi386: fix up CR0 settings 2004-07-08 01:31:13 +00:00
Peter Wemm
6ad90df2e3 MFi386: 1.57: transparently respect alignment/boundary tags 2004-07-08 01:28:33 +00:00
Alan Cox
26a965568d Simplify the control flow in pmap_extract(), enabling the elimination of a
PMAP_UNLOCK() call.
2004-07-07 16:47:58 +00:00
Alan Cox
03c0ca74ee White space and style changes only. 2004-07-07 02:23:46 +00:00
Alan Cox
c9a217d2ed Style changes to pmap_extract(). 2004-07-06 02:33:11 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
Warner Losh
9a879a35fb We need to make resources visible here as well. 2004-06-30 19:24:26 +00:00
Nate Lawson
1a26ea7f2c Add machdep quirks functions. On i386, this disables acpi on systems with
BIOS dates earlier than Jan 1, 1999.  Add prototypes and quirks flags.
2004-06-30 04:42:29 +00:00
John Baldwin
a7cd01df0e Fetch the actual acpi0 device_t and use device_is_attached() to see if
it's alive rather than trying to fetch its softc pointer via its devclass.

Glanced at by:	imp, njl
2004-06-23 17:59:01 +00:00
Alan Cox
db2a1d54cc Implement the protection check required by the pmap_extract_and_hold()
specification.  This enables the elimination of Giant from that function.
2004-06-23 04:37:14 +00:00
Alan Cox
dc8beb5358 - Simplify pmap_remove_pages(), eliminating unnecessary indirection.
- Simplify the locking of pmap_is_modified() by converting control flow to
   data flow.
2004-06-20 20:57:06 +00:00
Alan Cox
1ec4b75936 Add pmap locking to pmap_is_prefaultable(). 2004-06-20 06:11:00 +00:00
Bruce Evans
4c5f10a672 Backed out previous commit. Blind substitution of dev_t by `struct cdev *'
was just wrong here because the dev_t's are user dev_t's.
2004-06-20 03:52:50 +00:00
Alan Cox
785f2cdf57 Remove unused pt_entry_ts. Remove an unneeded semicolon. 2004-06-19 19:09:08 +00:00
Bruce Evans
7a637a637e Include <sys/_lock.h>'s prerequisite <sys/queue.h> before including the
former, not after.

Don't hide this bug by including <sys/queue.h> in <sys/_lock.h>.
2004-06-19 14:58:35 +00:00
Peter Wemm
a5de0db8b5 Try harder to give new processes a clean initial fpu state. fpu_cleanstate
wasn't actually clean, it was saving the xmm registers as left over by the
bios.  fninit() doesn't clear those.

In fpudna(), instead of doing a fninit() and forgetting to load the initial
mxcsr, do a full fxrstor(&fpu_cleanstate).  Otherwise we hand over whatever
random values are left in the xmm registers by the last user.

I'm not certain of whether this is excessive paranoia or not, but there was
an outright bug in neglecting to set the mxcsr value that caused awk to
SIGFPE in some case.  Especially for Tim Robbins. :-)

i386 probably should do something about the mxcsr setings too.

Found by:  tjr
2004-06-18 04:01:54 +00:00
Nate Lawson
37c55a039a Revert last change. If acpi is loaded or compiled into the kernel, its
devclass will be present even if the driver was disabled by a hint.  Using
device_get_softc() provides the right info even if it's overkill.

Explained by:	jhb
2004-06-17 17:27:37 +00:00
Alan Cox
d45f21f31a Do not preset PG_BUSY on VM_ALLOC_NOOBJ pages. Such pages are not
accessible through an object.  Thus, PG_BUSY serves no purpose.
2004-06-17 06:16:58 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Alan Cox
d7c3bd4918 Add some lock assertions. Lock a small part of pmap_enter(). 2004-06-16 07:51:19 +00:00
Alan Cox
82d8e6f5a0 Correct an error in the implementation of pmap_is_prefaultable(). When I
introduced this function in revision 1.441, I inverted one of the
comparisons.
2004-06-16 03:11:24 +00:00
Alan Cox
7b9d474460 Remove a stale comment. 2004-06-15 19:28:40 +00:00
Alan Cox
0dec7f69a6 Add pmap locking to pmap_extract(), pmap_mincore(), and pmap_remove(). 2004-06-15 07:41:44 +00:00
Nate Lawson
345281bc43 We only need the devclass_find() result, not the softc. 2004-06-15 02:12:12 +00:00
Alan Cox
50f91a9445 Introduce pmap locking to many of the pmap functions. There is more to
come later.
2004-06-14 01:17:50 +00:00
David E. O'Brien
2a79761d77 The majority of FreeBSD/amd64 machines are SMP, so use ADAPTIVE_MUTEXES
by default to improve performance.
2004-06-13 23:03:57 +00:00
Alan Cox
7881f95056 Prevent the loss of a PG_M bit through an SMP race in pmap_ts_referenced(). 2004-06-13 21:59:42 +00:00
Alan Cox
b34ec165b4 Remove dead or unneeded code, e.g., spl calls. 2004-06-13 19:48:38 +00:00
Alan Cox
8559e0a291 - Remove an unused declaration.
- Move a definition inside the scope of a #ifdef _KERNEL.
2004-06-13 03:44:11 +00:00
Alan Cox
2d0dc0fcd6 In a multiprocessor, the PG_W bit in the pte must be changed atomically.
Otherwise, the setting of the PG_M bit by one processor could be lost if
another processor is simultaneously changing the PG_W bit.

Reviewed by:	tegge@
2004-06-12 20:01:48 +00:00
Poul-Henning Kamp
1930e303cf Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
Peter Wemm
a520047095 Argh. Add the mini-stack-frame back in for mcount's benefit for syscall
stubs.
2004-06-10 22:02:26 +00:00
Peter Wemm
6d05d7c75a Make profiling work for varargs functions.. %al is an additional argument
which indicates the number of xmm registers used in the varargs.  This
stops the explosion that happened when profiling printf() etc.
2004-06-10 22:00:58 +00:00
Peter Wemm
11b000253e Insta-MFi386: ignore disabled cpu apic id's entirely 2004-06-10 21:30:08 +00:00
John Baldwin
bad4ce7d91 - Use the correct devclass name ("acpi" vs "ACPI") to detect if acpi0 is
present and thus that the PnPBIOS probe should be skipped instead of
  having ACPI zero out the PnPBIOStable pointer.
- Make the PnPBIOStable pointer static to i386/i386/bios.c now that that is
  the only place it is used.
2004-06-10 20:43:04 +00:00
John Baldwin
092a5c4530 Remove atdevbase and replace it's remaining uses with direct references to
KERNBASE instead.
2004-06-10 20:31:00 +00:00
Peter Wemm
591506d322 In pmap_extract_and_hold(), there is no need to mask off PG_FRAME because
pmap_extract() already does it.
In pmap_enter(), opa has already been masked so don't do it again.
Wrap a long line (recent transgression).
Use trunc_page() in pmap_mapdev() instead of anding with PG_FRAME, since
that is what we really meant.

Submitted by:  alc (first item)
2004-06-08 02:20:40 +00:00
Peter Wemm
71e0fe3abc Fix my silly typo in asm statement in previous commit. 2004-06-08 01:35:48 +00:00
Peter Wemm
576bb07aaa Argh. Remove stray number that slipped into the previous commit. 2004-06-08 01:20:37 +00:00
Peter Wemm
96a7759e99 Reapply rev 1.151 after enable sse/fpuinit order fixed in mp_machdep.c
Obtained from:  das
2004-06-08 01:14:39 +00:00
Peter Wemm
18154cd6f8 Set up the fpu *after* enabling SSE mode on AP's
Submitted by: (argh, I can't find the email)
2004-06-08 01:07:51 +00:00
Peter Wemm
430e272c7e Initial PG_NX support (no-execute page bit)
- export the rest of the cpu features (and amd's features).
- turn on EFER_NXE, depending on the NX amd feature bit
- reorg the identcpu stuff a bit in order to stop treating the
  amd features as second class features (since it is now a primary feature
  bit set) and make it easier to export.
2004-06-08 01:02:52 +00:00
Peter Wemm
7d95d34bb7 Mask pte's with PG_FRAME before passing it to PHYS_TO_VM_PAGE().. PG_NX
lives in the top 12 'available' bits.  atop() in the PHYS_TO_VM_PAGE()
macro only masks off the lower bits (by accident) and the upper bits
in the 64 bit ptes turn into "interesting" index values.
2004-06-08 00:29:42 +00:00
Peter Wemm
74567f8f85 Use trunc_page(va) when we mean it rather than anding it with PG_FRAME
(which doesn't work all that well when there are bits at the top that are
 masked by PG_FRAME)
2004-06-08 00:11:32 +00:00
Peter Wemm
c2490f2dfb Fix a serious problem that manifested during swap, and a few other times.
pmap_remove() would be called with a huge range and we'd stride across
it in only 2MB chunks.  This would manifest as massive cpu time and a
largely unresponsive system during hard swap.  Instead, check the higher
page directories which means we can run pmap_remove() in just a few
hundred loop iterations instead of millions since we can process
address space in chunks of 512GB and 1GB as well as 2MB.

Eternal thanks to:  tmm
2004-06-07 23:51:20 +00:00
Peter Wemm
b8168edefc Be a little more consistent in the naming of the PML4 defines. 2004-06-07 23:47:59 +00:00
David Schultz
8c2267ec9a Back out revision 1.150, since dwmalone reports that it causes a panic
upon startup on his machine.
2004-06-06 09:16:02 +00:00
David Schultz
ad070467cd Initialize the MXCSR to the appropriate default value at startup.
Tested on:	tjr
2004-06-05 03:13:39 +00:00
Poul-Henning Kamp
79005bbdbe Add new bios_string() which will hunt for a string inside a given range
of the BIOS.  This can be used for finding arbitrary magic in the BIOS
in order to recognize particular platforms.
2004-06-03 22:36:24 +00:00
Peter Wemm
74cfa96999 MFi386: add ixgp device 2004-06-03 21:40:41 +00:00
Peter Wemm
ea10166e8a MFi386: apic intpin programming updates etc. 2004-06-03 20:25:05 +00:00
Peter Wemm
bfe14b3edc MFi386: remove debug printf 2004-06-03 20:22:48 +00:00
Peter Wemm
b5eb4e5196 Move module.h include to the same place as on i386 for diff reduction. 2004-06-03 20:21:30 +00:00
Peter Wemm
9248fc7bc0 MFi386: move cpu_nameclass struct next to its only consumer 2004-06-03 20:18:15 +00:00
Tim J. Robbins
cc05397ffc Remove checks for curthread == NULL - it can't happen. 2004-06-03 10:22:47 +00:00
Poul-Henning Kamp
fd360128ff Add missing <sys/module.h> instances which were shadowed by the nested
include in <sys/kernel.h>
2004-06-03 05:58:30 +00:00
Tim J. Robbins
fa2a4d0595 Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by:	jhb
2004-06-03 01:47:37 +00:00
Tim J. Robbins
aa0aa7a113 Move TDF_SA from td_flags to td_pflags (and rename it accordingly)
so that it is no longer necessary to hold sched_lock while
manipulating it.

Reviewed by:	davidxu
2004-06-02 07:52:36 +00:00
Poul-Henning Kamp
41ee9f1c69 Add some missing <sys/module.h> includes which are masked by the
one on death-row in <sys/kernel.h>
2004-05-30 17:57:46 +00:00
Alan Cox
b59f545aa2 MFi386 revision 1.6
Reenable ithread preemption for interrupts that occur while executing in
 the kernel.
2004-05-30 04:49:39 +00:00
Tim J. Robbins
402705521a Implement __bb_init_func. This is a fairly straightforward conversion
of the i386 version.
2004-05-29 01:13:28 +00:00
Alan Cox
662d471da6 Remove a broken micro-optimization from pmap_enter(). The ill effect
of this micro-optimization occurs when we call pmap_enter() to wire an
already mapped page.  Because of the micro-optimization, we fail to
mark the PTE as wired.  Later, on teardown of the address space,
pmap_remove_pages() destroys the PTE before vm_fault_unwire() has
unwired the page.  (pmap_remove_pages() is not supposed to destroy
wired PTEs.  They are destroyed by a later call to pmap_remove().)
Thus, the page becomes lost.

Note: The page is not lost if the application called munlock(2), only
if it relies on teardown of the address space to unwire its pages.

For the historically inclined, this bug was introduced by a
megacommit, revision 1.182, roughly six years ago.

Leak observed by: green@ and dillon independently
Patch submitted by: dillon at backplane dot com
Reviewed by: tegge@
MFC after: 1 week
2004-05-28 19:42:02 +00:00
Thomas Moestl
65e29c4822 Retire cpu_sched_exit(); it is not used any more. 2004-05-26 12:09:39 +00:00
Bruce Evans
026afdcc05 Quick fix for overflow when tsc_freq >= 2^31. "int profrate" in struct
gmon and struct gmonhdr was originally just to represent the kernel
(profiling) clock frequency and it remains poorly suited to representing
the frequencies of fast counters like the TSC.  It broke a year or two
ago.  This quick fix keeps it working for another year or month or two
until TSC frequencies can exceed 2^32, by dividing the frequency by 2.
Dividing the frequency by 4 would work for a little longer but would
lose a little too much precision.
2004-05-26 09:43:38 +00:00
Bruce Evans
d7d197e0f9 Oops, ".align 4" for the data section in the previous commit should
have been ".p2align 4".  This bug is cosmetic since the data section
happens to be empty.
2004-05-24 12:42:16 +00:00
Bruce Evans
003d5d66b1 Fixed profiling of trap, syscall and interrupt handlers and some
ordinary functions, essentially by backing out half of rev.1.115 of
amd64/exception.S.  The handlers must be between certain labels for
the purposes of profiling, and this was broken by scattering them in
separately compiled .S files, especially for ordinary functions that
ended up between the labels.  Merge the files by #including them as
before, except with different pathnames and better comments and
organization.  Changes to the scattered files are minimal -- just
move the labels to the file that does the #includes.

This also partly fixes profiling of IPIs -- all IPI handlers are now
correctly classified as interrupt handlers, but many are still missing
mcount calls.
2004-05-24 12:08:56 +00:00
Bruce Evans
909ca1671d Don't repeat the definition of IDTVEC(). It is in asmacros.h. 2004-05-24 11:28:11 +00:00
Bruce Evans
7a9e253666 Added profiling support for Xint0x80_syscall. 2004-05-23 19:06:15 +00:00
Bruce Evans
5eb1e23a45 Adjusted for amd64 after repo-copy. The adjustments are routine, except:
- perfmon headers must be avoided until perfmon is supported.
- all call-used registers including return registers must be preserved
  by .mcount(), etc., not quite as in profile.h.  __cyg_profile_func_*()
  don't require this, but they are (mis)implemented as aliases for
  .mcount(), etc. so they preserve the registers.
- i386 ifdefs related to perfmon have not been adjusted yet.
2004-05-23 18:27:14 +00:00
Bruce Evans
03d5ca33db Restored FAKE_MCOUNT() and MEXITCOUNT invocations and adjusted them for
amd64 as necessary.  This is routine, except:
- the FAKE_MCOUNT($bintr) in doreti was missing the '$'.  This gave a
  a garbage address made up of padding bytes (with the nop byte 0x90 as
  the MSB) instead of the intended address of bintr.  This accidentally
  worked on i386's because (0x90 << 24) is close enough to bintr, but
  it doesn't work on amd64's because (0x90 << 56) is much further away
  from bintr.
- the FAKE_MCOUNT($btrap) in calltrap was similarly broken.  It hasn't
  been needed since FreeBSD-1, so just delete it.
2004-05-23 17:18:48 +00:00
Bruce Evans
5423714950 Adjusted FAKE_MCOUNT()s for amd64. This is needed for both ordinary
and high resolution profiling of interrupt handlers.  The adjustments
are routine once the magic stack offset 13*4 is decoded to be TF_RIP
(there were originally more types of stack frames so using TF_EIP for
one of them wouldn't have been much simpler).

Removed garbage comments attached to some of the FAKE_MCOUNT()s.
2004-05-23 16:23:29 +00:00
Bruce Evans
e2960917e1 Spell "retq" as "ret" in pagezero() like it is everywhere, else so
that the usual macro for "ret" hides the detail of calling .mexitcount
before returning.

Fixed missing call to .mexitcount in lgdt().  This was missing on
i386's, mainly because lgdt() uses lret[q] insted of ret.  This is
very unimportant since lgdt() is not (normally?) called until after
profiling is initialized.
2004-05-23 14:56:02 +00:00
Bruce Evans
69820c2a56 MFi386 (1.103 and 1.104: fixed some problems in high resolution profiling
and improved some comments).  Also, made the documented {f,s}uword()
functions the standard entry points and the undocumented {f,s}uword64()
functions alternative entry points, like {f,s}uword32() for i386's.  The
bitrot in the comments was a little larger here -- there are new undocumented
32-bit sub-word functions, not just renaming of 16-bit functions from
documented ones to undocumented ones.
2004-05-21 16:50:57 +00:00
Bruce Evans
5a8f125ad9 MFi386 (1.37: GUPROF calibration macros; only routine adjustments needed). 2004-05-20 16:22:57 +00:00
Peter Wemm
8dd1279c31 Like on i386, clear the last three entries in the pml4 page when doing a
pmap_release(), and put it the free queue marked as already zeroed.
2004-05-19 21:55:37 +00:00
Bruce Evans
8693960479 Fixed the type of fptrdiff_t. It needs to be 64 bits in theory, and in
practice too since kernel addresses are almost 2^64 higher than most
user addresses.
2004-05-19 16:19:11 +00:00
Bruce Evans
19b5915afa Fixed some style bugs (mainly misalignment of backslashes). 2004-05-19 16:04:26 +00:00
Bruce Evans
b2321e7cdb Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>.  Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.
2004-05-19 15:41:26 +00:00
Peter Wemm
eba9b48b10 Unbreak builds without DDB. Bad Bruce! No cookie! :-) 2004-05-19 01:23:48 +00:00
Peter Wemm
2079cde964 The 'call mcount' hooks that gcc inserts when profiling are in a place that
cannot handle the scratch registers being trashed.  So we have to preserve
them ourselves.
2004-05-18 22:52:32 +00:00
Stefan Farfeleder
b1aa0ba527 <stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is
defined.  Otherwise first including <wchar.h> and then <stdint.h> leads to no
WINT_M{AX,IN} at all.

PR:		64956
Approved by:	das (mentor)
2004-05-18 16:04:57 +00:00
Bruce Evans
130ff9c31a Fixed DDB_NOKLDSYM on amd64's:
machdep.c:
Initialize the symbol table pointers, not quite like for other arches.

db_elf.c:
Don't claim to be an i486 in the fake ELF header.
2004-05-18 05:30:06 +00:00
Peter Wemm
922013a665 Turn on modules for amd64. Fear. 2004-05-17 22:13:14 +00:00
Peter Wemm
910bb7dbe9 Deal with REL records that have the addend embedded variable sized targets
rather than the RELA table.  I dont know if bintutils will ever generate
REL records, but just in case.....
2004-05-17 21:16:49 +00:00
Peter Wemm
df4fd27737 Checkpoint some of what I was starting to tinker with for having some
different context support for 32 vs 64 bit processes.  This simply omits
the save/restore of the segment selector registers for non 32 bit
processes.  This avoids the rdmsr/rwmsr juggling when restoring %gs
clobbers the kernel msr that holds the gsbase.

However, I suspect it might be better to conditionally do this at
user<->kernel transition where we wouldn't need to do the juggling in the
first place.  Or have per-thread extended context save/restore hooks.
2004-05-16 22:43:57 +00:00
Peter Wemm
12c1418ccf Kill the LAZYPMAP ifdefs. While they worked, they didn't do anything
to help the AMD cpus (which have a hardware tlb flush filter).  I held
off to see what the 64 bit Intel cpus did, but it doesn't seem to help
much there either.  Oh well, store it in the Attic.
2004-05-16 22:11:50 +00:00
Peter Wemm
7be2e3e26d Converge some more with i386. 2004-05-16 21:27:29 +00:00
Peter Wemm
2a779082bb MFi386: add rue and twa 2004-05-16 20:57:01 +00:00
Peter Wemm
5119532b56 MFi386: avoid partial register references, for what its worth. 2004-05-16 20:46:13 +00:00
Peter Wemm
792e29ba26 For consistency with i386, have pmap_kenter_temporary() take a vm_paddr_t
argument.  It is actually the same type on amd64 (vm_paddr_t = vm_offset_t)
but this reduces the i386<->amd64 diffs a little.
2004-05-16 20:44:41 +00:00
Peter Wemm
463e5aa66e MFi386: numerous interrupt and acpi updates 2004-05-16 20:30:47 +00:00
Peter Wemm
e8855d4f97 Make a small revision to the api between the elf linker core and the
elf_reloc() backends for two reasons.  First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).
2004-05-16 20:00:28 +00:00