39955 Commits

Author SHA1 Message Date
Peter Wemm
cbd667fa2f Update comments. Note that the kernel is at -1GB, not -2GB as erroniously
implied by the previous commit.  KVM is still only 1GB until
pmap_growkernel() learns about the extra page table level.

Approved by:  re (blanket)
2003-05-23 06:35:45 +00:00
Peter Wemm
f229f5cf85 As suggested by the gdb folks, pad the 'struct fpreg' to a full 512 bytes
to match the native fxsave/fxrstor object size since thats apparently what
the Linux/NetBSD folks do.
2003-05-23 06:31:56 +00:00
Peter Wemm
637068b1d3 Low risk amd64 fix. Use a vm_offset_t for the virtual location of the
buffer space instead of a u_int32_t.  Otherwise the upper 32 bits of
the address space get truncated and syscons blows up.

Approved by:	re (safe, low risk amd64 fixes)
2003-05-23 05:10:49 +00:00
Peter Wemm
9f0c4ab393 Deal with the user VM space expanding. 32 bit applications do not like
having their stack at the 512GB mark.  Give 4GB of user VM space for 32
bit apps.  Note that this is significantly more than on i386 which gives
only about 2.9GB of user VM to a process (1GB for kernel, plus page
table pages which eat user VM space).

Approved by: re (blanket)
2003-05-23 05:07:33 +00:00
Peter Wemm
3c9a3c9ca3 Major pmap rework to take advantage of the larger address space on amd64
systems.  Of note:
- Implement a direct mapped region using 2MB pages.  This eliminates the
  need for temporary mappings when getting ptes.  This supports up to
  512GB of physical memory for now.  This should be enough for a while.
- Implement a 4-tier page table system.  Most of the infrastructure is
  there for 128TB of userland virtual address space, but only 512GB is
  presently enabled due to a mystery bug somewhere.  The design of this
  was heavily inspired by the alpha pmap.c.
- The kernel is moved into the negative address space(!).
- The kernel has 2GB of KVM available.
- Provide a uma memory allocator to use the direct map region to take
  advantage of the 2MB TLBs.
- Fixed some assumptions in the bus_space macros about the ability
  to fit virtual addresses in an 'int'.

Notable missing things:
- pmap_growkernel() should be able to grow to 512GB of KVM by expanding
  downwards below kernbase.  The kernel must be at the top 2GB of the
  negative address space because of gcc code generation strategies.
- need to fix the >512GB user vm code.

Approved by:	re (blanket)
2003-05-23 05:04:54 +00:00
Greg Lehey
74f2cc2c9c Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

expand_table: Add parameters file and line if we're debugging.

Approved by: re (jhb)
2003-05-23 01:15:55 +00:00
Greg Lehey
93573e2e76 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

Add and clarify comments.

Approved by: re (jhb)
2003-05-23 01:15:30 +00:00
Greg Lehey
7db14b2ff2 expand_table: Add parameters file and line if we're debugging.
MMalloc, vinum_meminfo: Use strlcpy to copy file name.

Approved by: re (jhb)
2003-05-23 01:15:01 +00:00
Greg Lehey
d026346c86 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

Approved by: re (jhb)
2003-05-23 01:14:35 +00:00
Greg Lehey
8a697ff435 detachobject: Update volume config after detaching a plex.
update_volume_config: Remove redundant diskconfig parameter.

Approved by: re (jhb)
2003-05-23 01:14:13 +00:00
Greg Lehey
cb5eba5e09 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

update_volume_config: Remove redundant diskconfig parameter.

expand_table: Add parameters file and line if we're debugging.

Approved by: re (jhb)
2003-05-23 01:13:43 +00:00
Greg Lehey
f7b76dc815 Change many strcpys to strlcpys, etc.
Submitted by:	   Ted Unangst <tedu@stanford.edu>

Correct some inaccurate and badly formatted comments.

config_subdisk: If our drive is down, ensure that the subdisk is
		crashed.  Previously it was possible for the subdisk
		to be up when the drive was down.

Change the way the plex lock mutexes work.  Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

update_volume_config: Remove redundant diskconfig parameter.

Approved by: re (jhb)
2003-05-23 01:13:10 +00:00
Peter Wemm
997f3bfc2a Merge from i386/trap.c rev 1.252. Use td_critnest instead of the
spinlocks count for explicitly enabling interrupts.

Approved by:	re (blanket)
2003-05-22 20:09:50 +00:00
Bernd Walter
cdc95e1bb8 Calculate routed interrupts using the slot number from the device and
not that of the bridge.

Approved by:	re (jhb)
2003-05-22 17:45:26 +00:00
Mike Barcroft
6f9622a926 Fix two misuses of __BSD_VISIBLE.
Submitted by:	bde
Approved by:	re
2003-05-22 17:07:57 +00:00
Julian Elischer
faaa20f639 When we are spilling threads out of the run queue during panic, make sure we
keep the thread state variable consistent with its real state.
i.e. Don't say it's on the run queue when it isn't.

Also clarify the associated comment.

Turns a double panic back to a single panic :-/

Approved by:	re@ (jhb)
2003-05-21 18:53:25 +00:00
Poul-Henning Kamp
67fd2837cd Return ENXIO if the softc pointer is NULL, in all likelyhood the
disk is in the process of disappearing.

Approved by:	re/rwats*
2003-05-21 18:52:29 +00:00
Paul Saab
3284b9ee87 Make ciss usable under PAE
Approved by:	re (scottl)
2003-05-21 07:17:06 +00:00
Paul Saab
487a8c7e61 - Make this work with PAE.
- atomically load and clear the status block so we dont miss an
  update.
  Submitted by: jdp

Approved by:	re (scottl)
2003-05-21 07:00:49 +00:00
Nate Lawson
742d91f211 Quirk for Hitachi DVD USB drive. It returns "invalid field in cdb" for
normal INQUIRY requests so enable the NO_INQUIRY quirk.

Submitted by:	Lars Eggert <larse@ISI.EDU>
Approved by:	re (scottl)
2003-05-21 00:22:07 +00:00
John Baldwin
7f4725bd09 The per-CPU spinlocks list is only maintained when WITNESS is enabled.
Thus, treat all page faults while in a critical section as fatal rather
than just those that occur with a non-empty spinlocks list.  All such page
faults are fatal anyways.  Calling trap_fatal() earlier increases the
chances of getting more useful panic messages and a possible DDB prompt.

Approved by:	re (scottl)
2003-05-20 20:50:33 +00:00
Nate Lawson
2f8f9581dd Remove a redundant quirk. Instead, we wildcard all Asahi Optical chips.
Approved by:	re
2003-05-20 18:04:42 +00:00
Marcel Moolenaar
bfaccb767c o Fix a definite bogon: the dirty bity fault, instruction access
failt and data access fault install the PTE in question into
   the VHPT table. However, a post-increment was missing and we
   wrote the raw PTE data into the pagesize/access key field.
   This leaves a corrupt VHPT entry.
o  While here, remove the explicit cache purge. Insertion into
   the translation implicitly purges any overlapping entries.
o  Make sure there's a cycle break between the itc and the rfi.
o  Whitespace fixes.
2003-05-20 06:57:20 +00:00
Marcel Moolenaar
14d2ae56c7 Rename the "IA64 ITC" counter to "ITC" counter. We don't call the
"TSC" counter on i386 "I386 TSC".

Approved by: re@ (blanket)
2003-05-20 06:51:20 +00:00
Marcel Moolenaar
9b9ce577d4 Prevent corruption of the VHPT collision chain by protecting it with
a mutex. The only volatile chain operations are insertion and deletion
but since updating an existing PTE also updates the VHPT entry itself,
and we have the VHPT mutex in both other cases, we also lock when we
update an existing PTE even though no chain operation is involved.
Note that we perform the insertion and deletion careful enough that
we don't need to lock traversals. If we need to lock traversals, we
also need to lock from the exception handler, which we can't without
creating a trapframe.

We're now able to withstand a -j8 buildworld. More work is needed to
withstand Murphy fields. In other words: we still have a bogon...

Approved by: re@ (blanket)
2003-05-20 02:52:41 +00:00
Peter Wemm
62d8fb93d0 Deal with the possibility of negative available space from the file server
to avoid Bad Things(TM) happening (eg: df crashing with a floating point
exception).

Submitted by:	Harold Gutch <logix@foobar.franken.de>
Approved by:	re (scottl)
2003-05-19 22:35:00 +00:00
Peter Wemm
3830dc4629 Another x86-64 comment fixup
Approved by:	re (blanket amd64 stuff)
2003-05-19 22:19:02 +00:00
Peter Wemm
92f0cd89a0 s/x86_64/amd64/ in comments in header.
Approved by:	re (blanket amd64)
2003-05-19 22:15:30 +00:00
Alexander Kabaev
980ded9a7d sys/sys/limits.h:
- Fix visibilty test for LONG_BIT and WORD_BIT.  `#if defined(__FOO_VISIBLE)'
   is alays wrong because __FOO_VISIBLE is always defined (to 0 for
   invisibility).

sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:

 - Style fixes.

Submitted by:	bde
Reviewed by:	bsdmike
Approved by:	re (scottl)
2003-05-19 20:29:07 +00:00
Søren Schmidt
e1750fb855 Print the right position on disk errors
Approved by: re@
2003-05-19 13:43:12 +00:00
Søren Schmidt
c9f5649b3e Unbork the chip locating code.
Approved by: re@
2003-05-19 13:42:23 +00:00
Marcel Moolenaar
b8c4149cff Turn pmap_install_pte() into a critical section. We better not get
interrupted while writing into the VHPT table. While here, make sure
memory accesses a properly ordered. Tag invalidation must happen
first so that the hardware VHPT walker will not be able to match
this entry while we're updating it and we have to make sure the new
new tag gets written only after the PTE is completely updated.

Approved by: re (blanket)
2003-05-19 08:02:36 +00:00
Marcel Moolenaar
a75b99ea2d Unconditionally set pcb_current_pmap. WIP versions of the code
previously committed cleared pcb_current_pmap prior to changing
the region registers, but that was removed before committing.
Since we don't normally (at all?) pass a NULL pointer, the bug
was mostly harmless. Fix it while I'm here...

I'm here because we need to have data serialization after writing
to the region registers. Not doing so was likely the cause of the
hangs we were experiencing. General exceptions in cpu_switch may
also be caused by the lack of serialization.

Approved by: re (blanket)
2003-05-19 06:05:30 +00:00
Marcel Moolenaar
dc0bde0f18 pmap_install() needs to be atomic WRT to context switching. Protect
switching user regions (region 0-4) with schedlock. Avoid unnecessary
recursion on schedlock by moving the core functionality to another
function (pmap_switch()) where we assert schedlock is held. Turn
pmap_install() into a wrapper that grabs schedlock. This minimizes
the number of callsites that need to be changed.
Since we already have schedlock in cpu_switch() and cpu_throw(),
have them call pmap_switch() directly. These were also the only two
calls to pmap_install() outside pmap.c, so make pmap_install() static
and remove its prototype from pmap.h

Approved by: re (blanket)
2003-05-19 04:16:30 +00:00
Greg Lehey
4555a3de62 print_config:
Change config format slightly to save plex preferences correctly.

vinum_scandisk: reinitialise volatile pointer after function call.
This is the "deafc0de" bug.

Approved by: re (scottl)
2003-05-19 02:21:31 +00:00
David Schultz
e92686d065 If we seem to be out of VM, don't allow the pagedaemon to kill
processes in the first pass.  Among other things, this will give
us a chance to launder vnode-backed pages before concluding that
we need more swap.  This is particularly useful for systems that
have no swap.

While here, update a comment and remove some long-unused code.

Reported by:	Lucky Green <shamrock@cypherpunks.to>
Suggested by:	dillon
Approved by:	re (rwatson)
2003-05-19 00:51:07 +00:00
Alan Cox
7f758dabbb Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
Bernd Walter
d7a1c636e1 Recreate devnodes on USB_SET_ALTINTERFACE ioctl.
This fixes net/pppoa port for Alcatel Speedtouch devices.

Submitted by: Jay Cornwall <jay@evilrealms.net>
Tested by: Francois Rogler <francois@rogler.org>
Approved by: re (scottl)
2003-05-18 21:22:00 +00:00
Ruslan Ermilov
517f3f1ae5 There's just no reason to not have these in GENERIC.
Found by:	release/*/drivers.conf cleaning script
Approved by:	re (scottl)
2003-05-18 20:39:15 +00:00
Søren Schmidt
05688ceccc Support the ICH5 SATA part.
Fix HPT374 UDMA133 timing.
Fix Promise ID.
Cosmetics on probe print for Promise & HPT.

Approved by: re
2003-05-18 16:45:48 +00:00
Søren Schmidt
27409aa046 Add string for SATA150
Approved by: re
2003-05-18 16:43:08 +00:00
Søren Schmidt
347ebe4c41 Add define for SATA150
Approved by: re
2003-05-18 16:40:38 +00:00
Alan Cox
1c500307d1 Reduce the size of a vm object by converting its shadow list from a TAILQ
to a LIST.

Approved by:	re (rwatson)
2003-05-18 04:10:16 +00:00
Scott Long
8c33536c7f Add the MUTEX_NOINLINE option that explicitely de-inlines the mutex
operations.

Submitted by:	jhb
2003-05-18 03:46:30 +00:00
Ruslan Ermilov
2f0e162dc0 Fixed the markup and wording of the kern.ipc.nsfbufs tunable.
(It does not modify NSFBUFS, but just overrides it if set.)

Approved by:	re (blanket)
2003-05-17 22:17:23 +00:00
Marcel Moolenaar
040c5b92bb Remove unused files. cpu_switch() and cpu_throw(), normally in swtch.s,
can be found in machdep.c.

Approved: re@
2003-05-17 04:55:04 +00:00
Peter Wemm
5c0fe26236 Actually get all the bits for sd_hibase.. it was 16 bits short. oops.
Approved by:	re (amd64/* blanket)
2003-05-17 02:05:10 +00:00
Peter Wemm
728ec271c1 Fix a bug in the AMD64 trampoline. I misunderstood the implicit
32->64 bit zero extend.  This changes a movl to an orq.

Approved by:	re (amd64 bits)
2003-05-17 00:30:51 +00:00
Marcel Moolenaar
f2c49dd248 Revamp of the syscall path, exception and context handling. The
prime objectives are:
o  Implement a syscall path based on the epc inststruction (see
   sys/ia64/ia64/syscall.s).
o  Revisit the places were we need to save and restore registers
   and define those contexts in terms of the register sets (see
   sys/ia64/include/_regset.h).

Secundairy objectives:
o  Remove the requirement to use contigmalloc for kernel stacks.
o  Better handling of the high FP registers for SMP systems.
o  Switch to the new cpu_switch() and cpu_throw() semantics.
o  Add a good unwinder to reconstruct contexts for the rare
   cases we need to (see sys/contrib/ia64/libuwx)

Many files are affected by this change. Functionally it boils
down to:
o  The EPC syscall doesn't preserve registers it does not need
   to preserve and places the arguments differently on the stack.
   This affects libc and truss.
o  The address of the kernel page directory (kptdir) had to
   be unstaticized for use by the nested TLB fault handler.
   The name has been changed to ia64_kptdir to avoid conflicts.
   The renaming affects libkvm.
o  The trapframe only contains the special registers and the
   scratch registers. For syscalls using the EPC syscall path
   no scratch registers are saved. This affects all places where
   the trapframe is accessed. Most notably the unaligned access
   handler, the signal delivery code and the debugger.
o  Context switching only partly saves the special registers
   and the preserved registers. This affects cpu_switch() and
   triggered the move to the new semantics, which additionally
   affects cpu_throw().
o  The high FP registers are either in the PCB or on some
   CPU. context switching for them is done lazily. This affects
   trap().
o  The mcontext has room for all registers, but not all of them
   have to be defined in all cases. This mostly affects signal
   delivery code now. The *context syscalls are as of yet still
   unimplemented.

Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.

Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.

This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.

Approved by: re@ (jhb)
2003-05-16 21:26:42 +00:00
Don Lewis
1e9bc9f889 Detect that a vnode has been reclaimed while vflush() was waiting to lock
the vnode and restart the loop.  Vflush() is vulnerable since it does not
hold a reference to the vnode and it holds no other locks while waiting
for the vnode lock.  The vnode will no longer be on the list when the
loop is restarted.

Approved by:	re (rwatson)
2003-05-16 19:46:51 +00:00