Commit Graph

40224 Commits

Author SHA1 Message Date
Hiten Pandya
b77c32a07e Rename BUS_DMAMEM_NOSYNC to BUS_DMA_COHERENT.
The current name is confusing, because it indicates to
the client that a bus_dmamap_sync() operation is not
necessary when the flag is specified, which is wrong.

The main purpose of this flag is to hint the underlying
architecture that DMA memory should be mapped in a coherent
way, but the architecture can ignore it.  But if the
architecture does supports coherent mapping of memory, then
it makes bus_dmamap_sync() calls cheap.

This flag is the same as the one in NetBSD's Bus DMA.

Reviewed by: gibbs, scottl, des (implicitly)
Approved by: re@ (jhb)
2003-05-30 20:40:33 +00:00
Robert Watson
6d7f268ad1 rpc.lockd stability workaround: remove PCATCH from the tsleep() in
nfs_lock.c.  Right now, if we permit a signal to interrupt the sleep,
we will slip the lock and no process on that client, the server, or
any other client will be able to acquire the lock.  This can happen,
for example, if a user hits Ctrl-C or Ctrl-T while a process is
waiting for the lock.  By removing PCATCH, we prevent that from
happening, at the cost of not permitting a user-requested lock abort:
also nasty.  However, a user interface bug might be preferable to a
serious semantic bug, so we go with that for now.

We need to teach the rpc.lockd/kernel protocol how to abort lock
requests, and rpc.lockd how to handle aborted lock requests; patches
for the kernel bit are floating around, but no rpc.lockd bit yet.

Approved by:	re (scottl)
2003-05-30 17:15:56 +00:00
Robert Watson
c2ea1fec5b Make sure all character pointers are properly initialized; this was
mismerged from the MAC tree, and didn't get picked up because warnings
are not normally fatal in per-module builds, only when they are linked
into a kernel (such as LINT).

Reported by:	des and the technicolor tinderbox
Approved by:	re (scottl)
2003-05-30 17:02:36 +00:00
Scott Long
dfc36ded78 Add a new bootloader menu. Pull in screen.4th and frames.4th from the
examples directory to support it.  This is installed only on i386 for
now.  It will be enabled in a later commit.

Approved by:	re
2003-05-30 09:29:24 +00:00
Scott Long
9d5be300d3 Add support for the upcoming 2410SA card.
Approved by:	re (telecon)
2003-05-30 09:22:19 +00:00
Scott Long
95c9929a3b aic79xx.c:
Use the special LUNLEN_SINGLE_LEVEL constant for
	post Rev A4 hardware for single byte luns.  Without
	this change, Rev B hardware would place the single
	byte of lun data in byte 0 of the lun structure when
	it should be in byte 1.  Since there are few if any
	devices on the market that support multiple luns in
	target mode, the corrupted lun field (which was only
	corrupted for non-zero luns) wasn't hurting us.

Approved by: re	(rwatson)
2003-05-30 02:15:15 +00:00
Scott Long
6ee007e145 Fix a reported case of severe data corruption:
aic79xx.h:
aic79xx.reg:
	Return the SCB_TAG field to 16byte alignment.
	It seems that on some PCI systems, SCBs are not
	transferred correctly to the controller with
	the previous placement of the SCB_TAG field.

Approved by:	re (rwatson)
2003-05-30 02:14:22 +00:00
Peter Wemm
edd1f930aa Update the kernel compile flags inside the .if ${MACHINE_ARCH} == "amd64"
section to stop gcc generating the dwarf2 .eh_frame unwind tables.  It
is dead weight for the time being.  Maybe it can be used to perform
stack traces and/or get the location of function arguments in ddb, but
that requires a dwarf2 runtime interpreter, which we do not have.

Approved by:	re (amd64 "safe" bits)
2003-05-30 01:06:58 +00:00
Peter Wemm
ec2343a8e1 Add ddb machdep bits.
Approved by:	re (amd64 bits)
2003-05-30 01:03:43 +00:00
Peter Wemm
5c980babcd Nasty 'make it compile' port to amd64. Note that it needs some other
wire protocol for the extra registers.  I should probably just remove it
from here for now since its quite useless.

Approved by:	re (amd64/* blanket)
2003-05-30 01:02:52 +00:00
Peter Wemm
5feb2148ba Initial port to amd64 after repocopy from i386. Note that the
disassembler has not been updated yet, and will do some very strange
things.  It does tracebacks (without function arguments due to regparm
calling conventions) if -fno-omit-frame-pointer is used (to come later).
This achieves basic functionality.

Approved by:	re (amd64/* blanket)
2003-05-30 01:01:07 +00:00
Peter Wemm
0afbc83dfd Add setjmp/longjmp for ddb 2003-05-30 00:58:48 +00:00
Bernd Walter
6445c6bdf1 Correct the fix in rev 1.70
Some lines were misslocated

Submitted by:	Jay Cornwall <jay@evilrealms.net>
Approved by:	re (rwatson)
2003-05-29 23:47:12 +00:00
Robert Watson
7792fe5719 Use strsep() in preference to manual string parsing for Biba and MLS
label internalization.  Use sensible variable names.  Include comments.
Doesn't fix any known bugs, but may fix unknown ones.

Approved by:	re (scottl)
2003-05-29 22:51:52 +00:00
Maxime Henrion
193f2edbf9 When loading a module that contains a sysctl which is already compiled
in the kernel, the sysctl_register() call would fail, as expected.
However, when unloading this module again, the kernel would then panic
in sysctl_unregister().  Print a message error instead.

Submitted by:	Nicolai Petri <nicolai@catpipe.net>
Reviewed by:	imp
Approved by:	re@ (jhb)
2003-05-29 21:19:18 +00:00
David Malone
0f7e5f778a Add an INVARIENTS only check to make sure Giant is held if mbuf
allocation is attempted with M_TRYWAIT.

Reviewed by:	bmilekic
Approved by:	re (scottl)
2003-05-29 18:38:24 +00:00
David Malone
de1cab2b60 Grab giant in sendit rather than kern_sendit because sockargs may
allocate mbufs with M_TRYWAIT, which may require Giant.

Reviewed by:	bmilekic
Approved by:	re (scottl)
2003-05-29 18:36:26 +00:00
Thomas Moestl
9078f61c55 Completely disable interrupts (not just raise %pil) when calculating the
value to be written into tick_compare in tick_hardclock(). While
we were taking care that the value to be written was at least TICK_GRACE
ticks in the future, a vector interrupt could happen between calculating
the value and writing it. If it took longer than TICK_GRACE to complete
(which is doubtful for a single device-triggered vector interrupt, but
quite likely for some IPIs), the value written would be in the past
and tick interrupts (which drive hardclock and statclock) would stop
until %tick wraps around, which takes a long time.
Also, increase TICK_GRACE from 1000 to 10000 for good measure.

Reported by:	kris
Reviewed by:	jake
Approved by:	re (scottl)
2003-05-29 17:49:21 +00:00
Marcel Moolenaar
3a8c4f9f9c Move the sysctls of the misalignment handler to where they belong
and use OID_AUTO instead of fixed IDs.

Approved by: re@ (blanket)
2003-05-29 06:30:36 +00:00
Marcel Moolenaar
12cd60b726 Fix what I think is a cut-n-paste bug: use OID_AUTO for the
print_usertrap sysctl instead of CPU_UNALIGNED_PRINT. The
latter is used already.

Approved by: re@ (blanket)
2003-05-29 05:09:15 +00:00
Nate Lawson
2ce0a0b9ec This commit was generated by cvs2svn to compensate for changes in r115367,
which included commits to RCS files with non-trunk default branches.
2003-05-28 17:32:31 +00:00
Nate Lawson
d0e9cc3b3a Revert to using TABLE_ID_DSDT as the default. It looks like the dynamic
ID allocation is not there yet.  This fixes a few warnings about \_OS_ not
being found and an S3 freeze for one user.
Re-staticize AcpiNsRemoveReference() since it is not needed elsewhere.

Approved by:	re (scottl)
2003-05-28 17:32:31 +00:00
Ian Dowse
ad6adb4f18 In cluster_wbuild(), initialise b_iocmd to BIO_WRITE before calling
buf_start() to avoid triggering a panic in softdep_disk_io_initiation()
if b_iocmd happened to be BIO_READ. The later initialisation of
b_iocmd in cluster_wbuild() could probably be moved to before the
buf_start() call, but this patch keeps the change as simple as
possible.

This is reported to fix occasional "softdep_disk_io_initiation: read"
panics, especially on NFS servers.

Reported by:	Nick Hilliard <nick@netability.ie>
Tested by:	Nick Hilliard <nick@netability.ie>
Approved by:	re (rwatson)
2003-05-28 13:22:10 +00:00
Mike Silbersack
17d6531977 Replace a handrolled defrag function with m_defrag. The handrolled
function couldn't handle chains of > MCLBYTES, and it had a bug which
caused corruption and panics in certain low mbuf situations.

Additionally, change the failure case so that looutput returns ENOBUFS
rather than attempting to pass on non-defragmented mbuf chains.

Finally, remove the printf which would happen every time the low memory
situation occured.  It served no useful purpose other than to clue me
in as to what was causing the panic in question. :)

MFC after:	4 days
2003-05-28 02:04:33 +00:00
Peter Wemm
5e1b7df5cf Update AMD Features vector to include NX (page table entry no-execute bit)
and LM (long mode) etc.
2003-05-27 21:59:56 +00:00
John Baldwin
4fb8dd97a7 Fix support for 256 MB aperture sizes on chipsets such as the 845 and
865.  The APSIZE register has a variable-sized field of enabled bits.
To figure out how many bits a specific host bridge supports, write the
maximum width and see how many bits are set in the hardware.  We then
use this mask for setting and getting the aperture size.  Prior to this,
the agp(4) driver would treat an aperture size of 256 MB as 128 MB and
would not allocate enough physical memory for the GART as a result.

MFC after:	3 days
Sponsored by:	The Weather Channel
Approved by:	re (rwatson)
2003-05-27 20:13:44 +00:00
John Baldwin
ebca65b627 Grr, fix compile. The bane of trying to split out patches into two
commits.

Reported by:	Lukas Ertl <l.ertl@univie.ac.at>
With hat:	re
Pointy hat to:	jhb
2003-05-27 19:42:18 +00:00
Nate Lawson
006b3ddb51 Fix false AE_NOT_FOUND messages, reported in NetBSD port-i386/20897.
NetBSD dsmethod.c rev 1.7

Fix parent-child loop problem
Fix a reference count problem that may cause unexpected memory free
Intel 20030512 ACPICA drop (nsalloc.c)

Approved by:	re (jhb)
Obtained from:	NetBSD, Intel
Reported by:	mbr, kochi AT netbsd.org
2003-05-27 19:19:05 +00:00
Nate Lawson
480170d0b4 This commit was generated by cvs2svn to compensate for changes in r115351,
which included commits to RCS files with non-trunk default branches.
2003-05-27 19:19:05 +00:00
John Baldwin
6705889407 Fix compile: the type is spelled bus_dmasync_op_t rather than
bus_dmamap_sync_t.

With hat:	re
2003-05-27 18:32:24 +00:00
John Baldwin
e9ff34a5e2 Add support for the Intel 865 chipset.
MFC after:	3 days
Sponsored by:	The Weather Channel
Approved by:	re (murray)
2003-05-27 18:23:56 +00:00
Scott Long
bf423d4637 Remove the redundant declaration of bus_dmasync_op_t. 2003-05-27 16:34:52 +00:00
Marcel Moolenaar
81d77e2eed A flushrs must be the first in an instruction group.
Approved by: re@ (blanket)
2003-05-27 07:10:58 +00:00
Scott Long
7e71df9339 Bring back bus_dmasync_op_t. It is now a typedef to an int, though the
BUS_DMASYNC_ definitions remain as before.  The does not change the ABI,
and reverts the API to be a bit more compatible and flexible.  This has
survived a full 'make universe'.

Approved by:	re (bmah)
2003-05-27 04:59:59 +00:00
Marcel Moolenaar
1093ceb088 Have the unwinder allocate memory with M_NOWAIT. The unwinder is
used by DDB and we cannot know in advance whether it's save to
sleep. It often enough isn't. We may want to pre-allocate space
to cover the most common cases without having to use malloc at
all, but that requires some analysis. We leave that for later.

Approved by: re@ (blanket)
2003-05-27 01:15:16 +00:00
Marcel Moolenaar
a47e5d473b Fix fu{byte|word*} and su{byte|word*}:
o  If the address was not within user space we jumped to fusufault
   where we would clear pcb_onfault and return 0. There are two
   bugs here:
   1. We never got to the point where we assigned the address of
      pcb_onfault to r15, which means that we would clobber some
      random memory location, including I/O space or ROM.
   2. We're supposed to return -1 on error.
o  Make sure we have proper memory ordering for setting pcb_onfault,
   doing the memory access to user space and clearing pcb_onfault.
   For the fu* family of functions this means that we need a mf
   instruction, because we don't have acquire semantics on stores
   and release semantics on loads (hence st;ld cannot be ordered
   without intermediate mf).

While here, implement casuptr() so that we are a (small) step
closer to supporting libthr and deobfuscate the non-implementation
of {f|s}uswintr.

Approved by: re@ (blanket)
2003-05-27 01:00:12 +00:00
Marcel Moolenaar
941a057663 Revision 1.99 of this file changed the allocation request from
VM_ALLOC_INTERRUPT to VM_ALLOC_SYSTEM. There was no mention of
this in commit log as it was considered harmless. Guess what:
it does harm. WITNESS showed that we can not safely grab the
page queue lock in vm_page_alloc() in all cases as we may have
to sleep on it. Revert the request to VM_ALLOC_INTERRUPT to
circumvent this. We panic if vm_page_alloc returns 0. I'm not
entirely happy about this, but we have bigger fish to fry.

Approved by: re@ (blanket)
2003-05-26 22:54:18 +00:00
Justin T. Gibbs
177799b596 This driver supports the 2920C not the 2920.
Make this clear in our card identification string.

PR: kern/50428
Approved by: RE
2003-05-26 21:45:09 +00:00
Justin T. Gibbs
8ed30d5b45 Consistently use #ifdef for testing AHC_TARGET_MODE.
Approved by: RE
2003-05-26 21:44:03 +00:00
Justin T. Gibbs
662152ce16 aic79xx.c:
aic79xx_osm.h:
aic7xxx_osm.h:
	Explicitly define functions that take no arguments
	with "(void)"

Approved by: RE
2003-05-26 21:43:29 +00:00
Justin T. Gibbs
333f04d935 Correct/Simplify ignore wide residue message handling
aic79xx.c:
	In ahd_handle_ign_wide_residue():
	o Use SCB_XFERLEN_ODD SCB field to determine transfer
	  "oddness" rather than the DATA_COUNT_ODD logic.
	  SCB_XFERLEN_ODD is toggled on every ignore wide
	  residue message so that multiple ignore wide residue
	  messages for the same transaction are properly supported.
	o If the sg list has been exausted, the sequencer
	  doesn't bother to update the residual data count
	  since it is known to be zero.  Perform the zeroing
	  manually before calculating the remaining data count.
	o Use multibyte in/out macros instead of shifting/masking
	  by hand.

aic79xx_inline.h:
	In ahd_setup_scb_common(), setup the SCB_XFERLEN_ODD field.

aic79xx.reg:
	Use the SCB_TASK_ATTRIBUTE field as a bit field in the
	non-packetized case.  We currently only define one bit,
	SCB_XFERLEN_ODD.

	Remove the ODD_SEG bit field that was used to carry the odd
	transfer length information through the SG cache.  This
	is obviated by SCB_XFERLEN_ODD field.

	Remove the DATA_COUNT_ODD scratch ram byte that was used
	dynamicaly compute data transfer oddness.  This is obviated
	by SCB_XFERLEN_ODD field.

aic79xx.seq:
	Remove all updates to the DATA_COUNT_ODD scratch ram field.
	Remove all uses of ODD_SEG.  These two save quite a few
	sequencer instructions.

	Use SCB_XFERLEN_ODD to validate the end of transfer
	ignore wide residue message case.
2003-05-26 21:26:52 +00:00
Justin T. Gibbs
645ca9e9f6 FIFOEMP can lag LAST_SEG_DONE in the Ultra2 and U160
hardware.  Wait a few extra clocks for FIFOEMP to assert
before calling an overrun.

Approved by: RE
2003-05-26 21:24:55 +00:00
Justin T. Gibbs
92931c12ff Correct/Simplify ignore wide residue message handling
aic7xxx.c:
	In ahc_handle_ign_wide_residue():
	o Use SCB_XFERLEN_ODD SCB field to determine transfer
	  "oddness" rather than the DATA_COUNT_ODD logic.
	  SCB_XFERLEN_ODD is toggled on every ignore wide
	  residue message so that multiple ignore wide residue
	  messages for the same transaction are properly supported.
	o If the sg list has been exausted, the sequencer
	  doesn't bother to update the residual data count
	  since it is known to be zero.  Perform the zeroing
	  manually before calculating the remaining data count.
	o Ensure that SG_LIST_NULL is cleared in the
	  residual sg pointer for "mid-transfer" ignore
	  wide residue cases.
	o Use multibyte in/out macros instead of shifting/masking
	  by hand.

aic7xxx.h:
	Modify the SCB_GET_LUN() macro to mask the lun hardware
	SCB field with LID.  This leaves two bits in the LUN
	field that can be used for other purposes.

aic7xxx.reg:
	Change LID to be 0x3F.  This is the maximum supported
	lun size for non-packetized SCSI.  Map the top bit
	of the lun to SCB_XFERLEN_ODD.  The host must set
	this bit whenever a transfer is an odd length.

	Remove the ODD_SEG bit field that was used to carry the odd
	transfer length information through the SG cache.  This
	is obviated by SCB_XFERLEN_ODD field.

	Remove the DATA_COUNT_ODD scratch ram byte that was used
	dynamicaly compute data transfer oddness.  This is obviated
	by SCB_XFERLEN_ODD field.

aic7xxx.seq:
	Be more careful in our handling of the SCB_LUN field.  It
	must be masked with LID if only lun information is desired.

	Remove all updates to the DATA_COUNT_ODD scratch ram field.
	Remove all uses of ODD_SEG.  These two save quite a few
	sequencer instructions.

	Use SCB_XFERLEN_ODD to validate the end of transfer
	ignore wide residue message case.

aic7xxx_inline.h:
	In ahc_queue_scb(), setup the SCB_XFERLEN_ODD field.

Approved by: RE
2003-05-26 21:24:01 +00:00
Justin T. Gibbs
e4e6e6d6ea Fix disabling of PCI parity error interrupts. We need to set
FAILDIS in the SEQCTL register, not the HCNTRL register.

aic7xxx.c:
	Remeber SEQCTL settings in the "seqctl" field of our
	softc.  seqctl defaults to just having FASTMODE set,
	but the bus attachments can override this.

aic7xxx.h:
	Add the seqctl softc field.

aic7xxx_pci.c:
	Update the seqctl softc field and manually update SEQCTL
	when to many PCI errors occur

Approved by: RE
2003-05-26 21:20:47 +00:00
Justin T. Gibbs
a3f571b832 Change hadling of the Rev. A packetized lun output bug
to be more efficient by having the sequencer copy the
single byte of valid lun data into the long lun field.

aic79xx.c:
	Memset our hardware SCB to 0 so that untouched
	fields don't confuse diagnostic output.  With the
	old method for handling the Rev A bug, if the long
	lun field was not 0, this could result in bogus
	lun information being sent to drives.

	Use the same SCB transfer size for all chip types
	now that the long lun is not DMA'ed to the chip.

aic79xx.seq:
	Add code to copy lun information for Rev.A hardware.

aic79xx_inline.h:
	Remove host update of the long_lun field on every
	packetized command.
2003-05-26 21:18:48 +00:00
Justin T. Gibbs
197696e939 Add 7901B support.
Sort IDs based on chip type.

Remove IROC IDs.  We'll switch to using the IROC masks
if/when we want to start attaching to IROC controllers.

Approved by: RE
2003-05-26 21:15:52 +00:00
Justin T. Gibbs
8089f0f033 Fixup spelling of "coalesce" and derivatives.
Approved by: RE
2003-05-26 21:10:58 +00:00
Justin T. Gibbs
3baccea690 Remove stray K&R style function definition.
Approved by: RE
2003-05-26 21:09:15 +00:00
Scott Long
5cf33ce608 Fix two typos from the last commit 2003-05-26 16:59:00 +00:00
Scott Long
0dccf2239d De-orbit bus_dmamem_alloc_size from here too.
Pointed out by:	des
Pointy hat to:	me
2003-05-26 14:38:48 +00:00
Scott Long
c87d464f28 De-orbit bus_dmamem_alloc_size(). It's a hack and was never used anyways.
No need for it to pollute the 5.x API any further.

Approved by:	re (bmah)
2003-05-26 04:00:52 +00:00
Peter Wemm
a9a0bbad19 Copy the va_list in sbuf_vprintf() before passing it to vsnprintf(),
because we could fail due to a small buffer and loop and rerun.  If this
happens, then the vsnprintf() will have already taken the arguments off
the va_list.  For i386 and others, this doesn't matter because the
va_list type is a passed as a copy.  But on powerpc and amd64, this is
fatal because the va_list is a reference to an external structure that
keeps the vararg state due to the more complicated argument passing system.
On amd64, arguments can be passed as follows:
First 6 int/pointer type arguments go in registers, the rest go on
  the memory stack.
Float and double are similar, except using SSE registers.
long double (80 bit precision) are similar except using the x87 stack.
Where the 'next argument' comes from depends on how many have been
processed so far and what type it is.  For amd64, gcc keeps this state
somewhere that is referenced by the va_list.

I found a description that showed the va_copy was required here:
http://mirrors.ccs.neu.edu/cgi-bin/unixhelp/man-cgi?va_end+9
The single unix spec doesn't mention va_copy() at all.

Anyway, the problem was that the sysctl kern.geom.conf* nodes would panic
due to walking off the end of the va_arg lists in vsnprintf.  A better fix
would be to have sbuf_vprintf() use a single pass and call kvprintf()
with a callback function that stored the results and grew the buffer
as needed.

Approved by:	re (scottl)
2003-05-25 19:03:08 +00:00
Jeff Roberson
0003d1b74e - Create a new lock, umtx_lock, for use instead of the proc lock for
protecting the umtx queues.  We can't use the proc lock because we need
   to hold the lock across calls to casuptr, which can fault.

Approved by:	re
2003-05-25 18:18:32 +00:00
Poul-Henning Kamp
43f0db6cc5 Don't do silly thing if the disk_create() event gets canceled.
Approved by:	re/scottl
2003-05-25 16:57:10 +00:00
Jeff Roberson
30fd5d085d - Reset the free ent to NULL if we have consumed the last free entry. This
fixes a problem where we would overwrite old data if we ran out of free
   entries.

Submitted by:	sam
Approved by:	re (scottl)
2003-05-25 08:48:42 +00:00
Don Lewis
263c8abeb9 Beat vnode locking in the NFS server code into submission. This change
is not pretty, but it fixes the code so that it no longer violates the
vnode locking rules in the VFS API and doesn't trip any of the locking
assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option.
There is one report that this patch fixed a "locking against myself"
panic on an NFS server that was tripped by a diskless client.

Approved by:	re (scottl)
2003-05-25 06:17:33 +00:00
Don Lewis
a35e7eaa1a Always set the hardware parse bit in the IPCB structure when this
structure, which is new to the 82550 and 82551, is used to transmit
a packet.  This appears to fix the packet truncation problem that was
observed when using 82550-based fxp cards to transmit ICMP or fragmented
UDP packets of certain lengths which only had one to three bytes in the
second and final mbuf of the packet.  This matches a note in the "Intel
8255x 10/100 Mbps Ethernet Controller Family Open Source Software Developer
Manual", which says that the hardware parse bit should be set when sending
these types of packets.

There have also been unconfirmed reports of similar problems when
transmitting TCP packets, which should not be affected by the above
mentioned change because the hardware parse bit was already being set
if the stack requested hardware checksumming of the packet.  If the
problem remains, the use of the IPCB structure can be disabled to
cause the driver to fall back to using the older 82559 interface with
82550-based cards by setting
        hint.fxp.UNIT_NUMBER.ipcbxmit_disable
to a non-zero value at boot time, or using kenv to set this variable
before using kldload to load the fxp driver.

Approved by:	re (jhb)
2003-05-25 05:04:26 +00:00
Marcel Moolenaar
dc0545462e Now that we define user mode as any IP address that isn't in the
kernel's VA regions, we cannot limit the use of break-based
syscalls to user mode only. The signal trampolines are in the
gateway page, which is mapped into the process address space in
region 5 and thus is kernel space.

We don't special case the gateway page here. Allow break-based
syscalls from anywhere in the kernel VA space.

Approved by: re@ (blanket)
2003-05-25 01:01:28 +00:00
Warner Losh
f9aedaa4ba Ignore the 'must allocate below 1MB' flag for the TPL_BAR_REG. It is
set on realtek cards, but they work without it (and don't work with
it).  The standard seems to imply that this is just a hint anyway, so
this should be harmless.  It doesn't appear to be set on any other
cardbus cards that I have (or have seen).

This should make the rl based CardBus cards work again.  I've been
running it for about a month now.

Approved by: re@ (jhb)
2003-05-24 23:23:41 +00:00
Marcel Moolenaar
d7f827116f Fix a source of instability specific to an EPC userland. We return
to userland with interrupts disabled until we restore PSR. However,
it has been observed that interrupts do actually happen before they
are enabled again. This is a bit surprising and I don't know yet
what's going on exactly. Nevertheless, the code was not crafted
carefully enough to allow interrupts to happen and we could
clobber the kernel stack of another thread when interrupts did
happen.

This is what happens: we restore the (memory) stack pointer (sp)
and the register stack base prior to restoring ar.k6 and ar.k7.
This is not a problem if interrupts don't happen between setting
sp/ar.bspstore and ar.k6/ar.k7. Alas, interrupts can happen.
Since sp/ar.bspstore already point to the userland stacks, we
need to switch to the kernel stack in interrupt. However, ar.k6
and ar.k7 have not been set, which means that we were switching
to some unrelated kstack and happily clobbered the trapframe
present there if the thread to which the kstack belonged was
in kernel mode or otherwise we could have our trapframe clobbered
if that other thread enters the kernel. Nasty either way.

We now carefully restore ar.k6 prior to restoring ar.bspstore and
likewise for ar.k7 and sp. All we need is the guarantee that an
interrupt does not clobber ar.k6 or ar.k7 before we're back in
userland. That has been achieved by restoring ar.k6/ar.k7
unconditionally (see exception.s)

While here, remove the disabling of interrupts on EPC entry. It
was added as a way to "resolve" the crashes until it was understood
what was going on. I think I achieved the latter, so we can remove
the patch. Note that setting up a trapframe with interrupts
enabled has it's own share of corner cases, but it's better to
properly fixed those than to keep a mostly wrong patch around
because we're afraid to remove it...

Approved by: re@ (blanket)
2003-05-24 22:53:10 +00:00
Marcel Moolenaar
a7b90d80fc Be more careful how we restore interrupts. Don't rewrite most of the
PSR only to achieve setting PSR.i back to it's previous value. It
makes it impossible to change any of the 30+ other unrelated bits
when done between intr_disable() and intr_restore(). That's bad.

Instead have intr_disable() return 1 when interrupts were previously
enabled and 0 otherwise and only enable interrupts in intr_restore()
when given a non-0 value.

This change specifically disallows using intr_restore() to disable
interrupts. The reason is simple: interrupts only need to be restored
after they are being disabled, which means that intr_restore() is
called with interrupts disabled and we only need to enable them if
they were previously enabled.

This change does not fix any bugs, other than that it bugged me...

Approved by: re@ (blanket)
2003-05-24 21:44:24 +00:00
Marcel Moolenaar
95f2dbba40 Consistently us the same metric to differentiate between kernel mode
and user mode. We need to take into account that the EPC syscall path
introduces a grey area in which one can argue either way, including a
third: neither.

We now use the region in which the IP address lies. Regions 5, 6 and 7
are kernel VA regions and if the IP lies any any of those regions we
assume we're in kernel mode. Hence, we can be in kernel mode even if
we're not on the kernel stack and/or have user privileges. There're
gremlins living in the twilight zone :-)

For the EPC syscall path this particularly means that the process
leaves user mode the moment it calls into the gateway page. This
makes the most sense because from a process' point of view the call
represents a request to the kernel for some service and that service
has been performed if the call returns. With the metric we picked,
this also means that we're back in user mode IFF the call returns.

Approved by: re@ (blanket)
2003-05-24 21:16:19 +00:00
Marcel Moolenaar
fb4aa34f3b Unconditionally restore ar.k7 (memory stack) and ar.k6 (register stack)
when returning from an interrupt. Both registers are used on interrupt
to switch to the right kernel stack, but other than that they are not
used. This means we only have to make sure they contain proper values
while in user mode. As such, we conditionally restored these registers
based on whether we returned to userland or not. A nice property of
conditionally restoring ar.k6 and ar.k7 is that it introduces two
invariants: ar.k6 always points to the bottom of the kernel stack and
ar.k7 always points to the top of the kernel stack (immediately below
the PCB we have there).

However, the EPC syscall path introduces an irregularity: there's no
"thin red line" between user and kernel. There's a grey area that's a
couple of instructions wide. Any interruption in that grey area is
bound to see an inconsistent state. One such state is that we're in
kernel space for all practical purposes, but we still need to have
ar.k6 and ar.k7 restored as if we're in userland.

Thus: restore ar.k6 and ar.k7 unconditionally at the cost of losing
a valuable invariant. Both registers now hold the extend of the
usable portion of the kernel stack at any interrupt nesting, which
when in userland mean the bottom and the top of the kstack.
2003-05-24 20:51:55 +00:00
Peter Wemm
3ebd9b48ce Stop profiled libc from exploding, matching gcc's generated code.
Approved by: re (amd64/* blanket)
2003-05-24 18:24:03 +00:00
Marcel Moolenaar
d1d7df1905 Fix an alpha inheritance bug:
On alpha, PAL is involved in context management and after wiring
the CPU (in alpha_init()) a context switch was performed to tell
PAL about the context. This was bogusly brought over to ia64
where it introduced bugs, because we restored the context from
a mostly uninitialized PCB.

The cleanup constitutes:
o  Remove the unused arguments from ia64_init().
o  Don't return from ia64_init(), but instead call mi_startup()
   directly. This reduces the amount of muckery in assembly and
   also allows for the next bullet:
o  Save our currect context prior to calling mi_startup(). The
   reason for this is that many threads are created from thread0
   by cloning the PCB. By saving our context in the PCB, we have
   something sane to clone. It also ensures that a cloned thread
   that does not alter the context in any way will return to
   the saved context, where we're ready for the eventuality with
   a nice, user unfriendly panic().

The cleanup fixes at least the following bugs:
o  Entering mi_startup() with the RSE in enforced lazy mode.
o  Re-execution of ia64_init() in certain "lab" conditions.

While here, add proper unwind directives to __start() so that
the unwind knows it has reached the bottom of the (call) stack.

Approved by: re@ (blanket)
2003-05-24 00:17:34 +00:00
Marcel Moolenaar
ca125f9c17 Fix a (new) source of instability:
When interrupting a kernel context, we don't need to switch stacks
(memory nor register). As such, we were also not restoring the
register stack pointer (ar.bspstore). This, however, fails to be
valid in 1 situation: when we interrupt a register stack switch as
is being done in restorectx(). The problem is that restorectx()
needs to have ar.bsp == ar.bspstore before it can assign the new
value to ar.bspstore. This is achieved by doing a loadrs prior to
assigning to ar.bspstore. If we take an interrupt in between the
loadrs and the assignment and we don't make sure we restore the
ar.bspstore prior to returning from the interrupt, we switch
stacks with possibly non-zero dirty registers, which means that
the new frame pointer (ar.bsp) will be invalid.

So, instead of jumping over the restoration of the register frame
pointer and related registers, we conditionalize it based on whether
we return to kernel context or user context. A future performance
tweak is possible by only restoring ar.bspstore when returning to
kernel mode *and* when the RSE is in enforced lazy mode. One cannot
assume ar.bsp == ar.bspstore if the RSE is not in enforced lazy mode
anyway.

While here (well, not quite) don't unconditionally assign to
ar.bspstore in exception_save. Only do that when we actually switch
stacks. It can only harm us to do it unconditionally.

Approved by: re@ (blanket)
2003-05-23 23:55:31 +00:00
Marcel Moolenaar
42b919d4a6 In swapctx(), put the RSE in enforced lazy mode before we flush the
register stack. There's nothing really wrong with flushing before
putting the RSE in enforced lazy mode, provided you don't depend on
ar.bspstore being equal to ar.bsp when the RSE has been put in
enforced lazy more. The small window between the flush and setting
the RSE may be sufficient to have the RSE eagerly increase the dirty
region (and hence cause ar.bspstore != ar.bsp) or have an interrupt
that may even get the laziest RSE to do something.

Anyway: we don't depend on ar.bspstore being equal to ar.bsp, so
nothing was and is broken. But the code was non-intuitive and
easily confuses. This is a source of future bugs.

Note: the advantage of not depending on ar.bspstore is that there's
some recilience against an interrupted flushrs. Clobbering is limited
to stacked register contents only, not to RSE address clobbering.

Approved: re@ (blanket)
2003-05-23 23:16:43 +00:00
Alan Cox
2e05d89828 Make the maximum number of vnodes a function of both the physical memory
size and the kernel's heap size, specifically, vm_kmem_size.  This
function allows a maximum of 40% of the vm_kmem_size to be used for
vnodes and vm objects.  This is a conservative bound based upon recent
problem reports.  (In other words, a slight increase in this percentage
may be safe.)

Finally, machines with less than ~3GB of RAM should be unaffected
by this change, i.e., the maximum number of vnodes should remain
the same.  If necessary, machines with 3GB or more of RAM can increase
the maximum number of vnodes by increasing vm_kmem_size.

Desired by:	scottl
Tested by:	jake
Approved by:	re (rwatson,scottl)
2003-05-23 19:54:02 +00:00
Peter Wemm
d9cd1af4aa Typo fix. oops.
Submitted by:  jmallett
Approved by:   re (blanket amd64/*)
2003-05-23 06:36:46 +00:00
Peter Wemm
cbd667fa2f Update comments. Note that the kernel is at -1GB, not -2GB as erroniously
implied by the previous commit.  KVM is still only 1GB until
pmap_growkernel() learns about the extra page table level.

Approved by:  re (blanket)
2003-05-23 06:35:45 +00:00
Peter Wemm
f229f5cf85 As suggested by the gdb folks, pad the 'struct fpreg' to a full 512 bytes
to match the native fxsave/fxrstor object size since thats apparently what
the Linux/NetBSD folks do.
2003-05-23 06:31:56 +00:00
Peter Wemm
637068b1d3 Low risk amd64 fix. Use a vm_offset_t for the virtual location of the
buffer space instead of a u_int32_t.  Otherwise the upper 32 bits of
the address space get truncated and syscons blows up.

Approved by:	re (safe, low risk amd64 fixes)
2003-05-23 05:10:49 +00:00
Peter Wemm
9f0c4ab393 Deal with the user VM space expanding. 32 bit applications do not like
having their stack at the 512GB mark.  Give 4GB of user VM space for 32
bit apps.  Note that this is significantly more than on i386 which gives
only about 2.9GB of user VM to a process (1GB for kernel, plus page
table pages which eat user VM space).

Approved by: re (blanket)
2003-05-23 05:07:33 +00:00
Peter Wemm
3c9a3c9ca3 Major pmap rework to take advantage of the larger address space on amd64
systems.  Of note:
- Implement a direct mapped region using 2MB pages.  This eliminates the
  need for temporary mappings when getting ptes.  This supports up to
  512GB of physical memory for now.  This should be enough for a while.
- Implement a 4-tier page table system.  Most of the infrastructure is
  there for 128TB of userland virtual address space, but only 512GB is
  presently enabled due to a mystery bug somewhere.  The design of this
  was heavily inspired by the alpha pmap.c.
- The kernel is moved into the negative address space(!).
- The kernel has 2GB of KVM available.
- Provide a uma memory allocator to use the direct map region to take
  advantage of the 2MB TLBs.
- Fixed some assumptions in the bus_space macros about the ability
  to fit virtual addresses in an 'int'.

Notable missing things:
- pmap_growkernel() should be able to grow to 512GB of KVM by expanding
  downwards below kernbase.  The kernel must be at the top 2GB of the
  negative address space because of gcc code generation strategies.
- need to fix the >512GB user vm code.

Approved by:	re (blanket)
2003-05-23 05:04:54 +00:00
Greg Lehey
74f2cc2c9c Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

expand_table: Add parameters file and line if we're debugging.

Approved by: re (jhb)
2003-05-23 01:15:55 +00:00
Greg Lehey
93573e2e76 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

Add and clarify comments.

Approved by: re (jhb)
2003-05-23 01:15:30 +00:00
Greg Lehey
7db14b2ff2 expand_table: Add parameters file and line if we're debugging.
MMalloc, vinum_meminfo: Use strlcpy to copy file name.

Approved by: re (jhb)
2003-05-23 01:15:01 +00:00
Greg Lehey
d026346c86 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

Approved by: re (jhb)
2003-05-23 01:14:35 +00:00
Greg Lehey
8a697ff435 detachobject: Update volume config after detaching a plex.
update_volume_config: Remove redundant diskconfig parameter.

Approved by: re (jhb)
2003-05-23 01:14:13 +00:00
Greg Lehey
cb5eba5e09 Change the way the plex lock mutexes work. Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

update_volume_config: Remove redundant diskconfig parameter.

expand_table: Add parameters file and line if we're debugging.

Approved by: re (jhb)
2003-05-23 01:13:43 +00:00
Greg Lehey
f7b76dc815 Change many strcpys to strlcpys, etc.
Submitted by:	   Ted Unangst <tedu@stanford.edu>

Correct some inaccurate and badly formatted comments.

config_subdisk: If our drive is down, ensure that the subdisk is
		crashed.  Previously it was possible for the subdisk
		to be up when the drive was down.

Change the way the plex lock mutexes work.  Previously they were part
of the struct plex, which tore apart the mutex linked lists when the
plex table was expanded.  Now we maintain a pool of mutexes (currently
32) to be shared by all plexes.  This is still a lot better than the
splhigh() method used in other architectures.

update_volume_config: Remove redundant diskconfig parameter.

Approved by: re (jhb)
2003-05-23 01:13:10 +00:00
Peter Wemm
997f3bfc2a Merge from i386/trap.c rev 1.252. Use td_critnest instead of the
spinlocks count for explicitly enabling interrupts.

Approved by:	re (blanket)
2003-05-22 20:09:50 +00:00
Bernd Walter
cdc95e1bb8 Calculate routed interrupts using the slot number from the device and
not that of the bridge.

Approved by:	re (jhb)
2003-05-22 17:45:26 +00:00
Mike Barcroft
6f9622a926 Fix two misuses of __BSD_VISIBLE.
Submitted by:	bde
Approved by:	re
2003-05-22 17:07:57 +00:00
Julian Elischer
faaa20f639 When we are spilling threads out of the run queue during panic, make sure we
keep the thread state variable consistent with its real state.
i.e. Don't say it's on the run queue when it isn't.

Also clarify the associated comment.

Turns a double panic back to a single panic :-/

Approved by:	re@ (jhb)
2003-05-21 18:53:25 +00:00
Poul-Henning Kamp
67fd2837cd Return ENXIO if the softc pointer is NULL, in all likelyhood the
disk is in the process of disappearing.

Approved by:	re/rwats*
2003-05-21 18:52:29 +00:00
Paul Saab
3284b9ee87 Make ciss usable under PAE
Approved by:	re (scottl)
2003-05-21 07:17:06 +00:00
Paul Saab
487a8c7e61 - Make this work with PAE.
- atomically load and clear the status block so we dont miss an
  update.
  Submitted by: jdp

Approved by:	re (scottl)
2003-05-21 07:00:49 +00:00
Nate Lawson
742d91f211 Quirk for Hitachi DVD USB drive. It returns "invalid field in cdb" for
normal INQUIRY requests so enable the NO_INQUIRY quirk.

Submitted by:	Lars Eggert <larse@ISI.EDU>
Approved by:	re (scottl)
2003-05-21 00:22:07 +00:00
John Baldwin
7f4725bd09 The per-CPU spinlocks list is only maintained when WITNESS is enabled.
Thus, treat all page faults while in a critical section as fatal rather
than just those that occur with a non-empty spinlocks list.  All such page
faults are fatal anyways.  Calling trap_fatal() earlier increases the
chances of getting more useful panic messages and a possible DDB prompt.

Approved by:	re (scottl)
2003-05-20 20:50:33 +00:00
Nate Lawson
2f8f9581dd Remove a redundant quirk. Instead, we wildcard all Asahi Optical chips.
Approved by:	re
2003-05-20 18:04:42 +00:00
Marcel Moolenaar
bfaccb767c o Fix a definite bogon: the dirty bity fault, instruction access
failt and data access fault install the PTE in question into
   the VHPT table. However, a post-increment was missing and we
   wrote the raw PTE data into the pagesize/access key field.
   This leaves a corrupt VHPT entry.
o  While here, remove the explicit cache purge. Insertion into
   the translation implicitly purges any overlapping entries.
o  Make sure there's a cycle break between the itc and the rfi.
o  Whitespace fixes.
2003-05-20 06:57:20 +00:00
Marcel Moolenaar
14d2ae56c7 Rename the "IA64 ITC" counter to "ITC" counter. We don't call the
"TSC" counter on i386 "I386 TSC".

Approved by: re@ (blanket)
2003-05-20 06:51:20 +00:00
Marcel Moolenaar
9b9ce577d4 Prevent corruption of the VHPT collision chain by protecting it with
a mutex. The only volatile chain operations are insertion and deletion
but since updating an existing PTE also updates the VHPT entry itself,
and we have the VHPT mutex in both other cases, we also lock when we
update an existing PTE even though no chain operation is involved.
Note that we perform the insertion and deletion careful enough that
we don't need to lock traversals. If we need to lock traversals, we
also need to lock from the exception handler, which we can't without
creating a trapframe.

We're now able to withstand a -j8 buildworld. More work is needed to
withstand Murphy fields. In other words: we still have a bogon...

Approved by: re@ (blanket)
2003-05-20 02:52:41 +00:00
Peter Wemm
62d8fb93d0 Deal with the possibility of negative available space from the file server
to avoid Bad Things(TM) happening (eg: df crashing with a floating point
exception).

Submitted by:	Harold Gutch <logix@foobar.franken.de>
Approved by:	re (scottl)
2003-05-19 22:35:00 +00:00
Peter Wemm
3830dc4629 Another x86-64 comment fixup
Approved by:	re (blanket amd64 stuff)
2003-05-19 22:19:02 +00:00
Peter Wemm
92f0cd89a0 s/x86_64/amd64/ in comments in header.
Approved by:	re (blanket amd64)
2003-05-19 22:15:30 +00:00
Alexander Kabaev
980ded9a7d sys/sys/limits.h:
- Fix visibilty test for LONG_BIT and WORD_BIT.  `#if defined(__FOO_VISIBLE)'
   is alays wrong because __FOO_VISIBLE is always defined (to 0 for
   invisibility).

sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:

 - Style fixes.

Submitted by:	bde
Reviewed by:	bsdmike
Approved by:	re (scottl)
2003-05-19 20:29:07 +00:00
Søren Schmidt
e1750fb855 Print the right position on disk errors
Approved by: re@
2003-05-19 13:43:12 +00:00
Søren Schmidt
c9f5649b3e Unbork the chip locating code.
Approved by: re@
2003-05-19 13:42:23 +00:00
Marcel Moolenaar
b8c4149cff Turn pmap_install_pte() into a critical section. We better not get
interrupted while writing into the VHPT table. While here, make sure
memory accesses a properly ordered. Tag invalidation must happen
first so that the hardware VHPT walker will not be able to match
this entry while we're updating it and we have to make sure the new
new tag gets written only after the PTE is completely updated.

Approved by: re (blanket)
2003-05-19 08:02:36 +00:00
Marcel Moolenaar
a75b99ea2d Unconditionally set pcb_current_pmap. WIP versions of the code
previously committed cleared pcb_current_pmap prior to changing
the region registers, but that was removed before committing.
Since we don't normally (at all?) pass a NULL pointer, the bug
was mostly harmless. Fix it while I'm here...

I'm here because we need to have data serialization after writing
to the region registers. Not doing so was likely the cause of the
hangs we were experiencing. General exceptions in cpu_switch may
also be caused by the lack of serialization.

Approved by: re (blanket)
2003-05-19 06:05:30 +00:00
Marcel Moolenaar
dc0bde0f18 pmap_install() needs to be atomic WRT to context switching. Protect
switching user regions (region 0-4) with schedlock. Avoid unnecessary
recursion on schedlock by moving the core functionality to another
function (pmap_switch()) where we assert schedlock is held. Turn
pmap_install() into a wrapper that grabs schedlock. This minimizes
the number of callsites that need to be changed.
Since we already have schedlock in cpu_switch() and cpu_throw(),
have them call pmap_switch() directly. These were also the only two
calls to pmap_install() outside pmap.c, so make pmap_install() static
and remove its prototype from pmap.h

Approved by: re (blanket)
2003-05-19 04:16:30 +00:00
Greg Lehey
4555a3de62 print_config:
Change config format slightly to save plex preferences correctly.

vinum_scandisk: reinitialise volatile pointer after function call.
This is the "deafc0de" bug.

Approved by: re (scottl)
2003-05-19 02:21:31 +00:00
David Schultz
e92686d065 If we seem to be out of VM, don't allow the pagedaemon to kill
processes in the first pass.  Among other things, this will give
us a chance to launder vnode-backed pages before concluding that
we need more swap.  This is particularly useful for systems that
have no swap.

While here, update a comment and remove some long-unused code.

Reported by:	Lucky Green <shamrock@cypherpunks.to>
Suggested by:	dillon
Approved by:	re (rwatson)
2003-05-19 00:51:07 +00:00
Alan Cox
7f758dabbb Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
Bernd Walter
d7a1c636e1 Recreate devnodes on USB_SET_ALTINTERFACE ioctl.
This fixes net/pppoa port for Alcatel Speedtouch devices.

Submitted by: Jay Cornwall <jay@evilrealms.net>
Tested by: Francois Rogler <francois@rogler.org>
Approved by: re (scottl)
2003-05-18 21:22:00 +00:00
Ruslan Ermilov
517f3f1ae5 There's just no reason to not have these in GENERIC.
Found by:	release/*/drivers.conf cleaning script
Approved by:	re (scottl)
2003-05-18 20:39:15 +00:00
Søren Schmidt
05688ceccc Support the ICH5 SATA part.
Fix HPT374 UDMA133 timing.
Fix Promise ID.
Cosmetics on probe print for Promise & HPT.

Approved by: re
2003-05-18 16:45:48 +00:00
Søren Schmidt
27409aa046 Add string for SATA150
Approved by: re
2003-05-18 16:43:08 +00:00
Søren Schmidt
347ebe4c41 Add define for SATA150
Approved by: re
2003-05-18 16:40:38 +00:00
Alan Cox
1c500307d1 Reduce the size of a vm object by converting its shadow list from a TAILQ
to a LIST.

Approved by:	re (rwatson)
2003-05-18 04:10:16 +00:00
Scott Long
8c33536c7f Add the MUTEX_NOINLINE option that explicitely de-inlines the mutex
operations.

Submitted by:	jhb
2003-05-18 03:46:30 +00:00
Ruslan Ermilov
2f0e162dc0 Fixed the markup and wording of the kern.ipc.nsfbufs tunable.
(It does not modify NSFBUFS, but just overrides it if set.)

Approved by:	re (blanket)
2003-05-17 22:17:23 +00:00
Marcel Moolenaar
040c5b92bb Remove unused files. cpu_switch() and cpu_throw(), normally in swtch.s,
can be found in machdep.c.

Approved: re@
2003-05-17 04:55:04 +00:00
Peter Wemm
5c0fe26236 Actually get all the bits for sd_hibase.. it was 16 bits short. oops.
Approved by:	re (amd64/* blanket)
2003-05-17 02:05:10 +00:00
Peter Wemm
728ec271c1 Fix a bug in the AMD64 trampoline. I misunderstood the implicit
32->64 bit zero extend.  This changes a movl to an orq.

Approved by:	re (amd64 bits)
2003-05-17 00:30:51 +00:00
Marcel Moolenaar
f2c49dd248 Revamp of the syscall path, exception and context handling. The
prime objectives are:
o  Implement a syscall path based on the epc inststruction (see
   sys/ia64/ia64/syscall.s).
o  Revisit the places were we need to save and restore registers
   and define those contexts in terms of the register sets (see
   sys/ia64/include/_regset.h).

Secundairy objectives:
o  Remove the requirement to use contigmalloc for kernel stacks.
o  Better handling of the high FP registers for SMP systems.
o  Switch to the new cpu_switch() and cpu_throw() semantics.
o  Add a good unwinder to reconstruct contexts for the rare
   cases we need to (see sys/contrib/ia64/libuwx)

Many files are affected by this change. Functionally it boils
down to:
o  The EPC syscall doesn't preserve registers it does not need
   to preserve and places the arguments differently on the stack.
   This affects libc and truss.
o  The address of the kernel page directory (kptdir) had to
   be unstaticized for use by the nested TLB fault handler.
   The name has been changed to ia64_kptdir to avoid conflicts.
   The renaming affects libkvm.
o  The trapframe only contains the special registers and the
   scratch registers. For syscalls using the EPC syscall path
   no scratch registers are saved. This affects all places where
   the trapframe is accessed. Most notably the unaligned access
   handler, the signal delivery code and the debugger.
o  Context switching only partly saves the special registers
   and the preserved registers. This affects cpu_switch() and
   triggered the move to the new semantics, which additionally
   affects cpu_throw().
o  The high FP registers are either in the PCB or on some
   CPU. context switching for them is done lazily. This affects
   trap().
o  The mcontext has room for all registers, but not all of them
   have to be defined in all cases. This mostly affects signal
   delivery code now. The *context syscalls are as of yet still
   unimplemented.

Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.

Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.

This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.

Approved by: re@ (jhb)
2003-05-16 21:26:42 +00:00
Don Lewis
1e9bc9f889 Detect that a vnode has been reclaimed while vflush() was waiting to lock
the vnode and restart the loop.  Vflush() is vulnerable since it does not
hold a reference to the vnode and it holds no other locks while waiting
for the vnode lock.  The vnode will no longer be on the list when the
loop is restarted.

Approved by:	re (rwatson)
2003-05-16 19:46:51 +00:00
Marcel Moolenaar
baf74b8876 o In pmap_install, don't prevent switching the pmap if we're
switching to kernel_pmap. The pmap is not special enough.
o  Clear the active bit on the pmap we're switching out.
o  Fix some nearby style(9) bugs.

Approved by: re@
2003-05-16 07:57:44 +00:00
Alan Cox
f820bc501e Use vm_object_deallocate(), not vm_pager_deallocate(), to destroy a
vm object.  (vm_pager_deallocate() does not, in fact, destroy a vm object.)

Approved by:	re (scottl)
Reviewed by:	phk
2003-05-16 07:28:27 +00:00
Marcel Moolenaar
906f065725 Indent a comment. This makes 1.100.
Still approved by: re@ (blanket)
2003-05-16 07:05:08 +00:00
Marcel Moolenaar
164d4986fd Turn pmap_growkernel() into a critical section. While here, initialize
kernel_vm_end in pmap_bootstrap. Don't delay the initialization until
we need to grow the kernel VM space. This BTW happens twice before
we enter either single- or multi-user mode. Don't adjust kernel_vm_end
while growing based on whether the KPT contains a non-NULL entry. We
trust kernel_vm_end to be correct and we make sure it's still correct
after growing.
Define virtual_avail and virtual_end in terms of VM_MIN_KERNEL_ADDRESS
and VM_MAX_KERNEL_ADDRESS (resp). Don't hardcode region knowledge.
2003-05-16 07:03:15 +00:00
Marcel Moolenaar
8cc31ae5be Revamp the RID allocation code:
o  Limit the size of the region ID map to 64KB. This gives a bitmap
   that is large enough to keep track of 2^19 numbers. The minimal map
   size is 32KB. The reason we limit the map size is that processor
   models may have implemented a 24-bit region ID, which would give
   a 2MB bitmap while the maximum number of allocations is always
   less than PID_MAX*5, which is less than 2^19.
o  Allocate all region IDs up-front. The slight downside of reserving
   more RIDs then a process needs (3 for ia64 native and 1 for ia32)
   is preferable over the call to pmap_ensure_rid() where RIDs are
   allocated on demand. On SMP systems this may lead to a race
   condition.
o  When allocating a region ID, don't use arc4random(). We're not
   interested in randomness or uniform distribution across the
   spectrum. We only need uniqueness. Random numbers may easily
   collide when the number of allocated RIDs is high, creating a
   possibly unbounded retry rate.
2003-05-16 06:40:40 +00:00
Marcel Moolenaar
75189cff08 Move the conditional definition of KSTACK_MAX_PAGES up ahead where
it's more visible.

Approved by: re@ (blanket)
2003-05-16 06:17:34 +00:00
Marcel Moolenaar
5551d84398 Sync the linker script with the one used by default for userland. Since
ia64 only uses relocations with addend, remove the sections specific to
non-addend relocations (.rel.*). Also remove C++ specific sections.

Approved by: re@ (blanket)
2003-05-16 06:03:45 +00:00
Murray Stokely
a8a084fc17 Add variables for missing network drivers.
PR:		kern/51911
Submitted by:	David Yeske <dyeske@yahoo.com>
Approved by:	re
2003-05-16 04:31:00 +00:00
Murray Stokely
4001e1ee2e Add E-Tech ISA PnP modem ID.
PR:		kern/36692
Submitted by:	Theo van Klaveren <t.vanklaveren@student.utwente.nl>
Approved by:	re (murray)
MFC After:	3 days
2003-05-16 04:04:04 +00:00
David E. O'Brien
04ddc5dea6 Run $S/kern/genassym.sh with the correct NM.
Approved by:	re(blanket)
2003-05-16 02:27:17 +00:00
David E. O'Brien
8d542cb56d Fix long standing bug that prevents the PT_CONTINUE, PT_KILL and
PT_DETACH ptrace(2) requests from functioning as advertised in the
manual page.  As described in kern/35175, the PT_DETACH request will,
under certain circumstances, pass an unwanted signal on to the traced
process upan detaching from it.  The PT_CONTINUE request will
sometimes fail if you make it pass a signal that has "properties" that
differ from the properties of the signal that origionally caused the
traced process to be stopped.  Since PT_KILL is nothing than
PT_CONTINUE with SIGKILL, it is broken too.  In the PT_KILL case, this
leads to an unkillable process.

PR:		44011
Submitted by:	Mark Kettenis <kettenis@chello.nl>
Approved by:	re(jhb)
2003-05-16 01:34:23 +00:00
Robert Watson
98b2788832 Add a tunable/sysctl "hw.fxp_noflow" which disables flow control support
on if_fxp cards.  When flow control is enabled, if the operating system
doesn't acknowledge the packet buffer filling, the card will begin to
generate ethernet quench packets, but appears to get into a feedback
loop of some sort, hosing local switches.  This is a temporary workaround
for 5.1: the ability to configure flow control should probably be
exposed by some or another management interface on ethernet link layer
devices.

Approved by:	re (bmah)
Reviewed by:	mux
2003-05-16 01:13:16 +00:00
Thomas Moestl
a93b6bf5e9 In cpu_fork(), initialize pcb_psl for the new process to PSL_KERNEL,
instead of taking the (userland) eflags from the trap frame and masking
out PSL_I. There is no need to inherit any flags from the forking process;
the old method however can cause flags set in userland for the forking
process to be bogusly set in kernel mode when the newly forked process
runs for the first time (in particular PSL_T, which is set for userland
when the process is single-stepped; this would cause trace traps in
kernel mode).

Approved by:	re (jhb)
2003-05-16 01:10:33 +00:00
Robert Watson
c1dca9ab07 VOP_PATHCONF() requires a vnode lock; this patch adds locking to
fpathconf(). The lock is held for direct calls to VOP_PATHCONF() in
pathconf() already.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:13:08 +00:00
Robert Watson
7042ac8cd7 This change grabs the vnode lock for NFS client vnodes when calling
VOP_SETATTR() or VOP_GETATTR(); without these locks (a) VFS_DEBUG_LOCKS
will panic, and (b) it may be possible to corrupt entries in the cached
vnode attributes in the nfsnode, since nfsnode attribute cache data is
also protected by the vnode lock.

Approved by:	re (jhb)
Pointed out by:	VFS_DEBUG_LOCKS
2003-05-15 21:12:08 +00:00
Robert Watson
62d4b85ec1 Jeff added locking assertions that the VV_ flags on vnodes were modified
only while holding appropriate vnode locks.  This patch slides the lock
release for ufs_extattr_enable() to continue to hold the active vnode lock
on a backing file until after the flag change; it also acquires a vnode
lock when disabling an attribute and hence clearing a flag on the backing
vnode.  This permits VFS_DEBUG_LOCKS to run UFS1 extended attributes
without panicking, as well as preventing a potential race and vnode flag
problem.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:07:33 +00:00
Bosko Milekic
11583f6c93 Make the mb_alloc low-watermark sysctl-tunable read-only and make
netstat(1) not display it for now because its effects are not yet
completely implemented and we're about to cut 5.2-RELEASE.
This is temporary.

Approved by: re (scottl, rwatson)
2003-05-15 19:05:28 +00:00
Julian Elischer
95f04def4b fix a cut-n-paste error.
in the case where the bridge node was closed down but a timeout
still applied to it, the final reference to the node was freeing the private
data structure using the wrong malloc type.

Approved by:	re@
2003-05-15 18:51:28 +00:00
Nate Lawson
d6061de923 Generalize a quirk for Asahi Optical-based cameras (i.e. Pentax). It appears
all of the Optio series have the same problems.  It might be a better
approach eventually to add wildcard support to USB quirks.

PR:		kern/50271, kern/46369
Approved by:	re (rwatson)
2003-05-15 17:36:22 +00:00
Nate Lawson
f410510b09 Add a quirk for OTi USB flash key.
PR:		kern/51825
Approved by:	re (rwatson)
2003-05-15 17:35:35 +00:00
Thomas Moestl
18100346d1 Miscellaneous fixes:
- Fix compilation without GEM_DEBUG.
- Do not #define GEM_DEBUG by default; it adds overhead (due to bzero()ing
  RX space) and is not needed any more, since the driver is quite stable
  now.
- Fix watchdog timeouts when failing to load TX packets.
- Do not forcibly limit the number of descriptors used for a packet to
  GEM_NTXSEGS, by passing this number to bus_dma_tag_create(). There is
  no requirement for a limit any lower than the total number of
  available descriptors, and the present limit caused network problems
  due to mbuf chains requiring more descriptors.
  GEM_NTXSEGS is still used to estimate the interrupt window size, for
  which we just need an estimate.

Approved by:	re (rwatson)
2003-05-15 16:57:55 +00:00
Martin Blapp
f956e0b3f0 Only use a SIA/SYM media info block if no MII block is detected.
The submitter of PR 32118 told me that this patch also fixes autoselecting
for znyx 4 port cards (10baseT, 100baseTX did work already).

PR:		32118
Reviewed by:	imp
Approved by:	rwatson (re)
2003-05-15 16:53:29 +00:00
Marcel Moolenaar
794518cd6d This file creates register sets based on the runtime specification.
The advantage of using register sets is that you don't focus on each
register seperately, but instead instroduce a level of abstraction.
This reduces the chance of errors, and also simplifies the code.
The register sers form the basis of everything register.
The sets in this file are:

struct _special
contains all of the control related registers, such as instruction
pointer and stack pointer. It also contains interrupt specific registers
like the faulting address. The set is roughly split in 3 groups. The
first contains the registers that define a context or thread. This is
the only group that the kernel needs to switch threads.  The second group
contains registers needed in addition to the first group needed to switch
userland threads. This group contains the thread pointer and the FP control
register. The third group contains those registers we need for execption
handling and are used on top of the first two groups.

struct _callee_saved, struct _callee_saved_fp
These sets contain the preserved registers, including the NaT after
spilling. The general registers (including branch registers) are
seperated from the FP registers for ptrace(2).

struct _caller_saved, struct _caller_saved_fp
These sets contain the scratch registers based on SDM 2.1, This means that
both ar.csd and ar.ccd are included here, even though they contain ia32
segment register descriptions. We keep seperate NaT bits for scratch and
preserved registers, because they are never saved/restored at the same
time.

struct _high_fp
The upper 96 FP registers that can be enabled/disabled seperately on
the CPU from the lower 32 FP registers. Due to the size of this set,
we treat them specially, even though they are defined as scratch
registers.

CVS ----------------------------------------------------------------------
2003-05-15 08:36:03 +00:00
Marcel Moolenaar
4bae872201 This file contains elementary context related functions used to
save and restore "sets" of registers in various places.
The restorectx and swapctx functions are used by cpu_switch()
and deal with the special registers, as well as the preserved
registers.
The *callee_saved* functions are used to save and restore the
preserved registers (integer and floating-point). They are
useful for signal delivery and ptrace support.
The save_high_fp and restore_high_fp functions are used to
"load" and "unload" to and from the CPU as part of lazy context
switching.
The ia32 specific context functions have been kept with the ia32
code.

Approved by: re@ (blanket)
2003-05-15 08:08:32 +00:00
Marcel Moolenaar
1d67adffd6 This file contains the code that implements the syscall path based
on the epc instruction. The epc instruction, given the permissions
of the page in which the epc is located, allows the privilege level
to be increased with little or no overhead. The previous privilege
level is recorded in the current frame marker and is restored by
a regular (function) return.
Since the epc instruction has to live in a page with non-standard
properties, we hardwire a "gateway" page in the address space. The
address of the gateway page is exported to userland in ar.k7. This
allows us to rewire the page without breaking the ABI.
The syscall stubs in libc are regular function calls that slightly
differ from the normal runtime. The difference is mostly to simplify
the stubs themselves by by moving some of the logic to the kernel.
The libc stubs call into the gateway page (offset 0), from where the
kernel trampolines to the code that sets up a minimal trapframe and
arranges to execute from the kernel stack.
The way back is basicly the same. The kernel returns to the gateway
page, whereby privilege is dropped, and jumps back to the syscall
stub.
Only the special registers are saved in the trapframe. None of the
scratch registers are preserved and since the kernel follows the
same runtime model, none of the preserved registers are saved.
Future enhancements can include the implementation of lightweight
syscalls, where kernel functions are performed without setting up
a trapframe. Good candidates are the *context syscalls for example.

Now that there's a gateway page from which code can be executed in
a non-privileged context, we also have the ideal place to put the
signal trampolines. By moving the signal trampolines from the user
stack to the gateway page, we open up the doors to unexecutable
stacks. The gateway page contains signal trampolines for both the
"legacy" break-based syscall code and the new and improved epc-
based syscall code.

Approved: re@ (blanket)
2003-05-15 07:51:22 +00:00
Alan Cox
4a0d6dfd2c Initialize logical_cpus_mask when the logical CPUs are enumerated in
the mptable.  (Previously, logical_cpus_mask was only initialized if
the hyperthreading fixup was executed.)

Approved by:	re (jhb)
Reviewed by:	ps
2003-05-15 05:12:24 +00:00
Marcel Moolenaar
2a9fc22645 This commit was generated by cvs2svn to compensate for changes in r115013,
which included commits to RCS files with non-trunk default branches.
2003-05-15 05:04:44 +00:00
Marcel Moolenaar
35859e5946 This is beta4 of libuwx; an ia64 stack unwinder. This code is made
available by Hewlett-Packard under the MIT license. The unwinder is
small, clean and fast and needed little adaptation for use in the
kernel.

This import has embedded in it the changes needed to make it build
in a kernel environment.

To optimize the common case, the kernel will minimize the number
of registers saved by not saving the preserved registers. In case
access to preserved registers is needed (signal handling, ptrace)
the kernel will unwind to the context of the syscall or exception.
For this we need an unwinder.

Approved by: re (blanket)
2003-05-15 05:04:44 +00:00
Juli Mallett
7bbf05a2c3 Clear up that COMPAT_43 may not do the same thing on every architecture
and clear up that COMPAT_SUNOS is similarly MI, and does something
relatively similar.

Approved by:	re/rwatson
2003-05-15 02:10:30 +00:00
Peter Wemm
c0a54ff621 Collect the nastiness for preserving the kernel MSR_GSBASE around the
load_gs() calls into a single place that is less likely to go wrong.

Eliminate the per-process context switching of MSR_GSBASE, because it
should be constant for a single cpu.  Instead, save/restore it during
the loading of the new %gs selector for the new process.

Approved by:	re (amd64/* blanket)
2003-05-15 00:23:40 +00:00
Peter Wemm
be52ef1399 Use compile time constants for things like PTmap[] etc because they're
about to move outside of the +/- 2GB range

Suggested by:	jake
Approved by:	re (amd64/* blanket)
2003-05-15 00:20:17 +00:00
Maxime Henrion
4d340ec485 GCC 3.3 complains about anonymous structures in unions, so
give the fxp_ipcb structure a name in the fxp_rfa structure.

Submitted by:	peter
Approved by:	re (jhb)
2003-05-14 20:33:41 +00:00
John Baldwin
aa7ba84232 Fix a typo that broke the pc98 kernel build.
Reported by:	des@'s tinderbox
Pointy hat to:	jhb
Approved by:	re (blanket/scottl)
2003-05-14 20:21:42 +00:00
John Baldwin
ce130a9573 Add <sys/queue.h> to unbreak world.
Approved by:	re (scottl)
2003-05-14 15:00:24 +00:00
Thomas Quinot
b3c957133a In atapi_cam_reinit_bus, only call reinit_bus if the ATAPI channel
has already been registered with ATAPI/CAM (else there is nothing
to do). atapi_cam_reinit_bus may be called before the bus is
registered if an ATAPI command times out during the boot sequence.

PR:		i386/51421
Reviewed by:	roberto
Approved by:	re (rwatson)
MFC after:	1 week
2003-05-14 14:20:22 +00:00
Wilko Bulte
5adbf8fb4d add support for NetMos 4S0P PCI: 4S, 0P
tested on -current: ceri
tested on -stable:  wilko

approved: re (scottl)
2003-05-14 09:37:46 +00:00
Peter Wemm
e14528b349 Regen
Approved by: re (amd64 blanket)
2003-05-14 04:11:25 +00:00
Peter Wemm
d85631c4ac Add BASIC i386 binary support for the amd64 kernel. This is largely
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls.  It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4.  The ia64 code has not implemented signal delivery, so I had
to do that.

Before you say it, yes, this does need to go in a common place.  But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.

On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure.  The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting.  The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes.  This still needs
to be optimized.

Approved by:	re (amd64/* blanket)
2003-05-14 04:10:49 +00:00
Peter Wemm
5d5ca6d75e Fix some misunderstandings about 64 bit extension.
Fix fuword/suword - they're supposed to be 'long' - ie: point them
at fuword64/suword64 instead of the incorrect 32 bit versions.
2003-05-14 03:38:13 +00:00
Paul Saab
13d56a9a90 p_sigignore moved into struct sigacts. move one which was missed.
Approved by:	re (scottl)
2003-05-14 00:03:55 +00:00
John Baldwin
90af4afacb - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
John Baldwin
25b4d3a8a6 In setitimer(2), if the it_value of the new itimer value is clear, then
don't add the current time to it, but leave it as clear so that when the
timer is disabled, the it_value is always clear.

Reviewed by:	bde
Approved by:	re (rwatson)
2003-05-13 19:21:46 +00:00
John Baldwin
dea7cce585 Add some extra #ifdef stubs so that this compiles on 4.8.
Approved by:	re (rwatson/bmah)
2003-05-13 16:59:46 +00:00
Yoshihiro Takahashi
3e4e484918 Move the ips driver from ${MACHINE_ARCH} == "i386" to ${MACHINE} == "i386".
Approved by:	re (scottl)
2003-05-13 11:26:08 +00:00
Alan Cox
099e981aa1 Optimize the use of splay in gbincore(). During a "make buildworld" the
desired buffer is found at one of the roots more than 60% of the time.
Thus, checking both roots before performing either splay eliminates
unnecessary splays on the first tree splayed.

Approved by:	re (jhb)
2003-05-13 04:36:02 +00:00
Poul-Henning Kamp
3eb8c738fd When a disk disappears, destroy the class from the event thread
to avoid race condtion.

Approved by:	re/rwatson
2003-05-12 20:15:28 +00:00
Martin Blapp
7eac366be1 Add support for 3Com OfficeConnect 10/100B.
PR:		49059, 50747
Submitted by:	Dax Eckenberg <daxbert@dweebsoft.com>
Reviewed by:	imp, jhb
Approved by:	jhb
MFC after:	2 weeks
2003-05-12 19:50:21 +00:00
Peter Wemm
8a6d52c3f8 Really stop the loader from trying to load the acpi module by lying and
pretending that it is already here.

Approved by:	re (amd64/* stuff)
2003-05-12 18:37:56 +00:00
Peter Wemm
0fe93e7480 For the page fault handler, save %cr2 in the outer trap handler so that
we do not have to run so long with interrupts disabled.  This involved
creating tf_addr in the trapframe.  Reorganize the trap stubs so that
they consistently reserve the stack space and initialize any missing
bits.

Approved by:	re (amd64 stuff)
2003-05-12 18:33:19 +00:00
Peter Wemm
0f6241620b Sync ucontext with reality. The struct trapframe changes need to be
reflected here.

Approved by:	re (blanket amd64/*)
2003-05-12 18:23:04 +00:00
Maxime Henrion
72490791a8 Fix the unaligned access problems that some people saw on alpha
by using a __packed keyword for the fxp_rfa structure.  The Intel
guys who designed this structure with unaligned fields deserve
to be shot.

Tested by:	kris
Approved by:	re@ (jhb)
2003-05-12 18:15:33 +00:00
Nate Lawson
014ed75b27 Move some printfs under bootverbose since they are not true errors.
Approved by:	re (bmah)
2003-05-12 16:54:55 +00:00
Søren Schmidt
69d691371a Fix typo (that even got cut/pasted 2 times)
Found by:	phk
Approved by:	re@
2003-05-12 16:43:13 +00:00
Poul-Henning Kamp
df1970aa55 Fix an off-by-1 error.
Found by:	FlexeLint
Reviewed by:	sos
Approved by:	re/rwatson
2003-05-12 15:26:05 +00:00
Poul-Henning Kamp
87b1831f1d Bail out if there were not two loadable sections. Add XXX comment about
one other issue.

Approved by:	re/rwatson.
2003-05-12 15:08:10 +00:00
Robert Watson
1964fb9ba2 Remove bogus locking from DDB's "show lockedvnods" command: using
synchronization primitives from inside DDB is generally a bad idea,
and in this case it frequently results in panics due to DDB commands
being executed from the sio fast interrupt context on a serial
console.  Replace the locking with a note that a lack of locking
means that DDB may get see inconsistent views of the mount and vnode
lists, which could also result in a panic.  More frequently,
though, this avoids a panic than causes it.

Discussed with ages ago:	bde
Approved by:			re (scottl)
2003-05-12 14:37:47 +00:00
Peter Wemm
ab6859fd2f Fix lookup of module metadata on amd64 systems. While this is in
common code, the non-trivial part is #ifdef'ed and only executes when
loading amd64 kernels. The rest is trivial but needed for the the amd64
case. (Two variables changed from char ** to Elf_Addr).

Approved by:	re (amd64 "low-risk" stuff)
2003-05-12 05:48:09 +00:00
Poul-Henning Kamp
1282e9acea Don't pass NULL pointer to memset if we are compiled with DIAGNOSTIC
Approved by:	re/rwatson
2003-05-12 05:09:56 +00:00
Poul-Henning Kamp
3d5371a1a6 Don't #define memset() to bzero(), it is far too prone to bite somebody.
Approved by:	re/scottl
2003-05-12 05:08:38 +00:00
Peter Wemm
063107e21d Revert leftover AMD64 disable-acpi-module stuff. 2003-05-12 04:57:05 +00:00
Murray Stokely
281b971b68 Regen.
Approved by:	re
2003-05-12 04:27:22 +00:00
Peter Wemm
e9b193dc33 AMD64 physical space is much larger than i386, de-i386 the bus_space and
bus_dma MD code for AMD64.  (And a trivial ifdef update in dev/kbd because
of this).  More updates are needed here to take advantage of the 64 bit
instructions.

Approved by:	re (blanket amd64/*)
2003-05-12 02:44:37 +00:00
Peter Wemm
bf1e897425 Give a %fs and %gs to userland. Use swapgs to obtain the kernel %GS.base
value on entry and exit.  This isn't as easy as it sounds because when
we recursively trap or interrupt, we have to avoid duplicating the
swapgs instruction or we end up back with the userland %gs.  I implemented
this by testing TF_CS to see if we're coming from supervisor mode
already, and check for returning to supervisor. To avoid a race with
interrupts in the brief period after beginning executing the handler and
before the swapgs, convert all trap gates to interrupt gates, and reenable
interrupts immediately after the swapgs.  I am not happy with this.
There are other possible ways to do this that should be investigated.
(eg: storing the GS.base MSR value in the trapframe)

Add some sysarch functions to let the userland code get to this.

Approved by:	re (blanket amd64/*)
2003-05-12 02:37:29 +00:00
Hidetoshi Shimokawa
96c7c6dd58 Make it compiled on 4-stable.
Approved by: re (scottl)
2003-05-12 00:42:28 +00:00
Josef Karthauser
8e274c38c2 Extend the digital camera support (umass) to the PENTAX Optio 330GS.
Submitted by:	Jan-Oliver Neumann <neumannj@arcor.de>
By way of:	n_hibma
Approved by:	re (jhb & bmah)
MFC After:	7 days
2003-05-11 23:55:28 +00:00
Peter Wemm
85983c59cd Call it an AMD64 Processor, not a Hammer. Also, it seems that the cpuid
model numbers are wider than I first thought.

Approved by: re (blanket amd64/*)
2003-05-11 23:01:04 +00:00
Peter Wemm
f75b005a99 I missed another printf format error while extracting the patch.
Approved by: re (blanket amd64/*)
2003-05-11 22:55:40 +00:00
Peter Wemm
eeee69d45c Make atdevbase long for the KERNBASE > 4GB case
Approved by: re (amd64/* blanket)
2003-05-11 22:53:43 +00:00
Peter Wemm
573044a926 For amd64 kernels, repeat the 1GB mapping over the entire address space
instead of just at 0GB and 1GB marks.  This gives more flexibility for
the choice of KERNBASE.

Approved by:	re (amd64 stuff)
2003-05-11 22:42:29 +00:00
Peter Wemm
5a337b2589 Fix printf format errors that were undetected due to using the standard
FSF compiler during early development.
2003-05-11 22:40:25 +00:00
Peter Wemm
5048926df9 Export PML4SHIFT and PDPSHIFT
Approved by: re (blanket amd64/*)
2003-05-11 22:39:40 +00:00
Peter Wemm
4ce3e250ce Since compiling natively, the compile environment has been less forgiving
about silly typos.  Use the correct comment sequences.
2003-05-11 22:38:54 +00:00
Matthew N. Dodd
598d45be84 Provide exec_linux_setregs() to override exec_setregs().
Linux initializes %gs to 0.  Mimic this behavior.

Submitted by:	 Christian Zander <zander@minion.de>
Reviewed by:	 jake
Approved by:	 re
2003-05-11 21:51:11 +00:00
Hidetoshi Shimokawa
6902ee83c7 - Use moderate gap counts listed in IEEE1394a.
- Simplify and correct the bus manager election process.
- Check link_active when choosing cycle master.
- Fix location of the cmr bit.

Approved by: re (scottl)
2003-05-11 10:32:20 +00:00
Scott Long
3bd9d6f570 Hook up the ips module 2003-05-11 06:40:09 +00:00
Scott Long
1b20702e45 Add notes about the 'ips' driver. 2003-05-11 06:39:05 +00:00
Scott Long
21157fae1c Add files for the 'ips' driver. 2003-05-11 06:37:52 +00:00
Scott Long
2aedd662d8 Add the 'ips' driver for the IBM (now Adaptec) ServeRAID controller
series.  This driver was generously developed and released by David
Jeffreys and Adaptec.  I've updated it to work with 5.x and fixed a
few bugs.

MFC After:	1 week
2003-05-11 06:36:49 +00:00
Scott Long
5639836dcf garbage collect the reserved major for the ips disk device. GEOM makes
it unneeded.
2003-05-11 06:18:33 +00:00
Julian Elischer
335d40c8ff Last commit of the bluetooth upgrade. (this patch was forgotten in the first
commit)

Submitted by: Maksim Yevmenkin <m_evmenkin@yahoo.com>
Approved by: re@
2003-05-10 22:11:25 +00:00
Julian Elischer
f2bb1cae36 Part one of undating the bluetooth code to the newest version
Submitted by:   Maksim Yevmenkin <m_evmenkin@yahoo.com>
Approved by: re@
2003-05-10 21:44:42 +00:00
Bosko Milekic
969bab3efb Make m_freem() just use m_free() instead of duplicating the code. The
reason for the duplication was that m_freem() was meant to eventually
be optimized to hold the lock of the cache being freed to as long as
possible across frees but the difficulty of implementing said
optimization right now is too high, given that in some cases (see MAC
and non-cluster external buffers), we need to call into other subsytems,
something not permissible when the cache lock is held.

This change minimizes code duplication while keeping at least the
atomic mbuf+cluster free optimization.

Suggested by: luigi
2003-05-10 18:08:23 +00:00
Søren Schmidt
60ad94dea4 Add a couble new Intel PCI id's
Approved by: re@
2003-05-10 14:49:19 +00:00
Peter Wemm
b2744ab9c4 Remove special hacks for FSF cross tools now that it builds natively. 2003-05-10 01:12:24 +00:00
Peter Wemm
0fe0f2515b Provide a fake varargs implementation for lint's benefit. This way
it can see the intent of the va_* macros, even though it cannot work.

Approved by:	re (blanket amd64/*)
2003-05-10 00:55:15 +00:00
Peter Wemm
e1ef71de2b Remove _ARCH_INDIRECT ifdefs. They existed for lib/msun/* on i386, which
could use different versions of the math code depending on whether there
was real floating point hardware or math emulation.  Since the fpu is
part of the core specification on amd64, there is no need for this here.

Approved by:	re (blanket amd64/*)
2003-05-10 00:53:34 +00:00
Peter Wemm
2e4f687a1d bcopyb() isn't used on amd64 kernel (it only exists for i386/pcvt)
Approved by:	re (blanket amd64/*)
2003-05-10 00:51:29 +00:00
Peter Wemm
5826a47e9b Finish translating i386/support.s into amd64 asm - replace bcopy etc with
asm versions.  This yields about a 5% kernel compile time speedup.
2003-05-10 00:49:56 +00:00
Poul-Henning Kamp
4da6e74ce4 When a GEOM (/dev-)device is closed and we find that I/O requests are
still outstanding, give them a chance to complete.

If after 10 seconds we still find outstanding I/O requests, complete
the close with a console warning that the system is likely to panic
later on.

This is a workaround for umount -f not quite doing the right thing.

Approved by:    re/scottl
2003-05-09 21:25:28 +00:00
John Baldwin
b1bf1c3a98 Remove Giant from kern_sigsuspend() and osigsuspend() as these should now
be MP safe.

Approved by:	re (scottl)
2003-05-09 19:11:32 +00:00
Peter Wemm
395e65aa29 Include the MXCSR initial values, based on the AMD docs. This file
should really be renamed to fpu.h and npx.c to fpu.c since its part of
the core architecture on amd64 systems, not an isa 'numeric processor
extension'.
2003-05-09 18:28:05 +00:00
Peter Wemm
14426b9c3b Turn syscons on now that it works, so that anybody trying to run this
can see something.  Probing for keyboard still works for auto serial
console mode.
2003-05-09 18:26:06 +00:00
Peter Wemm
7edc7b0d3b Trivial addition of __amd64__ to the ifdefs for platforms that use
i386-style vga console support.

Approved by:  re
2003-05-09 18:24:40 +00:00
Mike Silbersack
b9697d572f Redefine M_FREELIST to be 0x8000; 0x4000 conflicted with two other
uses of m_flags in the kernel.  (A future commit will move all
private m_flags users here so they're obvious without a great
deal of searching.)

This should fix the mbuf double-free panics those using ppp or
ipfw reset rules have been seeing since the double-free detection
code went in.
2003-05-09 02:15:52 +00:00
Alan Cox
3a12f5da1f Give the kmem object's mutex a unique name, instead of "vm object",
to avoid false reports of lock-order reversal with a system map mutex.

Approved by:	re (jhb)
2003-05-09 02:13:23 +00:00
Robert Watson
b2aef57123 Rename MAC_MAX_POLICIES to MAC_MAX_SLOTS, since the variables and
constants in question refer to the number of label slots, not the
maximum number of policies that may be loaded.  This should reduce
confusion regarding an element in the MAC sysctl MIB, as well as
make it more clear what the affect of changing the compile-time
constants is.

Approved by:	re (jhb)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-08 19:49:42 +00:00
John Baldwin
258dbbab69 Acquire Giant at the start of the raid rebuild kthreads.
Reported by:	Masachika ISHIZUKA <ishizuka@ish.org>
Reviewed by:	sos
Approved by:	re (bmah)
2003-05-08 16:38:14 +00:00
Peter Wemm
b3f7680e49 Oops. Turn T_PAGEFLT back into an interrupt gate. It is *critical*
that interrupts be disabled and remain disabled until %cr2 is read.
Otherwise we can preempt and another process can fault, and by the
time we read %cr2, we see a different processes fault address.  This
Greatly Confuses vm_fault() (to say the least).  The i386 port has
got this marked as a bug workaround for a Cyrix CPU, which is what
lead me astray.  Its actually necessary for preemption, regardless
of whether Cyrix cpus had a bug or not.
2003-05-08 08:25:51 +00:00
Peter Wemm
34da59975b Exclude sys/boot for amd64. There are still toolchain issues to deal
with.  In theory, gcc -m32 should work, but for now, do not tempt fate.

Approved by:	re (scottl)
2003-05-08 06:35:39 +00:00
Greg Lehey
108b696afe ioctl VINUM_READCONFIG: Don't lock configuration here. vinum_scandisk
needs to do it anyway to handle the startup case.  This is
            part of a fix for the recently reported hangs.

Approved by:  re (scottl)
2003-05-08 00:36:20 +00:00
Peter Wemm
2dbe628162 Leave space for the 128 byte red-zone on the stack. 2003-05-08 00:13:24 +00:00
Peter Wemm
f3b234157e #include <machine/metadata.h> was missing; add it 2003-05-08 00:12:37 +00:00
Peter Wemm
9c43b77ff5 Fix a preemption race. I was reenabling interrupts in the fast system
call handler before it was safe.  It was possible for to lose context
and for something else to clobber the PCPU scratch variable.  This
moves the interrupt enable *way* too late, but its better safe than
sorry for the moment.
2003-05-08 00:05:00 +00:00
Paul Saab
e0ced69666 - Change the full Asic revision defines to CHIPID to better since the
ASIC revision is really the major number of the CHIPID.  Also store
  the chipid, asic rev and chip revision in the softc for later use.

- The write twice to send producer index workaround only applies to
  the 5700_BX chips, so only do it there.
  Requested by: jdp

- Do not initalize the LED's to 0x00.  The default configuration
  the chip comes up in should yeild proper operation of the LED's.
  Confirmed by: John Cagle <john.cagle@hp.com>

Approved by:	re (blanket)
2003-05-07 21:51:13 +00:00
Robert Watson
41a17fe326 Clean up locking for the MAC Framework:
(1) Accept that we're now going to use mutexes, so don't attempt
    to avoid treating them as mutexes.  This cleans up locking
    accessor function names some.

(2) Rename variables to _mtx, _cv, _count, simplifying the naming.

(3) Add a new form of the _busy() primitive that conditionally
    makes the list busy: if there are entries on the list, bump
    the busy count.  If there are no entries, don't bump the busy
    count.  Return a boolean indicating whether or not the busy
    count was bumped.

(4) Break mac_policy_list into two lists: one with the same name
    holding dynamic policies, and a new list, mac_static_policy_list,
    which holds policies loaded before mac_late and without the
    unload flag set.  The static list may be accessed without
    holding the busy count, since it can't change at run-time.

(5) In general, prefer making the list busy conditionally, meaning
    we pay only one mutex lock per entry point if all modules are
    on the static list, rather than two (since we don't have to
    lower the busy count when we're done with the framework).  For
    systems running just Biba or MLS, this will halve the mutex
    accesses in the network stack, and may offer a substantial
    performance benefits.

(6) Lay the groundwork for a dynamic-free kernel option which
    eliminates all locking associated with dynamically loaded or
    unloaded policies, for pre-configured systems requiring
    maximum performance but less run-time flexibility.

These changes have been running for a few weeks on MAC development
branch systems.

Approved by:	re (jhb)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-07 17:49:24 +00:00
John Baldwin
ace85d0a3c Style nits.
Approved by:	re (bmah)
2003-05-07 17:21:38 +00:00
Poul-Henning Kamp
8b7e2de80c #include <sys/resource.h> to limit ports damage.
Approved by:	re/rwatson
2003-05-07 15:26:43 +00:00
Poul-Henning Kamp
2cc9686e52 Hide the "ENOMEM" notice messages behind bootverbose. They are still
a valuable debugging tool for certain kinds of problems.

Approved by:	re/scottl
2003-05-07 05:37:31 +00:00
Robert Watson
430c635447 Correct a bug introduced with reduced TCP state handling; make
sure that the MAC label on TCP responses during TIMEWAIT is
properly set from either the socket (if available), or the mbuf
that it's responding to.

Unfortunately, this is made somewhat difficult by the TCP code,
as tcp_twstart() calls tcp_twrespond() after discarding the socket
but without a reference to the mbuf that causes the "response".
Passing both the socket and the mbuf works arounds this--eventually
it might be good to make sure the mbuf always gets passed in in
"response" scenarios but working through this provided to
complicate things too much.

Approved by:	re (scottl)
Reviewed by:	hsu
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-07 05:26:27 +00:00
Robert Watson
688fe1d954 Trim a call to mac_create_mbuf_from_mbuf() since m_tag meta-data
copying for mbuf headers now works properly in m_dup_pkthdr(), so
we don't need to do an explicit copy.

Approved by:	re (jhb)
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-05-06 20:34:04 +00:00
Poul-Henning Kamp
c9e629297a Fix the WARNING for wrong rawoffset, I tested incompatible units.
Approved by:	re/jhb
2003-05-06 19:36:13 +00:00
John Baldwin
6e52134c70 Add PCI ID's for the Intel ICH5 (82801EB) chipset.
Approved by:	re (murray)
Sponsored by:	The Weather Channel
2003-05-06 19:31:56 +00:00
John Baldwin
d40a3f6f47 Add PCI ID's for the 4 USB hubs on the ICH5 controller.
Approved by:	re (murray)
2003-05-06 19:30:41 +00:00
Scott Long
bc2de3dac0 We are now in 5.1-BETA 2003-05-06 03:55:24 +00:00
Dag-Erling Smørgrav
88b1e0bc5b Fix a printf() format error which broke the ia64 GENERIC build. 2003-05-06 03:55:12 +00:00
Alan Cox
658ad5fff5 Lock the vm_object when performing vm_pager_deallocate(). 2003-05-06 02:45:28 +00:00
Olivier Houchard
0be8f80357 Don't call timeout() in sis_tick(), this is done earlier by mii_tick(), and it
leads to a panic at unload time, as we own 2 instances of callout and
untimeout() only one.
Will I'm there, remove a call to callout_handler_init(), one is enough.

Reviewed by:	wpaul
2003-05-06 02:00:01 +00:00
John Baldwin
01de25134f Tweak the clearing of TDF_DEADLKTREAT so that we only bother grabbing the
lock and clearing the flag if it was clear when uiomove() was called.
2003-05-05 21:27:29 +00:00
John Baldwin
854dc8c2a1 Mostly sort the includes. 2003-05-05 21:26:25 +00:00
Poul-Henning Kamp
069accaa6a Put descriptive comments on the GEOM_* options 2003-05-05 21:21:31 +00:00
John Baldwin
18440c7fe7 Lock the proc lock around calls to tdsignal() in the sigwait() family of
syscalls.
2003-05-05 21:18:10 +00:00
John Baldwin
6711f10fb6 Make issignal() private to kern_sig.c since it is only called from cursig()
and cursig() is now a function rather than a macro.
2003-05-05 21:16:28 +00:00
John Baldwin
e668d8d834 Remove TD_ON_RUNQ() from a check to make sure Giant is not held when
calling mi_switch().  The kernel would panic on an earlier KASSERT() in
mi_switch() if TD_ON_RUNQ() was true.
2003-05-05 21:12:36 +00:00
David Malone
710c5645af Split sendit into two parts. The first part, still called sendit, that
does the copyin stuff and then calls the second part kern_sendit to do
the hard work. Don't bother holding Giant during the copyin phase.

The intent of this is to allow the Linux emulator to impliment send*
syscalls without using the stackgap.
2003-05-05 20:33:38 +00:00
David E. O'Brien
2875867356 Fix usages of %ll[dx] with typedef'ed created types.
In the kernel it is wrong 99.9 times out of 100 to use %ll rather than cast
to intmax_t and use %j.
2003-05-05 16:56:44 +00:00
Hartmut Brandt
2102bdf21a Define a link layer MIB for ATM. Most fields of this MIB are needed by
ILMI daemons. Factor out common softc fields for all ATM interfaces that
need to be externally visible into an ifatm structure and make the midway
driver using this structure and fill the MIB.
2003-05-05 16:35:52 +00:00
Poul-Henning Kamp
af3e2db5de Avoid double-free panic.
Tripped up:	DougB
2003-05-05 15:52:11 +00:00
Robert Watson
587ffa4508 Clean up proc locking in procfs: make sure the proc lock is held before
entering sys_process.c debugging primitives, or we violate assertions.
Also, be more careful about releasing the process lock around calls
to uiomove() which may sleep waiting for paging machinations or
related notions.  We may want to defer the uiomove() in at least
one case, but jhb will look into that at a later date.

Reported by:	Philippe Charnier <charnier@xp11.frmug.org>
Reviewed by:	jhb
2003-05-05 15:12:51 +00:00
Hidetoshi Shimokawa
d7398f2363 Write to RESET_START register if TARGET_RESET ORB doesn't work for timeout. 2003-05-05 14:50:24 +00:00
Hidetoshi Shimokawa
2b68d77fdd Don't panic for FWXF_START state in fw_xfer_unload(). 2003-05-05 10:14:52 +00:00
Søren Schmidt
b4074fb04d Add a missing ~ when clearing flags in close.
PR: 35392
2003-05-05 10:11:17 +00:00