freebsd-dev

Author	SHA1	Message	Date
Poul-Henning Kamp	43f0db6cc5	Don't do silly thing if the disk_create() event gets canceled. Approved by: re/scottl	2003-05-25 16:57:10 +00:00
Jeff Roberson	30fd5d085d	- Reset the free ent to NULL if we have consumed the last free entry. This fixes a problem where we would overwrite old data if we ran out of free entries. Submitted by: sam Approved by: re (scottl)	2003-05-25 08:48:42 +00:00
Don Lewis	263c8abeb9	Beat vnode locking in the NFS server code into submission. This change is not pretty, but it fixes the code so that it no longer violates the vnode locking rules in the VFS API and doesn't trip any of the locking assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option. There is one report that this patch fixed a "locking against myself" panic on an NFS server that was tripped by a diskless client. Approved by: re (scottl)	2003-05-25 06:17:33 +00:00
Don Lewis	a35e7eaa1a	Always set the hardware parse bit in the IPCB structure when this structure, which is new to the 82550 and 82551, is used to transmit a packet. This appears to fix the packet truncation problem that was observed when using 82550-based fxp cards to transmit ICMP or fragmented UDP packets of certain lengths which only had one to three bytes in the second and final mbuf of the packet. This matches a note in the "Intel 8255x 10/100 Mbps Ethernet Controller Family Open Source Software Developer Manual", which says that the hardware parse bit should be set when sending these types of packets. There have also been unconfirmed reports of similar problems when transmitting TCP packets, which should not be affected by the above mentioned change because the hardware parse bit was already being set if the stack requested hardware checksumming of the packet. If the problem remains, the use of the IPCB structure can be disabled to cause the driver to fall back to using the older 82559 interface with 82550-based cards by setting hint.fxp.UNIT_NUMBER.ipcbxmit_disable to a non-zero value at boot time, or using kenv to set this variable before using kldload to load the fxp driver. Approved by: re (jhb)	2003-05-25 05:04:26 +00:00
Marcel Moolenaar	dc0545462e	Now that we define user mode as any IP address that isn't in the kernel's VA regions, we cannot limit the use of break-based syscalls to user mode only. The signal trampolines are in the gateway page, which is mapped into the process address space in region 5 and thus is kernel space. We don't special case the gateway page here. Allow break-based syscalls from anywhere in the kernel VA space. Approved by: re@ (blanket)	2003-05-25 01:01:28 +00:00
Warner Losh	f9aedaa4ba	Ignore the 'must allocate below 1MB' flag for the TPL_BAR_REG. It is set on realtek cards, but they work without it (and don't work with it). The standard seems to imply that this is just a hint anyway, so this should be harmless. It doesn't appear to be set on any other cardbus cards that I have (or have seen). This should make the rl based CardBus cards work again. I've been running it for about a month now. Approved by: re@ (jhb)	2003-05-24 23:23:41 +00:00
Marcel Moolenaar	d7f827116f	Fix a source of instability specific to an EPC userland. We return to userland with interrupts disabled until we restore PSR. However, it has been observed that interrupts do actually happen before they are enabled again. This is a bit surprising and I don't know yet what's going on exactly. Nevertheless, the code was not crafted carefully enough to allow interrupts to happen and we could clobber the kernel stack of another thread when interrupts did happen. This is what happens: we restore the (memory) stack pointer (sp) and the register stack base prior to restoring ar.k6 and ar.k7. This is not a problem if interrupts don't happen between setting sp/ar.bspstore and ar.k6/ar.k7. Alas, interrupts can happen. Since sp/ar.bspstore already point to the userland stacks, we need to switch to the kernel stack in interrupt. However, ar.k6 and ar.k7 have not been set, which means that we were switching to some unrelated kstack and happily clobbered the trapframe present there if the thread to which the kstack belonged was in kernel mode or otherwise we could have our trapframe clobbered if that other thread enters the kernel. Nasty either way. We now carefully restore ar.k6 prior to restoring ar.bspstore and likewise for ar.k7 and sp. All we need is the guarantee that an interrupt does not clobber ar.k6 or ar.k7 before we're back in userland. That has been achieved by restoring ar.k6/ar.k7 unconditionally (see exception.s) While here, remove the disabling of interrupts on EPC entry. It was added as a way to "resolve" the crashes until it was understood what was going on. I think I achieved the latter, so we can remove the patch. Note that setting up a trapframe with interrupts enabled has it's own share of corner cases, but it's better to properly fixed those than to keep a mostly wrong patch around because we're afraid to remove it... Approved by: re@ (blanket)	2003-05-24 22:53:10 +00:00
Marcel Moolenaar	a7b90d80fc	Be more careful how we restore interrupts. Don't rewrite most of the PSR only to achieve setting PSR.i back to it's previous value. It makes it impossible to change any of the 30+ other unrelated bits when done between intr_disable() and intr_restore(). That's bad. Instead have intr_disable() return 1 when interrupts were previously enabled and 0 otherwise and only enable interrupts in intr_restore() when given a non-0 value. This change specifically disallows using intr_restore() to disable interrupts. The reason is simple: interrupts only need to be restored after they are being disabled, which means that intr_restore() is called with interrupts disabled and we only need to enable them if they were previously enabled. This change does not fix any bugs, other than that it bugged me... Approved by: re@ (blanket)	2003-05-24 21:44:24 +00:00
Marcel Moolenaar	95f2dbba40	Consistently us the same metric to differentiate between kernel mode and user mode. We need to take into account that the EPC syscall path introduces a grey area in which one can argue either way, including a third: neither. We now use the region in which the IP address lies. Regions 5, 6 and 7 are kernel VA regions and if the IP lies any any of those regions we assume we're in kernel mode. Hence, we can be in kernel mode even if we're not on the kernel stack and/or have user privileges. There're gremlins living in the twilight zone :-) For the EPC syscall path this particularly means that the process leaves user mode the moment it calls into the gateway page. This makes the most sense because from a process' point of view the call represents a request to the kernel for some service and that service has been performed if the call returns. With the metric we picked, this also means that we're back in user mode IFF the call returns. Approved by: re@ (blanket)	2003-05-24 21:16:19 +00:00
Marcel Moolenaar	fb4aa34f3b	Unconditionally restore ar.k7 (memory stack) and ar.k6 (register stack) when returning from an interrupt. Both registers are used on interrupt to switch to the right kernel stack, but other than that they are not used. This means we only have to make sure they contain proper values while in user mode. As such, we conditionally restored these registers based on whether we returned to userland or not. A nice property of conditionally restoring ar.k6 and ar.k7 is that it introduces two invariants: ar.k6 always points to the bottom of the kernel stack and ar.k7 always points to the top of the kernel stack (immediately below the PCB we have there). However, the EPC syscall path introduces an irregularity: there's no "thin red line" between user and kernel. There's a grey area that's a couple of instructions wide. Any interruption in that grey area is bound to see an inconsistent state. One such state is that we're in kernel space for all practical purposes, but we still need to have ar.k6 and ar.k7 restored as if we're in userland. Thus: restore ar.k6 and ar.k7 unconditionally at the cost of losing a valuable invariant. Both registers now hold the extend of the usable portion of the kernel stack at any interrupt nesting, which when in userland mean the bottom and the top of the kstack.	2003-05-24 20:51:55 +00:00
Peter Wemm	3ebd9b48ce	Stop profiled libc from exploding, matching gcc's generated code. Approved by: re (amd64/* blanket)	2003-05-24 18:24:03 +00:00
Marcel Moolenaar	d1d7df1905	Fix an alpha inheritance bug: On alpha, PAL is involved in context management and after wiring the CPU (in alpha_init()) a context switch was performed to tell PAL about the context. This was bogusly brought over to ia64 where it introduced bugs, because we restored the context from a mostly uninitialized PCB. The cleanup constitutes: o Remove the unused arguments from ia64_init(). o Don't return from ia64_init(), but instead call mi_startup() directly. This reduces the amount of muckery in assembly and also allows for the next bullet: o Save our currect context prior to calling mi_startup(). The reason for this is that many threads are created from thread0 by cloning the PCB. By saving our context in the PCB, we have something sane to clone. It also ensures that a cloned thread that does not alter the context in any way will return to the saved context, where we're ready for the eventuality with a nice, user unfriendly panic(). The cleanup fixes at least the following bugs: o Entering mi_startup() with the RSE in enforced lazy mode. o Re-execution of ia64_init() in certain "lab" conditions. While here, add proper unwind directives to __start() so that the unwind knows it has reached the bottom of the (call) stack. Approved by: re@ (blanket)	2003-05-24 00:17:34 +00:00
Marcel Moolenaar	ca125f9c17	Fix a (new) source of instability: When interrupting a kernel context, we don't need to switch stacks (memory nor register). As such, we were also not restoring the register stack pointer (ar.bspstore). This, however, fails to be valid in 1 situation: when we interrupt a register stack switch as is being done in restorectx(). The problem is that restorectx() needs to have ar.bsp == ar.bspstore before it can assign the new value to ar.bspstore. This is achieved by doing a loadrs prior to assigning to ar.bspstore. If we take an interrupt in between the loadrs and the assignment and we don't make sure we restore the ar.bspstore prior to returning from the interrupt, we switch stacks with possibly non-zero dirty registers, which means that the new frame pointer (ar.bsp) will be invalid. So, instead of jumping over the restoration of the register frame pointer and related registers, we conditionalize it based on whether we return to kernel context or user context. A future performance tweak is possible by only restoring ar.bspstore when returning to kernel mode and when the RSE is in enforced lazy mode. One cannot assume ar.bsp == ar.bspstore if the RSE is not in enforced lazy mode anyway. While here (well, not quite) don't unconditionally assign to ar.bspstore in exception_save. Only do that when we actually switch stacks. It can only harm us to do it unconditionally. Approved by: re@ (blanket)	2003-05-23 23:55:31 +00:00
Marcel Moolenaar	42b919d4a6	In swapctx(), put the RSE in enforced lazy mode before we flush the register stack. There's nothing really wrong with flushing before putting the RSE in enforced lazy mode, provided you don't depend on ar.bspstore being equal to ar.bsp when the RSE has been put in enforced lazy more. The small window between the flush and setting the RSE may be sufficient to have the RSE eagerly increase the dirty region (and hence cause ar.bspstore != ar.bsp) or have an interrupt that may even get the laziest RSE to do something. Anyway: we don't depend on ar.bspstore being equal to ar.bsp, so nothing was and is broken. But the code was non-intuitive and easily confuses. This is a source of future bugs. Note: the advantage of not depending on ar.bspstore is that there's some recilience against an interrupted flushrs. Clobbering is limited to stacked register contents only, not to RSE address clobbering. Approved: re@ (blanket)	2003-05-23 23:16:43 +00:00
Alan Cox	2e05d89828	Make the maximum number of vnodes a function of both the physical memory size and the kernel's heap size, specifically, vm_kmem_size. This function allows a maximum of 40% of the vm_kmem_size to be used for vnodes and vm objects. This is a conservative bound based upon recent problem reports. (In other words, a slight increase in this percentage may be safe.) Finally, machines with less than ~3GB of RAM should be unaffected by this change, i.e., the maximum number of vnodes should remain the same. If necessary, machines with 3GB or more of RAM can increase the maximum number of vnodes by increasing vm_kmem_size. Desired by: scottl Tested by: jake Approved by: re (rwatson,scottl)	2003-05-23 19:54:02 +00:00
Peter Wemm	d9cd1af4aa	Typo fix. oops. Submitted by: jmallett Approved by: re (blanket amd64/*)	2003-05-23 06:36:46 +00:00
Peter Wemm	cbd667fa2f	Update comments. Note that the kernel is at -1GB, not -2GB as erroniously implied by the previous commit. KVM is still only 1GB until pmap_growkernel() learns about the extra page table level. Approved by: re (blanket)	2003-05-23 06:35:45 +00:00
Peter Wemm	f229f5cf85	As suggested by the gdb folks, pad the 'struct fpreg' to a full 512 bytes to match the native fxsave/fxrstor object size since thats apparently what the Linux/NetBSD folks do.	2003-05-23 06:31:56 +00:00
Peter Wemm	637068b1d3	Low risk amd64 fix. Use a vm_offset_t for the virtual location of the buffer space instead of a u_int32_t. Otherwise the upper 32 bits of the address space get truncated and syscons blows up. Approved by: re (safe, low risk amd64 fixes)	2003-05-23 05:10:49 +00:00
Peter Wemm	9f0c4ab393	Deal with the user VM space expanding. 32 bit applications do not like having their stack at the 512GB mark. Give 4GB of user VM space for 32 bit apps. Note that this is significantly more than on i386 which gives only about 2.9GB of user VM to a process (1GB for kernel, plus page table pages which eat user VM space). Approved by: re (blanket)	2003-05-23 05:07:33 +00:00
Peter Wemm	3c9a3c9ca3	Major pmap rework to take advantage of the larger address space on amd64 systems. Of note: - Implement a direct mapped region using 2MB pages. This eliminates the need for temporary mappings when getting ptes. This supports up to 512GB of physical memory for now. This should be enough for a while. - Implement a 4-tier page table system. Most of the infrastructure is there for 128TB of userland virtual address space, but only 512GB is presently enabled due to a mystery bug somewhere. The design of this was heavily inspired by the alpha pmap.c. - The kernel is moved into the negative address space(!). - The kernel has 2GB of KVM available. - Provide a uma memory allocator to use the direct map region to take advantage of the 2MB TLBs. - Fixed some assumptions in the bus_space macros about the ability to fit virtual addresses in an 'int'. Notable missing things: - pmap_growkernel() should be able to grow to 512GB of KVM by expanding downwards below kernbase. The kernel must be at the top 2GB of the negative address space because of gcc code generation strategies. - need to fix the >512GB user vm code. Approved by: re (blanket)	2003-05-23 05:04:54 +00:00
Greg Lehey	74f2cc2c9c	Change the way the plex lock mutexes work. Previously they were part of the struct plex, which tore apart the mutex linked lists when the plex table was expanded. Now we maintain a pool of mutexes (currently 32) to be shared by all plexes. This is still a lot better than the splhigh() method used in other architectures. expand_table: Add parameters file and line if we're debugging. Approved by: re (jhb)	2003-05-23 01:15:55 +00:00
Greg Lehey	93573e2e76	Change the way the plex lock mutexes work. Previously they were part of the struct plex, which tore apart the mutex linked lists when the plex table was expanded. Now we maintain a pool of mutexes (currently 32) to be shared by all plexes. This is still a lot better than the splhigh() method used in other architectures. Add and clarify comments. Approved by: re (jhb)	2003-05-23 01:15:30 +00:00
Greg Lehey	7db14b2ff2	expand_table: Add parameters file and line if we're debugging. MMalloc, vinum_meminfo: Use strlcpy to copy file name. Approved by: re (jhb)	2003-05-23 01:15:01 +00:00
Greg Lehey	d026346c86	Change the way the plex lock mutexes work. Previously they were part of the struct plex, which tore apart the mutex linked lists when the plex table was expanded. Now we maintain a pool of mutexes (currently 32) to be shared by all plexes. This is still a lot better than the splhigh() method used in other architectures. Approved by: re (jhb)	2003-05-23 01:14:35 +00:00
Greg Lehey	8a697ff435	detachobject: Update volume config after detaching a plex. update_volume_config: Remove redundant diskconfig parameter. Approved by: re (jhb)	2003-05-23 01:14:13 +00:00
Greg Lehey	cb5eba5e09	Change the way the plex lock mutexes work. Previously they were part of the struct plex, which tore apart the mutex linked lists when the plex table was expanded. Now we maintain a pool of mutexes (currently 32) to be shared by all plexes. This is still a lot better than the splhigh() method used in other architectures. update_volume_config: Remove redundant diskconfig parameter. expand_table: Add parameters file and line if we're debugging. Approved by: re (jhb)	2003-05-23 01:13:43 +00:00
Greg Lehey	f7b76dc815	Change many strcpys to strlcpys, etc. Submitted by: Ted Unangst <tedu@stanford.edu> Correct some inaccurate and badly formatted comments. config_subdisk: If our drive is down, ensure that the subdisk is crashed. Previously it was possible for the subdisk to be up when the drive was down. Change the way the plex lock mutexes work. Previously they were part of the struct plex, which tore apart the mutex linked lists when the plex table was expanded. Now we maintain a pool of mutexes (currently 32) to be shared by all plexes. This is still a lot better than the splhigh() method used in other architectures. update_volume_config: Remove redundant diskconfig parameter. Approved by: re (jhb)	2003-05-23 01:13:10 +00:00
Peter Wemm	997f3bfc2a	Merge from i386/trap.c rev 1.252. Use td_critnest instead of the spinlocks count for explicitly enabling interrupts. Approved by: re (blanket)	2003-05-22 20:09:50 +00:00
Bernd Walter	cdc95e1bb8	Calculate routed interrupts using the slot number from the device and not that of the bridge. Approved by: re (jhb)	2003-05-22 17:45:26 +00:00
Mike Barcroft	6f9622a926	Fix two misuses of __BSD_VISIBLE. Submitted by: bde Approved by: re	2003-05-22 17:07:57 +00:00
Julian Elischer	faaa20f639	When we are spilling threads out of the run queue during panic, make sure we keep the thread state variable consistent with its real state. i.e. Don't say it's on the run queue when it isn't. Also clarify the associated comment. Turns a double panic back to a single panic :-/ Approved by: re@ (jhb)	2003-05-21 18:53:25 +00:00
Poul-Henning Kamp	67fd2837cd	Return ENXIO if the softc pointer is NULL, in all likelyhood the disk is in the process of disappearing. Approved by: re/rwats*	2003-05-21 18:52:29 +00:00
Paul Saab	3284b9ee87	Make ciss usable under PAE Approved by: re (scottl)	2003-05-21 07:17:06 +00:00
Paul Saab	487a8c7e61	- Make this work with PAE. - atomically load and clear the status block so we dont miss an update. Submitted by: jdp Approved by: re (scottl)	2003-05-21 07:00:49 +00:00
Nate Lawson	742d91f211	Quirk for Hitachi DVD USB drive. It returns "invalid field in cdb" for normal INQUIRY requests so enable the NO_INQUIRY quirk. Submitted by: Lars Eggert <larse@ISI.EDU> Approved by: re (scottl)	2003-05-21 00:22:07 +00:00
John Baldwin	7f4725bd09	The per-CPU spinlocks list is only maintained when WITNESS is enabled. Thus, treat all page faults while in a critical section as fatal rather than just those that occur with a non-empty spinlocks list. All such page faults are fatal anyways. Calling trap_fatal() earlier increases the chances of getting more useful panic messages and a possible DDB prompt. Approved by: re (scottl)	2003-05-20 20:50:33 +00:00
Nate Lawson	2f8f9581dd	Remove a redundant quirk. Instead, we wildcard all Asahi Optical chips. Approved by: re	2003-05-20 18:04:42 +00:00
Marcel Moolenaar	bfaccb767c	o Fix a definite bogon: the dirty bity fault, instruction access failt and data access fault install the PTE in question into the VHPT table. However, a post-increment was missing and we wrote the raw PTE data into the pagesize/access key field. This leaves a corrupt VHPT entry. o While here, remove the explicit cache purge. Insertion into the translation implicitly purges any overlapping entries. o Make sure there's a cycle break between the itc and the rfi. o Whitespace fixes.	2003-05-20 06:57:20 +00:00
Marcel Moolenaar	14d2ae56c7	Rename the "IA64 ITC" counter to "ITC" counter. We don't call the "TSC" counter on i386 "I386 TSC". Approved by: re@ (blanket)	2003-05-20 06:51:20 +00:00
Marcel Moolenaar	9b9ce577d4	Prevent corruption of the VHPT collision chain by protecting it with a mutex. The only volatile chain operations are insertion and deletion but since updating an existing PTE also updates the VHPT entry itself, and we have the VHPT mutex in both other cases, we also lock when we update an existing PTE even though no chain operation is involved. Note that we perform the insertion and deletion careful enough that we don't need to lock traversals. If we need to lock traversals, we also need to lock from the exception handler, which we can't without creating a trapframe. We're now able to withstand a -j8 buildworld. More work is needed to withstand Murphy fields. In other words: we still have a bogon... Approved by: re@ (blanket)	2003-05-20 02:52:41 +00:00
Peter Wemm	62d8fb93d0	Deal with the possibility of negative available space from the file server to avoid Bad Things(TM) happening (eg: df crashing with a floating point exception). Submitted by: Harold Gutch <logix@foobar.franken.de> Approved by: re (scottl)	2003-05-19 22:35:00 +00:00
Peter Wemm	3830dc4629	Another x86-64 comment fixup Approved by: re (blanket amd64 stuff)	2003-05-19 22:19:02 +00:00
Peter Wemm	92f0cd89a0	s/x86_64/amd64/ in comments in header. Approved by: re (blanket amd64)	2003-05-19 22:15:30 +00:00
Alexander Kabaev	980ded9a7d	sys/sys/limits.h: - Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)' is alays wrong because __FOO_VISIBLE is always defined (to 0 for invisibility). sys/<arch>/include/limits.h sys/<arch>/include/_limits.h: - Style fixes. Submitted by: bde Reviewed by: bsdmike Approved by: re (scottl)	2003-05-19 20:29:07 +00:00
Søren Schmidt	e1750fb855	Print the right position on disk errors Approved by: re@	2003-05-19 13:43:12 +00:00
Søren Schmidt	c9f5649b3e	Unbork the chip locating code. Approved by: re@	2003-05-19 13:42:23 +00:00
Marcel Moolenaar	b8c4149cff	Turn pmap_install_pte() into a critical section. We better not get interrupted while writing into the VHPT table. While here, make sure memory accesses a properly ordered. Tag invalidation must happen first so that the hardware VHPT walker will not be able to match this entry while we're updating it and we have to make sure the new new tag gets written only after the PTE is completely updated. Approved by: re (blanket)	2003-05-19 08:02:36 +00:00
Marcel Moolenaar	a75b99ea2d	Unconditionally set pcb_current_pmap. WIP versions of the code previously committed cleared pcb_current_pmap prior to changing the region registers, but that was removed before committing. Since we don't normally (at all?) pass a NULL pointer, the bug was mostly harmless. Fix it while I'm here... I'm here because we need to have data serialization after writing to the region registers. Not doing so was likely the cause of the hangs we were experiencing. General exceptions in cpu_switch may also be caused by the lack of serialization. Approved by: re (blanket)	2003-05-19 06:05:30 +00:00
Marcel Moolenaar	dc0bde0f18	pmap_install() needs to be atomic WRT to context switching. Protect switching user regions (region 0-4) with schedlock. Avoid unnecessary recursion on schedlock by moving the core functionality to another function (pmap_switch()) where we assert schedlock is held. Turn pmap_install() into a wrapper that grabs schedlock. This minimizes the number of callsites that need to be changed. Since we already have schedlock in cpu_switch() and cpu_throw(), have them call pmap_switch() directly. These were also the only two calls to pmap_install() outside pmap.c, so make pmap_install() static and remove its prototype from pmap.h Approved by: re (blanket)	2003-05-19 04:16:30 +00:00

1 2 3 4 5 ...

39971 Commits