and ndis_halt_nic(). It's been disabled for some time anyway, and
it turns out there's a possible deadlock in NdisMInitializeTimer() when
acquiring the miniport block lock to modify the timer list: it's
possible for a driver to call NdisMInitializeTimer() when the miniport
block lock has already been acquired by an earlier piece of code. You
can't acquire the same spinlock twice, so this can deadlock.
Also, implement MmMapIoSpace() and MmUnmapIoSpace(), and make
NdisMMapIoSpace() and NdisMUnmapIoSpace() use them. There are some
drivers that want MmMapIoSpace() and MmUnmapIoSpace() so that they can
map arbitrary register spaces not directly associated with their
device resources. For example, there's an Atheros driver for
a miniPci card (0x168C:0x1014) on the IBM Thinkpad x40 that wants
to map some I/O spaces at 0xF00000 and 0xE00000 which are held by
the acpi0 device. I don't know what it wants these ranges for,
but if it can't map and access them, the MiniportInitialize() method
fails.
o Oxford Semiconductor PCI Dual Port Serial
o Netmos Nm9845 PCI Bridge with Dual UART
Add PCI IDs for single-port cards:
o Various SIIG Cyber Serial
o Oxford Semiconductor OXCB950 UART
Update description as per puc(4).
immediately, back off to the next higher Cx sleep state. Some machines
with a Via chipset report a valid C3 but a register read doesn't actually
halt the CPU. This would cause the machine to appear unresponsive as it
repeatedly called cpu_idle() which immediately returned. Causing interrupts
(i.e. by pressing the power button) would cause the system to make forward
progress, showing that it wasn't actually hung.
Also, enable interrupts a little earlier. We don't need them disabled
to calculate the delta time for the read.
Reported by: silby
MFC after: 2 weeks
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces. struct intr_event holds the list
of interrupt handlers associated with interrupt sources.
struct intr_thread contains the data relative to an interrupt thread.
Currently we still provide a 1:1 relationship of events to threads
with the exception that events only have an associated thread if there
is at least one threaded interrupt handler attached to the event. This
means that on x86 we no longer have 4 bazillion interrupt threads with
no handlers. It also means that interrupt events with only INTR_FAST
handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
intr_foo naming convention. This did require renaming the powerpc
MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
powerpc. This means that multiple INTR_FAST handlers can attach to the
same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
to the same interrupt. Sharing INTR_FAST handlers may not always be
desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
either. Drivers can always still use INTR_EXCL to ask for an interrupt
exclusively. The way this sharing works is that when an interrupt
comes in, all the INTR_FAST handlers are executed first, and if any
threaded handlers exist, the interrupt thread is scheduled afterwards.
This type of layout also makes it possible to investigate using interrupt
filters ala OS X where the filter determines whether or not its companion
threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
dumping their state. It also has a '/v' verbose switch which dumps
info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
handler is removed because it would suck for things like ppbus(8)'s
braindead behavior. The code is present, though, it is just under
#if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
event into a separate function so that ithread_loop() becomes more
readable. Previously this code was all in the middle of ithread_loop()
and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
curthread's proc against idlethread's proc. (Not really related to intr
changes)
Tested on: alpha, amd64, i386, sparc64
Tested on: arm, ia64 (older version of patch by cognet and marcel)
to the actual dates when code actually changed. Also add special case
link state change handling for RELENG_5, which doesn't have
if_link_state_change(). No actual operational changes are done.
devices can be opened multiple times simultaneously but we're
expected to be able to do so by the rest of the loader.
This fixes booting from disks attached to the on-board SCSI
controller of Sun Ultra 1 (previously this triggered a trap)
and probably also of AX1115 boards.
- While here, remove unused variables and add empty lines where
style(9) requires such.
Tested on: powerpc (grehan), sparc64
MFC after: 1 month
Try to make everyone happy: David (to have debug kernels installed
by default), Warner (to be able to override that), and myself (for
actually making it all work and to be consistent).
Now, if kernel was configured for debugging (through DEBUG=-g in
the kernel config file or "config -g"), doing "make install" will
install debug versions of kernel and module objects with their
canonical names,
kernel.debug -> /boot/kernel/kernel
if_fxp.ko.debug -> /boot/kernel/if_fxp.ko
Installing a kernel not configured for debugging, or debug kernel
with INSTALL_NODEBUG variable defined, will install non-debug
kernel and module objects.
Also, restore the install.debug and reinstall.debug targets that
are part of the existing API (they cause some additional gdb(1)
scripts to be installed).
of the PCIR_HDRTYPE register. It's the value returned from this
read access that determines whether or not we decide a device is
present at the current slot index. For some reason that I can't
adequately explain, this read fails on my machine when probing the
USB controller on my machine (which happens a multifunction device
at slot index 3 hung off the PCI-PCI bridge on the AMD8111 (bus
index 1)). The read will return 0xFF even though it should return
0x80 to indicate the presence of a multifunction device.
As near as I can tell, there's some timing issue involved with reading
the 'dead' slot indexes 0 through 2 that causes the read of the actual
device at slot 3 to fail. I tried a couple of different tricks to
correct the problem (the patch to amd64/pci/pci_cfgreg.c fixes it
for the amd64 arch), but adding this delay is the only thing that
always allows the USB controllers to be correctly probed 100% of the
time. Whatever the problem is, it's likely confined to the AMD8111
chipset. However, a simple 1us delay is fairly harmless and should
have no side effects for other hardware. I consider this to be
voodoo, but it's fairly benign voodoo and it makes my USB keyboard
and mouse work again.
Note that this is the second time that I've had to resort to a
1us delay to fix a PCI-related problem with this AMD8111/Opteron
system (the first being a fix I made a while back to the NDISulator).
It's possible the delay really belongs in the cfgreg code itself,
or that pci_cfgreg needs some custom hackery for an errata in the
8111. (I checked but couldn't find any documented errata on AMD's
site that could account for these problems.)
other OSes (Solaris, Linux, VxWorks). It's not necessary to write a 0
to the config address register when using config mechanism 1 to turn
off config access. In fact, it can be downright troublesome, since it
seems to confuse the PCI-PCI bridge in the AMD8111 chipset and cause
it to sporadically botch reads from some devices. This is the cause
of the missing USP ports problem I was experiencing with my Sun Opteron
system.
Also correct the case for mechanism 2: it's only necessary to write
a 0 to the ENABLE port.
- Move hardware counter reading/zeroing to hme_tick(). This saves
8 register access per interrupt. [1]
- Use imax macro for getting max. argument between two integers.
- Invoke bus_dmamap_sync(9) first before freeing mbuf.
- Check driver queue first to reduce locking operation in hme_start_locked()
and interrupt handler.
- Simplyfy watchdog timer setup in interrupt handler.
- Don't log normal errors such as RX overrun. If we have DMA stuck
condition, reinitialize the driver and log it.
Reviewed by: marius
Obtained from: OpenBSD [1]
confused with the Credit Card Adapter II and its spawn (which the xe
driver supports). These changes get my card probing and attaching. I
recently won one of these (and a NEC rebadged version) in an lot
auction. The NEC didn't work, so I took it apart and found the
MB86960A chip and then modified if_fe_pccard.c to attach. I can't
test this card further since I have no dongle for this card.
be installed. It should have been optional to install a non-debug
one, just like it was formerly optional to install a debug one. In
order to do that, most of 1.84 had to go.
Instead, make installing the debug kernel the default, but create a
new option INSTALL_NODEBUG for those people that have small /
partitions and good source control habits.
This preserves the behavior of 1.84 while allowing it to be overriden
for people (like me) that do not have the time to upgrade to get a
bigger / and also don't have time for stupid makefile tricks when
upgrading their older system, but still want a kernel.debug around if
things go south.
IPI_STOP IPIs.
- Change the i386 and amd64 MD IPI code to send an NMI if STOP_NMI is
enabled if an attempt is made to send an IPI_STOP IPI. If the kernel
option is enabled, there is also a sysctl to change the behavior at
runtime (debug.stop_cpus_with_nmi which defaults to enabled). This
includes removing stop_cpus_nmi() and making ipi_nmi_selected() a
private function for i386 and amd64.
- Fix ipi_all(), ipi_all_but_self(), and ipi_self() on i386 and amd64 to
properly handle bitmapped IPIs as well as IPI_STOP IPIs when STOP_NMI is
enabled.
- Fix ipi_nmi_handler() to execute the restart function on the first CPU
that is restarted making use of atomic_readandclear() rather than
assuming that the BSP is always included in the set of restarted CPUs.
Also, the NMI handler didn't clear the function pointer meaning that
subsequent stop and restarts could execute the function again.
- Define a new macro HAVE_STOPPEDPCBS on i386 and amd64 to control the use
of stoppedpcbs[] and always enable it for i386 and amd64 instead of
being dependent on KDB_STOP_NMI. It works fine in both the NMI and
non-NMI cases.
still works. Also, this is consistent with 'show pcpu' vs
'show allpcpu'. (And 'show allstacks' on OS X for that matter.)
- Add 'bt' as an alias for 'trace'. We already have a 'where' alias as
well, so this makes it easier for gdb-wired hands to work in ddb.
Ok'd by: rwatson (1)
Requested by: scottl (2)
MFC after: 1 day
is on AC power (i.e. not a laptop). This allows power_profile to run once
for desktop systems as well, for instance, to set C3 or CPU frequency.
MFC after: 2 weeks
promisc flag from the member interface, this is a no-op anyway since the
interface is disappearing. The driver may have already released
its resources such as miibus and this is likely to panic the kernel.
Submitted and tested by: Wojciech A. Koszek
MFC after: 2 weeks
from being reclaimed before it was wired. Use pmap_extract_and_hold()
instead of pmap_extract() and retain the hold on the page until it has been
wired.
clock are supported. I have plan to merge XSI timer ITIMER_REAL and other
two CPU timers into the new code, current three slots are available for
the XSI timers.
The SIGEV_THREAD notification type is not supported yet because our
sigevent struct lacks of two member fields:
sigev_notify_function
sigev_notify_attributes
I have found the sigevent is used in AIO, so I won't add the two members
unless the AIO code is adjusted.
2. Introduce flags KSI_EXT and KSI_INS. The flag KSI_EXT allows a ksiginfo
to be managed by outside code, the KSI_INS indicates sigqueue_add should
directly insert passed ksiginfo into queue other than copy it.
OR the flags bits with the driver managed status flags. This fixes an
issue where RUNNING flags would not be reported to processes, which
conflicts with the flags information provided by ifconfig(8).
support the CM-battery interface. Smart batteries can eventually be
supported without ACPI via a separate SMBus interface. The ACPI interface
uses the embedded controller for reading/writing to the SMBus, and normal
ASL definitions for locating the battery controller (since SMBus can't be
enumerated.) Also import definitions for the smart battery interface.
This was written by Hans Petter Selasky with minor cleanups from myself.
Submitted by: Hans Petter Selasky <hselasky / c2i.net>
* Use ACPI_BATT_UNKNOWN instead of constants
* Use maxunit instead of a count of devices since we may have sparse
battery devices in the future. Only userland should be using unit
numbers anyway, so provide a translation function. (Kernel use of
batteries should be restricted to looking up a device_t and calling
methods directly.
* Don't check acpi_BatteryIsPresent() in acpi_battery. Leave it up to
the hardware-specific driver (i.e. cmbat) since smart batteries seem
to not report the "battery present" flag.
* Convert mA to mW if the battery uses those units. CM-batteries only
used mW so this deficiency went unnoticed.
* Clean strings reported in the battery info from any control chars.
* Only dereference the unit from ioctl_arg if the full struct is present.
Unit wouldn't have been used later if it wasn't present but this is
cleaner. Translate the unit if it's not ACPI_BATTERY_ALL_UNITS.
* bzero structs before returning them to usermode for future compat.
Most of this work was submitted by Hans Petter Selasky and then majorly
reworked by myself.
Submitted by: Hans Petter Selasky <hselasky / c2i.net>
vm_object_backing_scan() was not written to handle. Specifically, a wired
page within a backing object that is shadowed by a page within the shadow
object. Handle this state by removing the wired page from the backing
object. The wired page will be freed by socow_iodone().
Stop masking errors: If a page is being freed by vm_object_backing_scan(),
assert that it is no longer mapped rather than quietly destroying any
mappings.
Tested by: Harald Schmalzbauer
too. This fixes problem when connected prefixes overlap.
Obtained from: OpenBSD (rev. 1.40 by claudio);
[ I came to this fix myself, and then found out that
OpenBSD had already fixed it the same way.]
attach routine. Go ahead and ask for it in the probe routine and be
just as wrong as all the other cards that ask for it there...
# this gets the RTL8019 on a SBC at work fully functional. 6.0 still treats
# the 8019 as a generic NE-2000, so these changes aren't relevant there.
This avoids the need for sched_bind() in the default case so that you
can start up the NDIS subsystem at boot time when only CPU 0 is running.
There are potentially ways to fix it so that the DPC threads aren't
started until after the other CPUs are launched, but doing it correctly
is tricky. You need to defer the startup of the ntoskrnl subsystem
(ntoskrnl_libinit()), not just defer ndis_attach().
For now, I don't think it will make much difference having just the
single DPC thread (I started out with just one anyway). Note that this
turns the KeSetTargetProcessorDpc() routine into a no-op, since the
CPU number in struct kdpc is now ignored.
ed_probe_rtl80x9. In the pci case we call ed_probe_rtl80x9 first. In
the PCI case we were using the correct nic_offset by accident because
softc is initialized to zero. In the isa case we were using the wrong
value by accident, since ed_probe_WD80x3 sets the offset value to
0x10. This lead to the identification routines failing. Fix this
problem by always initalizing the nic_offset and asic_offset before
making ed_{asic,nic}_{in,out}* calls.
get a new pv under high system load where the available pv entries
have been exhausted before the pagedaemon has a chance to wake up
to reclaim some.
Prior to this, the NULL pointer dereference ended up causing
secondary panics with rather less than useful resulting tracebacks.
Reviewed by: alc, jhb
MFC after: 1 week
in reiserfs_lookup() that was used to fix an actual race in
ufs_lookup.c:1.78. This is not currently a hazard, but the
bug would be activated by marking reiserfs as MPSAFE.
Reviewed by: mux (mentor)
MFC after: 2 weeks
is KeRaiseIrql(newirql, &oldirql), not oldirql = KeRaiseIrql(newirql).
(The macro ultimately translates to KfRaiseIrql() which does use
the latter API, so this has no effect on generated code.)
Also, wait for thread termination the right way: kthread_exit()
will ultimately do a wakeup(td->td_proc). This is the event we
should wait on. Eliminate the previous synchronization machinery
for this since it was never guaranteed to work correctly.
to (max block - 1) * bsize. For DEV_BSIZE, this doubles the limit from
0.5 TB to 1 TB. For the old 4.4 FFS case, decrease the limit from 0.5 TB
to 2 GB - 1. Older systems had a 32 bit off_t so they couldn't access the
larger files anyway.
Collaboration with: bde
processor, to insure DPC thread 0 runs on CPU0, DPC thread 1 runs on
CPU1, and so on.
Elevate the priority of the workitem threads, though don't use as
high a priority as the DPC threads.
available kernel malloc types. Quite useful for post-mortem debugging of
memory leaks without a dump device configured on a panicked box.
MFC after: 2 weeks
- Destroy mutex in case of attach failure. [1]
- Lock properly em_watchdog(). [1]
- Lock properly em_sysctl_int_delay(). [1]
- Remove unused global adapter linked list.
- Remove unused dma_size field from struct em_dma_alloc.
- Do not touch interface statistics, that must be edited
only by upper layers. [1]
Submitted by: yongari [1]
o Do not mask the RX overrun interrupt.
o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact. [ NetBSD also
resets adapter in this case, but my tests showed that
this is not needed and only pessimizes behavior under
heavy load. ]
- Since almost all functions is rewritten, style the
remaining lines.
This fixes em(4) interfaces wedging under high load.
In collaboration with: wpaul, cognet
Obtained from: NetBSD
to unload the usb.ko module after boot if it was originally preloaded
from "/boot/loader.conf". When processing preloaded modules, the
linker erroneously added self-dependencies the each module's reference
count. That prevented usb.ko's reference count from ever going to 0,
so it could not be unloaded.
Sponsored by Isilon Systems.
Reviewed by: pjd, peter
MFC after: 1 week
- disable IPv6 operation if DAD fails for some EUI-64 link-local addresses.
- export get_hw_ifid() (and rename it) as a subroutine for this process.
Obtained from: KAME
Reviewd by: ume, gnn
MFC after: 2 week
Fix a debugging printf to printf after a variable is first assigned,
not before.
These are purely build fixes, and need inspection to make sure they
were what the original author of the previous changes intended.
flag. This fixes panic, when 'ifconfig em0 down' was called and it calls
em_stop() while the em_process_receive_interrupts() has temporarily
dropped the lock.
- fixed typos
- improved some comment descriptions
- use NULL, instead of 0, to denote a NULL pointer
- avoid embedding a magic number in the code
- use nd6log() instead of log() to record NDP-specific logs
- nuked an unnecessay white space
Obtained from: KAME
MFC after: 1 day
sampling rate:
- Improve vchan chn_setspeed() strategy. Try to avoid FEEDER_RATE
on parent channel if the requested value is not supported
by the hardware.
- Fix vchan default speed calculation. In any case, vchan should
rely on parent bufsoft speed instead of bufhard since it is
possible that the entire feeder chain might involve FEEDER_RATE.
This is possible under extreme, rare condition if the above
chn_setspeed() strategy failed.
Approved by: netchild (mentor)
- Change ndis_return() from a DPC to a workitem so that it doesn't
run at DISPATCH_LEVEL (with the dispatcher lock held).
- In if_ndis.c, submit packets to the stack via (*ifp->if_input)() in
a workitem instead of doing it directly in ndis_rxeof(), because
ndis_rxeof() runs in a DPC, and hence at DISPATCH_LEVEL. This
implies that the 'dispatch level' mutex for the current CPU is
being held, and we don't want to call if_input while holding
any locks.
- Reimplement IoConnectInterrupt()/IoDisconnectInterrupt(). The original
approach I used to track down the interrupt resource (by scanning
the device tree starting at the nexus) is prone to problems when
two devices share an interrupt. (E.g removing ndis1 might disable
interrupts for ndis0.) The new approach is to multiplex all the
NDIS interrupts through a common internal dispatcher (ntoskrnl_intr())
and allow IoConnectInterrupt()/IoDisconnectInterrupt() to add or
remove interrupts from the dispatch list.
- Implement KeAcquireInterruptSpinLock() and KeReleaseInterruptSpinLock().
- Change the DPC and workitem threads to use the KeXXXSpinLock
API instead of mtx_lock_spin()/mtx_unlock_spin().
- Simplify the NdisXXXPacket routines by creating an actual
packet pool structure and using the InterlockedSList routines
to manage the packet queue.
- Only honor the value returned by OID_GEN_MAXIMUM_SEND_PACKETS
for serialized drivers. For deserialized drivers, we now create
a packet array of 64 entries. (The Microsoft DDK documentation
says that for deserialized miniports, OID_GEN_MAXIMUM_SEND_PACKETS
is ignored, and the driver for the Marvell 8335 chip, which is
a deserialized miniport, returns 1 when queried.)
- Clean up timer handling in subr_ntoskrnl.
- Add the following conditional debugging code:
NTOSKRNL_DEBUG_TIMERS - add debugging and stats for timers
NDIS_DEBUG_PACKETS - add extra sanity checking for NdisXXXPacket API
NTOSKRNL_DEBUG_SPINLOCKS - add test for spinning too long
- In kern_ndis.c, always start the HAL first and shut it down last,
since Windows spinlocks depend on it. Ntoskrnl should similarly be
started second and shut down next to last.
called during early init before cninit().
Tested on: i386, alpha, sparc64
Reviewed by: phk, imp
Reported by: Divacky Roman xdivac02 at stud dot fit dot vutbr dot cz
MFC after: 1 week
the descriptors set.
- In em_process_receive_interrupts(), call bus_dmamap_sync() for the
descriptors set each time we modify one descriptor, instead of doing it only
at the function exit, to make sure the adapters know he can re-use the
descriptor.
This helps on arm with write-back data cache (and possibly on other arches
with bounce pages, I don't know) under heavy network load. Without this,
if we attempt to process more than num_rx_desc descriptors, the adapter
would just stop processing rx interrupts.
While here, support up to four sections because it was trivial to do
and cheap. (One pointer per section).
For amd64 with "-fpic -shared" format .ko files, using a single PT_LOAD
section is important to avoid wasting about 1MB of KVM and physical ram
for the 'gap' between the two PT_LOAD sections. amd64 normally uses
.o format kld files and isn't affected normally. But -fpic -shared modules
are actually possible to produce and load... (And with a bugfix to
binutils, we can build and use plain -shared .ko files without -fpic)
i386 only wastes 4K per .ko file, so that isn't such a big deal there.
and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
when user wants to set them, in context switch routine, we only need to
write them into registers, we never have to read them out from registers
when thread is switched away. Since rdmsr is a serialization instruction,
micro benchmark shows it is worthy to do.
Reviewed by: peter, jhb
the former is the ISA part, not the latter.
MFC After 6.0 is unfrozen (this bug doesn't exist in 6.0 because I didn't
MFC the rtl80x9 changes for ISA due to an error on my part)
cache_lookup() has returned a ref'ed and locked vnode since
vfs_cache.c:1.96, dated Tue Mar 29 12:59:06 2005 UTC. This change
is similar to the change made to smbfs_lookup() in smbfs_vnops.c:1.58.
Tested by: "Antony Mawer" ant AT mawer.org
MFC after: 2 weeks
I benchmarked this by simultaneously extracting 4 large tarballs (basically
world images) on a 4-processor AMD64 system, in a malloc-backed md.
With this patch, system time was reduced by 43%, and wall clock time by 33%.
Submitted by: jeff
MFC after: 1 week
cd9660_lookup() that was used to fix an actual race in ufs_lookup.c:1.78.
This is not currently a hazard, but the bug would be activated by
marking cd9660 as MPSAFE.
Requested by: bde
ext2_lookup() that was used to fix an actual race in ufs_lookup.c:1.78.
This is not currently a hazard, but the bug would be activated by
marking ext2fs as MPSAFE.
Requested by: bde
MFC after: 2 weeks
to the 100/1000 BCM5400 phy. This fixes the problem with
the GEM port not syncing up on Sawtooth G4's.
Obtained from: NetBSD
Reported by: Ben Rosengart <ben + freebsd org at narcissus net>
so we are ready for mpsafevfs=1 by default on sparc64 too. I have been
running this on all my sparc64 machines for over 6 months, and have not
encountered MD problems.
MFC after: 1 week
the kernel by wrapping all targets for fake opt_*.h files in
.if defined(KERNBUILDDIR). Thus, such fake files won't be
created at all if modules are built with the kernel.
Some modules undergo cleanup like removing unused or unneeded
options or .h files, without which they wouldn't build this way
or the other.
Reviewed by: ru
Tested by: no binary changes in modules built alone
Tested on: i386 sparc64 amd64
provided in the kernel build directory, fix modules that were
failing to build this way due to not quite correct kernel option
usage. In particular:
ng_mppc.c uses two complementary options, both of which are listed
in sys/conf/files. Ideally, there should be a separate option for
including ng_mppc.c in kernel build, but now only
NETGRAPH_MPPC_ENCRYPTION is usable anyway, the other one requires
proprietary files.
nwfs and smbfs were trying to ensure they were built with proper
network components, but the check was rather questionable.
Discussed with: ru
- Add newer CPUID definitions for future use.
Many thanks to Mike Tancsa <mike at sentex dot net> for providing test
cases for Intel Pentium D and AMD Athlon 64 X2.
Approved by: anholt (mentor)
case by saving the value of dp->i_ino before unlocking the vnode
for the current directory and passing the saved value to VFS_VGET().
Without this change, another thread can overwrite dp->i_ino after
the current directory is unlocked, causing ufs_lookup() to lock
and return the wrong vnode in place of the vnode for its parent
directory. A deadlock can occur if dp->i_ino was changed to a
subdirectory of the current directory because the root to leaf vnode
lock ordering will be violated. A vnode lock can be leaked if
dp->i_ino was changed to point to the current directory, which
causes the current vnode lock for the current directory to be
recursed, which confuses lookup() into calling vrele() when it
should be calling vput().
The probability of this bug being triggered seems to be quite low
unless the sysctl variable debug.vfscache is set to 0.
Reviewed by: jhb
MFC after: 2 weeks
amd64, and is a factor of 3 less than the value previously auto-sized on
a 12GB machine, which would cause an overflow in calculations involving the
maxbcache int, causing bufinit() to loop forever at boot.
Reviewed by: mlaier, peter
correspond to the commit log. It changed the maxswzone and maxbcache
parameters from int to long, without changing the extern definitions
in <sys/buf.h>.
In fact it's a good thing it did not, because other parts of the system
are not yet ready for this, and on large-memory sparc machines it causes
severe filesystem damage if you try.
The worst effect of the change was that the tunables controlling the
above variables stopped working. These were necessary to allow such
large sparc64 machines (with >12GB RAM) to boot, since sparc64 did not
set a hard-coded upper limit on these parameters and they ended
up overflowing an int, causing an infinite loop at boot in bufinit().
Reviewed by: mlaier
cards and teach the re(4) driver to attach to revision 3 cards.
Submitted by: Fredrik Lindberg fli+freebsd-current at shapeshifter dot se
MFC after: 2 weeks
Reviewed by: imp, mdodd
originally wrote it for 4.x and hasn't really had the time to fully update
it to 5.x and later. Also, the author doesn't use the hardware anymore as
well. If someone does need this driver they can always resurrect it from
the Attic.
Requested by: Frank Mayhar frank at exit dot com
the modified memory rather than using register operands that held a pointer
to the memory. The biggest effect is that we now correctly tell the
compiler that these functions change the memory that these functions
modify.
Reviewed by: cognet
do not support the GETINFO immediate command, unlike just about every other
variant of the hardware. Also document some magic values and fix some minor
nearby whitespace.
MFC After: 3 days
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.
6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
The receive function em_process_receive_interrupts() unlocks the
adapter while ether_input() processes the packet, and then locks
it back. In the meantime, em_init() may be called, either from
em_watchdog() from softclock interrupt or from the ifconfig(8)
program. The em_init() resets the card, in particular it sets
adapter->next_rx_desc_to_check to 0 and resets hardware RX Head
and Tail descriptor pointers. The loop in
em_process_receive_interrupts() does not expect these things to
change, and a mess may result.
This fixes long wedges of em(4) interfaces receive part under high
load and IP fastforwarding enabled.
PR: kern/87418
Submitted by: Dmitrij Tejblum <tejblum yandex-team.ru>
that the following funtions are not used, wrap in '#ifdef noused' for the
moment.
bstp_enable_change_detection
bstp_disable_change_detection
bstp_set_bridge_priority
bstp_set_port_priority
bstp_set_path_cost
the ExCA spec, and close cousins:
o Write an activate routine that works.
o merge a couple of items from oldcard before they are lost
o write a deactivate routine
I suspect we're still a ways away from having this work, but maybe for
6.1/5.5?
- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.
previous revision only restored the MP optimization.
Describe the optimization strategy for TLB invalidations in a comment.
Reviewed by: ups@
MFC after: 3 days
Use bridge_ifdetach() to notify the bridge that a member has been detached. The
bridge can then remove it from its interface list and not try to send out via a
dead pointer.
as a Novell NE-2000. This is necessary for unpatched qemu working
correctly. qemu claims to be a RTL8029, but doesn't implement the
RTL8029 specific registers at this time. I've created patches for
that, but there's no reason we can't use qemu's emulation w/o these
patches. This should make life easier for those folks that boot
FreeBSD via qemu.
ed_probe_generic8390 where we're calling it. It will be done as part
of ed_probe_Novel_generic after things are setup in a way that
ed_probe_generic8390 will grok.
o Fix operator precedence botch that causes a panic when setting the media
type for 10baseT connections.
o Save the type of device so that it prints with the rest of the probe.
# this should make it work with qemu again, but only if it has my patches
# to actually implement the RTL8029 specific registers.
- Use device_printf() and if_printf() and remove nge_unit.
- Use callout_init_mtx() and remove nge_tick_locked() as nge_tick() is now
always called with the driver lock held.
- Use M_ZERO to contigmalloc() when allocating nge_ldata. It was possible
for the random garbage to be used in certain cases otherwise.
- Cleanup attach error handling including no longer leaking nge_ldata.
- Add locking to the ifmedia callouts.
- Lock accesses to if_hwassist and if_capenable in nge_ioctl().
Submitted by: Yuriy N. Shkandybin jura at networks dot ru (1, 3, 4)
Tested by: Yuriy N. Shkandybin jura at networks dot ru
MFC after: 3 days
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.
Assert that all cloners have been destroyed when freeing the memory.
Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.
Discussed with: brooks, pjd, -current
Reviewed by: brooks
notifications when LIO operations completed. These were the problems
with LIO event complete notification:
- Move all LIO/AIO event notification into one general function
so we don't have bugs in different data paths. This unification
got rid of several notification bugs one of which if kqueue was
used a SIGILL could get sent to the process.
- Change the LIO event accounting to count all AIO request that
could have been split across the fast path and daemon mode.
The prior accounting only kept track of AIO op's in that
mode and not the entire list of operations. This could cause
a bogus LIO event complete notification to occur when all of
the fast path AIO op's completed and not the AIO op's that
ended up queued for the daemon.
Suggestions from: alc
auto-start, set cnp.cn_lkflags to LK_EXCLUSIVE. This flag must now
be set so that lockmgr knows what kind of lock to acquire, and it
will panic if not specified. This resulted in a panic when using
extended attributes on UFS1 as of locking work present in the 6.x
branch.
This is a RELENG_6_0 merge candidate.
Reported by: lofi
MFC after: 3 days
TLB shootdown requirements. Otherwise a CPU may not get the needed
TLB invalidation.
The PTE valid and access flags can not be used here to avoid TLB
shootdowns unless sf->cpumask == all_cpus.
( Otherwise some CPUs may still hold an even older entry in the TLB)
Since sf_buf_alloc mappings are normally always used this is
also not really useful and presetting accessed and modified
allows the CPU to speculatively load the entry into the TLB.
Both bugs can cause random data corruption.
MFC after: 3 days
based on XMAC II chip should be ready for this in their initial
mode of operation, and Yukon-based NICs are configured so by
the driver.
PR: kern/79998
MFC after: 1 month
semantics, and then was reused for next node, it still would be applied
as writer again.
To fix the regression the decision is made never to alter item->el_flags
after the item has been allocated. This requires checking for overrides
both in ng_dequeue() and in ng_snd_item().
Details:
- Caller of the ng_apply_item() knows what is the current access to
node and specifies it to ng_apply_item(). The latter drops the
given access after item has beem applied.
- ng_dequeue() needs to be supplied with int pointer, where it stores
the obtained access on node.
- Check for node/hook access overrides in ng_dequeue().
generic sounding CIS "PCMCIA", "FAST ETHERENT CARD" and a bogus MANFID
code (0xffff and 0x1090). However, since I'm not aware of 'generic'
cards that aren't NE-2000oids, go with that and hope for the best.
First and most importantly, I threw out the thread priority-twiddling
implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in
favor of a new scheme that uses sleep mutexes. The old scheme was
really very naughty and sought to provide the same behavior as
Windows spinlocks (i.e. blocking pre-emption) but in a way that
wouldn't raise the ire of WITNESS. The new scheme represents
'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If
a thread on cpu0 acquires the 'dispatcher mutex,' it will block
any other thread on the same processor that tries to acquire it,
in effect only allowing one thread on the processor to be at
'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit
and spin' routine on the spinlock variable itself. If a thread on
cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher
mutex' for cpu1 and then it too does an atomic sit and spin to try
acquiring the spinlock.
Unlike real spinlocks, this does not disable pre-emption of all
threads on the CPU, but it does put any threads involved with
the NDISulator to sleep, which is just as good for our purposes.
This means I can now play nice with WITNESS, and I can safely do
things like call malloc() when I'm at 'DISPATCH_LEVEL,' which
you're allowed to do in Windows.
Next, I completely re-wrote most of the event/timer/mutex handling
and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects()
have been re-written to use condition variables instead of msleep().
This allows us to use the Windows convention whereby thread A can
tell thread B "wake up with a boosted priority." (With msleep(), you
instead have thread B saying "when I get woken up, I'll use this
priority here," and thread A can't tell it to do otherwise.) The
new KeWaitForMultipleObjects() has been better tested and better
duplicates the semantics of its Windows counterpart.
I also overhauled the IoQueueWorkItem() API and underlying code.
Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the
same work item isn't put on the queue twice. ExQueueWorkItem(),
which in my implementation is built on top of IoQueueWorkItem(),
was also modified to perform a similar test.
I renamed the doubly-linked list macros to give them the same names
as their Windows counterparts and fixed RemoveListTail() and
RemoveListHead() so they properly return the removed item.
I also corrected the list handling code in ntoskrnl_dpc_thread()
and ntoskrnl_workitem_thread(). I realized that the original logic
did not correctly handle the case where a DPC callout tries to
queue up another DPC. It works correctly now.
I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and
modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to
use them. I also tried to duplicate the interrupt handling scheme
used in Windows. The interrupt handling is now internal to ndis.ko,
and the ndis_intr() function has been removed from if_ndis.c. (In
the USB case, interrupt handling isn't needed in if_ndis.c anyway.)
NdisMSleep() has been rewritten to use a KeWaitForSingleObject()
and a KeTimer, which is how it works in Windows. (This is mainly
to insure that the NDISulator uses the KeTimer API so I can spot
any problems with it that may arise.)
KeCancelTimer() has been changed so that it only cancels timers, and
does not attempt to cancel a DPC if the timer managed to fire and
queue one up before KeCancelTimer() was called. The Windows DDK
documentation seems to imply that KeCantelTimer() will also call
KeRemoveQueueDpc() if necessary, but it really doesn't.
The KeTimer implementation has been rewritten to use the callout API
directly instead of timeout()/untimeout(). I still cheat a little in
that I have to manage my own small callout timer wheel, but the timer
code works more smoothly now. I discovered a race condition using
timeout()/untimeout() with periodic timers where untimeout() fails
to actually cancel a timer. I don't quite understand where the race
is, using callout_init()/callout_reset()/callout_stop() directly
seems to fix it.
I also discovered and fixed a bug in winx32_wrap.S related to
translating _stdcall calls. There are a couple of routines
(i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that
return 64-bit quantities. On the x86 arch, 64-bit values are
returned in the %eax and %edx registers. However, it happens
that the ctxsw_utow() routine uses %edx as a scratch register,
and x86_stdcall_wrap() and x86_stdcall_call() were only preserving
%eax before branching to ctxsw_utow(). This means %edx was getting
clobbered in some cases. Curiously, the most noticeable effect of this
bug is that the driver for the TI AXC110 chipset would constantly drop
and reacquire its link for no apparent reason. Both %eax and %edx
are preserved on the stack now. The _fastcall and _regparm
wrappers already handled everything correctly.
I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem()
instead of the NdisScheduleWorkItem() API. This is to avoid possible
deadlocks with any drivers that use NdisScheduleWorkItem() themselves.
The unicode/ansi conversion handling code has been cleaned up. The
internal routines have been moved to subr_ntoskrnl and the
RtlXXX routines have been exported so that subr_ndis can call them.
This removes the incestuous relationship between the two modules
regarding this code and fixes the implementation so that it honors
the 'maxlen' fields correctly. (Previously it was possible for
NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't
own, which was causing many mysterious crashes in the Marvell 8335
driver.)
The registry handling code (NdisOpen/Close/ReadConfiguration()) has
been fixed to allocate memory for all the parameters it hands out to
callers and delete whem when NdisCloseConfiguration() is called.
(Previously, it would secretly use a single static buffer.)
I also substantially updated if_ndis so that the source can now be
built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only
WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled.
The original WPA code has been updated to fit in more cleanly with
the net80211 API, and to eleminate the use of magic numbers. The
ndis_80211_setstate() routine now sets a default authmode of OPEN
and initializes the RTS threshold and fragmentation threshold.
The WPA routines were changed so that the authentication mode is
always set first, followed by the cipher. Some drivers depend on
the operations being performed in this order.
I also added passthrough ioctls that allow application code to
directly call the MiniportSetInformation()/MiniportQueryInformation()
methods via ndis_set_info() and ndis_get_info(). The ndis_linksts()
routine also caches the last 4 events signalled by the driver via
NdisMIndicateStatus(), and they can be queried by an application via
a separate ioctl. This is done to allow wpa_supplicant to directly
program the various crypto and key management options in the driver,
allowing things like WPA2 support to work.
Whew.