- lock its own locks and drop Giant.
- create its own taskqueue thread.
- split interrupt routine
- use interrupt filter as a fast interrupt.
- run watchdog timer in taskqueue so that it should be
serialized with the bottom half.
- add extra sanity check for transaction labels.
disable ad-hoc workaround for unknown tlabels.
- add sleep/wakeup synchronization primitives
- don't reset OHCI in fwohci_stop()
seems not enough to verify its consistencies.
- Define AC97_MIXER_SIZE as SOUND_MIXER_NRDEVICES (25), since we
don't need more than that. Stop doing wild and random guess about
its size since we're stricly bound to it.
application specific SEND_OP_COND (CMD55 + ACMD41), go ahead and allow
100 tries. This gives a timeout of a second rather than the ~100ms
the old style produces.
I've had one old 16MB SD card which needs the extra time. I've now
had reports from the field that other cards need this too.
Originally done at BSDcan 2007 while waiting to give my embedding
madness minitalk.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
- In the ioctl path let command get queued up and return
when complete _without_ blocking the driving waiting for
the response. This way the driver doesn't "lock up" for
~30s during a flash command. Submitted by scottl.
- Add a guard so that if a DCMD of 0 is sent down the ioctl
path don't send it to the controller. Return with a
status of OK. This is a little strange since MegaCli
doesn't seem to like something and will issue some DCMD
of 0. This doesn't happen under Linux. So the emulation
needs to be improved but I'm not sure what. Another strange
thing is that when a DCMD of 0 gets issued under i386 the
controller returns OK but in amd64 the context is messed
up.
- Add a guard so the context has to be with-in the legal
limit so we get a reasonable error assertion versus random
panic.
It's going to be a challenge to figure out why MegaCli is not totally
happy and then sends some bogus commands. This means that flashing
firmware via the Linux tool won't work since it generates a DCMD of
0 when it should be opening the firmware for a flash update. Without
this problem flashing works fine. This means there is no publicly
available tool to upgrade the RAID firmware under FreeBSD right now.
I plan to MFC all of the mfi changes to 6.X shortly. This might not
include the SCSI pass-through changes.
Submitted by: scottl
Reviewed by: scottl
MFC after: 3 days
playtone() so that it uses times of 1/100ths of a second.
Now 'time echo T60ABC >/dev/speaker' takes ~3 seconds.
MFC after: 2 weeks
Problem noted by: dwmalone
GEMs is unable to discriminate UDP from TCP packets such that
it can generate 0x0000 checksum value for the UDP datagram. So the
UDP checksum offload was disabled by default. You can enable it
by setting link0 flag with ifconfig(8).
o bus_dma(9) clean up. It now correctly set number of required DMA
segments/size and removed incorrect use of BUS_DMA_ALLOCNOW flag
in static allocations done via bus_dmamem_alloc(9).
o Implemented ALTQ(9) support.
o Implemented Tx side bus_dmamap_load_mbuf_sg(9) which can remove
several book keeping chores orginated from call-back mechanism.
Therefore gem_txdma_callback() was removed and its functionality
was reimplemented in gem_load_txmbuf().
o Don't set GEM_TD_START_OF_PACKET flag until all remaining mbuf
chains are set. I think it was a long standing bug and it caused
fluctuating interrupts/CPU usage patterns while netperf test
is in progress. Previously it seems that we race with the device.
Because I don't have a documentation for GEM I'm not sure this is
correct but almost all other documentations I have stated this
implications on setting SOP mark in descriptor ring(e.g. hme(4)).
o Borrowed gem_defrag() from ath(4) which is supposed to be much
faster than m_defrag(9) since it's not need to defrag all
mbuf chains.
o gem_load_txmbuf() was changed to allow passed mbuf chains to free.
Caller of gem_load_txmbuf() correctly handles freed mbuf chains.
o In gem_start_locked(), added checks for availability of Tx
descriptors before trying to load DMA maps which could save CPU
cycles when number of available descriptors are low. Also, simplyfy
IFF_DRV_OACTIVE detection logic.
o Removed hard-coded function names in CTR macros and replaced it
with __func__.
o Moved statistics counter register access to gem_tick() to reduce
number of PCI bus accesses. There is no reason to update statistics
counters in interrupt handler.
o Removed unnecessary call of gem_start_locked() in gem_ioctl().
Reviewed by: grehan (initial version), marius (with improvements and suggestions)
Tested by: grehan (ppc), marius(sparc64)
own entry in the softc. This should allow more of cbb_pci_intr() to
migrate to a new cbb_pci_filt() so that we don't have to run cbb's ISR
in almost every case we get an interrupt. We can't just move
cbb_pci_intr into cbb_pci_filt because it does things that aren't safe
to do from a fast interrupt handler, err I mean from a filter. This is
an important first step.
# I wonder if I need to make cardok volatile or not.
mpt.h:
Add support for reading extended configuration pages.
mpt_cam.c:
Do a top level topology scan on the SAS controller. If any SATA
device are discovered in this scan, send a passthrough FIS to set
the write cache. This is controllable through the following
tunable at boot:
hw.mpt.enable_sata_wc:
-1 = Do not configure, use the controller default
0 = Disable the write cache
1 = Enable the write cache
The default is -1. This tunable is just a hack and may be
deprecated in the future.
Turning on the write cache alleviates the write performance problems with
SATA that many people have observed. It is not recommend for those who
value data reliability! I cannot stress this strongly enough. However,
it is useful in certain circumstances, and it brings the performence in line
with what a generic SATA controller running under the FreeBSD ATA driver
provides (and the ATA driver has had the WC enabled by default for years).
Things can get ugly without it due to uninitialized class. RELENG_6 need
a simmilar, but different treatment as well.
err.. perhaps we should teach devclass_get_maxunit() to return -1 ?
MFC after: 1 day
o If we don't have a filter, also check to make sure the card is there before
calling the scheduled ISR. This is necessary to help old drivers whose
ISRs can't cope with being called with the hardware missing, which sadly
still exist in the tree. This is the main reason why we have an extra
layer of indirection for cardbus interrupts.
o If the card is no longer present, mark the interrupt as 'handled' rather
than 'stray' because this accounts for why the interrupt happened. Stray
isn't all bad, since there are other filters that would claim it...
o Fix some comments
+ Add comment about why we check for CARD_OK and touch the hardware in both
the filter and ISR.
+ add a note about why we don't care about Giant
+ also note that giant can't be taken out in a filter...
+ Some minor formatting nits on very long comments.
While in the suspend path, this means the idle thread will just return
immediately rather than trying to enter C1-n. This helps in the case where
the chipset is powered down before the rest of the system and reads from
the cpu sleep registers begin returning immediately, causing the logic that
catches bad C2/C3 behavior to kick in. Observed on my Panasonic Y4.
MFC after: 3 days
(j/i) was being used and it was being incremented, not decremented as before.
Factor out this code into a common function and call it from both the common
and per-CPU case.
MFC after: 1 day
The global lock is a memory region shared with the BIOS and thus
has some strange behavior like the fact that the sleep is 1 ms max.
We use standard mutexes to synchronize with the SCI so acquiring
the global lock after locking the mutex resulted in a witness
warning.
To deal with this for now, acquire the global lock before all other
locks, similar to Giant. This should fix the witness "sleeping
with mutex held" issue on boot that occurred after the last ACPI-CA
import. In the future, we hope to move to the new mutex interface
in ACPI-CA instead of the pseudo-semaphore version we have now.
Reviewed by: jkim
this change both simplifies the code and plugs a hole where the devise
was reset without keeping the management controller at bay :) Second,
the 82571 LAA reset problem was incomplete, this addition is necessary.
Just one of those days :)
- Rework the entire pcm_channel structure:
* Remove rarely used link placeholder, instead, make each pcm_channel
as head/link of each own/each other. Unlock - Lock sequence due to
sleep malloc has been reduced.
* Implement "busy" queue which will contain list of busy/active
channels. This greatly reduce locking contention for example while
servicing interrupt for hardware with many channels or when virtual
channels reach its 256 peak channels.
- So I heard you like v chan ... O RLY?
Welcome to Virtual **Record** Channels (vrec, rec vchans, vchans for
recording, Rec-Chan, you decide), the ultimate solutions for your
nagging O_RDWR full-duplex wannabe (note: flash plugins) monopolizing
single record channel causing EBUSY. Vrec works exactly like Vchans
(or, should I rename it to "Vplay" :) , except that it operates on the
opposite direction (recording). Up to 256 vrecs (like vchans) are
possible.
Notes:
* Relocate dev.pcm.%d.{vchans,vchanformat,vchanrate} to each of its
respective node/direction:
dev.pcm.%d.play.* for "play" (cdev = dsp%d.vp%d)
dev.pcm.%d.rec.* for "record" (cdev = dsp%d.vr%d)
* Don't expect that it will magically give you ability to split
"recording source" (eg: 1 channel for cdrom, 1 channel for mic,
etc). Just admit that you only have a *single* recording source /
channel. Please bug your hardware vendor instead :)
- Bump maxautovchans from 4 to 16. For a full-fledged multimedia
desktop/workstation with too many soundservers installed (esound,
artsd, jackd, pulse/polypaudio, ding-dong pling plong mudkip fuh fuh,
etc), 4 seems inadequate. There will be no memory penalty here, since
virtual channels are allocate only by demand.
- Nuke/Rework the entire statically created cdev entries. Everything is
clonable through snd own clone manager which designed to withstand many
kind of abusive devfs droids such as:
* while : ; do /bin/test -e /dev/dsp ; done
* jot 16777216 0 | while read x ; do ls /dev/dsp0.$x ; done
* hundreds (could be thousands) concurrent threads/process opening
"/dev/dsp" (previously, this might result EBUSY even with just
3 contesting threads/procs).
o Reusable clone objects (instead of creating new one like there's no
tomorrow) after certain expiration deadline. The clone allocator will
decide whether to reuse, share, or creating new clone.
o Automatic garbage collector.
- Dynamic unit magic allocator. Maximum attached soundcards can be tuned
using tunable "hw.snd.maxunit" (Default to 512). Minimum is 16, and
maximum is 2048.
- ..other fixes, mostly related to concurrency issues.
joel@ will do the manpage updates on sound(4).
Have fun.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.
Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)
- Coverity Prevent(tm) CID 1906 a bogus use of bzero where unneeded.
- ICH8 systems autoneg to 100 rather than 1000, this can also be
seen in 82573, the logic was backwards.
- On new 82575 quadports half duplex tx speed is slow... this was due
to overwriting TCTL reg rather than adding bits.
OpenBSD's if_ral.c.
I didn't make the LINKSYS4 -> CISCOLINKSYS name change, nor did I
include the RALINK RT2573 that's supported by the rum(4) driver. I
didn't merge any code changes either.