- Move all scheduler locking into the schedulers utilizing a technique
similar to solaris's container locking.
- A per-process spinlock is now used to protect the queue of threads,
thread count, suspension count, p_sflags, and other process
related scheduling fields.
- The new thread lock is actually a pointer to a spinlock for the
container that the thread is currently owned by. The container may
be a turnstile, sleepqueue, or run queue.
- thread_lock() is now used to protect access to thread related scheduling
fields. thread_unlock() unlocks the lock and thread_set_lock()
implements the transition from one lock to another.
- A new "blocked_lock" is used in cases where it is not safe to hold the
actual thread's lock yet we must prevent access to the thread.
- sched_throw() and sched_fork_exit() are introduced to allow the
schedulers to fix-up locking at these points.
- Add some minor infrastructure for optionally exporting scheduler
statistics that were invaluable in solving performance problems with
this patch. Generally these statistics allow you to differentiate
between different causes of context switches.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
Now, we assume no more sched_lock protection for some of them and use the
distribuited loads method for vmmeter (distribuited through CPUs).
Reviewed by: alc, bde
Approved by: jeff (mentor)
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
given a specific value.
Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.
Reviewed by: alc, bde
Approved by: jeff (mentor)
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
value into a variable of the right type and then printing it via
an intmax_t. This makes avoids some duplication and makes it easy
to add a new integer format Q for printing things of type CTLTYPE_QUAD.
- In the ioctl path let command get queued up and return
when complete _without_ blocking the driving waiting for
the response. This way the driver doesn't "lock up" for
~30s during a flash command. Submitted by scottl.
- Add a guard so that if a DCMD of 0 is sent down the ioctl
path don't send it to the controller. Return with a
status of OK. This is a little strange since MegaCli
doesn't seem to like something and will issue some DCMD
of 0. This doesn't happen under Linux. So the emulation
needs to be improved but I'm not sure what. Another strange
thing is that when a DCMD of 0 gets issued under i386 the
controller returns OK but in amd64 the context is messed
up.
- Add a guard so the context has to be with-in the legal
limit so we get a reasonable error assertion versus random
panic.
It's going to be a challenge to figure out why MegaCli is not totally
happy and then sends some bogus commands. This means that flashing
firmware via the Linux tool won't work since it generates a DCMD of
0 when it should be opening the firmware for a flash update. Without
this problem flashing works fine. This means there is no publicly
available tool to upgrade the RAID firmware under FreeBSD right now.
I plan to MFC all of the mfi changes to 6.X shortly. This might not
include the SCSI pass-through changes.
Submitted by: scottl
Reviewed by: scottl
MFC after: 3 days
when figuring out what the real interpreter is for an
interpreted command. That is, check whether we can read
the script file in the first place and, if so, make sure
we got a valid shebang line from it.
1. Pass locking flags to VFS_ROOT().
2. Check v_mountedhere while the vnode is locked.
3. Always return locked vnode on success.
Change 1 fixes problem reported by Stephen M. Rumble - after
zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode
unconditionally and traverse() didn't pass right lock type to
VFS_ROOT(). The result was that kernel paniced when .zfs/ directory
was accessed via NFS.
playtone() so that it uses times of 1/100ths of a second.
Now 'time echo T60ABC >/dev/speaker' takes ~3 seconds.
MFC after: 2 weeks
Problem noted by: dwmalone
- removed unused structure members
- fixed a minor bug that the ECN code point may not be restored correctly
Approved by: ume (mentor)
MFC after: 1 week
GEMs is unable to discriminate UDP from TCP packets such that
it can generate 0x0000 checksum value for the UDP datagram. So the
UDP checksum offload was disabled by default. You can enable it
by setting link0 flag with ifconfig(8).
o bus_dma(9) clean up. It now correctly set number of required DMA
segments/size and removed incorrect use of BUS_DMA_ALLOCNOW flag
in static allocations done via bus_dmamem_alloc(9).
o Implemented ALTQ(9) support.
o Implemented Tx side bus_dmamap_load_mbuf_sg(9) which can remove
several book keeping chores orginated from call-back mechanism.
Therefore gem_txdma_callback() was removed and its functionality
was reimplemented in gem_load_txmbuf().
o Don't set GEM_TD_START_OF_PACKET flag until all remaining mbuf
chains are set. I think it was a long standing bug and it caused
fluctuating interrupts/CPU usage patterns while netperf test
is in progress. Previously it seems that we race with the device.
Because I don't have a documentation for GEM I'm not sure this is
correct but almost all other documentations I have stated this
implications on setting SOP mark in descriptor ring(e.g. hme(4)).
o Borrowed gem_defrag() from ath(4) which is supposed to be much
faster than m_defrag(9) since it's not need to defrag all
mbuf chains.
o gem_load_txmbuf() was changed to allow passed mbuf chains to free.
Caller of gem_load_txmbuf() correctly handles freed mbuf chains.
o In gem_start_locked(), added checks for availability of Tx
descriptors before trying to load DMA maps which could save CPU
cycles when number of available descriptors are low. Also, simplyfy
IFF_DRV_OACTIVE detection logic.
o Removed hard-coded function names in CTR macros and replaced it
with __func__.
o Moved statistics counter register access to gem_tick() to reduce
number of PCI bus accesses. There is no reason to update statistics
counters in interrupt handler.
o Removed unnecessary call of gem_start_locked() in gem_ioctl().
Reviewed by: grehan (initial version), marius (with improvements and suggestions)
Tested by: grehan (ppc), marius(sparc64)
own entry in the softc. This should allow more of cbb_pci_intr() to
migrate to a new cbb_pci_filt() so that we don't have to run cbb's ISR
in almost every case we get an interrupt. We can't just move
cbb_pci_intr into cbb_pci_filt because it does things that aren't safe
to do from a fast interrupt handler, err I mean from a filter. This is
an important first step.
# I wonder if I need to make cardok volatile or not.
mpt.h:
Add support for reading extended configuration pages.
mpt_cam.c:
Do a top level topology scan on the SAS controller. If any SATA
device are discovered in this scan, send a passthrough FIS to set
the write cache. This is controllable through the following
tunable at boot:
hw.mpt.enable_sata_wc:
-1 = Do not configure, use the controller default
0 = Disable the write cache
1 = Enable the write cache
The default is -1. This tunable is just a hack and may be
deprecated in the future.
Turning on the write cache alleviates the write performance problems with
SATA that many people have observed. It is not recommend for those who
value data reliability! I cannot stress this strongly enough. However,
it is useful in certain circumstances, and it brings the performence in line
with what a generic SATA controller running under the FreeBSD ATA driver
provides (and the ATA driver has had the WC enabled by default for years).
Things can get ugly without it due to uninitialized class. RELENG_6 need
a simmilar, but different treatment as well.
err.. perhaps we should teach devclass_get_maxunit() to return -1 ?
MFC after: 1 day
o If we don't have a filter, also check to make sure the card is there before
calling the scheduled ISR. This is necessary to help old drivers whose
ISRs can't cope with being called with the hardware missing, which sadly
still exist in the tree. This is the main reason why we have an extra
layer of indirection for cardbus interrupts.
o If the card is no longer present, mark the interrupt as 'handled' rather
than 'stray' because this accounts for why the interrupt happened. Stray
isn't all bad, since there are other filters that would claim it...
o Fix some comments
+ Add comment about why we check for CARD_OK and touch the hardware in both
the filter and ISR.
+ add a note about why we don't care about Giant
+ also note that giant can't be taken out in a filter...
+ Some minor formatting nits on very long comments.