133171 Commits

Author SHA1 Message Date
jeff
b980923789 Commit 11/14 of sched_lock decomposition.
- There is no globally visible scheduler lock any longer.  For now the
   watchdog can only check Giant.  This model of checking particular locks
   is flawed and should be revisited.  Other metrics should be considered.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:33 +00:00
jeff
20e0f793e8 Commit 10/14 of sched_lock decomposition.
- Use sched_throw() rather than replicating the same cpu_throw() code for
   each architecture.  This also allows the scheduler to use any locking it
   may want to.
 - Use the thread_lock() rather than sched_lock when preempting.
 - The scheduler lock is not required to synchronize release_aps.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:08 +00:00
jeff
3a4cdc52dc Commit 10/14 of sched_lock decomposition.
- Add new spinlocks to support thread_lock() and adjust ordering.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:55:45 +00:00
jeff
0e873af7bd Commit 9/14 of sched_lock decomposition.
- Attempt to return the ttyinfo() selection algorithm to something sane
   as it has been broken and disabled for some time.  Adapt this algorithm
   in such a way that it does not conflict with per-cpu scheduler locking.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:55:32 +00:00
jeff
b8fc17c4ae Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
   is no global scheduler lock.
 - Use thread_lock() to protect thread state.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
jeff
b5c8e4a113 Commit 7/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Use a global kse spinlock to protect upcall and thread assignment.  The
   per-process spinlock can not be used because this lock must be acquired
   via mi_switch() where we already hold a thread lock.  The kse spinlock
   is a leaf lock ordered after the process and thread spinlocks.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:27 +00:00
jeff
2fc4033486 Commit 6/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Replace the tail-end of fork_exit() with a scheduler specific routine
   which can do the appropriate lock manipulations.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:53:34 +00:00
jeff
d72d912582 Commit 5/14 of sched_lock decomposition.
- Protect the cp_time tick counts with atomics instead of a global lock.
   There will only be one atomic per tick and this allows all processors
   to execute softclock concurrently.
 - In softclock, protect access to rusage and td_*tick data with the
   thread_lock(), expanding the scope of the thread lock over the whole
   function.
 - Do some creative re-arranging in hardclock() to avoid excess locking.
 - Protect the p_timer fields with the per-process spinlock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:53:06 +00:00
jeff
8931208c40 Commit 4/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.
 - Move some common code into thread_suspend_switch() to handle the
   mechanics of suspending a thread.  The locking here is incredibly
   convoluted and should be simplified.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:52:24 +00:00
jeff
15c2dd7a1f Commit 3/14 of sched_lock decomposition.
- Add a per-turnstile spinlock to solve potential priority propagation
   deadlocks that are possible with thread_lock().
 - The turnstile lock order is defined as the exact opposite of the
   lock order used with the sleep locks they represent.  This allows us
   to walk in reverse order in priority_propagate and this is the only
   place we wish to multiply acquire turnstile locks.
 - Use the turnstile_chain lock to protect assigning mutexes to turnstiles.
 - Change the turnstile interface to pass back turnstile pointers to the
   consumers.  This allows us to reduce some locking and makes it easier
   to cancel turnstile assignment while the turnstile chain lock is held.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:51:44 +00:00
jeff
ea7c909871 Commit 2/14 of sched_lock decomposition.
- Adapt sleepqueues to the new thread_lock() mechanism.
 - Delay assigning the sleep queue spinlock as the thread lock until after
   we've checked for signals.  It is illegal for a thread to return in
   mi_switch() with any lock assigned to td_lock other than the scheduler
   locks.
 - Change sleepq_catch_signals() to do the switch if necessary to simplify
   the callers.
 - Simplify timeout handling now that locking a sleeping thread has the
   side-effect of locking the sleepqueue.  Some previous races are no
   longer possible.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:50:56 +00:00
jeff
186ae07cb6 Commit 1/14 of sched_lock decomposition.
- Move all scheduler locking into the schedulers utilizing a technique
   similar to solaris's container locking.
 - A per-process spinlock is now used to protect the queue of threads,
   thread count, suspension count, p_sflags, and other process
   related scheduling fields.
 - The new thread lock is actually a pointer to a spinlock for the
   container that the thread is currently owned by.  The container may
   be a turnstile, sleepqueue, or run queue.
 - thread_lock() is now used to protect access to thread related scheduling
   fields.  thread_unlock() unlocks the lock and thread_set_lock()
   implements the transition from one lock to another.
 - A new "blocked_lock" is used in cases where it is not safe to hold the
   actual thread's lock yet we must prevent access to the thread.
 - sched_throw() and sched_fork_exit() are introduced to allow the
   schedulers to fix-up locking at these points.
 - Add some minor infrastructure for optionally exporting scheduler
   statistics that were invaluable in solving performance problems with
   this patch.  Generally these statistics allow you to differentiate
   between different causes of context switches.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:50:30 +00:00
attilio
9bd4fdf7ce Do proper "locking" for missing vmmeters part.
Now, we assume no more sched_lock protection for some of them and use the
distribuited loads method for vmmeter (distribuited through CPUs).

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:45:18 +00:00
attilio
e333d0ff0e Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
dwmalone
771efb08f5 Despite several examples in the kernel, the third argument of
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.

Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.

In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported.  In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
2007-06-04 18:25:08 +00:00
dwmalone
360dc5b21c Add a function for exporting 64 bit types. 2007-06-04 18:14:28 +00:00
dwmalone
79a4f2f433 Use common code for printing ints and longs by coppying the sysctl
value into a variable of the right type and then printing it via
an intmax_t. This makes avoids some duplication and makes it easy
to add a new integer format Q for printing things of type CTLTYPE_QUAD.
2007-06-04 18:02:23 +00:00
marcel
c118c8a64b Revert to the previous version where the return value of uart_getenv()
is being ignored. It's optional and the lack of environment variable
is not an error condition.
2007-06-04 17:53:42 +00:00
brueffer
3f6ea62ed6 Xref altq(4). 2007-06-04 16:59:11 +00:00
ambrisko
7261dfb656 Add in a couple of things:
-	In the ioctl path let command get queued up and return
	when complete _without_ blocking the driving waiting for
	the response.  This way the driver doesn't "lock up" for
	~30s during a flash command.  Submitted by scottl.
      -	Add a guard so that if a DCMD of 0 is sent down the ioctl
	path don't send it to the controller.  Return with a
	status of OK.  This is a little strange since MegaCli
	doesn't seem to like something and will issue some DCMD
	of 0.  This doesn't happen under Linux.  So the emulation
	needs to be improved but I'm not sure what.  Another strange
	thing is that when a DCMD of 0 gets issued under i386 the
	controller returns OK but in amd64 the context is messed
	up.
      -	Add a guard so the context has to be with-in the legal
	limit so we get a reasonable error assertion versus random
	panic.

It's going to be a challenge to figure out why MegaCli is not totally
happy and then sends some bogus commands.  This means that flashing
firmware via the Linux tool won't work since it generates a DCMD of
0 when it should be opening the firmware for a flash update.  Without
this problem flashing works fine.  This means there is no publicly
available tool to upgrade the RAID firmware under FreeBSD right now.

I plan to MFC all of the mfi changes to 6.X shortly.  This might not
include the SCSI pass-through changes.

Submitted by:	scottl
Reviewed by:	scottl
MFC after:	3 days
2007-06-04 16:39:22 +00:00
mav
2f12bfb53b No need to update link queue stats when round-robin algorithm enabled.
Approved by:	glebius (mentor)
2007-06-04 13:50:09 +00:00
yar
68cc2f890e Be robust to a bogus script specification or contents
when figuring out what the real interpreter is for an
interpreted command.  That is, check whether we can read
the script file in the first place and, if so, make sure
we got a valid shebang line from it.
2007-06-04 11:39:35 +00:00
pjd
f28297d01f Reimplement traverse() helper function:
1. Pass locking flags to VFS_ROOT().
2. Check v_mountedhere while the vnode is locked.
3. Always return locked vnode on success.

Change 1 fixes problem reported by Stephen M. Rumble - after
zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode
unconditionally and traverse() didn't pass right lock type to
VFS_ROOT(). The result was that kernel paniced when .zfs/ directory
was accessed via NFS.
2007-06-04 11:31:46 +00:00
brian
e40209916c Now that tone & delay times are correct (independent of hz), adjust
playtone() so that it uses times of 1/100ths of a second.

Now 'time echo T60ABC >/dev/speaker' takes ~3 seconds.

MFC after:		2 weeks
Problem noted by:	dwmalone
2007-06-04 09:27:13 +00:00
brian
287d4ad6fa Speaker durations are specified in 1/100ths of a second according to
spkr(4).

PR:		70610, 67995
Submitted by:	dada at sbox dot tugraz dot at (modulo one fix)
MFC after:	2 weeks
2007-06-04 08:33:18 +00:00
alc
ccfe66d477 Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-04 08:02:22 +00:00
scottl
741062ba0a Track an update in the MPI headers that was missed earlier. 2007-06-04 06:18:07 +00:00
jinmei
eff13571bc cleanup about the reassembly structures and routine:
- removed unused structure members
  - fixed a minor bug that the ECN code point may not be restored correctly

Approved by:	ume (mentor)
MFC after:	1 week
2007-06-04 06:06:35 +00:00
yongari
f5e3f2b6d0 gem(4) supports altq(4) now. 2007-06-04 06:01:57 +00:00
yongari
8183c40824 o Implemented Rx/Tx checksum offload. The simple checksum logic in
GEMs is unable to discriminate UDP from TCP packets such that
  it can generate 0x0000 checksum value for the UDP datagram. So the
  UDP checksum offload was disabled by default. You can enable it
  by setting link0 flag with ifconfig(8).
o bus_dma(9) clean up. It now correctly set number of required DMA
  segments/size and removed incorrect use of BUS_DMA_ALLOCNOW flag
  in static allocations done via bus_dmamem_alloc(9).
o Implemented ALTQ(9) support.
o Implemented Tx side bus_dmamap_load_mbuf_sg(9) which can remove
  several book keeping chores orginated from call-back mechanism.
  Therefore gem_txdma_callback() was removed and its functionality
  was reimplemented in gem_load_txmbuf().
o Don't set GEM_TD_START_OF_PACKET flag until all remaining mbuf
  chains are set. I think it was a long standing bug and it caused
  fluctuating interrupts/CPU usage patterns while netperf test
  is in progress. Previously it seems that we race with the device.
  Because I don't have a documentation for GEM I'm not sure this is
  correct but almost all other documentations I have stated this
  implications on setting SOP mark in descriptor ring(e.g. hme(4)).
o Borrowed gem_defrag() from ath(4) which is supposed to be much
  faster than m_defrag(9) since it's not need to defrag all
  mbuf chains.
o gem_load_txmbuf() was changed to allow passed mbuf chains to free.
  Caller of gem_load_txmbuf() correctly handles freed mbuf chains.
o In gem_start_locked(), added checks for availability of Tx
  descriptors before trying to load DMA maps which could save CPU
  cycles when number of available descriptors are low. Also, simplyfy
  IFF_DRV_OACTIVE detection logic.
o Removed hard-coded function names in CTR macros and replaced it
  with __func__.
o Moved statistics counter register access to gem_tick() to reduce
  number of PCI bus accesses. There is no reason to update statistics
  counters in interrupt handler.
o Removed unnecessary call of gem_start_locked() in gem_ioctl().

Reviewed by:	grehan (initial version), marius (with improvements and suggestions)
Tested by:	grehan (ppc), marius(sparc64)
2007-06-04 06:01:04 +00:00
imp
64652f7b07 Migrate from setting a CARD_OK flag in a shared word, to setting its
own entry in the softc.  This should allow more of cbb_pci_intr() to
migrate to a new cbb_pci_filt() so that we don't have to run cbb's ISR
in almost every case we get an interrupt.  We can't just move
cbb_pci_intr into cbb_pci_filt because it does things that aren't safe
to do from a fast interrupt handler, err I mean from a filter.  This is
an important first step.

# I wonder if I need to make cardok volatile or not.
2007-06-04 05:59:44 +00:00
scottl
83d530c093 Free the portinfo object on unload. 2007-06-04 04:35:04 +00:00
imp
4a281aa6a5 Don't register cb_func_filt if the client driver doesn't have a filter.
ditto for the isr.

Reviewed/Suggested by: simokawa-san
2007-06-04 03:13:24 +00:00
darrenr
27a50eee47 Remove files no longer required to build IPFilter 2007-06-04 03:07:34 +00:00
darrenr
a33069b532 Merge IPFilter 4.1.23 back to HEAD
See src/contrib/ipfilter/HISTORY for details of changes since 4.1.13
2007-06-04 02:54:36 +00:00
darrenr
1dd4fa592d This commit was generated by cvs2svn to compensate for changes in r170263,
which included commits to RCS files with non-trunk default branches.
2007-06-04 02:50:28 +00:00
darrenr
e2e28d4361 Import IPFilter 4.1.23 to vendor branch.
See src/contrib/ipfilter/HISTORY for details of changes since 4.1.13
2007-06-04 02:50:28 +00:00
darrenr
b8cc98bd6c Import IPFilter 4.1.23 to vendor branch.
See src/contrib/ipfilter/HISTORY for details of changes since 4.1.13
2007-06-04 02:50:28 +00:00
alc
4d78df42fa Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-04 02:32:07 +00:00
grog
a0b8b93ae8 Use correct comment syntax for $FreeBSD$. This file gets put through
cpp, not a shell script.

Pointy hat to: grog
2007-06-04 01:44:07 +00:00
delphij
d7966d0f30 Regen. 2007-06-04 01:43:25 +00:00
delphij
2908bdb1f1 Resolve conflicts. 2007-06-04 01:43:11 +00:00
delphij
9b5d103b5b This commit was generated by cvs2svn to compensate for changes in r170256,
which included commits to RCS files with non-trunk default branches.
2007-06-04 01:42:54 +00:00
delphij
7672cb6e48 /home/delphij/m 2007-06-04 01:42:54 +00:00
truckman
05dd7cf626 Nuke man page links that were orphaned by vendor branch import of
TrustedBSD OpenBSM 1.0 alpha 6.

MFC after:	3 days
2007-06-03 23:36:23 +00:00
alc
038cbd3091 Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-03 23:33:11 +00:00
alc
0c68a06c26 Add the machine-specific definitions for configuring the new physical
memory allocator.

Set the size of phys_avail[] and dump_avail[] using one of these
definitions.

Approved by:	re
2007-06-03 23:18:29 +00:00
scottl
6cbc02a681 mpt.c:
mpt.h:
	Add support for reading extended configuration pages.
mpt_cam.c:
	Do a top level topology scan on the SAS controller.  If any SATA
	device are discovered in this scan, send a passthrough FIS to set
	the write cache.  This is controllable through the following
	tunable at boot:

	hw.mpt.enable_sata_wc:
		-1 = Do not configure, use the controller default
		 0 = Disable the write cache
		 1 = Enable the write cache

	The default is -1.  This tunable is just a hack and may be
	deprecated in the future.

Turning on the write cache alleviates the write performance problems with
SATA that many people have observed.  It is not recommend for those who
value data reliability!  I cannot stress this strongly enough.  However,
it is useful in certain circumstances, and it brings the performence in line
with what a generic SATA controller running under the FreeBSD ATA driver
provides (and the ATA driver has had the WC enabled by default for years).
2007-06-03 23:13:05 +00:00
scottl
d39b8d1da5 Update to MPI 1.5.16 2007-06-03 22:58:27 +00:00
alc
67d75e2c91 Prepare for the new physical memory allocator: Change the way that the
physical page's color is obtained.

Approved by:	re
2007-06-03 19:39:38 +00:00