Commit Graph

128 Commits

Author SHA1 Message Date
jhb
df1acf8cfc Add a swi_remove() function to teardown software interrupt handlers. For
now it just calls intr_event_remove_handler(), but at some point it might
also be responsible for tearing down interrupt events created via swi_add.
2005-10-26 15:51:05 +00:00
jhb
e20e5c07ce Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces.  struct intr_event holds the list
  of interrupt handlers associated with interrupt sources.
  struct intr_thread contains the data relative to an interrupt thread.
  Currently we still provide a 1:1 relationship of events to threads
  with the exception that events only have an associated thread if there
  is at least one threaded interrupt handler attached to the event.  This
  means that on x86 we no longer have 4 bazillion interrupt threads with
  no handlers.  It also means that interrupt events with only INTR_FAST
  handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
  intr_foo naming convention.  This did require renaming the powerpc
  MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
  powerpc.  This means that multiple INTR_FAST handlers can attach to the
  same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
  to the same interrupt.  Sharing INTR_FAST handlers may not always be
  desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
  either.  Drivers can always still use INTR_EXCL to ask for an interrupt
  exclusively.  The way this sharing works is that when an interrupt
  comes in, all the INTR_FAST handlers are executed first, and if any
  threaded handlers exist, the interrupt thread is scheduled afterwards.
  This type of layout also makes it possible to investigate using interrupt
  filters ala OS X where the filter determines whether or not its companion
  threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
  is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
  dumping their state.  It also has a '/v' verbose switch which dumps
  info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
  handler is removed because it would suck for things like ppbus(8)'s
  braindead behavior.  The code is present, though, it is just under
  #if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
  event into a separate function so that ithread_loop() becomes more
  readable.  Previously this code was all in the middle of ithread_loop()
  and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
  with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
  curthread's proc against idlethread's proc. (Not really related to intr
  changes)

Tested on:	alpha, amd64, i386, sparc64
Tested on:	arm, ia64 (older version of patch by cognet and marcel)
2005-10-25 19:48:48 +00:00
jhb
feeb07e6ae Don't disallow sleeping for handlers on swi's since some swi handlers
(like CAM) do sleep in their handlers.

Requested by:	scottl
2005-09-15 20:08:21 +00:00
jhb
e535e11c9f - Add a new simple facility for marking the current thread as being in a
state where sleeping on a sleep queue is not allowed.  The facility
  doesn't support recursion but uses a simple private per-thread flag
  (TDP_NOSLEEPING).  The sleepq_add() function will panic if the flag is
  set and INVARIANTS is enabled.
- Use this new facility to replace the g_xup and g_xdown mutexes that were
  (ab)used to achieve similar behavior.
- Disallow sleeping in interrupt threads when invoking interrupt handlers.

MFC after:	1 week
Reviewed by:	phk
2005-09-15 19:05:37 +00:00
jhb
a5e3094bcf Simplify the storming logic and remove a variable as a result.
Approved by:	re (dwhite)
2005-06-20 19:32:23 +00:00
jhb
f9da7305b5 Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
jhb
7b611b0cb2 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
jhb
9c6e87707d Don't bother exiting storming mode once a second to see if it has gone
away, instead only exit storming mode when an interrupt stops firing long
enough for the ithread to exit the loop and go back to sleep.

Tested by:	macrus (cruder version)
2004-11-17 14:39:41 +00:00
jhb
9933c3bdef Adjust the interrupt storm handling code to better handle a storm. When
a storm is detected, enter "storming" mode which throttles the interrupt
source such that the handlers are run once every clock tick.  Previously
we allowed a full set of storm_threshold interations through the handler
before going back to sleep.  Also, this currently will intentionally exit
storming mode once a second to see if the storm has passed.

Tested by:	marcus
Discussed with:	bde
2004-11-16 16:09:46 +00:00
jhb
c19696c197 - Make setting of IT_ENTROPY a bit simpler in ithread_update().
- Tweak the updating of the ithread name in ithread_update() so that the
  '+' and '*' characters for device names that were too short only get
  added at the end after as many device names as possible were fit into
  the allocated space.  Prior to this, some long devices would result
  in '+' chars showing up between two different devices rather than at the
  end.
2004-11-05 19:11:24 +00:00
jhb
43d29263e9 Revert most of 1.109. Although it improved the situation on one particular
motherboard, in practice the changes resulted in many false positives for
heavy network loads, etc. resulting in poor performance.  Also, the
motherboard referenced in the 1.109 log has other problems and simply does
not seem to work with the APIC enabled even with the changes in 1.109.  The
correct fix for that board seems to be to not use the APIC at all.  One
thing kept from 1.109 is that throttled interrupts are now effectively
polled on every clock tick rather than just 10 times per second.

MFC after:	1 month
Tested by:	Shunsuke SHINOMIYA shino at fornext dot org
2004-11-03 22:11:20 +00:00
jhb
a9860ec891 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
julian
5813d27029 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
julian
e9d9514975 Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
rwatson
6dcf7eb45e Annotate call to DELAY() in interrupt storm mitigation as being
something to revisit.

Approved by:	re (scottl)
2004-08-17 04:09:09 +00:00
rwatson
6680706c2b In ithread_schedule(), when we plan to go harvest some entropy as
a result of scheduling an ithread, cut a KTR_INTR trace record so
that it's clear in tracing interrupt activity where and when the
entropy harvesting code is invoked.
2004-08-06 03:39:28 +00:00
jhb
696704716d Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
jhb
1b16b181d1 - Change mi_switch() and sched_switch() to accept an optional thread to
switch to.  If a non-NULL thread pointer is passed in, then the CPU will
  switch to that thread directly rather than calling choosethread() to pick
  a thread to choose to.
- Make sched_switch() aware of idle threads and know to do
  TD_SET_CAN_RUN() instead of sticking them on the run queue rather than
  requiring all callers of mi_switch() to know to do this if they can be
  called from an idlethread.
- Move constants for arguments to mi_switch() and thread_single() out of
  the middle of the function prototypes and up above into their own
  section.
2004-07-02 19:09:50 +00:00
bde
e02f078768 Detect interrupt storms better. The storm detection didn't work at all
with an ASUS A7N8X-E motherboard in APIC mode, since storming interrupts
don't repeat immediately.  Use DELAY(1) to wait a bit for them to repeat.
This affects all systems.  Only delay for the first
(10 * intr_storm_threshold) interrupts (per interrupt handler) so that
this is only a pessimization while warming up.  Throttle after calling
the sub-handlers instead of before so that the long delay given by
throttling can be used instead of the DELAY(1) to detect storms after
warming up.

Reduced the throttling period from 1/10 second to 1/hz seconds so that
throttling doesn't destroy performance so much.  Interrupts that are
detected as storming are effectively handled by polling at a frequency
of hz Hz.  On A7N8X-E's there is another hardware or configuration bug
that makes the throttled frequency closer to 2*hz Hz.
2004-06-05 18:27:28 +00:00
bde
62ea68b046 Fixed some style bugs in previous commit (mainly an insertion sort error
for declarations, and poorly worded messages).

Fixed some nearby style bugs (unsorted declarations).
2004-04-17 02:46:05 +00:00
jhb
24e7821d24 - Enable (unmask) interrupt sources earlier in the ithread loop.
Specifically, we used to enable the source after locking sched_lock
  and just before we had already decided to do a context switch.
  This meant that an ithread could never process more than one interrupt
  per context switch.  Enabling earlier in the loop before sched_lock is
  acquired allows an ithread to handle multiple interrupts per context
  switch if interrupts fire very rapidly.  For the case of heavy interrupt
  load this can reduce the number of context switches (and thus overhead)
  as well as reduce interrupt latency.
- Now that we can handle multiple interrupts per context switch, add simple
  interrupt storm protection to threaded interrupts.  If X number of
  consecutive interrupts are triggered before the itherad voluntarily
  yields to another thread, then the interrupt thread will sleep with the
  associated interrupt source disabled (masked) for 1/10th of a second.
  The default value of X is 500, but it can be tweaked via the tunable/
  sysctl hw.intr_storm_threshold.  If an interrupt storm is detected, then
  a message is output to the kernel console on the first occurrence per
  interrupt thread.  Interrupt storm protection can be disabled completely
  by setting this value to 0.  There is no scientific reasoning for the
  1/10th of a second or 500 interrupts values, so they may require tweaking
  at some point in the future.

Tested by:	rwatson (an earlier version w/o the storm protection)
Tested by:	mux (reportedly made a machine with two PCI interrupts
		storming usable rather than hard locked)
Reviewed by:	imp
2004-04-16 20:25:40 +00:00
jhb
2642ed4029 kthread_exit() no longer requires Giant, so don't force callers to acquire
Giant just to call kthread_exit().

Requested by:	many
2004-03-05 22:42:17 +00:00
jeff
c85cdc3d0f - Add a flags parameter to mi_switch. The value of flags may be SW_VOL or
SW_INVOL.  Assert that one of these is set in mi_switch() and propery
   adjust the rusage statistics.  This is to simplify the large number of
   users of this interface which were previously all required to adjust the
   proper counter prior to calling mi_switch().  This also facilitates more
   switch and locking optimizations.
 - Change all callers of mi_switch() to pass the appropriate paramter and
   remove direct references to the process statistics.
2004-01-25 03:54:52 +00:00
truckman
b028f9e268 If a device attach routine fails during boot and calls bus_teardown_intr(),
ithread_remove_handler() may fail to remove the interrupt handler if
it decides to let the ithread do the removal.  The problem is that during
boot "cold" is set, which causes msleep() to return immediately.  This
will cause ithread_remove_handler() to fail to wait for the ithread
to do the removal from the handler TAILQ before freeing the handler
back to the heap.  Bad things will happen when some other user of the
TAILQ, such as ithread_add_handler() or the actual ithread attempts to use
the freed handler.  Fix the problem by forcing ithread_remove_handler()
to do the actual removal itself if the "cold" flag is set.

Reviewed by:	jhb
2004-01-13 22:55:46 +00:00
markm
6a2f4748c4 Fix a major faux pas of mine. I was causing 2 very bad things to
happen in interrupt context; 1) sleep locks, and 2) malloc/free
calls.

1) is fixed by using spin locks instead.

2) is fixed by preallocating a FIFO (implemented with a STAILQ)
   and using elements from this FIFO instead. This turns out
   to be rather fast.

OK'ed by:	re (scottl)
Thanks to:	peter, jhb, rwatson, jake
Apologies to:	*
2003-11-20 15:35:48 +00:00
markm
832b97971f Hackfix to patch around a kernel panic I introduced. Real fix to
follow. In the meanwhile, we are not harvesting interrupt entropy.

Approved by:	re (jhb)
2003-11-18 14:35:43 +00:00
peter
38ebd79a92 Expand the argument to the ithread enable/disable helper hooks from an
int to something big enough to hold a pointer.  amd64 needs this.
2003-11-17 06:08:10 +00:00
jhb
da7c66355b Don't require INTR_FAST handlers to be exclusive in the MI layer. Instead,
let the MD code choose whether or not to implement such a policy.  The new
i386 interrupt code allows multiple FAST handlers for a given source for
example.  However, the code does not allow FAST and non-FAST handlers to be
mixed.
2003-11-03 22:42:58 +00:00
jhb
7d3a70bca2 - Add a DDB command 'show intrcnt' to show the non-zero interrupt counts.
- Add a DDB function to dump the contents of an ithread and optionally
  details about each handler in that ithread.  This function can be used
  by MD code to implement DDB commands that display information about
  interrupt sources and their registered handlers.
2003-10-24 21:05:30 +00:00
scottl
5d14736333 Make swi_vm be INTR_MPSAFE. On all platforms, it is only used to activate
busdma_swi().  Now that busdma_swi() uses driver-provided locking, this
should be safe.
2003-07-01 16:00:38 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
phk
2048912526 Remove unused variable(s).
Found by:       FlexeLint
2003-05-31 20:29:34 +00:00
julian
a954908b35 Move the flag that indicates an idle thread from the KSE to the thread.
It was always referenced via the thread anyhow.

Reviewed by:	jhb (a LOOOOONG time ago)
2003-05-02 00:33:12 +00:00
jhb
313b87d41a Add some locking in for a few proc and thread fields. 2003-04-17 22:25:35 +00:00
jhb
e7a906488e Use local struct proc variables to reduce repeated td->td_proc dereferences
and improve readability.
2003-04-17 22:02:47 +00:00
jhb
ac139f5914 Adjust a KTR trace to log thread state instead of proc state as that is
more relevant.
2003-04-17 22:01:01 +00:00
jhb
e87dfc0cde Add a WITNESS_WARN() call to verify that we hold no locks after running
a handler from an interrupt thread.
2003-03-04 21:01:42 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
julian
af55753a06 Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by:	 parts by davidxu@
Reviewed by:	jeff@ mini@
2003-02-17 09:55:10 +00:00
alfred
268bc18ef2 Fix crash dumps on ata and scsi.
To fix scsi, don't wait for ithreads if we're dumping, it makes the
debugger sad.

To fix ata, use what appears to be a polling method if we're dumping,
I stole this from tmm but added code to ensure that this change is
only in effect while dumping.

Tested by: des
2003-02-14 13:10:40 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
jake
3a802b19f9 Don't put a newline in KTR traces. 2002-12-28 23:22:22 +00:00
robert
399c4103ec Instead of (sizeof(source_buffer) - 1) bytes, copy at most
(sizeof(destination_buffer) - 1) bytes into the destination buffer.
This was not harmful because they currently both provide space for
(MAXCOMLEN + 1) bytes.
2002-10-17 21:02:02 +00:00
robert
1e0cdb534a Use strlcpy() instead of strncpy() to copy NUL terminated strings
for safety and consistency.
2002-10-17 20:03:38 +00:00
scottl
3a150bca9c Some kernel threads try to do significant work, and the default KSTACK_PAGES
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create.  Passing the
value 0 prevents the alternate kstack from being created.  Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.

Reviewed by:	jake, peter, jhb
2002-10-02 07:44:29 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
jake
30830d4d2e Removed unneeded include (missed in last revision). 2002-09-22 06:05:23 +00:00
jake
e54737666a Moved netisr code from kern/kern_intr.c to net/netisr.c as threatened in a
comment.
2002-09-22 05:56:41 +00:00
julian
5702a380a5 Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00