mi_switch(9): update to current day
The function itself and much of the information in this page remains relevant, but many details need to be fixed. - Update function signatures - Update the list of major uses of mi_switch() (it is not exhaustive) - Document 'flags' argument and its possible values - Document thread lock requirement for callers - Thread runtime limits are out of scope now, no need to describe them - Remove outdated information w.r.t. KSE, runqueue, non-preemptible kernel, etc - Update the description of cpu_switch() and its responsibilities PR: 149574 Reviewed by: kib Discussed with: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38185
This commit is contained in:
parent
1029dab634
commit
175db7b582
@ -2,10 +2,14 @@
|
||||
.\"
|
||||
.\" Copyright (c) 1996 The NetBSD Foundation, Inc.
|
||||
.\" All rights reserved.
|
||||
.\" Copyright (c) 2023 The FreeBSD Foundation
|
||||
.\"
|
||||
.\" This code is derived from software contributed to The NetBSD Foundation
|
||||
.\" by Paul Kranenburg.
|
||||
.\"
|
||||
.\" Portions of this documentation were written by Mitchell Horne
|
||||
.\" under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
@ -29,7 +33,7 @@
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd November 24, 1996
|
||||
.Dd January 9, 2023
|
||||
.Dt MI_SWITCH 9
|
||||
.Os
|
||||
.Sh NAME
|
||||
@ -41,96 +45,171 @@
|
||||
.In sys/param.h
|
||||
.In sys/proc.h
|
||||
.Ft void
|
||||
.Fn mi_switch "void"
|
||||
.Fn mi_switch "int flags"
|
||||
.Ft void
|
||||
.Fn cpu_switch "void"
|
||||
.Fn cpu_switch "struct thread *oldtd" "struct thread *newtd" "struct mtx *lock"
|
||||
.Ft void
|
||||
.Fn cpu_throw "void"
|
||||
.Fn cpu_throw "struct thread *oldtd" "struct thread *newtd"
|
||||
.Sh DESCRIPTION
|
||||
The
|
||||
.Fn mi_switch
|
||||
function implements the machine independent prelude to a thread context
|
||||
function implements the machine-independent prelude to a thread context
|
||||
switch.
|
||||
It is called from only a few distinguished places in the kernel
|
||||
code as a result of the principle of non-preemptable kernel mode execution.
|
||||
It is the single entry point for every context switch and is called from only
|
||||
a few distinguished places in the kernel.
|
||||
The context switch is, by necessity, always performed by the switched thread,
|
||||
even when the switch is initiated from elsewhere; e.g. preemption requested via
|
||||
Inter-Processor Interrupt (IPI).
|
||||
.Pp
|
||||
The various major uses of
|
||||
.Nm
|
||||
.Fn mi_switch
|
||||
can be enumerated as follows:
|
||||
.Bl -enum -offset indent
|
||||
.It
|
||||
From within a function such as
|
||||
.Xr cv_wait 9 ,
|
||||
.Xr mtx_lock 9 ,
|
||||
.Xr sleepq_wait 9
|
||||
or
|
||||
.Xr tsleep 9
|
||||
.Fn turnstile_wait
|
||||
when the current thread
|
||||
voluntarily relinquishes the CPU to wait for some resource or lock to become
|
||||
available.
|
||||
.It
|
||||
After handling a trap
|
||||
(e.g.\& a system call, device interrupt)
|
||||
when the kernel prepares a return to user-mode execution.
|
||||
This case is
|
||||
typically handled by machine dependent trap-handling code after detection
|
||||
of a change in the signal disposition of the current process, or when a
|
||||
higher priority thread might be available to run.
|
||||
The latter event is
|
||||
communicated by the machine independent scheduling routines by calling
|
||||
the machine defined
|
||||
.Fn need_resched .
|
||||
Involuntary preemption due to arrival of a higher-priority thread.
|
||||
.It
|
||||
At the tail end of
|
||||
.Xr critical_exit 9 ,
|
||||
if preemption was deferred due to the critical section.
|
||||
.It
|
||||
Within the TDA_SCHED AST handler, when rescheduling before the return to
|
||||
usermode was requested.
|
||||
There are several reasons for this, a notable one coming from
|
||||
.Fn sched_clock
|
||||
when the running thread has exceeded its time slice.
|
||||
.It
|
||||
In the signal handling code
|
||||
(see
|
||||
.Xr issignal 9 )
|
||||
if a signal is delivered that causes a process to stop.
|
||||
.It
|
||||
When a thread dies in
|
||||
.Xr thread_exit 9
|
||||
and control of the processor can be passed to the next runnable thread.
|
||||
.It
|
||||
In
|
||||
.Xr thread_suspend_check 9
|
||||
.Fn thread_suspend_check
|
||||
where a thread needs to stop execution due to the suspension state of
|
||||
the process as a whole.
|
||||
.It
|
||||
In
|
||||
.Xr kern_yield 9
|
||||
when a thread wants to voluntarily relinquish the processor.
|
||||
.El
|
||||
.Pp
|
||||
The
|
||||
.Va flags
|
||||
argument to
|
||||
.Fn mi_switch
|
||||
records the amount of time the current thread has been running in the
|
||||
process structures and checks this value against the CPU time limits
|
||||
allocated to the process
|
||||
(see
|
||||
.Xr getrlimit 2 ) .
|
||||
Exceeding the soft limit results in a
|
||||
.Dv SIGXCPU
|
||||
signal to be posted to the process, while exceeding the hard limit will
|
||||
cause a
|
||||
.Dv SIGKILL .
|
||||
indicates the context switch type.
|
||||
One of the following must be passed:
|
||||
.Bl -tag -offset indent -width "SWT_REMOTEWAKEIDLE"
|
||||
.It Dv SWT_OWEPREEMPT
|
||||
Switch due to delayed preemption after exiting a critical section.
|
||||
.It Dv SWT_TURNSTILE
|
||||
Switch after propagating scheduling priority to the owner of a resource.
|
||||
.It Dv SWT_SLEEPQ
|
||||
Begin waiting on a
|
||||
.Xr sleepqueue 9 .
|
||||
.It Dv SWT_RELINQUISH
|
||||
Yield call.
|
||||
.It Dv SWT_NEEDRESCHED
|
||||
Rescheduling was requested.
|
||||
.It Dv SWT_IDLE
|
||||
Switch from the idle thread.
|
||||
.It Dv SWT_IWAIT
|
||||
A kernel thread which handles interrupts has finished work and must wait for
|
||||
interrupts to schedule additional work.
|
||||
.It Dv SWT_SUSPEND
|
||||
Thread suspended.
|
||||
.It Dv SWT_REMOTEPREEMPT
|
||||
Preemption by a higher-priority thread, initiated by a remote processor.
|
||||
.It Dv SWT_REMOTEWAKEIDLE
|
||||
Idle thread preempted, initiated by a remote processor.
|
||||
.It Dv SWT_BIND
|
||||
The running thread has been bound to another processor and must be switched
|
||||
out.
|
||||
.El
|
||||
.Pp
|
||||
If the thread is still in the
|
||||
.Dv TDS_RUNNING
|
||||
state,
|
||||
.Fn mi_switch
|
||||
will put it back onto the run queue, assuming that
|
||||
it will want to run again soon.
|
||||
If it is in one of the other
|
||||
states and KSE threading is enabled, the associated
|
||||
.Em KSE
|
||||
will be made available to any higher priority threads from the same
|
||||
group, to allow them to be scheduled next.
|
||||
In addition to the switch type, callers must specify the nature of the
|
||||
switch by performing a bitwise OR with one of the
|
||||
.Dv SW_VOL
|
||||
or
|
||||
.Dv SW_INVOL
|
||||
flags, but not both.
|
||||
Respectively, these flags denote whether the context switch is voluntary or
|
||||
involuntary on the part of the current thread.
|
||||
For an involuntary context switch in which the running thread is
|
||||
being preempted, the caller should also pass the
|
||||
.Dv SW_PREEMPT
|
||||
flag.
|
||||
.Pp
|
||||
After these administrative tasks are done,
|
||||
Upon entry to
|
||||
.Fn mi_switch ,
|
||||
the current thread must be holding its assigned thread lock.
|
||||
It may be unlocked as part of the context switch.
|
||||
After they have been rescheduled and execution resumes, threads will exit
|
||||
.Fn mi_switch
|
||||
hands over control to the machine dependent routine
|
||||
.Fn cpu_switch ,
|
||||
which will perform the actual thread context switch.
|
||||
with their thread lock unlocked.
|
||||
.Pp
|
||||
.Fn mi_switch
|
||||
records the amount of time the current thread has been running before handing
|
||||
control over to the scheduler, via
|
||||
.Fn sched_switch .
|
||||
After selecting a new thread to run, the scheduler will call
|
||||
.Fn cpu_switch
|
||||
to perform the low-level context switch.
|
||||
.Pp
|
||||
.Fn cpu_switch
|
||||
first saves the context of the current thread.
|
||||
Next, it calls
|
||||
.Fn choosethread
|
||||
to determine which thread to run next.
|
||||
Finally, it reads in the saved context of the new thread and starts to
|
||||
execute the new thread.
|
||||
is the machine-dependent function that performs the actual switch from the
|
||||
running thread
|
||||
.Fa oldtd
|
||||
to the chosen thread
|
||||
.Fa newtd .
|
||||
First, it saves the context of
|
||||
.Fa oldtd
|
||||
to its Process Control Block,
|
||||
.Po
|
||||
PCB
|
||||
.Vt struct pcb
|
||||
.Pc ,
|
||||
pointed at by
|
||||
.Va oldtd->td_pcb .
|
||||
The function then updates important per-CPU state such as the
|
||||
.Dv curthread
|
||||
variable, and activates
|
||||
.Fa newtd\&'s
|
||||
virtual address space using its associated
|
||||
.Xr pmap 9
|
||||
structure.
|
||||
Finally, it reads in the saved context from
|
||||
.Fa newtd\&'s
|
||||
PCB.
|
||||
CPU instruction flow continues in the new thread context, on
|
||||
.Fa newtd\&'s
|
||||
kernel stack.
|
||||
The return from
|
||||
.Fn cpu_switch
|
||||
can be understood as a completion of the function call initiated by
|
||||
.Fa newtd
|
||||
when it was previously switched out, at some point in the distant (relative to
|
||||
CPU time) past.
|
||||
.Pp
|
||||
The
|
||||
.Fa mtx
|
||||
argument to
|
||||
.Fn cpu_switch
|
||||
is used to pass the mutex which will be stored as
|
||||
.Fa oldtd\&'s
|
||||
thread lock at the moment that
|
||||
.Fa oldtd
|
||||
is completely switched out.
|
||||
This is an implementation detail of
|
||||
.Fn sched_switch .
|
||||
.Pp
|
||||
.Fn cpu_throw
|
||||
is similar to
|
||||
@ -140,19 +219,18 @@ This function is useful when the kernel does not have an old thread
|
||||
context to save, such as when CPUs other than the boot CPU perform their
|
||||
first task switch, or when the kernel does not care about the state of the
|
||||
old thread, such as in
|
||||
.Fn thread_exit
|
||||
.Xr thread_exit 9
|
||||
when the kernel terminates the current thread and switches into a new
|
||||
thread.
|
||||
.Pp
|
||||
To protect the
|
||||
.Xr runqueue 9 ,
|
||||
all of these functions must be called with the
|
||||
.Va sched_lock
|
||||
mutex held.
|
||||
thread,
|
||||
.Fa newtd .
|
||||
The
|
||||
.Fa oldtd
|
||||
argument is unused.
|
||||
.Sh SEE ALSO
|
||||
.Xr cv_wait 9 ,
|
||||
.Xr critical_exit 9 ,
|
||||
.Xr issignal 9 ,
|
||||
.Xr kern_yield 9 ,
|
||||
.Xr mutex 9 ,
|
||||
.Xr runqueue 9 ,
|
||||
.Xr tsleep 9 ,
|
||||
.Xr wakeup 9
|
||||
.Xr pmap 9 ,
|
||||
.Xr sleepqueue 9 ,
|
||||
.Xr thread_exit 9
|
||||
|
Loading…
Reference in New Issue
Block a user