It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.
This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.
Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.
This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)
* Remove the rest of the stoffs hack.
* Remove my half-baked displace_symbol_table() function.
* Extend ddb initialization to cope with having a relocation offset on the
kernel symbol table.
* Fix my kernel-as-initrd hack to work with booke64 by using a temporary
mapping to access the data.
* Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
* Change the behavior or X_db_symbol_values to apply the relocation base
when updating valp, to match link_elf_symbol_values() behavior.
Reviewed by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D25223
- Clear the current thread's TLS pointer on exec. Previously the TLS
pointer (and register) remain unchanged.
- Explicitly clear the TLS pointer when new threads are created.
- Make md_tls_tcb_offset per-process instead of per-thread.
The layout of the TLS and TCB are identical for all threads in a
process, it is only the TLS pointer values themselves that vary by
thread. This also makes setting md_tls_tcb_offset in
cpu_set_user_tls() redundant with the setting in exec_setregs(), so
only set it in exec_setregs().
Submitted by: Alfredo Mazzinghi (1)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24957
Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.
While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.
Reviewed by: delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25126
I noticed that unaligned accesses were returning garbage values.
Give test data like this:
char testdata[] = { 0x12, 0x34, 0x56, 0x78, 0x9a, 0xbc, 0xde, 0xf1, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef, 0x5a };
Iterating through uint32_t space 1 byte at a time should
look like this:
freebsd-carambola2:/mnt# ./test
Hello, world!
offset 0 pointer 0x410b00 value 0x12345678 0x12345678
offset 1 pointer 0x410b01 value 0x3456789a 0x3456789a
offset 2 pointer 0x410b02 value 0x56789abc 0x56789abc
offset 3 pointer 0x410b03 value 0x789abcde 0x789abcde
offset 4 pointer 0x410b04 value 0x9abcdef1 0x9abcdef1
offset 5 pointer 0x410b05 value 0xbcdef123 0xbcdef123
offset 6 pointer 0x410b06 value 0xdef12345 0xdef12345
offset 7 pointer 0x410b07 value 0xf1234567 0xf1234567
.. but to begin with it looked like this:
offset 0 value 0x12345678
offset 1 value 0x00410a9a
offset 2 value 0x00419abc
offset 3 value 0x009abcde
offset 4 value 0x9abcdef1
offset 5 value 0x00410a23
offset 6 value 0x00412345
offset 7 value 0x00234567
The amusing reason? The compiler is generating the lwr/lwl incorrectly.
Here's an example after I tried to replace the two macros with a single
invocation and offset, rather than having the compiler compile in addiu
to s3 - but the bug is the same:
1044: 8a620003 lwl v0,0(s3)
1048: 9a730000 lwr s3,3(s3)
.. which is just totally trashy and wrong.
This explicitly tells the compiler to treat the output as being read
and written to, which is what lwl/lwr does with the destination
register.
I think a subsequent commit should unify these macros to skip an addiu,
but that can be a later commit.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D25040
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro. Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.
Prior commit message:
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults. It's just
an older incarnation of the now-more-common strlcpy().
Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.
Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).
Remove N redundant MI implementations of copystr. For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.
Reviewed by: jhb (earlier version)
Discussed with: brooks (thanks!)
Differential Revision: https://reviews.freebsd.org/D24672
Match other architectures and print CPU information during
cpu_startup(). In particular, this prints the information after the
message buffer is initialized which allows it to be retrieved after
boot via dmesg(8).
While here, add some extern declarations to <machine/md_var.h> in
place of duplicated declarations in various source files.
Reviewed by: brooks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24936
Rather than walking all of cpu_switch looking for the sequence of
instructions to patch, add a global label at the location that needs
the patch applied.
Reviewed by: brooks, Alfredo Mazzinghi <alfredo.mazzinghi_cl.cam.ac.uk>
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24931
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults. It's just
an older incarnation of the now-more-common strlcpy().
Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.
Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy.
Remove N redundant MI implementations of copystr. For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D24672
There are no in-kernel consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24775
It no longer has any in-kernel consumers via OCF. smbfs still uses
single DES directly, so sys/crypto/des remains for that use case.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24773
There are no longer any in-kernel consumers. The software
implementation was also a non-functional stub.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24771
Although a few drivers supported this algorithm, there were never any
in-kernel consumers. cryptosoft and cryptodev never supported it,
and there was not a software xform auth_hash for it.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24767
This is stuff I've been running for a couple years. It's inspired by changes
I found in the linux ag71xx ethernet driver.
* Delay between stopping DMA and checking to see if it's stopped; this gives
the hardware time to do its thing.
* Non-final frames in the chain need to be a multiple of 4 bytes in size.
Ensure this is the case when assembling a TX DMA list.
* Add counters for tx/rx underflow and too-short packets.
* Log if TX/RX DMA couldn't be stopped when resetting the MAC.
* Add some more debugging / logging around TX/RX ring bits.
Tested:
* AR7240, AR7241
* AR9344 (TL-WDR3600/TL-WDR4300 APs)
* AR9331 (Carambola 2)
pmap_emulate_modify() was assuming that no changes to the pmap could
take place between the TLB signaling the fault and
pmap_emulate_modify()'s acquisition of the pmap lock, but that's clearly
not even true in the uniprocessor case, nevermind the SMP case.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24523
MipsDoTLBMiss() will load a segmap entry or pde, check that it isn't
zero, and then chase that pointer to a physical page. If that page has
been freed in the interim, it will read garbage and go on to populate
the TLB with it.
This can happen because pmap_unwire_ptp zeros out the pde and
vm_page_free_zero()s the ptp (or, recursively, zeros out the segmap
entry and vm_page_free_zero()s the pdp) without interlocking against
MipsDoTLBMiss(). The pmap is locked, and pvh_global_lock may or may not
be held, but this is not enough. Solve this issue by inserting TLB
shootdowns within _pmap_unwire_ptp(); as MipsDoTLBMiss() runs with IRQs
deferred, the IPIs involved in TLB shootdown are sufficient to ensure
that MipsDoTLBMiss() sees either a zero segmap entry / pde or a non-zero
entry and the pointed-to page still not freed.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24491
If DTRACE is enabled at compile time, all kernel breakpoint traps are
first given to dtrace to see if they are triggered by a FBT probe.
Previously if dtrace didn't recognize the trap, it was silently
ignored breaking the handling of other kernel breakpoint traps such as
the debug.kdb.enter sysctl. This only returns early from the trap
handler if dtrace recognizes the trap and handles it.
Submitted by: Nicolò Mazzucato <nicomazz97@gmail.com>
Reviewed by: markj
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D24478
The sole in-tree user of this flag has been retired, so remove this
complexity from all drivers. While here, add a helper routine drivers
can use to read the current request's IV into a local buffer. Use
this routine to replace duplicated code in nearly all drivers.
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24450
The use of "int" here caused the compiler to believe that it needs to
insert a "sll $n, $n, 0" to sign extend as part of the implicit cast
to uint64_t.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: brooks, arichardson
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24457
Rather then using the racy useracc() followed by direct access to
userspace memory, perform a copyin() and use the result if it succeeds.
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24410
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging. ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code. This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.
PR: 244939 (exp-run)
Reviewed by: kib, mjg (earlier version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D23837
The log for the failure contained errors like this:
| In file included from ${SRCTOP}/sys/mips/nlm/dev/net/xlpge.c:34:
| In file included from ${SRCTOP}/sys/sys/systm.h:44:
| In file included from ./machine/atomic.h:849:
| ${SRCTOP}/sys/sys/_atomic_subword.h:222:37: error: unknown type name 'u_long'; did you mean 'long'?
| atomic_testandset_acq_long(volatile u_long *p, u_int v)
| ^~~~~~
| long
And similar "unknown type name" errors for u_int, not recognizing bool as a type, etc.
This was caused by including <sys/param.h> too far down; move it up where it belongs.
While here, add a blank line after '__FBSDID()', in keeping with convention.
Reviewed by: emaste
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D24242
- The linked list of cryptoini structures used in session
initialization is replaced with a new flat structure: struct
crypto_session_params. This session includes a new mode to define
how the other fields should be interpreted. Available modes
include:
- COMPRESS (for compression/decompression)
- CIPHER (for simply encryption/decryption)
- DIGEST (computing and verifying digests)
- AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
- ETA (combined auth and encryption using encrypt-then-authenticate)
Additional modes could be added in the future (e.g. if we wanted to
support TLS MtE for AES-CBC in the kernel we could add a new mode
for that. TLS modes might also affect how AAD is interpreted, etc.)
The flat structure also includes the key lengths and algorithms as
before. However, code doesn't have to walk the linked list and
switch on the algorithm to determine which key is the auth key vs
encryption key. The 'csp_auth_*' fields are always used for auth
keys and settings and 'csp_cipher_*' for cipher. (Compression
algorithms are stored in csp_cipher_alg.)
- Drivers no longer register a list of supported algorithms. This
doesn't quite work when you factor in modes (e.g. a driver might
support both AES-CBC and SHA2-256-HMAC separately but not combined
for ETA). Instead, a new 'crypto_probesession' method has been
added to the kobj interface for symmteric crypto drivers. This
method returns a negative value on success (similar to how
device_probe works) and the crypto framework uses this value to pick
the "best" driver. There are three constants for hardware
(e.g. ccr), accelerated software (e.g. aesni), and plain software
(cryptosoft) that give preference in that order. One effect of this
is that if you request only hardware when creating a new session,
you will no longer get a session using accelerated software.
Another effect is that the default setting to disallow software
crypto via /dev/crypto now disables accelerated software.
Once a driver is chosen, 'crypto_newsession' is invoked as before.
- Crypto operations are now solely described by the flat 'cryptop'
structure. The linked list of descriptors has been removed.
A separate enum has been added to describe the type of data buffer
in use instead of using CRYPTO_F_* flags to make it easier to add
more types in the future if needed (e.g. wired userspace buffers for
zero-copy). It will also make it easier to re-introduce separate
input and output buffers (in-kernel TLS would benefit from this).
Try to make the flags related to IV handling less insane:
- CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
member of the operation structure. If this flag is not set, the
IV is stored in the data buffer at the 'crp_iv_start' offset.
- CRYPTO_F_IV_GENERATE means that a random IV should be generated
and stored into the data buffer. This cannot be used with
CRYPTO_F_IV_SEPARATE.
If a consumer wants to deal with explicit vs implicit IVs, etc. it
can always generate the IV however it needs and store partial IVs in
the buffer and the full IV/nonce in crp_iv and set
CRYPTO_F_IV_SEPARATE.
The layout of the buffer is now described via fields in cryptop.
crp_aad_start and crp_aad_length define the boundaries of any AAD.
Previously with GCM and CCM you defined an auth crd with this range,
but for ETA your auth crd had to span both the AAD and plaintext
(and they had to be adjacent).
crp_payload_start and crp_payload_length define the boundaries of
the plaintext/ciphertext. Modes that only do a single operation
(COMPRESS, CIPHER, DIGEST) should only use this region and leave the
AAD region empty.
If a digest is present (or should be generated), it's starting
location is marked by crp_digest_start.
Instead of using the CRD_F_ENCRYPT flag to determine the direction
of the operation, cryptop now includes an 'op' field defining the
operation to perform. For digests I've added a new VERIFY digest
mode which assumes a digest is present in the input and fails the
request with EBADMSG if it doesn't match the internally-computed
digest. GCM and CCM already assumed this, and the new AEAD mode
requires this for decryption. The new ETA mode now also requires
this for decryption, so IPsec and GELI no longer do their own
authentication verification. Simple DIGEST operations can also do
this, though there are no in-tree consumers.
To eventually support some refcounting to close races, the session
cookie is now passed to crypto_getop() and clients should no longer
set crp_sesssion directly.
- Assymteric crypto operation structures should be allocated via
crypto_getkreq() and freed via crypto_freekreq(). This permits the
crypto layer to track open asym requests and close races with a
driver trying to unregister while asym requests are in flight.
- crypto_copyback, crypto_copydata, crypto_apply, and
crypto_contiguous_subsegment now accept the 'crp' object as the
first parameter instead of individual members. This makes it easier
to deal with different buffer types in the future as well as
separate input and output buffers. It's also simpler for driver
writers to use.
- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
This understands the various types of buffers so that drivers that
use DMA do not have to be aware of different buffer types.
- Helper routines now exist to build an auth context for HMAC IPAD
and OPAD. This reduces some duplicated work among drivers.
- Key buffers are now treated as const throughout the framework and in
device drivers. However, session key buffers provided when a session
is created are expected to remain alive for the duration of the
session.
- GCM and CCM sessions now only specify a cipher algorithm and a cipher
key. The redundant auth information is not needed or used.
- For cryptosoft, split up the code a bit such that the 'process'
callback now invokes a function pointer in the session. This
function pointer is set based on the mode (in effect) though it
simplifies a few edge cases that would otherwise be in the switch in
'process'.
It does split up GCM vs CCM which I think is more readable even if there
is some duplication.
- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
as an auth algorithm and updated cryptocheck to work with it.
- Combined cipher and auth sessions via /dev/crypto now always use ETA
mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
This was actually documented as being true in crypto(4) before, but
the code had not implemented this before I added the CIPHER_FIRST
flag.
- I have not yet updated /dev/crypto to be aware of explicit modes for
sessions. I will probably do that at some point in the future as well
as teach it about IV/nonce and tag lengths for AEAD so we can support
all of the NIST KAT tests for GCM and CCM.
- I've split up the exising crypto.9 manpage into several pages
of which many are written from scratch.
- I have converted all drivers and consumers in the tree and verified
that they compile, but I have not tested all of them. I have tested
the following drivers:
- cryptosoft
- aesni (AES only)
- blake2
- ccr
and the following consumers:
- cryptodev
- IPsec
- ktls_ocf
- GELI (lightly)
I have not tested the following:
- ccp
- aesni with sha
- hifn
- kgssapi_krb5
- ubsec
- padlock
- safe
- armv8_crypto (aarch64)
- glxsb (i386)
- sec (ppc)
- cesa (armv7)
- cryptocteon (mips64)
- nlmsec (mips64)
Discussed with: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23677
When I implemented MD DYNAMIC parsing, I was originally passing a
linker_file_t so that the MD code could relocate pointers.
However, it turns out this isn't even filled in until later, so it was
always 0.
Just pass the load base (ef->address) directly, as that's really the only
thing we were interested in in the first place.
This fixes a crash on RB800 where it was trying to write to an unmapped
address when updating the GOT.
Reviewed by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D24105
Clang does not recognize some of the GCC optimization options that are
used to compile ucore_app.bin. This is required to switch MIPS to
compile with LLVM by default (D23204).
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D24092
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
switch over to opt-in instead of opt-out for epoch.
Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks
itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch
when processing its packets.
Now this will create recursive entrance in epoch in >90% network
drivers, but will guarantee safeness of the transition.
Mark several tested drivers as IFF_KNOWSEPOCH.
Reviewed by: hselasky, jeff, bz, gallatin
Differential Revision: https://reviews.freebsd.org/D23674
After r355784 the td_oncpu field is no longer synchronized by the thread
lock, so the stack capture interrupt cannot be delievered precisely.
Fix this using a loop which drops the thread lock and restarts if the
wrong thread was sampled from the stack capture interrupt handler.
Change the implementation to use a regular interrupt instead of an NMI.
Now that we drop the thread lock, there is no advantage to the latter.
Simplify the KPIs. Remove stack_save_td_running() and add a return
value to stack_save_td(). On platforms that do not support stack
capture of running threads, stack_save_td() returns EOPNOTSUPP. If the
target thread is running in user mode, stack_save_td() returns EBUSY.
Reviewed by: kib
Reported by: mjg, pho
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23355
supposedly may call into ether_input() without network epoch.
They all need to be reviewed before 13.0-RELEASE. Some may need
be fixed. The flag is not planned to be used in the kernel for
a long time.
Instead of re-deriving the value of SR using logic similar to
exec_set_regs(), just inherit the value from the existing thread
similar to fork().
Reviewed by: brooks
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23059
- Use ksi_addr directly as si_addr in the siginfo instead of the
'badvaddr' register.
- Remove a duplicate assignment of si_code.
- Use ksi_addr as the 4th argument to the old-style handler instead of
'badvaddr'.
Reviewed by: brooks, kevans
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23013
This is a lock-based emulation of 64-bit atomics for kernel use, split off
from an earlier patch by jhibbits.
This is needed to unblock future improvements that reduce the need for
locking on 64-bit platforms by using atomic updates.
The implementation allows for future integration with userland atomic64,
but as that implies going through sysarch for every use, the current
status quo of userland doing its own locking may be for the best.
Submitted by: jhibbits (original patch), kevans (mips bits)
Reviewed by: jhibbits, jeff, kevans
Differential Revision: https://reviews.freebsd.org/D22976
off the stack, initialized to default values, and then filled in with
driver-specific values, all without having to worry about the numerous
other fields in the tag. The resulting template is then passed into
busdma and the normal opaque tag object created. See the man page for
details on how to initialize a template.
Templates do not support tag filters. Filters have been broken for many
years, and only existed for an ancient make/model of hardware that had a
quirky DMA engine. Instead of breaking the ABI/API and changing the
arugment signature of bus_dma_tag_create() to remove the filter arguments,
templates allow us to ignore them, and also significantly reduce the
complexity of creating and managing tags.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D22906
r356043 missed a couple of references in machdep parts... arguably, these
lines could probably be dropped as the softc is likely still zero'd at this
point.
Pointy hat: kevans
So it turns out that sometime in the past I removed the GPIO bits here
and was going to move it into a module in order to save a little space.
However, it turns out that was a mistake on this particular AP - it
uses a pair of GPIO lines to control the two receive LNAs on the 2GHz
radio and without them enabled the radio is a LOT DEAF.
With this re-introduced (and some replacement userland tools to save
space, *cough* cpio/libarchive) I can actually use these chipsets
again as a 2G station. Without the LNA the AP was seeing a per-radio
RSSI upstairs here of around 3-5dB, with the LNA on it's around 15dB,
more than enough to actually use wifi upstairs and also in line with
the other Atheros / Intel devices I have up here.
Big oopsie to Adrian. Big, big oopsie.
fix an assert violation introduced in r355784. Without this spinlock_exit()
may see owepreempt and switch before reducing the spinlock count. amd64
had been optimized to do a single critical enter/exit regardless of the
number of spinlocks which avoided the problem and this optimization had
not been applied elsewhere.
Reported by: emaste
Suggested by: rlibby
Discussed with: jhb, rlibby
Tested by: manu (arm64)
Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency. This means that mi_switch() is entered with the thread
locked but returns without. This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.
This change has not been made to 4BSD but in principle it would be
more straightforward.
Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778