foreign per-CPU pages in cpu_ipi_send() in order to get the module IDs
of the other CPUs can cause a page fault. If this happens when doing a
TLB shootdown while dealing with another page fault this causes a panic
due to the recursive page fault. As I don't spot other code that assumes
or requires that accessing foreign per-CPU pages must not page fault
solve this by adding a statically allocated (and therefore locked in the
kernel pages) array which establishes a FreeBSD CPU ID -> module ID
relation and use that in cpu_ipi_selected() (instead of statically
allocating the per-CPU pages which would just waste memory on say a dual
CPU machine as sun4u theoretically supports up to 128 CPUs or wasting
dTLB slots for the foreign per-CPU pages). [1]
- Fix a potential race in cpu_ipi_send(); as we don't serialize the access
to cpu_ipi_selected() between MI and MD use (only MI-MI and MD-MD) we
might catch the NACK bit caused by sending another IPI. Solve this by
checking the NACK bit in the contents of the interrupt dispatch status
reg read while interrupts were still turned off instead of reading that
reg anew after interrupts were turned on again. This is also what the
CPU docs suggest to do.
- Add a workaround for the SpitFire erratum #54 bug (affecting interrupt
dispatch). While public info regarding what this CPU bug actually causes
is not available testing shows that with the workaround in place it's
less likely to get a "couldn't send ipi" panic, it doesn't solve these
panics entirely though. [2]
Reported by: kris [1]
Some clue from: kmacy [1]
Info from: Linux, OpenSolaris [2]
Additional testing by: kris
MFC after: 3 days