Marcel Moolenaar 4630415a47 Improve SMP support:
o  Allocate a VHPT per CPU. The VHPT is a hash table that the CPU
   uses to look up translations it can't find in the TLB. As such,
   the VHPT serves as a level 1 cache (the TLB being a level 0 cache)
   and best results are obtained when it's not shared between CPUs.
   The collision chain (i.e. the hash bucket) is shared between CPUs,
   as all buckets together constitute our collection of PTEs. To
   achieve this, the collision chain does not point to the first PTE
   in the list anymore, but to a hash bucket head structure. The
   head structure contains the pointer to the first PTE in the list,
   as well as a mutex to lock the bucket. Thus, each bucket is locked
   independently of each other. With at least 1024 buckets in the VHPT,
   this provides for sufficiently finei-grained locking to make the
   ssolution scalable to large SMP machines.
o  Add synchronisation to the lazy FP context switching. We do this
   with a seperate per-thread lock. On SMP machines the lazy high FP
   context switching without synchronisation caused inconsistent
   state, which resulted in a panic. Since the use of the high FP
   registers is not common, it's possible that races exist. The ia64
   package build has proven to be a good stress test, so this will
   get plenty of exercise in the near future.
o  Don't use the local ID of the processor we want to send the IPI to
   as the argument to ipi_send(). use the struct pcpu pointer instead.
   The reason for this is that IPI delivery is unreliable. It has been
   observed that sending an IPI to a CPU causes it to receive a stray
   external interrupt. As such, we need a way to make the delivery
   reliable. The intended solution is to queue requests in the target
   CPU's per-CPU structure and use a single IPI to inform the CPU that
   there's a new entry in the queue. If that IPI gets lost, the CPU
   can check it's queue at any convenient time (such as for each
   clock interrupt). This also allows us to send requests to a CPU
   without interrupting it, if such would be beneficial.

With these changes SMP is almost working. There are still some random
process crashes and the machine can hang due to having the IPI lost
that deals with the high FP context switch.

The overhead of introducing the hash bucket head structure results
in a performance degradation of about 1% for UP (extra pointer
indirection). This is surprisingly small and is offset by gaining
reasonably/good scalable SMP support.
2005-08-06 20:28:19 +00:00

44 lines
923 B
C

/*
* $FreeBSD$
*/
#ifndef _MACHINE_SMP_H_
#define _MACHINE_SMP_H_
#ifdef _KERNEL
/*
* Interprocessor interrupts for SMP. The following values are indices
* into the IPI vector table. The SAL gives us the vector used for AP
* wake-up. We base the other vectors on that. Keep IPI_AP_WAKEUP at
* index 0. See sal.c for details.
*/
/* Architecture specific IPIs. */
#define IPI_AP_WAKEUP 0
#define IPI_HIGH_FP 1
#define IPI_MCA_CMCV 2
#define IPI_MCA_RENDEZ 3
#define IPI_TEST 4
/* Machine independent IPIs. */
#define IPI_AST 5
#define IPI_RENDEZVOUS 6
#define IPI_STOP 7
#define IPI_PREEMPT 8
#define IPI_COUNT 9
#ifndef LOCORE
struct pcpu;
extern int ipi_vector[];
void ipi_all(int ipi);
void ipi_all_but_self(int ipi);
void ipi_selected(cpumask_t cpus, int ipi);
void ipi_self(int ipi);
void ipi_send(struct pcpu *, int ipi);
#endif /* !LOCORE */
#endif /* _KERNEL */
#endif /* !_MACHINE_SMP_H */