37 lines
1.4 KiB
Plaintext
37 lines
1.4 KiB
Plaintext
This directory contains mpn functions for various SPARC chips. Code that
|
|
runs only on version 8 SPARC implementations, is in the v8 subdirectory.
|
|
|
|
RELEVANT OPTIMIZATION ISSUES
|
|
|
|
Load and Store timing
|
|
|
|
On most early SPARC implementations, the ST instructions takes multiple
|
|
cycles, while a STD takes just a single cycle more than an ST. For the CPUs
|
|
in SPARCstation I and II, the times are 3 and 4 cycles, respectively.
|
|
Therefore, combining two ST instrucitons into a STD when possible is a
|
|
significant optimiation.
|
|
|
|
Later SPARC implementations have single cycle ST.
|
|
|
|
For SuperSPARC, we can perform just one memory instruction per cycle, even
|
|
if up to two integer instructions can be executed in its pipeline. For
|
|
programs that perform so many memory operations that there are not enough
|
|
non-memory operations to issue in parallel with all memory operations, using
|
|
LDD and STD when possible helps.
|
|
|
|
STATUS
|
|
|
|
1. On a SuperSPARC, mpn_lshift and mpn_rshift run at 3 cycles/limb, or 2.5
|
|
cycles/limb asymptotically. We could optimize speed for special counts
|
|
by using ADDXCC.
|
|
|
|
2. On a SuperSPARC, mpn_add_n and mpn_sub_n runs at 2.5 cycles/limb, or 2
|
|
cycles/limb asymptotically.
|
|
|
|
3. mpn_mul_1 runs at what is believed to be optimal speed.
|
|
|
|
4. On SuperSPARC, mpn_addmul_1 and mpn_submul_1 could both be improved by a
|
|
cycle by avoiding one of the add instrucitons. See a29k/addmul_1.
|
|
|
|
The speed of the code for other SPARC implementations is uncertain.
|