Alexandre Savard | 1b09e31 | 2012-08-07 20:33:29 -0400 | [diff] [blame] | 1 | Note that the UNROLL option makes the 'inner' des loop unroll all 16 rounds |
| 2 | instead of the default 4. |
| 3 | RISC1 and RISC2 are 2 alternatives for the inner loop and |
| 4 | PTR means to use pointers arithmatic instead of arrays. |
| 5 | |
| 6 | FreeBSD - Pentium Pro 200mhz - gcc 2.7.2.2 - assembler 577,000 4620k/s |
| 7 | IRIX 6.2 - R10000 195mhz - cc (-O3 -n32) - UNROLL RISC2 PTR 496,000 3968k/s |
| 8 | solaris 2.5.1 usparc 167mhz?? - SC4.0 - UNROLL RISC1 PTR [1] 459,400 3672k/s |
| 9 | FreeBSD - Pentium Pro 200mhz - gcc 2.7.2.2 - UNROLL RISC1 433,000 3468k/s |
| 10 | solaris 2.5.1 usparc 167mhz?? - gcc 2.7.2 - UNROLL 380,000 3041k/s |
| 11 | linux - pentium 100mhz - gcc 2.7.0 - assembler 281,000 2250k/s |
| 12 | NT 4.0 - pentium 100mhz - VC 4.2 - assembler 281,000 2250k/s |
| 13 | AIX 4.1? - PPC604 100mhz - cc - UNROLL 275,000 2200k/s |
| 14 | IRIX 5.3 - R4400 200mhz - gcc 2.6.3 - UNROLL RISC2 PTR 235,300 1882k/s |
| 15 | IRIX 5.3 - R4400 200mhz - cc - UNROLL RISC2 PTR 233,700 1869k/s |
| 16 | NT 4.0 - pentium 100mhz - VC 4.2 - UNROLL RISC1 PTR 191,000 1528k/s |
| 17 | DEC Alpha 165mhz?? - cc - RISC2 PTR [2] 181,000 1448k/s |
| 18 | linux - pentium 100mhz - gcc 2.7.0 - UNROLL RISC1 PTR 158,500 1268k/s |
| 19 | HPUX 10 - 9000/887 - cc - UNROLL [3] 148,000 1190k/s |
| 20 | solaris 2.5.1 - sparc 10 50mhz - gcc 2.7.2 - UNROLL 123,600 989k/s |
| 21 | IRIX 5.3 - R4000 100mhz - cc - UNROLL RISC2 PTR 101,000 808k/s |
| 22 | DGUX - 88100 50mhz(?) - gcc 2.6.3 - UNROLL 81,000 648k/s |
| 23 | solaris 2.4 486 50mhz - gcc 2.6.3 - assembler 65,000 522k/s |
| 24 | HPUX 10 - 9000/887 - k&r cc (default compiler) - UNROLL PTR 76,000 608k/s |
| 25 | solaris 2.4 486 50mhz - gcc 2.6.3 - UNROLL RISC2 43,500 344k/s |
| 26 | AIX - old slow one :-) - cc - 39,000 312k/s |
| 27 | |
| 28 | Notes. |
| 29 | [1] For the ultra sparc, SunC 4.0 |
| 30 | cc -xtarget=ultra -xarch=v8plus -Xa -xO5, running 'des_opts' |
| 31 | gives a speed of 344,000 des/s while 'speed' gives 459,000 des/s. |
| 32 | I'll record the higher since it is coming from the library but it |
| 33 | is all rather weird. |
| 34 | [2] Similar to the ultra sparc ([1]), 181,000 for 'des_opts' vs 175,000. |
| 35 | [3] I was unable to get access to this machine when it was not heavily loaded. |
| 36 | As such, my timing program was never able to get more that %30 of the CPU. |
| 37 | This would cause the program to give much lower speed numbers because |
| 38 | it would be 'fighting' to stay in the cache with the other CPU burning |
| 39 | processes. |