In 1998, I was briefly given remote access to an Ardent Titan vector computer. IIRC, I was told that it had two 33 MHz MIPS CPUs controlling a vector unit with 8192-word vectors. The intent was to try to implement bitslice DES on it, as part of John the Ripper. Unfortunately, I did not approach the actual task until it was too late - I was told that the system had died. However, I did happen to try it out briefly and I saved some files from it, which I think are of historical value (which is why I am writing this).
Downloadable files:
And here they are right on the wiki:
This XOR's a lot of data - relevant to the task mentioned above.
#define N 10000 int x[N], y[N], z[N]; int main() { int i, j; for (i = 0; i < N; i++) { x[i] = i; y[i] = i + 1; z[i] = i + 2; } for (j = 0; j < N; j++, x[j] += z[j]) for (i = 0; i < N; i++) z[i] = x[i] ^ y[i]; for (i = 0; i < N; i++) if (i % (N / 10) == 0) printf("%d\n", z[i]); return 0; }
Vectorized Results From File vx.c Origin -- Line 9 Line Stmt Time Program * * 3 $$ds = 32; * * 6 if ($$B1 < 64 & $$B1 >= 2) $$ds = $$B1>>1; 9 * 5 DO PARALLEL ($$ip = 0; $$ip != 9999; $$ip += $$ds) { 9 * 11 $$rp = MIN(9999, $$ip - 1 + $$ds); * * 9 $$vl = $$rp - $$ip + 1; 9 * 6 DO VECTOR ($$I1 = $$ip; $$I1 != $$rp; $$I1++) { 10 6 16 x[$$I1] = $$I1; 11 7 19 y[$$I1] = $$I1 + 1; 12 8 19 z[$$I1] = $$I1 + 2; } } No directives were found. -------------------------------------- Vectorized Results From File vx.c Origin -- Line 15 Line Stmt Time Program * 21 3 $$B2 = 10000; 15 * 4 for ($$I1 = 0; $$I1 != 9999; $$I1++) { * * 3 $$ds = 32; * * 6 if ($$B2 < 64 & $$B2 >= 2) $$ds = $$B2>>1; 16 * 5 DO PARALLEL ($$ip = 0; $$ip != 9999; $$ip += $$ds) { 16 * 11 $$rp = MIN(9999, $$ip - 1 + $$ds); * * 9 $$vl = $$rp - $$ip + 1; 16 * 6 DO VECTOR ($$I2 = $$ip; $$I2 != $$rp; $$I2++) { 17 24 44 z[$$I2] = x[$$I2] ^ y[$$I2]; } } 15 33 46 x[1 + $$I1] = x[1 + $$I1] + z[1 + $$I1]; } No directives were found. -------------------------------------- Vectorized Results From File vx.c Origin -- Line 19 Line Stmt Time Program 19 44 4 while (i < 10000) { 19 45 6 $$B1 = $$B1 + 1; 20 46 10 if ((i % 1000) == 0) { 20 47 18 printf("%d\n", z[i]); } 19 48 4 $$4 = i; 19 49 6 i = $$4 + 1; } Loop was not analyzed for the following reasons: 1) This loop contains 1 function calls.
This is MIPS with some instructions using the vector unit - these have āvā in their names and in the register names:
# -S output .text .comm x,40000 .comm y,40000 .comm z,40000 .ltcomm $$103,4 .ltcomm $$102,4 .set noat .set noreorder .set nofpuwait .globl main main: fsw move $t6, $0 sw $t6, <vlength> addiu $sp, $sp, 0xffffff98 addiu $t6, $0, 0x1 addiu $t7, $0, 0x20 addiu $t5, $0, 0x2 addiu $t4, $0, 0x139 sw $t4, 84($sp) sw $t5, 80($sp) sw $t7, 72($sp) sw $t6, 76($sp) sw $s1, 92($sp) sw $s0, 96($sp) sw $ra, 100($sp) addiu $a0, $0, 0x139 jal _parbegin nop fsw lw $t5, 76($sp) lw $t6, 80($sp) sw $t5, $vl0.0.1 sw $t6, $vl0.0 la $t7, _index lw $t5, 72($sp) lw $t4, 84($sp) la $t9, $$102 fst $f1, 0($t9) la $t9, $$103 fst $f0, 0($t9) csabdw nop nop nop nop $L1: sw $t7, <SemAddr> lw $t3, <SemValue> nop subu $t6, $0, $t3 beq $t3, $0, $L2 nop subu $t6, $t4, $t6 mult $t6, $t5 mflo $t6 addiu $t3, $t6, 0xffffffff addu $t3, $t3, $t5 slti $t9, $t3, 0x270f bne $t9, $0, $L3 nop addiu $t3, $0, 0x270f $L3: subu $t3, $t3, $t6 addiu $t3, $t3, 0x1 sw $t3, <vlength> addiu $t9, $0, 0x4020 sw $t9, <reg_a> # <reg_a> <= $v1.1 addiu $t9, $0, 0x4a00 sw $t9, 0x5c40($0) # fvlda $v1.1, 0($t9) sw $t6, $vl0.0.2 ivadd $v0.1, $v1.1, [$v0.0.2] sll $t6, $t6, 2 la $t1, x addu $t1, $t1, $t6 addiu $t9, $0, 0x20 sw $t9, <reg_d> # <reg_d> <= $v0.1 sw $t1, 0x5e40($0) # fvst $v0.1, 0($t1) la $t2, y addu $t2, $t2, $t6 ivadd $v3.1, $v0.1, [$v0.0.1] ori $t9,$0,0xc020 sw $t9, <reg_d> # <reg_d> <= $v3.1 sw $t2, 0x5e40($0) # fvst $v3.1, 0($t2) la $t3, z addu $t3, $t3, $t6 ivadd $v2.1, $v0.1, [$v0.0] ori $t9,$0,0x8020 sw $t9, <reg_d> # <reg_d> <= $v2.1 sw $t3, 0x5e40($0) # fvst $v2.1, 0($t3) j $L1 nop $L2: jal _barrier nop fsw move $t7, $0 sw $t7, <vlength> la $t7, x+4 la $t6, z+4 move $t5, $0 sw $t5, 68($sp) sw $t6, 64($sp) sw $t7, 60($sp) csd nop nop nop nop $L4: addiu $a0, $0, 0x139 jal _parbegin nop fsw la $t7, _index csabdw nop nop nop nop $L5: sw $t7, <SemAddr> lw $t5, <SemValue> nop subu $t6, $0, $t5 beq $t5, $0, $L6 nop addiu $t9, $0, 0x139 subu $t6, $t9, $t6 sll $t6, $t6, 5 addiu $t5, $t6, 0xffffffff addiu $t5, $t5, 0x20 slti $t9, $t5, 0x270f bne $t9, $0, $L7 nop addiu $t5, $0, 0x270f $L7: subu $t5, $t5, $t6 sll $t6, $t6, 2 addiu $t5, $t5, 0x1 sw $t5, <vlength> la $t3, x addu $t3, $t3, $t6 addiu $t9, $0, 0x20 sw $t9, <reg_b> # <reg_b> <= $v0.1 sw $t3, 0x5d40($0) # fvldb $v0.1, 0($t3) la $t4, y addu $t4, $t4, $t6 ori $t9,$0,0x8020 sw $t9, <reg_a> # <reg_a> <= $v2.1 sw $t4, 0x5c40($0) # fvlda $v2.1, 0($t4) la $t5, z addu $t5, $t5, $t6 fsub $f31, $f31, $f31 fsub $f31, $f31, $f31 lvxor $v1.1, $v0.1, $v2.1 addiu $t9, $0, 0x4020 sw $t9, <reg_d> # <reg_d> <= $v1.1 sw $t5, 0x5e40($0) # fvst $v1.1, 0($t5) j $L5 nop $L6: jal _barrier nop fsw lw $t7, 60($sp) lw $t6, 64($sp) lw $t5, 0($t7) lw $t4, 0($t6) addiu $t6, $t6, 0x4 addu $t5, $t5, $t4 sw $t5, 0($t7) addiu $t7, $t7, 0x4 sw $t7, 60($sp) lw $t7, 68($sp) sw $t6, 64($sp) addiu $t7, $t7, 0x1 csd nop nop nop nop slti $t9, $t7, 0x2710 bne $t9, $0, $L4 sw $t7, 68($sp) move $s1, $0 la $s0, z addiu $t7, $0, 0x3e8 $L8: div $0, $s1, $t7 mfhi $t6 nop nop bne $t6, $0, $L9 sll $a1, $s1, 2 addu $a1, $s0, $a1 lw $a1, 0($a1) la $a0, $$5$12 jal printf nop fsw $L9: addiu $s1, $s1, 0x1 slti $t8, $s1, 0x2710 bne $t8, $0, $L8 addiu $t7, $0, 0x3e8 lw $ra, 100($sp) lw $s1, 92($sp) lw $s0, 96($sp) addiu $v0, $0, 0x0 j $ra addiu $sp, $sp, 0x68 nop nop .data $$5$12: .word 0x25640a00 # 627313152 % d nl nul .word 0 .ident .word 0x76782e63 # 1987587683 v x . c .word 0x3a613765 # 979449701 : a 7 e .word 0x39366130 # 959865136 9 6 a 0 .word 0x360a0000 # 906625024 6 nl nul nul .word 0x76782e63 # 1987587683 v x . c .word 0x204f5054 # 542068820 sp O P T .word 0x494f4e53 # 1229934163 I O N S .word 0x20303030 # 540028976 sp 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303039 # 808464441 0 0 0 9 .word 0x38363830 # 943077424 8 6 8 0 .word 0x30613130 # 811675952 0 a 1 0 .word 0x30323532 # 808596786 0 2 5 2 .word 0x30303030 # 808464432 0 0 0 0 .word 0x31303830 # 825243696 1 0 8 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x300a0000 # 805961728 0 nl nul nul # end of assembler output
The way I happened to save it (formatted right on the Ardent Titan system):
CC(1) (C Programming Language Utilities) CC(1) NAME cc - C compiler SYNOPSIS cc [ options ] [ files ] [ options ] [ files ] DESCRIPTION The cc command is an interface to the Titan 1500/3000 Compilation System. The compilation tools consist of a preprocessor, compiler, beautifier, assembler, and link editor. The cc command processes the supplied options and then executes the various tools with the proper arguments. The cc command accepts several types of files as arguments: Files whose names end with .c are taken to be C source programs and may be preprocessed, compiled, optimized, assembled, and link edited. The compilation process may be stopped after the completion of any pass if the appropriate options are supplied. If the compilation process runs through the assembler then an object program is produced and is left in the file whose name is that of the source with .o substituted for .c. However, the .o file is normally deleted if a single C program is compiled and then immediately link edited. In the same way, files whose names end in .s are taken to be assembly source programs, and may be assembled and link edited; and files whose names end in .i are taken to be preprocessed C source programs and may be compiled, optimized, assembled and link edited. Files whose names do not end in .c, .s or .i are handed to the link editor. Since the cc command usually creates files in the current directory during the compilation process, it is necessary to run the cc command in a directory in which a file can be created. The following options are interpreted by cc: -c Suppress the link editing phase of the compilation, and do not remove any produced object files. -Dname Define name to have the value of 1, to the preprocessor. -Dname=val Define name to have the value of val, to the preprocessor. -E Run only cpp(1) on the named C programs, and send the result to the standard output. (printed 9/2/92) Kubota Pacific Computer Inc. Page 1 CC(1) (C Programming Language Utilities) CC(1) -full_report Produce a detailed vectorizer report. -g Generate additional information needed for the use of dbg(1). Force optimization level to zero. -I Suppress the default searching for preprocessor included files in /usr/include. -Idir Search for include files in dir. -i Suppress the automatic production of #ident information. -inline Instruct the compiler to enable function inlining. -Npaths=name.in Instruct the compiler to make use of the database of functions listed in the catalog name.in as the source for inlining. -NW Suppress compiler warnings. -n Suppress the standard C startup routine. -O0 Turn off all optimizations. -O1 Perform common subexpression elimination and instruction scheduling. If nothing is specified, this -O1 is the default setting of compiler optimization level. -O2 Perform -O1 and vectorization. -O3 Perform -O2 and parallelization. -O This is synonymous with -O1. -o filename Place the output into filename. -P Run only cpp(1) on the named C programs and leave the result in corresponding files suffixed .i. This option is passed to cpp(1). -p Generate code to profile the loaded program during execution. (See prof(1) and mkprof(1).) -ploop Generate code that allows loops within a single routine (printed 9/2/92) Kubota Pacific Computer Inc. Page 2 CC(1) (C Programming Language Utilities) CC(1) to be profiled separately. -r Produce a relocatable output file. -S Compile and do not assemble the named C programs, and leave the assembler output in corresponding files suffixed .s. -safe=loops Guarantee that all for loops within the program have upper bounds that do not vary within the loop. -safe=parms Declare that input arguments do not have hidden aliases. -safe=ptrs Declare that pointers do not have hidden aliases. -subcheck Produce code to check at runtime to ensure that each array element accessed is actually part of the appropriate array. However, at optimization level 02 and higher, this option ignores the vector mask. This means that some operations may generate subscriptranges that are not actually in the code. -Uname Undefine name. -V Print version information. -v Generate more messages tracking the progress of the compilation. -vector_c This is equivalent to specifying -safe=parms -safe=loops. -vreport Invoke the vector reporting facility and tell the user what vectorization has been done. A detailed listing is provided for each loop nest and includes suggestions for achieving better performance. -vsummary Invoke the vector reporting facility and tell the user what vectorization has been done. Print out what statements are and are not vectorized in each loop. This output is in Fortran-like notation. -w Suppress warning messages during compilation. (printed 9/2/92) Kubota Pacific Computer Inc. Page 3 CC(1) (C Programming Language Utilities) CC(1) -43 Use this option to get 4.3 BSD header files and libraries. The cc command recognizes -B hhhhhhh, -D hhhhhhh, -esym, -L, -Ldir, -ltag, -m, -N, -ofilename, -opct, -p, -r, -s, -T hhhhhhh, -t, -uname, and -yname and passes these options and their arguments directly to the loader. See the manual pages for cpp(1) and ld(1) for descriptions. Other arguments are taken to be C compatible object programs, typically produced by an earlier cc run, or perhaps libraries of C compatible routines and are passed directly to the link editor. These programs, together with the results of any compilations specified, are link edited (in the order given) to produce an executable program with name a.out. FILES file.c C source file file.o object file file.s assembly language file a.out link edited output /lib/crt0.o start-up routine TMPDIR/* temporary files /lib/cpp preprocessor, cpp(1) /bin/as assembler, as(1) /bin/ld link editor, ld(1) /lib/libc.a standard C library TMPDIR is usually /usr/tmp but can be redefined by setting the environment variable TMPDIR [see tempnam() in tmpnam(3S)]. SEE ALSO as(1), dbg(1), ld(1), cpp(1), mkprof(1), prof(1) Kernighan, B. W., and Ritchie, D. M., The C Programming Language, Prentice-Hall, 1978. Harbison, S. P., and Steele, G. L. Jr., C: A Reference Manual, Prentice-Hall, Second Edition, 1987. NOTES By default, the return value from a compiled C program is completely random. The only two guaranteed ways to return a specific value is to explicitly call exit(2) or to leave the function main() with a ``return expression;'' construct. (printed 9/2/92) Kubota Pacific Computer Inc. Page 4
AS(1) (Software Generation System Utilities) AS(1) NAME as - common assembler SYNOPSIS as [options] [input] output DESCRIPTION Note: This program differs from most UNIX assemblers because it may be used as a filter. The as command assembles the named file. The following flags may be specified in any order: -i filename Specifies a name for the input filename. (However, the input is still stdin .) -o objfile Put the output of the assembly in objfile. By default, the output file name is formed by removing the .s suffix, if there is one, from the input file name specified with the -i option and appending a .o suffix. If there is no -i option, the default output file is named a.out. -S Produce on the standard output a disassembled version of the input. -V Write the version number of the assembler being run on the standard error output. SEE ALSO cc(1), ld(1), nm(1), strip(1), tmpnam(3S), a.out(4) NOTES Note: Writing assembly code that correctly uses the floating point or vector units involves subtle issues of synchronization that are best left to the compiler. Use of this option is strongly discouraged. Wherever possible, the assembler should be accessed through a compilation system interface program such as cc(1). In this case, the C preprocessor is run, giving a rudimentary macro and include capability. (printed 9/2/92) Kubota Pacific Computer Inc. Page 1
Back to my pseudo homepage.