In 1998, I was briefly given remote access to an Ardent Titan vector computer. IIRC, I was told that it had two 33 MHz MIPS CPUs controlling a vector unit with 8192-word vectors. The intent was to try to implement bitslice DES on it, as part of John the Ripper. Unfortunately, I did not approach the actual task until it was too late - I was told that the system had died. However, I did happen to try it out briefly and I saved some files from it, which I think are of historical value (which is why I am writing this).
Downloadable files:
And here they are right on the wiki:
This XOR's a lot of data - relevant to the task mentioned above.
#define N 10000 int x[N], y[N], z[N]; int main() { int i, j; for (i = 0; i < N; i++) { x[i] = i; y[i] = i + 1; z[i] = i + 2; } for (j = 0; j < N; j++, x[j] += z[j]) for (i = 0; i < N; i++) z[i] = x[i] ^ y[i]; for (i = 0; i < N; i++) if (i % (N / 10) == 0) printf("%d\n", z[i]); return 0; }
Vectorized Results From File vx.c
Origin -- Line 9
Line Stmt Time Program
* * 3 $$ds = 32;
* * 6 if ($$B1 < 64 & $$B1 >= 2) $$ds = $$B1>>1;
9 * 5 DO PARALLEL ($$ip = 0; $$ip != 9999; $$ip += $$ds) {
9 * 11 $$rp = MIN(9999, $$ip - 1 + $$ds);
* * 9 $$vl = $$rp - $$ip + 1;
9 * 6 DO VECTOR ($$I1 = $$ip; $$I1 != $$rp; $$I1++) {
10 6 16 x[$$I1] = $$I1;
11 7 19 y[$$I1] = $$I1 + 1;
12 8 19 z[$$I1] = $$I1 + 2;
}
}
No directives were found.
--------------------------------------
Vectorized Results From File vx.c
Origin -- Line 15
Line Stmt Time Program
* 21 3 $$B2 = 10000;
15 * 4 for ($$I1 = 0; $$I1 != 9999; $$I1++) {
* * 3 $$ds = 32;
* * 6 if ($$B2 < 64 & $$B2 >= 2) $$ds = $$B2>>1;
16 * 5 DO PARALLEL ($$ip = 0; $$ip != 9999; $$ip += $$ds) {
16 * 11 $$rp = MIN(9999, $$ip - 1 + $$ds);
* * 9 $$vl = $$rp - $$ip + 1;
16 * 6 DO VECTOR ($$I2 = $$ip; $$I2 != $$rp; $$I2++) {
17 24 44 z[$$I2] = x[$$I2] ^ y[$$I2];
}
}
15 33 46 x[1 + $$I1] = x[1 + $$I1] + z[1 + $$I1];
}
No directives were found.
--------------------------------------
Vectorized Results From File vx.c
Origin -- Line 19
Line Stmt Time Program
19 44 4 while (i < 10000)
{
19 45 6 $$B1 = $$B1 + 1;
20 46 10 if ((i % 1000) == 0)
{
20 47 18 printf("%d\n", z[i]);
}
19 48 4 $$4 = i;
19 49 6 i = $$4 + 1;
}
Loop was not analyzed for the following reasons:
1) This loop contains 1 function calls.
This is MIPS with some instructions using the vector unit - these have “v” in their names and in the register names:
# -S output .text .comm x,40000 .comm y,40000 .comm z,40000 .ltcomm $$103,4 .ltcomm $$102,4 .set noat .set noreorder .set nofpuwait .globl main main: fsw move $t6, $0 sw $t6, <vlength> addiu $sp, $sp, 0xffffff98 addiu $t6, $0, 0x1 addiu $t7, $0, 0x20 addiu $t5, $0, 0x2 addiu $t4, $0, 0x139 sw $t4, 84($sp) sw $t5, 80($sp) sw $t7, 72($sp) sw $t6, 76($sp) sw $s1, 92($sp) sw $s0, 96($sp) sw $ra, 100($sp) addiu $a0, $0, 0x139 jal _parbegin nop fsw lw $t5, 76($sp) lw $t6, 80($sp) sw $t5, $vl0.0.1 sw $t6, $vl0.0 la $t7, _index lw $t5, 72($sp) lw $t4, 84($sp) la $t9, $$102 fst $f1, 0($t9) la $t9, $$103 fst $f0, 0($t9) csabdw nop nop nop nop $L1: sw $t7, <SemAddr> lw $t3, <SemValue> nop subu $t6, $0, $t3 beq $t3, $0, $L2 nop subu $t6, $t4, $t6 mult $t6, $t5 mflo $t6 addiu $t3, $t6, 0xffffffff addu $t3, $t3, $t5 slti $t9, $t3, 0x270f bne $t9, $0, $L3 nop addiu $t3, $0, 0x270f $L3: subu $t3, $t3, $t6 addiu $t3, $t3, 0x1 sw $t3, <vlength> addiu $t9, $0, 0x4020 sw $t9, <reg_a> # <reg_a> <= $v1.1 addiu $t9, $0, 0x4a00 sw $t9, 0x5c40($0) # fvlda $v1.1, 0($t9) sw $t6, $vl0.0.2 ivadd $v0.1, $v1.1, [$v0.0.2] sll $t6, $t6, 2 la $t1, x addu $t1, $t1, $t6 addiu $t9, $0, 0x20 sw $t9, <reg_d> # <reg_d> <= $v0.1 sw $t1, 0x5e40($0) # fvst $v0.1, 0($t1) la $t2, y addu $t2, $t2, $t6 ivadd $v3.1, $v0.1, [$v0.0.1] ori $t9,$0,0xc020 sw $t9, <reg_d> # <reg_d> <= $v3.1 sw $t2, 0x5e40($0) # fvst $v3.1, 0($t2) la $t3, z addu $t3, $t3, $t6 ivadd $v2.1, $v0.1, [$v0.0] ori $t9,$0,0x8020 sw $t9, <reg_d> # <reg_d> <= $v2.1 sw $t3, 0x5e40($0) # fvst $v2.1, 0($t3) j $L1 nop $L2: jal _barrier nop fsw move $t7, $0 sw $t7, <vlength> la $t7, x+4 la $t6, z+4 move $t5, $0 sw $t5, 68($sp) sw $t6, 64($sp) sw $t7, 60($sp) csd nop nop nop nop $L4: addiu $a0, $0, 0x139 jal _parbegin nop fsw la $t7, _index csabdw nop nop nop nop $L5: sw $t7, <SemAddr> lw $t5, <SemValue> nop subu $t6, $0, $t5 beq $t5, $0, $L6 nop addiu $t9, $0, 0x139 subu $t6, $t9, $t6 sll $t6, $t6, 5 addiu $t5, $t6, 0xffffffff addiu $t5, $t5, 0x20 slti $t9, $t5, 0x270f bne $t9, $0, $L7 nop addiu $t5, $0, 0x270f $L7: subu $t5, $t5, $t6 sll $t6, $t6, 2 addiu $t5, $t5, 0x1 sw $t5, <vlength> la $t3, x addu $t3, $t3, $t6 addiu $t9, $0, 0x20 sw $t9, <reg_b> # <reg_b> <= $v0.1 sw $t3, 0x5d40($0) # fvldb $v0.1, 0($t3) la $t4, y addu $t4, $t4, $t6 ori $t9,$0,0x8020 sw $t9, <reg_a> # <reg_a> <= $v2.1 sw $t4, 0x5c40($0) # fvlda $v2.1, 0($t4) la $t5, z addu $t5, $t5, $t6 fsub $f31, $f31, $f31 fsub $f31, $f31, $f31 lvxor $v1.1, $v0.1, $v2.1 addiu $t9, $0, 0x4020 sw $t9, <reg_d> # <reg_d> <= $v1.1 sw $t5, 0x5e40($0) # fvst $v1.1, 0($t5) j $L5 nop $L6: jal _barrier nop fsw lw $t7, 60($sp) lw $t6, 64($sp) lw $t5, 0($t7) lw $t4, 0($t6) addiu $t6, $t6, 0x4 addu $t5, $t5, $t4 sw $t5, 0($t7) addiu $t7, $t7, 0x4 sw $t7, 60($sp) lw $t7, 68($sp) sw $t6, 64($sp) addiu $t7, $t7, 0x1 csd nop nop nop nop slti $t9, $t7, 0x2710 bne $t9, $0, $L4 sw $t7, 68($sp) move $s1, $0 la $s0, z addiu $t7, $0, 0x3e8 $L8: div $0, $s1, $t7 mfhi $t6 nop nop bne $t6, $0, $L9 sll $a1, $s1, 2 addu $a1, $s0, $a1 lw $a1, 0($a1) la $a0, $$5$12 jal printf nop fsw $L9: addiu $s1, $s1, 0x1 slti $t8, $s1, 0x2710 bne $t8, $0, $L8 addiu $t7, $0, 0x3e8 lw $ra, 100($sp) lw $s1, 92($sp) lw $s0, 96($sp) addiu $v0, $0, 0x0 j $ra addiu $sp, $sp, 0x68 nop nop .data $$5$12: .word 0x25640a00 # 627313152 % d nl nul .word 0 .ident .word 0x76782e63 # 1987587683 v x . c .word 0x3a613765 # 979449701 : a 7 e .word 0x39366130 # 959865136 9 6 a 0 .word 0x360a0000 # 906625024 6 nl nul nul .word 0x76782e63 # 1987587683 v x . c .word 0x204f5054 # 542068820 sp O P T .word 0x494f4e53 # 1229934163 I O N S .word 0x20303030 # 540028976 sp 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303039 # 808464441 0 0 0 9 .word 0x38363830 # 943077424 8 6 8 0 .word 0x30613130 # 811675952 0 a 1 0 .word 0x30323532 # 808596786 0 2 5 2 .word 0x30303030 # 808464432 0 0 0 0 .word 0x31303830 # 825243696 1 0 8 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x30303030 # 808464432 0 0 0 0 .word 0x300a0000 # 805961728 0 nl nul nul # end of assembler output
The way I happened to save it (formatted right on the Ardent Titan system):
CC(1) (C Programming Language Utilities) CC(1)
NAME
cc - C compiler
SYNOPSIS
cc [ options ] [ files ] [ options ] [ files ]
DESCRIPTION
The cc command is an interface to the Titan 1500/3000
Compilation System. The compilation tools consist of a
preprocessor, compiler, beautifier, assembler, and link
editor. The cc command processes the supplied options and
then executes the various tools with the proper arguments.
The cc command accepts several types of files as arguments:
Files whose names end with .c are taken to be C source
programs and may be preprocessed, compiled, optimized,
assembled, and link edited. The compilation process may be
stopped after the completion of any pass if the appropriate
options are supplied. If the compilation process runs
through the assembler then an object program is produced and
is left in the file whose name is that of the source with .o
substituted for .c. However, the .o file is normally
deleted if a single C program is compiled and then
immediately link edited. In the same way, files whose names
end in .s are taken to be assembly source programs, and may
be assembled and link edited; and files whose names end in
.i are taken to be preprocessed C source programs and may be
compiled, optimized, assembled and link edited. Files whose
names do not end in .c, .s or .i are handed to the link
editor.
Since the cc command usually creates files in the current
directory during the compilation process, it is necessary to
run the cc command in a directory in which a file can be
created.
The following options are interpreted by cc:
-c Suppress the link editing phase of the compilation, and
do not remove any produced object files.
-Dname
Define name to have the value of 1, to the
preprocessor.
-Dname=val
Define name to have the value of val, to the
preprocessor.
-E Run only cpp(1) on the named C programs, and send the
result to the standard output.
(printed 9/2/92) Kubota Pacific Computer Inc. Page 1
CC(1) (C Programming Language Utilities) CC(1)
-full_report
Produce a detailed vectorizer report.
-g Generate additional information needed for the use of
dbg(1). Force optimization level to zero.
-I Suppress the default searching for preprocessor
included files in /usr/include.
-Idir
Search for include files in dir.
-i Suppress the automatic production of #ident
information.
-inline
Instruct the compiler to enable function inlining.
-Npaths=name.in
Instruct the compiler to make use of the database of
functions listed in the catalog name.in as the source
for inlining.
-NW Suppress compiler warnings.
-n Suppress the standard C startup routine.
-O0 Turn off all optimizations.
-O1 Perform common subexpression elimination and
instruction scheduling. If nothing is specified, this
-O1 is the default setting of compiler optimization
level.
-O2 Perform -O1 and vectorization.
-O3 Perform -O2 and parallelization.
-O This is synonymous with -O1.
-o filename
Place the output into filename.
-P Run only cpp(1) on the named C programs and leave the
result in corresponding files suffixed .i. This option
is passed to cpp(1).
-p Generate code to profile the loaded program during
execution. (See prof(1) and mkprof(1).)
-ploop
Generate code that allows loops within a single routine
(printed 9/2/92) Kubota Pacific Computer Inc. Page 2
CC(1) (C Programming Language Utilities) CC(1)
to be profiled separately.
-r Produce a relocatable output file.
-S Compile and do not assemble the named C programs, and
leave the assembler output in corresponding files
suffixed .s.
-safe=loops
Guarantee that all for loops within the program have
upper bounds that do not vary within the loop.
-safe=parms
Declare that input arguments do not have hidden
aliases.
-safe=ptrs
Declare that pointers do not have hidden aliases.
-subcheck
Produce code to check at runtime to ensure that each
array element accessed is actually part of the
appropriate array. However, at optimization level 02
and higher, this option ignores the vector mask. This
means that some operations may generate subscriptranges
that are not actually in the code.
-Uname
Undefine name.
-V Print version information.
-v Generate more messages tracking the progress of the
compilation.
-vector_c
This is equivalent to specifying -safe=parms
-safe=loops.
-vreport
Invoke the vector reporting facility and tell the user
what vectorization has been done. A detailed listing
is provided for each loop nest and includes suggestions
for achieving better performance.
-vsummary
Invoke the vector reporting facility and tell the user
what vectorization has been done. Print out what
statements are and are not vectorized in each loop.
This output is in Fortran-like notation.
-w Suppress warning messages during compilation.
(printed 9/2/92) Kubota Pacific Computer Inc. Page 3
CC(1) (C Programming Language Utilities) CC(1)
-43 Use this option to get 4.3 BSD header files and
libraries.
The cc command recognizes -B hhhhhhh, -D hhhhhhh, -esym, -L,
-Ldir, -ltag, -m, -N, -ofilename, -opct, -p, -r, -s, -T
hhhhhhh, -t, -uname, and -yname and passes these options and
their arguments directly to the loader. See the manual
pages for cpp(1) and ld(1) for descriptions.
Other arguments are taken to be C compatible object
programs, typically produced by an earlier cc run, or
perhaps libraries of C compatible routines and are passed
directly to the link editor. These programs, together with
the results of any compilations specified, are link edited
(in the order given) to produce an executable program with
name a.out.
FILES
file.c C source file
file.o object file
file.s assembly language file
a.out link edited output
/lib/crt0.o start-up routine
TMPDIR/* temporary files
/lib/cpp preprocessor, cpp(1)
/bin/as assembler, as(1)
/bin/ld link editor, ld(1)
/lib/libc.a standard C library
TMPDIR is usually /usr/tmp but can be redefined by setting
the environment variable TMPDIR [see tempnam() in
tmpnam(3S)].
SEE ALSO
as(1), dbg(1), ld(1), cpp(1), mkprof(1), prof(1)
Kernighan, B. W., and Ritchie, D. M., The C Programming
Language, Prentice-Hall, 1978. Harbison, S. P., and Steele,
G. L. Jr., C: A Reference Manual, Prentice-Hall, Second
Edition, 1987.
NOTES
By default, the return value from a compiled C program is
completely random. The only two guaranteed ways to return a
specific value is to explicitly call exit(2) or to leave the
function main() with a ``return expression;'' construct.
(printed 9/2/92) Kubota Pacific Computer Inc. Page 4
AS(1) (Software Generation System Utilities) AS(1)
NAME
as - common assembler
SYNOPSIS
as [options] [input] output
DESCRIPTION
Note: This program differs from most UNIX assemblers because
it may be used as a filter.
The as command assembles the named file. The following
flags may be specified in any order:
-i filename Specifies a name for the input filename.
(However, the input is still stdin .)
-o objfile Put the output of the assembly in objfile. By
default, the output file name is formed by
removing the .s suffix, if there is one, from
the input file name specified with the -i option
and appending a .o suffix. If there is no -i
option, the default output file is named a.out.
-S Produce on the standard output a disassembled
version of the input.
-V Write the version number of the assembler being
run on the standard error output.
SEE ALSO
cc(1), ld(1), nm(1), strip(1), tmpnam(3S), a.out(4)
NOTES
Note: Writing assembly code that correctly uses the floating
point or vector units involves subtle issues of
synchronization that are best left to the compiler. Use of
this option is strongly discouraged.
Wherever possible, the assembler should be accessed through
a compilation system interface program such as cc(1). In
this case, the C preprocessor is run, giving a rudimentary
macro and include capability.
(printed 9/2/92) Kubota Pacific Computer Inc. Page 1
Back to my pseudo homepage.