How does a gang of ISPC instances maintain state of local variables for each of its instances? Does it mean that a SIMD instruction would use more registers to maintain information for its instances than a scalar instruction?
Note that in a regular scalar program, a local variable is stored in a register. For example the float value
might get stored in R0.
Compiling the ISPC program down to a vector program is similar. Now the programCount
unique values of the variable value
for an entire gang of program instances is stored in a vector register of length programCount
. Let's call that register V0. So the i'th element of V0 is the value of the local variable value
for the i'th program instance in the gang.
@Julie I believe you're correct; ISPC parallelizes your code, performing arithmetic operations simultaneously
During lecture, it was mentioned that ISPC is not implemented using threads. Instead, at compile time, all programCount instances are turned into vector instructions (i.e. SIMD).
What distinguishes local variables from variables available to all program instances in a gang?
Please log in to leave a comment.
Are instances running ISPC code concurrently, or are they running them in parallel? My understanding is that if you have say, 8 ISPC program instances, and 8 ALU's in a core, these 8 program instances would be running in parallel