Why is the entire CUDA program not SPMD (here it says the single thread block level is SPMD). In my mind SPMD means "multiple threads running this program", but is the caveat that all the threads have to be concurrently scheduled for us to call the program SPMD?
Also, what are the subtle but notable differences between these models of execution that this slide hints at?
Two questions here:
Why is the entire CUDA program not SPMD (here it says the single thread block level is SPMD). In my mind SPMD means "multiple threads running this program", but is the caveat that all the threads have to be concurrently scheduled for us to call the program SPMD?
Also, what are the subtle but notable differences between these models of execution that this slide hints at?