having sufficient parallel work to utilize all available execution units seems to fit certain problem domains better that others. If your problem can't be parallelized, are you out of luck?
@apappu, I think slide 79 might help clarify what @kayvon means by "model parallel processors". Essentially, there is a mix of multiple cores running independent instructions amongst themselves, and these instructions are SIMD. In the example on slide 79, he had 16 cores, each with 4 threads (4 execution contexts / instructions stream), with 8 ALU's per core (8-wide SIMD capability) = 512 independent pieces of work needed to utilize all available execution units.
In other words, I think "modern processors" imply SIMD processing.
Please log in to leave a comment.
Here -- I'm confused what point 1 is referring to in terms of execution units per core, as doesn't execution units per core concern SIMD (point 2)? Or is this referring to superscalar parallelism? So point 1 means something like: "have at least n threads of work (where n = num cores), and have sufficient independent operations in each such that we can maximally utilize superscalar abilities"?