Previous | Next --- Slide 12 of 60

qwerty

The objective here is to have this system share the underlying cache memory system so it can communicate efficiently. There are multiple CPU cores and some sort of integrated GPU and also components for doing media processing. The CPU cores efficient at executing threads of (potentially complex) control whereas the GPU is optimized for more data parallel processing. Finally, the media processor is optimized for image and video processing.

yarrow2

I'm curious about the placement of the shared LLC. It seems like the shared LLC is perfectly positioned amongst the 4 CPU cores, but relatively far from the GPU. Is there a reason for this? I guess from the next few slides on heterogenous systems, it seems like we can get away with having discrete GPUs, so I hypothesize that GPUs can do less frequent accesses to cache and focus more on optimizing for arithmetic intensity, but not completely sure.

Please log in to leave a comment.