on the V100 chip, how are we actually able to get 163840 pieces of data to each processor at the same time? I'd imagine with that many cores, there's bound to be some sort of asymmetry with when things are executed?
Please log in to leave a comment.
on the V100 chip, how are we actually able to get 163840 pieces of data to each processor at the same time? I'd imagine with that many cores, there's bound to be some sort of asymmetry with when things are executed?