Previous | Next --- Slide 23 of 60

michzrrr

The idea here is that a large inefficiency we have not addressed in the class till now lies in the overhead of using these parallel computations. E.g. even in SIMD, we only use like 5-10% of the computation. Ideally, we can directly hardcode out operations on the hardware, and skip a ton of this overhead work.

Please log in to leave a comment.