Previous | Next --- Slide 65 of 86
Back to Lecture Thumbnails
sidhikab

Tensor cores implement 4x4 matrix multiplication through specialized hardware. This is done through a circuit that is designed to do 4x4 or these days 8x4 matrix multiplication on NVIDIA chips. It can be thought of as a small spatial architecture.

amohamdy

How common are these complex instructions in specialized hardware?

Please log in to leave a comment.