I suppose this answers my question from last lecture! Very interesting how the design here has been architected to allow for very fast context switching.
In practice, is the decision between many threads with few registers (this slide) or few threads with many registers (next slide) made at a silicon level or can it be configured dynamically e.g. by the CPU or operating system?
kayvonf
In general, CPU designs tend to have a fixed number of threads, determined at chip design time. For most Intel CPUs, that number is 2.
GPUs adopt designs where the number of threads is determined at runtime by system software. For example, let's say a modern GPU 16KB of on-chip storage for execution contexts. And say a program for the GPU is compiled to use 256 bytes of registers. Then the core could support concurrent execution of 64 threads running that compiled binary.
I suppose this answers my question from last lecture! Very interesting how the design here has been architected to allow for very fast context switching.
In practice, is the decision between many threads with few registers (this slide) or few threads with many registers (next slide) made at a silicon level or can it be configured dynamically e.g. by the CPU or operating system?