Visual computing tasks such as computational imaging, image/video understanding, generative AI, and real-time 3D graphics are key responsibilities of modern computer systems ranging from sensor-rich smart phones, autonomous robots, and large datacenters. These workloads demand exceptional system efficiency and this course examines the key ideas, techniques, and challenges associated with the design of parallel, heterogeneous systems that accelerate visual computing applications. This course is intended for systems students interested in architecting efficient graphics, image processing, and computer vision platforms (both new hardware architectures and domain-optimized programming frameworks for these platforms) and for graphics, vision, and AI students that wish to understand throughput computing principles to design new algorithms that map efficiently to these machines.
Apr 02 |
|
Discussion of modern visual computing applications, a design exercise
|
Apr 04 |
|
Algorithms for taking raw sensor pixels to an RGB image: demosaicing, sharpening, correcting lens aberrations, multi-shot alignment/merging, image filtering, multi-scale processing with Gaussian and Laplacian pyramids, HDR (local tone mapping)
|
Apr 09 |
|
The Frankencamera, modern camera APIs, advanced image analysis for photography (portrait mode, autofocus, etc)
|
Apr 11 |
|
Balancing locality, parallelism, and work, fusion and tiling, design of the Halide domain-specific language, automatically scheduling image processing pipelines
|
Apr 16 |
|
Detailed look at Halide's scheduling algebra
|
Apr 18 |
|
Data-layout optimizations, scheduling decisions, fusion optimizations, modern libraries (like CUTLASS)
|
Apr 23 |
|
GPUs, TPUs, special instructions for DNN evaluation (and their efficiency vs custom ASIC), choice of precision in arithmetic, modern commercial DNN accelerators, flexibility vs efficiency trade-offs
|
Apr 25 |
|
The importance of predictable control in content creation. Techniques for inserting new forms of control into generative image synthesis, role of human-interpretable abstractions.
|
Apr 30 |
|
Modern techniques for generating images efficiently with generative AI: stable diffusion, low-dimensional spaces, consistency matching
|
May 02 |
|
Video generation (like Sora), generating 3D content, virtual worlds, generating programs
|
May 07 |
|
LLM-based problem solving agents, systems and platforms for developing AI agents
|
May 09 |
|
Training agents in virtual worlds, simulation engines for training agents, throughput-maximized engines, sim-to-real issues, hybrid RL-LLM systems
|
May 14 |
|
Discussion of high-throughput systems like Madrona, and pixel based systems like DeepMind's Genie
|
May 16 |
|
Discussion: Data-Driven vs. Traditional Modeling Driven World Simulation
An in-class debate about the viablilty of data-driven neural approaches to world simulation, vs more traditional methods that model surfaces, materials, lighting, and physics using traditional representations.
|
May 21 |
|
Guest Speaker: Learning to Play Counterstrike
A guest lecture from David Durst, who has created bots for the game of Counterstrike that exhibit movement and teamwork patterns typical of skilled human players.
|
May 23 |
|
H.264 video representation/encoding, parallel encoding, motivations for ASIC acceleration, ML-based compression methods, emerging opportunities for compression when machines, not humans, will observe most images
|
May 28 |
|
Scene representations such as NeRF, dense volumes, sparse-octrees, neural Hash-Grids, 3D gaussians. Gaussian splatting and its performance optimization. Ray casting vs. rasterization.
|
May 30 |
|
Guest Lecture: OpenAI's Sora
A guest lecture from Tim Brooks, one of the creators of Sora.
|
Jun 5 | Term Project Information |