Stanford CS348K, Spring 2023

Visual computing tasks such as computational imaging, image/video understanding, and real-time 3D graphics are key responsibilities of modern computer systems ranging from sensor-rich smart phones, autonomous robots, and large datacenters. These workloads demand exceptional system efficiency and this course examines the key ideas, techniques, and challenges associated with the design of parallel, heterogeneous systems that accelerate visual computing applications. This course is intended for systems students interested in architecting efficient graphics, image processing, and computer vision platforms (both new hardware architectures and domain-optimized programming frameworks for these platforms) and for graphics, vision, and machine learning students that wish to understand throughput computing principles to design new algorithms that map efficiently to these machines.

Basic Info
Tues/Thurs 10:30-11:50pm
Location: STLC 105
Instructor: Kayvon Fatahalian
Welcome to CS348K Spring 2023. Please see the course info page for more info on policies and logistics, and well as answers to common questions like "Am I prepared to take this class?" This course is a paper-reading and in-class discussion-based course, so live attendence is expected of all participants.
Spring 2023 Schedule
Apr 04
Discussion of modern visual computing applications, basic computer architecture review
Apr 06
Algorithms for taking raw sensor pixels to an RGB image: demosaicing, sharpening, correcting lens aberrations, multi-shot alignment/merging, image filtering
Apr 11
Multi-scale processing with Gaussian and Laplacian pyramids, HDR (local tone mapping), burst image processing techniques (align and merge)
Apr 13
the Frankencamera, modern camera APIs, advanced image analysis for photography (portrait mode, autofocus, etc)
Apr 18
Balancing locality, parallelism, and work, fusion and tiling, design of the Halide domain-specific language, automatically scheduling image processing pipelines
Apr 20
Popular DNN trunks and topologies, where the compute lies in modern networks, data layout optimizations, scheduling decisions, modern code generation frameworks
Apr 25
Understanding modern optimization of transformers and attention. Lingering inefficiencies in designs. Memory footprint issues.
Apr 27
GPUs, TPUs, special instructions for DNN evaluation (and their efficiency vs custom ASIC), choice of precision in arithmetic, modern commercial DNN accelerators, flexibility vs efficiency trade-offs
May 02
Systematic approaches to generating supervision (Snorkel, Overton, Ludwig). Specifying models at a higher level of abstraction than DNN architecture graphs.
May 04
H.264 video representation/encoding, parallel encoding, motivations for ASIC acceleration, ML-based compression methods, emerging opportunities for compression when machines, not humans, will observe most images
May 09
System design issues for building a video conferencing system: reducing latency, bandwidth, etc. How real-time video analysis will enable richer video-based applications.
May 11
The light field, challenges of reconstructing geometry, initial discussion of NeRF algorithms
May 16
Discussion of the arc of NeRF papers, and the combination of neural and non-neural representations.
May 18
How simulation is being used to train agents to learn skills and problem solving.
May 23
How might systems for rendering and simulating virtual worlds be architected differently to more efficiently execute the computations needs for training agents
May 25
Basics of diffusion models. Survey of ways to control diffusion models.
May 30
Performance Optimization of Diffusion-Based Image Generation
Performance and efficiency issues related to diffusion models.
Jun 01
Real-Time GPU Accelerated Ray Tracing
Ray tracing workload characteristics, memory coherence and SIMD execution challenges of ray tracing, modern hardware acceleration on NVIDIA RTX GPUs), neural denoising and post-processing
Jun 01
Prompt Engineering: System Building Using LLMs as a Primitive
The emerging area of architecting systems around LLMs as a primitive
TBD Term Project Information