Stanford CS149, Winter 2019
PARALLEL COMPUTING
This page contains lecture slides and recommended readings for the Winter 2019 offering of CS149.
(motivations for parallel chip decisions, challenges of parallelizing code)
Further Reading:
- The Future of Microprocessors. by K. Olukotun and L. Hammond, ACM Queue 2005
- Power: A First-Class Architectural Design Constraint. by Trevor Mudge IEEE Computer 2001
(forms of parallelism: multicore, SIMD, threading + understanding latency and bandwidth)
Further Reading:
- CPU DB: Recording Microprocessor History. A. Danowitz, K. Kelley, J. Mao, J.P. Stevenson, M. Horowitz, ACM Queue 2005. (You can also take a peak at the CPU DB website)
- The Compute Architecture of Intel Processor Graphics. Intel Technical Report, 2015 (a very nice description of a modern throughput processor)
- Intel's Haswell CPU Microarchitecture. D. Kanter, 2013 (realworldtech.com article)
- NVIDIA GP100 Pascal Whitepaper. NVIDIA Technical Report 2016
(ways of thinking about parallel programs, and their corresponding hardware implementations)
Further Reading: (some fun systems)
(the thought process of parallelizing a program)
(achieving good work distribution while minimizing overhead, scheduling Cilk programs with work stealing)
Further Reading:
- CilkPlus documentation
- Scheduling Multithreaded Computations by Work Stealing. by Blumofe and Leiserson, JACM 1999
- Implementation of the Cilk 5 Multi-Threaded Language. by Frigo et al. PLDI 1998
- Intel Thread Building Blocks
(message passing, async vs. blocking sends/receives, pipelining, increasing arithmetic intensity, avoiding contention)
(CUDA programming abstractions, and how they are implemented on modern GPUs)
Further Reading:
- You may enjoy the free Udacity Course: Intro to Parallel Programming Using CUDA, by Luebke and Owens
- The Thrust Library is a useful collection library for CUDA.
- Rise of the Graphics Processor. D. Blythe (Proceedings of IEEE 2008) a nice overview of GPU history.
- NVIDIA GeForce GTX 1080 Whitepaper. NVIDIA Technical Report 2016
- NVIDIA Tesla P100 Whitepaper. NVIDIA Technical Report 2016
- The Compute Architecture of Intel Processor Graphics. Intel Technical Report, 2015 (a very nice description of a modern Intel integrated GPU)
- Pascal Tuning Guide. NVIDIA CUDA Documentation
(definition of memory coherence, invalidation-based coherence using MSI and MESI, maintaining coherence with multi-level caches, false sharing)