The idea of using directory is to reduce the number of processors we want to broadcast to when updating the cache line. The directory serves as a place to keep track of the processors (using presence bits) that has line in its cache. When broadcasting, we only send messages to the processors that is indicated by the directory's presence bits.
Is there any disadvantage to using directory coherence instead of snooping-based coherence besides the memory overhead of L3 keeping track of all the cache lines in a directory?
Why is directory coherence so hard to design?
so when we are only dealing with a single core, the directory entries for each cache line are stored in the L3 cache but if we are dealing with multiple cores, the directory is located in main memory? Does this extra cost of having to go into memory and look up the cache entry end up being worth it or in the end, does it end up being better to just go ahead and broadcast to all processors ?
@tigerpanda, I believe that multiple cores on a single processor chip share an L3 cache (this is what the diagram shows). However, if we have multiple separate CPUs, I think the directory is distributed across the processors and information about updates are sent through the interconnect between them.
More information on directories is available here: https://www.cs.cmu.edu/afs/cs/academic/class/15418-s19/www/lectures/13_directory.pdf
Please log in to leave a comment.