Previous | Next --- Slide 19 of 50

Martingale

Why would P1's access be more costly on NUMA?

minglotus

The answer is mostly in the definition of NUMA itself -> from wikipedia https://en.wikipedia.org/wiki/Non-uniform_memory_access

Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. > Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors). The benefits of NUMA are limited to particular workloads, notably on servers where the data is often > associated strongly with certain tasks or users.

So in a NUMA system, if a0-7 is stored in memory with higher affiliation with p1 and a8-15 is stored in memory with higher affiliation with p2, accessing memory of other cores could be more costly.

ghostcow

This is a great example of the potential downsides of overoptimization without careful examination of the hardware and other system properties an algorithm is running on. Here, the spatial (and temporal) locality of the simpler algorithm would likely yield superior performance than the fancier algorithm presented earlier on two cores.

Please log in to leave a comment.