riglesias

Would an accurate way of describing the benefits of "message passing" is that "it's more easily scalable"? It seems like there's very little hardware / software overhead to simply sending/receiving a message, relative to the cost of trying to redesign a chip to support larger / more robust shared memory models.

(I guess one downside is that the copying may be more costly, but not sure how much of an issue this is in practice).

huangda

I wonder how this affects latency. I would imagine that a switch between cores to memory would be super fast. Does message passing have an impact on memory latency? What would be the comparison if message passing were used when cores share memory--what would be the performance hit?

shaan0924

Is the benefit of message passing over shared address based on what needs to be shared? For example, if there is a large dataset that all cores need access to, then shared address would be better, but if there is only some values that need to be accessed by a certain subset of cores, then massage passing is more useful. Am I on the right track here?

gmudel

@shaan0924 I think that you are totally correct. I'll also add that shared memory can be quite dangerous due to race conditions and whatnot -- you have to make sure that this memory is being written to and read from appropriately, probably requiring you to use locks/semaphores etc.

subscalar

At the scale of the cluster depicted in the image on this slide, the dataset is likely too large to fit on a single machine, and so a networking layer needs to be introduced. This blurs the distinction between shared addresses and message passing since the shared memory model could very well be an abstraction built on top of message passing.

My guess is that you'd need to establish mechanisms so that different parts of the shared address space can be worked on independently and synchronized across several machines. Race conditions at the thread level (local to one machine) would be superseded by protocol-level efforts to keep memory consistent. I'd appreciate it if somebody with experience in distributed systems could clarify what considerations need to be made for such a program model.

alexder

Also, a lot of modern day data centers and cloud providers utilize microservices, where each component of their system is basically contained separately and information is passed through each of these components. The good thing about this design is that it's extremely adaptable and scalable. But on the other hand, message passing can make debugging and finding root cause errors extremely difficult. This is because errors are more prone to be propagated through the system rather than localized.

akdhawan

Are there any C/C++ operations that are inherently atomic as in they translate directly to one assembly instruction?

kayvonf

@akdhawan - I wanted to point out this is a good example of be careful about abstraction and implementation. In modern C++ the atomic<T> is used to define datatypes that support atomic operations. (https://en.cppreference.com/w/cpp/atomic/atomic). By definition, these operations are atomic.

Your question is about implementation. You are asking about are there atomic operations that are implemented by a single atomic machine instruction. And the answer to your question is complex, as a single machine instruction may not be atomic with respect to all the operations a CPU must perform to execute it. But there are certain atomic operations that the C++ compiler could compile down to efficient atomic instruction sequences. For example, if your datatype was atomic<int>, then the compiler would almost certainly be smart enough to special case the addition operation on variables of that type to special atomic machine primitives like atomicAdd().