What is the difference between a node vs a server vs a machine?
@potato I'm also wondering this – I can imagine some general differences, but the words seem to be used in very similar contexts in most cases.
@potato I think for our purposes we can assume node = server = machine.
From lecture, each server rack contains several nodes/servers, and there's a network switch at the top of the rack that lets nodes from different racks communicate with each other. When we say we're utilizing 1000 nodes for our distributed tasks, these nodes may be spread over several racks.
An interesting fact about racks that are used in cloud computing centers. All the racks are of the same size for mainly few reasons: 1. The design of the cooling system is much simpler, cheaper, and more efficient when each rack can emit at most the same amount of heat. 2. It's cheaper to buy in bulk (:
what is the lifespan of a commodity cluster like this?
Given the CPU<->SSD bandwidth is comparable to CPU<->CPU, is there a common abstraction for widely shared files?
@rthomp typically the life span of a cluster is about 5 years, that seems to be the industry standard. It depends on how well maintained it is, it could last longer but it won't necessarily remain cost-effective. You can find more about this in this link: https://www.datacenterknowledge.com/how/why-expected-server-lifetime-eye-beholder
It is interesting that the speed of communication across the network is the same as nodes within a rack. I suppose this makes it easier to transfer data.
From lecture: a node is a single computer with its main memory and disk. It is has a single operating system.
Key observations from this slide are: 1. Bandwidth is higher on the shorter paths - 'Node to its own DRAM' > 'Node to its SSD' > or ~= 'Node to SSDs of other nodes on the same rack' > 'Node to nodes on other racks' 2. Prof emphasized on the similar bandwidths of 'Node to its own SSD' and 'Node to SSDs on the same rack'
Are there any alternative organization to nodes? What puts the hierarchical organization at an advantage?
the biggest difference from a commodity cluster from a high performance computation cluster is the network bandwidth
Please log in to leave a comment.
From lecture: A key insight is that the bandwidth from CPU to SSD is roughly the same at from CPU to other nodes across the network, meaning you often can access data on other nodes' disk about as fast as you could on your own disk in these commodity clusters.