In Mapreduce technique, all the intermediate values are being stored in the disk. In applications where we use the intermediate values iteratively, this becomes inefficient. This is covered by RDDs in Spark where the intermediate values are stored in the memory itself.
bmehta22
This was made to deal with the size of the Netflix Prize dataset efficiently!
In Mapreduce technique, all the intermediate values are being stored in the disk. In applications where we use the intermediate values iteratively, this becomes inefficient. This is covered by RDDs in Spark where the intermediate values are stored in the memory itself.