Spark promises to up-end Hadoop, but in a good way

Date: February 12, 2015 Published by

  • Not only can Spark work with data in-memory, making queries 100x faster than MapReduce, but Spark queries on disk also run 10x faster.
  • One of the biggest problems with big data is that the technology is either insanely expensive, insanely complicated, or both.
  • This is why traditional Hadoop powerhouses like Cloudera have maintained some commitment to MapReduce as they’ve dramatically increased their commitment to Spark.
  • Without even hitting its teenage years, Hadoop is being replaced by Apache Spark, a superior data processing engine that overcomes some of Hadoop’s core MapReduce limitations.
  • Spark, as Stoica told me, is much easier, largely because of how it interacts with other systems.