Date: February 12, 2015
- Not only can Spark work with data in-memory, making queries 100x faster than MapReduce, but Spark queries on disk also run 10x faster.
- One of the biggest problems with big data is that the technology is either insanely expensive, insanely complicated, or both.
- This is why traditional Hadoop powerhouses like Cloudera have maintained some commitment to MapReduce as they’ve dramatically increased their commitment to Spark.
- Without even hitting its teenage years, Hadoop is being replaced by Apache Spark, a superior data processing engine that overcomes some of Hadoop’s core MapReduce limitations.
- Spark, as Stoica told me, is much easier, largely because of how it interacts with other systems.
Related NLP Articles
Build smarter apps with our natural language processing API.