What is Apache Spark RDD

RDD stands for Resilient Distributed Dataset. Its a distributed dataset which has the capability to recover from failures.

Read more about the article Understanding Apache Spark Architecture
Understanding Apache SPark Architecture

Understanding Apache Spark Architecture

Apache Spark is an open-source cluster computing framework which is 100 times faster in memory and 10 times faster on disk when compared to Apache Hadoop.