Read more about the article Apache Spark RDD reduceByKey transformation
Apache Spark RDD reduceByKey transformation

Apache Spark RDD reduceByKey transformation

reduceByKey(func) converts a dataset of (K, V) pairs, into a dataset of (K, V) pairs where the values for each key are aggregated using the given reduce function.

Read more about the article Apache Spark RDD groupBy transformation
Apache Spark RDD groupBy transformation

Apache Spark RDD groupBy transformation

As per Apache Spark documentation, groupBy returns an RDD of grouped items where each group consists of a key and a sequence of elements.

Read more about the article Apache Spark RDD filter transformation
Apache Spark RDD’s filter transformation

Apache Spark RDD filter transformation

As per Apache Spark, filter(function) returns a new dataset formed by selecting those elements of the source on which function returns true.

What is Apache Spark RDD

RDD stands for Resilient Distributed Dataset. Its a distributed dataset which has the capability to recover from failures.

Read more about the article Manipulating String  columns in Dataframe
Manipulating String Columns in Dataframe

Manipulating String columns in Dataframe

In this video we will understand how to manipulate the String columns in Dataframe. For the demo we are using Spark 2.4 version and scala language.

Read more about the article Working with AVRO data in Apache Spark
Working with AVRO data

Working with AVRO data in Apache Spark

In this video we will understand how to work with AVRO data in Apache Spark.For the demo we are using Spark 2.4 version and scala language.