Read more about the article Manipulating String  columns in Dataframe
Manipulating String Columns in Dataframe

Manipulating String columns in Dataframe

In this video we will understand how to manipulate the String columns in Dataframe. For the demo we are using Spark 2.4 version and scala language.

Read more about the article Working with AVRO data in Apache Spark
Working with AVRO data

Working with AVRO data in Apache Spark

In this video we will understand how to work with AVRO data in Apache Spark.For the demo we are using Spark 2.4 version and scala language.

Read more about the article Working with CSV data in Apache Spark
Working with CSV data in Apache Spark

Working with CSV data in Apache Spark

In this video we will understand how to work with CSV data in Apache Spark. For the demo we are using Spark 2.4 version and scala language.

Read more about the article Working With Hive Metastore in Apache Spark
Working with Hive Metastore in Spark

Working With Hive Metastore in Apache Spark

In this lecture we will learn how to work with Hive Metastore in Apache Spark. We will be reading table from Hive metasotre in spark and will also be creating a table using saveAsTable API.

Installing ELK Stack on CentOS 8

ELK stands for Elasticsearch, Logstash, and Kibana. These are three components of the ELK stack that are used to index, collect and visualize the data.

Read more about the article Understanding Apache Spark Architecture
Understanding Apache SPark Architecture

Understanding Apache Spark Architecture

Apache Spark is an open-source cluster computing framework which is 100 times faster in memory and 10 times faster on disk when compared to Apache Hadoop.

Important HDFS shell commands

I’m going to walk you through some important HDFS shell commands which can be used to manage files present in Hadoop distributed file system. These command are also important if you are planning to take CCA-175 certification exam.