Spark Archives - Page 2 of 2

Working with CSV data in Apache Spark

In this video we will understand how to work with CSV data in Apache Spark. For the demo we are using Spark 2.4 version and scala language.

Working with Hive Metastore in Spark

Working With Hive Metastore in Apache Spark

In this lecture we will learn how to work with Hive Metastore in Apache Spark. We will be reading table from Hive metasotre in spark and will also be creating a table using saveAsTable API.

Working with JSON data

Working with JSON data in Apache Spark

In this video we will learn how to work with JSON data in Apache Spark.

Working with Parquet Data

Working with Parquet File Format in Spark

In this lecture we will learn how to work with Parquet File Format in Spark.

Manipulating Dates in Apache Spark DataFrame

Manipulating Dates in Dataframe using Spark API

Manipulating Dates in Dataframe using Spark API using from_unixtime(), unix_timestamp(), to_date(), hour(), minute() and second() function.

Understanding DataFrame abstraction in Apache Spark

In this video we will understand DataFrame abstraction in Spark.

Understanding Apache SPark Architecture

Understanding Apache Spark Architecture

Apache Spark is an open-source cluster computing framework which is 100 times faster in memory and 10 times faster on disk when compared to Apache Hadoop.

setup Spark 2.4 cluster on Google Cloud using Dataproc

How to setup Spark 2.4 cluster on Google Cloud using Dataproc

How to setup Spark 2.4 cluster on Google Cloud using Dataproc. Step1 - Create a new project , Step2 - Create a new Cluster using Dataproc.

Installing Apache Spark on Windows

In this post I will tell you how to install Apache Spark on windows machine. By the end of this tutorial you’ll be able to use Spark with Scala on windows.