HOME > IT & Software > Spark SQL and Spark 3 using Scala Hands-On with Labs

Spark SQL and Spark 3 using Scala Hands-On with Labs

SynopsisSpark SQL and Spark 3 using Scala Hands-On with Labs, availab...
Spark SQL and 3 using Scala Hands-On with Labs  No.1

Spark SQL and Spark 3 using Scala Hands-On with Labs, available at $79.99, has an average rating of 4.32, with 232 lectures, based on 2858 reviews, and has 21723 subscribers.

You will learn about All the HDFS Commands that are relevant to validate files and folders in HDFS. Enough Scala to work Data Engineering Projects using Scala as Programming Language Spark Dataframe APIs to solve the problems using Dataframe style APIs. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs Inner as well as outer joins using Spark Data Frame APIs Ability to use Spark SQL to solve the problems using SQL style syntax. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL Inner as well as outer joins using Spark SQL Basic DDL to create and manage tables using Spark SQL Basic DML or CRUD Operations using Spark SQL Create and Manage Partitioned Tables using Spark SQL Manipulating Data using Spark SQL Functions Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL This course is ideal for individuals who are Any IT aspirant/professional willing to learn Data Engineering using Apache Spark or Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer or Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile It is particularly useful for Any IT aspirant/professional willing to learn Data Engineering using Apache Spark or Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer or Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile.

Enroll now: Spark SQL and Spark 3 using Scala Hands-On with Labs

Summary

Title: Spark SQL and Spark 3 using Scala Hands-On with Labs

Price: $79.99

Average Rating: 4.32

Number of Lectures: 232

Number of Published Lectures: 232

Number of Curriculum Items: 232

Number of Published Curriculum Objects: 232

Original Price: $22.99

Quality Status: approved

Status: Live

What You Will Learn

  • All the HDFS Commands that are relevant to validate files and folders in HDFS.
  • Enough Scala to work Data Engineering Projects using Scala as Programming Language
  • Spark Dataframe APIs to solve the problems using Dataframe style APIs.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs
  • Inner as well as outer joins using Spark Data Frame APIs
  • Ability to use Spark SQL to solve the problems using SQL style syntax.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL
  • Inner as well as outer joins using Spark SQL
  • Basic DDL to create and manage tables using Spark SQL
  • Basic DML or CRUD Operations using Spark SQL
  • Create and Manage Partitioned Tables using Spark SQL
  • Manipulating Data using Spark SQL Functions
  • Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL
  • Who Should Attend

  • Any IT aspirant/professional willing to learn Data Engineering using Apache Spark
  • Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer
  • Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile
  • Target Audiences

  • Any IT aspirant/professional willing to learn Data Engineering using Apache Spark
  • Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer
  • Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile
  • As part of this course, you will learn all the key skills to build Data Engineering Pipelines using Spark SQL and Spark Data Frame APIs using Scala as a Programming language. This course used to be a CCA 175 Spark and Hadoop Developer course for the preparation of the Certification Exam. As of 10/31/2021, the exam is sunset and we have renamed it to Spark SQL and Spark 3 using Scala as it covers industry-relevant topics beyond the scope of certification.

    About Data Engineering

    Data Engineering is nothing but processing the data depending on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc. Apache Spark is evolved as a leading technology to take care of Data Engineering at scale.

    I have prepared this course for anyone who would like to transition into a Data Engineer role using Spark (Scala). I myself am a proven Data Engineering Solution Architectwith proven experience in designing solutions using Apache Spark.

    Let us go through the details about what you will be learning in this course. Keep in mind that the course is created with a lot of hands-on tasks which will give you enough practice using the right tools. Also, there are tons of tasks and exercises to evaluate yourself.

    Setup of Single Node Big Data Cluster

    Many of you would like to transition to Big Data from Conventional Technologies such as Mainframes, Oracle PL/SQL, etc and you might not have access to Big Data Clusters. It is very important for you set up the environment in the right manner. Don’t worry if you do not have the cluster handy, we will guide you through support via Udemy Q&A.

  • Setup Ubuntu-based AWS Cloud9 Instance with the right configuration

  • Ensure Docker is setup

  • Setup Jupyter Lab and other key components

  • Setup and Validate Hadoop, Hive, YARN, and Spark

  • Are you feeling a bit overwhelmed about setting up the environment? Don’t worry!!! We will provide complementary lab access for up to 2 months. Here are the details.

  • Training using an interactive environment. You will get 2 weeks of lab access, to begin with. If you like the environment, and acknowledge it by providing a 5* rating and feedback, the lab access will be extended to additional 6 weeks (2 months). Feel free to send an email to support@itversity.com to get complementary lab access. Also, if your employer provides a multi-node environment, we will help you set up the material for the practice as part of the live session. On top of Q&A Support, we also provide required support via live sessions.

  • A quick recap of Scala

    This course requires a decent knowledge of Scala. To make sure you understand Spark from a Data Engineering perspective, we added a module to quickly warm up with Scala. If you are not familiar with Scala, then we suggest you go through relevant courses on Scala as Programming Language.

    Data Engineering using Spark SQL

    Let us, deep-dive into Spark SQL to understand how it can be used to build Data Engineering Pipelines. Spark with SQL will provide us the ability to leverage distributed computing capabilities of Spark coupled with easy-to-use developer-friendly SQL-style syntax.

  • Getting Started with Spark SQL

  • Basic Transformations using Spark SQL

  • Managing Spark Metastore Tables – Basic DDL and DML

  • Managing Spark Metastore Tables Tables – DML and Partitioning

  • Overview of Spark SQL Functions

  • Windowing Functions using Spark SQL

  • Data Engineering using Spark Data Frame APIs

    Spark Data Frame APIs are an alternative way of building Data Engineering applications at scale leveraging distributed computing capabilities of Spark. Data Engineers from application development backgrounds might prefer Data Frame APIs over Spark SQL to build Data Engineering applications.

  • Data Processing Overview using Spark Data Frame APIs leveraging Scala as Programming Language

  • Processing Column Data using Spark Data Frame APIs leveraging Scala as Programming Language

  • Basic Transformations using Spark Data Frame APIs leveraging Scala as Programming Language – Filtering, Aggregations, and Sorting

  • Joining Data Sets using Spark Data Frame APIs leveraging Scala as Programming Language

  • All the demos are given on our state-of-the-art Big Data cluster. You can avail of one-month complimentary lab access by reaching out to support@itversity.com with a Udemy receipt.

    Course Curriculum

    Chapter 1: Introduction

    Lecture 1: CCA 175 Spark and Hadoop Developer – Curriculum

    Chapter 2: Setting up Environment using AWS Cloud9

    Lecture 1: Getting Started with Cloud9

    Lecture 2: Creating Cloud9 Environment

    Lecture 3: Warming up with Cloud9 IDE

    Lecture 4: Overview of EC2 related to Cloud9

    Lecture 5: Opening ports for Cloud9 Instance

    Lecture 6: Associating Elastic IPs to Cloud9 Instance

    Lecture 7: Increase EBS Volume Size of Cloud9 Instance

    Lecture 8: Setup Jupyter Lab on Cloud9

    Lecture 9: [Commands] Setup Jupyter Lab on Cloud9

    Chapter 3: Setting up Environment – Overview of GCP and Provision Ubuntu VM

    Lecture 1: Signing up for GCP

    Lecture 2: Overview of GCP Web Console

    Lecture 3: Overview of GCP Pricing

    Lecture 4: Provision Ubuntu VM from GCP

    Lecture 5: Setup Docker

    Lecture 6: Why we are setting up Python and Jupyter Lab for Scala related course?

    Lecture 7: Validating Python

    Lecture 8: Setup Jupyter Lab

    Chapter 4: Setup Hadoop on Single Node Cluster

    Lecture 1: Introduction to Single Node Hadoop Cluster

    Lecture 2: Setup Prerequisties

    Lecture 3: [Commands] – Setup Prerequisites

    Lecture 4: Setup Password less login

    Lecture 5: [Commands] – Setup Password less login

    Lecture 6: Download and Install Hadoop

    Lecture 7: [Commands] – Download and Install Hadoop

    Lecture 8: Configure Hadoop HDFS

    Lecture 9: [Commands] – Configure Hadoop HDFS

    Lecture 10: Start and Validate HDFS

    Lecture 11: [Commands] – Start and Validate HDFS

    Lecture 12: Configure Hadoop YARN

    Lecture 13: [Commands] – Configure Hadoop YARN

    Lecture 14: Start and Validate YARN

    Lecture 15: [Commands] – Start and Validate YARN

    Lecture 16: Managing Single Node Hadoop

    Lecture 17: [Commands] – Managing Single Node Hadoop

    Chapter 5: Setup Hive and Spark on Single Node Cluster

    Lecture 1: Setup Data Sets for Practice

    Lecture 2: [Commands] – Setup Data Sets for Practice

    Lecture 3: Download and Install Hive

    Lecture 4: [Commands] – Download and Install Hive

    Lecture 5: Setup Database for Hive Metastore

    Lecture 6: [Commands] – Setup Database for Hive Metastore

    Lecture 7: Configure and Setup Hive Metastore

    Lecture 8: [Commands] – Configure and Setup Hive Metastore

    Lecture 9: Launch and Validate Hive

    Lecture 10: [Commands] – Launch and Validate Hive

    Lecture 11: Scripts to Manage Single Node Cluster

    Lecture 12: [Commands] – Scripts to Manage Single Node Cluster

    Lecture 13: Download and Install Spark 2

    Lecture 14: [Commands] – Download and Install Spark 2

    Lecture 15: Configure Spark 2

    Lecture 16: [Commands] – Configure Spark 2

    Lecture 17: Validate Spark 2 using CLIs

    Lecture 18: [Commands] – Validate Spark 2 using CLIs

    Lecture 19: Validate Jupyter Lab Setup

    Lecture 20: [Commands] – Validate Jupyter Lab Setup

    Lecture 21: Intergrate Spark 2 with Jupyter Lab

    Lecture 22: [Commands] – Intergrate Spark 2 with Jupyter Lab

    Lecture 23: Download and Install Spark 3

    Lecture 24: [Commands] – Download and Install Spark 3

    Lecture 25: Configure Spark 3

    Lecture 26: [Commands] – Configure Spark 3

    Lecture 27: Validate Spark 3 using CLIs

    Lecture 28: [Commands] – Validate Spark 3 using CLIs

    Lecture 29: Intergrate Spark 3 with Jupyter Lab

    Lecture 30: [Commands] – Intergrate Spark 3 with Jupyter Lab

    Chapter 6: Scala Fundamentals

    Lecture 1: Introduction and Setting up of Scala

    Lecture 2: Setup Scala on Windows

    Lecture 3: Basic Programming Constructs

    Lecture 4: Functions

    Lecture 5: Object Oriented Concepts – Classes

    Lecture 6: Object Oriented Concepts – Objects

    Lecture 7: Object Oriented Concepts – Case Classes

    Lecture 8: Collections – Seq, Set and Map

    Lecture 9: Basic Map Reduce Operations

    Lecture 10: Setting up Data Sets for Basic I/O Operations

    Lecture 11: Basic I/O Operations and using Scala Collections APIs

    Lecture 12: Tuples

    Lecture 13: Development Cycle – Create Program File

    Lecture 14: Development Cycle – Compile source code to jar using SBT

    Lecture 15: Development Cycle – Setup SBT on Windows

    Lecture 16: Development Cycle – Compile changes and run jar with arguments

    Lecture 17: Development Cycle – Setup IntelliJ with Scala

    Lecture 18: Development Cycle – Develop Scala application using SBT in IntelliJ

    Chapter 7: Overview of Hadoop HDFS Commands

    Lecture 1: Getting help or usage of HDFS Commands

    Lecture 2: Listing HDFS Files

    Lecture 3: Managing HDFS Directories

    Lecture 4: Copying files from local to HDFS

    Lecture 5: Copying files from HDFS to local

    Lecture 6: Getting File Metadata

    Lecture 7: Previewing Data in HDFS File

    Lecture 8: HDFS Block Size

    Lecture 9: HDFS Replication Factor

    Lecture 10: Getting HDFS Storage Usage

    Instructors

  • Spark SQL and 3 using Scala Hands-On with Labs  No.2
    Durga Viswanatha Raju Gadiraju
    CEO at ITVersity and CTO at Analytiqs, Inc
  • Spark SQL and 3 using Scala Hands-On with Labs  No.3
    Madhuri Gadiraju
  • Spark SQL and 3 using Scala Hands-On with Labs  No.3
    Sathvika Dandu
  • Spark SQL and 3 using Scala Hands-On with Labs  No.3
    Pratik Kumar
  • Spark SQL and 3 using Scala Hands-On with Labs  No.3
    Sai Varma
  • Spark SQL and 3 using Scala Hands-On with Labs  No.3
    Phani Bhushan Bozzam
  • Rating Distribution

  • 1 stars: 81 votes
  • 2 stars: 124 votes
  • 3 stars: 400 votes
  • 4 stars: 1038 votes
  • 5 stars: 1215 votes
  • Frequently Asked Questions

    How long do I have access to the course materials?

    You can view and review the lecture materials indefinitely, like an on-demand channel.

    Can I take my courses with me wherever I go?

    Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!