title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.
Data Analytics With Hadoop And Spark
Course Description
Overview
This course will introduce Apache Spark. The students will learn how Spark fits into the Big Data ecosystem, and how to use Spark for data analysis.This class is taught with Python language and using Jupyter environment.
Objectives
- Spark ecosystem
- Spark Shell
- Spark Data structures (RDD / Dataframe / Dataset)
- Spark SQL
- Modern data formats and Spark
- Spark & Hadoop & Hive
Audience
Prerequisites
- Analyst background (familiarity with SQL, Scripting ..etc)
Topics
- Big Data, Hadoop, Spark
- Spark concepts and architecture
- Spark components overview
- Labs : Installing and running Spark
- Spark shell
- Spark web UIs
- Analyzing dataset – part 1
- Labs: Spark shell exploration
- Partitions
- Distributed execution
- Operations: transformations and actions
- Labs: Unstructured data analytics using RDDs
- Caching overview
- Various caching mechanisms available in Spark
- In memory file systems
- Caching use cases and best practices
- Labs: Benchmark of caching performance
- Dataframes Intro
- Loading structured data (json, CSV) using Dataframes
- Using schema
- Specifying schema for Dataframes
- Labs : Dataframes, Datasets, Schema
- Spark SQL concepts and overview
- Defining tables and importing datasets
- Querying data using SQL
- Handling various storage formats : JSON / Parquet / ORC
- Labs: querying structured data using SQL; evaluating data formats
- Hadoop Primer: HDFS / YARN
- Hadoop + Spark architecture
- Running Spark on Hadoop YARN
- Processing HDFS files using Spark
- Spark & Hive
- These are group workshops
- Attendees will work on solving real-world data analysis problems using Spark
Related Courses
-
Hadoop for Systems Administrators
OSUN-660- Duration: 3 Days
- Delivery Format: Classroom Training, Online Training
- Price: 1,755.00 USD
-
Hadoop for Developers
EJHD-125- Duration: 3
- Delivery Format: Classroom Training, Online Training
- Price: 2,100.00 USD
Self-Paced Training Info
Learn at your own pace with anytime, anywhere training
- Same in-demand topics as instructor-led public and private classes.
- Standalone learning or supplemental reinforcement.
- e-Learning content varies by course and technology.
- View the Self-Paced version of this outline and what is included in the SPVC course.
- Learn more about e-Learning
Course Added To Shopping Cart
bla
bla
bla
bla
bla
bla
Self-Paced Training Terms & Conditions
Exam Terms & Conditions
Sorry, there are no classes that meet your criteria.
Please contact us to schedule a class.
STOP! Before You Leave
Save 0% on this course!
Take advantage of our online-only offer & save 0% on any course !
Promo Code skip0 will be applied to your registration
Purchase Information
title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.