Contact Us


Thank you for your interest in LearnQuest.

Your request is being processed and LearnQuest or a LearnQuest-Authorized Training Provider will be in touch with you shortly.


Thank you for your interest in Private Training.

We look forward to helping you develop the perfect training solution to help you meet your company's goals.

For immediate assistance, speak with one of our representatives using the chat module below. Otherwise, LearnQuest or a LearnQuest-Authorized Training Provider will be in touch with you shortly.


Thank you for your interest in LearnQuest!

Now, you will be able to stay up-to-date on our latest course offerings, promotions, and training discounts. Watch your inbox for upcoming special offers.


Date: xxx

Location: xxx

Time: xxx

Price: xxx

Please take a moment to fill out this form. We will get back to you as soon as possible.

All fields marked with an asterisk (*) are mandatory.

Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop

Course content updated by LearnQuest
3,195 USD
4 Days
Classroom Training, Online Training
Cloudera Training
Prices reflect a 22.5% discount for IBM employees.
Prices shown are the special AWS Partner Prices.
Prices reflect the Capgemini employee discount.
Prices reflect the UPS employee discount.
Prices reflect the ??democompanyname?? employee discount.
GSA Private/Onsite Price: ??gsa-private-price??
For GSA pricing, please go to GSA Advantage.
This course is eligible for the IBM Full Access Training Pass
Enroll today and save 10% on this course. Use promo code CLOUD10 when registering.
Working on a laptop
Gain access to IBM’s library of digital, on-demand courses for one low annual subscription fee
IBM Full Access Training Pass
$500 off any IBM Full Access Training Pass Option
See Offer
Get a 30% Discount on IBM Self-Paced Courses
See Offer

Class Schedule

Delivery Formats

Sort results

Filter Classes

Guaranteed to Run





    Sorry, there are no public classes currently scheduled in your country.

    Please complete this form, and a Training Advisor will be in touch with you shortly to address your training needs.

View Global Schedule

Course Description


Cloudera Universityâ??s four-day data analyst training course focusing on Apache Pig and Hive and Cloudera Impala will teach you to apply traditional data analytics and business intelligence skills to big data. Cloudera presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages.



  • The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysis
  • The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
  • How Pig, Hive, and Impala improve productivity for typical analysis tasks
  • Joining diverse datasets to gain valuable business insight
  • Performing real-time, complex queries on datasets
  • Audience





    Hadoop Fundamentals

    • The Motivation for Hadoop
    • Hadoop Overview
    • Data Storage: HDFS
    • Distributed Data Processing: YARN, MapReduce, and Spark
    • Data Processing and Analysis: Pig, Hive, and Impala
    • Data Integration: Sqoop
    • Other Hadoop Data Tools
    • Exercise Scenarios Explanation

    Introduction to Pig

    • What Is Pig?
    • Pigâ??s Features
    • Pig Use Cases
    • Interacting with Pig

    Basic Data Analysis with Pig

    • Pig Latin Syntax
    • Loading Data
    • Simple Data Types
    • Field Definitions
    • Data Output
    • Viewing the Schema
    • Filtering and Sorting Data
    • Commonly-Used Functions

    Processing Complex Data with Pig

    • Storage Formats
    • Complex/Nested Data Types
    • Grouping
    • Built-In Functions for Complex Data
    • Iterating Grouped Data

    Multi-Dataset Operations with Pig

    • Techniques for Combining Data Sets
    • Joining Data Sets in Pig
    • Set Operations
    • Splitting Data Sets

    Pig Troubleshooting and Optimization

    • Troubleshooting Pig
    • Logging
    • Using Hadoopâ??s Web UI
    • Data Sampling and Debugging
    • Performance Overview
    • Understanding the Execution Plan
    • Tips for Improving the Performance of Your Pig Jobs

    Introduction to Hive and Impala

    • What Is Hive?
    • What Is Impala?
    • Schema and Data Storage
    • Comparing Hive to Traditional Databases
    • Hive Use Cases

    Querying with Hive and Impala

    • Databases and Tables
    • Basic Hive and Impala Query Language Syntax
    • Data Types
    • Differences Between Hive and Impala Query Syntax
    • Using Hue to Execute Queries
    • Using the Impala Shell

    Data Management

    • Data Storage
    • Creating Databases and Tables
    • Loading Data
    • Altering Databases and Tables
    • Simplifying Queries with Views
    • Storing Query Results

    Data Storage and Performance

    • Partitioning Tables
    • Choosing a File Format
    • Managing Metadata
    • Controlling Access to Data

    Relational Data Analysis with Hive and Impala

    • Joining Datasets
    • Common Built-In Functions
    • Aggregation and Windowing

    Working with Impala

    • How Impala Executes Queries
    • Extending Impala with User-Defined Functions
    • Improving Impala Performance

    Analyzing Text and Complex Data with Hive

    • Complex Values in Hive
    • Using Regular Expressions in Hive
    • Sentiment Analysis and N-Grams
    • Conclusion

    Hive Optimization

    • Understanding Query Performance
    • Controlling Job Execution Plan
    • Bucketing
    • Indexing Data

    Extending Hive

    • SerDes
    • Data Transformation with Custom Scripts
    • User-Defined Functions
    • Parameterized Queries

    Choosing the Best Tool for the Job

    • Comparing MapReduce, Pig, Hive, Impala, and Relational Databases
    • Which to Choose?

    • HBase for Developers

      • Duration: 3 Days
      • Delivery Format: Classroom Training, Online Training
      • Price: 2,100.00 USD
    2020 Top 20 Training Industry Company - IT Training

    Need Help?

    Call us toll free at 877-206-0106 or e-mail us at

    Personalized Solutions

    Need a personalized solution for your training? Contact us, and one of our advisors will help you find the best solution to your training needs.

    Contact us

    Need Help?

    Do you have a question about the courses, instruction, or materials covered? Do you need help finding which course is best for you?

    Talk to us

    Self-Paced Training Info

    Learn at your own pace with anytime, anywhere training

    • Same in-demand topics as instructor-led public and private classes.
    • Standalone learning or supplemental reinforcement.
    • e-Learning content varies by course and technology.
    • View the Self-Paced version of this outline and what is included in the SPVC course.
    • Learn more about e-Learning

    Course Added To Shopping Cart







    Self-Paced Training Terms & Conditions


    Sorry, there are no classes that meet your criteria.

    Please contact us to schedule a class.
    Nothing yet
    here's the message from the cart

    To view the cart, you can click "View Cart" on the right side of the heading on each page
    Add to cart clicker.

    Purchase Information

    ??elearning-coursenumber?? ??coursename??
    View Cart

    Need more Information?

    Speak with our training specialists to continue your learning journey.


    Delivery Formats


    By submitting this form, I agree to LearnQuest's Terms and Conditions

    heres the new schedule
    This website uses third-party profiling cookies to provide services in line with the preferences you reveal while browsing the Website. By continuing to browse this Website, you consent to the use of these cookies. If you wish to object such processing, please read the instructions described in our Privacy Policy.
    Your use of this LearnQuest site affirms your consent to our use of session and persistent cookies to track how you use our website.