Contact Us


Thank you for your interest in LearnQuest.

Your request is being processed and LearnQuest or a LearnQuest-Authorized Training Provider will be in touch with you shortly.


Thank you for your interest in Private Training.

We look forward to helping you develop the perfect training solution to help you meet your company's goals.

For immediate assistance, speak with one of our representatives using the chat module below. Otherwise, LearnQuest or a LearnQuest-Authorized Training Provider will be in touch with you shortly.


Thank you for your interest in LearnQuest!

Now, you will be able to stay up-to-date on our latest course offerings, promotions, and training discounts. Watch your inbox for upcoming special offers.


Date: xxx

Location: xxx

Time: xxx

Price: xxx

Please take a moment to fill out this form. We will get back to you as soon as possible.

All fields marked with an asterisk (*) are mandatory.

Designing and Building Big Data Applications

Course content updated by LearnQuest
3,195 USD
4 Days
Classroom Training, Online Training
Cloudera Training
Prices reflect a 22.5% discount for IBM employees.
Prices shown are the special AWS Partner Prices.
Prices reflect the Capgemini employee discount.
Prices reflect the UPS employee discount.
Prices reflect the ??democompanyname?? employee discount.
GSA Private/Onsite Price: ??gsa-private-price??
For GSA pricing, please go to GSA Advantage.
This course is eligible for the IBM Full Access Training Pass
Enroll today and save 10% on this course. Use promo code CLOUD10 when registering.
Working on a laptop
Gain access to IBM’s library of digital, on-demand courses for one low annual subscription fee
IBM Full Access Training Pass
$500 off any IBM Full Access Training Pass Option
See Offer
Get a 30% Discount on IBM Self-Paced Courses
See Offer

Class Schedule

Delivery Formats

Sort results

Filter Classes

Guaranteed to Run





    Sorry, there are no public classes currently scheduled in your country.

    Please complete this form, and a Training Advisor will be in touch with you shortly to address your training needs.

View Global Schedule

Course Description


Cloudera Universityâ??s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results to the end-user in an easy-to-digest form. Go beyond MapReduce to use additional elements of the EDH and develop converged applications that are highly relevant to the business.



  • Creating a data set with Kite SDK
  • Developing custom Flume components for data ingestion
  • Managing a multi-stage workflow with Oozie
  • Analyzing data with Crunch
  • Writing user-defined functions for Hive and Impala
  • Transforming data with Morphlines
  • Indexing data with Cloudera Search
  • Audience






      Application Architecture

      • Scenario Explanation
      • Understanding the Development Environment
      • Identifying and Collecting Input Data
      • Selecting Tools for Data Processing and Analysis
      • Presenting Results to the Use

      Defining and Using Data Sets

      • Metadata Management
      • What is Apache Avro?
      • Avro Schemas
      • Avro Schema Evolution
      • Selecting a File Format
      • Performance Considerations

      Using the Kite SDK Data Module

      • What is the Kite SDK?
      • Fundamental Data Module Concepts
      • Creating New Data Sets Using the Kite SDK
      • Loading, Accessing, and Deleting a Data Set

      Importing Relational Data with Apache Sqoop

      • What is Apache Sqoop?
      • Basic Imports
      • Limiting Results
      • Improving Sqoopâ??s Performance
      • Sqoop 2

      Capturing Data with Apache Flume

      • What is Apache Flume?
      • Basic Flume Architecture
      • Flume Sources
      • Flume Sinks
      • Flume Configuration
      • Logging Application Events to Hadoop

      Developing Custom Flume Components

      • Flume Data Flow and Common Extension Points
      • Custom Flume Sources
      • Developing a Flume Pollable Source
      • Developing a Flume Event-Driven Source
      • Custom Flume Interceptors
      • Developing a Header-Modifying Flume Interceptor
      • Developing a Filtering Flume Interceptor
      • Writing Avro Objects with a Custom Flume Interceptor

      Managing Workflows with Apache Oozie

      • The Need for Workflow Management
      • What is Apache Oozie?
      • Defining an Oozie Workflow
      • Validation, Packaging, and Deployment
      • Running and Tracking Workflows Using the CLI
      • Hue UI for Oozie

      Processing Data Pipelines with Apache Crunch

      • What is Apache Crunch?
      • Understanding the Crunch Pipeline
      • Comparing Crunch to Java MapReduce
      • Working with Crunch Projects
      • Reading and Writing Data in Crunch
      • Data Collection API Functions
      • Utility Classes in the Crunch API

      Working with Tables in Apache Hive

      • What is Apache Hive?
      • Accessing Hive
      • Basic Query Syntax
      • Creating and Populating Hive Tables
      • How Hive Reads Data
      • Using the RegexSerDe in Hive

      Developing User-Defined Functions

      • What are User-Defined Functions?
      • Implementing a User-Defined Function
      • Deploying Custom Libraries in Hive
      • Registering a User-Defined Function in Hive

      Executing Interactive Queries with Impala

      • What is Impala?
      • Comparing Hive to Impala
      • Running Queries in Impala
      • Support for User-Defined Functions
      • Data and Metadata Management

      Understanding Cloudera Search

      • What is Cloudera Search?
      • Search Architecture
      • Supported Document Formats

      Indexing Data with Cloudera Search

      • Collection and Schema Management
      • Morphlines
      • Indexing Data in Batch Mode
      • Indexing Data in Near Real Time

      Presenting Results to Users

      • Solr Query Syntax
      • Building a Search UI with Hue
      • Accessing Impala through JDBC
      • Powering a Custom Web Application with Impala and Search

      • HBase for Developers

        • Duration: 3 Days
        • Delivery Format: Classroom Training, Online Training
        • Price: 2,100.00 USD
      2020 Top 20 Training Industry Company - IT Training

      Need Help?

      Call us toll free at 877-206-0106 or e-mail us at

      Personalized Solutions

      Need a personalized solution for your training? Contact us, and one of our advisors will help you find the best solution to your training needs.

      Contact us

      Need Help?

      Do you have a question about the courses, instruction, or materials covered? Do you need help finding which course is best for you?

      Talk to us

      Self-Paced Training Info

      Learn at your own pace with anytime, anywhere training

      • Same in-demand topics as instructor-led public and private classes.
      • Standalone learning or supplemental reinforcement.
      • e-Learning content varies by course and technology.
      • View the Self-Paced version of this outline and what is included in the SPVC course.
      • Learn more about e-Learning

      Course Added To Shopping Cart







      Self-Paced Training Terms & Conditions


      Sorry, there are no classes that meet your criteria.

      Please contact us to schedule a class.
      Nothing yet
      here's the message from the cart

      To view the cart, you can click "View Cart" on the right side of the heading on each page
      Add to cart clicker.

      Purchase Information

      ??elearning-coursenumber?? ??coursename??
      View Cart

      Need more Information?

      Speak with our training specialists to continue your learning journey.


      Delivery Formats


      By submitting this form, I agree to LearnQuest's Terms and Conditions

      heres the new schedule
      This website uses third-party profiling cookies to provide services in line with the preferences you reveal while browsing the Website. By continuing to browse this Website, you consent to the use of these cookies. If you wish to object such processing, please read the instructions described in our Privacy Policy.
      Your use of this LearnQuest site affirms your consent to our use of session and persistent cookies to track how you use our website.