title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.
Aster Data Basics
Course Description
Overview
This Aster Data Basics course will cover the challenges of big data and the solutions that the Aster Data platform provides. The course will also discuss in detail how Aster Data tables are created, loaded and queried so users can become better prepared for large table joins and advanced analytics. This course will also discuss options to tuning queries with logical partitioning and columnar capabilities. This course will be 50% hand-on training and 50% lively discussion.Objectives
- Be familiar with Aster Data terms and understand the challenges that big data brings
- Be able to discuss the different options to creating Aster Data tables and the impact of their design on specific environments
- Understand how to design a fundamental Aster Data solution
- Have hands-on experience with querying Aster Data.
- Be able to discuss what to do and what not to do in order to query and design an Aster Data solution effectively
- Learn about Map Reduce and its functionality and strengths
- Gain enormous experience and knowledge in running a wide variety of Aster Data analytics
- Have the confidence necessary to be a solid performer on Aster Data
Audience
- Anyone involved in an Aster Data project or for any person who is or will soon be working with big data
- Managers, DBAs, developers and any IT or business professionals who are involved in managing, designing or querying an Aster Data system
Prerequisites
- None - any and all who are interested and enthused are welcome
Topics
- Big Data is here
- The Financials from Big Data
- There are four Axes to Big Data
- The Sources of Big Data
- Big Data outside the Data Warehouse
- Introduction to MapReduce
- MapReduce Details
- New Data + New Analytics = New Capabilities
- What is a Data Scientist?
- A Day in the Life of a Data Scientist
- Fact Tables
- Dimension Tables
- Tables that Hash
- Tables the Duplicate
- The three types of tables
- Regular or Persistent Tables
- Temporary Tables
- Analytic Tables
- Rules for Distribution Keys
- Partitioning a Table
- Logical Partitioning
- Automatic Logical Partitioning
- Partition by Range
- Partition by List
- Columnar Tables
- When to use Columnar Tables
- Eight Fundamental Rules for Modeling Big Data in Aster Database
- Introduction to Data Modeling in Aster Database
- Dimensionalize Your Schema
- Use Columnar Tables When Appropriate
- Distribute Your Data with Joins in Mind
- Replicate Common, Frequently Joined Data
- Split Data into Child Partitions
- Verticalize Your Schema
- Index Your Tables
- Consider Using a Denormalized Data Model
- Data Modeling FAQ
- Three Principles for Aster Big Data
- Move as Little Data as Possible
- Don’t Read Irrelevant Data
- Don’t Do Redundant Processing
- Distribute Your Data with Joins in Mind
- Join your Tables on the same Hash
- Join your smaller Tables with Replication
- Use Denormalization When Appropriate
- Joins are better than Subqueries
- Move the subselect from the WHERE to the FROM Clause
- Use Inner Joins to Shrink the size of data on which Outer Joins operate
- Avoid Per-Row Function and Group Rows First
- Avoid Joins, Group BYs, and WHERE clause on non-native data types
- EXPLAIN is your friend
- Understand How Joins Work
- Join Tables on Distribution Key if Possible
- Use Group BY and Count Distinct on the Distribution Key
- The Three Types of Queries in which Redistribution May Happen
- on a Fact Table with Group BY or Distinct
- Multiple Queries together with UNION or INTERSECT
- The Three Types of Table Scans
- Scans
- Scans
- Index Scan
- The Three ways a Join Happens between Two Tables
- Loop
- Joins
- Joins
- What is SQL-MapReduce?
- What Does SQL-MapReduce Do?
- SQL-MR Example – Correlation
- SQL-MR Syntax Section
- SQL-MR Syntax
- SQL-MR Syntax – From Function
- SQL-MR Syntax – On Clause
- SQL-MR Syntax – Partition by Clause
- SQL-MR Syntax – Order by Clause
- SQL-MR Syntax – Argument Clause
- SQL-MR Syntax – Sessionize Example
- Basic Primitive – SQL-MR Function
- Pre-Packaged SQL-MapReduce Analytic Functions
- Categories of Analytic Functions
- Cluster Analysis Section
- Canopy Clustering
- Canopy Example
- Kmeans Clustering
- Kmeans Training Example
- Kmeansplot Predict Example
- Minhash Clustering
- Minhash Example
- Associative Analysis Section
- Market Basket Generator
- Market Basket Syntax
- Market Basket Example
- Collaborative Filtering
- Cfilter Example
- Statistical Analysis Section
- Linear Regression
- Linear Regression – Training
- Linear Regression – Prediction
- Generalized Linear Model
- GLM Score
- In-Line Lab
- Logistic Regression
- Logistic Regression – Training
- Logistic Regression – Prediction
- Logistic Regression Example
- Correlation
- Knearest Neighbor (knn)
- Knearest Neighbor Example
- Histogram
- Histogram Example: Continuous, Non-Overlap, Equal
- Histogram Example: Discrete, Overlap, Unequal
- Predictive Analysis Section
- Naïve Bayes
- Naïve Bayes Concept
- Naïve Bayes Syntax
- Naïve Bayes Example
- Support Vector Machines (SVM)
- SVM Example
- Decision Tree
- Decision Tree Syntax
- Decision Tree Example 1 (Regression)
- Decision Tree Example 2 (Classification)
- Decision Tree Example 3 (Binary)
- Creating Graphs from Cfilter and nPath
- GraphGen Concept and Syntax
- GraphGen Charts – Sankey
- GraphGen Charts – Chord
- GraphGen Charts – Tree
- GraphGen Charts – Sigma
- Categories of Analytic Functions
- Times Series and Attribution Analysis Section
- Sessionization
- Sessionization Example
- Attribution – The Big Picture
- Attribution Syntax
- Attribution Arguments
- Window Size – Rows
- Window Size – Seconds
- Window Size – Rows&Seconds
- MODEL1/MODEL2 and Distribution Models
- MODEL1/MODEL2 Types
- Model1/Model2 Can Use Event or K
- Using Event with Weight, Model and Parameters
- What Happens If We Don’t Have 8 Rows?
- Using K with Segment_rows
- Using K with Segment_seconds
- Advanced Attribution – Using Multiple Models
- Multiple Models Example
- Advanced Attribution – Multiple Input
- Multiple Input Examples
- Graph Analysis Section
- Single Source Shortest Path (SSSP)
- SSSP Example
- The nTree
- An nTree Example
- Another nTree Example
- Social Media Functions
- Pagerank
- Text Analysis Section
- Text Parser
- Text Parser Example
- An nGram
- An nGram Example
- Levenshtein Distance
- Named Entity Recognition (NER)
- Named Entity Recognition Example
- Extract Sentiment
- Extract Sentiment Example 1
- Extract Sentiment Example 2
- Naïve Bayes Text Classifier
- Naïve Bayes Text Training Set
- Export Naïve Bayes Model and Install
- Naïve Bayes Text Predict
- Naïve Bayes Text Answer Set
- Text Classifier Overview
- TextClassifierTrainer
- TextClassifier
- Data Transformation Section
- Data Transformation – Antiselect
- Data Transformation – Unpack
- Data Transformation – Pack
- Data Transformation – Multicase (Input & Output)
- Data Transformation – Pivot
- Data Transformation – Apache Log Parsing
- Data Transformation – XML Parsing
Self-Paced Training Info
Learn at your own pace with anytime, anywhere training
- Same in-demand topics as instructor-led public and private classes.
- Standalone learning or supplemental reinforcement.
- e-Learning content varies by course and technology.
- View the Self-Paced version of this outline and what is included in the SPVC course.
- Learn more about e-Learning
Course Added To Shopping Cart
bla
bla
bla
bla
bla
bla
Self-Paced Training Terms & Conditions
Exam Terms & Conditions
Sorry, there are no classes that meet your criteria.
Please contact us to schedule a class.
STOP! Before You Leave
Save 0% on this course!
Take advantage of our online-only offer & save 0% on any course !
Promo Code skip0 will be applied to your registration
Purchase Information
title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.