title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.
Aster Data Architecture and Basics
Course Description
Overview
This Aster Data Architecture and Basics course is designed to provide students with a deeper knowledge and understanding of the Aster Data Architecture and basic terms associated with an Aster Data system. Students will learn the Aster Data Architecture and basics starting at the most basic level and going to the most advanced level with many examples.Objectives
- Understand the Aster Data Architecture
- Discover four options for Aster Data Table Design
- Apply Temporary and Analytic Tables
- Identify SQL-MapReduce
- Name how Aster Processes Data
- Know how Joins Work Inside the Aster Engine
- Indicate Aster Modeling Rules
- Examine nPath in Action
- Define Indexes
- Demonstrate The Multi-Case Function
- Operate Aster Windows Functions
- Show the Fundamental SQL Commands That Work on Aster
- Prepare SQL-MapReduce Examples - Linear Regression
- Recall Tera-Tom’s Top Tips
Audience
- Anyone who has a desire to learn the Aster Data Architecture and basics from beginners to an advanced audience.
Prerequisites
- None
Topics
- What is Parallel Processing?
- Aster Data is a Parallel Processing System
- Each vworker holds a Portion of Every Table
- The Rows of a Table are Spread Across All vworkers
- The Aster Data Architecture
- The Queen, Worker, Loader, Backup Node
- The Aster Architecture Interconnect
- Backup and Loader Nodes Do Not use the Interconnect
- The Aster Architecture has Spare Nodes
- The Aster Architecture Allows Flexibility based on Need
- Aster Data Provides Four Fundamental Hardware Strengths
- Replication Failover
- Data is Compressed on Data Transfers
- Aster Utilizes Dual Optimizers
- Aster Allows a Hybrid of SQL and MapReduce
- MapReduce History
- What is MapReduce?
- What is SQL-MR?
- Support for Mixed Workload Management and Prioritization
- Aster Tables are defined as Fact or Dimension when Created
- Fact Table
- Dimension Table are Replicated
- Aster Data has Fact and Dimension Tables
- Aster Tables are defined as Fact or Dimension when Created
- Fact and Dimension Tables can be Hashed by the same Key
- Distribution Key Rules
- Aster Data Uses a Hash Formula
- The Hash Formula, Hash Map and vworker
- Placing rows on the vworker
- Placing rows on the vworker Continued
- A Review of the Hashing Process
- Like Data Hashes to the Same vworker
- Distribution Key Data Types
- Run ANALYZE to COLLECT STATISTICS on a Table
- Some Examples of ANALYZE
- What Columns to Analyze
- When a Table is Created, a Table Header is Created
- Every vworker has the Exact Same Tables
- All Aster Tables are spread across All vworkers
- The Table Header and the Data Rows are Stored Separately
- A vworker Stores the Rows of a Table inside a Data Block
- To Read Rows, a vworker Moves the Data Block into Memory
- A Full Table Scan Means All vworkers must Read All Rows
- The “Achilles Heel”, or Slowest Process, is Block Transfer
- Each Table has a Distribution Key
- A Query Using the Distribution Key uses a Single vworker.
- As Rows are Added, a Data Block will Eventually Split
- A Full Table Scan Means All vworkers Read All Blocks
- Distribution Key Query uses One vworker
- Each vworker Can Have Many Blocks for a Single Table
- A Full Table Scan Means All vworkers Read All Blocks
- There are Four Options to Aster Table Design
- Straight up Distribute by Hash
- Straight up Distribute by Hash - Problems
- Straight up Distribute by Replication
- Partition the Table with Logical Partitioning
- This Partitioned Table Sorts Rows by Month of Order_Date
- An All vworkers Retrieve By Way of a Single Partition
- You can Partition a Table by Range or by List
- A Partitioned By List Example with Three Tactical Queries
- Aster Data Multi-Level Partitioning
- Aster Allows for Multi-Level Partitioning
- SQL Commands for Logical Partitioning as One Table
- What Partitions are on my Table?
- What does a Columnar Table look like?
- A Comparison of Data for Normal Vs. Columnar
- A Columnar Table is best for Queries with Few Columns
- When to use a Columnar Table
- The Joining of Two Tables
- Aster Moves Joining Rows to the Same vworker
- Because of the Join Rule – Dimension Table are Replicated
- The Two Different Philosophies for Table Join Design
- What Could You Do If Two Tables Joined 1000 Times a Day?
- Fact and Dimension Tables can be Hashed by the same Key
- Joining Two Tables with the same PK/FK Distribution Key
- A Join With Co-Location
- A Performance Tuning Technique for Large Joins
- The Joining of Two Tables with an Additional WHERE Clause
- Aster Performs Joins Using Three Different Methods
- The Hash, Merge, and Nested Loop Joins
- Aster has Three Types of Data
- Create a Permanent Table Using Create Table AS (CTAS)
- Create a Logically Partitioned Table and Populate It
- Create a Temporary Table with using Create Table AS (CTAS)
- A Temporary Table in Action
- A Temporary Table That Uses an Insert/Select
- Create an Analytic Table Using an Insert/Select
- Create an Analytic Table Using CREATE TABLE AS (CTAS)
- Operations that Invalidate an Analytic Table
- If an Analytic Table is Invalid
- Tera-Tom History
- Modeling Rules for Aster Data
- Three Principles that Govern the Modeling Rules
- Modeling Rule 1 – Dimensionalize your Model
- A Dimensional Model is called a 'Star Schema'
- To Read a Data Block, a vworker Moves the Block to Memory
- A Dimensional Model Moves Less Mass into Memory
- Which Move From Disk to Memory Would You Choose?
- Vworkers transfer their Fact Table into Memory in Parallel
- Modeling Rule 2 – Use Columnar
- Which Move From Disk to Memory Would You Choose?
- Let's Discuss Modeling and Joins at the Simplest Level
- Let's Discuss Joins at the Simplest Level
- Modeling Rule 3 – Distribute your Tables Based on Joins
- The Two Different Philosophies for Table Join Design
- Facts are Hashed and most often the Dimension is Replicated
- Fact and Dimension Tables can be Hashed by the same Key
- Joining Two Tables with the same PK/FK Primary Index
- A Join With No Redistribution or Duplication
- Modeling Rule 4 – Replicate Dimension Tables
- Modeling Rule 5 – Partition Your Tables
- Modeling Rule 6 – Make Fact Tables Skinny
- Modeling Rule 6 – Make Fact Tables Skinny Example
- Modeling Rule 7 – Index Your Tables
- Modeling Rule 8 – Denormalize based on Your Environment.
- Tera-Tom's Top Tips
- Tera-Tom's Top Tips # 2
- Tera-Tom's Top Tips #3
- Tera-Tom's Top Tips #4
- When the GROUP BY Column is NOT the Distribution Key
- Example of GROUP BY Column is NOT the Distribution Key
- Tera-Tom's Top Tips #5
- Tera-Tom's Top Tips #6 – Use EXPLAIN
- Query Plan and Estimates
- Explain Plan Showing a Hash Join
- Explain Plan Showing a Merge Join
- Explain Plan Showing a Nested Loop Join
- There are Only Three Types of Scans
- Guidelines for Indexes
- The B-Tree Index
- Which Columns Might You Create an Index?
- A Visual of an Index (Conceptually)
- A Query Using an Index Uses All vworkers
- Multicolumn indexes
- A NUSI BITMAP Theory
- A NUSI Bitmap in Action
- Indexes on Expressions
- Indexes on Extracts of Dates
- GiST Indexes
- Five Operational Tips for Efficient Indexing
- REINDEX
- createCompressedIndexOnCompressedTableByDefault Flag
- Cumulative Sum - Major and Minor Sort Key(s)
- The ANSI CSUM – Getting a Sequential Number
- The ANSI OLAP – Reset with a PARTITION BY Statement
- PARTITION BY only Resets a Single OLAP not ALL of them
- ANSI Moving Sum is Current Row and Preceding n Rows
- How ANSI Moving SUM Handles the Sort
- Moving SUM every 3-rows vs. a Continuous Sum
- Moving Average
- Partition By Resets an ANSI OLAP
- Moving Average Using BETWEEN
- Moving Difference using ANSI Syntax
- Moving Difference using ANSI Syntax with Partition By
- RANK Defaults to Ascending Order
- Getting RANK to Sort in DESC Order
- You can use Window Functions in Expressions
- RANK() OVER and PARTITION BY
- DENSE_RANK() OVER
- PERCENT_RANK() OVER
- RANK With ORDER BY SUM()
- COUNT OVER for a Sequential Number
- The MAX OVER Command
- MAX OVER with PARTITION BY Reset
- The MIN OVER Command
- The Row_Number Command
- NTILE
- CUME_DIST
- LEAD
- LAG
- FIRST_VALUE
- LAST_VALUE
- NTH_VALUE
- SUM(SUM(n))
- MapReduce History
- What is MapReduce?
- What is SQL-MapReduce?
- SQL-MapReduce Input and Output
- Subtle SQL-MapReduce Processing
- Aster Data Provides an Analytic Foundation
- Path, Text, and Statistical Analysis
- Segmentation (Data Mining)
- Graph Analysis
- Transformation of Data
- Sessionize
- Tokenize
- SQL-MapReduce Function… nPath
- nPath SELECT Clause
- nPath ON Clause
- nPath PARTITION BY Expression
- nPath DIMENSION Expression
- nPath ORDER BY Expression
- nPath MODE Clause has Overlapping or NonOverlapping
- nPath PATTERN Clause
- Pattern Operators
- Matching Patterns Which Repeat
- nPath SYMBOLS Clause
- nPath RESULTS Clause
- Adding an Aggregate to nPath Results
- SQL-MapReduce Examples:
- Use Regular SQL
- Create Objects
- Subquery
- Query as Input
- Functions
- Functions in Derived Tables
- SMAVG
- Pack Function
- Pack Function (Continued)
- Pivot Columns
Self-Paced Training Info
Learn at your own pace with anytime, anywhere training
- Same in-demand topics as instructor-led public and private classes.
- Standalone learning or supplemental reinforcement.
- e-Learning content varies by course and technology.
- View the Self-Paced version of this outline and what is included in the SPVC course.
- Learn more about e-Learning
Course Added To Shopping Cart
bla
bla
bla
bla
bla
bla
Self-Paced Training Terms & Conditions
Exam Terms & Conditions
Sorry, there are no classes that meet your criteria.
Please contact us to schedule a class.
STOP! Before You Leave
Save 0% on this course!
Take advantage of our online-only offer & save 0% on any course !
Promo Code skip0 will be applied to your registration
Purchase Information
title
Please take a moment to fill out this form. We will get back to you as soon as possible.
All fields marked with an asterisk (*) are mandatory.