Duration: 5 Days
In this course, you will learn to design and implement the Greenplum environment and gain the information needed to install, configure, and manage the Greenplum database system. You will be introduced to the Greenplum environment, consisting of the Greenplum Database and supported systems. You will learn the fundamental concepts on data warehousing, business intelligence, and how Greenplum helps to solve business problems in managing and analyzing big data. You will evaluate logical models and business requirements to determine the best physical design for a Greenplum database.
What You Will Learn
- Greenplum features, benefits, and architecture in terms of shared nothing and the Massively Parallel Processing (MPP) design
- Support redundancy and high availability with Greenplum
- Data models used in data warehousing and how data is stored in Greenplum
- Install, initialize, validate, and configure Greenplum Database
- Manage database objects and workload management processes by defining and managing roles, privileges, and resource queues
- Use table partitioning as a design methodology for handling large tables
- Load data into a Greenplum database instance using external tables, SQL copy, insert commands, and parallel load utilities
- Use data manipulation language and data query language to access, manage, and query data
- Perform system administrative tasks, including managing and checking the state of the Greenplum database, its data, and the distribution of data
- Perform backup and restoration of Greenplum data
- Distribute and store data in Greenplum using a distribution key and partitioning
- Use EXPLAIN and EXPLAIN ANALYZE to help the Greenplum query plan optimizer determine how to handle a submitted query
- Improve query performance by keeping statistics up to date and tuning the database for sampling size and error conditions
- Determine when it's best to use an index and what type of index to use
- Improve query performance by following a number of performance enhancement tips
Audience
This course is intended for any person who presently or plans to:
- Install Greenplum Database
- Design and develop for Greenplum Database implementation
- Administer and manage the Greenplum Database
Prerequistes
- Basic UNIX or Linux command-line navigation and administration skills
- Database query language basics, including basic SQL knowledge for accessing database objects
- Fundamental relational database concepts
Course Outline
1. Greenplum Fundamental Concepts
- Basics of Data Warehousing
- Greenplum Concepts, Features, and Benefits
- Greenplum Architecture
- Shared Nothing and MPP Implementation
2. Database Installation and Initialization
- Systems Preparation and Verification
- Greenplum Database Initialization
3. Greenplum Database Tools, Utilities, and Internals
- PSQL Client and Greenplum Utilities
- Greenplum Performance Monitor
- Greenplum Database Server Configuration
- Greenplum Database Internals
4. Defining and Securing the User Database
- Data Definition Language
- Data Manipulation and Data Query Language
- Roles, Privileges, and Resources
5. Data Loading and Distribution
- Data Loading
- Table Partitioning
6. Database Management and Archiving
- Managing the Greenplum Database
- Backup and Restores
7. Data Modeling and Design
- Data Modeling
- Physical Design Decisions
8. Performance Analysis and Tuning
- JOIN Tables
- Database Tuning
- EXPLAIN Plan
- Improve Performance with Statistics
- Indexing Strategies
9. Developing Reports Using Advanced SQL
- Advanced Reporting Using Online Analytic Processing (OLAP)
- PostgreSQL Functions
- Advanced SQL Topics and Performance Tips
Course Labs
This course includes labs designed to allow practical experience for the participant.