Data Lake Architectures & Data Management Principles

Big Data | Intermediate

10 videos | 34m 8s
Includes Assessment
Earns a Badge

(26)

A key component to wrangling data is the data lake framework. In this 9-video Skillsoft Aspire course, learners discover how to implement data lakes for real-time management. Explore data ingestion, data processing, and data lifecycle management with Amazon Web Services (AWS) and other open-source ecosystem products. Begin by examining real-time big data architectures, and how to implement Lambda and Kappa architectures to manage real-time big data. View benefits of adopting Zaloni data lake reference architecture. Examine the essential approach of data ingestion and comparative benefits provided by file formats Avro and Parquet. Explore data ingestion with Sqoop, and various data processing strategies provided by MapReduce V2, Hive, Pig, and Yam for processing data with data lakes. Learn how to derive value from data lakes and describe benefits of critical roles. Learners will explore steps involved in the data lifecycle and the significance of archival policies. Finally, learn how to implement an archival policy to transition between S3 and Glacier, depending on adopted policies. Close the course with an exercise on ingesting data and archival policy.

WHAT YOU WILL LEARN

Implement lambda and kappa architectures to manage real-time big data

Identify the benefits of adopting zaloni data lake reference architecture

Describe data ingestion approaches and compare avro and parquet file format benefits

Demonstrate how to ingest data using sqoop

Describe the data processing strategies provided by mapreduce v2, hive, pig, and yam for processing data with data lakes
Recognize how to derive value from data lakes and describe the benefits of critical roles

Describe the steps involved in the data life cycle and the significance of archival policies

Implement an archival policy to transition between s3 and glacier, depending on adopted policies

Ingest data using sqoop and implement an archival policy to transition from s3 to adopted policies

IN THIS COURSE

2m 9s

FREE ACCESS
4m 5s

Find out how to implement Lambda and Kappa architectures to manage real-time big data. FREE ACCESS
3. Data Lake Reference Architecture

2m 11s

In this video, find out how to identify the benefits of adopting Zaloni's data lake reference architecture. FREE ACCESS
4. Data Ingestion and File Formats

4m 44s

After completing this video, you will be able to describe data ingestion approaches and compare the benefits of Avro and Parquet file formats. FREE ACCESS
5. Ingestion Using Sqoop

5m 55s

In this video, you will learn how to ingest data using Sqoop. FREE ACCESS
6. Data Processing Strategies

3m 42s

Upon completion of this video, you will be able to describe the data processing strategies provided by MapReduce V2, Hive, Pig, and Yarn for processing data with data lakes. FREE ACCESS
7. Deriving Value from Data Lakes

2m 32s

Upon completion of this video, you will be able to recognize how to derive value from data lakes and describe the benefits of critical roles. FREE ACCESS
8. Data Life Cycle

2m 28s

Upon completion of this video, you will be able to describe the steps involved in the data life cycle and the significance of archival policies. FREE ACCESS
9. S3 and Glacier

4m 4s

In this video, learn how to implement an archival policy to transition between S3 and Glacier, depending on the policies you have adopted. FREE ACCESS
10. Exercise: Ingest Data and Implement Archival Policy

2m 19s

In this video, find out how to ingest data using Sqoop and implement an archival policy to transition from S3 to Glacier. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Book Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform

Audiobook Fundamentals of Data Engineering: Plan and Build Robust Data Systems

Book Databricks Lakehouse Platform Cookbook: 100+ Recipes for Building a Scalable and Secure Databricks Lakehouse

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course Data Warehouse Essential: Concepts

(292)

Course Data Architecture Getting Started

(667)

Course Data Architecture Deep Dive - Microservices & Serverless Computing

(19)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

Data Lake Architectures & Data Management Principles

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE