Hadoop HDFS Getting Started

Apache Hadoop | Beginner

12 videos | 1h 14m 36s
Includes Assessment
Earns a Badge

(45)

From Channel:

Apache Hadoop

From Journey:

Data Analyst to Data Scientist

Explore the concepts of analyzing large data sets in this 12-video Skillsoft Aspire course, which deals with Hadoop and its Hadoop Distributed File System (HDFS), which enables parallel processing of big data efficiently in a distributed cluster. The course assumes a conceptual understanding of Hadoop and its components; purely theoretical, it contains no labs, with just enough information provided to understand how Hadoop and HDFS allow processing big data in parallel. The course opens by explaining the ideas of vertical and horizontal scaling, then discusses functions served by Hadoop to horizontally scale data processing tasks. Learners explore functions of YARN, MapReduce, and HDFS, covering how HDFS keeps track of where all pieces of large files are distributed, replication of data, and how HDFS is used with Zookeeper: a tool maintained by the Apache Software Foundation and used to provide coordination and synchronization in distributed systems, along with other services related to distributed computing-a naming service, configuration management, and so on. Learn about Spark, a data analytics engine for distributed data processing.

WHAT YOU WILL LEARN

Recognize the need to process massive datasets at scale

Describe the benefits of horizontal scaling for processing big data and the challenges of this approach

Recall the features of a distributed cluster which address the challenges of horizontal scaling

Identify the features of hdfs which enables large datasets to be distributed across a cluster

Describe the simple and high-availability architectures of hdfs and the implementations for each of them

Identify the role of hadoop's mapreduce in processing chunks of big datasets in parallel
Recognize the role of the yarn resource negotiator in enabling map and reduce operations to execute on a cluster

Describe the steps involved in resource allocation and job execution for operations on a hadoop cluster

Recall how apache zookeeper enables the hdfs namenode and yarn resourcemanager to run in high-availability mode

Identify various technologies which integrate with hadoop and simplify the task of big data processing

Recognize the key features of distributed clusters, hdfs, and the input outs of the map and reduce phases

IN THIS COURSE

2m 17s

FREE ACCESS
4m 29s

After completing this video, you will be able to recognize the need to process massive datasets quickly. FREE ACCESS
3. Horizontal Scaling for Big Data

7m 12s

After completing this video, you will be able to describe the benefits of horizontal scaling for processing big data and the challenges of this approach. FREE ACCESS
4. Distributed Clusters and Horizontal Scaling

8m 1s

After completing this video, you will be able to recall the features of a distributed cluster that address the challenges of horizontal scaling. FREE ACCESS
5. Overview of HDFS

4m 52s

In this video, find out how to identify the features of HDFS which enable large datasets to be distributed across a cluster. FREE ACCESS
6. HDFS Architectures

6m 51s

Upon completion of this video, you will be able to describe the simple and high-availability architectures of HDFS and the implementations for each of them. FREE ACCESS
7. MapReduce for HDFS

8m 24s

In this video, you will identify the role of Hadoop's MapReduce in processing chunks of big datasets in parallel. FREE ACCESS
8. YARN for HDFS

6m 49s

Upon completion of this video, you will be able to recognize the role of the YARN resource negotiator in enabling Map and Reduce operations to execute on a cluster. FREE ACCESS
9. The Mechanism of Resource Allocation in Hadoop

2m 43s

Upon completion of this video, you will be able to describe the steps involved in resource allocation and job execution for operations on a Hadoop cluster. FREE ACCESS
10. Apache Zookeeper for HDFS

8m 25s

Upon completion of this video, you will be able to recall how Apache Zookeeper enables the HDFS NameNode and YARN ResourceManager to run in a high-availability mode. FREE ACCESS
11. The Hadoop Ecosystem

8m 9s

In this video, you will identify various technologies that integrate with Hadoop and simplify the task of big data processing. FREE ACCESS
12. Exercise: An Introduction to HDFS

6m 26s

After completing this video, you will be able to recognize the key features of distributed clusters, HDFS, and the inputs and outputs of the Map and Reduce phases. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Course Getting Started with Hive

(79)

Channel Apache HBase

(1)

Course Working with Files in Hadoop HDFS

(20)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course Working with tables in Excel 365 (2023)

(5)

Course Data Silos, Lakes, & Streams Introduction

(41)

Course Introduction to the Shell for Hadoop HDFS

(21)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

Hadoop HDFS Getting Started

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE