Advanced Operations Using Hadoop MapReduce
Apache Hadoop
| Intermediate
- 9 videos | 48m 16s
- Includes Assessment
- Earns a Badge
In this Skillsoft Aspire course, explore how MapReduce can be used to extract the five most expensive vehicles in a data set, then build an inverted index for the words appearing in a set of text files. Begin by defining a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueue, then configure a Mapper to use a PriorityQueue to store the five most expensive automobiles it has processed from the dataset. Learn how to use a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top five automobiles overall to the output, then execute the application to verify the results. Next, explore how you can utilize the MapReduce framework in order to generate an inverted index and configure the Reducer and Driver for the inverted index application. This leads on to running the application and examining the inverted index on HDFS (Hadoop Distributed File System). The concluding exercise involves advanced operations using MapReduce.
WHAT YOU WILL LEARN
-
Define a vehicle type that can be used to represent automobiles to be stored in a java priorityqueueConfigure a mapper to use a priorityqueue to store the five most expensive vehicles it has processed from the datasetUse a priorityqueue in the reducer of the application to receive the five most expensive automobiles from each mapper and write the top 5 vehicles overall to the outputExecute the application and examine the output on hdfs to confirm that the five most expensive automobiles have been written out
-
Define the mapper for a mapreduce application to build an inverted index from a set of text filesConfigure the reducer and the driver for the inverted index applicationRun the application and examine the inverted index on hdfsRecognize the data structures and configurations involved when extracting the top n values from a data set
IN THIS COURSE
-
2m 30s
-
6m 45sFind out how to define a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueue. FREE ACCESS
-
5m 31sIn this video, learn how to configure a Mapper to use a PriorityQueue to store the five most expensive vehicles it has processed from the dataset. FREE ACCESS
-
6m 29sIn this video, learn how to use a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top 5 vehicles overall to the output. FREE ACCESS
-
5m 7sFind out how to execute the application, and examine the output on HDFS to confirm that the five most expensive automobiles have been written out. FREE ACCESS
-
6m 5sIn this video, you will define the Mapper for a MapReduce application to build an inverted index from a set of text files. FREE ACCESS
-
5m 31sIn this video, you will learn how to configure the Reducer and the Driver for the inverted index application. FREE ACCESS
-
5m 1sLearn how to run the application and examine the inverted index on a Hadoop Distributed File System. FREE ACCESS
-
5m 17sUpon completion of this video, you will be able to recognize the data structures and configurations involved when extracting the top N values from a data set. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.