Bucketing & Window Functions with Hive

Apache Hive 2.3.2    |    Intermediate
  • 9 videos | 1h 3m 14s
  • Includes Assessment
  • Earns a Badge
Rating 4.2 of 6 users Rating 4.2 of 6 users (6)
Learners explore how Apache Hive query executions can be optimized, including techniques such as bucketing data sets, in this Skillsoft Aspire course. Using windowing functions to extract meaningful insights from data is also covered. This 10-video course assumes previous work with partitions in Hive, as well as conceptual understanding of how buckets can improve query performance. Learners begin by focusing on how to use the bucketing technique to process big data efficiently. Then take a look at HDFS (Hadoop Distributed File System) by navigating to the shell of the Hadoop master node; from there, make use of the Hadoop fs-ls command to examine contents of the directory. Observe three subdirectories corresponding to three partitions based on the value of the category column. You will then explore how to combine both the partitioning as well as bucketing techniques to further improve query performance. Finally, learners will explore the concept of co-windowing, which helps users analyze a subset of ordered data, and then to see how this technique can be implemented in Hive.

WHAT YOU WILL LEARN

  • Implement bucketing for a hive table and explore the structure of the table and bucket on hdfs
    Apply both bucketing and partitioning for a table and describe the structure of such a table on hdfs
    Extract further performance from hive queries by sorting the contents of buckets
    Work with samples of a hive table by dividing it into buckets
  • Perform join operations on three or more tables by chaining the joins
    Implement a window function to calculate running totals on an ordered dataset
    Apply a window function within a partition of your dataset
    Apply bucketing of hive tables to boost query performance and to use window functions

IN THIS COURSE

  • 2m 9s
  • 8m 58s
    In this video, you will learn how to implement bucketing for a Hive table and explore the structure of the table and bucket on HDFS. FREE ACCESS
  • Locked
    3.  Using Bucketing and Partitioning Together in Hive
    8m 13s
    Learn how to apply both bucketing and partitioning to a table and describe the structure of such a table on HDFS. FREE ACCESS
  • Locked
    4.  Sorting a Bucket's Contents in Hive
    4m 41s
    Learn how to extract better performance from Hive queries by sorting the contents of buckets. FREE ACCESS
  • Locked
    5.  Sampling a Table in Hive
    7m 42s
    Learn how to work with samples of a Hive table by dividing it into partitions. FREE ACCESS
  • Locked
    6.  Joining Multiple Tables in Hive
    7m 16s
    In this video, learn how to perform join operations on three or more tables by connecting the joins. FREE ACCESS
  • Locked
    7.  Introducing Window Functions in Hive
    9m 31s
    In this video, find out how to implement a window function to calculate running totals on an ordered dataset. FREE ACCESS
  • Locked
    8.  Windows Functions with Partitions in Hive
    9m 23s
    During this video, you will learn how to apply a window function to a partition of your dataset. FREE ACCESS
  • Locked
    9.  Exercise: Bucketing and Window Functions in Hive
    5m 22s
    In this video, you will learn how to use bucketing and window functions to improve query performance in Hive. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.3 of 993 users Rating 4.3 of 993 users (993)
Rating 4.9 of 7 users Rating 4.9 of 7 users (7)
Rating 4.8 of 14 users Rating 4.8 of 14 users (14)