Data Engineering on Microsoft Azure: Data Partitioning

Azure    |    Intermediate
  • 11 videos | 1h 2m 57s
  • Includes Assessment
  • Earns a Badge
Rating 4.7 of 49 users Rating 4.7 of 49 users (49)
Partitioning data is key to ensuring efficient processing. In this course, you'll explore what data partitioning is and the strategies for implementation. You'll learn about transactional and analytical workloads and how to determine the best strategy for your files and table storage. Then, you'll examine design patterns for efficiency and performance. You'll learn about partitioning dedicated SQL pools in Azure Synapse Analytics and partitioning data lakes. Finally, you'll learn how data sharding across multiple data stores can be used for improving transaction performance. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Recognize data partitioning concepts
    Describe data partitioning strategies for different services
    Compare transactional and analytical workloads to determine which data store and partitioning strategy to implement
    Recognize criteria for determining how to partition files for efficient distribution and querying
    Describe how to partition a table to ensure efficient scalability for analytical workloads
  • Describe how the index table and materialized view design patterns can increase efficiency and performance of queries
    Describe table partitions used by azure synapse analytics and how to size them, and recognize the differences from sql server
    Describe when to implement partitioning at the storage layer
    Describe how data sharding distributes load over multiple datastores
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 1m 33s
    Explore what data partitioning is. You’ll define strategies for implementation. You’ll also cover transactional and analytical workloads. You’ll determine the best strategy for your files and table storage. Then you'll explore design patterns for efficiency and performance. Next, you'll explore partitioning, dedicated SQL Pools in Azure Synapse analytics and partitioning data links. FREE ACCESS
  • 6m 53s
    Here, you’ll learn more about partitioning data. You’ll learn all about the physical partitioning of data and not the virtual partitioning within the same data table. You’ll learn the purpose of partitioning data. You’ll also learn about functional partitioning and what it's used for. Partitioning improves availability and it can improve security by separating sensitive data from non-sensitive data. FREE ACCESS
  • Locked
    3.  Data Partitioning Strategies
    6m 40s
    Take a look at the data partitioning strategies for several Azure products. Explore SQL database, Cosmos database, Table storage, Azure search, and Azure service bus. Learn how to implement an elastic pool to achieve high horizontal scalability. An elastic pool breaks data into shards that can be stored across multiple SQL database instances. FREE ACCESS
  • Locked
    4.  Transactional and Analytical Workloads
    6m 35s
    Here, you’ll look at datastore and partitioning strategies. You’ll look at how different workloads are served by these different strategies and how data is stored. You’ll learn about transactional workloads. The majority of common application operations fall under this category. You’ll also learn more about analytical workloads for finding useful patterns in data. FREE ACCESS
  • Locked
    5.  File Partition Strategies
    6m 5s
    Discover Apache Spark, popular for processing large datasets. An Apache Hive is an equally popular data warehouse. These two technologies are often combined or data processing may take place in Spark and the data is then written to Hive. Here, you’ll look at some file partition strategies for using a data pipeline with Apache Spark and Apache Hive. FREE ACCESS
  • Locked
    6.  Partitioning Strategies for Table Storage
    7m 12s
    Here, you’ll look at partitioning strategies for Azure Table Storage. You’ll discuss table entities that are stored in a table. You’ll learn about table partitioning, partition sizing, and the pros and cons of specific partition sizing strategies. You’ll also learn to perform a table partition stress test. Table entities in Azure Table storage are analogous to table rows. FREE ACCESS
  • Locked
    7.  Designing for Efficiency and Performance
    8m 12s
    Here, you’ll look at designing datastores for efficiency and performance. You’ll learn about the index table pattern, which can optimize query performance by making data easier to find. You’ll also look at the materialized view pattern which represents data in a way that's more consumable to a query than the way it's stored in the data schema. FREE ACCESS
  • Locked
    8.  Partitioning Azure Synapse Analytics Dedicated Pools
    4m 50s
    Here, you’ll look at partitioning Azure Synapse Analytics Dedicated Pools. You’ll learn the advantages such partitioning holds for queries. You’ll also look at the advantages of partitioning when loading data. You’ll cover considerations when sizing partitions. Finally, you'll look at the details of partition splitting and switching. FREE ACCESS
  • Locked
    9.  Partitioning Data Lakes
    7m
    Discover partitioning of storage in Data Lakes. Explore Azure Table Storage, which allows you to store entities and query them by a unique key. Learn about Azure Blob Storage, which allows you to store any kind of document as a blob, be it structured, semi-structured, or non-structured data. Explore Azure Queue Storage, which allows for message handling. FREE ACCESS
  • Locked
    10.  Data Sharding for Scaling
    7m 1s
    Here, you’ll investigate data sharding for scaling. You’ll discuss the limitations of not sharding, which means storing all of your data on a single physical server. Next, you'll discuss sharding and how it can be beneficial. Then you'll discuss the characteristics of three common sharding strategies. Finally, you'll look at factors to consider when implementing an effective sharding strategy. FREE ACCESS
  • Locked
    11.  Course Summary
    54s
    You’ve examined data partitioning strategies and designing for performance. You explored data partitioning and the strategies employed. You looked at transactional and analytical workloads and file partition strategies. You discovered partitioning strategies for table storage and designing for performance efficiency. You explored partitioning in Azure Synapse Analytics. You learned about partitioning data lakes and data sharding for scaling. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.7 of 11 users Rating 4.7 of 11 users (11)
Rating 4.5 of 147 users Rating 4.5 of 147 users (147)
Rating 4.7 of 43 users Rating 4.7 of 43 users (43)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.7 of 43 users Rating 4.7 of 43 users (43)
Rating 4.7 of 34 users Rating 4.7 of 34 users (34)
Rating 4.6 of 161 users Rating 4.6 of 161 users (161)