Data Engineering on Microsoft Azure: Logical Data Structures

Azure    |    Intermediate
  • 11 videos | 1h 31m 37s
  • Includes Assessment
  • Earns a Badge
Rating 4.7 of 34 users Rating 4.7 of 34 users (34)
Logical data structures, also called entity-relationship models, are models used to define a high-level model of data and the relationships contained within. In this course, you'll learn about the stages of data lake maturity. You'll explore temporal database tables and how to manage them. You'll also learn how to define slowly changing dimensions and how to implement them. You'll then move on to explore logical file and folder structures for data ingestion. You'll discover how PolyBase can be used to connect to external tables. Finally, you'll explore the best practices for accelerating queries. This course is one in a collection that prepares learners for the Data Engineering on Microsoft Azure (DP-203) exam.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Define key concepts for maturing data lake storage structures
    Describe how system-versioned temporal tables in are used for point-in-time analysis
    Create and manage system-versioned temporal tables in an azure sql database
    Describe the different types of slowly changing dimensions
    Build a slowly changing dimension type 1 deployment
  • Build a slowly changing dimension type 2 deployment
    Define an effective logical file and folder structure for efficient data ingestion and manipulation
    Use polybase to build an external table
    Describe best practices for accelerating queries against data in azure data lake storage gen2
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 1m 27s
    In this video, you’ll learn more about your instructor and this course. In this course, you’ll learn the stages of data lake maturity. Then, you’ll explore temporal database tables while also learning about the slowly changing dimensions and how to implement them. You’ll also learn about logical file and folder structures for data ingestion. Next, you’ll learn how Polybase can be used to connect to external tables. Finally, you’ll learn best practices for accelerating queries. FREE ACCESS
  • 7m 51s
    In this video, you’ll learn more about data lake maturity. You’ll learn about data ponds, which are collections of data puddles in a project. You’ll see data lakes move beyond the idea of data accustomed to a specific project or purpose, to self-service data that can be sourced by anyone who needs it. You’ll learn a data ocean is a data lake at the enterprise level. A data ocean is a source for self-service data. FREE ACCESS
  • Locked
    3.  Temporal Database Tables
    7m 10s
    In this video, you’ll learn more about Temporal Database Tables. You’ll learn temporal tables store current data in a current table and historical data in a history table. They’re sometimes called system version tables because the database manages the versioning of records. Here, you’ll learn about temporal or system version tables as they're implemented by SQL Server. FREE ACCESS
  • Locked
    4.  Managing Temporal Tables
    10m 59s
    In this video, you’ll watch a video. You’ll learn to create temporal tables in SQL database in Azure. Temporal tables store records like a normal table, but when records are updated, the history of records is stored in a separate history table. This way, temporal queries can be performed. This can be useful if you need to see the state of records at a specific time if you want to query for trends over time. FREE ACCESS
  • Locked
    5.  Slowly Changing Dimensions
    8m 16s
    In this video, you’ll learn more about Slow Changing Dimensions. You’ll learn these are data dimensions that rarely, if ever, change. You’ll see there are several methods for managing slowly changing dimensions, depending on your requirements. The first or Type 0 is the passive method. This means the value is written once and never changes Another option is Type 1. This involves overwriting old values with the new ones in the database. FREE ACCESS
  • Locked
    6.  Building a Slowly Changing Dimension Type 1
    15m 17s
    In this video, you’ll watch a demo. You’ll see how to use a data flow in Azure data factory to implement slow changing dimensions. Slow changing dimensions are values in a database that are expected to change infrequently. In this demo, you'll replace the values of slow changing dimensions and then record when that data was updated. First, you’ll open Azure, and the open your SQL database instance from the Query editor. FREE ACCESS
  • Locked
    7.  Building a Slowly Changing Dimension Type 2
    15m 49s
    In this video, you’ll watch a demo. You’ll learn how to use a data flow in Azure Data Factory to implement slow changing dimensions. Here, you’ll maintain the history of slow changing dimensions so previous values can be queried. You’ll open a SQL database instance in Azure. Onscreen, you’ll see some tables set up. These are the tables you’ll use as source and destination tables for your data flow. FREE ACCESS
  • Locked
    8.  Folder Structures
    7m 8s
    In this video, you’ll learn about folder structures in a data lake. You’ll see a data lake folder structure works much like the folder structure in your Windows operating system, with nested folders and documents. There are several factors to consider when setting up your folder structure in a data lake. First, you’ll need to consider business needs. The data must be accessible to those who need it. Then, you’ll want to consider security needs. FREE ACCESS
  • Locked
    9.  Using External Tables
    11m 43s
    In this video, you’ll watch a demo. In this demo, you’ll learn how to use an external table in Polybase to import data into Synapse Analytics. You’ll see external tables are not physical tables. They're mappings to external resources such as blob storage. Polybase with external tables is one of the fastest ways to import data into Synapse Analytics from an external source. You’ll learn the Polybase process involves three tables. FREE ACCESS
  • Locked
    10.  Accelerating Queries
    5m 1s
    In this video, you’ll learn about Accelerating Queries. You’ll see that in Azure, this feature can be used by clients when they're querying data lake. It works with two concepts, predicates and column projections. Predicates are expressions evaluated to be either true or false. Records will only be returned if the predicates evaluate to true. Column projections are filtering out unwanted columns in the resulting records, only returning the columns of interest. FREE ACCESS
  • Locked
    11.  Course Summary
    58s
    In this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned how to define mature, logical data structures for big data implementations. You explored key concepts for maturing data lake storage structures, temporal database tables and how to manage them, slowly changing dimensions and building a type 1 and type 2 changing dimension, and effective logical file and folder structures. You also learned to use Polybase to build external tables. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.6 of 161 users Rating 4.6 of 161 users (161)
Rating 4.5 of 15 users Rating 4.5 of 15 users (15)
Rating 4.6 of 189 users Rating 4.6 of 189 users (189)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.7 of 43 users Rating 4.7 of 43 users (43)
Rating 4.7 of 49 users Rating 4.7 of 49 users (49)
Rating 4.5 of 67 users Rating 4.5 of 67 users (67)