Data Engineering on Microsoft Azure: Logical Data Structures
Azure
| Intermediate
- 11 videos | 1h 31m 37s
- Includes Assessment
- Earns a Badge
Logical data structures, also called entity-relationship models, are models used to define a high-level model of data and the relationships contained within. In this course, you'll learn about the stages of data lake maturity. You'll explore temporal database tables and how to manage them. You'll also learn how to define slowly changing dimensions and how to implement them. You'll then move on to explore logical file and folder structures for data ingestion. You'll discover how PolyBase can be used to connect to external tables. Finally, you'll explore the best practices for accelerating queries. This course is one in a collection that prepares learners for the Data Engineering on Microsoft Azure (DP-203) exam.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseDefine key concepts for maturing data lake storage structuresDescribe how system-versioned temporal tables in are used for point-in-time analysisCreate and manage system-versioned temporal tables in an azure sql databaseDescribe the different types of slowly changing dimensionsBuild a slowly changing dimension type 1 deployment
-
Build a slowly changing dimension type 2 deploymentDefine an effective logical file and folder structure for efficient data ingestion and manipulationUse polybase to build an external tableDescribe best practices for accelerating queries against data in azure data lake storage gen2Summarize the key concepts covered in this course
IN THIS COURSE
-
1m 27sIn this video, you’ll learn more about your instructor and this course. In this course, you’ll learn the stages of data lake maturity. Then, you’ll explore temporal database tables while also learning about the slowly changing dimensions and how to implement them. You’ll also learn about logical file and folder structures for data ingestion. Next, you’ll learn how Polybase can be used to connect to external tables. Finally, you’ll learn best practices for accelerating queries. FREE ACCESS
-
7m 51sIn this video, you’ll learn more about data lake maturity. You’ll learn about data ponds, which are collections of data puddles in a project. You’ll see data lakes move beyond the idea of data accustomed to a specific project or purpose, to self-service data that can be sourced by anyone who needs it. You’ll learn a data ocean is a data lake at the enterprise level. A data ocean is a source for self-service data. FREE ACCESS
-
7m 10sIn this video, you’ll learn more about Temporal Database Tables. You’ll learn temporal tables store current data in a current table and historical data in a history table. They’re sometimes called system version tables because the database manages the versioning of records. Here, you’ll learn about temporal or system version tables as they're implemented by SQL Server. FREE ACCESS
-
10m 59sIn this video, you’ll watch a video. You’ll learn to create temporal tables in SQL database in Azure. Temporal tables store records like a normal table, but when records are updated, the history of records is stored in a separate history table. This way, temporal queries can be performed. This can be useful if you need to see the state of records at a specific time if you want to query for trends over time. FREE ACCESS
-
8m 16sIn this video, you’ll learn more about Slow Changing Dimensions. You’ll learn these are data dimensions that rarely, if ever, change. You’ll see there are several methods for managing slowly changing dimensions, depending on your requirements. The first or Type 0 is the passive method. This means the value is written once and never changes Another option is Type 1. This involves overwriting old values with the new ones in the database. FREE ACCESS
-
15m 17sIn this video, you’ll watch a demo. You’ll see how to use a data flow in Azure data factory to implement slow changing dimensions. Slow changing dimensions are values in a database that are expected to change infrequently. In this demo, you'll replace the values of slow changing dimensions and then record when that data was updated. First, you’ll open Azure, and the open your SQL database instance from the Query editor. FREE ACCESS
-
15m 49sIn this video, you’ll watch a demo. You’ll learn how to use a data flow in Azure Data Factory to implement slow changing dimensions. Here, you’ll maintain the history of slow changing dimensions so previous values can be queried. You’ll open a SQL database instance in Azure. Onscreen, you’ll see some tables set up. These are the tables you’ll use as source and destination tables for your data flow. FREE ACCESS
-
7m 8sIn this video, you’ll learn about folder structures in a data lake. You’ll see a data lake folder structure works much like the folder structure in your Windows operating system, with nested folders and documents. There are several factors to consider when setting up your folder structure in a data lake. First, you’ll need to consider business needs. The data must be accessible to those who need it. Then, you’ll want to consider security needs. FREE ACCESS
-
11m 43sIn this video, you’ll watch a demo. In this demo, you’ll learn how to use an external table in Polybase to import data into Synapse Analytics. You’ll see external tables are not physical tables. They're mappings to external resources such as blob storage. Polybase with external tables is one of the fastest ways to import data into Synapse Analytics from an external source. You’ll learn the Polybase process involves three tables. FREE ACCESS
-
5m 1sIn this video, you’ll learn about Accelerating Queries. You’ll see that in Azure, this feature can be used by clients when they're querying data lake. It works with two concepts, predicates and column projections. Predicates are expressions evaluated to be either true or false. Records will only be returned if the predicates evaluate to true. Column projections are filtering out unwanted columns in the resulting records, only returning the columns of interest. FREE ACCESS
-
58sIn this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned how to define mature, logical data structures for big data implementations. You explored key concepts for maturing data lake storage structures, temporal database tables and how to manage them, slowly changing dimensions and building a type 1 and type 2 changing dimension, and effective logical file and folder structures. You also learned to use Polybase to build external tables. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.