Microsoft Fabric: Spark & the Capacity Metrics App for Lakehouses

Microsoft Fabric 2024    |    Expert
  • 15 videos | 1h 57m 51s
  • Includes Assessment
  • Earns a Badge
Spark is a key technology on Fabric Lakehouses, and a fundamental part of the DP-600 test curriculum. In this course, you'll learn how Apache Spark integrates with Microsoft Fabric to handle large-scale data processing through distributed computing. First, learn about Spark pools and study the role of the T-SQL endpoint. Create Fabric shortcuts, set up storage accounts, enable hierarchical namespaces, and use Shared Access Signatures (SAS) to link these sources and build Delta tables from the connected data. Next, create notebooks with Apache Spark in Microsoft Fabric, run PySpark and SparkSQL commands, monitor resource usage and learn how to associate lakehouses with notebooks. Finally, explore the Microsoft Fabric Capacity Metrics App, tracking capacity units (CUs), managing SKUs, and handling overages and throttling. Complete the course by installing the app, entering your Fabric capacity ID, and using charts to analyze utilization metrics. This course is part of a series that prepares learners for Exam DP-600: Implementing Analytics Solutions Using Microsoft Fabric.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Outline key attractions, features and terms related to apache spark and spark on fabric
    Outline the role of the t-sql endpoint and semantic models in working with data lakehouses
    Define fabric shortcuts, enumerate their types and outline the role they play in lakehouses
    Create an adls gen2 storage account from azure and connect to it from fabric via a shortcut
    Create delta tables based on data connected via a shortcut and study how updates propagate through to the delta table
    Create onelake shortcuts with an amazon s3 bucket as the underlying data source
    Analyze spark pools, starter pools, spark environments and run spark on fabric in high concurrency mode
  • Create a notebook, associate a lakehouse with it, and then run various pyspark and sparksql commands
    Perform grouping and aggregation operations in pyspark and sparksql
    Write dataframes out to managed delta tables, as well as to parquet and json files
    Analyze fabric capacities, skus, and capacity units (cus) and outline responses to overages, from overage protection to background rejection
    Install the microsoft fabric capacity metrics app from microsoft appsource and enter our fabric capacity id to complete the process
    Analyze utilization and other usage metrics, as well as throttling and overages in the fabric capacity metrics app
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 14s
    In this video, you will discover the key concepts covered in this course. FREE ACCESS
  • 5m 55s
    After completing this video, you will be able to outline key attractions, features and terms related to Apache Spark and Spark on Fabric. FREE ACCESS
  • Locked
    3.  The T-SQL Endpoint & Semantic Models
    7m 21s
    In this video, find out how to outline the role of the T-SQL endpoint and semantic models in working with data lakehouses. FREE ACCESS
  • Locked
    4.  Data Lakehouse Shortcuts
    5m 6s
    During this video, discover how to define Fabric shortcuts, enumerate their types and outline the role they play in lakehouses. FREE ACCESS
  • Locked
    5.  Creating a Shortcut in Fabric
    11m 17s
    In this video, you will learn how to create an ADLS Gen2 storage account from Azure and connect to it from Fabric via a shortcut. FREE ACCESS
  • Locked
    6.  Creating Dynamic Shortcuts in Fabric
    8m 16s
    Learn how to create delta tables based on data connected via a shortcut and study how updates propagate through to the delta table. FREE ACCESS
  • Locked
    7.  Creating OneLake Shortcuts for Amazon S3 Buckets
    12m 53s
    Upon completion of this video, you will be able to create OneLake shortcuts with an Amazon S3 bucket as the underlying data source. FREE ACCESS
  • Locked
    8.  Apache Spark in Microsoft Fabric
    6m 32s
    Find out how to analyze Spark pools, starter pools, Spark environments and run Spark on Fabric in high concurrency mode. FREE ACCESS
  • Locked
    9.  Working with Lakehouses, Notebooks, & Spark Commands
    10m 42s
    In this video, learn how to create a notebook, associate a lakehouse with it, and then run various PySpark and SparkSQL commands. FREE ACCESS
  • Locked
    10.  Grouping & Aggregating in SparkSQL & PySpark
    8m 7s
    During this video, discover how to perform grouping and aggregation operations in PySpark and SparkSQL. FREE ACCESS
  • Locked
    11.  Parquet, JSON Files, & Delta Tables Dataframes
    9m 40s
    After completing this video, you will be able to write dataframes out to managed Delta tables, as well as to parquet and JSON files. FREE ACCESS
  • Locked
    12.  The Fabric Capacity Metrics App
    7m 27s
    Learn how to analyze Fabric capacities, SKUs, and capacity units (CUs) and outline responses to overages, from overage protection to background rejection. FREE ACCESS
  • Locked
    13.  Installing the Microsoft Fabric Capacity Metrics App
    8m 9s
    In this video, you will learn how to install the Microsoft Fabric Capacity Metrics App from Microsoft AppSource and enter our Fabric capacity ID to complete the process. FREE ACCESS
  • Locked
    14.  Using the Microsoft Fabric Capacity Metrics App
    11m 44s
    Find out how to analyze utilization and other usage metrics, as well as throttling and overages in the Fabric Capacity Metrics App. FREE ACCESS
  • Locked
    15.  Course Summary
    2m 27s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.