Microsoft Fabric: Spark Configuration & Delta Tables

Microsoft Fabric 2024    |    Expert
  • 12 videos | 1h 47m 48s
  • Earns a Badge
Rating 4.0 of 2 users Rating 4.0 of 2 users (2)
Spark is essential both in Microsoft Fabric and for the DP-600 certification test. In this course, you'll learn how to create Python scripts for Spark batch jobs in Fabric, writing ETL transformations and modifying them for batch execution. You'll configure and monitor Spark batch jobs, analyze job logs, use the Spark History Server, and learn how to schedule jobs with retry policies. Next, you'll configure starter pools and explore high concurrency sessions. You'll customize Spark settings in Fabric, and create custom Spark pools and environments, linking them to notebooks. After that, you'll focus on Delta tables, working with version history and table contents. You'll use the DESCRIBE HISTORY command to analyze table versions and explore time travel by viewing and restoring data to specific versions or timestamps with SparkSQL and PySpark. Finally, you'll explore the differences between managed and external Delta tables. This course is part of a series that prepares learners for Exam DP-600: Implementing Analytics Solutions Using Microsoft Fabric.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Compose a python script for an etl transformation that writes out a delta table, adding in code for additional spark setup
    Create, configure, save, and monitor a spark job and analyze the output
    Analyze logs and error messages for a failed spark job in fabric, view the spark history server, and schedule spark jobs
    Customize spark settings for starter pool and demonstrate the use of high concurrency sessions to speed up pyspark notebooks
    Create and configure a custom spark pool, create a custom spark environment relying on that pool and use it in a notebook
  • Perform a series of insert, update, and delete operations on a delta table, and observe how the version history and delta table directory contents change in response
    Run sparksql and pyspark commands to view data in a delta table as either a specific version or at a specific utc timestamp, then restore to a chosen version
    Contrast managed delta tables with external delta tables
    Create managed delta tables from spark and analyze the data, metadata, and properties
    Create external delta tables from spark and contrast the properties and deletion semantics of managed and external delta tables
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 18s
    In this video, you will discover the key concepts covered in this course. FREE ACCESS
  • 8m 27s
    Learn how to compose a Python script for an ETL transformation that writes out a Delta table, adding in code for additional Spark setup. FREE ACCESS
  • Locked
    3.  Creating and Monitoring Spark Batch Jobs
    11m 27s
    In this video, learn how to create, configure, save, and monitor a Spark Job and analyze the output. FREE ACCESS
  • Locked
    4.  The Spark History Server & Scheduling Spark Jobs
    10m 33s
    Find out how to analyze logs and error messages for a failed Spark job in Fabric, view the Spark History Server, and schedule Spark jobs. FREE ACCESS
  • Locked
    5.  Configuring Starter Pools & High Concurrency
    9m 52s
    Discover how to customize Spark settings for starter pool and demonstrate the use of high concurrency sessions to speed up PySpark notebooks. FREE ACCESS
  • Locked
    6.  Creating & Using Custom Spark Pools & Environments
    11m 37s
    In this video, find out how to create and configure a custom Spark pool, create a custom Spark environment relying on that pool and use it in a notebook. FREE ACCESS
  • Locked
    7.  Viewing Versions & History of Delta Tables
    10m 39s
    Learn how to perform a series of insert, update, and delete operations on a Delta table, and observe how the version history and Delta table directory contents change in response. FREE ACCESS
  • Locked
    8.  Time Travelling with Delta Tables
    11m 38s
    Find out how to run SparkSQL and PySpark commands to view data in a Delta table as either a specific version or at a specific UTC timestamp, then restore to a chosen version. FREE ACCESS
  • Locked
    9.  Managed & External Delta Tables
    8m 4s
    In this video, learn how to contrast managed Delta tables with external Delta tables. FREE ACCESS
  • Locked
    10.  Creating Managed Delta Tables from Spark
    10m 18s
    In this video, discover how to create managed Delta tables from Spark and analyze the data, metadata, and properties. FREE ACCESS
  • Locked
    11.  Creating External Delta Tables from Spark
    10m 35s
    Discover how to create external Delta tables from Spark and contrast the properties and deletion semantics of managed and external Delta tables. FREE ACCESS
  • Locked
    12.  Course Summary
    2m 21s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 3.0 of 1 users Rating 3.0 of 1 users (1)
Rating 4.7 of 52 users Rating 4.7 of 52 users (52)