GCP Data Engineer Pro: Dataset Processing

Google Cloud 2024    |    Intermediate
  • 19 videos | 2h 14m 36s
  • Includes Assessment
  • Earns a Badge
Rating 4.0 of 3 users Rating 4.0 of 3 users (3)
In the intricate world of data management, envisioning the flow of information as an oil pipeline offers a vivid analogy. Just as oil must be carefully extracted, transported, and refined, data too requires meticulous processes to ensure its value is maximized. In this course, learn about big data processing, including Dataproc cluster options, creating a Dataproc cluster and running a Spark job, and Dataprep flows, profiling, transforming, and sampling of data. Next, discover how to build and deploy robust Dataflow pipelines and options to fine-tune Dataflow pipeline performance. Finally, explore the setup and management of Data Fusion instances and pipelines, create a Data Fusion pipeline, and examine the pricing models for Dataproc, Dataprep, and Dataflow. This course is one of a collection that prepares learners for the Google Professional Data Engineer exam.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Identify the key differences between google cloud services that represent part of managing the controlled inflow of data
    Outline the optimization options of dataproc for different situations
    Build a cluster in dataproc and run a spark job
    Recognize the purpose and basic configuration options of a dataprep flow
    Outline the workflow associated with using dataprep
    Create flows in dataprep, import data, and gather samples before running a job
    Identify the use case and components of the dataflow service
    Build a dataflow pipeline creation script and examine its contents
    Create a dataflow pipeline deployment script and examine its contents
  • Identify the use cases for selecting various options available when running a dataflow pipeline
    Outline the use case and components of the data fusion service
    Recognize the options associated with data fusion instance and pipeline creation
    Identify the options available for configuring instances, pipelines, and connections in data fusion
    Create a pipeline in data fusion using the studio tools
    Identify the choices that affect the cost of dataproc, dataflow, dataprep, and data fusion
    Recognize common issues that can arise with dataproc and how to troubleshoot them
    Outline common issues that can arise with dataflow and how to troubleshoot them
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 1m 29s
    In this video, we will discover the key concepts covered in this course. FREE ACCESS
  • 5m 22s
    Through this video, you will be able to identify the key differences between Google Cloud services that represent part of managing the controlled inflow of data. FREE ACCESS
  • Locked
    3.  Dataproc Cluster Options
    11m 18s
    After completing this video, you will be able to outline the optimization options of Dataproc for different situations. FREE ACCESS
  • Locked
    4.  Creating a Dataproc Cluster and Running a Spark Job
    8m 24s
    Learn how to build a cluster in Dataproc and run a Spark job. FREE ACCESS
  • Locked
    5.  Dataprep Flows
    8m 43s
    Upon completion of this video, you will be able to recognize the purpose and basic configuration options of a Dataprep flow. FREE ACCESS
  • Locked
    6.  Profile, Transform, and Sample Data
    11m 43s
    In this video, we will outline the workflow associated with using Dataprep. FREE ACCESS
  • Locked
    7.  Using Dataprep
    7m 15s
    Discover how to create flows in Dataprep, import data, and gather samples before running a job. FREE ACCESS
  • Locked
    8.  Google Cloud Dataflow
    9m 41s
    After completing this video, you will be able to identify the use case and components of the Dataflow service. FREE ACCESS
  • Locked
    9.  Building a Dataflow Pipeline
    6m 19s
    Find out how to build a Dataflow pipeline creation script and examine its contents. FREE ACCESS
  • Locked
    10.  Deploying a Dataflow Pipeline
    5m 19s
    During this video, you will learn how to create a Dataflow pipeline deployment script and examine its contents. FREE ACCESS
  • Locked
    11.  Dataflow Pipeline Options
    8m 27s
    Upon completion of this video, you will be able to identify the use cases for selecting various options available when running a Dataflow pipeline. FREE ACCESS
  • Locked
    12.  Google Cloud Data Fusion
    7m 53s
    In this video, we will outline the use case and components of the Data Fusion service. FREE ACCESS
  • Locked
    13.  Data Fusion Setup
    7m 22s
    Through this video, you will recognize the options associated with Data Fusion instance and pipeline creation. FREE ACCESS
  • Locked
    14.  Data Fusion Management
    6m 45s
    After completing this video, you will be able to identify the options available for configuring instances, pipelines, and connections in Data Fusion. FREE ACCESS
  • Locked
    15.  Creating a Data Fusion Pipeline
    8m 15s
    During this video, discover how to create a pipeline in Data Fusion using the Studio tools. FREE ACCESS
  • Locked
    16.  Pricing for Dataproc, Dataflow, Dataprep, and Data Fusion
    8m 36s
    In this video, you will identify the choices that affect the cost of Dataproc, Dataflow, Dataprep, and Data Fusion. FREE ACCESS
  • Locked
    17.  Dataproc Troubleshooting
    5m 13s
    Through this video, you will recognize common issues that can arise with Dataproc and how to troubleshoot them. FREE ACCESS
  • Locked
    18.  Dataflow Troubleshooting
    5m 22s
    Upon completion of this video, you will be able to outline common issues that can arise with Dataflow and how to troubleshoot them. FREE ACCESS
  • Locked
    19.  Course Summary
    1m 10s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.3 of 32 users Rating 4.3 of 32 users (32)
Rating 5.0 of 6 users Rating 5.0 of 6 users (6)
Rating 4.9 of 10 users Rating 4.9 of 10 users (10)