GCP Data Engineer Pro: Dataset Processing
Google Cloud 2024
| Intermediate
- 19 videos | 2h 14m 36s
- Includes Assessment
- Earns a Badge
In the intricate world of data management, envisioning the flow of information as an oil pipeline offers a vivid analogy. Just as oil must be carefully extracted, transported, and refined, data too requires meticulous processes to ensure its value is maximized. In this course, learn about big data processing, including Dataproc cluster options, creating a Dataproc cluster and running a Spark job, and Dataprep flows, profiling, transforming, and sampling of data. Next, discover how to build and deploy robust Dataflow pipelines and options to fine-tune Dataflow pipeline performance. Finally, explore the setup and management of Data Fusion instances and pipelines, create a Data Fusion pipeline, and examine the pricing models for Dataproc, Dataprep, and Dataflow. This course is one of a collection that prepares learners for the Google Professional Data Engineer exam.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseIdentify the key differences between google cloud services that represent part of managing the controlled inflow of dataOutline the optimization options of dataproc for different situationsBuild a cluster in dataproc and run a spark jobRecognize the purpose and basic configuration options of a dataprep flowOutline the workflow associated with using dataprepCreate flows in dataprep, import data, and gather samples before running a jobIdentify the use case and components of the dataflow serviceBuild a dataflow pipeline creation script and examine its contentsCreate a dataflow pipeline deployment script and examine its contents
-
Identify the use cases for selecting various options available when running a dataflow pipelineOutline the use case and components of the data fusion serviceRecognize the options associated with data fusion instance and pipeline creationIdentify the options available for configuring instances, pipelines, and connections in data fusionCreate a pipeline in data fusion using the studio toolsIdentify the choices that affect the cost of dataproc, dataflow, dataprep, and data fusionRecognize common issues that can arise with dataproc and how to troubleshoot themOutline common issues that can arise with dataflow and how to troubleshoot themSummarize the key concepts covered in this course
IN THIS COURSE
-
1m 29sIn this video, we will discover the key concepts covered in this course. FREE ACCESS
-
5m 22sThrough this video, you will be able to identify the key differences between Google Cloud services that represent part of managing the controlled inflow of data. FREE ACCESS
-
11m 18sAfter completing this video, you will be able to outline the optimization options of Dataproc for different situations. FREE ACCESS
-
8m 24sLearn how to build a cluster in Dataproc and run a Spark job. FREE ACCESS
-
8m 43sUpon completion of this video, you will be able to recognize the purpose and basic configuration options of a Dataprep flow. FREE ACCESS
-
11m 43sIn this video, we will outline the workflow associated with using Dataprep. FREE ACCESS
-
7m 15sDiscover how to create flows in Dataprep, import data, and gather samples before running a job. FREE ACCESS
-
9m 41sAfter completing this video, you will be able to identify the use case and components of the Dataflow service. FREE ACCESS
-
6m 19sFind out how to build a Dataflow pipeline creation script and examine its contents. FREE ACCESS
-
5m 19sDuring this video, you will learn how to create a Dataflow pipeline deployment script and examine its contents. FREE ACCESS
-
8m 27sUpon completion of this video, you will be able to identify the use cases for selecting various options available when running a Dataflow pipeline. FREE ACCESS
-
7m 53sIn this video, we will outline the use case and components of the Data Fusion service. FREE ACCESS
-
7m 22sThrough this video, you will recognize the options associated with Data Fusion instance and pipeline creation. FREE ACCESS
-
6m 45sAfter completing this video, you will be able to identify the options available for configuring instances, pipelines, and connections in Data Fusion. FREE ACCESS
-
8m 15sDuring this video, discover how to create a pipeline in Data Fusion using the Studio tools. FREE ACCESS
-
8m 36sIn this video, you will identify the choices that affect the cost of Dataproc, Dataflow, Dataprep, and Data Fusion. FREE ACCESS
-
5m 13sThrough this video, you will recognize common issues that can arise with Dataproc and how to troubleshoot them. FREE ACCESS
-
5m 22sUpon completion of this video, you will be able to outline common issues that can arise with Dataflow and how to troubleshoot them. FREE ACCESS
-
1m 10sIn this video, we will summarize the key concepts covered in this course. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.