Using Apache Spark for AI Development
Apache Spark
| Intermediate
- 13 videos | 36m 52s
- Includes Assessment
- Earns a Badge
Spark is a leading open-source cluster-computing framework that is used for distributed databases and machine learning. Although not primarily designed for AI, Spark allows you to take advantage of data parallelism and the large distributed systems used in AI development. AI practitioners should recognize when to use Spark for a particular application. In this course, you'll explore advanced techniques for working with Apache Spark and identify the key advantages of using Spark over other platforms. You'll define the meaning of resilient distributed databases (RDDs) and explore several workflows related to them. You'll move on to recognize how to work with a Spark DataFrame, identifying its features and use cases. Finally, you'll learn how to create a machine learning pipeline using Spark ML Pipelines.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseIdentify cases in which it is advantageous to use spark over other platformsDefine a resilient distributed dataset and identify typical sources of dataSpecify the unique features of a resilient distributed datasetDescribe how to create a resilient distributed datasetList possible operations with resilient distributed datasets and define their rolesList potential sources of data for a spark dataframe and outline how to import these into spark
-
Name the features of a spark dataframe and some useful operations with which to use itOutline how to create a spark dataframeSpecify how spark ml pipelines can be used for creating and tuning ml modelsDescribe fundamental concepts of spark ml pipelinesCreate an ml pipeline using spark ml pipelinesSummarize the key concepts covered in this course
IN THIS COURSE
-
2m 46s
-
5mIn this video, you will identify cases in which it is advantageous to use Spark over other platforms. FREE ACCESS
-
3m 22sLearn how to define a resilient distributed dataset and identify typical sources of data. FREE ACCESS
-
2m 2sUpon completion of this video, you will be able to specify the unique features of a resilient distributed dataset. FREE ACCESS
-
2m 43sAfter completing this video, you will be able to describe how to create a resilient distributed dataset. FREE ACCESS
-
2m 53sAfter completing this video, you will be able to list possible operations with resilient distributed datasets and define their roles. FREE ACCESS
-
1m 58sAfter completing this video, you will be able to list potential sources of data for a Spark DataFrame and outline how to import these into Spark. FREE ACCESS
-
1m 42sUpon completion of this video, you will be able to name the features of a Spark DataFrame and some useful operations to use with it. FREE ACCESS
-
2m 46sIn this video, you will learn how to create a Spark DataFrame. FREE ACCESS
-
3m 55sUpon completion of this video, you will be able to specify how Spark ML Pipelines can be used for creating and tuning machine learning models. FREE ACCESS
-
2mUpon completion of this video, you will be able to describe fundamental concepts of Spark ML pipelines. FREE ACCESS
-
4m 55sIn this video, you will create an ML pipeline using Spark ML pipelines. FREE ACCESS
-
51sIn this video, we will summarize the key concepts covered in this course. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.