Streaming Data Architectures: Processing Streaming Data with Spark
Data Science
| Intermediate
- 11 videos | 52m 10s
- Includes Assessment
- Earns a Badge
Process streaming data with Spark, the analytic engine built on Hadoop. In this course, you will discover how to develop applications in Spark to work with streaming data and generate output. Topics include the following: Configure a streaming data source; Use Netcat and write applications to process the data stream; Learn the effects of using the Update mode on your stream processing application's output; Write a monitoring application that listens for new files added to a directory; Compare the append output with the update mode; Develop applications to limit files processed in each trigger; Use Spark's Complete mode for output; Perform aggregation operations on streaming data with the DataFrame API; Process streaming data with Spark SQL queries.
WHAT YOU WILL LEARN
-
Install the latest available version of pysparkConfigure a streaming data source using netcat and write an application to process the streamDescribe the effects of using the update mode for the output of your stream processing applicationWrite an application to listen for new files being added to a directory and process them as soon as they come inCompare the append output to the update mode and distinguish between the two
-
Develop applications that limit the files processed in each trigger and use spark's complete mode for the outputPerform aggregation operations on streaming data using the dataframe apiWork with spark sql in order to process streaming data using sql queriesDefine and apply standard, re-usable transformations for streaming dataRecall they key ways to use spark for streaming data and explore the ways to process streams and generate output
IN THIS COURSE
-
2m 10s
-
2m 39sLearn how to install the latest version of PySpark. FREE ACCESS
-
8m 53sIn this video, you will configure a streaming data source using Netcat and write an application to process the stream. FREE ACCESS
-
3m 32sUpon completion of this video, you will be able to describe the effects of using the Update mode for the output of your stream processing application. FREE ACCESS
-
7m 50sIn this video, you will learn how to write an application to listen for new files being added to a directory and process them as soon as they are added. FREE ACCESS
-
2m 15sLearn how to compare the output of Append to the output of Update mode, and distinguish between the two. FREE ACCESS
-
6m 35sFind out how to develop applications that limit the files processed in each trigger and use Spark's Complete mode for the output. FREE ACCESS
-
4m 6sIn this video, you will learn how to perform aggregation operations on streaming data using the DataFrame API. FREE ACCESS
-
4m 59sFind out how to work with Spark SQL in order to process streaming data using SQL queries. FREE ACCESS
-
4m 41sLearn how to define and apply standard, reusable transformations for streaming data. FREE ACCESS
-
4m 30sAfter completing this video, you will be able to recall the key ways to use Spark for streaming data and explore the ways to process streams and generate output. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.