Spark in Action, Second Edition

8h 10m
Jean-Georges Perrin
Manning Publications
2020

The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop.

About the technology

Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem.

About the book

Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms.

What's inside

Writing Spark applications in Java
Spark application architecture
Ingestion through files, databases, streaming, and Elasticsearch
Querying distributed datasets with Spark SQL

About the reader

This book does not assume previous experience with Spark, Scala, or Hadoop.

About the Author

Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years.

In this Book

Foreword
About This Book
About the Cover Illustration
So, What is Spark, Anyway?
Architecture and Flow
The Majestic Role of the Dataframe
Fundamentally Lazy
Building a Simple App for Deployment
Deploying Your Simple App
Ingestion from Files
Ingestion from Databases
Advanced Ingestion—Finding Data Sources and Building Your Own
Ingestion Through Structured Streaming
Working with SQL
Transforming Your Data
Transforming Entire Documents
Extending Transformations with User-Defined Functions
Aggregating Your Data
Cache and Checkpoint—Enhancing Spark’s Performances
Exporting Data and Building Full Data Pipelines
Exploring Deployment Constraints—Understanding the Ecosystem

FREE ACCESS

Course Using Apache Spark for AI Development

(20)

Course Processing Data: Integrating Kafka with Apache Spark

(29)

Book Querying Databricks with Spark SQL: Leverage SQL to Query and Analyze Big Data for Insights

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course AWS Certified Machine Learning: Machine Learning in SageMaker

(134)

Book The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake

Book Data Science: Concepts and Practice, Second Edition

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

Spark in Action, Second Edition

In this Book

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE