Hadoop for Dummies
- 6h 39m
- Dirk deRoos, et al.
- John Wiley & Sons (US)
- 2014
Let Hadoop For Dummies help harness the power of your data and rein in the information overload
Big data has become big business, and companies and organizations of all sizes are struggling to find ways to retrieve valuable information from their massive data sets with becoming overwhelmed. Enter Hadoop and this easy-to-understand For Dummies guide. Hadoop For Dummies helps readers understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters.
- Explains the origins of Hadoop, its economic benefits, and its functionality and practical applications
- Helps you find your way around the Hadoop ecosystem, program MapReduce, utilize design patterns, and get your Hadoop cluster up and running quickly and easily
- Details how to use Hadoop applications for data mining, web analytics and personalization, large-scale text processing, data science, and problem-solving
- Shows you how to improve the value of your Hadoop cluster, maximize your investment in Hadoop, and avoid common pitfalls when building your Hadoop cluster
From programmers challenged with building and maintaining affordable, scaleable data systems to administrators who must deal with huge volumes of information effectively and efficiently, this how-to has something to help you with Hadoop.
About the Authors
Dirk deRoos, B.C.S., B.A., is IBM's World-Wide Technical Sales Leader for IBM's Hadoop offering: BigInsights. Dirk provides technical guidance to IBM's technical sales community and helps customers build solutions featuring BigInsights and Apache Hadoop.
Paul C. Zikopoulos, B.A., M.B.A., is the VP of Big Data and Technical Sales at IBM. He's an award winning speaker & writer, penning 18 books and 350+ articles. Independent groups often recognize Paul as a thought leader with nominations to SAP's "Top 50 Big Data Twitter Influencers", Big Data Republic's "Most Influential", Onalytica's "Top 100", and AnalyticsWeek "Thought Leader in Big Data and Analytics" lists. Technopedia listed him as a "Big Data Expert to Follow" and he was consulted on the topic of Big Data by the popular TV show 60 Minutes.
Bruce Brown, B.Sc. (C.S.), is a Big Data Technical Specialist at IBM, where he provides training to IBM's world-wide technical community and helps clients achieve success with BigInsights and Apache Hadoop.
Rafael Coss, M.C.S., is a Solution Architect and manages IBM's World Wide Big Data Enablement team based in IBM Silicon Valley Lab, where he's responsible for the technical development of partnerships and customer advocates for IBM's Big Data portfolio.
Roman B. Melnyk, Ph.D., is a senior member of the IBM DB2 Information Development team. Roman edited DB2 10.5 with BLU Acceleration: New Dynamic In-Memory Analytics for the Era of Big Data, Harness the Power of Big Data: The IBM Big Data Platform, Warp Speed, Time Travel, Big Data, and More!: DB2 10 for Linux, UNIX, and Windows New Features, and Apache Derby - Off to the Races. Roman co-authored DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, and DB2 for Dummies.
In this Book
-
Introduction
-
Introducing Hadoop and Seeing What It's Good for
-
Common Use Cases for Big Data in Hadoop
-
Setting up Your Hadoop Environment
-
Storing Data in Hadoop—The Hadoop Distributed File System
-
Reading and Writing Data
-
MapReduce Programming
-
Frameworks for Processing Data in Hadoop—YARN and MapReduce
-
Pig—Hadoop Programming Made Easier
-
Statistical Analysis in Hadoop
-
Developing and Scheduling Application Workflows with Oozie
-
Hadoop and the Data Warehouse—Friends or Foes?
-
Extremely Big Tables—Storing Data in HBase
-
Applying Structure to Hadoop Data with Hive
-
Integrating Hadoop with Relational Databases Using Sqoop
-
The Holy Grail—Native SQL Access to Hadoop Data
-
Deploying Hadoop
-
Administering Your Hadoop Cluster
-
Ten Hadoop Resources Worthy of a Bookmark
-
Ten Reasons to Adopt Hadoop