Data Mining and Predictive Analytics, Second Edition

  • 11h 17m
  • Chantal D. Larose, Daniel T. Larose
  • John Wiley & Sons (US)
  • 2015

Learn methods of data analysis and their application to real-world data sets

This updated second edition serves as an introduction to data mining methods and models, including association rules, clustering, neural networks, logistic regression, and multivariate analysis. The authors apply a unified “white box” approach to data mining methods and models. This approach is designed to walk readers through the operations and nuances of the various methods, using small data sets, so readers can gain an insight into the inner workings of the method under review. Chapters provide readers with hands-on analysis problems, representing an opportunity for readers to apply their newly-acquired data mining expertise to solving real problems using large, real-world data sets.

Data Mining and Predictive Analytics, Second Edition:

  • Offers comprehensive coverage of association rules, clustering, neural networks, logistic regression, multivariate analysis, and R statistical programming language
  • Features over 750 chapter exercises, allowing readers to assess their understanding of the new material
  • Provides a detailed case study that brings together the lessons learned in the book

Data Mining and Predictive Analytics, Second Edition will appeal to computer science and statistic students, as well as students in MBA programs, and chief executives.

About the Authors

Daniel T. Larose is Professor of Mathematical Sciences and Director of the Data Mining programs at Central Connecticut State University. He has published several books, including Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage (Wiley, 2007) and Discovering Knowledge in Data: An Introduction to Data Mining (Wiley, 2005). In addition to his scholarly work, Dr. Larose is a consultant in data mining and statistical analysis working with many high profile clients, including Microsoft, Forbes Magazine, the CIT Group, KPMG International, Computer Associates, and Deloitte, Inc.

Chantal D. Larose is a Ph.D. candidate in Statistics at the University of Connecticut. Her research focuses on the imputation of missing data and model-based clustering. She has taught undergraduate statistics since 2011, and is a statistical consultant for DataMiningConsultant.com, LLC.

In this Book

  • An Introduction to Data Mining and Predictive Analytics
  • Data Preprocessing
  • Exploratory Data Analysisa
  • Dimension-Reduction Methods
  • Univariate Statistical Analysis
  • Multivariate Statistics
  • Preparing to Model the Data
  • Simple Linear Regression
  • Multiple Regression and Model Building
  • k-Nearest Neighbor Algorithm
  • Decision Trees
  • Neural Networks
  • Logistic Regression
  • Naïve Bayes and Bayesian Networks
  • Model Evaluation Techniques
  • Cost-Benefit Analysis Using Data-Driven Costs
  • Cost-Benefit Analysis for Trinary and K-Nary Classification Models
  • Graphical Evaluation of Classification Models
  • Hierarchical and K-Means Clustering
  • Kohonen Networks
  • BIRCH Clustering
  • Measuring Cluster Goodness
  • Association Rules
  • Segmentation Models
  • Ensemble Methods—Bagging and Boosting
  • Model Voting and Propensity Averaging
  • Genetic Algorithms
  • Imputation of Missing Data
  • Case Study, Part 1—Business Understanding, Data Preparation, and EDA
  • Case Study, Part 2—Clustering and Principal Components Analysis
  • Case Study, Part 3—Modeling and Evaluation for Performance and Interpretability
  • Case Study, Part 4—Modeling and Evaluation for High Performance Only
SHOW MORE
FREE ACCESS