Thinking Data Science: A Data Science Practitioner's Guide

3h 44m
Poornachandra Sarang
Springer
2023

This definitive guide to Machine Learning projects answers the problems an aspiring or experienced data scientist frequently has: Confused on what technology to use for your ML development? Should I use GOFAI, ANN/DNN or Transfer Learning? Can I rely on AutoML for model development? What if the client provides me Gig and Terabytes of data for developing analytic models? How do I handle high-frequency dynamic datasets? This book provides the practitioner with a consolidation of the entire data science process in a single “Cheat Sheet”.

The challenge for a data scientist is to extract meaningful information from huge datasets that will help to create better strategies for businesses. Many Machine Learning algorithms and Neural Networks are designed to do analytics on such datasets. For a data scientist, it is a daunting decision as to which algorithm to use for a given dataset. Although there is no single answer to this question, a systematic approach to problem solving is necessary. This book describes the various ML algorithms conceptually and defines/discusses a process in the selection of ML/DL models. The consolidation of available algorithms and techniques for designing efficient ML models is the key aspect of this book. Thinking Data Science will help practising data scientists, academicians, researchers, and students who want to build ML models using the appropriate algorithms and architectures, whether the data be small or big.

About the Author

Poornachandra Sarang, in his IT career spanning four decades, has been consulting large IT organizations on the design and architecture of systems using state-of-the-art technologies. He has authored several books covering a wide range of emerging technologies. Dr. Sarang is a Ph.D. advisor for Computer Science and Engineering and is on the thesis advisory committee for aspiring doctoral candidates. He has designed and delivered courses/curricula for universities at the postgraduate level, including courses and workshops on emerging technologies for industry. He is a known face at technical and research conferences delivering both keynote and technical talks.

In this Book

Data Science Process
Dimensionality Reduction: Creating Manageable Training Datasets
Regression Analysis: A Well-Studied Statistical Technique for Predictive Analysis
Decision Tree: A Supervised Learning Algorithm for Classification
Ensemble: Bagging and Boosting: Improving Decision Tree Performance by Ensemble Methods
K-Nearest Neighbors: A Supervised Learning Algorithm for Classification and May Be Regression
Naive Bayes: A Supervised Learning Algorithm for Classification
Support Vector Machines: A Supervised Learning Algorithm for Classification and Regression
Centroid-Based Clustering: Clustering Algorithms for Hard Clustering
Connectivity-Based Clustering: Clustering Built on a Tree-Type Structure
Gaussian Mixture Model: A Probabilistic Clustering Model for Datasets with Mixture of Gaussian Blobs
Density-Based Clustering: Density-Based Spatial Clustering
BIRCH: Divide and Conquer
CLARANS: Clustering Large Datasets with Randomized Search
Affinity Propagation Clustering: A Gossip-Style Algorithm for Clustering
STING & CLIQUE: Density and Grid Based Clustering
Artificial Neural Networks: A Noticeable Evolution in AI
ANN-Based Applications: Text and Image Dataset Processing for ANN Applications
Automated Tools: Data Scientist’s Aid for Designing Classical and ANN-Based Models
Data Scientist’s Ultimate Workflow: A Quick Summary on a Data Scientist’s Approach to Model Development

FREE ACCESS

Book Graph Algorithms for Data Science: With examples in Neo4j

Course Developing an AI/ML Data Strategy: The Data Analytics Maturity Model

(260)

Book Data Science, Analytics and Machine Learning with R, First Edition

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

Thinking Data Science: A Data Science Practitioner's Guide

In this Book

YOU MIGHT ALSO LIKE