Building ML Training Sets: Preprocessing Datasets for Classification

Machine Learning | Beginner

6 videos | 43m 46s
Includes Assessment
Earns a Badge

(35)

In this course, learners can explore how to implement machine learning scaling techniques such as standardizing and normalizing on continuous data and label encoding on the target, in order to get the best out of machine learning algorithms. Examine dimensionality reduction by using Principal Component Analysis (PCA). Start this 6-video course by using Pandas library to load a CSV data set into a data frame and scale continuous features by using a standard scaler. You will then learn how to build and evaluate a support vector classifier in scikit-learn; use Pandas and Seaborn to generate a heat map; and spot the correlations between features in a data set. Discover how to apply the technique of PCA to reduce the number of dimensions in your input data and obtain the explained variance of each principal component. In the course's final tutorial, you will explore how to apply normalization and PCA on data sets and build a classification model with the principal components of scaled data. The concluding exercise involves processing data for classification.

WHAT YOU WILL LEARN

Use the pandas library to load a csv dataset into a dataframe and scale the continuous features using a standard scaler

Build and evaluate a support vector classifier in scikit-learn, use pandas and seaborn to generate a heatmap, and spot the correlations between features in a dataset

Apply the technique of principal component analysis to reduce the number of dimensions in your input data and obtain the explained variance of each principal component
Apply normalization and pca on a dataset and build a classification model with the principal components of scaled data

Encode the target column of a dataset containing certain values, identify the features of normalization, enumerate reasons for using pca, split data into training and test sets using scikit-learn, identify one method of viewing correlations in a dataset using pandas and seaborn

IN THIS COURSE

2m 54s

FREE ACCESS
8m 13s

During this video, you will learn how to use the Pandas library to load a CSV dataset into a dataframe and scale the continuous features using a standard scaler. FREE ACCESS
3. Spotting Correlations in a Dataset

6m 42s

Learn how to build and evaluate a support vector classifier in scikit-learn, use Pandas and Seaborn to generate a heatmap, and identify the correlations between features in a dataset. FREE ACCESS
4. Principal Component Analysis

7m 33s

In this video, you will learn how to apply the technique of Principal Component Analysis to reduce the number of dimensions in your input data and obtain the explained variance of each principal component. FREE ACCESS
5. Normalizing a Dataset

9m 22s

During this video, you will learn how to apply normalization and PCA to a dataset and build a classification model with the principal components of scaled data. FREE ACCESS
6. Exercise: Processing Data for Classification

9m 2s

Find out how to encode the target column of a dataset containing certain values, identify the features of Normalization, enumerate reasons for using PCA, split data into training and test sets using scikit-learn, identify one method of viewing correlations in a dataset using Pandas and Seaborn. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Book Optimizing AI and Machine Learning Solutions: Your Ultimate Guide to Building High-Impact ML/AI Solutions

Course Low-code ML with KNIME: Building Classification Models

(1)

Book Machine Learning in Production: Master The Art of Delivering Robust Machine Learning Solutions with MLOps

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course Convolutional and Recurrent Neural Networks

(45)

Course Machine Learning in Python Bootcamp: Session 1 Replay

(38)

Course Building ML Training Sets: Preprocessing Datasets for Linear Regression

(32)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

Building ML Training Sets: Preprocessing Datasets for Classification

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE