Machine & Deep Learning Algorithms: Imbalanced Datasets Using Pandas ML
Machine Learning
| Intermediate
- 12 videos | 1h 23m 4s
- Includes Assessment
- Earns a Badge
The imbalanced-learn library that integrates with Pandas ML (machine learning) offers several techniques to address the imbalance in datasets used for classification. In this course, explore oversampling, undersampling, and a combination of techniques. Begin by using Pandas ML to explore a data set in which samples are not evenly distributed across target classes. Then apply the technique of oversampling with the RandomOverSampler class in the imbalanced-learn library; build a classification model with oversampled data; and evaluate its performance. Next, learn how to create a balanced data set with the Synthetic Minority Oversampling Technique and how to perform undersampling operations on a data set by applying Near Miss, Cluster Centroids, and Neighborhood cleaning rules techniques. Next, look at ensemble classifiers for imbalanced data, applying combination samplers for imbalanced data, and finding correlations in a data set. Learn how to build a multilabel classification model, explore the use of principal component analysis, or PCA, and how to combine use of oversampling and PCA in building a classification model. The exercise involves working with imbalanced data sets.
WHAT YOU WILL LEARN
-
Use pandas ml to explore a dataset where the samples are not evenly distributed across the target classesApply the technique of oversampling using the randomoversampler class in the imbalanced-learn library, build a classification model with the oversampled data, and evaluate its performanceCreate a balanced dataset using the synthetic minority oversampling technique and build and evaluate a classification model with that dataPerform undersampling operations on a dataset by applying the near miss, cluster centroids, and neighborhood cleaning rule techniquesUse the easyensembleclassifier and balancedrandomforestclassifier available in the imbalanced-learn library to build classification models with imbalanced dataApply a combination of oversampling and undersampling using the smotetomek and smoteenn techniques
-
Use pandas and seaborn to visualize the correlated fields in a datasetTrain and evaluate a classification model to predict the quality ratings of red winesTransform a dataset containing multiple features to a handful of principal components and build a classification model using the reduced dimensions of the datasetCombine the use of oversampling and pca in building a classification modelRecall the techniques used by algorithms for undersampling and oversampling data and the use of combined samplers
IN THIS COURSE
-
2m 18s
-
7m 36sLearn how to use Pandas ML to explore a dataset where the samples are not evenly distributed across the target classes. FREE ACCESS
-
8m 35sIn this video, you will learn how to apply the technique of oversampling using the RandomOverSampler class in the imbalanced-learn library. You will also build a classification model with the oversampled data, and evaluate its performance. FREE ACCESS
-
4m 14sIn this video, you will learn how to create a balanced dataset using the Synthetic Minority Oversampling Technique and build and evaluate a classification model with that data. FREE ACCESS
-
8m 51sLearn how to perform undersampling operations on a dataset by applying the Near Miss, Cluster Centroids, and Neighborhood Cleaning Rule techniques. FREE ACCESS
-
6m 15sIn this video, learn how to use the EasyEnsembleClassifier and BalancedRandomForestClassifier available in the imbalanced-learn library to build classification models with imbalanced data. FREE ACCESS
-
7m 22sIn this video, learn how to apply a combination of oversampling and undersampling using the SMOTE and Tomek techniques. FREE ACCESS
-
8m 18sIn this video, find out how to use Pandas and Seaborn to visualize the correlation between fields in a dataset. FREE ACCESS
-
3m 22sIn this video, you will learn how to train and evaluate a classification model to predict the quality ratings of red wines. FREE ACCESS
-
9m 41sIn this video, you will transform a dataset containing multiple features to a handful of principal components and build a classification model using the reduced dimensions of the dataset. FREE ACCESS
-
8m 57sIn this video, you will learn how to use oversampling and PCA together to build a classification model. FREE ACCESS
-
7m 34sUpon completion of this video, you will be able to recall the techniques used by algorithms for undersampling and oversampling data, as well as the use of combined samplers. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.