Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models
R Programming
| Expert
- 14 videos | 1h 31m 11s
- Includes Assessment
- Earns a Badge
Understanding the bias-variance trade-off allows data scientists to build generalizable models that perform well on test data. Machine learning models are considered a good fit if they can extract general patterns or dominant trends in the training data and use these to make predictions on unseen instances. Use this course to discover what it means for your model to be a good fit for the training data. Identify underfit and overfit models and what the bias-variance trade-off represents in machine learning. Mitigate overfitting on training data using regularized regression models, train and evaluate models built using ridge regression, lasso regression, and ElasticNet regression, and implement ensemble learning using the random forest model. When you're done with this course, you'll have the skills and knowledge to train models that learn general patterns using regularized models and ensemble learning.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseRecall characteristics of overfitted and underfitted modelsDescribe the bias-variance trade-offExamine and interpret the data for regressionPerform ordinary least squares (osl) regressionPrepare data to build regularized regression modelsPerform and evaluate ridge regression
-
Perform and evaluate lasso regressionPerform and evaluate elasticnet regressionOutline the main characteristics of ensemble learningExamine and visualize data for regressionPerform regression using decision treesPerform regression using random forestSummarize the key concepts covered in this course
IN THIS COURSE
-
2m 7sIn this video, you’ll learn more about the course and your instructor. In this course, you’ll learn what it means for your model to be a good fit for the training data. You’ll discover the characteristics of Underfit and Overfit models and what the bias-variance tradeoff represents in machine learning. You’ll learn to mitigate overfitting on the training data using regularized regression models. You’ll use ridge regression, lasso regression, and elastic net regression. FREE ACCESS
-
9m 51sIn this video, you’ll learn more about machine learning models. These models are said to be good models if they’re a good fit for your underlying data. Machine learning algorithms can be divided into two categories: supervised learning and unsupervised learning algorithms. Regression and classification are examples of supervised learning techniques which are trained using labeled training data. FREE ACCESS
-
6m 18sIn this video, you’ll learn more about the bias-variance trade-off. First, you’ll look at the different kinds of errors that can exist within machine learning models. Errors in machine learning models can be categorized into three broad categories. There are bias errors, variance, and irreducible errors. Bias errors are erroneous assumptions made by machine learning algorithms. FREE ACCESS
-
9m 21sIn this video, you’ll watch a demo. Here, you’ll explore regularized regression models. You’ll see regularized regression models tweak the objective function that OLS regression uses. In addition to the original objective function for OLS regression, regularized models add a penalty term to this objective function. This penalty term will penalize complex coefficients in your model. Regularized regression models allow you to mitigate the effects of overfitting on the training data. FREE ACCESS
-
5m 41sIn this video, you’ll watch a demo. You’ll split the data you’re working with into training data and test data. First, you’ll invoke set.seed and pass in (1). Then, you’ll use sample.split to split the admission.data. Your target variable is Chance.of.Admit. That's what has been specified as an input argument. You’ll see the SplitRatio is 80%. This sample.split will generate a mask that will have true values for 80% of the records. FREE ACCESS
-
4m 31sIn this video, you’ll watch a demo covering regularized regression models. First, you’ll need to pre-process your data to fit into your Ridge, Lasso, and ElasticNet models. These are the functions you'll be using for your regularized regression model. These functions require data in the form of a model matrix. A model.matrix creates a design matrix by expanding factors in the data frame to a set of dummy variables, expanding interactions similarly. FREE ACCESS
-
10m 42sIn this video, you’ll learn more about performing ridge regression in R. You’ll see regularized regression models mitigate overfitting on the training data. This allows the model to learn more general patterns that exist in the data rather than specific patterns which don’t have much predictive ability. FREE ACCESS
-
9m 19sIn this video, you’ll learn more about performing lasso regression in R. Here, you’ll start with ordinary least squares regression. You’ve learned the objective function of this regression is to minimize the sum of squared errors. Lasso regression is a regularized regression model, which means this objective function of lasso regression is similar to ordinary least squares regression. It has an additional penalty added for complex coefficients. FREE ACCESS
-
6m 43sIn this video, you’ll learn more about performing ElasticNet regression in R. ElasticNet regression can be thought of as a combination of lasso and ridge regression. The penalty used by ElasticNet is some combination of L1 regularization and L2 regularization. You have the tuning parameter lambda needed to specify, but lambda is then multiplied by some combination of L1 and L2 regularization. ElasticNet regularization incorporates penalties from both. FREE ACCESS
-
7m 36sIn this video, you’ll learn more about ensemble learning. You’ll learn that in statistics and machine learning, ensemble methods use multiple-learning algorithms to obtain better predictive performance than can be obtained from any of the individual learning algorithms alone. Ensemble learning techniques try to harness the wisdom of crowds. With multiple models, the predictions are better than the prediction of any of the individual models. FREE ACCESS
-
4m 54sIn this video, you’ll watch a demo. Onscreen, you’ll see you need to use ggplot to set up a bar plot to see how many countries you have from different regions. This gives you a quick overview of the countries in your dataset. You’ll see the different colors represent the different Region, Western Europe, Sub-Saharan Africa, and Asia. Here, you’ll see how the HappinessScore is distributed across regions. FREE ACCESS
-
5m 45sIn this video, you’ll watch a demo. You’ll first build a decision tree regression model, and then move on to the random forest model. First, you’ll invoke set.seed and pass in (3). Then, you’ll invoke sample.split, specify your target variable, which is HappinessScore, and your SplitRatio of 0.8. This will give you a mask with true and false values. You’ll see you have 125 records to train in your regression model. FREE ACCESS
-
6m 2sIn this video, you’ll watch a demo. Here, you’ll run a random forest regression. To build a random forest model, you’ll need to install an additional package. You’ll invoke install.packages for the package "randomForest". This can be found on line 115. Now, you’ll include this package in your current program by invoking the library function for randomForest. This contains the function you’ll use to build your random forest model. FREE ACCESS
-
2m 22sIn this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned to mitigate our models from overfitting on the training data. You also learned about the bias-variance trade-off that every data engineer has to keep in mind while building and training machine learning models. You saw how to mitigate overfitting for regression models using regularization. Finally, you explored ensemble learning. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.