Datasets in R: Selecting, Filtering, Ordering, & Grouping Data
R Programming
| Intermediate
- 12 videos | 1h 34m 30s
- Includes Assessment
- Earns a Badge
Data analysis often requires performing a series of complex transformations. R makes this hassle-free via the forward pipe operator for chaining operations, data selection and filtering based on conditional operations, and grouping and aggregating options to compute summaries. Learn how to carry out all these operations in this course. Task you'll carry out include using logical and relational operators to perform conditional filtering, sampling records at random, and computing the top N records based on values in a variable. You'll also learn to use the forward pipe operator in the magrittr package and tibbles, the next-generation data frame, to store and transform your data. You'll round this course off by performing ordering, grouping, and aggregations on your data. When you're finished, you'll have a solid grasp of complex operations on data frames and be able to apply these concepts using the R programming language.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseEdit data frame columns to be of the right data typeSelect variables from data framesFilter data using relational operatorsUse the select() function and chaining to filter data in tibblesUse the %>% operator and the filter() function to filter tibbles
-
Sample rows using sample() and select top n rows using top_n()Change columns to be of their logically correct data typeUse the order() and arrange() functions to sort data framesCreate crosstabs and view the aggregate statistics of data framesView aggregate statistics of tibbles with summarize() and group_by()Summarize the key concepts covered in this course
IN THIS COURSE
-
2m 5sIn this video, you’ll learn about the instructor and the course. In this course, you’ll learn to select and filter only the data you want to work with using different selection criteria and filtering techniques. You’ll use logical and relational operators to perform conditional filtering. You’ll sample records at random and compute the top n records based on the values in a variable. You’ll also learn how you can chain operations on your data frame. FREE ACCESS
-
7m 6sIn this video, you’ll watch a demo. In this demo, you’ll perform a range of operations in R that allow you to select and filter data stored in data frames or in data frame like formats. The name of the file where you’ll write our code is called SelectionAndFiltering.R. First, you’ll perform the rm list is equal to ls command to get rid of any objects that are currently in your R memory. FREE ACCESS
-
6m 22sIn this video, you’ll watch a demo. In this demo, you’ll perform selection and filtering operations using raw data frames. First, you’ll take a look at the column names in R data. You’ll invoke the colnames function and pass in your data frame. You’ll see there are a total of 13 columns starting from CLIENTNUM. The 13th column is the total revolving balance of a customer. FREE ACCESS
-
10m 49sIn this video, you’ll watch a demo. In this demo, you’ll see row selection within a data frame also allows you to select rows based on a condition. Rather than specifying the index values of a row, you can specify that you’d like to select rows that match a certain condition. Onscreen, you’ll see square brackets are used to index into the bank churners data frame. FREE ACCESS
-
10m 27sIn this video, you’ll watch a demo. In this demo, you’ll use functions and packages from the tidyverse universe to perform selection and filtering on your data frames. Instead of working with data frames, you’ll work with the tibble format. In the tidyverse universe, which R packages for data science, data is stored in tibbles rather than data frames. You’ll see tibbles and data frames are alike because they store data in a tabular format. FREE ACCESS
-
8m 19sIn this video, you’ll watch a demo. In this demo, you’ll learn about the different functions available in the dplyr package. These allow you to slice and filter your data. The first function you’ll look at is the slice function. Slice allows you to specify the index values of the records you want to select.The slice operation is performed on the bank.churners tibble, and you’ll specify this using the forward pipe operator. FREE ACCESS
-
7m 17sIn this video, you’ll watch a demo. In this demo, you’ll explore functions from the dplyr package which allow you to sample records at random. You’ll discover the sample_n function allows you to sample any number of records from the original data frame. Onscreen you’ll see how to feed in the bank.churners tibble to sample_n using the forward pipe operator. Sample_ n takes as input arguments the samples you want from the original data. FREE ACCESS
-
7m 17sIn this video, you’ll watch a demo. In this demo, you’ll cover a range of operations you can perform on R dataframes and tibbles. You’ll learn to order your data, group your data, and then perform aggregations. You’ll start by looking at different techniques you can use to order or sort your data, starting with clean dataframes and moving on to tibbles. First, you’ll run rm(list=ls()), to get rid of any existing objects in memory. FREE ACCESS
-
9m 21sIn this video, you’ll watch a demo. In this demo, you’ll use the order function to sort the records in your dataframe. You’ll see you can use it to order by any column. Onscreen, you’ve ordered the dataframe records based on the price of the car. The default is to order in ascending order from the lowest price to the highest. FREE ACCESS
-
12m 8sIn this video, you’ll watch a demo. In this demo, you’ll move on to grouping operations. First, you’ll perform grouping without using functions from packages that belong to the tidyverse universe. You'll perform grouping using functions from the base R package. You’ll see that if you want to group records and see the counts of records in different categories, the easiest way to do this is to build a contingency table using the table function. FREE ACCESS
-
11m 21sIn this video, you’ll watch a demo. In this demo, you’ll learn how to compute aggregations on your data using functions from the dplyr package. The dplyr package offers functions to quickly summarize and aggregate your data. You’ll look at functions in the dplyr package that work with dataframes and with tibbles. Since you’re in the tidyverse universe, you’ll use the tibble for all your aggregations. FREE ACCESS
-
1m 59sIn this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned a wide variety of data transformations and manipulation techniques, using R data frames and tibbles from the tidyverse universe. You explored the different techniques used to select columns or variables in a data frame. We learned the data types associated with the variables in our data and performed indexing operations to select specific rows and columns. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.