Home > Software engineering >  Learning R-Code Review on Subsetting Data
Learning R-Code Review on Subsetting Data

Time:10-25

Instructions:*

  • Create an object called response that contains the name of the variable that will be considered the response variable.
  • Create a vector called predictors that contains the names of the variables that will be used as predictors.

Assignment Code:

# Create response object
response = 'price'

# Create predictors object
predictors = c("carat","cut","clarity","color","depth")

Feedback:
Now that you've identified what variables are of interest, you can subset the data to only include those columns.

Subsetting the Data

Context:
There are various ways to subset a data frame to only include specific columns. Here, you can use the objects response and predictors to indicate the names of the columns to keep. The function select(), from the {dplyr} package, is a good way to accomplish this task. It requires two arguments:

  • the name of the data frame being used
  • the column names to be included in the subset, or the objects that contain that information (response and predictors), separated by a comma

Instructions:

  • Use select() to create a subset of myData that contains only the columns of interest, and store this subset in an object called myData_subset.

Assignment Code:

# Subset the data
myData_subset %>% select(response,predictors)

Could anyone tell me where I am going wrong on subsetting? Is it that my objects were created incorrectly? Thanks a lot.

CodePudding user response:

Try this: the assignment is lacking myData and Using an external vector in selections is ambiguous -> we could use all_of(..)

response = 'price'

predictors = c("carat","cut","clarity","color","depth")


myData_subset <- myData %>% 
  select(response,all_of(predictors))
  •  Tags:  
  • r
  • Related