Instructions:*
- Create an object called
response
that contains the name of the variable that will be considered the response variable. - Create a vector called
predictors
that contains the names of the variables that will be used as predictors.
Assignment Code:
# Create response object
response = 'price'
# Create predictors object
predictors = c("carat","cut","clarity","color","depth")
Feedback:
Now that you've identified what variables are of interest, you can subset the data to only include those columns.
Subsetting the Data
Context:
There are various ways to subset a data frame to only include specific columns. Here, you can use the objects response
and predictors
to indicate the names of the columns to keep. The function select()
, from the {dplyr}
package, is a good way to accomplish this task. It requires two arguments:
- the name of the data frame being used
- the column names to be included in the subset, or the objects that contain that information (
response
andpredictors
), separated by a comma
Instructions:
- Use
select()
to create a subset ofmyData
that contains only the columns of interest, and store this subset in an object calledmyData_subset
.
Assignment Code:
# Subset the data
myData_subset %>% select(response,predictors)
Could anyone tell me where I am going wrong on subsetting? Is it that my objects were created incorrectly? Thanks a lot.
CodePudding user response:
Try this: the assignment is lacking myData
and Using an external vector in selections is ambiguous -> we could use all_of(..)
response = 'price'
predictors = c("carat","cut","clarity","color","depth")
myData_subset <- myData %>%
select(response,all_of(predictors))