I have a data frame with 854 observations and 47 variables (India_Summary). I want to create another data frame that contains only some columns from the 47 variables, named 'MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears'.
I thought I could simply use this (assuming I am just naming the new df 'India_Summary2'):
India_Summary2 <- India_Summary[['MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears']]
The error I receive is:
Error in `[[.default`(col, i, exact = exact) : subscript out of bounds.
I tried using an equal sign instead:
India_Summary2 = India_Summary[['MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears']]
and I receive the below error:
Error in `[[.default`(col, i, exact = exact) : subscript out of bounds
In addition: Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
display list redraw incomplete
2: In doTryCatch(return(expr), name, parentenv, handler) :
invalid graphics state
3: In doTryCatch(return(expr), name, parentenv, handler) :
invalid graphics state
CodePudding user response:
Your code looks like Python. In R, I'd recommend using the dplyr package. You'd have something like this:
library(dplyr)
India_Summary2 <- India_Summary %>%
select(MEMSEXCOV1, PostSecAvailable, TertiaryYears)
CodePudding user response:
You haven't provided any of your data and Justin already provided a solution using the dplyr package. It's impossible to know if this will work for you since your data is not available, so I show a way to do it with the iris
dataset already in R, employing a method that doesn't require libraries.
First, the data. I can inspect the top with head(iris)
:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
I want Sepal.Length and Sepal.Width. So I can achieve this in R's base functions in two ways. First, with matrix notation, I select a row x column location of values [X, X]. Since I only want columns Sepal.Width and Sepal.Length, I ask for only columns by omitting the row [,X].
#### Subset by Matrix Notation ####
iris.2 <- iris[,c(1,2)]
Alternatively, I can do the same thing by specifying specifically what I want with subset
using the select
argument.
#### Subset with Function ####
iris.2 <- subset(iris,
select = c("Sepal.Length",
"Sepal.Width"))
Both achieve the same thing. If I now use head(iris)
, I only see two columns:
Sepal.Length Sepal.Width
1 5.1 3.5
2 4.9 3.0
3 4.7 3.2
4 4.6 3.1
5 5.0 3.6
6 5.4 3.9