Home > database >  Creating a subset of a data frame with specific years
Creating a subset of a data frame with specific years

Time:03-02

I am using a data frame from WDI and am attempting to clean a merged dataset.

One of the two merged datasets only has values at 2000,2005,2010, and so I would like to have a subsetted data frame that only includes those years (for each country etc).

My code is as follows:

WB_Merge1 = subset(WB_Merge, select = c(year==2000 | year==2005 | year==2010))

However, when I run it in R it creates a data frame that now has all 5502 observations but no variables?

Could anyone help? Many thanks.

CodePudding user response:

You just used the wrong argument, to select rows you want subset=.

subset(dat, subset=c(year == 2000 | year == 2005 | year == 2010))

Or more concise:

subset(dat, subset=year %in% c(2000, 2005, 2010))
#    year          x          z
# 1  2000 -0.4703161 0.62147778
# 6  2005 -0.6667708 0.03479132
# 11 2010 -0.8059292 0.43732005

select= is for the columns.

subset(dat, subset=year %in% c(2000, 2005, 2010), select=c(year, z))
#    year          z
# 1  2000 0.62147778
# 6  2005 0.03479132
# 11 2010 0.43732005

Note, that if you provide the arguments in the right order, you may leave out the argument names and just do:

subset(dat, year %in% c(2000, 2005, 2010), c(year, z))

Data:

set.seed(42)
dat <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))

CodePudding user response:

try:

require(dplyr)
WB_Merge1 <- filter(WB_Merge, year %in% c(2000, 2005, 2010)

CodePudding user response:

Another solution with which()

Sample data:

set.seed(42)
data <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))

Sample code:

new.data <- data[ which( data$year == 2000 | data$year == 2005 | data$year == 2010) , ]

Output:

   year          x         z
1  2000  1.3709584 0.8877549
6  2005 -0.1061245 0.3467482
11 2010  1.3048697 0.6772768
  • Related