Creating a subset of a data frame with specific years-CodePudding

I am using a data frame from WDI and am attempting to clean a merged dataset.

One of the two merged datasets only has values at 2000,2005,2010, and so I would like to have a subsetted data frame that only includes those years (for each country etc).

My code is as follows:

WB_Merge1 = subset(WB_Merge, select = c(year==2000 | year==2005 | year==2010))

However, when I run it in R it creates a data frame that now has all 5502 observations but no variables?

Could anyone help? Many thanks.

CodePudding user response：

You just used the wrong argument, to select rows you want subset=.

subset(dat, subset=c(year == 2000 | year == 2005 | year == 2010))

Or more concise:

subset(dat, subset=year %in% c(2000, 2005, 2010))
#    year          x          z
# 1  2000 -0.4703161 0.62147778
# 6  2005 -0.6667708 0.03479132
# 11 2010 -0.8059292 0.43732005

select= is for the columns.

subset(dat, subset=year %in% c(2000, 2005, 2010), select=c(year, z))
#    year          z
# 1  2000 0.62147778
# 6  2005 0.03479132
# 11 2010 0.43732005

Note, that if you provide the arguments in the right order, you may leave out the argument names and just do:

subset(dat, year %in% c(2000, 2005, 2010), c(year, z))

Data:

set.seed(42)
dat <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))

CodePudding user response：

try:

require(dplyr)
WB_Merge1 <- filter(WB_Merge, year %in% c(2000, 2005, 2010)

CodePudding user response：

Another solution with which()

Sample data:

set.seed(42)
data <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))

Sample code:

new.data <- data[ which( data$year == 2000 | data$year == 2005 | data$year == 2010) , ]

Output:

   year          x         z
1  2000  1.3709584 0.8877549
6  2005 -0.1061245 0.3467482
11 2010  1.3048697 0.6772768