I am using a data frame from WDI and am attempting to clean a merged dataset.
One of the two merged datasets only has values at 2000,2005,2010, and so I would like to have a subsetted data frame that only includes those years (for each country etc).
My code is as follows:
WB_Merge1 = subset(WB_Merge, select = c(year==2000 | year==2005 | year==2010))
However, when I run it in R it creates a data frame that now has all 5502 observations but no variables?
Could anyone help? Many thanks.
CodePudding user response:
You just used the wrong argument, to select rows you want subset=
.
subset(dat, subset=c(year == 2000 | year == 2005 | year == 2010))
Or more concise:
subset(dat, subset=year %in% c(2000, 2005, 2010))
# year x z
# 1 2000 -0.4703161 0.62147778
# 6 2005 -0.6667708 0.03479132
# 11 2010 -0.8059292 0.43732005
select=
is for the columns.
subset(dat, subset=year %in% c(2000, 2005, 2010), select=c(year, z))
# year z
# 1 2000 0.62147778
# 6 2005 0.03479132
# 11 2010 0.43732005
Note, that if you provide the arguments in the right order, you may leave out the argument names and just do:
subset(dat, year %in% c(2000, 2005, 2010), c(year, z))
Data:
set.seed(42)
dat <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))
CodePudding user response:
try:
require(dplyr)
WB_Merge1 <- filter(WB_Merge, year %in% c(2000, 2005, 2010)
CodePudding user response:
Another solution with which()
Sample data:
set.seed(42)
data <- data.frame(year=2000:2022, x=rnorm(23), z=runif(23))
Sample code:
new.data <- data[ which( data$year == 2000 | data$year == 2005 | data$year == 2010) , ]
Output:
year x z
1 2000 1.3709584 0.8877549
6 2005 -0.1061245 0.3467482
11 2010 1.3048697 0.6772768