I have a set of panel data, df1, that take the following form:
df1:
year state eligibility coverage
1990 AL .87 .70
1991 AL .78 .61
1992 AL .82 .63
1993 AL .79 .69
1994 AL .82 .73
1990 AK .91 .88
1991 AK .83 .79
1992 AK .82 .71
1993 AK .77 .69
1994 AK .82 .73
I need to add a variable "professionalism" from a different set of data, df2, but the problem is that df2 only has observations measured on even years. df2 thus takes the following form:
df2:
year state professionalism
1990 AL 1.33
1992 AL 1.40
1994 AL 1.42
1990 AK -0.92
1992 AK -0.98
1994 AK -1.02
Is there a function R that will add the odd years into df2, copying the value of the year 1, producing the following output:
df2':
year state professionalism
1990 AL 1.33
1991 AL 1.40
1992 AL 1.40
1993 AL 1.42
1994 AL 1.42
1990 AK -0.92
1991 AK -0.98
1992 AK -0.98
1993 AK -1.02
1994 AK -1.02
I can then merge the professionalism variable from the new df2' to df1... is this possible?
CodePudding user response:
We can use complete
with fill
library(dplyr)
library(tidyr)
df2 %>%
complete(year = 1990:1994, state) %>%
group_by(state) %>%
fill(professionalism, .direction = "updown") %>%
ungroup %>%
arrange(state, year)
-output
# A tibble: 10 x 3
year state professionalism
<int> <chr> <dbl>
1 1990 AK -0.92
2 1991 AK -0.98
3 1992 AK -0.98
4 1993 AK -1.02
5 1994 AK -1.02
6 1990 AL 1.33
7 1991 AL 1.4
8 1992 AL 1.4
9 1993 AL 1.42
10 1994 AL 1.42
data
df2 <- structure(list(year = c(1990L, 1992L, 1994L, 1990L, 1992L, 1994L
), state = c("AL", "AL", "AL", "AK", "AK", "AK"), professionalism = c(1.33,
1.4, 1.42, -0.92, -0.98, -1.02)), class = "data.frame", row.names = c(NA,
-6L))
CodePudding user response:
I think the easiest way to do it is to create a new column in df1 that rounds the year to an even value, and then left_join
the data from df2:
library(tidyverse)
#Setting up example data
df1 <- tribble(
~year, ~state, ~eligibility, ~coverage,
1990, "AL", .87, .70,
1991, "AL", .78, .61,
1992, "AL", .82, .63,
1993, "AL", .79, .69,
1994, "AL", .82, .73,
1990, "AK", .91, .88,
1991, "AK", .83, .79,
1992, "AK", .82, .71,
1993, "AK", .77, .69,
1994, "AK", .82, .73)
df2 <- tribble(
~year, ~state, ~professionalism,
1990, "AL", 1.33,
1992, "AL", 1.40,
1994, "AL", 1.42,
1990, "AK", -0.92,
1992, "AK", -0.98,
1994, "AK", -1.02)
#Create a "year even" variable in df1, then left join from df2
df1 <- df1 %>% mutate(year_even = ceiling(year/2)*2)
df1 <- left_join(df1, df2, by = c("year_even" = "year", "state" = "state"))