I wasn't sure how to word this question.
I have a data frame called p08, that shows how each state voted in the 2008 election. The indicator variable is named 'DemStatus' where 1==voted democrat and 0==voted republican.
I want to label each state as republican and democrat for all four years between elections. For instance, Alabama voted republican in 2008, so I want to label them as 0 (republican) for 2008, 2009,2010, and 2011.
I accomplished this by copying the data frame and naming it something else for each year. However, this is a very tedious process since I have election data from the 90s until 2020.
QUESTION: Is there a faster/simpler way to accomplish exactly what I have below?
p08=structure(list(STATE = c("Alabama", "Alaska", "Arizona", "Arkansas",
"California", "Colorado", "Connecticut", "Delaware", "Dist. of Col.",
"Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana",
"Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland",
"Massachusetts", "Michigan", "Minnesota", "Mississippi", "Missouri",
"Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey",
"New Mexico", "New York", "North Carolina", "North Dakota", "Ohio",
"Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina",
"South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virginia",
"Washington", "West Virginia", "Wisconsin", "Wyoming"), YEAR = c(2008,
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
2008, 2008, 2008, 2008, 2008, 2008), DemStatus = c(0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0,
0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1,
1, 1, 0, 1, 0)), row.names = 10:60, class = "data.frame")
party09=p08
party09$YEAR=2009
party10=p08
party10$YEAR=2010
party11=p08
party11$YEAR=2011
party08_11 = bind_rows(p08,party09,party10,party11)
CodePudding user response:
Another way would be creating a sequence per each row. This could be applied to any dataset regardless of which or how many election years it contains, provided that there is always a gap of 3 years.
library(data.table)
p08 <- setDT(p08)[, .(STATE, YEAR = seq(YEAR, YEAR 3L), DemStatus), by = 1:nrow(p08)][, nrow := NULL]
Output (showing first and last 5 rows):
STATE YEAR DemStatus
1: Alabama 2008 0
2: Alabama 2009 0
3: Alabama 2010 0
4: Alabama 2011 0
5: Alaska 2008 0
---
200: Wisconsin 2011 1
201: Wyoming 2008 0
202: Wyoming 2009 0
203: Wyoming 2010 0
204: Wyoming 2011 0
CodePudding user response:
Do you mean something like this?
library(dplyr)
p08 %>%
left_join(tibble(YEAR=2008, YEAR1=2008:2011), by = "YEAR") %>%
mutate(YEAR = YEAR1, YEAR1 = NULL)
# # A tibble: 204 x 3
# STATE YEAR DemStatus
# <chr> <int> <dbl>
# 1 Alabama 2008 0
# 2 Alabama 2009 0
# 3 Alabama 2010 0
# 4 Alabama 2011 0
# 5 Alaska 2008 0
# 6 Alaska 2009 0
# 7 Alaska 2010 0
# 8 Alaska 2011 0
# 9 Arizona 2008 0
# 10 Arizona 2009 0
# # ... with 194 more rows
(I'm inferring dplyr
based on your use of bind_rows
.)
Alternatively, using tidyr::complete
:
p08 %>%
tidyr::complete(STATE, YEAR = 2008:2011) %>%
group_by(STATE) %>%
mutate(DemStatus = na.omit(DemStatus)[1]) %>%
ungroup()
# # A tibble: 204 x 3
# STATE YEAR DemStatus
# <chr> <dbl> <dbl>
# 1 Alabama 2008 0
# 2 Alabama 2009 0
# 3 Alabama 2010 0
# 4 Alabama 2011 0
# 5 Alaska 2008 0
# 6 Alaska 2009 0
# 7 Alaska 2010 0
# 8 Alaska 2011 0
# 9 Arizona 2008 0
# 10 Arizona 2009 0
# # ... with 194 more rows
CodePudding user response:
in Base R, you can do the same using outer
join:
merge(p08[-2], list(YEAR=2008:2011), by = NULL) # -2 means remove year
STATE DemStatus YEAR
1 Alabama 0 2008
2 Alaska 0 2008
3 Arizona 0 2008
4 Arkansas 0 2008
5 California 1 2008
6 Colorado 1 2008
: : : :
if you do not know the index for the year, you can simply do:
merge(subset(p08, select = -YEAR), list(YEAR = 2008:2011), by =NULL)
Note that if you do not care about the namings, this can be as simple as
merge(p08, 2008:2011)
or even
merge(p08, 2008:2011)[-2]
STATE DemStatus y
1 Alabama 0 2008
2 Alaska 0 2008
3 Arizona 0 2008
4 Arkansas 0 2008
5 California 1 2008
6 Colorado 1 2008
7 Connecticut 1 2008
8 Delaware 1 2008