I want to create a variable that include all numbers between (startyear) and (endyear - 1). My data looks like this:
country | leader | startyear | endyear |
---|---|---|---|
US | Eisenhower | 1953 | 1961 |
US | Kennedy | 1961 | 1963 |
I want to show my data like this:
country | leader | startyear | endyear | year |
---|---|---|---|---|
US | Eisenhower | 1953 | 1961 | 1953 |
US | Eisenhower | 1953 | 1961 | 1954 |
US | Eisenhower | 1953 | 1961 | 1955 |
US | Eisenhower | 1953 | 1961 | 1956 |
US | Eisenhower | 1953 | 1961 | 1957 |
US | Eisenhower | 1953 | 1961 | 1958 |
US | Eisenhower | 1953 | 1961 | 1959 |
US | Eisenhower | 1953 | 1961 | 1960 |
US | Kennedy | 1961 | 1963 | 1961 |
US | Kennedy | 1961 | 1963 | 1962 |
I have many countries in data set. I want to manipulate all data set with "the" code.
CodePudding user response:
We may get the sequence (:
) by row and unnest
the list
column
library(dplyr)
library(purrr)
library(tidyr)
df1 %>%
mutate(year = map2(startyear, endyear-1, `:`)) %>%
unnest(year)
-output
# A tibble: 10 × 5
country leader startyear endyear year
<chr> <chr> <int> <int> <int>
1 US Eisenhower 1953 1961 1953
2 US Eisenhower 1953 1961 1954
3 US Eisenhower 1953 1961 1955
4 US Eisenhower 1953 1961 1956
5 US Eisenhower 1953 1961 1957
6 US Eisenhower 1953 1961 1958
7 US Eisenhower 1953 1961 1959
8 US Eisenhower 1953 1961 1960
9 US Kennedy 1961 1963 1961
10 US Kennedy 1961 1963 1962
data
df1 <- structure(list(country = c("US", "US"), leader = c("Eisenhower",
"Kennedy"), startyear = c(1953L, 1961L), endyear = c(1961L, 1963L
)), class = "data.frame", row.names = c(NA, -2L))