I have a dataframe akin to the following:
Date | England | ...3 | Wales | ...5 |
---|---|---|---|---|
2021-04-01 | 145 | e | 23 | |
2021-04-02 | 200 | e | 90 | s |
2021-04-03 | 120 | 66 | e |
There are 365 rows.
The columns continue such that they alternate between a location (eg, England) and a generic column name (eg, ...3). There are 190 locations in total, so there are 381 columns. I wanted to use the unite() function from tidyverse to merge each pair of columns with a "-" separator. Eg, "145-e" for the first row for England and "23- " for Wales. The date column should remain as it is. Is there a way to loop this so that each of the 180 pairs unify together? Thank you.
CodePudding user response:
One option would be to first rename your columns in a consistent manner. Then reshape your data to long which makes it easy to apply tidyr::unite
and if desired convert back to wide format afterwards:
# Rename columns
names(dat)[seq(3, ncol(dat), by = 2)] <- paste(names(dat)[seq(2, ncol(dat), by = 2)], "2", sep = "_")
names(dat)[seq(2, ncol(dat), by = 2)] <- paste(names(dat)[seq(2, ncol(dat), by = 2)], "1", sep = "_")
library(tidyr)
dat |>
pivot_longer(-Date, names_to = c("location", ".value"), names_sep = "_") |>
replace_na(list(`2` = "")) |>
unite(value, `1`, `2`, sep = "-") |>
pivot_wider(names_from = location, values_from = value)
#> # A tibble: 3 × 3
#> Date England Wales
#> <chr> <chr> <chr>
#> 1 2021-04-01 145-e 23-
#> 2 2021-04-02 200-e 90-s
#> 3 2021-04-03 120- 66-e
DATA
dat <- data.frame(
Date = c("2021-04-01", "2021-04-02", "2021-04-03"),
England = c(145L, 200L, 120L),
...3 = c("e", "e", NA),
Wales = c(23L, 90L, 66L),
...5 = c(NA, "s", "e")
)