I know how to duplicate columns in R but wonder if there is a more elegant way for a specific task that I'm doing.
I have multiple dataset. Some have value by Year, some have values by QuarterYear. Given that I need to do calculation to all the dataset by QuarterYear, I need to change dataset that have values by Year only to repeat the value by QuarterYear.
How my df look:
Item Year2020 Year2021
A 1 4
B 2 5
C 3 6
What I'm trying to achieve
Item 1Q20 2Q20 3Q20 4Q20 1Q21 2Q21 3Q21 4Q21
A 1 1 1 1 4 4 4 4
B 2 2 2 2 5 5 5 5
C 3 3 3 3 6 6 6 6
How I achieve it:
#recreate df
df <- data.frame(Item = c("A", "B", "C"),
Year2020 = c(1, 2, 3),
Year2021 = c(4, 5, 6))
#replicate Year column 3 times and change names
df <- df %>%
cbind(replicate(3, df$`Year2020`)) %>%
rename(`1Q20` = `Year2020`,
`2Q20` = `1`,
`3Q20` = `2`,
`4Q20` = `3`) %>%
cbind(replicate(3, df$`Year2021`)) %>%
rename(`1Q21` = `Year2021`,
`2Q21` = `1`,
`3Q21` = `2`,
`4Q21` = `3`)
#rearrange col
df <- df[,c(1:2,4:6,3,7:9)]
Thankfully I only have two year columns but I have to do it to a lot of dataset and wonder if there is more neat way to this.
CodePudding user response:
Do not know what is elegant for you, I like a pivot_*
workflow:
library(tidyverse)
df %>%
pivot_longer(-Item) %>%
slice(rep(1:n(), each = 4L)) %>%
group_by(Item, name) %>%
mutate(name = paste0(1:n(), "Q", str_extract(name, "[0-9]{2}$"))) %>%
pivot_wider()
# # A tibble: 3 × 9
# # Groups: Item [3]
# Item `1Q20` `2Q20` `3Q20` `4Q20` `1Q21` `2Q21` `3Q21` `4Q21`
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 A 1 1 1 1 4 4 4 4
# 2 B 2 2 2 2 5 5 5 5
# 3 C 3 3 3 3 6 6 6 6
CodePudding user response:
Here's a programmatic base R option:
idx <- which(grepl("Year", colnames(df)))
notidx <- setdiff(seq(ncol(df)), idx)
years <- gsub("Year20", "", colnames(df)[idx])
Q <- paste0(1:4, "Q", rep(years, each = 4))
df <- df[sort(c(notidx, rep(idx, each = 4)))]
colnames(df)[-grepl("Year", colnames(df))] <- Q
# Item 1Q20 2Q20 3Q20 4Q20 1Q21 2Q21 3Q21 4Q21
# 1 A 1 1 1 1 4 4 4 4
# 2 B 2 2 2 2 5 5 5 5
# 3 C 3 3 3 3 6 6 6 6
CodePudding user response:
df %>%
mutate(`1Q20` = Year2020,
`2Q20` = Year2020,
`3Q20` = Year2020,
`4Q20` = Year2020,
`1Q21` = Year2021,
`2Q21` = Year2021,
`3Q21` = Year2021,
`4Q21` = Year2021)
CodePudding user response:
cbind(df$Item,rep(df %>% dplyr::select(contains("Year")),4) %>%
as.data.frame() %>%
rename(Q1_20=ends_with("20"),
Q1_21=ends_with("21"),
Q2_20=ends_with("20.1"),
Q2_21=ends_with("21.1")))