I have a data frame that has serveral columns along with sub-columns and it looks like this:
I want to transform this data frame and make a new column from second row which can be considered as a sub column:
For reproducibility, I have used the code to create the data frame and in this data frame, the second row should be considered a sub-column.
df <- data.frame(
c("Reporting Month", "Type 1", "Type 2", "Type 3", "Type 4"),
c(202201, 30, 77, 73, 56),
c(202202, 5, 47, 91, 30),
c(202203, 44, 10, 68, 62),
c(202201, 13, 86, 26, 45),
c(202202, 70, 3, 92, 5),
c(202203, 29, 61, 8, 23),
c(202201, 65, 76, 70, 55),
c(202202, 27, 41, 73, 75),
c(202203, 10, 47, 72, 12)
)
colnames(df) <- c("ID", rep("Metric 1", 3), rep("Metric 2", 3), rep("Metric 3", 3))
Let me know in case of any confusion.
CodePudding user response:
If Tidyverse is OK then perhaps something like this:
library(dplyr, warn.conflicts = F)
library(tidyr)
# combine colnames with 1st row
colnames(df) <- paste(colnames(df), df[1,], sep = "_")
# [1] "ID_Reporting Month" "Metric 1_202201" "Metric 1_202202" "Metric 1_202203" ...
df %>%
# drop 1st row (4x10 data.frame)
slice(-1) %>%
# pivot longer, month & metric to separte columns (36x4 tibble)
pivot_longer(cols = starts_with("Metric"), names_to = c("metric", "month"), names_sep = "_") %>%
# pivot longer, (3) metrics to their own columns (12x5 tibble)
pivot_wider(names_from = "metric") %>%
rename(ID = `ID_Reporting Month`, `Reporting Month` = month) %>%
arrange(`Reporting Month`, ID)
Result:
#> # A tibble: 12 × 5
#> ID `Reporting Month` `Metric 1` `Metric 2` `Metric 3`
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Type 1 202201 30 13 65
#> 2 Type 2 202201 77 86 76
#> 3 Type 3 202201 73 26 70
#> 4 Type 4 202201 56 45 55
#> 5 Type 1 202202 5 70 27
#> 6 Type 2 202202 47 3 41
#> 7 Type 3 202202 91 92 73
#> 8 Type 4 202202 30 5 75
#> 9 Type 1 202203 44 29 10
#> 10 Type 2 202203 10 61 47
#> 11 Type 3 202203 68 8 72
#> 12 Type 4 202203 62 23 12
Input data:
df <- data.frame(
c("Reporting Month", "Type 1", "Type 2", "Type 3", "Type 4"),
c(202201, 30, 77, 73, 56),
c(202202, 5, 47, 91, 30),
c(202203, 44, 10, 68, 62),
c(202201, 13, 86, 26, 45),
c(202202, 70, 3, 92, 5),
c(202203, 29, 61, 8, 23),
c(202201, 65, 76, 70, 55),
c(202202, 27, 41, 73, 75),
c(202203, 10, 47, 72, 12)
)
colnames(df) <- c("ID", rep("Metric 1", 3), rep("Metric 2", 3), rep("Metric 3", 3))
Created on 2022-10-14 with reprex v2.0.2
CodePudding user response:
Something like this:
cbind(df[1,],df)
??