Similar to unite()
do to columns, is there a way to combine rows across columns for specific rows separating values with semicolon?
In the example below, IC_1
and IC_2
were combined as a new row, values between brackets and separeted by ;
structure(list(treatment = c("product", "product", "product",
"product", "control", "control", "control", "control"), variable = c("A",
"B", "IC_1", "IC_2", "A", "B", "IC_1", "IC_2"), X1 = 1:8, X2 = 8:15,
X3 = 16:23), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L))
treatment variable X1 X2 X3
<chr> <chr> <int> <int> <int>
1 product A 1 8 16
2 product B 2 9 17
3 product IC_1 3 10 18
4 product IC_2 4 11 19
5 control A 5 12 20
6 control B 6 13 21
7 control IC_1 7 14 22
8 control IC_2 8 15 23
Desired output:
treatment variable X1 X2 X3
<chr> <chr> <chr> <chr> <chr>
1 product A 1 8 16
2 product B 2 9 17
3 product IC [3;4][10;11][18:19]
4 control A 5 12 20
5 control B 6 13 21
6 control IC [7;8][14;15][22;23]
CodePudding user response:
You could check to see whether variable contains "IC" and group on that, then use paste
to glue "IC" values together. Note, however, that this will change columns X1:X3
to character data.
library(tidyverse)
df %>%
group_by(treatment, variable = ifelse(grepl('IC', variable), 'IC', variable)) %>%
summarize(across(X1:X3, ~ifelse(length(.x) == 1, as.character(.x), paste(.x, collapse = ';')))) %>%
mutate(across(X1:X3, ~ifelse(grepl(';', .x), sprintf('[%s]', .x), .x)))
treatment variable X1 X2 X3
<chr> <chr> <chr> <chr> <chr>
1 control A 5 12 20
2 control B 6 13 21
3 control IC [7;8] [14;15] [22;23]
4 product A 1 8 16
5 product B 2 9 17
6 product IC [3;4] [10;11] [18;19]
CodePudding user response:
Here is one more dplyr
solution:
First we remove _1
and _2
to be able to create groups.
Then we group_by
and apply an ifelse
statement across the columns that start with X
afterwords do some data wrangling.
library(dplyr)
library(stringr)
df %>%
mutate(row = row_number()) %>%
mutate(variable = str_remove(variable, '\\_\\d')) %>%
group_by(treatment, variable) %>%
mutate(across(starts_with("X"), ~ifelse(
variable == "IC", paste0("[", ., ";",lead(.), "]"),as.character(.)))
) %>%
slice(1) %>%
arrange(row) %>%
select(-row)
treatment variable X1 X2 X3
<chr> <chr> <chr> <chr> <chr>
1 product A 1 8 16
2 product B 2 9 17
3 product IC [3;4] [10;11] [18;19]
4 control A 5 12 20
5 control B 6 13 21
6 control IC [7;8] [14;15] [22;23]