I have a dataframe with some rows that differ by a single value, like below:
V1 V2 V3 V4 V5
A B C D E
A B C D F
Is there a convenient way to search for rows that differ only in V5 and then merge these two rows, pasting together the values that differed in V5 to produce:
V1 V2 V3 V4 V5
A B C D E,F
CodePudding user response:
As you're only interested in merging on the fifth column (as indicated in the comments), one solution is to group by the first four and concatenate the fifth.
E.g. using dplyr
:
library(dplyr)
df |>
group_by(across(1:4)) |>
summarise(V5 = paste0(unique(V5), collapse = ",")) |>
ungroup()
Output:
# A tibble: 2 × 5
V1 V2 V3 V4 V5
<chr> <chr> <chr> <chr> <chr>
1 A B C A F
2 A B C D E,F
Data:
library(readr)
df <- read_table("V1 V2 V3 V4 V5
A B C D E
A B C D F
A B C A F")