I have an R dataframe that look like this:
rs2288741 rs1821185 rs1432315 ID
T A T A
T C T A
G C C B
T C T B
G A C C
G A C C
T A C D
T C T D
I need to paste the row values when the "ID" is equal to the next. The outfile should look like this:
rs2288741 rs1821185 rs1432315 ID
TT AC TT A
GT CC CT B
GG AA CC C
TT AC CT D
Is there any easy way to get this?
CodePudding user response:
data.table
solution
setDT(mydata)[, lapply(.SD, paste0, collapse = ""), by = .(ID)]
# ID rs2288741 rs1821185 rs1432315
# 1: A TT AC TT
# 2: B GT CC CT
# 3: C GG AA CC
# 4: D TT AC CT
CodePudding user response:
If you use the tidyverse, you could do:
library(tidyverse)
df %>%
group_by(ID) %>%
summarize(across(everything(), ~ paste(.x, collapse = ''))) %>%
select(2:4, 1)
#> # A tibble: 4 x 4
#> rs2288741 rs1821185 rs1432315 ID
#> <chr> <chr> <chr> <chr>
#> 1 TT AC TT A
#> 2 GT CC CT B
#> 3 GG AA CC C
#> 4 TT AC CT D
Created on 2022-04-11 by the reprex package (v2.0.1)
CodePudding user response:
A base R version could be
aggregate(. ~ ID, data = df, function(x) paste(x, collapse = ""))
#> ID rs2288741 rs1821185 rs1432315
#> 1 A TT AC TT
#> 2 B GT CC CT
#> 3 C GG AA CC
#> 4 D TT AC CT