I have a data frame that looks like
Nicknames | Names |
---|---|
Fonse, Fons | Alfons |
Fonse, Fonsi | Alfons |
Gustel, Gustl, Guste, | August |
Baldi | Balthasar |
Hausl, Baldi | Balthasar |
Flore, Flori | Florian |
I would like to merge the duplicated rows to be :
Nicknames | Names |
---|---|
Fonse, Fons,Fonse, Fonsi | Alfons |
Gustel, Gustl, Guste, | August |
Baldi, Hausl, Baldi | Balthasar |
Flore, Flori | Florian |
I was able to creat a subset of the duplicate but I don't know how to combine them
nick2 <- subset(nick, any(duplicated(nick$Names)))
Here is the data as a csv file https://github.com/Garybertrand/nick
CodePudding user response:
This should solve your problem
library(data.table)
library(dplyr)
setDT(df)[, list(Nicknames = paste(Nicknames, collapse = ', ')),
by = c('Names')] %>%
select(Nicknames,Names)
CodePudding user response:
You can also use base R.
aggregate(Nicknames ~ Names, unique(df), paste, collapse = ", ")
CodePudding user response:
The short tidyverse
solution would be like this:
library(tidyverse)
df %>%
group_by(Names) %>%
summarize(Nicknames = paste0(Nicknames, collapse = ", "))