Home > Enterprise >  How to merge duplicated rows
How to merge duplicated rows

Time:11-24

I have a data frame that looks like

Nicknames Names
Fonse, Fons Alfons
Fonse, Fonsi Alfons
Gustel, Gustl, Guste, August
Baldi Balthasar
Hausl, Baldi Balthasar
Flore, Flori Florian

I would like to merge the duplicated rows to be :

Nicknames Names
Fonse, Fons,Fonse, Fonsi Alfons
Gustel, Gustl, Guste, August
Baldi, Hausl, Baldi Balthasar
Flore, Flori Florian

I was able to creat a subset of the duplicate but I don't know how to combine them

nick2 <- subset(nick, any(duplicated(nick$Names)))

Here is the data as a csv file https://github.com/Garybertrand/nick

CodePudding user response:

This should solve your problem

library(data.table)
library(dplyr)

setDT(df)[, list(Nicknames = paste(Nicknames, collapse = ', ')), 
          by = c('Names')] %>%
  select(Nicknames,Names)

CodePudding user response:

You can also use base R.

aggregate(Nicknames ~ Names, unique(df), paste, collapse = ", ")

CodePudding user response:

The short tidyverse solution would be like this:

library(tidyverse)

df %>% 
  group_by(Names) %>% 
  summarize(Nicknames = paste0(Nicknames, collapse = ", "))
  • Related