Home > Software engineering >  How to concatenate values of a column B grouped by the other column A SQL and R
How to concatenate values of a column B grouped by the other column A SQL and R

Time:09-06

I have this dataframe

df <- data.frame(scientific_name = c("Mandevilla grazielae",
                                     "Aosa parvifolia",
                                     "Mandevilla grazielae",
                                     "Dyckia ebracteata"), 
                 collection_number = c("254", "658","445", "568"))

And I'd like to combine the collection number of the plants in a column named "vouchers"

scientific_name       vouchers
Mandevilla grazielae  254, 445
Aosa parvifolia       658
Dyckia ebracteata     568

So far this is what I was able to do:

x <- sqldf('SELECT * FROM df GROUP BY scientific_name ORDER BY scientific_name ASC')

And then I started to learn more...

x <- sqldf('SELECT scientific_name, COUNT(collection_number) 
            FROM df GROUP BY scientific_name ORDER BY scientific_name ASC')

I'd like to know if there's something like the COUNT function above to concatenate the column "collection_number" in a "vouchers" column. I also tried CONCAT and it did't work.

I'd be very grateful!

CodePudding user response:

In base R, use aggregate with paste

aggregate(cbind(vouchers = collection_number) ~ scientific_name, df, toString)

-output

        scientific_name vouchers
1      Aosa parvifolia      658
2    Dyckia ebracteata      568
3 Mandevilla grazielae 254, 445
  • Related