I have this dataframe
df <- data.frame(scientific_name = c("Mandevilla grazielae",
"Aosa parvifolia",
"Mandevilla grazielae",
"Dyckia ebracteata"),
collection_number = c("254", "658","445", "568"))
And I'd like to combine the collection number of the plants in a column named "vouchers"
scientific_name vouchers
Mandevilla grazielae 254, 445
Aosa parvifolia 658
Dyckia ebracteata 568
So far this is what I was able to do:
x <- sqldf('SELECT * FROM df GROUP BY scientific_name ORDER BY scientific_name ASC')
And then I started to learn more...
x <- sqldf('SELECT scientific_name, COUNT(collection_number)
FROM df GROUP BY scientific_name ORDER BY scientific_name ASC')
I'd like to know if there's something like the COUNT function above to concatenate the column "collection_number" in a "vouchers" column. I also tried CONCAT and it did't work.
I'd be very grateful!
CodePudding user response:
In base R
, use aggregate
with paste
aggregate(cbind(vouchers = collection_number) ~ scientific_name, df, toString)
-output
scientific_name vouchers
1 Aosa parvifolia 658
2 Dyckia ebracteata 568
3 Mandevilla grazielae 254, 445