Home > Blockchain >  Merging cells in R based on a value in another row, with character format (not numeric)
Merging cells in R based on a value in another row, with character format (not numeric)

Time:08-03

I would like to merge several cells in a dataframe, based on a value in another column. However, I would also like to keep the data and not lose any values when merging, please see below.

I've tried to use the aggregate function but I don't think it works as my desired column is in the format character, rather than numeric.

Current dataframe format:

id     food
1      salt
1      chicken
1      pasta
2      rice
2      broccoli
3      cream
3      pasta
3      garlic
3      lemon
...    ...

Desired dataframe format:

id     food
1      salt chicken pasta
2      rice broccoli
3      cream pasta garlic lemon
...    ...

I was wondering if it was even possible to merge cells that have character/string properties, and if so how would I go about doing this?

Thank you very much

Data

dat  <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), food = c("salt",
"chicken", "pasta", "rice", "broccoli", "cream", "pasta", "garlic",
"lemon")), class = "data.frame", row.names = c(NA, -9L))

CodePudding user response:

You can do this with dplyr:

library(dplyr)
dat  |>
    group_by(id)  |>
    summarise(food = paste(food, collapse = " "))
# A tibble: 3 x 2
#      id food
#   <int> <chr>
# 1     1 salt chicken pasta      
# 2     2 rice broccoli
# 3     3 cream pasta garlic lemon

Alternatively here is a data.table approach:

library(data.table)
setDT(dat)

dat[, .(food = paste(food, collapse = " ")), by = id]
#    id                     food
# 1:  1       salt chicken pasta
# 2:  2            rice broccoli
# 3:  3 cream pasta garlic lemon

And for the sake of completeness a base R approach:

aggregate(
    list(food = dat$food), 
    by = list(id = dat$id), 
    paste
)
#   id                        food
# 1  1        salt, chicken, pasta
# 2  2              rice, broccoli
# 3  3 cream, pasta, garlic, lemon

CodePudding user response:

The code from SamR is excellent,

no matter what, do not use mine,

Conc<-NULL
for (i in 1:max(unique(dat[,1]))){
  Conc[i]<- paste(subset.data.frame(dat, dat[,1]==i)[,2], sep = "", collapse = " ")
}

df <- as.data.frame(matrix(c(1:max(unique(dat[,1])),Conc), nrow= max(unique(dat[,1])),ncol = 2))
colnames(df)<-c("ID", "Food")
  • Related