Home > Enterprise >  R grouping eror
R grouping eror

Time:09-28

I have a data which i need to group on column ID and then concatenate the rows in R studio.

Could anyone please help?

Current Data

ID   Name   Diagnosis    Medicine
1    A      a            a
1    A      b            a
1    A      c            b 
2    B      d            b
2    B      e            c
3    C      f            d

Result Expected

ID   Name   Diagnosis    Medicine
1    A      a|b|c         a|a|b
2    B      d|e           b|c
3    C      f             d

Code written till now is diagnoses<-read_xlsx("excel file.xlsx")

diagnoses <-diagnoses[,c( "ID" ,"Name" ,"Diagnosis" ,"Medicine"),] head(diagnoses)

aggregate(cbind(Diagnosis, Medicine)~ID, df, paste0, collapse = '|') write.xlsx(diagnoses, "Diagnoses.xlsx")

I have integer variables, bit variable, nvarchar variable)

I am getting error :( Please help

CodePudding user response:

Try groupby with mutate:

df %>% 
  group_by(ID, Name) %>% 
  summarize(Diagnosis = paste0(Diagnosis, collapse = "|"),
            Medicine = paste0(Medicine, collapse = "|")) 

Output:

     ID Name  Diagnosis Medicine
  <int> <chr> <chr>     <chr>   
1     1 A     a|b|c     a|a|b   
2     2 B     d|e       b|c     
3     3 C     f         d       

CodePudding user response:

If you read the data in diagnoses you should use the same variable in aggregate function in place of df. Another important thing is to save the output in a variable after running the function.

Try -

diagnoses<- readxl::read_xlsx("excel file.xlsx")
result <- aggregate(cbind(Diagnosis, Medicine)~ID, diagnoses,paste0, collapse = '|')
openxlsx::write.xlsx(result, "Diagnoses.xlsx")
  •  Tags:  
  • r
  • Related