I have a data which i need to group on column ID and then concatenate the rows in R studio.
Could anyone please help?
Current Data
ID Name Diagnosis Medicine
1 A a a
1 A b a
1 A c b
2 B d b
2 B e c
3 C f d
Result Expected
ID Name Diagnosis Medicine
1 A a|b|c a|a|b
2 B d|e b|c
3 C f d
Code written till now is diagnoses<-read_xlsx("excel file.xlsx")
diagnoses <-diagnoses[,c( "ID" ,"Name" ,"Diagnosis" ,"Medicine"),] head(diagnoses)
aggregate(cbind(Diagnosis, Medicine)~ID, df, paste0, collapse = '|') write.xlsx(diagnoses, "Diagnoses.xlsx")
I have integer variables, bit variable, nvarchar variable)
I am getting error :( Please help
CodePudding user response:
Try groupby
with mutate
:
df %>%
group_by(ID, Name) %>%
summarize(Diagnosis = paste0(Diagnosis, collapse = "|"),
Medicine = paste0(Medicine, collapse = "|"))
Output:
ID Name Diagnosis Medicine
<int> <chr> <chr> <chr>
1 1 A a|b|c a|a|b
2 2 B d|e b|c
3 3 C f d
CodePudding user response:
If you read the data in diagnoses
you should use the same variable in aggregate
function in place of df
. Another important thing is to save the output in a variable after running the function.
Try -
diagnoses<- readxl::read_xlsx("excel file.xlsx")
result <- aggregate(cbind(Diagnosis, Medicine)~ID, diagnoses,paste0, collapse = '|')
openxlsx::write.xlsx(result, "Diagnoses.xlsx")