How to remove duplicate by ID. For each ID, the column Drug must have unique values. Any help is appreciated.
dat <- read.table(text="Id Drug
A Meropenem
A Ampicillin
A Augmentin
A Meropenem
A Ampicillin
A Augmentin
B Meropenem
B Ampicillin
B Augmentin", header=TRUE)
This is the desired output:
dat.desired <- read.table(text="Id Drug
A Meropenem
A Ampicillin
A Augmentin
B Meropenem
B Ampicillin
B Augmentin", header=TRUE)
CodePudding user response:
Using the group_by in dplyr allows remove the duplicates per group only.
library(dplyr)
dat %>% group_by(Id) %>% filter( !duplicated(Drug))
Id Drug
<chr> <chr>
1 A Meropenem
2 A Ampicillin
3 A Augmentin
4 B Meropenem
5 B Ampicillin
6 B Augmentin
CodePudding user response:
unique
will select the "unique" rows in a table:
d2 = unique(dat)
##test it
d2 == dat.desired