I want to replace distinct values in the 'Grade' column with NA if the values in the 'ID' column are duplicates.
This is my data frame currently:
ID Name Grade
1001 Mary 10
1002 John 9
1002 John 10
1003 James 12
And this is what I want the data frame to look like:
ID Name Grade
1001 Mary 10
1002 John NA
1002 John NA
1003 James 12
How would I go about accomplishing this?
Thanks!
CodePudding user response:
You may try
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Grade = ifelse(n()>1, NA, Grade))
ID Name Grade
<int> <chr> <int>
1 1001 Mary 10
2 1002 John NA
3 1002 John NA
4 1003 James 12
CodePudding user response:
Here are couple of base R option -
- Using
duplicated
.
df$Grade[duplicated(df$ID) | duplicated(df$ID, fromLast = TRUE)] <- NA
df
# ID Name Grade
#1 1001 Mary 10
#2 1002 John NA
#3 1002 John NA
#4 1003 James 12
- Using
table
.
df$Grade[df$ID %in% names(Filter(function(x) x > 1, table(df$ID)))] <- NA
You can also use dplyr
for 1.
library(dplyr)
df <- df %>%
mutate(Grade = replace(Grade, duplicated(ID) |
duplicated(ID, fromLast = TRUE), NA))
df